Skip to content

Conversation

@cloud-fan
Copy link
Contributor

What changes were proposed in this pull request?

One step of the release process is to generate the contributor names and put them in the release notes. The script does a lot of work to find the full name of contributors, but I think it's more useful to show the github user id for credits, instead of the full name.

This PR simplifies this step: we now list contributor names in the form of github user id (Full Name) or without full name if the github user does not specify full name in the profile.

Why are the changes needed?

Simplify release process.

Does this PR introduce any user-facing change?

no

How was this patch tested?

manually, to generate 4.1.0 contributors

Was this patch authored or co-authored using generative AI tooling?

cursor 2.3.29

@github-actions
Copy link

github-actions bot commented Jan 8, 2026

JIRA Issue Information

=== Improvement SPARK-54960 ===
Summary: Simplify generating contributors of the release process
Assignee: None
Status: Open
Affected: ["4.2.0"]


This comment was automatically generated by GitHub Actions

@cloud-fan
Copy link
Contributor Author

cc @dongjoon-hyun @HyukjinKwon

# The PR number and github username is in the commit message
# itself and cannot be accessed through any GitHub API
pr_number = None
github_username = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should probably skip the case when github_username is not found. 99% cases github_username should be available but there are other cases like removal of the account, co-authored commits who don't have github accounts, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it's skipped, see https://github.com/apache/spark/pull/53728/files#diff-b5c7570e43a02752a0585f7f3de43edfc172d278540864775de7fd41e206139aR167

This function is to list commits, and we should still list all commits.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that tag or release tag already works like that inside GitHub repository and the rational of this proposal. However, why Apache Software Foundation community enforces that? This sounds a little controversial to me because it's not a perfect solution too.

I think it's more useful to show the github user id for credits, instead of the full name.

If we need this change explicitly, could you send out a simple announcement to dev@spark before merging this, please, @cloud-fan ? To be clear, I would not disagree with this change (if this is exposed once in the ASF mailing list once).

@cloud-fan
Copy link
Contributor Author

@dongjoon-hyun this proposal does not drop full name, it's just to remove human toil to figure out the full name, and read it from the github user profile. Most of the cases we just pick the full name from github user profile, it's very rare that you happen to know this person and can correct his/her full name.

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Jan 9, 2026

To be clear, I'm not saying about the dropping an information (full name). Instead, it's about enforcing ASF community and contributors to expose additional information. IIUC, there are three fundamental changes in this PR.

  1. This PR proposed to ignore the author name of the commit. It enforces the contributor should use GitHub information. I believe this is acceptable change because it's the majority.
  2. This PR proposed to expose GitHub ID which enforces the Apache Spark community to show the company prefixes from the company owned IDs like @databricks-david-lewis , @mswit-databricks , @michal-databricks , @vladimirg-db

Please don't get me wrong. We can do this, but we need to discuss this in the mailing list officially instead of a submarine patch because this PR is also proposed by a company employee whose company seems to have a company policy like the above company-owned ID for some purpose.

That's all what I asked here in the previous comment, @cloud-fan .

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for sharing the proposal to the broader audience, @cloud-fan .

+1 from my side.

@cloud-fan
Copy link
Contributor Author

thanks for the review, merging to master!

@cloud-fan cloud-fan closed this in de34309 Jan 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants