Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Avoid processing a single commit multiple times#1058

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
smortex wants to merge1 commit intogithub-changelog-generator:master
base:master
Choose a base branch
Loading
fromsmortex:avoid-processing-a-single-commit-multiple-time

Conversation

smortex
Copy link
Contributor

@smortexsmortex commentedMay 31, 2025
edited
Loading

When listing the commits that appear in a tag, some commits are added multiple time to the resulting Set. As a Set cannot contain duplicate, this is only a waste of computing resources, but with large repositories with a lot of branches and merges, this can require a tremendous amount of time.

To illustrate the issue, consider the following git history with 5 commits:

A---B---C---D     \ /      E

Assuming A is the root commit, and we callcommits_in_tag(D). Let's inspect the value ofshas andqueue at the beginning of the loop, and what is pushed to these variables on each iteration:

  1. shas = {}; queue = [D] #=> shas << D; queue << C
  2. shas = {D}; queue = [C] #=> shas << C; queue << [B, E]
  3. shas = {D, C}; queue = [B, E] #=> shas << B; queue << A
  4. shas = {D, C, B}; queue = [E, A] #=> shas << E; queue << B
  5. shas = {D, C, B, E}; queue = [A, B] #=> shas << A
  6. shas = {D, C, B, E, A}; queue = [B] #=> shas << B; queue << A
  7. shas = {D, C, B, E, A}; queue = [A] #=> shas << A
  8. shas = {D, C, B, E, A}; queue = [] #=> exit loop

As you can see above, 6 and 7 attempt to add B and A to the set, but they where already added at step 3 and 5, so these action are no-op.

This change checks if the current commit is already present inshas, and if so skip to the next queued commit. The above example becomes:

  1. shas = {}; queue = [D] #=> shas << D; queue << C
  2. shas = {D}; queue = [C] #=> shas << C; queue << [B, E]
  3. shas = {D, C}; queue = [B, E] #=> shas << B; queue << A
  4. shas = {D, C, B}; queue = [E, A] #=> shas << E; queue << B
  5. shas = {D, C, B, E}; queue = [A, B] #=> shas << A
  6. shas = {D, C, B, E, A}; queue = [B] #=> next
  7. shas = {D, C, B, E, A}; queue = [] #=> exit loop

While we where previously unable to generate the ChangeLog of one of our projects in a reasonable amount of time (Ctrl+C after 15+ minutes of waiting at 100% CPU), this change allows us to generate it in ~45s on my laptop.

kenyon reacted with thumbs up emoji
@smortex
Copy link
ContributorAuthor

While we where previously unable to generate the ChangeLog of one of our projects in a reasonable amount of time (Ctrl+C after 15+ minutes of waiting at 100% CPU), this change allows us to generate it in ~45s on my laptop.

The repo in question:https://github.com/OpenVoxProject/openvox-agent

bundle exec bin/github_changelog_generator --user OpenVoxProject --project openvox-agent

When listing the commits that appear in a tag, some commits are addedmultiple time to the resulting Set.  As a Set cannot contain duplicate,this is only a waste of computing resources, but with large repositorieswith a lot of branches and merges, this can require a tremendous amountof time.To illustrate the issue, consider the following git history with 5commits:```A---B---C---D     \ /      E```Assuming A is the root commit, and we call `commits_in_tag(D)`.  Let'sinspect the value of `shas` and `queue` at the beginning of the loop,and what is pushed to these variables on each iteration:1. `shas = {}; queue = [D] #=> shas << D; queue << C`2. `shas = {D}; queue = [C] #=> shas << C; queue << [B, E]`3. `shas = {D, C}; queue = [B, E] #=> shas << B; queue << A`4. `shas = {D, C, B}; queue = [E, A] #=> shas << E; queue << B`5. `shas = {D, C, B, E}; queue = [A, B] #=> shas << A`6. `shas = {D, C, B, E, A}; queue = [B] #=> shas << B; queue << A`7. `shas = {D, C, B, E, A}; queue = [A] #=> shas << A`8. `shas = {D, C, B, E, A}; queue = [] #=> exit loop`As you can see above, 6 and 7 attempt to add B and A to the set, butthey where already added at step 3 and 5, so these action are no-op.This change checks if the current commit is already present in `shas`,and if so skip to the next queued commit.  The above example becomes:1. `shas = {}; queue = [D] #=> shas << D; queue << C`2. `shas = {D}; queue = [C] #=> shas << C; queue << [B, E]`3. `shas = {D, C}; queue = [B, E] #=> shas << B; queue << A`4. `shas = {D, C, B}; queue = [E, A] #=> shas << E; queue << B`5. `shas = {D, C, B, E}; queue = [A, B] #=> shas << A`6. `shas = {D, C, B, E, A}; queue = [B] #=> next`7. `shas = {D, C, B, E, A}; queue = [] #=> exit loop`While we where previously unable to generate the ChangeLog of one of ourprojects in a reasonable amount of time (Ctrl+C after 15+ minutes ofwaiting at 100% CPU), this change allows us to generate it in ~45s on mylaptop.
@smortexsmortexforce-pushed theavoid-processing-a-single-commit-multiple-time branch from4fa93b7 toa0344afCompareMay 31, 2025 07:19
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@kenyonkenyonkenyon left review comments

Assignees
No one assigned
Labels
None yet
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

2 participants
@smortex@kenyon

[8]ページ先頭

©2009-2025 Movatter.jp