Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork938
Add more checks for the validity of refnames#1672
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Add more checks for the validity of refnames#1672
Uh oh!
There was an error while loading.Please reload this page.
Conversation
This change adds checks based on the rules described in [0] inorder to more robustly check a refname's validity.[0]:https://git-scm.com/docs/git-check-ref-format
facutuesca commentedSep 21, 2023 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
To add a bit more context: I followed the general approachmentioned by@Byron and used bygitoxide, which is a For comparison, here's the naive approach, where the logic separation matchesthe docs (one rule per def_check_ref_name_valid_naive(ref_path:PathLike)->None:# Based on https://git-scm.com/docs/git-check-ref-format/ifany([component.startswith(".")orcomponent.endswith(".lock")forcomponentinref_path.split("/")]):raiseValueError(f"Invalid reference '{ref_path}': components cannot start with '.' or end with '.lock'")elif".."instr(ref_path):raiseValueError(f"Invalid reference '{ref_path}': references cannot contain '..'")elifany([ord(c)<32orord(c)==127orcin [" ","~","^",":"]forcinref_path]):raiseValueError(f"Invalid reference '{ref_path}': references cannot contain ASCII control characters, spaces, tildes (~), carets (^) or colons (:)" )elifany([cin ["?","*","["]forcinref_path]):raiseValueError(f"Invalid reference '{ref_path}': references cannot contain question marks (?), asterisks (*) or open brackets ([)" )elifref_path.startswith("/")orref_path.endswith("/")or"//"inref_path:raiseValueError(f"Invalid reference '{ref_path}': references cannot start or end with '/', or contain '//")elifref_path.endswith("."):raiseValueError(f"Invalid reference '{ref_path}': references cannot end with '.'")elif"@{"inref_path:raiseValueError(f"Invalid reference '{ref_path}': references cannot contain '@{{'")elifref_path=="@":raiseValueError(f"Invalid reference '{ref_path}': references cannot be '@'")elif"\\"inref_path:raiseValueError(f"Invalid reference '{ref_path}': references cannot contain '\\'") The naive approach is IMO more readable, but around half as fast as the one in the PR. Although, for reference, in my MacBook M1 Pro, for a refname 25 characters long:
So we are talking about minimal amounts either way. I'll leave the choice of which algorithm to use up to the maintainers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Thanks a million, I love this implementation!
Strangely enough, I find the faster version (the one here) more readable as well and would want to keep it for that reason alone.
There is one issue I see that might be hard to solve, but it's time to at least try. It's the general problem of how to interact with paths without running into decoding problems (i.e. Python tries to decode a path as decoding X, and fails, even though it's a valid filesystem path). Maybe@EliahKagan also has ideas regarding this topic.
# Based on the rules described in https://git-scm.com/docs/git-check-ref-format/#_description | ||
previous: Union[str, None] = None | ||
one_before_previous: Union[str, None] = None | ||
for c in str(ref_path): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Is there a way to avoid converting tostr
? I assume this tries to decoderef_path
with the current string encoding, which changes depending on the interpreter or user configuration and generally causes a lot of trouble.
EliahKaganSep 22, 2023 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Unless this PR worsens that problem in some way, which I believe it does not, I would recommend it be fixed separately and later. The code this is replacing already had:
GitPython/git/refs/symbolic.py
Lines 171 to 172 ind40320b
if".."instr(ref_path): | |
raiseValueError(f"Invalid reference '{ref_path}'") |
But actually even that neither introduced nor exacerbated the problem. From the commit prior to#1644 being merged:
GitPython/git/refs/symbolic.py
Lines 164 to 174 in830025b
@classmethod | |
def_get_ref_info_helper( | |
cls,repo:"Repo",ref_path:Union[PathLike,None] | |
)->Union[Tuple[str,None],Tuple[None,str]]: | |
"""Return: (str(sha), str(target_ref_path)) if available, the sha the file at | |
rela_path points to, or None. target_ref_path is the reference we | |
point to, or None""" | |
tokens:Union[None,List[str],Tuple[str,str]]=None | |
repodir=_git_dir(repo,ref_path) | |
try: | |
withopen(os.path.join(repodir,str(ref_path)),"rt",encoding="UTF-8")asfp: |
Note howstr(ref_path)
was passed toos.path.join
, which when givenstr
s returns astr
, thus astr
was being passed toopen
. Note also that, while thisstr
call was actually redundant (os.path.join
acceptspath-like objects since Python 3.6), evenit was not the cause ofstr
and notbytes
being used. The annotation onref_path
isUnion[PathLike, None]
, wherePathLike
is:
Line 43 in830025b
PathLike=Union[str,"os.PathLike[str]"] |
Where both alternatives--str
andos.PathLike[str]
--represent text that has already been decoded.
So unless I'm missing something--which I admit I could be--I don't think it makes conceptual sense to do anything about that in this pull request. Furthermore, unless the judgment thatCVE-2023-41040 was a security vulnerability was mistaken, or something about the variation explicated in#1644 (comment) is less exploitable, it seems to me that this pull request is fixing a vulnerability. Assuming that is the case, then I think this should avoid attempting to make far-reaching changes beyond those that pertain to the vulnerability, and that although reviewing these changes for correctnessshouldnot be rushed, other kinds of delays should be mostly avoided. With good regression tests included, as seems to be the case, the code could be improved on later in other ways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Thanks a lot for the thorough assessment, I wholeheartedly agree.
The 'how to handle paths correctly' issue is definitely one of the big breaking points in GitPython, but maybe, for other reasons, this wasn't ever a problem here.
Knowing this is on your radar, maybe one day there will be a solution to it.gitoxide
already solves this problem, but it's easier when you have an actual type system and a standard library that makes you aware every step of the way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Is the ultimate goal to support bothstr
-based andbytes
-based ref names and paths?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The goal is correctness, and it's vital that one doesn't try to decode paths to fit some interpreter-controlled encoding. Paths are paths, and if you are lucky, they can be turned into bytes. On Unix, that's always possible and a no-op, but on windows it may require a conversion. It's just the question how these things are supposed to work in python.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Does this relate (conceptually, I mean) to the issue inrust-lang/rust#12056?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
A great find :) - yes, that's absolutely related.gitoxide
internally handles git-paths as bundles of bytes without known encoding, and just likegit
, it assumes at least ASCII. Conversions do happen but they all go throughgix-path
to have a central place for it.
Doing something like it would be needed here as well, even though I argue that before that happens universally, there should be some clear definition of what GitPython is supposed to be.
When I took it over by contributing massively, just like you do now, I needed more control for the use-case I had in mind, and started implementing all these sloppy pure-python components that don't even get the basics right. With that I turned GitPython into some strange hybrid which I think didn't do it any good besides maybe being a little faster forsome usecases. After all, manipulating an index in memory has advantages, but there are also other ways to do it while relying ongit
entirely.
Maybe this is thinking a step too far, but I strongly believe that the true benefit of GitPython is to be able to callgit
in a simple manner and to be compliant naturally due to usinggit
directly. Thisshould be its identity.
But then again, it's worth recognizing that changing the various pure-python implementations to usgit
under the hood probably isn't possible in a non-breaking way.
Another avenue would be to try and get the APIs to use types that don't suffer from encoding/decoding issues related to Paths, and then one day make the jump to replacing the pure-python implementations with the python bindings ofgitoxide
.
Byron commentedSep 22, 2023 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
It looks like CI improved and now that the PR was merged, it failed CI due to a lint:https://github.com/gitpython-developers/GitPython/actions/runs/6271134895/job/17030195508#step:4:122 . A quick fix will be appreciated. Edit: I quickly fixed it myself - it seems like sometimes I forget that I am still able to edit text, despite it being python. |
EliahKagan commentedSep 22, 2023 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
@Byron Because the forthcoming 3.1.37 release that will include this patch will be a security fix, either the existing advisory/CVE should be updated with a correction (both a note and the version change), or a new CVE should be created for the variant of the vulnerability reported at#1644 (comment). I am not sure which of those things should be done here. Usually I would lean toward regarding such things as new bugs meriting new advisories/CVEs, which is also what I see more often. But I do not know that that's the best approach here, because the variant of the exploit where an absolute (or otherwise non-relative) path is used does seem to match the description in the summary section ofCVE-2023-41040 even though it doesn't resemble any of the examples.To be clear, I don't mean that this situation is necessarily ambiguous, but instead that I do not have the knowledge and experience to know how it ought to be handled. Either way, this need not delay the release, of course. (Sorry if you're already on top of the CVE/advisory matter and this comment is just noise.) |
Thanks for the hint, it's appreciated! I think it's fair to say that I am not on top of CVEs and that I have no intention to be - even though this sounds harsh it's just the current reality. But thus far members of the community picked up the necessary work around CVEs which I definitely appreciate if this would keep happening. |
A new release was created:https://pypi.org/project/GitPython/3.1.37/ |
EliahKagan commentedSep 22, 2023 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Given the3.1.37 release title ("3.1.37 - a proper fixCVE-2023-41040") I'm thinking this is intuitively being regarded as fixing the originally reported vulnerability, so perhaps that advisory should be updated, rather than a new one created? I am still not sure. As noted in#1638 (comment), you (@Byron) could update the local advisory. If you do, a PR could then also be opened onhttps://github.com/github/advisory-database (wheregithub/advisory-database#2690 was opened) to change the global advisory accordingly. I don't know if there's anything else that would need to be done. @stsewd Do you have any opinion about what ought to be done here? Would you have any objection to the local advisory being edited this way? Would you instead prefer that this variant, where an absolute path is used, be regarded as a related but separate vulnerability altogether? (I know@Byron can edit the advisory, but I wanted to check in case you had an opinion on this.) One source of my hesitancy here is that I think a new CVE may still be needed in this kind of situation.That seems common (courtesy ofthis SO answer). |
A good point - I am still getting used to advisories and the local ones are indeed editable. So that one has been adjusted. I kindly ask somebody else to create a PR for the global database though - it seems GitHub makes it hard/impossible to the use web interface for that.
To me, CVEs are good to create a far-reaching 'ping' to users of GitPython. Some might see it earlier than the new release. To me it's the question on how much time one wants to spend to create such a ping, and judging from the CVE's I have seen, it's quite expensive. |
EliahKagan commentedSep 22, 2023 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Thanks, but I'm not actually sure if it was a good point. Maybe a new advisory ought to have been, or ought to be, created. I really don't know the proper thing to do here. |
If there is an uproar because of how this was handled, it will be possible to undo changes to the local CVE and create a new one. So I think nothing is lost, and I think it's OK to chose less expensive options in dealing with this. |
Hi, a new CVE/advisory is usually created for this type of situation, and in the description you can put something like "this was a due to an incomplete fix of [link to the other CVE]". I don't oppose to edit the current one, but I guess editing doesn't have the same "ping to everyone to upgrade" effect as a new one. |
[](https://renovatebot.com)This PR contains the following updates:| Package | Change | Age | Adoption | Passing | Confidence ||---|---|---|---|---|---|| [GitPython](https://togithub.com/gitpython-developers/GitPython) |`==3.1.36` -> `==3.1.37` |[](https://docs.renovatebot.com/merge-confidence/)|[](https://docs.renovatebot.com/merge-confidence/)|[](https://docs.renovatebot.com/merge-confidence/)|[](https://docs.renovatebot.com/merge-confidence/)|---### Release Notes<details><summary>gitpython-developers/GitPython (GitPython)</summary>###[`v3.1.37`](https://togithub.com/gitpython-developers/GitPython/releases/tag/3.1.37):- a proper fixCVE-2023-41040[CompareSource](https://togithub.com/gitpython-developers/GitPython/compare/3.1.36...3.1.37)#### What's Changed- Improve Python version and OS compatibility, fixing deprecations by[@​EliahKagan](https://togithub.com/EliahKagan) in[https://github.com/gitpython-developers/GitPython/pull/1654](https://togithub.com/gitpython-developers/GitPython/pull/1654)- Better document env_case test/fixture and cwd by[@​EliahKagan](https://togithub.com/EliahKagan) in[https://github.com/gitpython-developers/GitPython/pull/1657](https://togithub.com/gitpython-developers/GitPython/pull/1657)- Remove spurious executable permissions by[@​EliahKagan](https://togithub.com/EliahKagan) in[https://github.com/gitpython-developers/GitPython/pull/1658](https://togithub.com/gitpython-developers/GitPython/pull/1658)- Fix up checks in Makefile and make them portable by[@​EliahKagan](https://togithub.com/EliahKagan) in[https://github.com/gitpython-developers/GitPython/pull/1661](https://togithub.com/gitpython-developers/GitPython/pull/1661)- Fix URLs that were redirecting to another license by[@​EliahKagan](https://togithub.com/EliahKagan) in[https://github.com/gitpython-developers/GitPython/pull/1662](https://togithub.com/gitpython-developers/GitPython/pull/1662)- Assorted small fixes/improvements to root dir docs by[@​EliahKagan](https://togithub.com/EliahKagan) in[https://github.com/gitpython-developers/GitPython/pull/1663](https://togithub.com/gitpython-developers/GitPython/pull/1663)- Use venv instead of virtualenv in test_installation by[@​EliahKagan](https://togithub.com/EliahKagan) in[https://github.com/gitpython-developers/GitPython/pull/1664](https://togithub.com/gitpython-developers/GitPython/pull/1664)- Omit py_modules in setup by[@​EliahKagan](https://togithub.com/EliahKagan) in[https://github.com/gitpython-developers/GitPython/pull/1665](https://togithub.com/gitpython-developers/GitPython/pull/1665)- Don't track code coverage temporary files by[@​EliahKagan](https://togithub.com/EliahKagan) in[https://github.com/gitpython-developers/GitPython/pull/1666](https://togithub.com/gitpython-developers/GitPython/pull/1666)- Configure tox by [@​EliahKagan](https://togithub.com/EliahKagan)in[https://github.com/gitpython-developers/GitPython/pull/1667](https://togithub.com/gitpython-developers/GitPython/pull/1667)- Format tests with black and auto-exclude untracked paths by[@​EliahKagan](https://togithub.com/EliahKagan) in[https://github.com/gitpython-developers/GitPython/pull/1668](https://togithub.com/gitpython-developers/GitPython/pull/1668)- Upgrade and broaden flake8, fixing style problems and bugs by[@​EliahKagan](https://togithub.com/EliahKagan) in[https://github.com/gitpython-developers/GitPython/pull/1673](https://togithub.com/gitpython-developers/GitPython/pull/1673)- Fix rollback bug in SymbolicReference.set_reference by[@​EliahKagan](https://togithub.com/EliahKagan) in[https://github.com/gitpython-developers/GitPython/pull/1675](https://togithub.com/gitpython-developers/GitPython/pull/1675)- Remove `@NoEffect` annotations by[@​EliahKagan](https://togithub.com/EliahKagan) in[https://github.com/gitpython-developers/GitPython/pull/1677](https://togithub.com/gitpython-developers/GitPython/pull/1677)- Add more checks for the validity of refnames by[@​facutuesca](https://togithub.com/facutuesca) in[https://github.com/gitpython-developers/GitPython/pull/1672](https://togithub.com/gitpython-developers/GitPython/pull/1672)**Full Changelog**:gitpython-developers/GitPython@3.1.36...3.1.37</details>---### Configuration📅 **Schedule**: Branch creation - At any time (no schedule defined),Automerge - At any time (no schedule defined).🚦 **Automerge**: Enabled.♻ **Rebasing**: Whenever PR becomes conflicted, or you tick therebase/retry checkbox.🔕 **Ignore**: Close this PR and you won't be reminded about this updateagain.---- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, checkthis box---This PR has been generated by [MendRenovate](https://www.mend.io/free-developer-tools/renovate/). Viewrepository job log[here](https://developer.mend.io/github/allenporter/flux-local).<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi45Ny4xIiwidXBkYXRlZEluVmVyIjoiMzYuOTcuMSIsInRhcmdldEJyYW5jaCI6Im1haW4ifQ==-->Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
EliahKagan commentedSep 24, 2023 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Thanks! @Byron Based on this, and also what I am now seeing is the recent history of this practice being followed for GitPython inCVE-2022-24439/CVE-2023-40267, I recommend making a new advisory. Maybe there is some way I can help with this? However, if for any reason you would still prefer this route not be taken, then I can definitely go ahead and open a PR to update the global advisory with the version change. (I am unsure if that would cause Dependabot to notify users of the security update or not, but I imagine that, if it would not, then a reviewer on the PR would mention that.)
I have three ideas of what I could do, but I don't know what, if any of them, would help or be wanted. This depends, in part, onwhat takes up the time for you.
|
Let's try something: updating version numbers is much cheaper than creating a new 'follow-up' CVE, for all sides actually. One could ask in the PR of the version change if notifications will be sent, and if unknown,@stsewd could probably help to tell as well. If no notification is sent, you could create a new CVE - you would be able todo this here in GitPython and from there it can be elevated, along with requesting a global CVE for it - this is easily done through the maintainer interface. The rest we can take from there should it come to that. |
Sounds good; I will do this. I noticed in the local advisory that, while
I'm making the PR through the structured "Suggest improvements" template, in which every field is pretty specific. I'll either include it somewhere if it fits, or otherwise try and add it into the created PR or add a comment with it.
Thanks for telling me about that. That is much nicer than the particular specific I had suggested might be used. Of course, I'll still save that for if the above proves insufficient, as you say. |
Byron commentedSep 24, 2023 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Thanks for the head's up, that's an oversight that is now corrected. And thanks again for your help! |
EliahKagan commentedSep 24, 2023 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
No problem! I've submitted the proposed edit to the global advisory in PRgithub/advisory-database#2753. |
github/advisory-database#2753 has been merged and the global GitHub advisory forCVE-2023-41040 has thus been updated. |
Further update: The change to the global advisoryhas caused Dependabot security alerts to be raised, as desired. For example,dmvassallo/EmbeddingScratchwork#248 is a PR opened automatically to resolve a new Dependabot security alert in a project where GitPython had already been upgraded to the previously listed version. Note that this does not necessarily apply for tools that are less closely coupled to the GitHub ecosystem, and I don't know, for example, if any newRenovatebot PRs will be generated. |
Bump gitpython from 3.1.35 to 3.1.37Bumps gitpython from 3.1.35 to 3.1.37.Release notesSourced from gitpython's releases.3.1.37 - a proper fixCVE-2023-41040What's ChangedImprove Python version and OS compatibility, fixing deprecations by @EliahKagan ingitpython-developers/GitPython#1654Better document env_case test/fixture and cwd by @EliahKagan ingitpython-developers/GitPython#1657Remove spurious executable permissions by @EliahKagan ingitpython-developers/GitPython#1658Fix up checks in Makefile and make them portable by @EliahKagan ingitpython-developers/GitPython#1661Fix URLs that were redirecting to another license by @EliahKagan ingitpython-developers/GitPython#1662Assorted small fixes/improvements to root dir docs by @EliahKagan ingitpython-developers/GitPython#1663Use venv instead of virtualenv in test_installation by @EliahKagan ingitpython-developers/GitPython#1664Omit py_modules in setup by @EliahKagan ingitpython-developers/GitPython#1665Don't track code coverage temporary files by @EliahKagan ingitpython-developers/GitPython#1666Configure tox by @EliahKagan ingitpython-developers/GitPython#1667Format tests with black and auto-exclude untracked paths by @EliahKagan ingitpython-developers/GitPython#1668Upgrade and broaden flake8, fixing style problems and bugs by @EliahKagan ingitpython-developers/GitPython#1673Fix rollback bug in SymbolicReference.set_reference by @EliahKagan ingitpython-developers/GitPython#1675Remove@NoEffect annotations by @EliahKagan ingitpython-developers/GitPython#1677Add more checks for the validity of refnames by @facutuesca ingitpython-developers/GitPython#1672Full Changelog: gitpython-developers/GitPython@3.1.36...3.1.37Commitsb27a89f fix makefile to compare commit hashes only0bd2890 prepare next release832b6ee remove unnecessary list comprehension to fix CIe98f57b Merge pull request #1672 from trail-of-forks/robust-refname-checks1774f1e Merge pull request #1677 from EliahKagan/no-noeffecta4701a0 Remove@NoEffect annotationsd40320b Merge pull request #1675 from EliahKagan/rollbackd1c1f31 Merge pull request #1673 from EliahKagan/flake8e480985 Tweak rollback logic in log.to_fileff84b26 Refactor try-finally cleanup in git/Additional commits viewable in compare viewDependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting@dependabot rebase.Dependabot commands and optionsYou can trigger Dependabot actions by commenting on this PR:@dependabot rebase will rebase this PR@dependabot recreate will recreate this PR, overwriting any edits that have been made to it@dependabot merge will merge this PR after your CI passes on it@dependabot squash and merge will squash and merge this PR after your CI passes on it@dependabot cancel merge will cancel a previously requested merge and block automerging@dependabot reopen will reopen this PR if it is closed@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually@dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)You can disable automated security fix PRs for this repo from the Security Alerts page.Reviewed-by: Vladimir Vshivkov
Uh oh!
There was an error while loading.Please reload this page.
This change adds checks based on the rules described inthe docs in order to more robustly check a refname's validity.