Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork34.1k
Comments
bpo-39503: CVE-2020-8492: Fix AbstractBasicAuthHandler#18284
bpo-39503: CVE-2020-8492: Fix AbstractBasicAuthHandler#18284vstinner merged 2 commits intopython:masterfromvstinner:urllib_basic_auth_regex
Conversation
vstinner commentedJan 30, 2020
mschwager commentedJan 30, 2020
This fix looks good to me! |
Lib/urllib/request.py Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
(?:.*,)* is equivalent to(?:.*,)?.
But since this regular expresion is only used withsearch().(?:.*,)*[ \t]* can be removed at all.
I'll analyze whether it is correct or there is an error in the regular expression.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Well, I am cannot say that I completely understand the code, but to give it some sense we can either
- Replace
rx.search()withrx.match()and replace(?:.*,)*with(?:.*,)?.
or
- Keep
rx.search()and replace(?:.*,)*with(?:^|,).
Do not keep(?:[^,]*,)*. It is a waster of resources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Humm, options 1 and 2 are not equivalent if the field value contains more than one challenge. Option 2 is closer to the current behavior. But correct support of more than one challenge need rewriting the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The current patch seems to give an O(n^3) time to evaluate - much better than O(2^n), but still very slow - with 2000 commas it takes about a minute to evalute. With 65000 it takes much, much longer. Testing using the code fromhere gave the following (commas, seconds) values:[(100, 0.124), (250, 0.261), (500, 0.923), (750, 2.85), (1000, 6.433), (1250, 12.608), (1500, 21.576), (2000, 50.751)]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I was wrong, the first option is equivalent to the current behavior (returns the last realm).
serhiy-storchaka commentedJan 30, 2020
Would be nice to add a test. |
mschwager commentedJan 30, 2020
Just a heads up,CVE-2020-8492 has been created. I'm not sure how Python CVEs are generally tracked, but it may be useful to include the information on the bug tracker issue 👍 |
vstinner commentedFeb 3, 2020 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Not only We can either simplify the regex to prevent the "catastrophic backtracking" or even remove the prefix. UPDATE: Oops, my example was wrong, I fixed it :-) |
bcaller commentedFeb 4, 2020
Does this also fixhttps://bugs.python.org/issue38826 ? |
encukou commentedMar 24, 2020
It seems to me that the >>>header='basic realm="1", x, other realm="2"'>>>re.search("(?:.*,)*[\t]*([^\t]+)[\t]+",header).group(1)'other'>>>re.search("[\t]*([^\t]+)[\t]+",header).group(1)'basic'>>> I don't see a way to fix this by just changing the regex while preserving the previous behavior. Then again, corner cases of the previous behavior might be wrong. |
vstinner commentedMar 25, 2020
I rebased my PR and added more tests. |
vstinner commentedMar 25, 2020
@serhiy-storchaka: I don't understand if you consider that the fix is wrong or that the fix is not enough (it remains possible to create a denial of service)? |
serhiy-storchaka commentedMar 25, 2020
@vstinner your fix helps, but we can do better. It has cubic complexity, my suggestion has quadratic complexity. It is possible to implement an algorithm with linear complexity, but not with such small changes. |
davidfraser commentedMar 25, 2020
I also added some tests and implemented a simpler complexity regex - seemaster...davidfraser:urllib_basic_auth_regex |
davidfraser commentedMar 25, 2020
It's worth seeing how the results of this regex are actually used Note my comment inf79379c: Note that the original regex was roughly O(2**n) |
vstinner commentedMar 25, 2020
WWW-Authenticate is badly specified. The RFC doesn't specify if a single HTTP header can contain multiple challenges. I found these resources:
A variant is to have multiple WWW-Authenticate: one challenge per WWW-Authenticate header. By the way, AbstractBasicAuthHandler code contains this interesting comment: Current behavior:
For example, |
vstinner commentedMar 25, 2020
Sorry, I misunderstood this proposition. In fact, I proposed something similar except that I missed the "start of the string" (regex I decided to write a way more complex change to not only fix the vulnerability, but also fix the parser since it didn't look possible to fix the regex without changing the behavior. Currently, the code uses thelast realm if there are multiple challenges per header. I fixed this behavior to use the realm of thefirst Basic challenge. I also modified the code to support multiple headers, except of only parsing the first one. |
vstinner commentedMar 25, 2020
Ok, the PR is now ready for a new round of reviews. I fixed the vulnerability but I also changed the code to parse all WWW-Authenticate HTTP Headers and accept multiple challenges per header. |
The AbstractBasicAuthHandler class of the urllib.request module usesan inefficient regular expression which can be exploited by anattacker to cause a denial of service. Fix the regex to prevent thecatastrophic backtracking. Vulnerability reported by Ben Callerand Matt Schwager.AbstractBasicAuthHandler of urllib.request now parses allWWW-Authenticate HTTP headers and accepts multiple challenges perheader: use the realm of the first Basic challenge.Co-Authored-By: Serhiy Storchaka <storchaka@gmail.com>
vstinner commentedMar 25, 2020
Oh, Ben Caller reported the issue at 2019-11-17:https://bugs.python.org/issue38826 |
vstinner commentedMar 25, 2020
orsenthil commentedMar 26, 2020
Thanks,@vstinner - This was a good detection and change. I am reviewing the patch further. |
| basic = f'Basic realm="{realm}"' | ||
| basic2 = f'Basic realm="{realm2}"' | ||
| other_no_realm = 'Otherscheme xxx' | ||
| digest = (f'Digest realm="{realm2}", ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
What was the motivation of adding adigest test case here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I tried to write a more realistic test than "Otherscheme xxx". The Digest challenge uses "realm" and the test ensures that it's skipped. It uses use multiple fields separated by commas, some use quotes: test that the parser handles that properly.
I picked the example from Wikipedia :-)https://fr.wikipedia.org/wiki/Authentification_HTTP#Demande_d'identification
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
perfect. sounds good to me. :)
orsenthil left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Thanks for the patch and the change. The entire addition is meaningful and looks good to me.
- The change of regex only the security issue keeps the patch simple.
I only had a question in the tests, but it is not a critical question. I thought, I am missing something by adding of "digest" test case in line 1469, when the multiple headers challenge is covered again in the same test case at line 1499.
LGTM.
The AbstractBasicAuthHandler class of the urllib.request module usesan inefficient regular expression which can be exploited by anattacker to cause a denial of service. Fix the regex to prevent thecatastrophic backtracking. Vulnerability reported by Ben Callerand Matt Schwager.AbstractBasicAuthHandler of urllib.request now parses allWWW-Authenticate HTTP headers and accepts multiple challenges perheader: use the realm of the first Basic challenge.Co-Authored-By: Serhiy Storchaka <storchaka@gmail.com>(cherry picked from commit0b297d4)Co-authored-by: Victor Stinner <vstinner@python.org>
bedevere-bot commentedApr 2, 2020
GH-19292 is a backport of this pull request to the3.7 branch. |
bedevere-bot commentedApr 2, 2020
|
miss-islington commentedApr 2, 2020
Thanks@vstinner for the PR 🌮🎉.. I'm working now to backport this PR to: 3.8. |
miss-islington commentedApr 2, 2020
Thanks@vstinner for the PR 🌮🎉.. I'm working now to backport this PR to: 3.7. |
The AbstractBasicAuthHandler class of the urllib.request module usesan inefficient regular expression which can be exploited by anattacker to cause a denial of service. Fix the regex to prevent thecatastrophic backtracking. Vulnerability reported by Ben Callerand Matt Schwager.AbstractBasicAuthHandler of urllib.request now parses allWWW-Authenticate HTTP headers and accepts multiple challenges perheader: use the realm of the first Basic challenge.Co-Authored-By: Serhiy Storchaka <storchaka@gmail.com>(cherry picked from commit0b297d4)Co-authored-by: Victor Stinner <vstinner@python.org>
bedevere-bot commentedApr 2, 2020
GH-19296 is a backport of this pull request to the3.8 branch. |
The AbstractBasicAuthHandler class of the urllib.request module usesan inefficient regular expression which can be exploited by anattacker to cause a denial of service. Fix the regex to prevent thecatastrophic backtracking. Vulnerability reported by Ben Callerand Matt Schwager.AbstractBasicAuthHandler of urllib.request now parses allWWW-Authenticate HTTP headers and accepts multiple challenges perheader: use the realm of the first Basic challenge.Co-Authored-By: Serhiy Storchaka <storchaka@gmail.com>(cherry picked from commit0b297d4)Co-authored-by: Victor Stinner <vstinner@python.org>
bedevere-bot commentedApr 2, 2020
GH-19297 is a backport of this pull request to the3.7 branch. |
…-19296)The AbstractBasicAuthHandler class of the urllib.request module usesan inefficient regular expression which can be exploited by anattacker to cause a denial of service. Fix the regex to prevent thecatastrophic backtracking. Vulnerability reported by Ben Callerand Matt Schwager.AbstractBasicAuthHandler of urllib.request now parses allWWW-Authenticate HTTP headers and accepts multiple challenges perheader: use the realm of the first Basic challenge.Co-Authored-By: Serhiy Storchaka <storchaka@gmail.com>Co-authored-by: Victor Stinner <vstinner@python.org>(cherry picked from commit0b297d4)
…-19297)The AbstractBasicAuthHandler class of the urllib.request module usesan inefficient regular expression which can be exploited by anattacker to cause a denial of service. Fix the regex to prevent thecatastrophic backtracking. Vulnerability reported by Ben Callerand Matt Schwager.AbstractBasicAuthHandler of urllib.request now parses allWWW-Authenticate HTTP headers and accepts multiple challenges perheader: use the realm of the first Basic challenge.Co-Authored-By: Serhiy Storchaka <storchaka@gmail.com>Co-authored-by: Victor Stinner <vstinner@python.org>(cherry picked from commit0b297d4)
…-19304)The AbstractBasicAuthHandler class of the urllib.request module usesan inefficient regular expression which can be exploited by anattacker to cause a denial of service. Fix the regex to prevent thecatastrophic backtracking. Vulnerability reported by Ben Callerand Matt Schwager.AbstractBasicAuthHandler of urllib.request now parses allWWW-Authenticate HTTP headers and accepts multiple challenges perheader: use the realm of the first Basic challenge.Co-Authored-By: Serhiy Storchaka <storchaka@gmail.com>(cherry picked from commit0b297d4)
…9305)The AbstractBasicAuthHandler class of the urllib.request module usesan inefficient regular expression which can be exploited by anattacker to cause a denial of service. Fix the regex to prevent thecatastrophic backtracking. Vulnerability reported by Ben Callerand Matt Schwager.AbstractBasicAuthHandler of urllib.request now parses allWWW-Authenticate HTTP headers and accepts multiple challenges perheader: use the realm of the first Basic challenge.
Uh oh!
There was an error while loading.Please reload this page.
The AbstractBasicAuthHandler class of the urllib.request module uses
an inefficient regular expression which can be exploited by an
attacker to cause a denial of service. Fix the regex to prevent the
catastrophic backtracking.
Vulnerability reported by Matt Schwager.
https://bugs.python.org/issue39503