Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork33.7k
Description
Background
RFC 3986 (spec for URIs) defines a valid port string with the following grammar rule:
port = *DIGIT
Here's the WHATWG URL spec definition:
"""
A URL-port string must be one of the following:
- the empty string
- one or moreASCII digits representing a decimal number no greater than
$2^{16} − 1$ .
"""1
The bug
This is the port string parsing code fromLib/urllib/parse.py:166-176:
defport(self):port=self._hostinfo[1]ifportisnotNone:try:port=int(port,10)exceptValueError:message=f'Port could not be cast to integer value as{port!r}'raiseValueError(message)fromNoneifnot (0<=port<=65535):raiseValueError("Port out of range 0-65535")returnport
This will erroneously validate strings"-0" andf"+{x}" for any value ofx in the valid range. Given that+ and- are not digits, this behavior is in violation of both specifications.
This bug is easily reproducible with the following snippet:
fromurllib.parseimporturlparseurl1=urlparse("http://python.org:-0")url2=urlparse("http://python.org:+80")print(url1.port)# prints 0, but error is expectedprint(url2.port)# prints 80, but error is expected
Happy to submit a PR, but don't want to step on any toes over at#25774.
My environment
- CPython version tested on:
- 3.10.6
- Operating system and architecture:
- Arch Linux x86_64
Footnotes
Given that this is
urlparseand noturiparse, it seems appropriate that we do not accept port numbers outsiderange(2**16), even though such numbers are allowed by RFC 3986.↩