
This issue trackerhas been migrated toGitHub, and is currentlyread-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.
Created on2008-02-25 01:55 byjaredgrubb, last changed2022-04-11 14:56 byadmin. This issue is nowclosed.
| Pull Requests | |||
|---|---|---|---|
| URL | Status | Linked | Edit |
| PR 13401 | merged | Anthony Sottile,2019-05-18 01:39 | |
| Messages (8) | |||
|---|---|---|---|
| msg62956 -(view) | Author: Jared Grubb (jaredgrubb) | Date: 2008-02-25 01:59 | |
tokenize does not handle line joining properly, as the following stringfails the CPython tokenizer but passes the tokenize module.Example 1:>>> s = "if 1:\n \\\n #hey\n print 1">>> exec sTraceback (most recent call last): File "<stdin>", line 1, in <module> File "<string>", line 3 #hey ^SyntaxError: invalid syntax>>> tokenize.tokenize(StringIO(s).readline)1,0-1,2:NAME'if'1,3-1,4:NUMBER'1'1,4-1,5:OP':'1,5-1,6:NEWLINE'\n'2,0-2,2:INDENT' '3,2-3,6:COMMENT'#hey'3,6-3,7:NEWLINE'\n'4,2-4,7:NAME'print'4,8-4,9:NUMBER'1'5,0-5,0:DEDENT''5,0-5,0:ENDMARKER'' | |||
| msg62960 -(view) | Author: Jared Grubb (jaredgrubb) | Date: 2008-02-25 02:22 | |
CPython allows \ at EOF, but tokenize does not.>>> s = 'print 1\\\n'>>> exec s1>>> tokenize.tokenize(StringIO(s).readline)1,0-1,5:NAME'print'1,6-1,7:NUMBER'1'Traceback (most recent call last): File "<stdin>", line 1, in <module> File"/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/tokenize.py",line 153, in tokenize tokenize_loop(readline, tokeneater) File"/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/tokenize.py",line 159, in tokenize_loop for token_info in generate_tokens(readline): File"/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/tokenize.py",line 283, in generate_tokens raise TokenError, ("EOF in multi-line statement", (lnum, 0))tokenize.TokenError: ('EOF in multi-line statement', (2, 0)) | |||
| msg116977 -(view) | Author: Mark Lawrence (BreamoreBoy)* | Date: 2010-09-20 21:26 | |
Nobody appears to be interested so I'll close this in a couple of weeks unless someone objects, unless a patch is provided. | |||
| msg116985 -(view) | Author: Raymond Hettinger (rhettinger)*![]() | Date: 2010-09-20 21:51 | |
Mark, please stop closing these based on age.The needs to be a determination whether thisis a valid bug. If so, then a patch is needed.If not, it can be closed. | |||
| msg143716 -(view) | Author: Meador Inge (meador.inge)*![]() | Date: 2011-09-08 01:39 | |
That syntax error is coming from the CPython parser and *not* the tokenizer. Both CPython and the 'tokenizer' modules produce the same tokenization:[meadori@motherbrain cpython]$ cat repro.pyif 1: \ pass[meadori@motherbrain cpython]$ ./python tokenize.py repro.py 0,0-0,0: ENCODING 'utf-8'1,0-1,2: NAME 'if'1,3-1,4: NUMBER '1'1,4-1,5: OP ':'1,5-1,6: NEWLINE '\n'2,0-2,2: INDENT ' '3,0-3,1: NEWLINE '\n'4,2-4,6: NAME 'pass'4,6-4,7: NEWLINE '\n'5,0-5,0: DEDENT ''5,0-5,0: ENDMARKER ''[44319 refs][meadori@motherbrain cpython]$ ./python -d repro.py | grep Token | tail -10 File "repro.py", line 3 ^SyntaxError: invalid syntax[44305 refs]Token NEWLINE/'' ... It's a token we knowToken DEDENT/'' ... It's a token we knowToken NEWLINE/'' ... It's a token we knowToken ENDMARKER/'' ... It's a token we knowToken NAME/'if' ... It's a keywordToken NUMBER/'1' ... It's a token we knowToken COLON/':' ... It's a token we knowToken NEWLINE/'' ... It's a token we knowToken INDENT/'' ... It's a token we knowToken NEWLINE/'' ... It's a token we knowThe NEWLINE INDENT NEWLINE tokenization causes the parser to choke because 'suite' nonterminals:suite: simple_stmt | NEWLINE INDENT stmt+ DEDENTare defined as NEWLINE INDENT.It seems appropriate that the NEWLINE after INDENT should be dropped by both tokenizers. In other words, I think:"""if 1: \ pass"""should produce the same tokenization as:"""if 1: pass"""This seems consistent with with how explicit line joining is defined [2].[1]http://hg.python.org/cpython/file/92842e347d98/Grammar/Grammar[2]http://docs.python.org/reference/lexical_analysis.html#explicit-line-joining | |||
| msg339576 -(view) | Author: Anthony Sottile (Anthony Sottile)* | Date: 2019-04-07 14:32 | |
Here's an example in the wild which still reproduces with python3.8a3:https://github.com/SecureAuthCorp/impacket/blob/194b22ed2fc85c4f241375fb7ebe4e0d89626c8c/impacket/examples/remcomsvc.py#L1669This was reported as a bug on flake8:https://gitlab.com/pycqa/flake8/issues/532Here's the reproduction with python3.8:$ python3.8 --version --versionPython 3.8.0a3 (default, Mar 27 2019, 03:46:44) [GCC 7.3.0]$ python3.8 impacket/examples/remcomsvc.py $ python3.8 -mtokenize impacket/examples/remcomsvc.py impacket/examples/remcomsvc.py:1670:0: error: EOF in multi-line statement | |||
| msg342807 -(view) | Author: miss-islington (miss-islington) | Date: 2019-05-18 18:27 | |
New changesetabea73bf4a320ff658c9a98fef3d948a142e61a9 by Miss Islington (bot) (Anthony Sottile) in branch 'master':bpo-2180: Treat line continuation at EOF as a `SyntaxError` (GH-13401)https://github.com/python/cpython/commit/abea73bf4a320ff658c9a98fef3d948a142e61a9 | |||
| msg342817 -(view) | Author: Gregory P. Smith (gregory.p.smith)*![]() | Date: 2019-05-18 21:02 | |
Thanks for figuring this one out Anthony! :) | |||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:56:31 | admin | set | github: 46433 |
| 2019-05-18 21:02:56 | gregory.p.smith | set | status: open -> closed resolution: fixed messages: +msg342817 stage: patch review -> commit review |
| 2019-05-18 18:27:30 | miss-islington | set | nosy: +miss-islington messages: +msg342807 |
| 2019-05-18 07:16:47 | gregory.p.smith | set | assignee:gregory.p.smith nosy: +gregory.p.smith |
| 2019-05-18 01:39:12 | Anthony Sottile | set | keywords: +patch stage: needs patch -> patch review pull_requests: +pull_request13312 |
| 2019-04-07 14:32:44 | Anthony Sottile | set | nosy: +Anthony Sottile messages: +msg339576 versions: + Python 3.8, Python 3.9, - Python 3.1, Python 2.7, Python 3.2 |
| 2014-02-03 19:15:35 | BreamoreBoy | set | nosy: -BreamoreBoy |
| 2011-09-08 01:39:11 | meador.inge | set | messages: +msg143716 stage: test needed -> needs patch |
| 2010-09-27 03:19:42 | meador.inge | set | nosy: +meador.inge |
| 2010-09-20 21:51:51 | rhettinger | set | status: pending -> open nosy: +rhettinger messages: +msg116985 assignee:jhylton -> (no value) |
| 2010-09-20 21:26:23 | BreamoreBoy | set | status: open -> pending nosy: +BreamoreBoy messages: +msg116977 |
| 2010-08-21 17:06:34 | BreamoreBoy | unlink | issue1230484 dependencies |
| 2010-08-21 17:03:39 | BreamoreBoy | set | versions: + Python 3.1, Python 2.7, Python 3.2, - Python 2.6 |
| 2009-02-16 02:26:11 | ajaksu2 | link | issue1230484 dependencies |
| 2009-02-16 02:20:41 | ajaksu2 | set | stage: test needed versions: + Python 2.6, - Python 2.5 |
| 2008-03-20 03:08:15 | jafo | set | assignee:jhylton nosy: +jhylton |
| 2008-02-25 02:22:29 | jaredgrubb | set | messages: +msg62960 |
| 2008-02-25 01:59:17 | jaredgrubb | set | messages: +msg62956 |
| 2008-02-25 01:55:51 | jaredgrubb | create | |