
This issue trackerhas been migrated toGitHub, and is currentlyread-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.
Created on2014-04-18 10:49 byAivar.Annamaa, last changed2022-04-11 14:58 byadmin. This issue is nowclosed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| py34_ast_call_bug.py | Aivar.Annamaa,2014-04-18 10:49 | Small demonstration of the bug | ||
| Messages (21) | |||
|---|---|---|---|
| msg216777 -(view) | Author: Aivar Annamaa (Aivar.Annamaa)* | Date: 2014-04-18 10:49 | |
Following program gives correct result in Python versions older than 3.4, but incorrect result in 3.4:----------------------import asttree = ast.parse("sin(0.5)")first_stmt = tree.body[0]call = first_stmt.valueprint("col_offset of call expression:", call.col_offset)print("col_offset of func of the call:", call.func.col_offset)-----------------------it should print:col_offset of call expression: 0col_offset of func of the call: 0but in 3.4 it prints:col_offset of call expression: 3col_offset of func of the call: 0 | |||
| msg216778 -(view) | Author: Aivar Annamaa (Aivar.Annamaa)* | Date: 2014-04-18 10:58 | |
... also, lineno is wrong for both Call and call's func, when func and arguments are on different lines:import asttree = ast.parse("(sin\n(0.5))")first_stmt = tree.body[0]call = first_stmt.valueprint("col_offset of call expression:", call.col_offset)print("col_offset of func of the call:", call.func.col_offset)print("lineno of call expression:", call.lineno)print("lineno of func of the call:", call.lineno)# lineno-s should be 1 for both call and func | |||
| msg216821 -(view) | Author: Benjamin Peterson (benjamin.peterson)*![]() | Date: 2014-04-19 00:38 | |
I suspect this was an intentional result of#16795. | |||
| msg216846 -(view) | Author: Aivar Annamaa (Aivar.Annamaa)* | Date: 2014-04-19 06:14 | |
Regarding#16795, the documentation says "The lineno is the line number of source text and the col_offset is the UTF-8 byte offset of the first token that generated the node", not that lineno and col_offset indicate a suitable position to mention in the error messages related to this node.IMO lineno and col_offset should stay as predictable means for finding the (beginning of) source text of the node. In error reporting code one could inspect the situation and compute locations suitable for this.Alternatively, these attributes could be left for purposes mentioned in#16795 and parser developers could introduce new attributes in ast nodes which indicate both start and end positions of corresponding source. (Hopefully this would resolve also#18374 and#16806) | |||
| msg221360 -(view) | Author: Aivar Annamaa (Aivar.Annamaa)* | Date: 2014-06-23 14:34 | |
Just found out that ast.Attribute in Python 3.4 has similar problem | |||
| msg235245 -(view) | Author: Mark Shannon (Mark.Shannon)*![]() | Date: 2015-02-02 12:21 | |
This is caused byhttps://hg.python.org/cpython/rev/7c5c678e4164/which is a supposed fix forhttp://bugs.python.org/issue16795which claims to make "some changes to AST to make it more useful for static language analysis", seemingly by breaking all existing static analysis tools.Could we just reverthttps://hg.python.org/cpython/rev/7c5c678e4164/ ? | |||
| msg235246 -(view) | Author: Mark Shannon (Mark.Shannon)*![]() | Date: 2015-02-02 12:23 | |
It is now very hard to determine accurate locations for an expression such as (x+y).attr as the column offset of leftmost subexpression of the expression is not the same as the column offset of the location. | |||
| msg235261 -(view) | Author: Mark Shannon (Mark.Shannon)*![]() | Date: 2015-02-02 14:41 | |
This also breaks the col_offset for subscripts like x[y] and, of course any statement with one of these expressions as its leftmost sub-expression. | |||
| msg235266 -(view) | Author: Roundup Robot (python-dev)![]() | Date: 2015-02-02 15:53 | |
New changeset7d1c32ddc432 by Benjamin Peterson in branch '3.4':revert lineno and col_offset changes from#16795 (closes#21295)https://hg.python.org/cpython/rev/7d1c32ddc432New changeset8ab6b404248c by Benjamin Peterson in branch 'default':merge 3.4 (#21295)https://hg.python.org/cpython/rev/8ab6b404248c | |||
| msg237577 -(view) | Author: Sven Brauch (scummos)* | Date: 2015-03-08 22:28 | |
Why did you not CC me in this discussion? It is not very nice to have this behaviour changed back from what I relied upon in a minor version without notice.Which regression was effectively caused by this patch, except for the documentation being out of date? | |||
| msg237581 -(view) | Author: Mark Shannon (Mark.Shannon)*![]() | Date: 2015-03-08 22:44 | |
You are on the nosy list. You should have got sent an email.This bug is the regression.https://hg.python.org/cpython/rev/7c5c678e4164/ resulted in incorrect column offsets for many compound expressions. | |||
| msg237585 -(view) | Author: Sven Brauch (scummos)* | Date: 2015-03-09 00:39 | |
Hmm, strange, I did not receive any emails."Incorrect" by what definition of incorrect? The word does not really help to clarify the issue you see with this change, since the behaviour was changed on purpose. What is the (preferably real-world) application which is broken by this change? | |||
| msg237670 -(view) | Author: Mark Shannon (Mark.Shannon)*![]() | Date: 2015-03-09 16:02 | |
The column offset has always been the offset of the start of the expression. Therefore the expression `x.y` should have the same offset as the sub-expresssion `x`.Likewise for calls, `f(args)` should have the same offset as the `f` sub expression.Our static analysis tool is a real-world use case:http://semmle.com/2014/06/semmle-analysis-now-includes-python/Presumably the submitter of this issue also had a real would use case. | |||
| msg237671 -(view) | Author: Aivar Annamaa (Aivar.Annamaa)* | Date: 2015-03-09 16:09 | |
Yes, I also need col_offset to work as advertised because of a real world use case: Thonny (http://thonny.cs.ut.ee/) is a visual Python debugger which highlights the (sub)expression about to be evaluated. | |||
| msg237672 -(view) | Author: Sven Brauch (scummos)* | Date: 2015-03-09 16:15 | |
But if you need the start of the full expression, can't you just go up in the "parent" chain until the parent is not an expression any more?Could additional API be introduced which provides the value I am looking for as well as the one you need?I was not on the nosy list by the way, I just put myself there after I commented. And that was after 3.4.3, after I noticed my software was suddenly broken by a patch release of python. | |||
| msg237675 -(view) | Author: Mark Shannon (Mark.Shannon)*![]() | Date: 2015-03-09 16:44 | |
How do I get the start of `(x+y).bit_length()` in `total += (x+y).bit_length()`?With your change, I can't get it from `x`, `x+y`, or from the whole statement.The primary purpose of the locations are for tracebacks, not for static tools.Also, most tools need to support earlier versions of Python and consistency between versions is the most important thing.A third-party parser that supported full, accurate locations would be great, but I don't think the builtin parser is the place for it. | |||
| msg251522 -(view) | Author: Radek Novacek (rnovacek) | Date: 2015-09-24 13:35 | |
I've ran the tests from first and second comment using python 3.5.0 and it seems it produces correct results:>>> import ast>>> tree = ast.parse("sin(0.5)")>>> first_stmt = tree.body[0]>>> call = first_stmt.value>>> print("col_offset of call expression:", call.col_offset)col_offset of call expression: 0>>> print("col_offset of func of the call:", call.func.col_offset)col_offset of func of the call: 0>>> tree = ast.parse("(sin\n(0.5))")>>> first_stmt = tree.body[0]>>> call = first_stmt.value>>> print("col_offset of call expression:", call.col_offset)col_offset of call expression: 1>>> print("col_offset of func of the call:", call.func.col_offset)col_offset of func of the call: 1>>> print("lineno of call expression:", call.lineno)lineno of call expression: 1>>> print("lineno of func of the call:", call.lineno)lineno of func of the call: 1 | |||
| msg252380 -(view) | Author: Radek Novacek (rnovacek) | Date: 2015-10-06 08:42 | |
There is still problem with col_offset is some situations, for example col_offset of the ast.Attribute should be 4 but is 0 instead:>>> for x in ast.walk(ast.parse('foo.bar')):... if hasattr(x, 'col_offset'):... print("%s: %d" % (x, x.col_offset))... <_ast.Expr object at 0x7fcdc84722b0>: 0<_ast.Attribute object at 0x7fcdc84723c8>: 0<_ast.Name object at 0x7fcdc8472438>: 0Is there any solution to this problem? It causes problems in python support in KDevelop (kdev-python). | |||
| msg252381 -(view) | Author: Aivar Annamaa (Aivar.Annamaa)* | Date: 2015-10-06 09:00 | |
Radek, the source corresponding to Attribute node does start at col 0 in your example | |||
| msg252384 -(view) | Author: Radek Novacek (rnovacek) | Date: 2015-10-06 10:37 | |
Aivar, I have to admit that my knowledge of this is limited, but as I understand it, the attribute is "bar" in the "foo.bar" expression.I can get beginning of the assignment by >>> ast.parse('foo.bar').body[0].value.value.col_offset0But how can I get position of the 'bar'? My guess is this:>>> ast.parse('foo.bar').body[0].value.col_offsetbut it still returns 0.Why this two col_offsets returns the same value? How can I get the position of 'bar' in 'foo.bar'? | |||
| msg252386 -(view) | Author: Aivar Annamaa (Aivar.Annamaa)* | Date: 2015-10-06 10:56 | |
ast.Attribute node actually means "the atribute of something", ie. the node includes this "something" as subnode. > How can I get the position of 'bar' in 'foo.bar'?I don't know a good way for this, because bar is not an AST node for Python. If Python AST nodes included the information about where a node ends in source, I would take the ending col of node.value (foo in your example), and added 2. In my own program (http://thonny.cs.ut.ee, it's a Python IDE for beginners) I'm using a really contrived algorithm for determining the end positions of nodes. See function mark_text_ranges here:https://bitbucket.org/plas/thonny/src/b8860704c99d47760ffacfaa335d2f8772721ba4/thonny/ast_utils.py?at=master&fileviewer=file-view-defaultI'm not happy with my solution, but I don't know any other ways. | |||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:58:02 | admin | set | github: 65494 |
| 2015-10-06 10:56:58 | Aivar.Annamaa | set | messages: +msg252386 |
| 2015-10-06 10:37:16 | rnovacek | set | messages: +msg252384 |
| 2015-10-06 09:00:15 | Aivar.Annamaa | set | messages: +msg252381 |
| 2015-10-06 08:42:22 | rnovacek | set | messages: +msg252380 |
| 2015-09-24 13:35:33 | rnovacek | set | nosy: +rnovacek messages: +msg251522 |
| 2015-03-09 16:44:39 | Mark.Shannon | set | messages: +msg237675 |
| 2015-03-09 16:15:55 | scummos | set | messages: +msg237672 |
| 2015-03-09 16:09:33 | Aivar.Annamaa | set | messages: +msg237671 |
| 2015-03-09 16:02:08 | Mark.Shannon | set | messages: +msg237670 |
| 2015-03-09 00:39:26 | scummos | set | messages: +msg237585 |
| 2015-03-08 22:44:49 | Mark.Shannon | set | messages: +msg237581 |
| 2015-03-08 22:28:17 | scummos | set | nosy: +scummos messages: +msg237577 |
| 2015-02-02 15:53:31 | python-dev | set | status: open -> closed nosy: +python-dev messages: +msg235266 resolution: fixed stage: resolved |
| 2015-02-02 14:41:41 | Mark.Shannon | set | messages: +msg235261 |
| 2015-02-02 12:23:40 | Mark.Shannon | set | messages: +msg235246 |
| 2015-02-02 12:21:32 | Mark.Shannon | set | nosy: +Mark.Shannon messages: +msg235245 |
| 2014-06-23 14:34:36 | Aivar.Annamaa | set | messages: +msg221360 |
| 2014-04-19 06:14:02 | Aivar.Annamaa | set | messages: +msg216846 |
| 2014-04-19 00:38:25 | benjamin.peterson | set | messages: +msg216821 |
| 2014-04-19 00:30:17 | terry.reedy | set | nosy: +brett.cannon,georg.brandl,ncoghlan,benjamin.peterson |
| 2014-04-18 20:18:46 | flox | set | keywords: +3.4regression nosy: +flox type: behavior |
| 2014-04-18 10:58:03 | Aivar.Annamaa | set | messages: +msg216778 |
| 2014-04-18 10:49:40 | Aivar.Annamaa | create | |