
This issue trackerhas been migrated toGitHub, and is currentlyread-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.
Created on2014-03-03 13:54 bymiwa, last changed2022-04-11 14:57 byadmin. This issue is nowclosed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| test2.py | miwa,2014-03-03 13:54 | |||
| issue20844.py | steven.winfield,2017-08-08 16:06 | Script used when reproducing the bug in slightly different ways | ||
| Pull Requests | |||
|---|---|---|---|
| URL | Status | Linked | Edit |
| PR 12616 | merged | methane,2019-03-29 09:50 | |
| PR 12647 | merged | methane,2019-04-01 09:44 | |
| Messages (15) | |||
|---|---|---|---|
| msg212637 -(view) | Author: Musashi Tamura (miwa) | Date: 2014-03-03 13:54 | |
Microsoft Windows [Version 6.1.7601]Copyright (c) 2009 Microsoft Corporation. All rights reserved.C:\bug>pythonPython 3.3.5rc2 (v3.3.5rc2:ca5635efe090, Mar 2 2014, 18:18:29) [MSC v.1600 64 bit (AMD64)] on win32Type "help", "copyright", "credits" or "license" for more information.>>> exit()C:\bug>python test2.py File "test2.py", line 1SyntaxError: encoding problem: iso-8859-1 | |||
| msg212638 -(view) | Author: STINNER Victor (vstinner)*![]() | Date: 2014-03-03 14:03 | |
It's a duplicate of the issue#20731. | |||
| msg213012 -(view) | Author: Musashi Tamura (miwa) | Date: 2014-03-10 02:19 | |
It seems that this is not fixed in 3.3.5. Someone please reproduce it. | |||
| msg213014 -(view) | Author: Mark Lawrence (BreamoreBoy)* | Date: 2014-03-10 02:55 | |
Works fine for me | |||
| msg213189 -(view) | Author: Musashi Tamura (miwa) | Date: 2014-03-12 01:15 | |
Thanks Mark.Perhaps, the problem is text-mode handling. When using Windows's text-mode stream, ftell() may return -1 even if no error occured. | |||
| msg213196 -(view) | Author: Musashi Tamura (miwa) | Date: 2014-03-12 02:59 | |
When opening LF-newline file, ftell() may return zero when the position is not at the beginning of the file.Maybe LF-newline file should open in binary-mode.http://support.microsoft.com/kb/68337 | |||
| msg214330 -(view) | Author: Marc Schlaich (schlamar)* | Date: 2014-03-21 07:52 | |
I can reproduce this one. There are a few conditions which needs to be met:- Linux line endings - File needs to have at least x lines (empty lines are fine). I guess this is the point why no one could reproduce it. The attached file has 19 lines but probably no one copy/pasted the empty lines. Downloading the file reproduces this in my case. The length of the encoding declaration is relevant to the number of required newlines. `#coding:latin-1` fails at a file with 19 lines, `#coding: latin-1` (whitespace added) requires 20 lines.More observations:- Also reproducible if utf8 is used as alias for utf-8 (`#coding: utf8` + 17 lines), but not reproducible with utf-8 - Python 3.4 is affected, too- No issues on Python 3.3.2 | |||
| msg221089 -(view) | Author: Mark Lawrence (BreamoreBoy)* | Date: 2014-06-20 14:15 | |
I can reproduce this with 3.4.1 and 3.5.0. | |||
| msg221134 -(view) | Author: Eryk Sun (eryksun)*![]() | Date: 2014-06-20 23:28 | |
This fix forissue 20731 doesn't address this bug completely because it's possible for ftell to return -1 without an actual error, as test2.py demonstrates. In text mode, CRLF is translated to LF by the CRT's _read function (Win32 ReadFile). So the buffer that's used by FILE streams is already translated. To get the stream position, ftell first calls _lseek (Win32 SetFilePointer) to get the file pointer. Then it adjusts the file pointer for the unwritten/unread bytes in the buffer. The problem for reading is how to tell whether or not LF in the buffer was translated from CRLF? The chosen 'solution' is to just assume CRLF.The example file test2.py is 33 bytes. At the time fp_setreadl calls ftell(tok->fp), the file pointer is 33, and Py_UniversalNewlineFgets has read the stream up to '#coding:latin-1\n'. That leaves 17 newline characters buffered. As stated above, ftell assumes CRLF, so it calculates the stream position as 33 - (17 * 2) == -1. That happens to be the value returned for an error, but who's checking? In this case, errno is 0 instead of the documented errno constants EBADF or EINVAL.Here's an example in 2.7.7, since it uses FILE streams: >>> f = open('test2.py') >>> f.read(16) '#coding:latin-1\n' >>> f.tell() Traceback (most recent call last): File "<stdin>", line 1, in <module> IOError: [Errno 0] ErrorCan the file be opened in binary mode inModules/main.c? Currently it's using `_Py_wfopen(filename, L"r")`. But decoding_fgets calls Py_UniversalNewlineFgets, which expects binary mode anyway. | |||
| msg224351 -(view) | Author: Mark Lawrence (BreamoreBoy)* | Date: 2014-07-30 21:58 | |
I've tried to make the title more meaningful, feel free to change it if you can think of something better. | |||
| msg233129 -(view) | Author: Ned Batchelder (nedbat)*![]() | Date: 2014-12-27 13:29 | |
This bug just bit me. Changing "# coding: utf8" to "# coding: utf-8" works around it. | |||
| msg233130 -(view) | Author: Ned Batchelder (nedbat)*![]() | Date: 2014-12-27 13:29 | |
(oops: with Python 3.4.1 on Windows) | |||
| msg299935 -(view) | Author: Steven Winfield (steven.winfield) | Date: 2017-08-08 16:06 | |
I've just been bitten by this on 3.6.2, Windows Server 2008 R2, when running the setup.py script for QuantLib-SWIG:https://github.com/lballabio/QuantLib-SWIG/blob/v1.10.x/Python/setup.pyIt seems there is different behaviour depending on whether: * Unix (LF) or Windows (CRLF) line endings are used * The file is >4096 bytes or <=4096 bytes * The module docstring has an initial spaceSome of that has been mentioned previously, but I think the 4096-byte limit might be new, which is why I'm posting.I've attached a script I used to come up with the results below. It contains: * a -*- coding line (for iso-8859-1 in this case) * a docstring consisting entirely of lines of x's, of length 78 * Unix line endingsThe file's length is exactly 4096 bytes.Running this, or slightly modified versions of this, with a 3.6.2 interpreter gave the following results: * In all cases, when Windows line endings were used there was no issue - running the script produced no errors or output. * With Unix line endings: * File length <= 4096, with no leading spaces in the docstring: File "issue20844.py", line 1SyntaxError: encoding problem: iso-8859-1 * File length > 4096, with no leading spaces in the docstring: File "issue20844.py", line 56xxxxx""" ^SyntaxError: EOF while scanning triple-quoted string literal * Any file length, with the first 'x' on line 3 replaced with a space (line 2 if the coding line is ignored): File "issue20844.py", line 2xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx^IndentationError: unexpected indentI had no issues with python 2.7.13. | |||
| msg339288 -(view) | Author: Inada Naoki (methane)*![]() | Date: 2019-04-01 09:35 | |
New changeset10654c19b5e6efdf3c529ff9bf7bcab89bdca1c1 by Inada Naoki in branch 'master':bpo-20844: open script file with "rb" mode (GH-12616)https://github.com/python/cpython/commit/10654c19b5e6efdf3c529ff9bf7bcab89bdca1c1 | |||
| msg339290 -(view) | Author: Inada Naoki (methane)*![]() | Date: 2019-04-01 12:03 | |
New changeset8384670615a90418fc52c3881242b7c10d1f2b13 by Inada Naoki in branch '3.7':bpo-20844: open script file with "rb" mode (GH-12616)https://github.com/python/cpython/commit/8384670615a90418fc52c3881242b7c10d1f2b13 | |||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:57:59 | admin | set | github: 65043 |
| 2019-05-05 18:32:02 | eryksun | link | issue36800 superseder |
| 2019-04-01 12:03:46 | methane | set | status: open -> closed stage: patch review -> resolved resolution: fixed versions: + Python 3.7, Python 3.8, - Python 3.4, Python 3.5, Python 3.6 |
| 2019-04-01 12:03:01 | methane | set | messages: +msg339290 |
| 2019-04-01 09:44:49 | methane | set | pull_requests: +pull_request12579 |
| 2019-04-01 09:35:34 | methane | set | nosy: +methane messages: +msg339288 |
| 2019-03-29 11:15:53 | methane | link | issue27797 superseder |
| 2019-03-29 09:50:42 | methane | set | keywords: +patch stage: patch review pull_requests: +pull_request12552 |
| 2018-02-09 15:54:47 | eryksun | link | issue32809 superseder |
| 2017-08-08 22:10:10 | BreamoreBoy | set | nosy: -BreamoreBoy |
| 2017-08-08 16:06:58 | steven.winfield | set | files: +issue20844.py versions: + Python 3.6 nosy: +steven.winfield messages: +msg299935 |
| 2014-12-27 13:29:55 | nedbat | set | messages: +msg233130 |
| 2014-12-27 13:29:27 | nedbat | set | nosy: +nedbat messages: +msg233129 |
| 2014-07-30 21:58:50 | BreamoreBoy | set | type: behavior title: coding bug remains in 3.3.5rc2 -> SyntaxError: encoding problem: iso-8859-1 on Windows components: + Interpreter Core versions: + Python 3.5, - Python 3.3 nosy: +tim.golden,zach.ware messages: +msg224351 |
| 2014-06-20 23:28:08 | eryksun | set | nosy: +eryksun messages: +msg221134 |
| 2014-06-20 14:15:00 | BreamoreBoy | set | messages: +msg221089 |
| 2014-03-21 07:52:57 | schlamar | set | nosy: +schlamar messages: +msg214330 versions: + Python 3.4 |
| 2014-03-12 02:59:16 | miwa | set | messages: +msg213196 |
| 2014-03-12 01:15:33 | miwa | set | messages: +msg213189 |
| 2014-03-10 02:55:40 | BreamoreBoy | set | nosy: +BreamoreBoy messages: +msg213014 |
| 2014-03-10 02:19:30 | miwa | set | messages: +msg213012 |
| 2014-03-03 14:03:53 | vstinner | set | nosy: +vstinner messages: +msg212638 |
| 2014-03-03 14:02:35 | vstinner | set | nosy: +benjamin.peterson |
| 2014-03-03 13:54:42 | miwa | create | |