Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Extended ASCII characters in multiline strings cause "SystemError: Negative size passed to PyUnicode_New" when the encoding is not specified #96611

Closed
Assignees
mdboom
Labels
interpreter-core(Objects, Python, Grammar, and Parser dirs)topic-unicodetype-bugAn unexpected behavior, bug, or error
@polprog

Description

@polprog

Bug report

In some cases, when dealing with multi-line string with non-utf8 encoded files, python will throw aSystemError: Negative size passed to PyUnicode_New and not execute any code.

Minimal test case:

print("""ą""")

This is only a problem if the non-utf8 character lies on a new line (at any point in the line)

A similar test case behaves correctly

print("""ą""")

And reports an encoding warning, which is the expected behavior

SyntaxError: Non-UTF-8 code starting with '\xb1' in file C:\Users\xxxxx\test.py on line 2, but no encoding declared; see https://python.org/dev/peps/pep-0263/ for details

Since this is an encoding related errors, both files are attached (as .txt, GitHub does not allow .py attachments).
test.txt - single line (correct behavior)
test_ml.txt - multi line (bug)

My environment

  • CPython versions tested on: Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug 1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)] on win32
  • Operating system and architecture: Windows 10 Pro 21H2 (19044.1826)

Metadata

Metadata

Assignees

Labels

interpreter-core(Objects, Python, Grammar, and Parser dirs)topic-unicodetype-bugAn unexpected behavior, bug, or error

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions


    [8]ページ先頭

    ©2009-2025 Movatter.jp