Movatterモバイル変換


[0]ホーム

URL:


Wayback Machine
4 captures
18 Jun 2021 - 19 Feb 2022
MayJUNJul
Previous capture18Next capture
202020212022
success
fail
COLLECTED BY
Organization:Internet Archive
Focused crawls are collections of frequently-updated webcrawl data from narrow (as opposed to broad or wide) web crawls, often focused on a single domain or subdomain.
Collection:github.com
TIMESTAMPS
loading
The Wayback Machine - https://web.archive.org/web/20210618125244/https://github.com/python/peps/blob/master/pep-0623.rst
Skip to content
Sign up
Sign in Sign up
Permalink
master
Switch branches/tags

peps/pep-0623.rst

Go to file
 
 
Cannot retrieve contributors at this time

PEP: 623Title: Remove wstr from UnicodeAuthor: Inada Naoki <songofacandy@gmail.com>BDFL-Delegate: Victor Stinner <vstinner@python.org>Status: AcceptedType: Standards TrackContent-Type: text/x-rstCreated: 25-Jun-2020Python-Version: 3.10

Abstract

PEP 393 deprecated some unicode APIs, and introducedwchar_t *wstr,andPy_ssize_t wstr_length in the Unicode structure to supportthese deprecated APIs.[1]

This PEP is planning removal ofwstr, andwstr_length withdeprecated APIs using these members by Python 3.12.

Deprecated APIs which doesn't use the members are out of scope becausethey can be removed independently.

Motivation

Memory usage

str is one of the most used types in Python. Even most simple ASCIIstrings have awstr member. It consumes 8 bytes per string on 64-bitsystems.

Runtime overhead

To support legacy Unicode object, many Unicode APIs must callPyUnicode_READY().

We can remove this overhead too by dropping support of legacy Unicodeobject.

Simplicity

Supporting legacy Unicode object makes the Unicode implementation morecomplex.Until we drop legacy Unicode object, it is very hard to try otherUnicode implementation like UTF-8 based implementation in PyPy.

Rationale

Python 4.0 is not scheduled yet

PEP 393 introduced efficient internal representation of Unicode andremoved border between "narrow" and "wide" build of Python.

PEP 393 was implemented in Python 3.3 which is released in 2012. OldAPIs were deprecated since then, and the removal was scheduled inPython 4.0.

Python 4.0 was expected as next version of Python 3.9 when PEP 393was accepted. But the next version of Python 3.9 is Python 3.10,not 4.0. This is why this PEP schedule the removal plan again.

Python 2 reached EOL

Since Python 2 didn't have PEP 393 Unicode implementation, legacyAPIs might help C extension modules supporting both of Python 2 and 3.

But Python 2 reached the EOL in 2020. We can remove legacy APIs keptfor compatibility with Python 2.

Plan

Python 3.9

These macros and functions are marked as deprecated, usingPy_DEPRECATED macro.

  • Py_UNICODE_WSTR_LENGTH()
  • PyUnicode_GET_SIZE()
  • PyUnicode_GetSize()
  • PyUnicode_GET_DATA_SIZE()
  • PyUnicode_AS_UNICODE()
  • PyUnicode_AS_DATA()
  • PyUnicode_AsUnicode()
  • _PyUnicode_AsUnicode()
  • PyUnicode_AsUnicodeAndSize()
  • PyUnicode_FromUnicode()

Python 3.10

  • Following macros, enum members are marked as deprecated.Py_DEPRECATED(3.10) macro are used as possible. But theyare deprecated only in comment and document if the macro cannot be used easily.
    • PyUnicode_WCHAR_KIND
    • PyUnicode_READY()
    • PyUnicode_IS_READY()
    • PyUnicode_IS_COMPACT()
  • PyUnicode_FromUnicode(NULL, size) andPyUnicode_FromStringAndSize(NULL, size) emitDeprecationWarning whensize > 0.
  • PyArg_ParseTuple() andPyArg_ParseTupleAndKeywords() emitDeprecationWarning whenu,u#,Z, andZ# formats are used.

Python 3.12

  • Following members are removed from the Unicode structures:
    • wstr
    • wstr_length
    • state.compact
    • state.ready
  • ThePyUnicodeObject structure is removed.
  • Following macros and functions, and enum members are removed:
    • Py_UNICODE_WSTR_LENGTH()
    • PyUnicode_GET_SIZE()
    • PyUnicode_GetSize()
    • PyUnicode_GET_DATA_SIZE()
    • PyUnicode_AS_UNICODE()
    • PyUnicode_AS_DATA()
    • PyUnicode_AsUnicode()
    • _PyUnicode_AsUnicode()
    • PyUnicode_AsUnicodeAndSize()
    • PyUnicode_FromUnicode()
    • PyUnicode_WCHAR_KIND
    • PyUnicode_READY()
    • PyUnicode_IS_READY()
    • PyUnicode_IS_COMPACT()
  • PyUnicode_FromStringAndSize(NULL, size)) raisesRuntimeError whensize > 0.
  • PyArg_ParseTuple() andPyArg_ParseTupleAndKeywords() raiseSystemError whenu,u#,Z, andZ# formats are used,as other unsupported format character.

Discussion

References

[1]PEP 393 -- Flexible String Representation(https://www.python.org/dev/peps/pep-0393/)

Copyright

This document has been placed in the public domain.


[8]ページ先頭

©2009-2025 Movatter.jp