Movatterモバイル変換


[0]ホーム

URL:


Following system colour schemeSelected dark colour schemeSelected light colour scheme

Python Enhancement Proposals

PEP 616 – String methods to remove prefixes and suffixes

Author:
Dennis Sweeney <sweeney.dennis650 at gmail.com>
Sponsor:
Eric V. Smith <eric at trueblade.com>
Status:
Final
Type:
Standards Track
Created:
19-Mar-2020
Python-Version:
3.9
Post-History:
20-Mar-2020

Table of Contents

Abstract

This is a proposal to add two new methods,removeprefix() andremovesuffix(), to the APIs of Python’s various string objects. Thesemethods would remove a prefix or suffix (respectively) from a string,if present, and would be added to Unicodestr objects, binarybytes andbytearray objects, andcollections.UserString.

Rationale

There have been repeated issues on Python-Ideas[2][3],Python-Dev[4][5][6][7], the Bug Tracker, andStackOverflow[8], related to user confusion about theexistingstr.lstrip andstr.rstrip methods. These users aretypically expecting the behavior ofremoveprefix andremovesuffix,but they are surprised that the parameter forlstrip isinterpreted as a set of characters, not a substring. This repeatedissue is evidence that these methods are useful. The new methodsallow a cleaner redirection of users to the desired behavior.

As another testimonial for the usefulness of these methods, severalusers on Python-Ideas[2] reported frequently including similarfunctions in their code for productivity. The implementationoften contained subtle mistakes regarding the handling of the emptystring, so a well-tested built-in method would be useful.

The existing solutions for creating the desired behavior are to eitherimplement the methods as in theSpecification below, or to useregular expressions as in the expressionre.sub('^'+re.escape(prefix),'',s), which is less discoverable,requires a module import, and results in less readable code.

Specification

The builtinstr class will gain two new methods which will behaveas follows whentype(self)istype(prefix)istype(suffix)isstr:

defremoveprefix(self:str,prefix:str,/)->str:ifself.startswith(prefix):returnself[len(prefix):]else:returnself[:]defremovesuffix(self:str,suffix:str,/)->str:# suffix='' should not call self[:-0].ifsuffixandself.endswith(suffix):returnself[:-len(suffix)]else:returnself[:]

When the arguments are instances ofstr subclasses, the methods shouldbehave as though those arguments were first coerced to basestrobjects, and the return value should always be a basestr.

Methods with the corresponding semantics will be added to the builtinbytes andbytearray objects. Ifb is either abytesorbytearray object, thenb.removeprefix() andb.removesuffix()will accept any bytes-like object as an argument. The two methods willalso be added tocollections.UserString, with similar behavior.

Motivating examples from the Python standard library

The examples below demonstrate how the proposed methods can make codeone or more of the following:

  1. Less fragile:

    The code will not depend on the user to count the length of a literal.

  2. More performant:

    The code does not require a call to the Python built-inlenfunction nor to the more expensivestr.replace() method.

  3. More descriptive:

    The methods give a higher-level API for code readability asopposed to the traditional method of string slicing.

find_recursionlimit.py

  • Current:
    iftest_func_name.startswith("test_"):print(test_func_name[5:])else:print(test_func_name)
  • Improved:
    print(test_func_name.removeprefix("test_"))

deccheck.py

This is an interesting case because the author chose to use thestr.replace method in a situation where only a prefix wasintended to be removed.

  • Current:
    iffuncname.startswith("context."):self.funcname=funcname.replace("context.","")self.contextfunc=Trueelse:self.funcname=funcnameself.contextfunc=False
  • Improved:
    iffuncname.startswith("context."):self.funcname=funcname.removeprefix("context.")self.contextfunc=Trueelse:self.funcname=funcnameself.contextfunc=False
  • Arguably further improved:
    self.contextfunc=funcname.startswith("context.")self.funcname=funcname.removeprefix("context.")

cookiejar.py

  • Current:
    defstrip_quotes(text):iftext.startswith('"'):text=text[1:]iftext.endswith('"'):text=text[:-1]returntext
  • Improved:
    defstrip_quotes(text):returntext.removeprefix('"').removesuffix('"')

test_i18n.py

  • Current:
    creationDate=header['POT-Creation-Date']# peel off the escaped newline at the end of stringifcreationDate.endswith('\\n'):creationDate=creationDate[:-len('\\n')]
  • Improved:
    creationDate=header['POT-Creation-Date'].removesuffix('\\n')

There were many other such examples in the stdlib.

Rejected Ideas

Expand the lstrip and rstrip APIs

Becauselstrip takes a string as its argument, it could be viewedas taking an iterable of length-1 strings. The API could, therefore, begeneralized to accept any iterable of strings, which would besuccessively removed as prefixes. While this behavior would beconsistent, it would not be obvious for users to have to call'foobar'.lstrip(('foo',)) for the common use case of asingle prefix.

Remove multiple copies of a prefix

This is the behavior that would be consistent with the aforementionedexpansion of thelstrip/rstrip API – repeatedly applying thefunction until the argument is unchanged. This behavior is attainablefrom the proposed behavior via by the following:

>>>s='Foo'*100+'Bar'>>>prefix='Foo'>>>whiles.startswith(prefix):s=s.removeprefix(prefix)>>>s'Bar'

Raising an exception when not found

There was a suggestion thats.removeprefix(pre) should raise anexception ifnots.startswith(pre). However, this does not matchwith the behavior and feel of other string methods. There could berequired=False keyword added, but this violates the KISSprinciple.

Accepting a tuple of affixes

It could be convenient to write thetest_concurrent_futures.pyexample above asname.removesuffix(('Mixin','Tests','Test')), sothere was a suggestion that the new methods be able to take a tuple ofstrings as an argument, similar to thestartswith() API. Withinthe tuple, only the first matching affix would be removed. This wasrejected on the following grounds:

  • This behavior can be surprising or visually confusing, especiallywhen one prefix is empty or is a substring of another prefix, as in'FooBar'.removeprefix(('','Foo'))=='FooBar'or'FooBartext'.removeprefix(('Foo','FooBar'))=='Bartext'.
  • The API forstr.replace() only accepts a single pair ofreplacement strings, but has stood the test of time by refusing thetemptation to guess in the face of ambiguous multiple replacements.
  • There may be a compelling use case for such a feature in the future,but generalization before the basic feature sees real-world use wouldbe easy to get permanently wrong.

Alternative Method Names

Several alternatives method names have been proposed. Some are listedbelow, along with commentary for why they should be rejected in favorofremoveprefix (the same arguments hold forremovesuffix).

  • ltrim,trimprefix, etc.:

    “Trim” does in other languages (e.g. JavaScript, Java, Go, PHP)whatstrip methods do in Python.

  • lstrip(string=...)

    This would avoid adding a new method, but for differentbehavior, it’s better to have two different methods than onemethod with a keyword argument that selects the behavior.

  • remove_prefix:

    All of the other methods of the string API, e.g.str.startswith(), uselowercase rather thanlower_case_with_underscores.

  • removeleft,leftremove, orlremove:

    The explicitness of “prefix” is preferred.

  • cutprefix,deleteprefix,withoutprefix,dropprefix, etc.:

    Many of these might have been acceptable, but “remove” isunambiguous and matches how one would describe the “remove the prefix”behavior in English.

  • stripprefix:

    Users may benefit from remembering that “strip” means workingwith sets of characters, while other methods work withsubstrings, so re-using “strip” here should be avoided.

How to Teach This

Among the uses for thepartition(),startswith(), andsplit() string methods or theenumerate() orzip()built-in functions, a common theme is that if a beginner findsthemselves manually indexing or slicing a string, then they shouldconsider whether there is a higher-level method that bettercommunicateswhat the code should do rather than merelyhow thecode should do it. The proposedremoveprefix() andremovesuffix() methods expand the high-level string “toolbox” andfurther allow for this sort of skepticism toward manual slicing.

The main opportunity for user confusion will be the conflation oflstrip/rstrip withremoveprefix/removesuffix.It may therefore be helpful to emphasize (as the documentation will)the following differences between the methods:

  • (l/r)strip:
    • The argument is interpreted as a character set.
    • The characters are repeatedly removed from the appropriate end ofthe string.
  • remove(prefix/suffix):
    • The argument is interpreted as an unbroken substring.
    • Only at most one copy of the prefix/suffix is removed.

Reference Implementation

See the pull request on GitHub[1].

History of Major revisions

  • Version 3: Remove tuple behavior.
  • Version 2: Changed name toremoveprefix/removesuffix;added support for tuples as arguments
  • Version 1: Initial draft withcutprefix/cutsuffix

References

[1]
GitHub pull request with implementation(https://github.com/python/cpython/pull/18939)
[2] (1,2)
[Python-Ideas] “New explicit methods to trim strings”(https://mail.python.org/archives/list/python-ideas@python.org/thread/RJARZSUKCXRJIP42Z2YBBAEN5XA7KEC3/)
[3]
“Re: [Python-ideas] adding a trim convenience function”(https://mail.python.org/archives/list/python-ideas@python.org/thread/SJ7CKPZSKB5RWT7H3YNXOJUQ7QLD2R3X/#C2W5T7RCFSHU5XI72HG53A6R3J3SN4MV)
[4]
“Re: [Python-Dev] strip behavior provides inconsistent results with certain strings”(https://mail.python.org/archives/list/python-ideas@python.org/thread/XYFQMFPUV6FR2N5BGYWPBVMZ5BE5PJ6C/#XYFQMFPUV6FR2N5BGYWPBVMZ5BE5PJ6C)
[5]
[Python-Dev] “correction of a bug”(https://mail.python.org/archives/list/python-dev@python.org/thread/AOZ7RFQTQLCZCTVNKESZI67PB3PSS72X/#AOZ7RFQTQLCZCTVNKESZI67PB3PSS72X)
[6]
[Python-Dev] “str.lstrip bug?”(https://mail.python.org/archives/list/python-dev@python.org/thread/OJDKRIESKGTQFNLX6KZSGKU57UXNZYAN/#CYZUFFJ2Q5ZZKMJIQBZVZR4NSLK5ZPIH)
[7]
[Python-Dev] “strip behavior provides inconsistent results with certain strings”(https://mail.python.org/archives/list/python-dev@python.org/thread/ZWRGCGANHGVDPP44VQKRIYOYX7LNVDVG/#ZWRGCGANHGVDPP44VQKRIYOYX7LNVDVG)
[8]
Comment listing Bug Tracker and StackOverflow issues(https://mail.python.org/archives/list/python-ideas@python.org/message/GRGAFIII3AX22K3N3KT7RB4DPBY3LPVG/)

Copyright

This document is placed in the public domain or under theCC0-1.0-Universal license, whichever is more permissive.


Source:https://github.com/python/peps/blob/main/peps/pep-0616.rst

Last modified:2025-02-01 08:55:40 GMT


[8]ページ先頭

©2009-2025 Movatter.jp