Movatterモバイル変換


[0]ホーム

URL:


Skip to main content
Ctrl+K

pandas.DataFrame.replace#

DataFrame.replace(to_replace=None,value=<no_default>,*,inplace=False,limit=None,regex=False,method=<no_default>)[source]#

Replace values given into_replace withvalue.

Values of the Series/DataFrame are replaced with other values dynamically.This differs from updating with.loc or.iloc, which requireyou to specify a location to update with some value.

Parameters:
to_replacestr, regex, list, dict, Series, int, float, or None

How to find the values that will be replaced.

  • numeric, str or regex:

    • numeric: numeric values equal toto_replace will bereplaced withvalue

    • str: string exactly matchingto_replace will be replacedwithvalue

    • regex: regexs matchingto_replace will be replaced withvalue

  • list of str, regex, or numeric:

    • First, ifto_replace andvalue are both lists, theymust be the same length.

    • Second, ifregex=True then all of the strings inbothlists will be interpreted as regexs otherwise they will matchdirectly. This doesn’t matter much forvalue since thereare only a few possible substitution regexes you can use.

    • str, regex and numeric rules apply as above.

  • dict:

    • Dicts can be used to specify different replacement valuesfor different existing values. For example,{'a':'b','y':'z'} replaces the value ‘a’ with ‘b’ and‘y’ with ‘z’. To use a dict in this way, the optionalvalueparameter should not be given.

    • For a DataFrame a dict can specify that different valuesshould be replaced in different columns. For example,{'a':1,'b':'z'} looks for the value 1 in column ‘a’and the value ‘z’ in column ‘b’ and replaces these valueswith whatever is specified invalue. Thevalue parametershould not beNone in this case. You can treat this as aspecial case of passing two lists except that you arespecifying the column to search in.

    • For a DataFrame nested dictionaries, e.g.,{'a':{'b':np.nan}}, are read as follows: look in column‘a’ for the value ‘b’ and replace it with NaN. The optionalvalueparameter should not be specified to use a nested dict in thisway. You can nest regular expressions as well. Note thatcolumn names (the top-level dictionary keys in a nesteddictionary)cannot be regular expressions.

  • None:

    • This means that theregex argument must be a string,compiled regular expression, or list, dict, ndarray orSeries of such elements. Ifvalue is alsoNone thenthismust be a nested dictionary or Series.

See the examples section for examples of each of these.

valuescalar, dict, list, str, regex, default None

Value to replace any values matchingto_replace with.For a DataFrame a dict of values can be used to specify whichvalue to use for each column (columns not in the dict will not befilled). Regular expressions, strings and lists or dicts of suchobjects are also allowed.

inplacebool, default False

If True, performs operation inplace and returns None.

limitint, default None

Maximum size gap to forward or backward fill.

Deprecated since version 2.1.0.

regexbool or same types asto_replace, default False

Whether to interpretto_replace and/orvalue as regularexpressions. Alternatively, this could be a regular expression or alist, dict, or array of regular expressions in which caseto_replace must beNone.

method{‘pad’, ‘ffill’, ‘bfill’}

The method to use when for replacement, whento_replace is ascalar, list or tuple andvalue isNone.

Deprecated since version 2.1.0.

Returns:
Series/DataFrame

Object after replacement.

Raises:
AssertionError
  • Ifregex is not abool andto_replace is notNone.

TypeError
  • Ifto_replace is not a scalar, array-like,dict, orNone

  • Ifto_replace is adict andvalue is not alist,dict,ndarray, orSeries

  • Ifto_replace isNone andregex is not compilableinto a regular expression or is a list, dict, ndarray, orSeries.

  • When replacing multiplebool ordatetime64 objects andthe arguments toto_replace does not match the type of thevalue being replaced

ValueError
  • If alist or anndarray is passed toto_replace andvalue but they are not the same length.

See also

Series.fillna

Fill NA values.

DataFrame.fillna

Fill NA values.

Series.where

Replace values based on boolean condition.

DataFrame.where

Replace values based on boolean condition.

DataFrame.map

Apply a function to a Dataframe elementwise.

Series.map

Map values of Series according to an input mapping or function.

Series.str.replace

Simple string replacement.

Notes

  • Regex substitution is performed under the hood withre.sub. Therules for substitution forre.sub are the same.

  • Regular expressions will only substitute on strings, meaning youcannot provide, for example, a regular expression matching floatingpoint numbers and expect the columns in your frame that have anumeric dtype to be matched. However, if those floating pointnumbersare strings, then you can do this.

  • This method hasa lot of options. You are encouraged to experimentand play with this method to gain intuition about how it works.

  • When dict is used as theto_replace value, it is likekey(s) in the dict are the to_replace part andvalue(s) in the dict are the value parameter.

Examples

Scalar `to_replace` and `value`

>>>s=pd.Series([1,2,3,4,5])>>>s.replace(1,5)0    51    22    33    44    5dtype: int64
>>>df=pd.DataFrame({'A':[0,1,2,3,4],...'B':[5,6,7,8,9],...'C':['a','b','c','d','e']})>>>df.replace(0,5)    A  B  C0  5  5  a1  1  6  b2  2  7  c3  3  8  d4  4  9  e

List-like `to_replace`

>>>df.replace([0,1,2,3],4)    A  B  C0  4  5  a1  4  6  b2  4  7  c3  4  8  d4  4  9  e
>>>df.replace([0,1,2,3],[4,3,2,1])    A  B  C0  4  5  a1  3  6  b2  2  7  c3  1  8  d4  4  9  e
>>>s.replace([1,2],method='bfill')0    31    32    33    44    5dtype: int64

dict-like `to_replace`

>>>df.replace({0:10,1:100})        A  B  C0   10  5  a1  100  6  b2    2  7  c3    3  8  d4    4  9  e
>>>df.replace({'A':0,'B':5},100)        A    B  C0  100  100  a1    1    6  b2    2    7  c3    3    8  d4    4    9  e
>>>df.replace({'A':{0:100,4:400}})        A  B  C0  100  5  a1    1  6  b2    2  7  c3    3  8  d4  400  9  e

Regular expression `to_replace`

>>>df=pd.DataFrame({'A':['bat','foo','bait'],...'B':['abc','bar','xyz']})>>>df.replace(to_replace=r'^ba.$',value='new',regex=True)        A    B0   new  abc1   foo  new2  bait  xyz
>>>df.replace({'A':r'^ba.$'},{'A':'new'},regex=True)        A    B0   new  abc1   foo  bar2  bait  xyz
>>>df.replace(regex=r'^ba.$',value='new')        A    B0   new  abc1   foo  new2  bait  xyz
>>>df.replace(regex={r'^ba.$':'new','foo':'xyz'})        A    B0   new  abc1   xyz  new2  bait  xyz
>>>df.replace(regex=[r'^ba.$','foo'],value='new')        A    B0   new  abc1   new  new2  bait  xyz

Compare the behavior ofs.replace({'a':None}) ands.replace('a',None) to understand the peculiaritiesof theto_replace parameter:

>>>s=pd.Series([10,'a','a','b','a'])

When one uses a dict as theto_replace value, it is like thevalue(s) in the dict are equal to thevalue parameter.s.replace({'a':None}) is equivalent tos.replace(to_replace={'a':None},value=None,method=None):

>>>s.replace({'a':None})0      101    None2    None3       b4    Nonedtype: object

Whenvalue is not explicitly passed andto_replace is a scalar, listor tuple,replace uses the method parameter (default ‘pad’) to do thereplacement. So this is why the ‘a’ values are being replaced by 10in rows 1 and 2 and ‘b’ in row 4 in this case.

>>>s.replace('a')0    101    102    103     b4     bdtype: object

Deprecated since version 2.1.0:The ‘method’ parameter and padding behavior are deprecated.

On the other hand, ifNone is explicitly passed forvalue, it willbe respected:

>>>s.replace('a',None)0      101    None2    None3       b4    Nonedtype: object

Changed in version 1.4.0:Previously the explicitNone was silently ignored.

Whenregex=True,value is notNone andto_replace is a string,the replacement will be applied in all columns of the DataFrame.

>>>df=pd.DataFrame({'A':[0,1,2,3,4],...'B':['a','b','c','d','e'],...'C':['f','g','h','i','j']})
>>>df.replace(to_replace='^[a-g]',value='e',regex=True)    A  B  C0  0  e  e1  1  e  e2  2  e  h3  3  e  i4  4  e  j

Ifvalue is notNone andto_replace is a dictionary, the dictionarykeys will be the DataFrame columns that the replacement will be applied.

>>>df.replace(to_replace={'B':'^[a-c]','C':'^[h-j]'},value='e',regex=True)    A  B  C0  0  e  f1  1  e  g2  2  e  e3  3  d  e4  4  e  e

[8]ページ先頭

©2009-2025 Movatter.jp