pandas.Series.str.split #

Series.str.split(pat=None,*,n=-1,expand=False,regex=None)[source]#

Split strings around given separator/delimiter.

Splits the string in the Series/Index from the beginning,at the specified delimiter string.

Parameters:

patstr or compiled regex, optional

String or regular expression to split on.If not specified, split on whitespace.

nint, default -1 (all)

Limit number of splits in output.None, 0 and -1 will be interpreted as return all splits.

expandbool, default False

Expand the split strings into separate columns.

IfTrue, return DataFrame/MultiIndex expanding dimensionality.
IfFalse, return Series/Index, containing lists of strings.

regexbool, default None

Determines if the passed-in pattern is a regular expression:

IfTrue, assumes the passed-in pattern is a regular expression
IfFalse, treats the pattern as a literal string.
IfNone andpat length is 1, treatspat as a literal string.
IfNone andpat length is not 1, treatspat as a regular expression.
Cannot be set to False ifpat is a compiled regex

Added in version 1.4.0.

Returns:

Series, Index, DataFrame or MultiIndex: Type matches caller unlessexpand=True (see Notes).

Raises:

ValueError

ifregex is False andpat is a compiled regex

See also

Series.str.split: Split strings around given separator/delimiter.
Series.str.rsplit: Splits string around given separator/delimiter, starting from the right.
Series.str.join: Join lists contained as elements in the Series/Index with passed delimiter.
str.split: Standard library version for split.
str.rsplit: Standard library version for rsplit.

Notes

The handling of then keyword depends on the number of found splits:

If found splits >n, make firstn splits only
If found splits <=n, make all splits
If for a certain row the number of found splits <n,appendNone for padding up ton ifexpand=True

If usingexpand=True, Series and Index callers return DataFrame andMultiIndex objects, respectively.

Use ofregex =False with apat as a compiled regex will raise an error.

Examples

>>>s=pd.Series(...[..."this is a regular sentence",..."https://docs.python.org/3/tutorial/index.html",...np.nan...]...)>>>s0                       this is a regular sentence1    https://docs.python.org/3/tutorial/index.html2                                              NaNdtype: object

In the default setting, the string is split by whitespace.

>>>s.str.split()0                   [this, is, a, regular, sentence]1    [https://docs.python.org/3/tutorial/index.html]2                                                NaNdtype: object

Without then parameter, the outputs ofrsplit andsplitare identical.

>>>s.str.rsplit()0                   [this, is, a, regular, sentence]1    [https://docs.python.org/3/tutorial/index.html]2                                                NaNdtype: object

Then parameter can be used to limit the number of splits on thedelimiter. The outputs ofsplit andrsplit are different.

>>>s.str.split(n=2)0                     [this, is, a regular sentence]1    [https://docs.python.org/3/tutorial/index.html]2                                                NaNdtype: object

>>>s.str.rsplit(n=2)0                     [this is a, regular, sentence]1    [https://docs.python.org/3/tutorial/index.html]2                                                NaNdtype: object

Thepat parameter can be used to split by other characters.

>>>s.str.split(pat="/")0                         [this is a regular sentence]1    [https:, , docs.python.org, 3, tutorial, index...2                                                  NaNdtype: object

When usingexpand=True, the split elements will expand out intoseparate columns. If NaN is present, it is propagated throughoutthe columns during the split.

>>>s.str.split(expand=True)                                               0     1     2        3         40                                           this    is     a  regular  sentence1  https://docs.python.org/3/tutorial/index.html  None  None     None      None2                                            NaN   NaN   NaN      NaN       NaN

For slightly more complex use cases like splitting the html document namefrom a url, a combination of parameter settings can be used.

>>>s.str.rsplit("/",n=1,expand=True)                                    0           10          this is a regular sentence        None1  https://docs.python.org/3/tutorial  index.html2                                 NaN         NaN

Remember to escape special characters when explicitly using regular expressions.

>>>s=pd.Series(["foo and bar plus baz"])>>>s.str.split(r"and|plus",expand=True)    0   1   20 foo bar baz

Regular expressions can be used to handle urls or file names.Whenpat is a string andregex=None (the default), the givenpat is compiledas a regex only iflen(pat)!=1.

>>>s=pd.Series(['foojpgbar.jpg'])>>>s.str.split(r".",expand=True)           0    10  foojpgbar  jpg

>>>s.str.split(r"\.jpg",expand=True)           0 10  foojpgbar

Whenregex=True,pat is interpreted as a regex

>>>s.str.split(r"\.jpg",regex=True,expand=True)           0 10  foojpgbar

A compiled regex can be passed aspat

>>>importre>>>s.str.split(re.compile(r"\.jpg"),expand=True)           0 10  foojpgbar

Whenregex=False,pat is interpreted as the string itself

>>>s.str.split(r"\.jpg",regex=False,expand=True)               00  foojpgbar.jpg

On this page

Show Source

Movatterモバイル変換

pandas.Series.str.split#

pandas.Series.str.split #