- API reference
- DataFrame
- pandas.DataF...
pandas.DataFrame.sort_values#
- DataFrame.sort_values(by,*,axis=0,ascending=True,inplace=False,kind='quicksort',na_position='last',ignore_index=False,key=None)[source]#
Sort by the values along either axis.
- Parameters:
- bystr or list of str
Name or list of names to sort by.
ifaxis is 0 or‘index’ thenby may contain indexlevels and/or column labels.
ifaxis is 1 or‘columns’ thenby may contain columnlevels and/or index labels.
- axis“{0 or ‘index’, 1 or ‘columns’}”, default 0
Axis to be sorted.
- ascendingbool or list of bool, default True
Sort ascending vs. descending. Specify list for multiple sortorders. If this is a list of bools, must match the length ofthe by.
- inplacebool, default False
If True, perform operation in-place.
- kind{‘quicksort’, ‘mergesort’, ‘heapsort’, ‘stable’}, default ‘quicksort’
Choice of sorting algorithm. See also
numpy.sort()
for moreinformation.mergesort andstable are the only stable algorithms. ForDataFrames, this option is only applied when sorting on a singlecolumn or label.- na_position{‘first’, ‘last’}, default ‘last’
Puts NaNs at the beginning iffirst;last puts NaNs at theend.
- ignore_indexbool, default False
If True, the resulting axis will be labeled 0, 1, …, n - 1.
- keycallable, optional
Apply the key function to the valuesbefore sorting. This is similar to thekey argument in thebuiltin
sorted()
function, with the notable difference thatthiskey function should bevectorized. It should expect aSeries
and return a Series with the same shape as the input.It will be applied to each column inby independently.
- Returns:
- DataFrame or None
DataFrame with sorted values or None if
inplace=True
.
See also
DataFrame.sort_index
Sort a DataFrame by the index.
Series.sort_values
Similar method for a Series.
Examples
>>>df=pd.DataFrame({...'col1':['A','A','B',np.nan,'D','C'],...'col2':[2,1,9,8,7,4],...'col3':[0,1,9,4,2,3],...'col4':['a','B','c','D','e','F']...})>>>df col1 col2 col3 col40 A 2 0 a1 A 1 1 B2 B 9 9 c3 NaN 8 4 D4 D 7 2 e5 C 4 3 F
Sort by col1
>>>df.sort_values(by=['col1']) col1 col2 col3 col40 A 2 0 a1 A 1 1 B2 B 9 9 c5 C 4 3 F4 D 7 2 e3 NaN 8 4 D
Sort by multiple columns
>>>df.sort_values(by=['col1','col2']) col1 col2 col3 col41 A 1 1 B0 A 2 0 a2 B 9 9 c5 C 4 3 F4 D 7 2 e3 NaN 8 4 D
Sort Descending
>>>df.sort_values(by='col1',ascending=False) col1 col2 col3 col44 D 7 2 e5 C 4 3 F2 B 9 9 c0 A 2 0 a1 A 1 1 B3 NaN 8 4 D
Putting NAs first
>>>df.sort_values(by='col1',ascending=False,na_position='first') col1 col2 col3 col43 NaN 8 4 D4 D 7 2 e5 C 4 3 F2 B 9 9 c0 A 2 0 a1 A 1 1 B
Sorting with a key function
>>>df.sort_values(by='col4',key=lambdacol:col.str.lower()) col1 col2 col3 col40 A 2 0 a1 A 1 1 B2 B 9 9 c3 NaN 8 4 D4 D 7 2 e5 C 4 3 F
Natural sort with the key argument,using thenatsort <https://github.com/SethMMorton/natsort> package.
>>>df=pd.DataFrame({..."time":['0hr','128hr','72hr','48hr','96hr'],..."value":[10,20,30,40,50]...})>>>df time value0 0hr 101 128hr 202 72hr 303 48hr 404 96hr 50>>>fromnatsortimportindex_natsorted>>>df.sort_values(...by="time",...key=lambdax:np.argsort(index_natsorted(df["time"]))...) time value0 0hr 103 48hr 402 72hr 304 96hr 501 128hr 20