- API reference
- DataFrame
- pandas.DataF...
pandas.DataFrame.nlargest#
- DataFrame.nlargest(n,columns,keep='first')[source]#
Return the firstn rows ordered bycolumns in descending order.
Return the firstn rows with the largest values incolumns, indescending order. The columns that are not specified are returned aswell, but not used for ordering.
This method is equivalent to
df.sort_values(columns,ascending=False).head(n), but moreperformant.- Parameters:
- nint
Number of rows to return.
- columnslabel or list of labels
Column label(s) to order by.
- keep{‘first’, ‘last’, ‘all’}, default ‘first’
Where there are duplicate values:
first: prioritize the first occurrence(s)last: prioritize the last occurrence(s)all: keep all the ties of the smallest item even if it meansselecting more thannitems.
- Returns:
- DataFrame
The firstn rows ordered by the given columns in descendingorder.
See also
DataFrame.nsmallestReturn the firstn rows ordered bycolumns in ascending order.
DataFrame.sort_valuesSort DataFrame by the values.
DataFrame.headReturn the firstn rows without re-ordering.
Notes
This function cannot be used with all column types. For example, whenspecifying columns withobject orcategory dtypes,
TypeErrorisraised.Examples
>>>df=pd.DataFrame({'population':[59000000,65000000,434000,...434000,434000,337000,11300,...11300,11300],...'GDP':[1937894,2583560,12011,4520,12128,...17036,182,38,311],...'alpha-2':["IT","FR","MT","MV","BN",..."IS","NR","TV","AI"]},...index=["Italy","France","Malta",..."Maldives","Brunei","Iceland",..."Nauru","Tuvalu","Anguilla"])>>>df population GDP alpha-2Italy 59000000 1937894 ITFrance 65000000 2583560 FRMalta 434000 12011 MTMaldives 434000 4520 MVBrunei 434000 12128 BNIceland 337000 17036 ISNauru 11300 182 NRTuvalu 11300 38 TVAnguilla 11300 311 AI
In the following example, we will use
nlargestto select the threerows having the largest values in column “population”.>>>df.nlargest(3,'population') population GDP alpha-2France 65000000 2583560 FRItaly 59000000 1937894 ITMalta 434000 12011 MT
When using
keep='last', ties are resolved in reverse order:>>>df.nlargest(3,'population',keep='last') population GDP alpha-2France 65000000 2583560 FRItaly 59000000 1937894 ITBrunei 434000 12128 BN
When using
keep='all', the number of element kept can go beyondnif there are duplicate values for the smallest element, all theties are kept:>>>df.nlargest(3,'population',keep='all') population GDP alpha-2France 65000000 2583560 FRItaly 59000000 1937894 ITMalta 434000 12011 MTMaldives 434000 4520 MVBrunei 434000 12128 BN
However,
nlargestdoes not keepndistinct largest elements:>>>df.nlargest(5,'population',keep='all') population GDP alpha-2France 65000000 2583560 FRItaly 59000000 1937894 ITMalta 434000 12011 MTMaldives 434000 4520 MVBrunei 434000 12128 BN
To order by the largest values in column “population” and then “GDP”,we can specify multiple columns like in the next example.
>>>df.nlargest(3,['population','GDP']) population GDP alpha-2France 65000000 2583560 FRItaly 59000000 1937894 ITBrunei 434000 12128 BN