pandas.DataFrame.nlargest #

DataFrame.nlargest(n,columns,keep='first')[source]#

Return the firstn rows ordered bycolumns in descending order.

Return the firstn rows with the largest values incolumns, indescending order. The columns that are not specified are returned aswell, but not used for ordering.

This method is equivalent todf.sort_values(columns,ascending=False).head(n), but moreperformant.

Parameters:

nint

Number of rows to return.

columnslabel or list of labels

Column label(s) to order by.

keep{‘first’, ‘last’, ‘all’}, default ‘first’

Where there are duplicate values:

first : prioritize the first occurrence(s)
last : prioritize the last occurrence(s)
all : keep all the ties of the smallest item even if it meansselecting more thann items.

Returns:

DataFrame: The firstn rows ordered by the given columns in descendingorder.

See also

DataFrame.nsmallest: Return the firstn rows ordered bycolumns in ascending order.
DataFrame.sort_values: Sort DataFrame by the values.
DataFrame.head: Return the firstn rows without re-ordering.

Notes

This function cannot be used with all column types. For example, whenspecifying columns withobject orcategory dtypes,TypeError israised.

Examples

>>>df=pd.DataFrame({'population':[59000000,65000000,434000,...434000,434000,337000,11300,...11300,11300],...'GDP':[1937894,2583560,12011,4520,12128,...17036,182,38,311],...'alpha-2':["IT","FR","MT","MV","BN",..."IS","NR","TV","AI"]},...index=["Italy","France","Malta",..."Maldives","Brunei","Iceland",..."Nauru","Tuvalu","Anguilla"])>>>df          population      GDP alpha-2Italy       59000000  1937894      ITFrance      65000000  2583560      FRMalta         434000    12011      MTMaldives      434000     4520      MVBrunei        434000    12128      BNIceland       337000    17036      ISNauru          11300      182      NRTuvalu         11300       38      TVAnguilla       11300      311      AI

In the following example, we will usenlargest to select the threerows having the largest values in column “population”.

>>>df.nlargest(3,'population')        population      GDP alpha-2France    65000000  2583560      FRItaly     59000000  1937894      ITMalta       434000    12011      MT

When usingkeep='last', ties are resolved in reverse order:

>>>df.nlargest(3,'population',keep='last')        population      GDP alpha-2France    65000000  2583560      FRItaly     59000000  1937894      ITBrunei      434000    12128      BN

When usingkeep='all', the number of element kept can go beyondnif there are duplicate values for the smallest element, all theties are kept:

>>>df.nlargest(3,'population',keep='all')          population      GDP alpha-2France      65000000  2583560      FRItaly       59000000  1937894      ITMalta         434000    12011      MTMaldives      434000     4520      MVBrunei        434000    12128      BN

However,nlargest does not keepn distinct largest elements:

>>>df.nlargest(5,'population',keep='all')          population      GDP alpha-2France      65000000  2583560      FRItaly       59000000  1937894      ITMalta         434000    12011      MTMaldives      434000     4520      MVBrunei        434000    12128      BN

To order by the largest values in column “population” and then “GDP”,we can specify multiple columns like in the next example.

>>>df.nlargest(3,['population','GDP'])        population      GDP alpha-2France    65000000  2583560      FRItaly     59000000  1937894      ITBrunei      434000    12128      BN

On this page

Show Source

Movatterモバイル変換

pandas.DataFrame.nlargest#

pandas.DataFrame.nlargest #