- API reference
- DataFrame
- pandas.DataF...
pandas.DataFrame.value_counts#
- DataFrame.value_counts(subset=None,normalize=False,sort=True,ascending=False,dropna=True)[source]#
Return a Series containing the frequency of each distinct row in the Dataframe.
- Parameters:
- subsetlabel or list of labels, optional
Columns to use when counting unique combinations.
- normalizebool, default False
Return proportions rather than frequencies.
- sortbool, default True
Sort by frequencies when True. Sort by DataFrame column values when False.
- ascendingbool, default False
Sort in ascending order.
- dropnabool, default True
Don’t include counts of rows that contain NA values.
Added in version 1.3.0.
- Returns:
- Series
See also
Series.value_counts
Equivalent method on Series.
Notes
The returned Series will have a MultiIndex with one level per inputcolumn but an Index (non-multi) for a single label. By default, rowsthat contain any NA values are omitted from the result. By default,the resulting Series will be in descending order so that the firstelement is the most frequently-occurring row.
Examples
>>>df=pd.DataFrame({'num_legs':[2,4,4,6],...'num_wings':[2,0,0,0]},...index=['falcon','dog','cat','ant'])>>>df num_legs num_wingsfalcon 2 2dog 4 0cat 4 0ant 6 0
>>>df.value_counts()num_legs num_wings4 0 22 2 16 0 1Name: count, dtype: int64
>>>df.value_counts(sort=False)num_legs num_wings2 2 14 0 26 0 1Name: count, dtype: int64
>>>df.value_counts(ascending=True)num_legs num_wings2 2 16 0 14 0 2Name: count, dtype: int64
>>>df.value_counts(normalize=True)num_legs num_wings4 0 0.502 2 0.256 0 0.25Name: proportion, dtype: float64
Withdropna set toFalse we can also count rows with NA values.
>>>df=pd.DataFrame({'first_name':['John','Anne','John','Beth'],...'middle_name':['Smith',pd.NA,pd.NA,'Louise']})>>>df first_name middle_name0 John Smith1 Anne <NA>2 John <NA>3 Beth Louise
>>>df.value_counts()first_name middle_nameBeth Louise 1John Smith 1Name: count, dtype: int64
>>>df.value_counts(dropna=False)first_name middle_nameAnne NaN 1Beth Louise 1John Smith 1 NaN 1Name: count, dtype: int64
>>>df.value_counts("first_name")first_nameJohn 2Anne 1Beth 1Name: count, dtype: int64