- API reference
- General functions
- pandas.qcut
pandas.qcut#
- pandas.qcut(x,q,labels=None,retbins=False,precision=3,duplicates='raise')[source]#
Quantile-based discretization function.
Discretize variable into equal-sized buckets based on rank or basedon sample quantiles. For example 1000 values for 10 quantiles wouldproduce a Categorical object indicating quantile membership for each data point.
- Parameters:
- x1d ndarray or Series
- qint or list-like of float
Number of quantiles. 10 for deciles, 4 for quartiles, etc. Alternatelyarray of quantiles, e.g. [0, .25, .5, .75, 1.] for quartiles.
- labelsarray or False, default None
Used as labels for the resulting bins. Must be of the same length asthe resulting bins. If False, return only integer indicators of thebins. If True, raises an error.
- retbinsbool, optional
Whether to return the (bins, labels) or not. Can be useful if binsis given as a scalar.
- precisionint, optional
The precision at which to store and display the bins labels.
- duplicates{default ‘raise’, ‘drop’}, optional
If bin edges are not unique, raise ValueError or drop non-uniques.
- Returns:
- outCategorical or Series or array of integers if labels is False
The return type (Categorical or Series) depends on the input: a Seriesof type category if input is a Series else Categorical. Bins arerepresented as categories when categorical data is returned.
- binsndarray of floats
Returned only ifretbins is True.
Notes
Out of bounds values will be NA in the resulting Categorical object
Examples
>>>pd.qcut(range(5),4)...[(-0.001, 1.0], (-0.001, 1.0], (1.0, 2.0], (2.0, 3.0], (3.0, 4.0]]Categories (4, interval[float64, right]): [(-0.001, 1.0] < (1.0, 2.0] ...
>>>pd.qcut(range(5),3,labels=["good","medium","bad"])...[good, good, medium, bad, bad]Categories (3, object): [good < medium < bad]
>>>pd.qcut(range(5),4,labels=False)array([0, 0, 1, 2, 3])