- API reference
- General functions
- pandas.unique
pandas.unique#
- pandas.unique(values)[source]#
Return unique values based on a hash table.
Uniques are returned in order of appearance. This does NOT sort.
Significantly faster than numpy.unique for long enough sequences.Includes NA values.
- Parameters:
- values1d array-like
- Returns:
- numpy.ndarray or ExtensionArray
The return can be:
Index : when the input is an Index
Categorical : when the input is a Categorical dtype
ndarray : when the input is a Series/ndarray
Return numpy.ndarray or ExtensionArray.
See also
Index.unique
Return unique values from an Index.
Series.unique
Return unique values of Series object.
Examples
>>>pd.unique(pd.Series([2,1,3,3]))array([2, 1, 3])
>>>pd.unique(pd.Series([2]+[1]*5))array([2, 1])
>>>pd.unique(pd.Series([pd.Timestamp("20160101"),pd.Timestamp("20160101")]))array(['2016-01-01T00:00:00.000000000'], dtype='datetime64[ns]')
>>>pd.unique(...pd.Series(...[...pd.Timestamp("20160101",tz="US/Eastern"),...pd.Timestamp("20160101",tz="US/Eastern"),...]...)...)<DatetimeArray>['2016-01-01 00:00:00-05:00']Length: 1, dtype: datetime64[ns, US/Eastern]
>>>pd.unique(...pd.Index(...[...pd.Timestamp("20160101",tz="US/Eastern"),...pd.Timestamp("20160101",tz="US/Eastern"),...]...)...)DatetimeIndex(['2016-01-01 00:00:00-05:00'], dtype='datetime64[ns, US/Eastern]', freq=None)
>>>pd.unique(np.array(list("baabc"),dtype="O"))array(['b', 'a', 'c'], dtype=object)
An unordered Categorical will return categories in theorder of appearance.
>>>pd.unique(pd.Series(pd.Categorical(list("baabc"))))['b', 'a', 'c']Categories (3, object): ['a', 'b', 'c']
>>>pd.unique(pd.Series(pd.Categorical(list("baabc"),categories=list("abc"))))['b', 'a', 'c']Categories (3, object): ['a', 'b', 'c']
An ordered Categorical preserves the category ordering.
>>>pd.unique(...pd.Series(...pd.Categorical(list("baabc"),categories=list("abc"),ordered=True)...)...)['b', 'a', 'c']Categories (3, object): ['a' < 'b' < 'c']
An array of tuples
>>>pd.unique(pd.Series([("a","b"),("b","a"),("a","c"),("b","a")]).values)array([('a', 'b'), ('b', 'a'), ('a', 'c')], dtype=object)
On this page