Movatterモバイル変換


[0]ホーム

URL:


Python Pandas Tutorial

Python Pandas - Comparing Categorical Data



Comparing categorical data is an essential task for getting insights and understanding the relationships between different categories of the data. In Python, Pandas provides various ways to perform comparisons using comparison operators (==, !=, >, >=, <, and <=) on categorical data. These comparisons can be made in three main scenarios −

  • Equality comparison (== and !=).

  • All comparisons (==, !=, >, >=, <, and <=).

  • Comparing categorical data to a scalar value.

It is important to note that any non-equality comparisons between categorical data with different categories or between a categorical Series and a list-like object will raise aTypeError. This is due to the categories ordering could be interpreted in two ways, one with taking into account the ordering and one without.

In this tutorial, we will learn how to compare categorical data in Python Pandas library using the comparison operators such as==,!=,>,>=,<, and<=.

Equality comparisons of Categorical Data

In Pandas, comparing categorical data for equality is possible with a variety of objects such as lists, arrays, or Series objects of the same length as the categorical data.

Example

The following example demonstrates how to perform equality and inequality comparisons between categorical Series and the list-like objects.

import pandas as pdfrom pandas.api.types import CategoricalDtypeimport numpy as np# Creating a categorical Seriess = pd.Series([1, 2, 1, 1, 2, 3, 1, 3]).astype(CategoricalDtype([3, 2, 1], ordered=True))# Creating another categorical Series for comparisons2 = pd.Series([2, 2, 2, 1, 1, 3, 3, 3]).astype(CategoricalDtype([3, 2, 1], ordered=True))# Equality comparisonprint("Equality comparison (s == s2):")print(s == s2)print("\nInequality comparison (s != s2):")print(s != s2)# Equality comparison with a NumPy arrayprint("\nEquality comparison with NumPy array:")print(s == np.array([1, 2, 3, 1, 2, 3, 2, 1]))

Following is the output of the above code −

Equality comparison (s == s2):0    False1     True2    False3     True4    False5     True6    False7     Truedtype: boolInequality comparison (s != s2):0     True1    False2     True3    False4     True5    False6     True7    Falsedtype: boolEquality comparison with NumPy array:0     True1     True2    False3     True4     True5     True6    False7    Falsedtype: bool

All Comparisons of Categorical Data

Pandas allows you to perform various comparison operations including (>, >=, <=, <=) between the ordered categorical data.

Example

This example demonstrates how to perform non-equality comparisons (>, >=, <=, <=) on ordered categorical data.

import pandas as pdfrom pandas.api.types import CategoricalDtypeimport numpy as np# Creating a categorical Seriess = pd.Series([1, 2, 1, 1, 2, 3, 1, 3]).astype(CategoricalDtype([3, 2, 1], ordered=True))# Creating another categorical Series for comparisons2 = pd.Series([2, 2, 2, 1, 1, 3, 3, 3]).astype(CategoricalDtype([3, 2, 1], ordered=True))# Greater than comparison print("Greater than comparison:\n",s > s2)# Less than comparison print("\nLess than comparison:\n",s < s2)# Greater than or equal to comparison print("\nGreater than or equal to comparison:\n",s >= s2)# Lessthan or equal to comparison print("\nLess than or equal to comparison:\n",s <= s2)

Following is the output of the above code −

Greater than comparison: 0     True1    False2     True3    False4    False5    False6     True7    Falsedtype: boolLess than comparison: 0    False1    False2    False3    False4     True5    False6    False7    Falsedtype: boolGreater than or equal to comparison: 0     True1     True2     True3     True4    False5     True6     True7     Truedtype: boolLessthan or equal to comparison: 0    False1     True2    False3     True4     True5     True6    False7     Truedtype: bool

Comparing Categorical Data to Scalars

Categorical data can also be compared to scalar values using all comparison operators (==, !=, >, >=, <, and <=). The categorical values are compared to the scalar based on the order of their categories.

Example

The following example demonstrates how the categorical data can be compared to a scalar value.

import pandas as pd# Creating a categorical Seriess = pd.Series([1, 2, 3]).astype(pd.CategoricalDtype([3, 2, 1], ordered=True))# Compare to a scalarprint("Comparing categorical data to a scalar:")print(s > 2)

Following is the output of the above code −

Comparing categorical data to a scalar:0     True1    False2    Falsedtype: bool

Comparing Categorical Data with Different Categories

When comparing two categorical Series that have different categories or orderings, then aTypeError will be raised.

Example

The following example demonstrates handling theTypeError while performing the comparison between the two categorical Series objects with the different categories or orders.

import pandas as pdfrom pandas.api.types import CategoricalDtypeimport numpy as np# Creating a categorical Seriess = pd.Series([1, 2, 1, 1, 2, 3, 1, 3]).astype(CategoricalDtype([3, 2, 1], ordered=True))# Creating another categorical Series for comparisons3 = pd.Series([2, 2, 2, 1, 1, 3, 1, 2]).astype(CategoricalDtype(ordered=True))try:    print("Attempting to compare differently ordered two Series objects:")    print(s > s3)except TypeError as e:    print("TypeError:", str(e))

Following is the output of the above code −

Attempting to compare differently ordered two Series objects:TypeError: Categoricals can only be compared if 'categories' are the same.
Print Page
Advertisements

[8]ページ先頭

©2009-2025 Movatter.jp