Movatterモバイル変換


[0]ホーム

URL:


Python Pandas Tutorial

Python Pandas - Histograms



A histogram is a graphical representation of the distribution of a dataset. It helps you to visualize the frequency of data within defined intervals, called bins. A histogram looks similar to a bar plot but the difference is, histograms represents the distribution of numerical data grouped into ranges (bins), whereas bar plots represent categorical data, with each bar corresponding to a specific category.

In this tutorial, we will learn how to create and customize histograms using the Pandas library with different examples.

Creating Histograms in Pandas

In Pandas, histograms can be created using theplot.hist() method for both the Series and DataFrames objects. This method results amatplotlib.AxesSubplot object containing the histogram plot.

  • DataFrame.plot.hist(): Creates histogram for one or more columns in a DataFrame.

  • Series.plot.hist(): Creates a histogram for a specific column or Series.

Syntax

Following is the syntax of the hist() method −

DataFrame.plot.hist(by=None, bins=10, **kwargs)

Where,

  • by: Groups the DataFrame by a column.

  • bins: The number of bins to use for the histogram. The default value is 10.

  • **kwargs: Additional arguments to customize the plot.

Example

Here is a basic example of creating a histogram for a DataFrame using theplot.hist() method.

import pandas as pdimport numpy as npimport matplotlib.pyplot as pltplt.rcParams["figure.figsize"] = [7, 4]# Create a DataFrame with random datadf = pd.DataFrame(np.random.rand(10, 2), columns=["a", "b"])# Plot histogramax = df.plot.hist()plt.title("Simple Histogram")plt.show()

Following is the output of the above code −

Simple Histogram

Plotting a Stacked Histogram

A stacked histogram displays multiple numerical columns stacked on top of each other. This can be done by using thestacked=True parameter.

Example

This example creates a stacked histogram for a DataFrame using thestacked=True parameter.

import pandas as pdimport numpy as npimport matplotlib.pyplot as pltplt.rcParams["figure.figsize"] = [7, 4]# Create a DataFrame with random datadf = pd.DataFrame(np.random.rand(10, 2), columns=["a", "b"])# Plot the stacked histogramdf.plot.hist(stacked=True, bins=20, alpha=0.7, title="Stacked Histogram")plt.show()

On executing the above code we will get the following output −

Stacked Histogram

Creating the Horizontal Histograms

To create a horizontal histogram, you can useorientation='horizontal' parameter of theplot.hist() method.

Example

This example creates a stacked histogram for a DataFrame using thestacked=True parameter.

import pandas as pdimport numpy as npimport matplotlib.pyplot as pltplt.rcParams["figure.figsize"] = [7, 4]# Create a DataFrame with random datadf = pd.DataFrame(np.random.rand(10, 2), columns=["a", "b"])# Plot the stacked histogramdf.plot.hist(orientation='horizontal', bins=20, alpha=0.7, title="Horizontal Histogram")plt.show()

Following is the output of the above code −

Horizontal Histogram

Plotting the Cumulative Histogram

Cumulative histograms show the cumulative frequency distribution. Plotting the cumulative histogram can be done by setting thecumulative parameter toTrue.

Example

This example demonstrates plotting a cumulative histogram for a DataFrame using thecumulative=True parameter of theplot.hist() method.

import pandas as pdimport numpy as npimport matplotlib.pyplot as pltplt.rcParams["figure.figsize"] = [7, 4]# Create a DataFrame with random datadf = pd.DataFrame(np.random.rand(10, 2), columns=["a", "b"])# Plot the Cumulative histogramdf.plot.hist(cumulative='horizontal', bins=20, alpha=0.7, title="Cumulative Histogram")plt.show()

On executing the above code we will get the following output −

Cumulative Histogram

Subplots for Histograms

You can create individual subplots for histograms of each column of a DataFrame using the directDataFrame.hist() method.

Example

This example creates subplots for histogram of DataFrame columns using theDataFrame.hist() method.

import pandas as pdimport numpy as npimport matplotlib.pyplot as pltplt.rcParams["figure.figsize"] = [7, 4]# Create a DataFrame with random datadf = pd.DataFrame(np.random.rand(10, 2), columns=["a", "b"])# Subplots for each columndf.hist(color='lightgreen', bins=20)plt.suptitle("Histograms into Subplots")plt.show()

Following is the output of the above code −

Histograms into Subplots

Grouped Histograms

Grouped histograms allow you to visualize data distribution by specific categories. We can use theby parameter to create histograms grouped by a column.

Example

This example creates a grouped histogram for DataFrame columns using theby parameter.

import pandas as pdimport numpy as npimport matplotlib.pyplot as pltplt.rcParams["figure.figsize"] = [7, 4]# Create a DataFrame with random datax = ['A']*30 + ['B']*70y = np.random.randn(100)df = pd.DataFrame({'Letter': x, 'Numbers': y})# Plot the Grouped histogramdf.plot.hist(by='Letter', bins=20, alpha=0.7, title="Grouped Histogram")plt.show()

Following is the output of the above code −

Grouped Histograms
Print Page
Advertisements

[8]ページ先頭

©2009-2025 Movatter.jp