Movatterモバイル変換


[0]ホーム

URL:


Python Pandas Tutorial

Python Pandas - Density Plot



A Density Plot, also known as a Kernel Density Estimate (KDE) plot, is a non-parametric way to estimate the Probability Density Function (PDF) of a random variable.

It is commonly used to visualize the distribution of data. It is closely related to histograms, instead of using bars to represent data distribution, density plot creates a smooth curve that estimates the Probability Density Function (PDF) of a dataset by applying Gaussian kernels to smooth the data points.

In this tutorial, we will learn about creating and customizing density plots using Pandas library with different examples.

Density Plot in Pandas

In Pandas, you can easily create Density Plots using theplot.kde() orplot.density() methods available for both Series and DataFrame objects. These methods internally use Matplotlib and return either amatplotlib.axes.Axes object or NumPy arraynp.ndarray of Axes.

Syntax

Following is the syntax of the plot.kde() or plot.density() method −

plot.kde(bw_method=None, ind=None, **kwargs)

Where,

  • bw_method: Specifies the method to calculate the bandwidth for the kernel. This can be 'scott', 'silverman', a scalar, or callable, by default it uses the 'scott'.

  • ind: Specifies the evaluation points for the KDE. Can be an integer or a NumPy array for custom evaluation points. By default, 1000 equally spaced points are used.

  • **kwargs: Additional arguments for plot customization.

Density Plots for Series

For creating a density plot for a single Pandas Series object, you can use theseries.plot.kde() orplot.density() methods.

Example

Here's an example of how to create a simple density plot for a Pandas Series object. This plot visualizes the data distribution as a smooth curve.

import pandas as pdimport matplotlib.pyplot as pltplt.rcParams["figure.figsize"] = [7, 4]# Sample dataseries = pd.Series([1, 2, 2.5, 3, 3.5, 4, 5])# Create density plotdensityplot = series.plot.kde()# Set the title and display the plotplt.title("Simple Density Plot")plt.show()

Following is the output of the above code −

Density Plot for Series

Density Plots for DataFrame

You can also create a density plot for an entire DataFrame or specific columns of the DataFrame by using theDataFrame.plot.kde() orDataFrame.plot.density() methods.

Example

The following example demonstrates how to generate a density plot for a specific attribute of a DataFrame.

import pandas as pdimport matplotlib.pyplot as pltimport numpy as npplt.rcParams["figure.figsize"] = [7, 4]# Generate random datadf = pd.DataFrame(np.random.normal(size=(10, 4)), columns=["Col1", "Col2", "Col3", "Col4"])# Create density plot for a specific attribute of a DataFramedf.Col1.plot.kde()# Set the title and display the plotplt.title("Density Plot for a DataFrame column")plt.show()

On executing the above code we will get the following output −

Density Plot for DataFrame

Multiple Density Plots on the Same Axes

You can overlay multiple density plots on the same axes. Which is useful for comparing the distributions of multiple columns.

Example

The following example demonstrates creating a multiple density plots on the same axes using theDataFrame.plot.kde() method.

import pandas as pdimport numpy as npimport matplotlib.pyplot as pltplt.rcParams["figure.figsize"] = [7, 4]# Create a DataFramedf = pd.DataFrame(np.random.normal(size=(100, 4)), columns=["Col1", "Col2", "Col3", "Col4"])# Plot density plotsax = df.plot.kde()# Set the title and display the plotplt.title("Multiple Density Plots")plt.show()

Following is the output of the above code −

Multiple Density Plot

Adjusting Bandwidth of the Density Plot

Thebw_method parameter controls the smoothness of the density plot. Smaller values may lead to over-fitting, while larger values result in under-fit the data.

Example: Density plot for Small Bandwidth

This example uses thebw_method parameter to adjust the bandwidth of the density plot for small bandwidth.

import pandas as pdimport numpy as npimport matplotlib.pyplot as pltplt.rcParams["figure.figsize"] = [7, 4]# Create a DataFramedf = pd.DataFrame(np.random.normal(size=(100, 4)), columns=["Col1", "Col2", "Col3", "Col4"])# Small bandwidthdf.plot.kde(bw_method=0.3)plt.title("Density Plot with Small Bandwidth (DataFrame)")plt.show()

Following is the output of the above code −

Density Plot for Small Bandwidth

Example: Density plot for Large Bandwidth

This example uses thebw_method parameter to adjust the bandwidth of the density plot for large bandwidth.

import pandas as pdimport numpy as npimport matplotlib.pyplot as pltplt.rcParams["figure.figsize"] = [7, 4]# Create a DataFramedf = pd.DataFrame(np.random.normal(size=(100, 4)), columns=["Col1", "Col2", "Col3", "Col4"])# Large bandwidthdf.plot.kde(bw_method=3)plt.title("Density Plot with Large Bandwidth (DataFrame)")plt.show()

Following is the output of the above code −

Density Plot for Large Bandwidth

Customizing Evaluation Points

To customize the evaluation point, you can use theind parameter. This allows you to control the specific points at which the KDE is calculated.

Example

The following example demonstrates customizing the evaluation points of the density plot by using theind parameter of theplot.kde() method.

import pandas as pdimport numpy as npimport matplotlib.pyplot as pltplt.rcParams["figure.figsize"] = [7, 4]# Generate random datadf = pd.DataFrame(np.random.normal(size=(10, 4)), columns=["Col1", "Col2", "Col3", "Col4"])# Create density plotdf.plot.kde(ind=[-2, -1, 0, 1, 2, 3])plt.title("Density Plot with Custom Evaluation Points (DataFrame)")plt.show()

On executing the above code we will get the following output −

Density Plot with Custom Evaluation
Print Page
Advertisements

[8]ページ先頭

©2009-2025 Movatter.jp