Histograms are a fundamental tool in data visualization, providing a graphical representation of the distribution of data. They are particularly useful for exploring continuous data, such as numerical measurements or sensor readings. This article will guide you through the process of Plot Histogram inPython usingMatplotlib, covering the essential steps from data preparation to generating the histogram plot.
What is Matplotlib Histograms?
AHistogram represents data provided in the form of some groups. It is an accurate method for the graphical representation of numerical data distribution. It is a type of bar plot where the X-axis represents the bin ranges while the Y-axis gives information about frequency.
Creating a Matplotlib Histogram
To create a Matplotlib histogram the first step is to create a bin of the ranges, then distribute the whole range of the values into a series of intervals, and count the values that fall into each of the intervals. Bins are identified as consecutive, non-overlapping intervals of variables.Thematplotlib.pyplot.hist() function is used to compute and create a histogram of x.
The following table shows the parameters accepted by matplotlib.pyplot.hist() function :
Attribute | Parameter |
---|
x | array or sequence of array |
bins | optional parameter contains integer or sequence or strings |
density | Optional parameter contains boolean values |
range | Optional parameter represents upper and lower range of bins |
histtype | optional parameter used to create type of histogram [bar, barstacked, step, stepfilled], default is "bar" |
align | optional parameter controls the plotting of histogram [left, right, mid] |
weights | optional parameter contains array of weights having same dimensions as x |
bottom | location of the baseline of each bin |
rwidth | optional parameter which is relative width of the bars with respect to bin width |
color | optional parameter used to set color or sequence of color specs |
label | optional parameter string or sequence of string to match with multiple datasets |
log | optional parameter used to set histogram axis on log scale |
Plotting Histogram in Python using Matplotlib
Here we will see different methods of Plotting Histogram in Matplotlib inPython:
- Basic Histogram
- Customized Histogram with Density Plot
- Customized Histogram with Watermark
- Multiple Histograms with Subplots
- Stacked Histogram
- 2D Histogram (Hexbin Plot)
Create a Basic Histogram in Matplotlib
Let's create a basic histogram in Matplotlib using Python of some random values.
Python3importmatplotlib.pyplotaspltimportnumpyasnp# Generate random data for the histogramdata=np.random.randn(1000)# Plotting a basic histogramplt.hist(data,bins=30,color='skyblue',edgecolor='black')# Adding labels and titleplt.xlabel('Values')plt.ylabel('Frequency')plt.title('Basic Histogram')# Display the plotplt.show()
Output:

Customized Histogram in Matplotlib with Density Plot
Let's create a customized histogram with a density plot using Matplotlib and Seaborn in Python. The resulting plot visualizes the distribution of random data with a smooth density estimate.
Python3importmatplotlib.pyplotaspltimportseabornassnsimportnumpyasnp# Generate random data for the histogramdata=np.random.randn(1000)# Creating a customized histogram with a density plotsns.histplot(data,bins=30,kde=True,color='lightgreen',edgecolor='red')# Adding labels and titleplt.xlabel('Values')plt.ylabel('Density')plt.title('Customized Histogram with Density Plot')# Display the plotplt.show()
Output:

Customized Histogram with Watermark
Create a customized histogram using Matplotlib in Python with specific features. It includes additional styling elements, such as removing axis ticks, adding padding, and setting a color gradient for better visualization.
Python3importmatplotlib.pyplotaspltimportnumpyasnpfrommatplotlibimportcolorsfrommatplotlib.tickerimportPercentFormatter# Creating datasetnp.random.seed(23685752)N_points=10000n_bins=20# Creating distributionx=np.random.randn(N_points)y=.8**x+np.random.randn(10000)+25legend=['distribution']# Creating histogramfig,axs=plt.subplots(1,1,figsize=(10,7),tight_layout=True)# Remove axes splinesforsin['top','bottom','left','right']:axs.spines[s].set_visible(False)# Remove x, y ticksaxs.xaxis.set_ticks_position('none')axs.yaxis.set_ticks_position('none')# Add padding between axes and labelsaxs.xaxis.set_tick_params(pad=5)axs.yaxis.set_tick_params(pad=10)# Add x, y gridlinesaxs.grid(b=True,color='grey',linestyle='-.',linewidth=0.5,alpha=0.6)# Add Text watermarkfig.text(0.9,0.15,'Jeeteshgavande30',fontsize=12,color='red',ha='right',va='bottom',alpha=0.7)# Creating histogramN,bins,patches=axs.hist(x,bins=n_bins)# Setting colorfracs=((N**(1/5))/N.max())norm=colors.Normalize(fracs.min(),fracs.max())forthisfrac,thispatchinzip(fracs,patches):color=plt.cm.viridis(norm(thisfrac))thispatch.set_facecolor(color)# Adding extra featuresplt.xlabel("X-axis")plt.ylabel("y-axis")plt.legend(legend)plt.title('Customized histogram')# Show plotplt.show()
Output :

Multiple Histograms with Subplots
Let's generates two histograms side by side using Matplotlib in Python, each with its own set of random data and provides a visual comparison of the distributions ofdata1
anddata2
using histograms.
Python3importmatplotlib.pyplotaspltimportnumpyasnp# Generate random data for multiple histogramsdata1=np.random.randn(1000)data2=np.random.normal(loc=3,scale=1,size=1000)# Creating subplots with multiple histogramsfig,axes=plt.subplots(nrows=1,ncols=2,figsize=(12,4))axes[0].hist(data1,bins=30,color='Yellow',edgecolor='black')axes[0].set_title('Histogram 1')axes[1].hist(data2,bins=30,color='Pink',edgecolor='black')axes[1].set_title('Histogram 2')# Adding labels and titleforaxinaxes:ax.set_xlabel('Values')ax.set_ylabel('Frequency')# Adjusting layout for better spacingplt.tight_layout()# Display the figureplt.show()
Output:

Stacked Histogram using Matplotlib
Let's generates a stacked histogram using Matplotlib in Python, representing two datasets with different random data distributions. The stacked histogram provides insights into the combined frequency distribution of the two datasets.
Python3importmatplotlib.pyplotaspltimportnumpyasnp# Generate random data for stacked histogramsdata1=np.random.randn(1000)data2=np.random.normal(loc=3,scale=1,size=1000)# Creating a stacked histogramplt.hist([data1,data2],bins=30,stacked=True,color=['cyan','Purple'],edgecolor='black')# Adding labels and titleplt.xlabel('Values')plt.ylabel('Frequency')plt.title('Stacked Histogram')# Adding legendplt.legend(['Dataset 1','Dataset 2'])# Display the plotplt.show()
Output:

Plot 2D Histogram (Hexbin Plot) using Matplotlib
Let's generates a 2D hexbin plot using Matplotlib in Python, provides a visual representation of the 2D data distribution, where hexagons convey the density of data points. The colorbar helps interpret the density of points in different regions of the plot.
Python3importmatplotlib.pyplotaspltimportnumpyasnp# Generate random 2D data for hexbin plotx=np.random.randn(1000)y=2*x+np.random.normal(size=1000)# Creating a 2D histogram (hexbin plot)plt.hexbin(x,y,gridsize=30,cmap='Blues')# Adding labels and titleplt.xlabel('X values')plt.ylabel('Y values')plt.title('2D Histogram (Hexbin Plot)')# Adding colorbarplt.colorbar()# Display the plotplt.show()
Output:

Conclusion
Plotting Matplotlib histograms is a simple and straightforward process. By using thehist()
function, we can easily create histograms with different bin widths and bin edges. We can also customize the appearance of histograms to meet our needs
Plotting Histogram Chart in Python using Matplotlib