Movatterモバイル変換


[0]ホーム

URL:


Skip to main content
Ctrl+K

How do I create plots in pandas?#

../../_images/04_plot_overview.svg
In [1]:importpandasaspdIn [2]:importmatplotlib.pyplotasplt
Data used for this tutorial:
  • For this tutorial, air quality data about\(NO_2\) is used, madeavailable byOpenAQ and using thepy-openaq package.Theair_quality_no2.csv data set provides\(NO_2\) values forthe measurement stationsFR04014,BETR801 andLondon Westminsterin respectively Paris, Antwerp and London.

    To raw data
    In [3]:air_quality=pd.read_csv("data/air_quality_no2.csv",index_col=0,parse_dates=True)In [4]:air_quality.head()Out[4]:                     station_antwerp  station_paris  station_londondatetime2019-05-07 02:00:00              NaN            NaN            23.02019-05-07 03:00:00             50.5           25.0            19.02019-05-07 04:00:00             45.0           27.7            19.02019-05-07 05:00:00              NaN           50.4            16.02019-05-07 06:00:00              NaN           61.9             NaN

    Note

    The usage of theindex_col andparse_dates parameters of theread_csv function to define the first (0th) column asindex of the resultingDataFrame and convert the dates in the column toTimestamp objects, respectively.

  • I want a quick visual check of the data.

    In [5]:air_quality.plot()Out[5]:<Axes: xlabel='datetime'>In [6]:plt.show()
    ../../_images/04_airqual_quick.png

    With aDataFrame, pandas creates by default one line plot for each ofthe columns with numeric data.

  • I want to plot only the columns of the data table with the data from Paris.

    In [7]:air_quality["station_paris"].plot()Out[7]:<Axes: xlabel='datetime'>In [8]:plt.show()
    ../../_images/04_airqual_paris.png

    To plot a specific column, use the selection method of thesubset data tutorial in combination with theplot()method. Hence, theplot() method works on bothSeries andDataFrame.

  • I want to visually compare the\(NO_2\) values measured in London versus Paris.

    In [9]:air_quality.plot.scatter(x="station_london",y="station_paris",alpha=0.5)Out[9]:<Axes: xlabel='station_london', ylabel='station_paris'>In [10]:plt.show()
    ../../_images/04_airqual_scatter.png

Apart from the defaultline plot when using theplot function, anumber of alternatives are available to plot data. Let’s use somestandard Python to get an overview of the available plot methods:

In [11]:[   ....:method_name   ....:formethod_nameindir(air_quality.plot)   ....:ifnotmethod_name.startswith("_")   ....:]   ....:Out[11]:['area', 'bar', 'barh', 'box', 'density', 'hexbin', 'hist', 'kde', 'line', 'pie', 'scatter']

Note

In many development environments as well as IPython andJupyter Notebook, use the TAB button to get an overview of the availablemethods, for exampleair_quality.plot. + TAB.

One of the options isDataFrame.plot.box(), which refers to aboxplot. Theboxmethod is applicable on the air quality example data:

In [12]:air_quality.plot.box()Out[12]:<Axes: >In [13]:plt.show()
../../_images/04_airqual_boxplot.png
To user guide

For an introduction to plots other than the default line plot, see the user guide section aboutsupported plot styles.

  • I want each of the columns in a separate subplot.

    In [14]:axs=air_quality.plot.area(figsize=(12,4),subplots=True)In [15]:plt.show()
    ../../_images/04_airqual_area_subplot.png

    Separate subplots for each of the data columns are supported by thesubplots argumentof theplot functions. The builtin options available in each of the pandas plotfunctions are worth reviewing.

To user guide

Some more formatting options are explained in the user guide section onplot formatting.

  • I want to further customize, extend or save the resulting plot.

    In [16]:fig,axs=plt.subplots(figsize=(12,4))In [17]:air_quality.plot.area(ax=axs)Out[17]:<Axes: xlabel='datetime'>In [18]:axs.set_ylabel("NO$_2$ concentration")Out[18]:Text(0, 0.5, 'NO$_2$ concentration')In [19]:fig.savefig("no2_concentrations.png")In [20]:plt.show()
    ../../_images/04_airqual_customized.png

Each of the plot objects created by pandas is aMatplotlib object. As Matplotlib providesplenty of options to customize plots, making the link between pandas andMatplotlib explicit enables all the power of Matplotlib to the plot.This strategy is applied in the previous example:

fig,axs=plt.subplots(figsize=(12,4))# Create an empty Matplotlib Figure and Axesair_quality.plot.area(ax=axs)# Use pandas to put the area plot on the prepared Figure/Axesaxs.set_ylabel("NO$_2$ concentration")# Do any Matplotlib customization you likefig.savefig("no2_concentrations.png")# Save the Figure/Axes using the existing Matplotlib method.plt.show()# Display the plot

REMEMBER

  • The.plot.* methods are applicable on both Series and DataFrames.

  • By default, each of the columns is plotted as a different element(line, boxplot,…).

  • Any plot created by pandas is a Matplotlib object.

To user guide

A full overview of plotting in pandas is provided in thevisualization pages.


[8]ページ先頭

©2009-2025 Movatter.jp