Movatterモバイル変換


[0]ホーム

URL:


Python Pandas Tutorial

Python Pandas - Pickling



Pickling in Python, also known as serialization, is the process of converting a Python object into a byte stream, which can be saved and later deserialized back into the original object structure. In the context of Pandas, pickling enables the efficient saving and loading of DataFrames and Series objects.

The Python Pandas library provides easy to use functions for pickling DataFrames and Series objects using itsto_pickle() andread_pickle() methods. These methods use Python'scPickle module, which implements a binary format for efficiently saving data structures to disk and loading them back to the Pandas object using the pickle format.

In this tutorial, we will learn about how to use Pandas built-in pickling functionalities for efficient serialization and deserialization of Pandas objects with more customization options.

Saving a Pandas Object to a Pickle File

To pickle the Pandas objects such as DataFrame, or Series, you can use theto_pickle() method, which saves them to a file in pickle format.

Example: Saving a DataFrame to a Pickle File

Following is the example that uses theto_pickle() method for saving a Pandas DataFrame object into a pickle file.

import pandas as pd# Create a sample DataFramedf = pd.DataFrame({"A": [1, 2, 3],"B": [4, 5, 6]})# Display the Input DataFrameprint("Original DataFrame:")print(df)# Save the DataFrame to a pickle filedf.to_pickle("dataframe.pkl")print('\nDataFrame is successfully saved as a pickle file named "dataframe.pkl".')

When we run above program, it produces following result −

Original DataFrame:
AB
004
115
226
DataFrame is successfully saved as a pickle file named 'dataframe.pkl'.
If you visit the folder where the pickle files are saved, you can observe the generated pickle file.

Example: Saving a Series to a Pickle File

Following is the example that uses theto_pickle() method for saving a Pandas Series object into a pickle file.

import pandas as pd# Creating a Pandas Seriess = pd.Series([1, 2, 3, 4])# Display the Input Seriesprint("Original Series:")print(s)# Save the Series as a pickle files.to_pickle("series_to_pickle_file.pkl")print('\nSeries is successfully saved as a pickle file named "series_to_pickle_file.pkl".')

When we run above program, it produces following result −

Original Series:0      11      22      33      4dtype: int64Series is successfully saved as a pickle file named "series_to_pickle_file.pkl".

In this example, the Series object is saved to a file named "series_to_pickle_file.pkl" in the current directory.

Loading a Pickled Pandas Object

For loading a pickled file into the Pandas object (Series, or DataFrame objects), you can use theread_pickle() method from Pandas. This will deserialize the byte stream and recreate the Pandas object.

It's important to note that, loading pickled data from untrusted sources can be risky. Always verify the source before deserializing a pickle file.

Example

This example loads a Pandas Series object from a pickle file using the Pandasread_pickle() method.

import pandas as pd# Creating a Pandas Seriess = pd.Series([1, 2, 3, 4], index=["cat", "dog", "fish", "mouse"])# Display the Input Seriesprint("Original Series:")print(s)# Save the Series as a pickle files.to_pickle("series_read_pickle_file.pkl")# Load the Series from the pickle fileunpickled_series = pd.read_pickle("series_read_pickle_file.pkl")print("\nLoaded Series:")print(unpickled_series)

While executing the above code we get the following output −

Original Series:cat      1dog      2fish     3mouse    4dtype: int64Loaded Series:cat      1dog      2fish     3mouse    4dtype: int64

Working with Compressed Pickle Files

Pandas pickling functionality also supports reading and writing compressed pickle files. The following compression formats are supported:

  • gzip

  • bz2

  • xz

  • zstd

You can specify the compression format either explicitly or by inferring it from the file extension. To do this, set thecompression parameter of theto_pickle() andread_pickle() methods to 'infer'.

Example: Saving a Compressed Pickle File

Here is an example of demonstrating how to save a DataFrame to a compressed pickle file usinggzip compression.

import pandas as pd# Create a DataFramedf = pd.DataFrame({"Col_1": range(5), "Col_2": range(5, 10)})print("Original DataFrame:")print(df)# Save the DataFrame to a pickle file with gzip compressiondf.to_pickle("dataframe_compressed.pkl.gz", compression="gzip")print("\nDataFrame is successfully saved as a pickle file with gzip compression.")

Following is an output of the above code −

Original DataFrame:
Col_1Col_2
005
116
227
338
449
DataFrame is successfully saved as a pickle file with gzip compression.

Example: Loading a Compressed Pickle File

The following example demonstrates how to use theread_pickle() method to read thegzip compressed pickle file into a Pandas object. To load a compressed pickle file, you simply need to specify the compression type.

import pandas as pd# Create a DataFramedf = pd.DataFrame({"Col_1": range(5), "Col_2": range(5, 10)})print("Original DataFrame:")print(df)# Save the DataFrame to a pickle file with gzip compressiondf.to_pickle("dataframe_compressed.pkl.gz", compression="gzip")# Load the compressed filecompressed_df = pd.read_pickle("dataframe_compressed.pkl.gz")print("\nLoaded Compressed DataFrame:")print(compressed_df)

Following is an output of the above code −

Original DataFrame:
Col_1Col_2
005
116
227
338
449
Loaded Compressed DataFrame:
Col_1Col_2
005
116
227
338
449

Inferring Compression Type from a Pickle File Extension

PaPandas will automatically detect the compression type if thecompression parameter is set toinfer and the file ends with.gz,.bz2,.xz, or.zst extensions.

Example

This example sets thecompression parameter value toinfer for automatically inferring the compression type from the file extension.

import pandas as pd# Create a DataFramedf = pd.DataFrame({"Col_1": [1, 2, 3, 4, 5],"Col_2": ["a", "b", "c", "d", "e"]})print("Original DataFrame:")print(df)# Save with inferred compression typedf.to_pickle("dataframe.pkl.xz", compression="infer")# Load with inferred compression typedf_loaded_inferred = pd.read_pickle("dataframe.pkl.xz", compression="infer")print("\nLoaded with inferred compression type:")print(df_loaded_inferred)

Following is an output of the above code −

Original DataFrame:
Col_1Col_2
01a
12b
23c
34d
45e
Loaded with inferred compression type:
Col_1Col_2
01a
12b
23c
34d
45e
Print Page
Advertisements

[8]ページ先頭

©2009-2025 Movatter.jp