Movatterモバイル変換


[0]ホーム

URL:


Python Pandas Tutorial

Python Pandas to_hdf() Method



Theto_hdf() method in Python's Pandas library allows you to store a DataFrame in an HDF5 file. HDF5 stands forHierarchical Data Format version 5, is a high-performance data format that supports large-scale data storage and efficient reading/writing of datasets. Using this method, you can save Series or DataFrames to disk in an organized and highly compressed manner.

HDF5 files are widely used for handling scientific data, where large datasets need to be stored and accessed efficiently. With theto_hdf() method, you can choose storage formats, compression levels, appending modes, and key names for saving pandas object, ensuring compatibility with various analytical workflows.

Syntax

The syntax of theto_hdf() method is as follows −

DataFrame.to_hdf(path_or_buf, *, key, mode='a', complevel=None, complib=None, append=False, format=None, index=True, min_itemsize=None, nan_rep=None, dropna=None, data_columns=None, errors='strict', encoding='UTF-8')

When using theto_hdf() method on a Series object, you should call it asSeries.to_hdf().

Parameters

The Python Pandasto_hdf() method accepts the below parameters −

  • path_or_buf: File path or HDFStore object where the HDF5 file will be saved.

  • key: Identifier for the group in the HDF5 file.

  • mode: Specifies the mode to open the file. Common values are 'w' (write), 'a' (append), and 'r+' (read/write).

  • format: The storage format, either 'fixed' (default, fast) or 'table' (slower but more flexible).

  • index: Boolean indicating whether to include the DataFrames index in the file. Defaults toTrue.

  • complevel: Specifies the compression level (0-9). Higher values mean more compression but slower performance.

  • complib: Specifies the compression library to use, such as 'zlib', 'bzip2', 'lzo', or 'blosc'.

  • append: Boolean indicating whether to append data to an existing HDF5 file or replace the dataset with new data. Defaults toFalse.

  • **kwargs: Additional keyword arguments passed to the HDF5 writer.

Return Value

Theto_hdf() method does not return a value. It writes the DataFrame to the specified HDF5 file.

Example: Saving a DataFrame to an HDF5 File

This example demonstrates how to save a Pandas DataFrame to an HDF5 file using theto_hdf() method.

import pandas as pd# Create a DataFramedata = {'Name': ['Kiran', 'Priya', 'Naveen'], 'Age': [25, 30, 35], 'City': ['New Delhi', 'Hyderabad', 'Chennai']}df = pd.DataFrame(data)# Save DataFrame to HDF5 filedf.to_hdf('data.h5', key='dataset')print("DataFrame saved to 'data.h5' successfully...")

Following is an output of the above code −

DataFrame saved to 'data.h5' successfully...

When you run the above code, the DataFrame is saved in the HDF5 file named 'data.h5' under the 'dataset' key.

Example: Saving DataFrame to Compressed HDF5

This example demonstrates how to save a DataFrame to an HDF5 file with compression. Here we will specify the "zlib" compression.

import pandas as pd# Create a DataFramedata = {'Name': ['Kiran', 'Priya', 'Naveen'], 'Score': [88, 92, 95]}df = pd.DataFrame(data)# Save with compressiondf.to_hdf('compressed_data.h5', key='dataset', mode='w', complevel=5, complib='zlib')print("DataFrame saved with compression to 'compressed_data.h5'")

This will save the DataFrame to 'compressed_data.h5' with zlib compression at level 5, and returns the following message −

DataFrame saved with compression to 'compressed_data.h5'

Example: Appending Pandas Data to an Existing HDF5 File

This example shows how to append a new data in a DataFrame to an existing HDF5 file under the 'new_dataset' key. Here we will append the new data to the 'compressed_data.h5' file which is created in the above example.

import pandas as pd# Create a new DataFramenew_data = {'Name': ['Suman', 'Dev'], 'Score': [45, 76]}new_df = pd.DataFrame(new_data)# Append to existing HDF5 filenew_df.to_hdf('compressed_data.h5', key='new_dataset', mode='a')print("DataFrame appended to 'compressed_data.h5'")

The above code appends the new DataFrame to 'compressed_data.h5' under the group 'new_dataset'.

DataFrame appended to 'compressed_data.h5'

Example: Saving a DataFrame in Table Format to an HDF5 File

The following example demonstrates saving a DataFrame in a table format of the HDF5 file.

import pandas as pd# Create a DataFramedf = pd.DataFrame({'City': ['New Delhi', 'Chennai', 'Hyderabad'], 'Population': [8.4, 9.0, 13.9]})# Save as a table formatdf.to_hdf('table_data.h5', key='table', mode='w', format='table')print("DataFrame saved in table format.")

Output of the above code is as follows −

DataFrame saved in table format.

Example: Specifying Columns for Querying while Saving to HDF5 file

This example demonstrates saving a DataFrame with specific columns set as data columns for querying while saving the DataFrame to the HDF5 file.

import pandas as pd# Create a DataFramedf = pd.DataFrame({'Name': ['Kiran', 'Priya', 'Naveen'], 'Age': [25, 30, 35], 'City': ['New Delhi', 'Hyderabad', 'Chennai']})# Save with data columnsdf.to_hdf('queryable_data.h5', key='queryable', mode='w', format='table', data_columns=['Name'])print("DataFrame saved with queryable columns.")

While executing the above code we obtain the following output −

DataFrame saved with queryable columns.
python_pandas_io_tool.htm
Print Page
Advertisements

[8]ページ先頭

©2009-2025 Movatter.jp