Movatterモバイル変換


[0]ホーム

URL:


Python Pandas Tutorial

Python Pandas read_hdf() Method



Theread_hdf() method in Python's Pandas library is used to read data from HDF5 (Hierarchical Data Format) files into a Pandas object such as Series or DataFrame. HDF5 is a widely used file format that supports the storage of large datasets, metadata, and heterogeneous data efficiently.

Theread_hdf() method simplifies loading HDF5 data into Pandas for analysis and manipulation. It also provides options for querying and filtering data stored in these files efficiently. This method only supports reading the local system files, and it not supports remote URLs and file-like objects.

Syntax

The syntax of the read_hdf() method is as follows −

pandas.read_hdf(path_or_buf, key=None, mode='r', errors='strict', where=None, start=None, stop=None, columns=None, iterator=False, chunksize=None, **kwargs)

Parameters

The Python Pandas read_hdf() method accepts the following parameters −

  • path_or_buf: The file path, buffer, or file-like object to read the HDF5 file from.

  • key: The identifier for the dataset or table within the HDF5 file.

  • mode: Specifies the mode to open the file. Common values are 'r' (read-only), 'r+' (read/write), and 'a' (append).

  • errors: Specifies how to handle errors while encoding and decoding.

  • where: Conditions to filter data (like SQL WHERE clause).

  • start: Specifies starting row for loading data.

  • stop: Specifies ending row for loading data.

  • columns: Specific columns to load from the HDF5 dataset.

  • iterator: If True, returns an iterator for reading data in chunks.

  • chunksize: Number of rows per chunk if iterator is True.

  • **kwargs: Additional keyword arguments passed to HDFStore.

Return Value

The Pandasread_hdf() method returns a Pandas object containing the data from the HDF5 file.

Example: Reading a Simple HDF5 File

Let's see a basic example of demonstrating how to read an entire HDF5 file using the pandasread_hdf() method.

import pandas as pd# Create a DataFramedata = {'Name': ['Kiran', 'Priya', 'Naveen'], 'Age': [25, 30, 35], 'City': ['New Delhi', 'Hyderabad', 'Chennai']}df = pd.DataFrame(data)# Save DataFrame to HDF5 filedf.to_hdf('data.h5', key='dataset')# Reading an HDF5 filedf = pd.read_hdf('data.h5')print("DataFrame from HDF5 File:")print(df)

Following is an output of the above code −

DataFrame from HDF5 File:
NameAgeCity
0Kiran25New Delhi
1Priya30Hyderabad
2Naveen35Chennai

Example: Reading HDF5 Data with Specific Key

This example shows how to read a specific dataset or table from an HDF5 file using thekey parameter. In this example initially we have saved two sets of data to the "data.h5" file under the "dataset_1" and "dataste_2" keys, then retrieved the data using a specific key.

import pandas as pd# Create a DataFramedata = {'Name': ['Kiran', 'Priya', 'Naveen'], 'Age': [25, 30, 35], 'City': ['New Delhi', 'Hyderabad', 'Chennai']}df = pd.DataFrame(data)# Save DataFrame to HDF5 file under dataset_1 keydf.to_hdf('data.h5', key='dataset_1')# Create a new DataFramenew_data = {'Name': ['Suman', 'Dev'], 'Score': [45, 76]}new_df = pd.DataFrame(new_data)# Append to existing HDF5 file under dataset_2 keynew_df.to_hdf('data.h5', key='dataset_2', mode='a')# Reading specific key from HDF5 fileresult = pd.read_hdf('data.h5', key='dataset_1')print("DataFrame for Key 'dataset_1':")print(result)

While executing the above code, you will get the following output −

DataFrame for Key 'dataset_1':
NameAgeCity
0Kiran25New Delhi
1Priya30Hyderabad
2Naveen35Chennai

Example: Querying Data While Reading HDF5 File

Here is an example demonstrating filtering data while reading HDF5 file using thewhere parameter.

import pandas as pd# Create a DataFramedata = {'Name': ['Kiran', 'Priya', 'Naveen'], 'Age': [25, 30, 35], 'City': ['New Delhi', 'Hyderabad', 'Chennai']}df = pd.DataFrame(data)# Save DataFrame to HDF5 file under dataset_1 keydf.to_hdf('example_data.h5', format='table', key='dataset_1', data_columns=True)# Reading HDF5 data while Queryingresult = pd.read_hdf('example_data.h5', 'dataset_1', where='Age < 32')print("Filtered DataFrame:")print(result)

Following is an output of the above code −

Filtered DataFrame:
NameAgeCity
0Kiran25New Delhi
1Priya30Hyderabad

Example: Reading Specific Columns

Here is another example that demonstrates how to load specific columns data from an HDF5 file, for this you can use thecolumn parameter of theread_hdf() method.

import pandas as pd# Create a DataFramedata = {'Name': ['Kiran', 'Priya', 'Naveen'], 'Age': [25, 30, 35], 'City': ['New Delhi', 'Hyderabad', 'Chennai']}df = pd.DataFrame(data)# Save DataFrame to HDF5 file under dataset_1 keydf.to_hdf('example_data.h5', format='table', key='dataset_1')# Reading specific columns from a HDF5 filedf = pd.read_hdf('example_data.h5', key='dataset_1', columns=['Name', 'City'])print("DataFrame from HDF5 file with Specific Columns:")print(df)

Upon executing the above code you will get the following output −

DataFrame from HDF5 file with Specific Columns:
NameCity
0KiranNew Delhi
1PriyaHyderabad
2NaveenChennai

Example: Reading HDF5 Data in Chunks

You can use thechunksize parameter to read large datasets in smaller chunks. The following example demonstrates the same.

import pandas as pd# Create a DataFramedata = {'Name': ['Kiran', 'Priya', 'Naveen'], 'Age': [25, 30, 35], 'City': ['New Delhi', 'Hyderabad', 'Chennai']}df = pd.DataFrame(data)# Save DataFrame to HDF5 file under dataset_1 keydf.to_hdf('example_data.h5', format='table', key='dataset_1')# Reading HDF5 data in chunkschunk_iterator = pd.read_hdf('example_data.h5', key='dataset_1', chunksize=1)for chunk in chunk_iterator:    print("Chunk DataFrame:")    print(chunk)

Following is an output of the above code −

Chunk DataFrame:
NameAgeCity
0Kiran25New Delhi
Chunk DataFrame:
NameAgeCity
1Priya30Hyderabad
Chunk DataFrame:
NameAgeCity
2Naveen35Chennai
python_pandas_io_tool.htm
Print Page
Advertisements

[8]ページ先頭

©2009-2025 Movatter.jp