
- Python Pandas - Home
- Python Pandas - Introduction
- Python Pandas - Environment Setup
- Python Pandas - Basics
- Python Pandas - Introduction to Data Structures
- Python Pandas - Index Objects
- Python Pandas - Panel
- Python Pandas - Basic Functionality
- Python Pandas - Indexing & Selecting Data
- Python Pandas - Series
- Python Pandas - Series
- Python Pandas - Slicing a Series Object
- Python Pandas - Attributes of a Series Object
- Python Pandas - Arithmetic Operations on Series Object
- Python Pandas - Converting Series to Other Objects
- Python Pandas - DataFrame
- Python Pandas - DataFrame
- Python Pandas - Accessing DataFrame
- Python Pandas - Slicing a DataFrame Object
- Python Pandas - Modifying DataFrame
- Python Pandas - Removing Rows from a DataFrame
- Python Pandas - Arithmetic Operations on DataFrame
- Python Pandas - IO Tools
- Python Pandas - IO Tools
- Python Pandas - Working with CSV Format
- Python Pandas - Reading & Writing JSON Files
- Python Pandas - Reading Data from an Excel File
- Python Pandas - Writing Data to Excel Files
- Python Pandas - Working with HTML Data
- Python Pandas - Clipboard
- Python Pandas - Working with HDF5 Format
- Python Pandas - Comparison with SQL
- Python Pandas - Data Handling
- Python Pandas - Sorting
- Python Pandas - Reindexing
- Python Pandas - Iteration
- Python Pandas - Concatenation
- Python Pandas - Statistical Functions
- Python Pandas - Descriptive Statistics
- Python Pandas - Working with Text Data
- Python Pandas - Function Application
- Python Pandas - Options & Customization
- Python Pandas - Window Functions
- Python Pandas - Aggregations
- Python Pandas - Merging/Joining
- Python Pandas - MultiIndex
- Python Pandas - Basics of MultiIndex
- Python Pandas - Indexing with MultiIndex
- Python Pandas - Advanced Reindexing with MultiIndex
- Python Pandas - Renaming MultiIndex Labels
- Python Pandas - Sorting a MultiIndex
- Python Pandas - Binary Operations
- Python Pandas - Binary Comparison Operations
- Python Pandas - Boolean Indexing
- Python Pandas - Boolean Masking
- Python Pandas - Data Reshaping & Pivoting
- Python Pandas - Pivoting
- Python Pandas - Stacking & Unstacking
- Python Pandas - Melting
- Python Pandas - Computing Dummy Variables
- Python Pandas - Categorical Data
- Python Pandas - Categorical Data
- Python Pandas - Ordering & Sorting Categorical Data
- Python Pandas - Comparing Categorical Data
- Python Pandas - Handling Missing Data
- Python Pandas - Missing Data
- Python Pandas - Filling Missing Data
- Python Pandas - Interpolation of Missing Values
- Python Pandas - Dropping Missing Data
- Python Pandas - Calculations with Missing Data
- Python Pandas - Handling Duplicates
- Python Pandas - Duplicated Data
- Python Pandas - Counting & Retrieving Unique Elements
- Python Pandas - Duplicated Labels
- Python Pandas - Grouping & Aggregation
- Python Pandas - GroupBy
- Python Pandas - Time-series Data
- Python Pandas - Date Functionality
- Python Pandas - Timedelta
- Python Pandas - Sparse Data Structures
- Python Pandas - Sparse Data
- Python Pandas - Visualization
- Python Pandas - Visualization
- Python Pandas - Additional Concepts
- Python Pandas - Caveats & Gotchas
Python Pandas - Reading Data from an Excel File
The Pandas library provides powerful tool for data manipulation and analysis. Among its many features, it offers the ability to read and write data to Excel files easily.Excel files are widely used to store and organize data in tabular format, making them an excellent choice for analysis. Excel files can have multiple sheets, each containing rows and columns of data. Excel files usually come with extensions like.xls or.xlsx.
Pandas provides easy-to-use methods for working with Excel data directly in Python. One of such method is theread_excel() method, which reads Excel files and loads the data into a Pandas DataFrame.
In this tutorial, we will learn how to read data from Excel files using thepandas.read_excel() method, covering different scenarios like loading a single sheet, specific sheets, and multiple sheets.
Supported Excel File Formats in Pandas
Pandas uses different backends to read various Excel file formats −
TheExcel 2007+ (.xlsx) files can be read using theopenpyxl Python module.
TheExcel 2003 (.xls) files can be read using thexlrd module.
TheBinary Excel (.xlsb) files can be read using thepyxlsb module.
Additionally, all formats can be read using the Calamine engine.
Note: Please make sure that you have installed the required packages (xlrd and openpyxl) in your system. If these packages are not installed, use the following commands to install them −pip install openpyxlorpip install xlrd
Reading an Excel File in Pandas
Thepandas.read_excel() method reads Excel files and loads the data into a Pandas DataFrame. This method supports multiple Excel file formats like,.xls,.xlsx,.xlsm, and more from a local filesystem or a URL.
Example
Here is a simple example of reading a local system Excel file into a DataFrame using thepandas.read_excel() method by specifying the file path.
import pandas as pd# Read an Excel filedf = pd.read_excel('data.xlsx')# Print the DataFrameprint('Output DataFrame:')print(df)Following is the output of the above code −
Output DataFrame:
| Sr.no | Name | Gender | Age | |
|---|---|---|---|---|
| 0 | 1 | Braund | female | 38 |
| 1 | 2 | Cumings | male | 22 |
| 2 | 3 | Heikkin | female | 35 |
| 3 | 4 | Futrelle | female | 26 |
Reading a Specific Sheet from an Excel file
The Excel files may contain multiple sheets with different names. To read a specific sheet into a Pandas DataFrame, you can specify the sheet name or index to thesheet_name parameter of thepandas.read_excel() method.
Example
The following example demonstrates how to read a specific sheet from an Excel file into a Pandas DataFrame usingpandas.read_excel() method. Here we will specify the sheet name to thesheet_name parameter to read that specific sheet.
import pandas as pd# Read a specific sheetdf = pd.read_excel('data.xlsx', sheet_name="Sheet_2")# Print the DataFrameprint('Output DataFrame:')print(df)Following is the output of the above code −
Output DataFrame:
| Name | Value | |
|---|---|---|
| 0 | string1 | 1 |
| 1 | string2 | 3 |
| 2 | Comment | 5 |
Reading Multiple Sheets as a Dictionary of DataFrames
If an Excel file contains multiple sheets and you need to read few of them into the Pandas DataFrame, you can pass a list of sheet names or indices to thesheet_name parameter of thepandas.read_excel() method.
Example
This example uses thepandas.read_excel() method to read the multiple sheets in an Excel file to a dictionary of DataFrames.
import pandas as pd# Read multiple sheetsdf = pd.read_excel('data.xlsx', sheet_name=[0, 1])# Print the DataFrameprint('Output Dict of DataFrames:')print(df)Following is the output of the above code −
Output Dict of DataFrames:{0: Sr.no Name Gender Age0 1 Braund female 381 2 Cumings male 222 3 Heikkin female 353 4 Futrelle female 26, 1: Name Value0 string1 11 string2 32 Comment 5}Reading MultiIndex Data from Excel
You can readMultiIndexed data from an Excel file using thepandas.read_excel() method, which is useful for working with hierarchical data. By specifying the lists of columns forindex_col and lists of rows forheader parameters to handle MultiIndex indices and columns.
Example
This example uses theto_excel() method andpandas.read_excel() method create an excel sheet withMultiIndexed data and read it back to Pandas DataFrame respectively.
import pandas as pd# Create a MultiIndex objectindex = pd.MultiIndex.from_tuples([('A', 'one'), ('A', 'two'), ('B', 'one'), ('B', 'two')])# Create a DataFramedata = [[1, 2], [3, 4], [5, 6], [7, 8]]df = pd.DataFrame(data, index=index, columns=['X', 'Y'])df.to_excel("multiindex_data.xlsx")# Read MultiIndex rows and columnsdf = pd.read_excel("multiindex_data.xlsx", index_col=[0, 1])print('Output DataFrame from Excel File:')print(df)Following is the output of the above code −
Output DataFrame from Excel File:
| X | Y | ||
|---|---|---|---|
| A | one | 1 | 2 |
| two | 3 | 4 | |
| B | one | 5 | 6 |
| two | 7 | 8 |