Python Pandas - Home
Python Pandas - Introduction
Python Pandas - Environment Setup
Python Pandas - Basics
Python Pandas - Introduction to Data Structures
Python Pandas - Index Objects
Python Pandas - Panel
Python Pandas - Basic Functionality
Python Pandas - Indexing & Selecting Data
Python Pandas - Series
Python Pandas - Series
Python Pandas - Slicing a Series Object
Python Pandas - Attributes of a Series Object
Python Pandas - Arithmetic Operations on Series Object
Python Pandas - Converting Series to Other Objects
Python Pandas - DataFrame
Python Pandas - DataFrame
Python Pandas - Accessing DataFrame
Python Pandas - Slicing a DataFrame Object
Python Pandas - Modifying DataFrame
Python Pandas - Removing Rows from a DataFrame
Python Pandas - Arithmetic Operations on DataFrame
Python Pandas - IO Tools
Python Pandas - IO Tools
Python Pandas - Working with CSV Format
Python Pandas - Reading & Writing JSON Files
Python Pandas - Reading Data from an Excel File
Python Pandas - Writing Data to Excel Files
Python Pandas - Working with HTML Data
Python Pandas - Clipboard
Python Pandas - Working with HDF5 Format
Python Pandas - Comparison with SQL
Python Pandas - Data Handling
Python Pandas - Sorting
Python Pandas - Reindexing
Python Pandas - Iteration
Python Pandas - Concatenation
Python Pandas - Statistical Functions
Python Pandas - Descriptive Statistics
Python Pandas - Working with Text Data
Python Pandas - Function Application
Python Pandas - Options & Customization
Python Pandas - Window Functions
Python Pandas - Aggregations
Python Pandas - Merging/Joining
Python Pandas - MultiIndex
Python Pandas - Basics of MultiIndex
Python Pandas - Indexing with MultiIndex
Python Pandas - Advanced Reindexing with MultiIndex
Python Pandas - Renaming MultiIndex Labels
Python Pandas - Sorting a MultiIndex
Python Pandas - Binary Operations
Python Pandas - Binary Comparison Operations
Python Pandas - Boolean Indexing
Python Pandas - Boolean Masking
Python Pandas - Data Reshaping & Pivoting
Python Pandas - Pivoting
Python Pandas - Stacking & Unstacking
Python Pandas - Melting
Python Pandas - Computing Dummy Variables
Python Pandas - Categorical Data
Python Pandas - Categorical Data
Python Pandas - Ordering & Sorting Categorical Data
Python Pandas - Comparing Categorical Data
Python Pandas - Handling Missing Data
Python Pandas - Missing Data
Python Pandas - Filling Missing Data
Python Pandas - Interpolation of Missing Values
Python Pandas - Dropping Missing Data
Python Pandas - Calculations with Missing Data
Python Pandas - Handling Duplicates
Python Pandas - Duplicated Data
Python Pandas - Counting & Retrieving Unique Elements
Python Pandas - Duplicated Labels
Python Pandas - Grouping & Aggregation
Python Pandas - GroupBy
Python Pandas - Time-series Data
Python Pandas - Date Functionality
Python Pandas - Timedelta
Python Pandas - Sparse Data Structures
Python Pandas - Sparse Data
Python Pandas - Visualization
Python Pandas - Visualization
Python Pandas - Additional Concepts
Python Pandas - Caveats & Gotchas

Python Pandas - Reindexing

Reindexing is a powerful and fundamental operation in Pandas that allows you to align your data with a new set of labels. Whether you're working with rows or columns, reindexing gives you control over how your data aligns with the labels you specify.

This operation is especially useful when working with time series data, aligning datasets from different sources, or simply reorganizing data to match a particular structure.

What is Reindexing?

Reindexing in Pandas refers to the process of conforming your data to match a new set of labels along a specified axis (rows or columns). This process can accomplish several tasks −

Reordering: Reorder the existing data to match a new set of labels.
Inserting Missing Values: If a label in the new set does not exist in the original data, Pandas will insert a missing value (NaN) for that label.
Filling Missing Data: You can specify how to fill in missing values that result from reindexing, using various filling methods.

Thereindex() method is the primary tool for performing reindexing in Pandas. It allows you to modify the row and column labels of Pandas data structures.

Key Methods Used in Reindexing

reindex(): This method is used to align an existing data structure with a new index (or columns). It can reorder and/or insert missing labels.
reindex_like(): This method allows you to reindex one DataFrame or Series to match another. It's useful when you want to ensure two data structures are aligned similarly.
Filling Methods: When reindexing introduces NaN values, you can fill them using methods like ffill, bfill, and nearest.

Example: Reindexing a Pandas Series

The following example demonstrates reindexing a Pandas Series object using thereindex() method. In this case, the "f" label was not present in the original Series, so it appears as NaN in the output reindexed Series.

import pandas as pdimport numpy as nps = pd.Series(np.random.randn(5), index=["a", "b", "c", "d", "e"])print("Original Series:\n",s)s_reindexed = s.reindex(["e", "b", "f", "d"])print('\nOutput Reindexed Series:\n',s_reindexed)

On executing the above code you will get the following output −

Original Series: a    0.148874b    0.592275c   -0.903546d    1.031230e   -0.254599dtype: float64Output Reindexed Series: e   -0.254599b    0.592275f         NaNd    1.031230dtype: float64

Example: Reindexing a DataFrame

Consider the following example of reindexing a DataFrame using thereindex() method. With a DataFrame, you can reindex both the rows (index) and columns.

import pandas as pdimport numpy as npN=5df = pd.DataFrame({   'A': pd.date_range(start='2016-01-01',periods=N,freq='D'),   'x': np.linspace(0,stop=N-1,num=N),   'y': np.random.rand(N),   'C': np.random.choice(['Low','Medium','High'],N).tolist(),   'D': np.random.normal(100, 10, size=(N)).tolist()})print("Original DataFrame:\n", df)#reindex the DataFramedf_reindexed = df.reindex(index=[0,2,5], columns=['A', 'C', 'B'])print("\nOutput Reindexed DataFrame:\n",df_reindexed)

Itsoutput is as follows −

Original DataFrame:            A    x         y       C           D0 2016-01-01  0.0  0.513990  Medium  118.1433851 2016-01-02  1.0  0.751248     Low   91.0412012 2016-01-03  2.0  0.332970  Medium  100.6443453 2016-01-04  3.0  0.723816    High  108.8103864 2016-01-05  4.0  0.376326    High  101.346443Output Reindexed DataFrame:            A       C   B0 2016-01-01  Medium NaN2 2016-01-03  Medium NaN5        NaT     NaN NaN

Reindex to Align with Other Objects

Sometimes, you may need to reindex one DataFrame to align it with another. Thereindex_like() method allows you to do this seamlessly.

Example

The following example demonstrates how to reindex a DataFrame (df1) to match another DataFrame (df2) using thereindex_like() method.

import pandas as pdimport numpy as npdf1 = pd.DataFrame(np.random.randn(10,3),columns=['col1','col2','col3'])df2 = pd.DataFrame(np.random.randn(7,3),columns=['col1','col2','col3'])df1 = df1.reindex_like(df2)print(df1)

Itsoutput is as follows −

          col1         col2         col30    -2.467652    -1.211687    -0.3917611    -0.287396     0.522350     0.5625122    -0.255409    -0.483250     1.8662583    -1.150467    -0.646493    -0.2224624     0.152768    -2.056643     1.8772335    -1.155997     1.528719    -1.3437196    -1.015606    -1.245936    -0.295275

Note: Here, thedf1 DataFrame is altered and reindexed likedf2. The column names should be matched or else NAN will be added for the entire column label.

Filling While ReIndexing

Thereindex() method provides an optional parameter method for filling missing values. The available methods include −

pad/ffill: Fill values forward.
bfill/backfill: Fill values backward.
nearest: Fill from the nearest index values.

Example

The following example demonstrates the working of theffill method.

import pandas as pdimport numpy as npdf1 = pd.DataFrame(np.random.randn(6, 3), columns=['col1', 'col2', 'col3'])df2 = pd.DataFrame(np.random.randn(2, 3), columns=['col1', 'col2', 'col3'])# Padding NaNsprint(df2.reindex_like(df1))# Now fill the NaNs with preceding valuesprint("Data Frame with Forward Fill:")print(df2.reindex_like(df1, method='ffill'))

Itsoutput is as follows −

         col1        col2       col30    1.311620   -0.707176   0.5998631   -0.423455   -0.700265   1.1333712         NaN         NaN        NaN3         NaN         NaN        NaN4         NaN         NaN        NaN5         NaN         NaN        NaNData Frame with Forward Fill:         col1        col2        col30    1.311620   -0.707176    0.5998631   -0.423455   -0.700265    1.1333712   -0.423455   -0.700265    1.1333713   -0.423455   -0.700265    1.1333714   -0.423455   -0.700265    1.1333715   -0.423455   -0.700265    1.133371

Note: The last four rows are padded.

Limits on Filling While Reindexing

Thelimit argument provides additional control over filling while reindexing. Thelimit specifies the maximum count of consecutive matches.

Example

Let us consider the following example to understand specifying limits on filling −

import pandas as pdimport numpy as np df1 = pd.DataFrame(np.random.randn(6,3),columns=['col1','col2','col3'])df2 = pd.DataFrame(np.random.randn(2,3),columns=['col1','col2','col3'])# Padding NaNsprint(df2.reindex_like(df1))# Now fill the NaNs with preceding valuesprint("Data Frame with Forward Fill limiting to 1:")print(df2.reindex_like(df1, method='ffill', limit=1))

Itsoutput is as follows −

         col1        col2        col30    0.247784    2.128727    0.7025761   -0.055713   -0.021732   -0.1745772         NaN         NaN         NaN3         NaN         NaN         NaN4         NaN         NaN         NaN5         NaN         NaN         NaNData Frame with Forward Fill limiting to 1:         col1        col2        col30    0.247784    2.128727    0.7025761   -0.055713   -0.021732   -0.1745772   -0.055713   -0.021732   -0.1745773         NaN         NaN         NaN4         NaN         NaN         NaN5         NaN         NaN         NaN

Note: The forward fill (ffill) is limited to only one row.

Print Page

Movatterモバイル変換

Python Pandas - Reindexing

What is Reindexing?

Key Methods Used in Reindexing

Example: Reindexing a Pandas Series

Example: Reindexing a DataFrame

Reindex to Align with Other Objects

Example

Filling While ReIndexing

Example

Limits on Filling While Reindexing

Example