pandas.DataFrame.reindex #

DataFrame.reindex(labels=None,*,index=None,columns=None,axis=None,method=None,copy=None,level=None,fill_value=nan,limit=None,tolerance=None)[source]#

Conform DataFrame to new index with optional filling logic.

Places NA/NaN in locations having no value in the previous index. A new objectis produced unless the new index is equivalent to the current one andcopy=False.

Parameters:

labelsarray-like, optional

New labels / index to conform the axis specified by ‘axis’ to.

indexarray-like, optional

New labels for the index. Preferably an Index object to avoidduplicating data.

columnsarray-like, optional

New labels for the columns. Preferably an Index object to avoidduplicating data.

axisint or str, optional

Axis to target. Can be either the axis name (‘index’, ‘columns’)or number (0, 1).

method{None, ‘backfill’/’bfill’, ‘pad’/’ffill’, ‘nearest’}

Method to use for filling holes in reindexed DataFrame.Please note: this is only applicable to DataFrames/Series with amonotonically increasing/decreasing index.

None (default): don’t fill gaps
pad / ffill: Propagate last valid observation forward to nextvalid.
backfill / bfill: Use next valid observation to fill gap.
nearest: Use nearest valid observations to fill gap.

copybool, default True

Return a new object, even if the passed indexes are the same.

Note

Thecopy keyword will change behavior in pandas 3.0.Copy-on-Writewill be enabled by default, which means that all methods with acopy keyword will use a lazy copy mechanism to defer the copy andignore thecopy keyword. Thecopy keyword will be removed in afuture version of pandas.

You can already get the future behavior and improvements throughenabling copy on writepd.options.mode.copy_on_write=True

levelint or name

Broadcast across a level, matching Index values on thepassed MultiIndex level.

fill_valuescalar, default np.nan

Value to use for missing values. Defaults to NaN, but can be any“compatible” value.

limitint, default None

Maximum number of consecutive elements to forward or backward fill.

toleranceoptional

Maximum distance between original and new labels for inexactmatches. The values of the index at the matching locations mostsatisfy the equationabs(index[indexer]-target)<=tolerance.

Tolerance may be a scalar value, which applies the same toleranceto all values, or list-like, which applies variable tolerance perelement. List-like includes list, tuple, array, Series, and must bethe same size as the index and its dtype must exactly match theindex’s type.

Returns:

DataFrame with changed index.

See also

DataFrame.set_index: Set row labels.
DataFrame.reset_index: Remove row labels or move them to new columns.
DataFrame.reindex_like: Change to same indices as other DataFrame.

Examples

DataFrame.reindex supports two calling conventions

(index=index_labels,columns=column_labels,...)
(labels,axis={'index','columns'},...)

Wehighly recommend using keyword arguments to clarify yourintent.

Create a dataframe with some fictional data.

>>>index=['Firefox','Chrome','Safari','IE10','Konqueror']>>>df=pd.DataFrame({'http_status':[200,200,404,404,301],...'response_time':[0.04,0.02,0.07,0.08,1.0]},...index=index)>>>df           http_status  response_timeFirefox            200           0.04Chrome             200           0.02Safari             404           0.07IE10               404           0.08Konqueror          301           1.00

Create a new index and reindex the dataframe. By defaultvalues in the new index that do not have correspondingrecords in the dataframe are assignedNaN.

>>>new_index=['Safari','Iceweasel','Comodo Dragon','IE10',...'Chrome']>>>df.reindex(new_index)               http_status  response_timeSafari               404.0           0.07Iceweasel              NaN            NaNComodo Dragon          NaN            NaNIE10                 404.0           0.08Chrome               200.0           0.02

We can fill in the missing values by passing a value tothe keywordfill_value. Because the index is not monotonicallyincreasing or decreasing, we cannot use arguments to the keywordmethod to fill theNaN values.

>>>df.reindex(new_index,fill_value=0)               http_status  response_timeSafari                 404           0.07Iceweasel                0           0.00Comodo Dragon            0           0.00IE10                   404           0.08Chrome                 200           0.02

>>>df.reindex(new_index,fill_value='missing')              http_status response_timeSafari                404          0.07Iceweasel         missing       missingComodo Dragon     missing       missingIE10                  404          0.08Chrome                200          0.02

We can also reindex the columns.

>>>df.reindex(columns=['http_status','user_agent'])           http_status  user_agentFirefox            200         NaNChrome             200         NaNSafari             404         NaNIE10               404         NaNKonqueror          301         NaN

Or we can use “axis-style” keyword arguments

>>>df.reindex(['http_status','user_agent'],axis="columns")           http_status  user_agentFirefox            200         NaNChrome             200         NaNSafari             404         NaNIE10               404         NaNKonqueror          301         NaN

To further illustrate the filling functionality inreindex, we will create a dataframe with amonotonically increasing index (for example, a sequenceof dates).

>>>date_index=pd.date_range('1/1/2010',periods=6,freq='D')>>>df2=pd.DataFrame({"prices":[100,101,np.nan,100,89,88]},...index=date_index)>>>df2            prices2010-01-01   100.02010-01-02   101.02010-01-03     NaN2010-01-04   100.02010-01-05    89.02010-01-06    88.0

Suppose we decide to expand the dataframe to cover a widerdate range.

>>>date_index2=pd.date_range('12/29/2009',periods=10,freq='D')>>>df2.reindex(date_index2)            prices2009-12-29     NaN2009-12-30     NaN2009-12-31     NaN2010-01-01   100.02010-01-02   101.02010-01-03     NaN2010-01-04   100.02010-01-05    89.02010-01-06    88.02010-01-07     NaN

The index entries that did not have a value in the original data frame(for example, ‘2009-12-29’) are by default filled withNaN.If desired, we can fill in the missing values using one of severaloptions.

For example, to back-propagate the last valid value to fill theNaNvalues, passbfill as an argument to themethod keyword.

>>>df2.reindex(date_index2,method='bfill')            prices2009-12-29   100.02009-12-30   100.02009-12-31   100.02010-01-01   100.02010-01-02   101.02010-01-03     NaN2010-01-04   100.02010-01-05    89.02010-01-06    88.02010-01-07     NaN

Please note that theNaN value present in the original dataframe(at index value 2010-01-03) will not be filled by any of thevalue propagation schemes. This is because filling while reindexingdoes not look at dataframe values, but only compares the original anddesired indexes. If you do want to fill in theNaN values presentin the original dataframe, use thefillna() method.

See theuser guide for more.

On this page

Show Source

Movatterモバイル変換

pandas.DataFrame.reindex#

pandas.DataFrame.reindex #