- API reference
- DataFrame
- pandas.DataF...
pandas.DataFrame.interpolate#
- DataFrame.interpolate(method='linear',*,axis=0,limit=None,inplace=False,limit_direction=None,limit_area=None,downcast=<no_default>,**kwargs)[source]#
Fill NaN values using an interpolation method.
Please note that only
method='linear'
is supported forDataFrame/Series with a MultiIndex.- Parameters:
- methodstr, default ‘linear’
Interpolation technique to use. One of:
‘linear’: Ignore the index and treat the values as equallyspaced. This is the only method supported on MultiIndexes.
‘time’: Works on daily and higher resolution data to interpolategiven length of interval.
‘index’, ‘values’: use the actual numerical values of the index.
‘pad’: Fill in NaNs using existing values.
‘nearest’, ‘zero’, ‘slinear’, ‘quadratic’, ‘cubic’,‘barycentric’, ‘polynomial’: Passed toscipy.interpolate.interp1d, whereas ‘spline’ is passed toscipy.interpolate.UnivariateSpline. These methods use the numericalvalues of the index. Both ‘polynomial’ and ‘spline’ require thatyou also specify anorder (int), e.g.
df.interpolate(method='polynomial',order=5)
. Note that,slinear method in Pandas refers to the Scipy first ordersplineinstead of Pandas first orderspline.‘krogh’, ‘piecewise_polynomial’, ‘spline’, ‘pchip’, ‘akima’,‘cubicspline’: Wrappers around the SciPy interpolation methods ofsimilar names. SeeNotes.
‘from_derivatives’: Refers toscipy.interpolate.BPoly.from_derivatives.
- axis{{0 or ‘index’, 1 or ‘columns’, None}}, default None
Axis to interpolate along. ForSeries this parameter is unusedand defaults to 0.
- limitint, optional
Maximum number of consecutive NaNs to fill. Must be greater than0.
- inplacebool, default False
Update the data in place if possible.
- limit_direction{{‘forward’, ‘backward’, ‘both’}}, Optional
Consecutive NaNs will be filled in this direction.
- If limit is specified:
If ‘method’ is ‘pad’ or ‘ffill’, ‘limit_direction’ must be ‘forward’.
If ‘method’ is ‘backfill’ or ‘bfill’, ‘limit_direction’ must be‘backwards’.
- If ‘limit’ is not specified:
If ‘method’ is ‘backfill’ or ‘bfill’, the default is ‘backward’
else the default is ‘forward’
- raises ValueError iflimit_direction is ‘forward’ or ‘both’ and
method is ‘backfill’ or ‘bfill’.
- raises ValueError iflimit_direction is ‘backward’ or ‘both’ and
method is ‘pad’ or ‘ffill’.
- limit_area{{None, ‘inside’, ‘outside’}}, default None
If limit is specified, consecutive NaNs will be filled with thisrestriction.
None
: No fill restriction.‘inside’: Only fill NaNs surrounded by valid values(interpolate).
‘outside’: Only fill NaNs outside valid values (extrapolate).
- downcastoptional, ‘infer’ or None, defaults to None
Downcast dtypes if possible.
Deprecated since version 2.1.0.
- ``**kwargs``optional
Keyword arguments to pass on to the interpolating function.
- Returns:
- Series or DataFrame or None
Returns the same object type as the caller, interpolated atsome or all
NaN
values or None ifinplace=True
.
See also
fillna
Fill missing values using different methods.
scipy.interpolate.Akima1DInterpolator
Piecewise cubic polynomials (Akima interpolator).
scipy.interpolate.BPoly.from_derivatives
Piecewise polynomial in the Bernstein basis.
scipy.interpolate.interp1d
Interpolate a 1-D function.
scipy.interpolate.KroghInterpolator
Interpolate polynomial (Krogh interpolator).
scipy.interpolate.PchipInterpolator
PCHIP 1-d monotonic cubic interpolation.
scipy.interpolate.CubicSpline
Cubic spline data interpolator.
Notes
The ‘krogh’, ‘piecewise_polynomial’, ‘spline’, ‘pchip’ and ‘akima’methods are wrappers around the respective SciPy implementations ofsimilar names. These use the actual numerical values of the index.For more information on their behavior, see theSciPy documentation.
Examples
Filling in
NaN
in aSeries
via linearinterpolation.>>>s=pd.Series([0,1,np.nan,3])>>>s0 0.01 1.02 NaN3 3.0dtype: float64>>>s.interpolate()0 0.01 1.02 2.03 3.0dtype: float64
Filling in
NaN
in a Series via polynomial interpolation or splines:Both ‘polynomial’ and ‘spline’ methods require that you also specifyanorder
(int).>>>s=pd.Series([0,2,np.nan,8])>>>s.interpolate(method='polynomial',order=2)0 0.0000001 2.0000002 4.6666673 8.000000dtype: float64
Fill the DataFrame forward (that is, going down) along each columnusing linear interpolation.
Note how the last entry in column ‘a’ is interpolated differently,because there is no entry after it to use for interpolation.Note how the first entry in column ‘b’ remains
NaN
, because thereis no entry before it to use for interpolation.>>>df=pd.DataFrame([(0.0,np.nan,-1.0,1.0),...(np.nan,2.0,np.nan,np.nan),...(2.0,3.0,np.nan,9.0),...(np.nan,4.0,-4.0,16.0)],...columns=list('abcd'))>>>df a b c d0 0.0 NaN -1.0 1.01 NaN 2.0 NaN NaN2 2.0 3.0 NaN 9.03 NaN 4.0 -4.0 16.0>>>df.interpolate(method='linear',limit_direction='forward',axis=0) a b c d0 0.0 NaN -1.0 1.01 1.0 2.0 -2.0 5.02 2.0 3.0 -3.0 9.03 2.0 4.0 -4.0 16.0
Using polynomial interpolation.
>>>df['d'].interpolate(method='polynomial',order=2)0 1.01 4.02 9.03 16.0Name: d, dtype: float64