- API reference
- DataFrame
- pandas.DataF...
pandas.DataFrame.copy#
- DataFrame.copy(deep=True)[source]#
Make a copy of this object’s indices and data.
When
deep=True
(default), a new object will be created with acopy of the calling object’s data and indices. Modifications tothe data or indices of the copy will not be reflected in theoriginal object (see notes below).When
deep=False
, a new object will be created without copyingthe calling object’s data or index (only references to the dataand index are copied). Any changes to the data of the originalwill be reflected in the shallow copy (and vice versa).Note
The
deep=False
behaviour as described above will changein pandas 3.0.Copy-on-Writewill be enabled by default, which means that the “shallow” copyis that is returned withdeep=False
will still avoid makingan eager copy, but changes to the data of the original willnolonger be reflected in the shallow copy (or vice versa). Instead,it makes use of a lazy (deferred) copy mechanism that will copythe data only when any changes to the original or shallow copy ismade.You can already get the future behavior and improvements throughenabling copy on write
pd.options.mode.copy_on_write=True
- Parameters:
- deepbool, default True
Make a deep copy, including a copy of the data and the indices.With
deep=False
neither the indices nor the data are copied.
- Returns:
- Series or DataFrame
Object type matches caller.
Notes
When
deep=True
, data is copied but actual Python objectswill not be copied recursively, only the reference to the object.This is in contrast tocopy.deepcopy in the Standard Library,which recursively copies object data (see examples below).While
Index
objects are copied whendeep=True
, the underlyingnumpy array is not copied for performance reasons. SinceIndex
isimmutable, the underlying data can be safely shared and a copyis not needed.Since pandas is not thread safe, see thegotchas when copying in a threadingenvironment.
When
copy_on_write
in pandas config is set toTrue
, thecopy_on_write
config takes effect even whendeep=False
.This means that any changes to the copied data would make a new copyof the data upon write (and vice versa). Changes made to either theoriginal or copied variable would not be reflected in the counterpart.SeeCopy_on_Write for more information.Examples
>>>s=pd.Series([1,2],index=["a","b"])>>>sa 1b 2dtype: int64
>>>s_copy=s.copy()>>>s_copya 1b 2dtype: int64
Shallow copy versus default (deep) copy:
>>>s=pd.Series([1,2],index=["a","b"])>>>deep=s.copy()>>>shallow=s.copy(deep=False)
Shallow copy shares data and index with original.
>>>sisshallowFalse>>>s.valuesisshallow.valuesands.indexisshallow.indexTrue
Deep copy has own copy of data and index.
>>>sisdeepFalse>>>s.valuesisdeep.valuesors.indexisdeep.indexFalse
Updates to the data shared by shallow copy and original is reflectedin both (NOTE: this will no longer be true for pandas >= 3.0);deep copy remains unchanged.
>>>s.iloc[0]=3>>>shallow.iloc[1]=4>>>sa 3b 4dtype: int64>>>shallowa 3b 4dtype: int64>>>deepa 1b 2dtype: int64
Note that when copying an object containing Python objects, a deep copywill copy the data, but will not do so recursively. Updating a nesteddata object will be reflected in the deep copy.
>>>s=pd.Series([[1,2],[3,4]])>>>deep=s.copy()>>>s[0][0]=10>>>s0 [10, 2]1 [3, 4]dtype: object>>>deep0 [10, 2]1 [3, 4]dtype: object
Copy-on-Write is set to true, the shallow copy is not modifiedwhen the original data is changed:
>>>withpd.option_context("mode.copy_on_write",True):...s=pd.Series([1,2],index=["a","b"])...copy=s.copy(deep=False)...s.iloc[0]=100...sa 100b 2dtype: int64>>>copya 1b 2dtype: int64