Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

bpo-33671 / shutil.copyfile: use memoryview() with dynamic size on Windows#7681

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
giampaolo merged 16 commits intopython:masterfromgiampaolo:33671-release-memoryview
Jun 19, 2018
Merged
Show file tree
Hide file tree
Changes fromall commits
Commits
Show all changes
16 commits
Select commitHold shift + click to select a range
820fc21
bpo-33671: release sub-memoryview resources immediately
giampaoloJun 12, 2018
7c55be5
rename OSX to macOS
giampaoloJun 13, 2018
db6a9ad
add unit tests for copyfileobj
giampaoloJun 13, 2018
090df6a
Remove _copybinfileobj() implementation
giampaoloJun 13, 2018
df2354f
update doc
giampaoloJun 13, 2018
4e2b9f8
update doc
giampaoloJun 13, 2018
dae995a
update doc
giampaoloJun 13, 2018
9f6cf1f
update doc
giampaoloJun 13, 2018
06d3e7a
update doc
giampaoloJun 13, 2018
8a3f88e
Reintroduce memoryview() variant of copyfileobj()
giampaoloJun 13, 2018
f47c011
speed up test case by mocking file size instead of creating an actua…
giampaoloJun 13, 2018
c3166e1
update doc
giampaoloJun 13, 2018
89cd883
update comments
giampaoloJun 14, 2018
be40ae1
rename var
giampaoloJun 14, 2018
93ebc1f
Use dynamic memoryview() size based on file size
giampaoloJun 14, 2018
9d93ce5
do not attempt to use sendfile() on OSX
giampaoloJun 14, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletionDoc/library/shutil.rst
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -407,11 +407,15 @@ efficiently (see :issue:`33671`).
"fast-copy" means that the copying operation occurs within the kernel, avoiding
the use of userspace buffers in Python as in "``outfd.write(infd.read())``".

OnOSX `fcopyfile`_ is used to copy the file content (not metadata).
OnmacOS `fcopyfile`_ is used to copy the file content (not metadata).

On Linux, Solaris and other POSIX platforms where :func:`os.sendfile` supports
copies between 2 regular file descriptors :func:`os.sendfile` is used.

On Windows :func:`shutil.copyfile` uses a bigger default buffer size (1 MiB
instead of 16 KiB) and a :func:`memoryview`-based variant of
:func:`shutil.copyfileobj` is used.

If the fast-copy operation fails and no data was written in the destination
file then shutil will silently fallback on using less efficient
:func:`copyfileobj` function internally.
Expand Down
28 changes: 17 additions & 11 deletionsDoc/whatsnew/3.8.rst
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -95,20 +95,18 @@ Optimizations

* :func:`shutil.copyfile`, :func:`shutil.copy`, :func:`shutil.copy2`,
:func:`shutil.copytree` and :func:`shutil.move` use platform-specific
"fast-copy" syscalls on Linux,OSX and Solaris in order to copy the file more
efficiently.
"fast-copy" syscalls on Linux,macOS and Solaris in order to copy the file
moreefficiently.
"fast-copy" means that the copying operation occurs within the kernel,
avoiding the use of userspace buffers in Python as in
"``outfd.write(infd.read())``".
All other platforms not using such technique will rely on a faster
:func:`shutil.copyfile` implementation using :func:`memoryview`,
:class:`bytearray` and
:meth:`BufferedIOBase.readinto() <io.BufferedIOBase.readinto>`.
Finally, :func:`shutil.copyfile` default buffer size on Windows was increased
from 16KB to 1MB.
The speedup for copying a 512MB file within the same partition is about +26%
on Linux, +50% on OSX and +38% on Windows. Also, much less CPU cycles are
consumed.
On Windows :func:`shutil.copyfile` uses a bigger default buffer size (1 MiB
instead of 16 KiB) and a :func:`memoryview`-based variant of
:func:`shutil.copyfileobj` is used.
The speedup for copying a 512 MiB file within the same partition is about
+26% on Linux, +50% on macOS and +40% on Windows. Also, much less CPU cycles
are consumed.
See :ref:`shutil-platform-dependent-efficient-copy-operations` section.
(Contributed by Giampaolo Rodola' in :issue:`25427`.)

* The default protocol in the :mod:`pickle` module is now Protocol 4,
Expand DownExpand Up@@ -171,6 +169,14 @@ Changes in the Python API
* The :class:`cProfile.Profile` class can now be used as a context
manager. (Contributed by Scott Sanderson in :issue:`29235`.)

* :func:`shutil.copyfile`, :func:`shutil.copy`, :func:`shutil.copy2`,
:func:`shutil.copytree` and :func:`shutil.move` use platform-specific
"fast-copy" syscalls (see
:ref:`shutil-platform-dependent-efficient-copy-operations` section).

* :func:`shutil.copyfile` default buffer size on Windows was changed from
16 KiB to 1 MiB.

CPython bytecode changes
------------------------

Expand Down
72 changes: 39 additions & 33 deletionsLib/shutil.py
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -43,15 +43,16 @@
except ImportError:
getgrnam = None

_WINDOWS = os.name == 'nt'
posix = nt = None
if os.name == 'posix':
import posix
elifos.name == 'nt':
elif_WINDOWS:
import nt

COPY_BUFSIZE = 1024 * 1024 ifos.name == 'nt' else 16 * 1024
COPY_BUFSIZE = 1024 * 1024 if_WINDOWS else 16 * 1024
_HAS_SENDFILE = posix and hasattr(os, "sendfile")
_HAS_FCOPYFILE = posix and hasattr(posix, "_fcopyfile") #OSX
_HAS_FCOPYFILE = posix and hasattr(posix, "_fcopyfile") #macOS

__all__ = ["copyfileobj", "copyfile", "copymode", "copystat", "copy", "copy2",
"copytree", "move", "rmtree", "Error", "SpecialFileError",
Expand DownExpand Up@@ -88,9 +89,9 @@ class _GiveupOnFastCopy(Exception):
file copy when fast-copy functions fail to do so.
"""

def_fastcopy_osx(fsrc, fdst, flags):
def_fastcopy_fcopyfile(fsrc, fdst, flags):
"""Copy a regular file content or metadata by using high-performance
fcopyfile(3) syscall (OSX).
fcopyfile(3) syscall (macOS).
"""
try:
infd = fsrc.fileno()
Expand DownExpand Up@@ -168,8 +169,11 @@ def _fastcopy_sendfile(fsrc, fdst):
break # EOF
offset += sent

def _copybinfileobj(fsrc, fdst, length=COPY_BUFSIZE):
"""Copy 2 regular file objects open in binary mode."""
def _copyfileobj_readinto(fsrc, fdst, length=COPY_BUFSIZE):
"""readinto()/memoryview() based variant of copyfileobj().
*fsrc* must support readinto() method and both files must be
open in binary mode.
"""
# Localize variable access to minimize overhead.
fsrc_readinto = fsrc.readinto
fdst_write = fdst.write
Expand All@@ -179,28 +183,21 @@ def _copybinfileobj(fsrc, fdst, length=COPY_BUFSIZE):
if not n:
break
elif n < length:
fdst_write(mv[:n])
with mv[:n] as smv:
fdst.write(smv)
else:
fdst_write(mv)

def _is_binary_files_pair(fsrc, fdst):
return hasattr(fsrc, 'readinto') and \
isinstance(fsrc, io.BytesIO) or 'b' in getattr(fsrc, 'mode', '') and \
isinstance(fdst, io.BytesIO) or 'b' in getattr(fdst, 'mode', '')

def copyfileobj(fsrc, fdst, length=COPY_BUFSIZE):
"""copy data from file-like object fsrc to file-like object fdst"""
if _is_binary_files_pair(fsrc, fdst):
_copybinfileobj(fsrc, fdst, length=length)
else:
# Localize variable access to minimize overhead.
fsrc_read = fsrc.read
fdst_write = fdst.write
while 1:
buf = fsrc_read(length)
if not buf:
break
fdst_write(buf)
# Localize variable access to minimize overhead.
fsrc_read = fsrc.read
fdst_write = fdst.write
while True:
buf = fsrc_read(length)
if not buf:
break
fdst_write(buf)

def _samefile(src, dst):
# Macintosh, Unix.
Expand All@@ -215,7 +212,7 @@ def _samefile(src, dst):
os.path.normcase(os.path.abspath(dst)))

def copyfile(src, dst, *, follow_symlinks=True):
"""Copy data from src to dst.
"""Copy data from src to dst in the most efficient way possible.

If follow_symlinks is not set and src is a symbolic link, a new
symlink will be created instead of copying the file it points to.
Expand All@@ -224,7 +221,8 @@ def copyfile(src, dst, *, follow_symlinks=True):
if _samefile(src, dst):
raise SameFileError("{!r} and {!r} are the same file".format(src, dst))

for fn in [src, dst]:
file_size = 0
for i, fn in enumerate([src, dst]):
try:
st = os.stat(fn)
except OSError:
Expand All@@ -234,26 +232,34 @@ def copyfile(src, dst, *, follow_symlinks=True):
# XXX What about other special files? (sockets, devices...)
if stat.S_ISFIFO(st.st_mode):
raise SpecialFileError("`%s` is a named pipe" % fn)
if _WINDOWS and i == 0:
file_size = st.st_size

if not follow_symlinks and os.path.islink(src):
os.symlink(os.readlink(src), dst)
else:
with open(src, 'rb') as fsrc, open(dst, 'wb') as fdst:
if _HAS_SENDFILE:
# macOS
if _HAS_FCOPYFILE:
try:
_fastcopy_sendfile(fsrc, fdst)
_fastcopy_fcopyfile(fsrc, fdst, posix._COPYFILE_DATA)
return dst
except _GiveupOnFastCopy:
pass

if _HAS_FCOPYFILE:
# Linux / Solaris
elif _HAS_SENDFILE:
try:
_fastcopy_osx(fsrc, fdst, posix._COPYFILE_DATA)
_fastcopy_sendfile(fsrc, fdst)
return dst
except _GiveupOnFastCopy:
pass
# Windows, see:
# https://github.com/python/cpython/pull/7160#discussion_r195405230
elif _WINDOWS and file_size > 0:
_copyfileobj_readinto(fsrc, fdst, min(file_size, COPY_BUFSIZE))
return dst

_copybinfileobj(fsrc, fdst)
copyfileobj(fsrc, fdst)

return dst

Expand DownExpand Up@@ -1147,7 +1153,7 @@ def disk_usage(path):
used = (st.f_blocks - st.f_bfree) * st.f_frsize
return _ntuple_diskusage(total, used, free)

elifos.name == 'nt':
elif_WINDOWS:

__all__.append('disk_usage')
_ntuple_diskusage = collections.namedtuple('usage', 'total used free')
Expand Down
88 changes: 81 additions & 7 deletionsLib/test/test_shutil.py
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -33,7 +33,7 @@
from test.support import TESTFN, FakePath

TESTFN2 = TESTFN + "2"
OSX = sys.platform.startswith("darwin")
MACOS = sys.platform.startswith("darwin")
try:
import grp
import pwd
Expand DownExpand Up@@ -1808,7 +1808,7 @@ def _open(filename, mode='r'):

self.assertRaises(OSError, shutil.copyfile, 'srcfile', 'destfile')

@unittest.skipIf(OSX, "skipped onOSX")
@unittest.skipIf(MACOS, "skipped onmacOS")
def test_w_dest_open_fails(self):

srcfile = self.Faux()
Expand All@@ -1828,7 +1828,7 @@ def _open(filename, mode='r'):
self.assertEqual(srcfile._exited_with[1].args,
('Cannot open "destfile"',))

@unittest.skipIf(OSX, "skipped onOSX")
@unittest.skipIf(MACOS, "skipped onmacOS")
def test_w_dest_close_fails(self):

srcfile = self.Faux()
Expand All@@ -1851,7 +1851,7 @@ def _open(filename, mode='r'):
self.assertEqual(srcfile._exited_with[1].args,
('Cannot close',))

@unittest.skipIf(OSX, "skipped onOSX")
@unittest.skipIf(MACOS, "skipped onmacOS")
def test_w_source_close_fails(self):

srcfile = self.Faux(True)
Expand DownExpand Up@@ -1892,6 +1892,80 @@ def test_move_dir_caseinsensitive(self):
os.rmdir(dst_dir)


class TestCopyFileObj(unittest.TestCase):
FILESIZE = 2 * 1024 * 1024

@classmethod
def setUpClass(cls):
write_test_file(TESTFN, cls.FILESIZE)

@classmethod
def tearDownClass(cls):
support.unlink(TESTFN)
support.unlink(TESTFN2)

def tearDown(self):
support.unlink(TESTFN2)

@contextlib.contextmanager
def get_files(self):
with open(TESTFN, "rb") as src:
with open(TESTFN2, "wb") as dst:
yield (src, dst)

def assert_files_eq(self, src, dst):
with open(src, 'rb') as fsrc:
with open(dst, 'rb') as fdst:
self.assertEqual(fsrc.read(), fdst.read())

def test_content(self):
with self.get_files() as (src, dst):
shutil.copyfileobj(src, dst)
self.assert_files_eq(TESTFN, TESTFN2)

def test_file_not_closed(self):
with self.get_files() as (src, dst):
shutil.copyfileobj(src, dst)
assert not src.closed
assert not dst.closed

def test_file_offset(self):
with self.get_files() as (src, dst):
shutil.copyfileobj(src, dst)
self.assertEqual(src.tell(), self.FILESIZE)
self.assertEqual(dst.tell(), self.FILESIZE)

@unittest.skipIf(os.name != 'nt', "Windows only")
def test_win_impl(self):
# Make sure alternate Windows implementation is called.
with unittest.mock.patch("shutil._copyfileobj_readinto") as m:
shutil.copyfile(TESTFN, TESTFN2)
assert m.called

# File size is 2 MiB but max buf size should be 1 MiB.
self.assertEqual(m.call_args[0][2], 1 * 1024 * 1024)

# If file size < 1 MiB memoryview() length must be equal to
# the actual file size.
with tempfile.NamedTemporaryFile(delete=False) as f:
f.write(b'foo')
fname = f.name
self.addCleanup(support.unlink, fname)
with unittest.mock.patch("shutil._copyfileobj_readinto") as m:
shutil.copyfile(fname, TESTFN2)
self.assertEqual(m.call_args[0][2], 3)

# Empty files should not rely on readinto() variant.
with tempfile.NamedTemporaryFile(delete=False) as f:
pass
fname = f.name
self.addCleanup(support.unlink, fname)
with unittest.mock.patch("shutil._copyfileobj_readinto") as m:
shutil.copyfile(fname, TESTFN2)
assert not m.called
self.assert_files_eq(fname, TESTFN2)


class _ZeroCopyFileTest(object):
"""Tests common to all zero-copy APIs."""
FILESIZE = (10 * 1024 * 1024) # 10 MiB
Expand DownExpand Up@@ -2111,12 +2185,12 @@ def test_file2file_not_supported(self):
shutil._HAS_SENDFILE = True


@unittest.skipIf(notOSX, 'OSX only')
classTestZeroCopyOSX(_ZeroCopyFileTest, unittest.TestCase):
@unittest.skipIf(notMACOS, 'macOS only')
classTestZeroCopyMACOS(_ZeroCopyFileTest, unittest.TestCase):
PATCHPOINT = "posix._fcopyfile"

def zerocopy_fun(self, src, dst):
return shutil._fastcopy_osx(src, dst, posix._COPYFILE_DATA)
return shutil._fastcopy_fcopyfile(src, dst, posix._COPYFILE_DATA)


class TermsizeTests(unittest.TestCase):
Expand Down
View file
Open in desktop
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,10 @@
:func:`shutil.copyfile`, :func:`shutil.copy`, :func:`shutil.copy2`,
:func:`shutil.copytree` and :func:`shutil.move` use platform-specific
fast-copy syscalls on Linux, Solaris and OSX in order to copy the file
more efficiently. All other platforms not using such technique will rely on a
faster :func:`shutil.copyfile` implementation using :func:`memoryview`,
:class:`bytearray` and
:meth:`BufferedIOBase.readinto() <io.BufferedIOBase.readinto>`.
Finally, :func:`shutil.copyfile` default buffer size on Windows was increased
from 16KB to 1MB. The speedup for copying a 512MB file is about +26% on Linux,
+50% on OSX and +38% on Windows. Also, much less CPU cycles are consumed
fast-copy syscalls on Linux, Solaris and macOS in order to copy the file
more efficiently.
On Windows :func:`shutil.copyfile` uses a bigger default buffer size (1 MiB
instead of 16 KiB) and a :func:`memoryview`-based variant of
:func:`shutil.copyfileobj` is used.
The speedup for copying a 512MiB file is about +26% on Linux, +50% on macOS and
+40% on Windows. Also, much less CPU cycles are consumed.
(Contributed by Giampaolo Rodola' in :issue:`25427`.)
4 changes: 2 additions & 2 deletionsModules/clinic/posixmodule.c.h
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -3859,7 +3859,7 @@ PyDoc_STRVAR(os__fcopyfile__doc__,
"_fcopyfile($module, infd, outfd, flags, /)\n"
"--\n"
"\n"
"Efficiently copy content or metadata of 2 regular file descriptors (OSX).");
"Efficiently copy content or metadata of 2 regular file descriptors (macOS).");

#define OS__FCOPYFILE_METHODDEF \
{"_fcopyfile", (PyCFunction)os__fcopyfile, METH_FASTCALL, os__fcopyfile__doc__},
Expand DownExpand Up@@ -6627,4 +6627,4 @@ os_getrandom(PyObject *module, PyObject *const *args, Py_ssize_t nargs, PyObject
#ifndef OS_GETRANDOM_METHODDEF
#define OS_GETRANDOM_METHODDEF
#endif /* !defined(OS_GETRANDOM_METHODDEF) */
/*[clinic end generated code: output=b5d1ec71bc6f0651 input=a9049054013a1b77]*/
/*[clinic end generated code: output=47fb6a3e88cba6d9 input=a9049054013a1b77]*/
4 changes: 2 additions & 2 deletionsModules/posixmodule.c
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -8755,12 +8755,12 @@ os._fcopyfile
flags: int
/

Efficiently copy content or metadata of 2 regular file descriptors (OSX).
Efficiently copy content or metadata of 2 regular file descriptors (macOS).
[clinic start generated code]*/

static PyObject *
os__fcopyfile_impl(PyObject *module, int infd, int outfd, int flags)
/*[clinic end generated code: output=8e8885c721ec38e3 input=aeb9456804eec879]*/
/*[clinic end generated code: output=8e8885c721ec38e3 input=69e0770e600cb44f]*/
{
int ret;

Expand Down

[8]ページ先頭

©2009-2026 Movatter.jp