Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

GH-120754: Disable buffering in Path.read_bytes#122111

Merged
hauntsaninja merged 2 commits intopython:mainfrom
cmaloney:cmaloney/read_bytes_nobuffer
Aug 16, 2024
Merged

GH-120754: Disable buffering in Path.read_bytes#122111
hauntsaninja merged 2 commits intopython:mainfrom
cmaloney:cmaloney/read_bytes_nobuffer

Conversation

@cmaloney
Copy link
Contributor

@cmaloneycmaloney commentedJul 22, 2024
edited
Loading

Path.read_bytes() is used to read a whole file.buffering= /BufferedIO is focused around making small, possibly interleaved, read and write efficient which doesn't add value for this case. This makesseek +isatty system call no longer called in this case, as well as removing the construction of the BufferedIO, allocation of a buffer, etc.

Benchmark

Run on my Mac

Benchmark code
importpyperffrompathlibimportPathdefread_all(all_paths):forpinall_paths:p.read_bytes()defread_file(path_obj):path_obj.read_bytes()all_rst=list(Path("Doc").glob("**/*.rst"))all_py=list(Path(".").glob("**/*.py"))assertall_rst,"Should have found rst files"assertall_py,"Should have found python source files"runner=pyperf.Runner()runner.bench_func("read_file_small",read_file,Path("Doc/howto/clinic.rst"))runner.bench_func("read_file_large",read_file,Path("Doc/c-api/typeobj.rst"))

Before:

.....................read_file_small:Mean+-stddev:6.80us+-0.07us.....................read_file_large:Mean+-stddev:10.8us+-0.2us

After:

.....................read_file_small:Mean+-stddev:5.67us+-0.05us.....................read_file_large:Mean+-stddev:9.77us+-0.52us

`Path.read_bytes()` is used to read a whole file. buffering /BufferedIO is focused around making small, possibly interleaved,read/write efficient which doesn't add value in this case.On my Mac, running the benchmark:```pythonimport pyperffrom pathlib import Pathdef read_all(all_paths):    for p in all_paths:        p.read_bytes()def read_file(path_obj):    path_obj.read_bytes()all_rst = list(Path("Doc").glob("**/*.rst"))all_py = list(Path(".").glob("**/*.py"))assert all_rst, "Should have found rst files"assert all_py, "Should have found python source files"runner = pyperf.Runner()runner.bench_func("read_file_small", read_file, Path("Doc/howto/clinic.rst"))runner.bench_func("read_file_large", read_file, Path("Doc/c-api/typeobj.rst"))```before:```python.....................read_file_small: Mean +- std dev: 6.80 us +- 0.07 us.....................read_file_large: Mean +- std dev: 10.8 us +- 0.2 us````after:```python.....................read_file_small: Mean +- std dev: 5.67 us +- 0.05 us.....................read_file_large: Mean +- std dev: 9.77 us +- 0.52 us```
@hauntsaninjahauntsaninja merged commit35d8ac7 intopython:mainAug 16, 2024
@cmaloneycmaloney deleted the cmaloney/read_bytes_nobuffer branchAugust 18, 2024 03:35
jeremyhylton pushed a commit to jeremyhylton/cpython that referenced this pull requestAug 19, 2024
`Path.read_bytes()` is used to read a whole file. buffering /BufferedIO is focused around making small, possibly interleaved,read/write efficient which doesn't add value in this case.On my Mac, running the benchmark:```pythonimport pyperffrom pathlib import Pathdef read_all(all_paths):    for p in all_paths:        p.read_bytes()def read_file(path_obj):    path_obj.read_bytes()all_rst = list(Path("Doc").glob("**/*.rst"))all_py = list(Path(".").glob("**/*.py"))assert all_rst, "Should have found rst files"assert all_py, "Should have found python source files"runner = pyperf.Runner()runner.bench_func("read_file_small", read_file, Path("Doc/howto/clinic.rst"))runner.bench_func("read_file_large", read_file, Path("Doc/c-api/typeobj.rst"))```before:```python.....................read_file_small: Mean +- std dev: 6.80 us +- 0.07 us.....................read_file_large: Mean +- std dev: 10.8 us +- 0.2 us````after:```python.....................read_file_small: Mean +- std dev: 5.67 us +- 0.05 us.....................read_file_large: Mean +- std dev: 9.77 us +- 0.52 us```
blhsing pushed a commit to blhsing/cpython that referenced this pull requestAug 22, 2024
`Path.read_bytes()` is used to read a whole file. buffering /BufferedIO is focused around making small, possibly interleaved,read/write efficient which doesn't add value in this case.On my Mac, running the benchmark:```pythonimport pyperffrom pathlib import Pathdef read_all(all_paths):    for p in all_paths:        p.read_bytes()def read_file(path_obj):    path_obj.read_bytes()all_rst = list(Path("Doc").glob("**/*.rst"))all_py = list(Path(".").glob("**/*.py"))assert all_rst, "Should have found rst files"assert all_py, "Should have found python source files"runner = pyperf.Runner()runner.bench_func("read_file_small", read_file, Path("Doc/howto/clinic.rst"))runner.bench_func("read_file_large", read_file, Path("Doc/c-api/typeobj.rst"))```before:```python.....................read_file_small: Mean +- std dev: 6.80 us +- 0.07 us.....................read_file_large: Mean +- std dev: 10.8 us +- 0.2 us````after:```python.....................read_file_small: Mean +- std dev: 5.67 us +- 0.05 us.....................read_file_large: Mean +- std dev: 9.77 us +- 0.52 us```
cmaloney added a commit to cmaloney/cpython that referenced this pull requestAug 28, 2024
NOTE: This needs a full buildbot test pass before merge, see:python#121143 (comment).1. Added `statx` to set of allowed syscall forms (Should make Raspian bot pass).2. Check that the `fd` returned from `open` is passed to all future calls. This helps ensure things like the `stat` call uses the file descriptor rather than the `filename` to avoid TOCTOU isuses.3. Update the `Path().read_bytes()` test case to additionally validate the reduction in`isatty`/`ioctl` + `seek` calls frompython#1221114. Better diagnostic assertion messagess from@gpshead, so when the test fails have first information immediately available. Makes remote CI debugging much simpler.
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@hauntsaninjahauntsaninjahauntsaninja approved these changes

@barneygalebarneygaleAwaiting requested review from barneygalebarneygale is a code owner

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

2 participants

@cmaloney@hauntsaninja

Comments


[8]ページ先頭

©2009-2026 Movatter.jp