Movatterモバイル変換


[0]ホーム

URL:


[Python-Dev] PEP 471 -- os.scandir() function -- a better and faster directory iterator

Nick Coghlanncoghlan at gmail.com
Sun Jun 29 06:59:19 CEST 2014


On 29 June 2014 05:48, Ben Hoyt <benhoyt at gmail.com> wrote:>>> But the underlying system calls -- ``FindFirstFile`` />>> ``FindNextFile`` on Windows and ``readdir`` on Linux and OS X -->>>> What about FreeBSD, OpenBSD, NetBSD, Solaris, etc. They don't provide readdir?>> I guess it'd be better to say "Windows" and "Unix-based OSs"> throughout the PEP? Because all of these (including Mac OS X) are> Unix-based.*nix and POSIX-based are the two conventions I use.>> Crazy idea: would it be possible to "convert" a DirEntry object to a>> pathlib.Path object without losing the cache? I guess that>> pathlib.Path expects a full  stat_result object.>> The main problem is that pathlib.Path objects explicitly don't cache> stat info (and Guido doesn't want them to, for good reason I think).> There's a thread on python-dev about this earlier. I'll add it to a> "Rejected ideas" section.The key problem with caches on pathlib.Path objects is that you couldend up with two separate path objects that referred to the samefilesystem location but returned different answers about thefilesystem state because their caches might be stale. DirEntry isdifferent, as the content is generally *assumed* to be stale(referring to when the directory was scanned, rather than the currentfilesystem state). DirEntry.lstat() on POSIX systems will be anexception to that general rule (referring to the time of first lookup,rather than when the directory was scanned, so the answer rom lstat()may be inconsistent with other data stored directly on the DirEntryobject), but one we can probably live with.More generally, as part of the pathlib PEP review, we figured out thata *per-object* cache of filesystem state would be an inherently badidea, but a string based *process global* cache might make sense formodules like walkdir (not part of the stdlib - it's an iteratorpipeline based approach to file tree scanning I wrote a while back,that currently suffers badly from the performance impact of repeatedstat calls at different stages of the pipeline). We realised this wasgetting into a space where application and library specific concernsare likely to start affecting the caching design, though, so thecurrent status of standard library level stat caching is "it's notclear if there's an available approach that would be sufficientlygeneral purpose to be appropriate for inclusion in the standardlibrary".Cheers,Nick.-- Nick Coghlan   |ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-Devmailing list

[8]ページ先頭

©2009-2025 Movatter.jp