- Notifications
You must be signed in to change notification settings - Fork318
Description
WinPython includes a ton of tiny files. Of the 77898 files in my WinPython install, the median is 2940 bytes and 92% of the files are less than 32KB.
Since WinPython's portability is a major selling point, some people (myself included) will be using it on thumb drives. The default ExFAT cluster size for modern drives is either 32KB (drives up to 32GB) or 128KB (from 32GB up to the ExFAT maximum size, 256 TB). When a single-byte file takes 128KB on disk, then because WinPython has so many small files, even though it's only 2.5GB it takes nearly 12GB on disk. Massive wasted space.
(This would also apply to any disk where people have a large cluster size, but NTFS keeps 4K sector sizes by default all the way to 16TB, ergo not worth worrying about.)
In addition, many thumb drives perform much much worse with large collections of small files. Thankfully that's improved a good bit on the last 5 years and is mostly only obvious on writes, giving WinPython a painfully slow install process but usually OK performance after install.
In other languages, shared libraries, JARs, etc avoid this type of problem. I don't know enough about the status of egg /wheel / zipimport etc to know whether something like that would be a reasonable option.
It's worth at least simply putting a notice in your documentation: "If you're actually using our portable python on a portable drive you may want to format your drive with a non-default cluster size to save tons of space."