Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
/dwarfsPublic

A fast high compression read-only file system for Linux, Windows and macOS

License

NotificationsYou must be signed in to change notification settings

mhx/dwarfs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Latest ReleaseTotal DownloadsHomebrew DownloadsDwarFS CI BuildCodacy BadgecodecovOpenSSF Best Practices

DwarFS

TheDeduplicatingWarp-speedAdvancedRead-onlyFileSystem.

A fast high compression read-only file system for Linux and Windows.

Table of contents

Overview

Windows Screen Capture

Linux Screen Capture

DwarFS is a read-only file system with a focus on achievingveryhigh compression ratios in particular for very redundant data.

This probably doesn't sound very exciting, because if it's redundant,itshould compress well. However, I found that other read-only,compressed file systems don't do a very good job at making use ofthis redundancy. Seehere for a comparison with othercompressed file systems.

DwarFS alsodoesn't compromise on speed and for my use cases I'vefound it to be on par with or perform better than SquashFS. For myprimary use case,DwarFS compression is an order of magnitude betterthan SquashFS compression, it's6 times faster to build the filesystem, it's typically faster to access files on DwarFS and it usesless CPU resources.

To give you an idea of what DwarFS is capable of, here's a quick comparisonof DwarFS and SquashFS on a set of video files with a total size of 39 GiB.The twist is that each unique video file has two sibling files with adifferent set of audio streams (this isan actual use case).So there's redundancy in both the video and audio data, but as the streamsare interleaved and identical blocks are typically very far apart, it'schallenging to make use of that redundancy for compression. SquashFSessentially fails to compress the source data at all, whereas DwarFS isable to reduce the size by almost a factor of 3, which is close to thetheoretical maximum:

$ du -hs dwarfs-video-test39G     dwarfs-video-test$ ls -lh dwarfs-video-test.*fs-rw-r--r-- 1 mhx users 14G Jul  2 13:01 dwarfs-video-test.dwarfs-rw-r--r-- 1 mhx users 39G Jul 12 09:41 dwarfs-video-test.squashfs

Furthermore, when mounting the SquashFS image and performing a random-readthroughput test usingfio-3.34, bothsquashfuse andsquashfuse_ll top out at around 230 MiB/s:

$ fio --readonly --rw=randread --name=randread --bs=64k --direct=1 \      --opendir=mnt --numjobs=4 --ioengine=libaio --iodepth=32 \      --group_reporting --runtime=60 --time_based[...]   READ: bw=230MiB/s (241MB/s), 230MiB/s-230MiB/s (241MB/s-241MB/s), io=13.5GiB (14.5GB), run=60004-60004msec

In comparison, DwarFS manages to sustainrandom read rates of 20 GiB/s:

  READ: bw=20.2GiB/s (21.7GB/s), 20.2GiB/s-20.2GiB/s (21.7GB/s-21.7GB/s), io=1212GiB (1301GB), run=60001-60001msec

Distinct features of DwarFS are:

  • Clustering of files by similarity using a similarity hash function.This makes it easier to exploit the redundancy across file boundaries.

  • Segmentation analysis across file system blocks in order to reducethe size of the uncompressed file system. This saves memory whenusing the compressed file system and thus potentially allows forhigher cache hit rates as more data can be kept in the cache.

  • Categorization framework to categorizefiles or even fragments of files and then process individual categoriesdifferently. For example, this allows you to not waste time trying tocompress incompressible files or to compress PCM audio data using FLACcompression.

  • Highly multi-threaded implementation. Both thefile system creation tool as well as theFUSE driver are able to make good use of themany cores of your system.

History

I started working on DwarFS in 2013 and my main use case and majormotivation was that I had several hundred different versions of Perlthat were taking up something around 30 gigabytes of disk space, andI was unwilling to spend more than 10% of my hard drive keeping themaround for when I happened to need them.

Up until then, I had been usingCromfsfor squeezing them into a manageable size. However, I was getting moreand more annoyed by the time it took to build the filesystem imageand, to make things worse, more often than not it was crashing afterabout an hour or so.

I had obviously also looked intoSquashFS,but never got anywhere close to the compression rates of Cromfs.

This alone wouldn't have been enough to get me into writing DwarFS,but at around the same time, I was pretty obsessed with the recentdevelopments and features of newer C++ standards and really wanteda C++ hobby project to work on. Also, I've wanted to do somethingwithFUSEfor quite some time. Last but not least, I had been thinking aboutthe problem of compressed file systems for a bit and had some ideasthat I definitely wanted to try.

The majority of the code was written in 2013, then I did a coupleof cleanups, bugfixes and refactors every once in a while, but Inever really got it to a state where I would feel happy releasingit. It was too awkward to build with its dependency on Facebook's(quite awesome)folly libraryand it didn't have any documentation.

Digging out the project again this year, things didn't look as grimas they used to. Folly now builds with CMake and so I just pulledit in as a submodule. Most other dependencies can be satisfiedfrom packages that should be widely available. And I've writtensome rudimentary docs as well.

Building and Installing

Note to Package Maintainers

DwarFS should usually build fine with minimal changes out of the box.If it doesn't, please file a issue. I've set upCI jobsusing Docker images for Ubuntu (22.04and24.04),Fedora RawhideandArchthat can help with determining an up-to-date set of dependencies.Note that building from the release tarball requires less dependenciesthan building from the git repository, notably theronn tool as wellas Python and themistletoe Python module are not required whenbuilding from the release tarball.

There are some things to be aware of:

  • There's a tendency to try and unbundle thefollyandfbthrift libraries thatare included as submodules and are built along with DwarFS.While I agree with the sentiment, it's unfortunately a bad idea.Besides the fact that folly does not make any claims about ABIstability (i.e. you can't just dynamically link a binary builtagainst one version of folly against another version), it's noteven possible to safely link against a folly library built withdifferent compile options. Even subtle differences, such as theC++ standard version, can cause run-time errors.Seethis issuefor details. Currently, it is not even possible to use externalversions of folly/fbthrift as DwarFS is building minimal subsets ofboth libraries; these are bundled in thedwarfs_common libraryand they are strictly used internally, i.e. none of the folly orfbthrift headers are required to build against DwarFS' libraries.

  • Similar issues can arise when using a system-installed versionof GoogleTest. GoogleTest itself recommends that it is beingdownloaded as part of the build. However, you can use the systeminstalled version by passing-DPREFER_SYSTEM_GTEST=ON to thecmake call. Use at your own risk.

  • For other bundled libraries (namelyfmt,parallel-hashmap,range-v3), the system installed version is used as long as itmeets the minimum required version. Otherwise, the preferredversion is fetched during the build.

Prebuilt Binaries

Each release has pre-built,statically linked binaries forLinux-x86_64,Linux-aarch64 andWindows-AMD64 available for download. Theseshould run withoutany dependencies and can be useful especially on older distributionswhere you can't easily build the tools from source.

Universal Binaries

In addition to the binary tarballs, there's auniversal binaryavailable for each architecture. These universal binaries containall tools (mkdwarfs,dwarfsck,dwarfsextract and thedwarfsFUSE driver) in a single executable. These executables are compressedusingupx, so they are much smaller thanthe individual tools combined. However, it also means the binaries needto be decompressed each time they are run, which can have a significantoverhead. If that is an issue, you can either stick to the "classic"individual binaries or you can decompress the universal binary, e.g.:

upx -d dwarfs-universal-0.7.0-Linux-aarch64

The universal binaries can be run through symbolic links named afterthe proper tool. e.g.:

$ ln -s dwarfs-universal-0.7.0-Linux-aarch64 mkdwarfs$ ./mkdwarfs --help

This also works on Windows if the file system supports symbolic links:

> mklink mkdwarfs.exe dwarfs-universal-0.7.0-Windows-AMD64.exe> .\mkdwarfs.exe --help

Alternatively, you can select the tool by passing--tool=<name> asthe first argument on the command line:

> .\dwarfs-universal-0.7.0-Windows-AMD64.exe --tool=mkdwarfs --help

Note that just like thedwarfs.exe Windows binary, the universalWindows binary depends on thewinfsp-x64.dll from theWinFsp project. However, for theuniversal binary, the DLL is loaded lazily, so you can still use allother tools without the DLL.See theWindows Support section for more details.

Dependencies

DwarFS usesCMake as a build tool.

It uses bothBoost andFolly, though the latter isincluded as a submodule since very few distributions actuallyoffer packages for it. Folly itself has a number of dependencies,so please checkherefor an up-to-date list.

It also usesFacebook Thrift,in particular thefrozen library, for storing metadata in a highlyspace-efficient, memory-mappable and well defined format. It's alsoincluded as a submodule, and we only build the compiler and a veryreduced library that contains just enough for DwarFS to work.

Other than that, DwarFS really only depends on FUSE3 and on a setof compression libraries that Folly already depends on (namelylz4,zstdandliblzma).

The dependency ongoogletestwill be automatically resolved if you build with tests.

A good starting point for apt-based systems is probably:

$ apt install \    gcc \    g++ \    clang \    git \    ccache \    ninja-build \    cmake \    make \    bison \    flex \    fuse3 \    pkg-config \    binutils-dev \    libacl1-dev \    libarchive-dev \    libbenchmark-dev \    libboost-chrono-dev \    libboost-context-dev \    libboost-filesystem-dev \    libboost-iostreams-dev \    libboost-program-options-dev \    libboost-regex-dev \    libboost-system-dev \    libboost-thread-dev \    libbrotli-dev \    libevent-dev \    libhowardhinnant-date-dev \    libjemalloc-dev \    libdouble-conversion-dev \    libiberty-dev \    liblz4-dev \    liblzma-dev \    libzstd-dev \    libxxhash-dev \    libmagic-dev \    libparallel-hashmap-dev \    librange-v3-dev \    libssl-dev \    libunwind-dev \    libdwarf-dev \    libelf-dev \    libfmt-dev \    libfuse3-dev \    libgoogle-glog-dev \    libutfcpp-dev \    libflac++-dev \    nlohmann-json3-dev

Note that when building withgcc, the optimization level will beset to-O2 instead of the CMake default of-O3 for releasebuilds. At least with versions up togcc-10, the-O3 build isup to 70% slower than abuild with-O2.

Building

First, unpack the release archive:

$ tar xvf dwarfs-x.y.z.tar.xz$ cd dwarfs-x.y.z

Alternatively, you can also clone the git repository, but be awarethat this has more dependencies and the build will likely take longerbecause the release archive ships with most of the auto-generatedfiles that will have to be generated when building from the repository:

$ git clone --recurse-submodules https://github.com/mhx/dwarfs$ cd dwarfs

Once all dependencies have been installed, you can build DwarFSusing:

$ mkdir build$ cd build$ cmake .. -GNinja -DWITH_TESTS=ON$ ninja

You can then run tests with:

$ ctest -j

All binaries usejemallocas a memory allocator by default, as it is typically uses much lesssystem memory compared to theglibc ortcmalloc allocators.To disable the use ofjemalloc, pass-DUSE_JEMALLOC=0 on thecmake command line.

It is also possible to build/install the DwarFS libraries, tools,and FUSE driver independently. This is mostly interesting whenpackaging DwarFS. Note that the tools and FUSE driver require thelibraries to be either built or already installed. To build justthe libraries, use:

$ cmake .. -GNinja -DWITH_TESTS=ON -DWITH_LIBDWARFS=ON -DWITH_TOOLS=OFF -DWITH_FUSE_DRIVER=OFF

Once the libraries are tested and installed, you can build thetools (i.e.mkdwarfs,dwarfsck,dwarfsextract) using:

$ cmake .. -GNinja -DWITH_TESTS=ON -DWITH_LIBDWARFS=OFF -DWITH_TOOLS=ON -DWITH_FUSE_DRIVER=OFF

To build the FUSE driver, use:

$ cmake .. -GNinja -DWITH_TESTS=ON -DWITH_LIBDWARFS=OFF -DWITH_TOOLS=OFF -DWITH_FUSE_DRIVER=ON

Installing

Installing is as easy as:

$ sudo ninja install

Though you don't have to install the tools to play with them.

Static Builds

Attempting to build statically linked binaries is highly discouragedand not officially supported. That being said, here's how to set upan environment where youmight be able to build static binaries.

This has been tested withubuntu-22.04-live-server-amd64.iso. First,install all the packages listed as dependencies above. Also install:

$ apt install ccache ninja libacl1-dev

ccache andninja are optional, but help with a speedy compile.

Depending on your distribution, you'll need to build and install staticversions of some libraries, e.g.libarchive andlibmagic for Ubuntu:

$ wget https://github.com/libarchive/libarchive/releases/download/v3.6.2/libarchive-3.6.2.tar.xz$ tar xf libarchive-3.6.2.tar.xz && cd libarchive-3.6.2$ ./configure --prefix=/opt/static-libs --without-iconv --without-xml2 --without-expat$ make && sudo make install
$ wget ftp://ftp.astron.com/pub/file/file-5.44.tar.gz$ tar xf file-5.44.tar.gz && cd file-5.44$ ./configure --prefix=/opt/static-libs --enable-static=yes --enable-shared=no$ make && make install

That's it! Now you can try building static binaries for DwarFS:

$ git clone --recurse-submodules https://github.com/mhx/dwarfs$ cd dwarfs && mkdir build && cd build$ cmake .. -GNinja -DWITH_TESTS=ON -DSTATIC_BUILD_DO_NOT_USE=ON \           -DSTATIC_BUILD_EXTRA_PREFIX=/opt/static-libs$ ninja$ ninja test

Usage

Please check out the manual pages formkdwarfs,dwarfs,dwarfsck anddwarfsextract. You can also access the manualpages using the--man option to each binary, e.g.:

$ mkdwarfs --man

Thedwarfs manual page also shows an example for settingup DwarFS withoverlayfsin order to create a writable file system mount on top a read-onlyDwarFS image.

A description of the DwarFS filesystem format can be found indwarfs-format.

A high-level overview of the internal operation ofmkdwarfs is showninthis sequence diagram.

Using the Libraries

Using the DwarFS libraries should be pretty straightforward if you'reusingCMake to build your project. For a quickstart, have a look at theexample code that usesthe libraries to print information about a DwarFS image (likedwarfsck)or extract it (likedwarfsextract).

There are five individual libraries:

  • dwarfs_common contains the common code required by all the otherlibraries. The interfaces are defined indwarfs/.

  • dwarfs_reader contains all code required to read data from aDwarFS image. The interfaces are defined indwarfs/reader/.

  • dwarfs_extractor contains the ccode required to extract a DwarFSimage usinglibarchive. The interfacesare defined indwarfs/utility/filesystem_extractor.h.

  • dwarfs_writer contains the code required to create DwarFS images.The interfaces are defined indwarfs/writer/.

  • dwarfs_rewrite contains the code to re-write DwarFS images. Theinterfaces are defined indwarfs/utility/rewrite_filesystem.h.

The headers ininternal subfolders are only accessible at buildtime and won't be installed. The same goes for thetool subfolder.

The reader and extractor APIs should be fairly stable. The writerAPIs are likely going to change. Note, however, that there are noguarantees on API stability before this project reaches version 1.0.0.

Windows Support

Support for the Windows operating system is currently experimental.Having worked pretty much exclusively in a Unix world for the past twodecades, my experience with Windows development is rather limited andI'd expect there to definitely be bugs and rough edges in the Windowscode.

The Windows version of the DwarFS filesystem driver relies on the awesomeWinFsp project and itswinfsp-x64.dllmust be discoverable by thedwarfs.exe driver.

The different tools should behave pretty much the same whether you'reusing them on Linux or Windows. The file system images can be copiedbetween Linux and Windows and images created on one OS should work fineon the other.

There are a few things worth pointing out, though:

  • DwarFS supports both hardlinks and symlinks on Windows, just as itdoes on Linux. However, creating hardlinks and symlinks seems torequire admin privileges on Windows, so if you want to e.g. extracta DwarFS image that contains links of some sort, you might run intoerrors if you don't have the right privileges.

  • Due to aproblem inWinFsp, symlinks cannot currently point outside of the mounted filesystem. Furthermore, due to anotherproblem in WinFsp,symlinks with a drive letter will appear with a mangled target path.

  • The DwarFS driver on Windows correctly reports hardlink counts viaits API, but currently these counts are not correctly propagatedto the Windows file system layer. This is presumably due to aproblem in WinFsp.

  • When mounting a DwarFS image on Windows, the mount point must notexist. This is different from Linux, where the mount point mustactually exist. Also, it's possible to mount a DwarFS image as adrive letter, e.g.

    dwarfs.exe image.dwarfs Z:

  • Filter rules formkdwarfs always require Unix path separators,regardless of whether it's running on Windows or Linux.

Building on Windows

Building on Windows is not too complicated thanks tovcpkg.You'll need to install:

WinFsp is expected to be installed inC:\Program Files (x86)\WinFsp;if it's not, you'll need to setWINFSP_PATH when running CMake viacmake/win.bat.

Now you need to clonevcpkg anddwarfs:

> cd %HOMEPATH%> mkdir git> cd git> git clone https://github.com/Microsoft/vcpkg.git> git clone https://github.com/mhx/dwarfs

Then, bootstrapvcpkg:

> .\vcpkg\bootstrap-vcpkg.bat

And build DwarFS:

> cd dwarfs> mkdir build> cd build> ..\cmake\win.bat> ninja

Once that's done, you should be able to run the tests.SetCTEST_PARALLEL_LEVEL according to the number of CPU cores inyour machine.

> set CTEST_PARALLEL_LEVEL=10> ninja test

macOS Support

The DwarFS libraries and tools (mkdwarfs,dwarfsck,dwarfsextract)are now available fromHomebrew:

$ brew install dwarfs$ brew test dwarfs

The macOS version of the DwarFS filesystem driver relies on the awesomemacFUSE project and is available fromgromgit'shomebrew-fuse tap:

$ brew tap gromgit/homebrew-fuse$ brew install dwarfs-fuse-mac

Use Cases

Astrophotography

Astrophotography can generate huge amounts of raw image data. During asingle night, it's not unlikely to end up with a few dozens of gigabytesof data. With most dedicated astrophotography cameras, this data ends upin the form of FITS images. These are usually uncompressed, don't compressvery well with standard compression algorithms, and while there are certaincompressed FITS formats, these aren't widely supported.

One of the compression formats (simply called "Rice") compresses reasonablywell and is really fast. However, its implementation for compressed FITShas a few drawbacks. The most severe drawbacks are that compression isn'tquite as good as it could be for color sensors and sensors with a less than16 bits of resolution.

DwarFS supports thericepp (Rice++) compression, which builds on the basicidea of Rice compression, but makes a few enhancements: it compresses colorand low bit depth images significantly better and always searches for theoptimum solution during compression instead of relying on a heuristic.

Let's look at an example using 129 images (darks, flats and lights) takenwith an ASI1600MM camera. Each image is 32 MiB, so a total of 4 GiB of data.Compressing these with the standardfpack tool takes about 16.6 secondsand yields a total output size of 2.2 GiB:

$ time fpack */*.fit */*/*.fituser14.992system1.592total16.616$ find . -name '*.fz' -print0 | xargs -0 cat | wc -c2369943360

However, this leaves you with*.fz files that not every application canactually read.

Using DwarFS, here's what we get:

$ mkdwarfs -i ASI1600 -o asi1600-20.dwarfs -S 20 --categorizeI 08:47:47.459077 scanning "ASI1600"I 08:47:47.491492 assigning directory and link inodes...I 08:47:47.491560 waiting for background scanners...I 08:47:47.675241 scanning CPU time: 1.051sI 08:47:47.675271 finalizing file inodes...I 08:47:47.675330 saved 0 B / 3.941 GiB in 0/258 duplicate filesI 08:47:47.675360 assigning device inodes...I 08:47:47.675371 assigning pipe/socket inodes...I 08:47:47.675381 building metadata...I 08:47:47.675393 building blocks...I 08:47:47.675398 saving names and symlinks...I 08:47:47.675514 updating name and link indices...I 08:47:47.675796 waiting for segmenting/blockifying to finish...I 08:47:50.274285 total ordering CPU time: 616.3usI 08:47:50.274329 total segmenting CPU time: 1.132sI 08:47:50.279476 saving chunks...I 08:47:50.279622 saving directories...I 08:47:50.279674 saving shared files table...I 08:47:50.280745 saving names table... [1.047ms]I 08:47:50.280768 saving symlinks table... [743ns]I 08:47:50.282031 waiting for compression to finish...I 08:47:50.823924 compressed 3.941 GiB to 1.201 GiB (ratio=0.304825)I 08:47:50.824280 compression CPU time: 17.92sI 08:47:50.824316 filesystem created without errors [3.366s]⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯waiting for block compression to finish5 dirs, 0/0 soft/hard links, 258/258 files, 0 otheroriginal size: 3.941 GiB, hashed: 315.4 KiB (18 files, 0 B/s)scanned: 3.941 GiB (258 files, 117.1 GiB/s), categorizing: 0 B/ssaved by deduplication: 0 B (0 files), saved by segmenting: 0 Bfilesystem: 3.941 GiB in 4037 blocks (4550 chunks, 516/516 fragments, 258 inodes)compressed filesystem: 4037 blocks/1.201 GiB written

In less than 3.4 seconds, it compresses the data down to 1.2 GiB, almosthalf the size of thefpack output.

In addition to saving a lot of disk space, this can also be useful when yourdata is stored on a NAS. Here's a comparison of the same set of data accessedover a 1 Gb/s network connection, first using the uncompressed raw data:

find /mnt/ASI1600 -name '*.fit' -print0 | xargs -0 -P4 -n1 cat | dd of=/dev/null status=progress4229012160 bytes (4.2 GB, 3.9 GiB) copied, 36.0455 s, 117 MB/s

And next using a DwarFS image on the same share:

$ dwarfs /mnt/asi1600-20.dwarfs mnt$ find mnt -name '*.fit' -print0 | xargs -0 -P4 -n1 cat | dd of=/dev/null status=progress4229012160 bytes (4.2 GB, 3.9 GiB) copied, 14.3681 s, 294 MB/s

That's roughly 2.5 times faster. You can very likely see similar resultswith slow external hard drives.

Dealing with Bit Rot

Currently, DwarFS has no built-in ability to add recovery information to afile system image. However, for archival purposes, it's a good idea to havesuch recovery information in order to be able to repair a damaged image.

This is fortunately relatively straightforward using something likepar2cmdline:

$ par2create -n1 asi1600-20.dwarfs

This will create two additional files that you can place alongside the image(or on a different storage), as you'll only need them if DwarFS has detectedan issue with the file system image. If there's an issue, you can run

$ par2repair asi1600-20.dwarfs

which will very likely be able to recover the image if less than 5% (that'sthe default used bypar2create) of the image are damaged.

Extended Attributes

Preserving Extended Attributes in DwarFS Images

Extended attributes are not currently supported. Any extended attributesstored in the source file system will not currently be preserved whenbuilding a DwarFS image usingmkdwarfs.

Extended Attributes exposed by the FUSE Driver

That being said, the root inode of a mounted DwarFS image currently exposesone or two extended attributes on Linux:

$ attr -l mntAttribute "dwarfs.driver.pid" has a 4 byte value for mntAttribute "dwarfs.driver.perfmon" has a 4849 byte value for mnt

Thedwarfs.driver.pid attribute simply contains the PID of the DwarFSFUSE driver. Thedwarfs.driver.perfmon attribute contains the currentresults of theperformance monitor.

Furthermore, each regular file exposes an attributedwarfs.inodeinfowith information about the underlying inode:

$ attr -l "05 Disappear.caf"Attribute "dwarfs.inodeinfo" has a 448 byte value for 05 Disappear.caf

The attribute contains a JSON object with information about theunderlying inode:

$ attr -qg dwarfs.inodeinfo "05 Disappear.caf"{  "chunks": [    {      "block": 2,      "category": "pcmaudio/metadata",      "offset": 270976,      "size": 4096    },    {      "block": 414,      "category": "pcmaudio/waveform",      "offset": 37594368,      "size": 29514492    },    {      "block": 419,      "category": "pcmaudio/waveform",      "offset": 0,      "size": 29385468    }  ],  "gid": 100,  "mode": 33188,  "modestring": "----rw-r--r--",  "uid": 1000}

This is useful, for example, to check how a particular file is spreadacross multiple blocks or which categories have been assigned to thefile.

Comparison

The SquashFS,xz,lrzip,zpaq andwimlib tests were all done onan 8 core Intel(R) Xeon(R) E-2286M CPU @ 2.40GHz with 64 GiB of RAM.

The Cromfs tests were done with an older version of DwarFSon a 6 core Intel(R) Xeon(R) CPU D-1528 @ 1.90GHz with 64 GiB of RAM.

The EROFS tests were done using DwarFS v0.9.8 and EROFS v1.7.1 on anIntel(R) Core(TM) i9-13900K with 64 GiB of RAM.

The systems were mostly idle during all of the tests.

With SquashFS

The source directory contained1139 different Perl installationsfrom 284 distinct releases, a total of 47.65 GiB of data in 1,927,501files and 330,733 directories. The source directory was freshlyunpacked from a tar archive to an XFS partition on a 970 EVO Plus 2TBNVME drive, so most of its contents were likely cached.

I'm using the same compression type and compression level forSquashFS that is the default setting for DwarFS:

$ time mksquashfs install perl-install.squashfs -comp zstd -Xcompression-level 22Parallel mksquashfs: Using 16 processorsCreating 4.0 filesystem on perl-install-zstd.squashfs, block size 131072.[=========================================================/] 2107401/2107401 100%Exportable Squashfs 4.0 filesystem, zstd compressed, data block size 131072        compressed data, compressed metadata, compressed fragments,        compressed xattrs, compressed ids        duplicates are removedFilesystem size 4637597.63 Kbytes (4528.90 Mbytes)        9.29% of uncompressed filesystem size (49922299.04 Kbytes)Inode table size 19100802 bytes (18653.13 Kbytes)        26.06% of uncompressed inode table size (73307702 bytes)Directory table size 19128340 bytes (18680.02 Kbytes)        46.28% of uncompressed directory table size (41335540 bytes)Number of duplicate files found 1780387Number of inodes 2255794Number of files 1925061Number of fragments 28713Number of symbolic links  0Number of device nodes 0Number of fifo nodes 0Number of socket nodes 0Number of directories 330733Number of ids (unique uids + gids) 2Number of uids 1        mhx (1000)Number of gids 1        users (100)real    32m54.713suser    501m46.382ssys     0m58.528s

For DwarFS, I'm sticking to the defaults:

$ time mkdwarfs -i install -o perl-install.dwarfsI 11:33:33.310931 scanning installI 11:33:39.026712 waiting for background scanners...I 11:33:50.681305 assigning directory and link inodes...I 11:33:50.888441 finding duplicate files...I 11:34:01.120800 saved 28.2 GiB / 47.65 GiB in 1782826/1927501 duplicate filesI 11:34:01.122608 waiting for inode scanners...I 11:34:12.839065 assigning device inodes...I 11:34:12.875520 assigning pipe/socket inodes...I 11:34:12.910431 building metadata...I 11:34:12.910524 building blocks...I 11:34:12.910594 saving names and links...I 11:34:12.910691 bloom filter size: 32 KiBI 11:34:12.910760 ordering 144675 inodes using nilsimsa similarity...I 11:34:12.915555 nilsimsa: depth=20000 (1000), limit=255I 11:34:13.052525 updating name and link indices...I 11:34:13.276233 pre-sorted index (660176 name, 366179 path lookups) [360.6ms]I 11:35:44.039375 144675 inodes ordered [91.13s]I 11:35:44.041427 waiting for segmenting/blockifying to finish...I 11:37:38.823902 bloom filter reject rate: 96.017% (TPR=0.244%, lookups=4740563665)I 11:37:38.823963 segmentation matches: good=454708, bad=6819, total=464247I 11:37:38.824005 segmentation collisions: L1=0.008%, L2=0.000% [2233254 hashes]I 11:37:38.824038 saving chunks...I 11:37:38.860939 saving directories...I 11:37:41.318747 waiting for compression to finish...I 11:38:56.046809 compressed 47.65 GiB to 430.9 MiB (ratio=0.00883101)I 11:38:56.304922 filesystem created without errors [323s]⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯waiting for block compression to finish330733 dirs, 0/2440 soft/hard links, 1927501/1927501 files, 0 otheroriginal size: 47.65 GiB, dedupe: 28.2 GiB (1782826 files), segment: 15.19 GiBfilesystem: 4.261 GiB in 273 blocks (319178 chunks, 144675/144675 inodes)compressed filesystem: 273 blocks/430.9 MiB written [depth: 20000]█████████████████████████████████████████████████████████████████████████████▏100% |real    5m23.030suser    78m7.554ssys     1m47.968s

So in this comparison,mkdwarfs ismore than 6 times faster thanmksquashfs,both in terms of CPU time and wall clock time.

$ ll perl-install.*fs-rw-r--r-- 1 mhx users  447230618 Mar  3 20:28 perl-install.dwarfs-rw-r--r-- 1 mhx users 4748902400 Mar  3 20:10 perl-install.squashfs

In terms of compression ratio, theDwarFS file system is more than 10 timessmaller than the SquashFS file system. With DwarFS, the content has beencompressed down to less than 0.9% (!) of its original size. This compressionratio only considers the data stored in the individual files, not the actualdisk space used. On the original XFS file system, according todu, thesource folder uses 52 GiB, sothe DwarFS image actually only uses 0.8% ofthe original space.

Here's another comparison usinglzma compression instead ofzstd:

$ time mksquashfs install perl-install-lzma.squashfs -comp lzmareal    13m42.825suser    205m40.851ssys     3m29.088s
$ time mkdwarfs -i install -o perl-install-lzma.dwarfs -l9real    3m43.937suser    49m45.295ssys     1m44.550s
$ ll perl-install-lzma.*fs-rw-r--r-- 1 mhx users  315482627 Mar  3 21:23 perl-install-lzma.dwarfs-rw-r--r-- 1 mhx users 3838406656 Mar  3 20:50 perl-install-lzma.squashfs

It's immediately obvious that the runs are significantly faster and theresulting images are significantly smaller. Still,mkdwarfs is about4 times faster and produces and image that's12 times smaller thanthe SquashFS image. The DwarFS image is only 0.6% of the original file size.

So, why not uselzma instead ofzstd by default? The reason is thatlzmais about an order of magnitude slower to decompress thanzstd. If you'reonly accessing data on your compressed filesystem occasionally, this mightnot be a big deal, but if you use it extensively,zstd will result inbetter performance.

The comparisons above are not completely fair.mksquashfs by defaultuses a block size of 128KiB, whereasmkdwarfs uses 16MiB blocks by default,or even 64MiB blocks with-l9. When using identical block sizes for bothfile systems, the difference, quite expectedly, becomes a lot less dramatic:

$ time mksquashfs install perl-install-lzma-1M.squashfs -comp lzma -b 1Mreal    15m43.319suser    139m24.533ssys     0m45.132s
$ time mkdwarfs -i install -o perl-install-lzma-1M.dwarfs -l9 -S20 -B3real    4m25.973suser    52m15.100ssys     7m41.889s
$ ll perl-install*.*fs-rw-r--r-- 1 mhx users  935953866 Mar 13 12:12 perl-install-lzma-1M.dwarfs-rw-r--r-- 1 mhx users 3407474688 Mar  3 21:54 perl-install-lzma-1M.squashfs

Even this isstill not entirely fair, as it uses a feature (-B3) that allowsDwarFS to reference file chunks from up to two previous filesystem blocks.

But the point is that this is really where SquashFS tops out, as it doesn'tsupport larger block sizes or back-referencing. And as you'll see below, thelarger blocks that DwarFS is using by default don't necessarily negativelyimpact performance.

DwarFS also features an option to recompress an existing file system witha different compression algorithm. This can be useful as it allows relativelyfast experimentation with different algorithms and options without requiringa full rebuild of the file system. For example, recompressing the above filesystem with the best possible compression (-l 9):

$ time mkdwarfs --recompress -i perl-install.dwarfs -o perl-lzma-re.dwarfs -l9I 20:28:03.246534 filesystem rewrittenwithout errors [148.3s]⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯filesystem: 4.261 GiB in 273 blocks (0 chunks, 0 inodes)compressed filesystem: 273/273 blocks/372.7 MiB written████████████████████████████████████████████████████████████████████▏100% \real    2m28.279suser    37m8.825ssys     0m43.256s
$ ll perl-*.dwarfs-rw-r--r-- 1 mhx users 447230618 Mar  3 20:28 perl-install.dwarfs-rw-r--r-- 1 mhx users 390845518 Mar  4 20:28 perl-lzma-re.dwarfs-rw-r--r-- 1 mhx users 315482627 Mar  3 21:23 perl-install-lzma.dwarfs

Note that while the recompressed filesystem is smaller than the original image,it is still a lot bigger than the filesystem we previously build with-l9.The reason is that the recompressed image still uses the same block size, andthe block size cannot be changed by recompressing.

In terms of how fast the file system is when using it, a quick testI've done is to freshly mount the filesystem created above and runeach of the 1139perl executables to print their version.

$ hyperfine -c "umount mnt" -p "umount mnt; dwarfs perl-install.dwarfs mnt -o cachesize=1g -o workers=4; sleep 1" -P procs 5 20 -D 5 "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P{procs} sh -c '\$0 -v >/dev/null'"Benchmark #1: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P5 sh -c '$0 -v >/dev/null'  Time (mean ± σ):      1.810 s ±  0.013 s    [User: 1.847 s, System: 0.623 s]  Range (min … max):    1.788 s …  1.825 s    10 runsBenchmark #2: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P10 sh -c '$0 -v >/dev/null'  Time (mean ± σ):      1.333 s ±  0.009 s    [User: 1.993 s, System: 0.656 s]  Range (min … max):    1.321 s …  1.354 s    10 runsBenchmark #3: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P15 sh -c '$0 -v >/dev/null'  Time (mean ± σ):      1.181 s ±  0.018 s    [User: 2.086 s, System: 0.712 s]  Range (min … max):    1.165 s …  1.214 s    10 runsBenchmark #4: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '$0 -v >/dev/null'  Time (mean ± σ):      1.149 s ±  0.015 s    [User: 2.128 s, System: 0.781 s]  Range (min … max):    1.136 s …  1.186 s    10 runs

These timings are forinitial runs on a freshly mounted file system,running 5, 10, 15 and 20 processes in parallel. 1.1 seconds means thatit takes only about 1 millisecond per Perl binary.

Following are timings forsubsequent runs, both on DwarFS (atmnt)and the original XFS (atinstall). DwarFS is around 15% slower here:

$ hyperfine -P procs 10 20 -D 10 -w1 "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P{procs} sh -c '\$0 -v >/dev/null'" "ls -1 install/*/*/bin/perl5* | xargs -d $'\n' -n1 -P{procs} sh -c '\$0 -v >/dev/null'"Benchmark #1: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P10 sh -c '$0 -v >/dev/null'  Time (mean ± σ):     347.0 ms ±   7.2 ms    [User: 1.755 s, System: 0.452 s]  Range (min … max):   341.3 ms … 365.2 ms    10 runsBenchmark #2: ls -1 install/*/*/bin/perl5* | xargs -d $'\n' -n1 -P10 sh -c '$0 -v >/dev/null'  Time (mean ± σ):     302.5 ms ±   3.3 ms    [User: 1.656 s, System: 0.377 s]  Range (min … max):   297.1 ms … 308.7 ms    10 runsBenchmark #3: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '$0 -v >/dev/null'  Time (mean ± σ):     342.2 ms ±   4.1 ms    [User: 1.766 s, System: 0.451 s]  Range (min … max):   336.0 ms … 349.7 ms    10 runsBenchmark #4: ls -1 install/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '$0 -v >/dev/null'  Time (mean ± σ):     302.0 ms ±   3.0 ms    [User: 1.659 s, System: 0.374 s]  Range (min … max):   297.0 ms … 305.4 ms    10 runsSummary  'ls -1 install/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '$0 -v >/dev/null'' ran    1.00 ± 0.01 times faster than 'ls -1 install/*/*/bin/perl5* | xargs -d $'\n' -n1 -P10 sh -c '$0 -v >/dev/null''    1.13 ± 0.02 times faster than 'ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '$0 -v >/dev/null''    1.15 ± 0.03 times faster than 'ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P10 sh -c '$0 -v >/dev/null''

Using the lzma-compressed file system, the metrics forinitial runs lookconsiderably worse (about an order of magnitude):

$ hyperfine -c "umount mnt" -p "umount mnt; dwarfs perl-install-lzma.dwarfs mnt -o cachesize=1g -o workers=4; sleep 1" -P procs 5 20 -D 5 "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P{procs} sh -c '\$0 -v >/dev/null'"Benchmark #1: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P5 sh -c '$0 -v >/dev/null'  Time (mean ± σ):     10.660 s ±  0.057 s    [User: 1.952 s, System: 0.729 s]  Range (min … max):   10.615 s … 10.811 s    10 runsBenchmark #2: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P10 sh -c '$0 -v >/dev/null'  Time (mean ± σ):      9.092 s ±  0.021 s    [User: 1.979 s, System: 0.680 s]  Range (min … max):    9.059 s …  9.126 s    10 runsBenchmark #3: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P15 sh -c '$0 -v >/dev/null'  Time (mean ± σ):      9.012 s ±  0.188 s    [User: 2.077 s, System: 0.702 s]  Range (min … max):    8.839 s …  9.277 s    10 runsBenchmark #4: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '$0 -v >/dev/null'  Time (mean ± σ):      9.004 s ±  0.298 s    [User: 2.134 s, System: 0.736 s]  Range (min … max):    8.611 s …  9.555 s    10 runs

So you might want to consider usingzstd instead oflzma if you'dlike to optimize for file system performance. It's also the defaultcompression used bymkdwarfs.

Now here's a comparison with the SquashFS filesystem:

$ hyperfine -c 'sudo umount mnt' -p 'umount mnt; dwarfs perl-install.dwarfs mnt -o cachesize=1g -o workers=4; sleep 1' -n dwarfs-zstd "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '\$0 -v >/dev/null'" -p 'sudo umount mnt; sudo mount -t squashfs perl-install.squashfs mnt; sleep 1' -n squashfs-zstd "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '\$0 -v >/dev/null'"Benchmark #1: dwarfs-zstd  Time (mean ± σ):      1.151 s ±  0.015 s    [User: 2.147 s, System: 0.769 s]  Range (min … max):    1.118 s …  1.174 s    10 runsBenchmark #2: squashfs-zstd  Time (mean ± σ):      6.733 s ±  0.007 s    [User: 3.188 s, System: 17.015 s]  Range (min … max):    6.721 s …  6.743 s    10 runsSummary  'dwarfs-zstd' ran    5.85 ± 0.08 times faster than 'squashfs-zstd'

So, DwarFS is almost six times faster than SquashFS. But what's more,SquashFS also uses significantly more CPU power. However, the numbersshown above for DwarFS obviously don't include the time spent in thedwarfs process, so I repeated the test outside of hyperfine:

$ time dwarfs perl-install.dwarfs mnt -o cachesize=1g -o workers=4 -freal    0m4.569suser    0m2.154ssys     0m1.846s

So, in total, DwarFS was using 5.7 seconds of CPU time, whereasSquashFS was using 20.2 seconds, almost four times as much. Ignorethe 'real' time, this is only how long it took me to unmount thefile system again after mounting it.

Another real-life test was to build and test a Perl module with 624different Perl versions in the compressed file system. The module I'veused,Tie::Hash::Indexed,has an XS component that requires a C compiler to build. So this reallyaccesses a lot of different stuff in the file system:

  • Theperl executables and its shared libraries

  • The Perl modules used for writing the Makefile

  • Perl's C header files used for building the module

  • More Perl modules used for running the tests

I wrote a little script to be able to run multiple builds in parallel:

#!/bin/bashset -euperl=$1dir=$(echo"$perl"| cut -d/ --output-delimiter=- -f5,6)rsync -a Tie-Hash-Indexed/$dir/cd$dir$1 Makefile.PL>/dev/null2>&1maketest>/dev/null2>&1cd ..rm -rf$direcho$perl

The following command will run up to 16 builds in parallel on the 8 coreXeon CPU, including debug, optimized and threaded versions of all Perlreleases between 5.10.0 and 5.33.3, a total of 624perl installations:

$ time ls -1 /tmp/perl/install/*/perl-5.??.?/bin/perl5* | sort -t / -k 8 | xargs -d $'\n' -P 16 -n 1 ./build.sh

Tests were done with a cleanly mounted file system to make sure the cacheswere empty.ccache was primed to make sure all compiler runs could besatisfied from the cache. With SquashFS, the timing was:

real    0m52.385suser    8m10.333ssys     4m10.056s

And with DwarFS:

real    0m50.469suser    9m22.597ssys     1m18.469s

So, frankly, not much of a difference, with DwarFS being just a bit faster.Thedwarfs process itself used:

real    0m56.686suser    0m18.857ssys     0m21.058s

So again, DwarFS used less raw CPU power overall, but in terms of wallclocktime, the difference is really marginal.

With SquashFS & xz

This test uses slightly less pathological input data: the root filesystem ofa recent Raspberry Pi OS release. This file system also contains device inodes,so in order to preserve those, we pass--with-devices tomkdwarfs:

$ time sudo mkdwarfs -i raspbian -o raspbian.dwarfs --with-devicesI 21:30:29.812562 scanning raspbianI 21:30:29.908984 waiting for background scanners...I 21:30:30.217446 assigning directory and link inodes...I 21:30:30.221941 finding duplicate files...I 21:30:30.288099 saved 31.05 MiB / 1007 MiB in 1617/34582 duplicate filesI 21:30:30.288143 waiting for inode scanners...I 21:30:31.393710 assigning device inodes...I 21:30:31.394481 assigning pipe/socket inodes...I 21:30:31.395196 building metadata...I 21:30:31.395230 building blocks...I 21:30:31.395291 saving names and links...I 21:30:31.395374 ordering 32965 inodes using nilsimsa similarity...I 21:30:31.396254 nilsimsa: depth=20000 (1000), limit=255I 21:30:31.407967 pre-sorted index (46431 name, 2206 path lookups) [11.66ms]I 21:30:31.410089 updating name and link indices...I 21:30:38.178505 32965 inodes ordered [6.783s]I 21:30:38.179417 waiting for segmenting/blockifying to finish...I 21:31:06.248304 saving chunks...I 21:31:06.251998 saving directories...I 21:31:06.402559 waiting for compression to finish...I 21:31:16.425563 compressed 1007 MiB to 287 MiB (ratio=0.285036)I 21:31:16.464772 filesystem created without errors [46.65s]⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯waiting for block compression to finish4435 dirs, 5908/0 soft/hard links, 34582/34582 files, 7 otheroriginal size: 1007 MiB, dedupe: 31.05 MiB (1617 files), segment: 47.23 MiBfilesystem: 928.4 MiB in 59 blocks (38890 chunks, 32965/32965 inodes)compressed filesystem: 59 blocks/287 MiB written [depth: 20000]████████████████████████████████████████████████████████████████████▏100% |real    0m46.711suser    10m39.038ssys     0m8.123s

Again, SquashFS uses the same compression options:

$ time sudo mksquashfs raspbian raspbian.squashfs -comp zstd -Xcompression-level 22Parallel mksquashfs: Using 16 processorsCreating 4.0 filesystem on raspbian.squashfs, block size 131072.[===============================================================\] 39232/39232 100%Exportable Squashfs 4.0 filesystem, zstd compressed, data block size 131072        compressed data, compressed metadata, compressed fragments,        compressed xattrs, compressed ids        duplicates are removedFilesystem size 371934.50 Kbytes (363.22 Mbytes)        35.98% of uncompressed filesystem size (1033650.60 Kbytes)Inode table size 399913 bytes (390.54 Kbytes)        26.53% of uncompressed inode table size (1507581 bytes)Directory table size 408749 bytes (399.17 Kbytes)        42.31% of uncompressed directory table size (966174 bytes)Number of duplicate files found 1618Number of inodes 44932Number of files 34582Number of fragments 3290Number of symbolic links  5908Number of device nodes 7Number of fifo nodes 0Number of socket nodes 0Number of directories 4435Number of ids (unique uids + gids) 18Number of uids 5        root (0)        mhx (1000)        unknown (103)        shutdown (6)        unknown (106)Number of gids 15        root (0)        unknown (109)        unknown (42)        unknown (1000)        users (100)        unknown (43)        tty (5)        unknown (108)        unknown (111)        unknown (110)        unknown (50)        mail (12)        nobody (65534)        adm (4)        mem (8)real    0m50.124suser    9m41.708ssys     0m1.727s

The difference in speed is almost negligible. SquashFS is just a bitslower here. In terms of compression, the difference also isn't huge:

$ ls -lh raspbian.* *.xz-rw-r--r-- 1 mhx  users 297M Mar  4 21:32 2020-08-20-raspios-buster-armhf-lite.img.xz-rw-r--r-- 1 root root  287M Mar  4 21:31 raspbian.dwarfs-rw-r--r-- 1 root root  364M Mar  4 21:33 raspbian.squashfs

Interestingly,xz actually can't compress the whole original imagebetter than DwarFS.

We can even again try to increase the DwarFS compression level:

$ time sudo mkdwarfs -i raspbian -o raspbian-9.dwarfs --with-devices -l9real    0m54.161suser    8m40.109ssys     0m7.101s

Now that actually gets the DwarFS image size well below that of thexz archive:

$ ls -lh raspbian-9.dwarfs *.xz-rw-r--r-- 1 root root  244M Mar  4 21:36 raspbian-9.dwarfs-rw-r--r-- 1 mhx  users 297M Mar  4 21:32 2020-08-20-raspios-buster-armhf-lite.img.xz

Even if you actually build a tarball and compress that (instead ofcompressing the EXT4 file system itself),xz isn't quite able tomatch the DwarFS image size:

$ time sudo tar cf - raspbian | xz -9 -vT 0 >raspbian.tar.xz  100 %     246.9 MiB / 1,037.2 MiB = 0.238    13 MiB/s       1:18real    1m18.226suser    6m35.381ssys     0m2.205s
$ ls -lh raspbian.tar.xz-rw-r--r-- 1 mhx users 247M Mar  4 21:40 raspbian.tar.xz

DwarFS also comes with thedwarfsextract toolthat allows extraction of a filesystem image without the FUSE driver.So here's a comparison of the extraction speed:

$ time sudo tar xf raspbian.tar.xz -C out1real    0m12.846suser    0m12.313ssys     0m1.616s
$ time sudo dwarfsextract -i raspbian-9.dwarfs -o out2real    0m3.825suser    0m13.234ssys     0m1.382s

So,dwarfsextract is almost 4 times faster thanks to using multipleworker threads for decompression. It's writing about 300 MiB/s in thisexample.

Another nice feature ofdwarfsextract is that it allows you to directlyoutput data in an archive format, so you could create a tarball fromyour image without extracting the files to disk:

$ dwarfsextract -i raspbian-9.dwarfs -f ustar | xz -9 -T0 >raspbian2.tar.xz

This has the interesting side-effect that the resulting tarball willlikely be smaller than the one built straight from the directory:

$ ls -lh raspbian*.tar.xz-rw-r--r-- 1 mhx users 247M Mar  4 21:40 raspbian.tar.xz-rw-r--r-- 1 mhx users 240M Mar  4 23:52 raspbian2.tar.xz

That's becausedwarfsextract writes files in inode-order, and bydefault inodes are ordered by similarity for the best possiblecompression.

With lrzip

lrzip is a compression utilitytargeted especially at compressing large files. From its description,it looks like it does something very similar to DwarFS, i.e. it looksfor duplicate segments before passing the de-duplicated data on toanlzma compressor.

When I first read aboutlrzip, I was pretty certain it would easilybeat DwarFS. So let's take a look.lrzip operates on a single file,so it's necessary to first build a tarball:

$ time tar cf perl-install.tar installreal    2m9.568suser    0m3.757ssys     0m26.623s

Now we can runlrzip:

$ time lrzip -vL9 -o perl-install.tar.lrzip perl-install.tarThe following options are in effect for this COMPRESSION.Threading is ENABLED. Number of CPUs detected: 16Detected 67106172928 bytes ramCompression level 9Nice Value: 19Show ProgressVerboseOutput Filename Specified: perl-install.tar.lrzipTemporary Directory set as: ./Compression mode is: LZMA. LZO Compressibility testing enabledHeuristically Computed Compression Window: 426 = 42600MBFile size: 52615639040Will take 2 passesBeginning rzip pre-processing phaseBeginning rzip pre-processing phaseperl-install.tar - Compression Ratio: 100.378. Average Compression Speed: 14.536MB/s.Total time: 00:57:32.47real    57m32.472suser    81m44.104ssys     4m50.221s

That definitely took a while. This is about an order of magnitudeslower thanmkdwarfs and it barely makes use of the 8 cores.

$ ll -h perl-install.tar.lrzip-rw-r--r-- 1 mhx users 500M Mar  6 21:16 perl-install.tar.lrzip

This is a surprisingly disappointing result. The archive is 65% largerthan a DwarFS image at-l9 that takes less than 4 minutes to build.Also, you can't just access the files in the.lrzip without fullyunpacking the archive first.

That being said, itis better than just usingxz on the tarball:

$ time xz -T0 -v9 -c perl-install.tar >perl-install.tar.xzperl-install.tar (1/1)  100 %      4,317.0 MiB / 49.0 GiB = 0.086    24 MiB/s      34:55real    34m55.450suser    543m50.810ssys     0m26.533s
$ ll perl-install.tar.xz -h-rw-r--r-- 1 mhx users 4.3G Mar  6 22:59 perl-install.tar.xz

With zpaq

zpaq is a journaling backuputility and archiver. Again, it appears to share some of the ideas inDwarFS, like segmentation analysis, but it also adds some features ontop that make it useful for incremental backups. However, it's alsonot usable as a file system, so data needs to be extracted before itcan be used.

Anyway, how does it fare in terms of speed and compression performance?

$ time zpaq a perl-install.zpaq install -m5

After a few million lines of output that (I think) cannot be turned off:

2258234 +added, 0 -removed.0.000000 + (51161.953159 -> 8932.000297 -> 490.227707) = 490.227707 MB2828.082 seconds (all OK)real    47m8.104suser    714m44.286ssys     3m6.751s

So, it's an order of magnitude slower thanmkdwarfs and uses 14 timesas much CPU resources asmkdwarfs -l9. The resulting archive it prettyclose in size to the default configuration DwarFS image, but it's morethan 50% bigger than the image produced bymkdwarfs -l9.

$ ll perl-install*.*-rw-r--r-- 1 mhx users 490227707 Mar  7 01:38 perl-install.zpaq-rw-r--r-- 1 mhx users 315482627 Mar  3 21:23 perl-install-l9.dwarfs-rw-r--r-- 1 mhx users 447230618 Mar  3 20:28 perl-install.dwarfs

What'sreally surprising is how slow it is to extract thezpaqarchive again:

$ time zpaq x perl-install.zpaq2798.097 seconds (all OK)real    46m38.117suser    711m18.734ssys     3m47.876s

That's 700 times slower than extracting the DwarFS image.

With zpaqfranz

zpaqfranz is a derivative of zpaq.Much to my delight, it doesn't generate millions of lines of output.It claims to be multi-threaded and de-duplicating, so definitely worthtaking a look. Like zpaq, it supports incremental backups.

We'll use a different input to compare zpaqfranz and DwarFS: The source codeof 670 different releases of the "wine" emulator. That's 73 gigabytes of datain total, spread across slightly more than 3 million files. It's obviouslyhighly redundant and should thus be a good data set to compare the tools.For reference, a.tar.xz of the directory is still 7 GiB in size and aSquashFS image of the data gets down to around 1.6 GiB. An "optimized".tar.xz, where the input files were ordered by similarity, compresses downto 399 MiB, almost 20 times better than without ordering.

Now it's time to try zpaqfranz. The input data is stored on a fast SSD and alarge fraction of it is already in the file system cache from previous runs,so disk I/O is not a bottleneck.

$ time ./zpaqfranz a winesrc.zpaq winesrczpaqfranz v58.8k-JIT-L(2023-08-05)Creating winesrc.zpaq at offset 0 + 0Add 2024-01-11 07:25:22 3.117.413     69.632.090.852 (  64.85 GB) 16T (362.904 dirs)3.480.317 +added, 0 -removed.0 + (69.632.090.852 -> 8.347.553.798 -> 617.600.892) = 617.600.892 @ 58.38 MB/s1137.441 seconds (000:18:57) (all OK)real    18m58.632suser    11m51.052ssys     1m3.389s

That is considerably faster than the original zpaq, and uses about 60 timesless CPU resources. The output file is 589 MiB, so slightly larger than boththe "optimized".tar.gz and the zpaq output.

How doesmkdwarfs do?

$ time mkdwarfs -i winesrc -o winesrc.dwarfs -l9[...]I 07:55:20.546636 compressed 64.85 GiB to 93.2 MiB (ratio=0.00140344)I 07:55:20.826699 compression CPU time: 6.726mI 07:55:20.827338 filesystem created without errors [2.283m][...]real    2m17.100suser    9m53.633ssys     2m29.236s

It uses pretty much the same amount of CPU resources, but finishes more than8 times faster. The DwarFS output file is more than 6 times smaller.

You can actually squeeze a bit more redundancy out of the original data bytweaking the similarity ordering and switching from lzma to brotli compression,albeit at a somewhat slower compression speed:

mkdwarfs -i winesrc -o winesrc.dwarfs -l9 -C brotli:quality=11:lgwin=26 --order=nilsimsa:max-cluster-size=200k[...]I 08:21:01.138075 compressed 64.85 GiB to 73.52 MiB (ratio=0.00110716)I 08:21:01.485737 compression CPU time: 36.58mI 08:21:01.486313 filesystem created without errors [5.501m][...]real    5m30.178suser    40m59.193ssys     2m36.234s

That's almost a 1000x reduction in size.

Let's also look at decompression speed:

$ time zpaqfranz x winesrc.zpaqzpaqfranz v58.8k-JIT-L(2023-08-05)/home/mhx/winesrc.zpaq:1 versions, 3.480.317 files, 617.600.892 bytes (588.99 MB)Extract 69.632.090.852 bytes (64.85 GB) in 3.117.413 files (362.904 folders) / 16 T        99.18% 00:00:00  (  64.32 GB)=>(  64.85 GB)  548.83 MB/sec125.636 seconds (000:02:05) (all OK)real    2m6.968suser    1m36.177ssys     1m10.980s
$ time dwarfsextract -i winesrc.dwarfsreal    1m49.182suser    0m34.667ssys     1m28.733s

Decompression time is pretty much in the same ballpark, with just slightlyshorter times for the DwarFS image.

With wimlib

wimlib is a really interesting project that isa lot more mature than DwarFS. While DwarFS at its core has a librarycomponent that could potentially be ported to other operating systems,wimlib already is available on many platforms. It also seems to havequite a rich set of features, so it's definitely worth taking a look at.

I first triedwimcapture on the perl dataset:

$ time wimcapture --unix-data --solid --solid-chunk-size=16M install perl-install.wimScanning "install"47 GiB scanned (1927501 files, 330733 directories)Using LZMS compression with 16 threadsArchiving file data: 19 GiB of 19 GiB (100%) donereal    15m23.310suser    174m29.274ssys     0m42.921s
$ ll perl-install.*-rw-r--r-- 1 mhx users  447230618 Mar  3 20:28 perl-install.dwarfs-rw-r--r-- 1 mhx users  315482627 Mar  3 21:23 perl-install-l9.dwarfs-rw-r--r-- 1 mhx users 4748902400 Mar  3 20:10 perl-install.squashfs-rw-r--r-- 1 mhx users 1016981520 Mar  6 21:12 perl-install.wim

So, wimlib is definitely much better than squashfs, in terms of bothcompression ratio and speed. DwarFS is however about 3 times faster tocreate the file system and the DwarFS file system less than half the size.When switching to LZMA compression, the DwarFS file system is more than3 times smaller (wimlib uses LZMS compression by default).

What's a bit surprising is that mounting awim file takes quite a bitof time:

$ time wimmount perl-install.wim mnt[WARNING] Mounting a WIM file containing solid-compressed data; file access may be slow.real    0m2.038suser    0m1.764ssys     0m0.242s

Mounting the DwarFS image takes almost no time in comparison:

$ time git/github/dwarfs/build-clang-11/dwarfs perl-install-default.dwarfs mntI 00:23:39.238182 dwarfs (v0.4.0, fuse version 35)real    0m0.003suser    0m0.003ssys     0m0.000s

That's just because it immediately forks into background by default andinitializes the file system in the background. However, even whenrunning it in the foreground, initializing the file system takes onlyabout 60 milliseconds:

$ dwarfs perl-install.dwarfs mnt -fI 00:25:03.186005 dwarfs (v0.4.0, fuse version 35)I 00:25:03.248061 file system initialized [60.95ms]

If you actually build the DwarFS file system with uncompressed metadata,mounting is basically instantaneous:

$ dwarfs perl-install-meta.dwarfs mnt -fI 00:27:52.667026 dwarfs (v0.4.0, fuse version 35)I 00:27:52.671066 file system initialized [2.879ms]

I've tried running the benchmark where all 1139perl executablesprint their version with the wimlib image, but after about 10 minutes,it still hadn't finished the first run (with the DwarFS image, one runtook slightly more than 2 seconds). I then tried the following instead:

$ ls -1 /tmp/perl/install/*/*/bin/perl5* | xargs -d $'\n' -n1 -P1 sh -c 'time $0 -v >/dev/null' 2>&1 | grep ^realreal    0m0.802sreal    0m0.652sreal    0m1.677sreal    0m1.973sreal    0m1.435sreal    0m1.879sreal    0m2.003sreal    0m1.695sreal    0m2.343sreal    0m1.899sreal    0m1.809sreal    0m1.790sreal    0m2.115s

Judging from that, it would have probably taken about half an hourfor a single run, which makes at least the--solid wim image prettymuch unusable for actually working with the file system.

The--solid option was suggested to me because it resembles the waythat DwarFS actually organizes data internally. However, judging by thewarning when mounting a solid image, it's probably not ideal when usingthe image as a mounted file system. So I tried again without--solid:

$ time wimcapture --unix-data install perl-install-nonsolid.wimScanning "install"47 GiB scanned (1927501 files, 330733 directories)Using LZX compression with 16 threadsArchiving file data: 19 GiB of 19 GiB (100%) donereal    8m39.034suser    64m58.575ssys     0m32.003s

This is still more than 3 minutes slower thanmkdwarfs. However, ityields an image that's almost 10 times the size of the DwarFS imageand comparable in size to the SquashFS image:

$ ll perl-install-nonsolid.wim -h-rw-r--r-- 1 mhx users 4.6G Mar  6 23:24 perl-install-nonsolid.wim

Thisstill takes surprisingly long to mount:

$ time wimmount perl-install-nonsolid.wim mntreal    0m1.603suser    0m1.327ssys     0m0.275s

However, it's really usable as a file system, even though it's about4-5 times slower than the DwarFS image:

$ hyperfine -c 'umount mnt' -p 'umount mnt; dwarfs perl-install.dwarfs mnt -o cachesize=1g -o workers=4; sleep 1' -n dwarfs "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '\$0 -v >/dev/null'" -p 'umount mnt; wimmount perl-install-nonsolid.wim mnt; sleep 1' -n wimlib "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '\$0 -v >/dev/null'"Benchmark #1: dwarfs  Time (mean ± σ):      1.149 s ±  0.019 s    [User: 2.147 s, System: 0.739 s]  Range (min … max):    1.122 s …  1.187 s    10 runsBenchmark #2: wimlib  Time (mean ± σ):      7.542 s ±  0.069 s    [User: 2.787 s, System: 0.694 s]  Range (min … max):    7.490 s …  7.732 s    10 runsSummary  'dwarfs' ran    6.56 ± 0.12 times faster than 'wimlib'

With Cromfs

I usedCromfs in the pastfor compressed file systems and remember that it did a pretty good jobin terms of compression ratio. But it was never fast. However, I didn'tquite remember justhow slow it was until I tried to set up a test.

Here's a run on the Perl dataset, with the block size set to 16 MiB tomatch the default of DwarFS, and with additional options suggested tospeed up compression:

$ time mkcromfs -f 16777216 -qq -e -r100000 install perl-install.cromfsWriting perl-install.cromfs...mkcromfs: Automatically enabling --24bitblocknums because it seems possible for this filesystem.Root pseudo file is 108 bytesInotab spans 0x7f3a18259000..0x7f3a1bfffb9cRoot inode spans 0x7f3a205d2948..0x7f3a205d294cBeginning task for Files and directories: Finding identical blocks2163608 reuse opportunities found. 561362 unique blocks. Block table will be 79.4% smaller than without the index search.Beginning task for Files and directories: BlockifyingBlockifying:  0.04% (140017/2724970) idx(siz=80423,del=0) rawin(20.97 MB)rawout(20.97 MB)diff(1956 bytes)Termination signalled, cleaning up temporariesreal    29m9.634suser    201m37.816ssys     2m15.005s

So, it processed 21 MiB out of 48 GiB in half an hour, using almosttwice as much CPU resources as DwarFS for thewhole file system.At this point I decided it's likely not worth waiting (presumably)another month (!) formkcromfs to finish. I double checked thatI didn't accidentally build a debugging version,mkcromfs wasdefinitely built with-O3.

I then tried once more with a smaller version of the Perl dataset.This only has 20 versions (instead of 1139) of Perl, and obviouslya lot less redundancy:

$ time mkcromfs -f 16777216 -qq -e -r100000 install-small perl-install.cromfsWriting perl-install.cromfs...mkcromfs: Automatically enabling --16bitblocknums because it seems possible for this filesystem.Root pseudo file is 108 bytesInotab spans 0x7f00e0774000..0x7f00e08410a8Root inode spans 0x7f00b40048f8..0x7f00b40048fcBeginning task for Files and directories: Finding identical blocks25362 reuse opportunities found. 9815 unique blocks. Block table will be 72.1% smaller than without the index search.Beginning task for Files and directories: BlockifyingCompressing raw rootdir inode (28 bytes)z=982370,del=2) rawin(641.56 MB)rawout(252.72 MB)diff(388.84 MB) compressed into 35 bytesINOTAB pseudo file is 839.85 kBInotab inode spans 0x7f00bc036ed8..0x7f00bc036ef4Beginning task for INOTAB: Finding identical blocks0 reuse opportunities found. 13 unique blocks. Block table will be 0.0% smaller than without the index search.Beginning task for INOTAB: Blockifyingmkcromfs: Automatically enabling --packedblocks because it is possible for this filesystem.Compressing raw inotab inode (52 bytes) compressed into 58 bytesCompressing 9828 block records (4 bytes each, total 39312 bytes) compressed into 15890 bytesCompressing and writing 16 fblocks...16 fblocks were written: 35.31 MB = 13.90 % of 254.01 MBFilesystem size: 35.33 MB = 5.50 % of original 642.22 MBEndreal    27m38.833suser    277m36.208ssys     11m36.945s

And repeating the same task withmkdwarfs:

$ time mkdwarfs -i install-small -o perl-install-small.dwarfs21:13:38.131724 scanning install-small21:13:38.320139 waiting for background scanners...21:13:38.727024 assigning directory and link inodes...21:13:38.731807 finding duplicate files...21:13:38.832524 saved 267.8 MiB / 611.8 MiB in 22842/26401 duplicate files21:13:38.832598 waiting for inode scanners...21:13:39.619963 assigning device inodes...21:13:39.620855 assigning pipe/socket inodes...21:13:39.621356 building metadata...21:13:39.621453 building blocks...21:13:39.621472 saving names and links...21:13:39.621655 ordering 3559 inodes using nilsimsa similarity...21:13:39.622031 nilsimsa: depth=20000, limit=25521:13:39.629206 updating name and link indices...21:13:39.630142 pre-sorted index (3360 name, 2127 path lookups) [8.014ms]21:13:39.752051 3559 inodes ordered [130.3ms]21:13:39.752101 waiting for segmenting/blockifying to finish...21:13:53.250951 saving chunks...21:13:53.251581 saving directories...21:13:53.303862 waiting for compression to finish...21:14:11.073273 compressed 611.8 MiB to 24.01 MiB (ratio=0.0392411)21:14:11.091099 filesystem created without errors [32.96s]⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯waiting for block compression to finish3334 dirs, 0/0 soft/hard links, 26401/26401 files, 0 otheroriginal size: 611.8 MiB, dedupe: 267.8 MiB (22842 files), segment: 121.5 MiBfilesystem: 222.5 MiB in 14 blocks (7177 chunks, 3559/3559 inodes)compressed filesystem: 14 blocks/24.01 MiB written██████████████████████████████████████████████████████████████████████▏100% \real    0m33.007suser    3m43.324ssys     0m4.015s

So,mkdwarfs is about 50 times faster thanmkcromfs and uses 75 timesless CPU resources. At the same time, the DwarFS file system is 30% smaller:

$ ls -l perl-install-small.*fs-rw-r--r-- 1 mhx users 35328512 Dec  8 14:25 perl-install-small.cromfs-rw-r--r-- 1 mhx users 25175016 Dec 10 21:14 perl-install-small.dwarfs

I noticed that theblockifying step that took ages for the full datasetwithmkcromfs ran substantially faster (in terms of MiB/second) on thesmaller dataset, which makes me wonder if there's some quadratic complexitybehaviour that's slowing downmkcromfs.

In order to be completely fair, I also ranmkdwarfs with-l 9 to enableLZMA compression (which is whatmkcromfs uses by default):

$ time mkdwarfs -i install-small -o perl-install-small-l9.dwarfs -l 921:16:21.874975 scanning install-small21:16:22.092201 waiting for background scanners...21:16:22.489470 assigning directory and link inodes...21:16:22.495216 finding duplicate files...21:16:22.611221 saved 267.8 MiB / 611.8 MiB in 22842/26401 duplicate files21:16:22.611314 waiting for inode scanners...21:16:23.394332 assigning device inodes...21:16:23.395184 assigning pipe/socket inodes...21:16:23.395616 building metadata...21:16:23.395676 building blocks...21:16:23.395685 saving names and links...21:16:23.395830 ordering 3559 inodes using nilsimsa similarity...21:16:23.396097 nilsimsa: depth=50000, limit=25521:16:23.401042 updating name and link indices...21:16:23.403127 pre-sorted index (3360 name, 2127 path lookups) [6.936ms]21:16:23.524914 3559 inodes ordered [129ms]21:16:23.525006 waiting for segmenting/blockifying to finish...21:16:33.865023 saving chunks...21:16:33.865883 saving directories...21:16:33.900140 waiting for compression to finish...21:17:10.505779 compressed 611.8 MiB to 17.44 MiB (ratio=0.0284969)21:17:10.526171 filesystem created without errors [48.65s]⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯waiting for block compression to finish3334 dirs, 0/0 soft/hard links, 26401/26401 files, 0 otheroriginal size: 611.8 MiB, dedupe: 267.8 MiB (22842 files), segment: 122.2 MiBfilesystem: 221.8 MiB in 4 blocks (7304 chunks, 3559/3559 inodes)compressed filesystem: 4 blocks/17.44 MiB written██████████████████████████████████████████████████████████████████████▏100% /real    0m48.683suser    2m24.905ssys     0m3.292s
$ ls -l perl-install-small*.*fs-rw-r--r-- 1 mhx users 18282075 Dec 10 21:17 perl-install-small-l9.dwarfs-rw-r--r-- 1 mhx users 35328512 Dec  8 14:25 perl-install-small.cromfs-rw-r--r-- 1 mhx users 25175016 Dec 10 21:14 perl-install-small.dwarfs

It takes about 15 seconds longer to build the DwarFS file system with LZMAcompression (this is still 35 times faster than Cromfs), but reduces thesize even further to make it almost half the size of the Cromfs file system.

I would have added some benchmarks with the Cromfs FUSE driver, but sadlyit crashed right upon trying to list the directory after mounting.

With EROFS

EROFS is a read-only compressedfile system that has been added to the Linux kernel recently.Its goals are different from those of DwarFS, though. It is designed tobe lightweight (which DwarFS is definitely not) and to run on constrainedhardware like embedded devices or smartphones. It is not designed to providemaximum compression. It currently supports LZ4 and LZMA compression.

Running it on the full Perl dataset using options given in the README for"well-compressed images":

$ time mkfs.erofs -C1048576 -Eztailpacking,fragments,all-fragments,dedupe -zlzma,9 perl-install-lzma9.erofs perl-installmkfs.erofs 1.7.1-gd93a18c9<W> erofs: It may take a longer time since MicroLZMA is still single-threaded for now.Build completed.------Filesystem UUID: 538ce164-5f9d-4a6a-9808-5915f17ced30Filesystem total blocks: 599854 (of 4096-byte blocks)Filesystem total inodes: 2255795Filesystem total metadata blocks: 74253Filesystem total deduplicated bytes (of source files): 29625028195user2:35:08.03system1:12.65total2:39:25.35$ ll -h perl-install-lzma9.erofs-rw-r--r-- 1 mhx mhx 2.3G Apr 15 16:23 perl-install-lzma9.erofs

That's definitely slower than SquashFS, but also significantly smaller.

For a fair comparison, let's use the same 1 MiB block size with DwarFS,but also tweak the options for best compression:

$ time mkdwarfs -i perl-install -o perl-install-1M.dwarfs -l9 -S20 -B64 --order=nilsimsa:max-cluster-size=150000[...]330733 dirs, 0/2440 soft/hard links, 1927501/1927501 files, 0 otheroriginal size: 47.49 GiB, hashed: 43.47 GiB (1920025 files, 1.451 GiB/s)scanned: 19.45 GiB (144675 files, 159.3 MiB/s), categorizing: 0 B/ssaved by deduplication: 28.03 GiB (1780386 files), saved by segmenting: 15.4 GiBfilesystem: 4.053 GiB in 4151 blocks (937069 chunks, 144674/144674 fragments, 144675 inodes)compressed filesystem: 4151 blocks/806.2 MiB written[...]user24:27.47system4:20.74total3:26.79

That's significantly smaller and, almost more importantly, 46 timesfaster thanmkfs.erofs.

Actually using the file system images, here's how DwarFS performs:

$ dwarfs perl-install-1M.dwarfs mnt -oworkers=8$ find mnt -type f -print0 | xargs -0 -P16 -n64 cat | dd of=/dev/null bs=1M status=progress50392172594 bytes (50 GB, 47 GiB) copied, 19 s, 2.7 GB/s0+1662649 records in0+1662649 records out51161953159 bytes (51 GB, 48 GiB) copied, 19.4813 s, 2.6 GB/s

Reading every single file from 16 parallel processes took less than20 seconds. The FUSE driver consumed 143 seconds of CPU time.

Here's the same for EROFS:

$ erofsfuse perl-install-lzma9.erofs mnt$ find mnt -type f -print0 | xargs -0 -P16 -n64 cat | dd of=/dev/null bs=1M status=progress2594306810 bytes (2.6 GB, 2.4 GiB) copied, 300 s, 8.6 MB/s^C0+133296 records in0+133296 records out2595212832 bytes (2.6 GB, 2.4 GiB) copied, 300.336 s, 8.6 MB/s

Note that I've stopped this after 5 minutes. The DwarFS FUSE driverdelivered about 300 times faster throughput compared to EROFS. TheEROFS FUSE driver consumed 50 minutes (!) of CPU time for only about5% of the data, i.e. more than 400 times the CPU time consumed bythe DwarFS FUSE driver.

I've tried two more EROFS configurations on the same set of data.The first one uses more or less just the defaults:

$ time mkfs.erofs -zlz4hc,12 perl-install-lz4hc.erofs perl-installmkfs.erofs 1.7.1-gd93a18c9Build completed.------Filesystem UUID: b75142ed-6cf3-46a4-84f3-12693f7759a0Filesystem total blocks: 5847130 (of 4096-byte blocks)Filesystem total inodes: 2255794Filesystem total metadata blocks: 419699Filesystem total deduplicated bytes (of source files): 0user3:38:23.36system1:10.84total3:41:37.33

The second one additionally enables the-Ededupe option:

$ time mkfs.erofs -zlz4hc,12 -Ededupe perl-install-lz4hc-dedupe.erofs perl-installmkfs.erofs 1.7.1-gd93a18c9Build completed.------Filesystem UUID: 0ccf581e-ad3b-4d08-8b10-5b7e15f8e3cdFilesystem total blocks: 1510091 (of 4096-byte blocks)Filesystem total inodes: 2255794Filesystem total metadata blocks: 435599Filesystem total deduplicated bytes (of source files): 19220717568user4:19:57.61system1:21.62total4:23:55.85

I don't know why these are even slower than the first, seemingly morecomplex, set of options. As was to be expected, the resulting imageswere significantly bigger:

$ ll -h perl-install*.erofs-rw-r--r-- 1 mhx mhx 5.8G Apr 16 02:46 perl-install-lz4hc-dedupe.erofs-rw-r--r-- 1 mhx mhx  23G Apr 15 22:34 perl-install-lz4hc.erofs-rw-r--r-- 1 mhx mhx 2.3G Apr 15 16:23 perl-install-lzma9.erofs

The good news is that these performmuch better and even outperformDwarFS, albeit by a small margin:

$ erofsfuse perl-install-lz4hc.erofs mnt$ find mnt -type f -print0 | xargs -0 -P16 -n64 cat | dd of=/dev/null bs=1M status=progress49920168315 bytes (50 GB, 46 GiB) copied, 16 s, 3.1 GB/s0+1493031 records in0+1493031 records out51161953159 bytes (51 GB, 48 GiB) copied, 16.4329 s, 3.1 GB/s

The deduplicated version is even a tiny bit faster:

$ erofsfuse perl-install-lz4hc-dedupe.erofs mntfind mnt -type f -print0 | xargs -0 -P16 -n64 cat | dd of=/dev/null bs=1M status=progress50808037121 bytes (51 GB, 47 GiB) copied, 16 s, 3.2 GB/s0+1499949 records in0+1499949 records out51161953159 bytes (51 GB, 48 GiB) copied, 16.1184 s, 3.2 GB/s

The EROFS kernel driver wasn't any faster than the FUSE driver.

The FUSE driver used about 27 seconds of CPU time in both cases,substantially less than before and 5 times less than DwarFS.

DwarFS can get close to the throughput of EROFS by usingzstd insteadoflzma compression:

$ dwarfs perl-install-1M-zstd.dwarfs mnt -oworkers=8find mnt -type f -print0 | xargs -0 -P16 -n64 cat | dd of=/dev/null bs=1M status=progress49224202357 bytes (49 GB, 46 GiB) copied, 16 s, 3.1 GB/s0+1529018 records in0+1529018 records out51161953159 bytes (51 GB, 48 GiB) copied, 16.6716 s, 3.1 GB/s

With fuse-archive

I came acrossfuse-archivewhile looking for FUSE drivers to mount archives and it seems to bethe most versatile of the alternatives (and the one that actuallycompiles out of the box).

An interesting test case straight from fuse-archive's README is inthePerformancesection: an archive with a single huge file full of zeroes. Let'smake the example a bit more extreme and use a 1 GiB file instead ofjust 256 MiB:

$ mkdir zerotest$ truncate --size=1G zerotest/zeroes

Now, we build several different archives and a DwarFS image:

$ time mkdwarfs -i zerotest -o zerotest.dwarfs -W16 --log-level=warn --progress=nonereal    0m7.604suser    0m7.521ssys     0m0.083s$ time zip -9 zerotest.zip zerotest/zeroes  adding: zerotest/zeroes (deflated 100%)real    0m4.923suser    0m4.840ssys     0m0.080s$ time 7z a -bb0 -bd zerotest.7z zerotest/zeroes7-Zip [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21p7zip Version 16.02 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,64 bits,16 CPUs Intel(R) Xeon(R) E-2286M  CPU @ 2.40GHz (906ED),ASM,AES-NI)Scanning the drive:1 file, 1073741824 bytes (1024 MiB)Creating archive: zerotest.7zItems to compress: 1Files read from disk: 1Archive size: 157819 bytes (155 KiB)Everything is Okreal    0m5.535suser    0m48.281ssys     0m1.116s$ time tar --zstd -cf zerotest.tar.zstd zerotest/zeroesreal    0m0.449suser    0m0.510ssys     0m0.610s

Turns out thattar --zstd is easily winning the compression speedtest. Looking at the file sizes did actually blow my mind just a bit:

$ ll zerotest.* --sort=size-rw-r--r-- 1 mhx users 1042231 Jul  1 15:24 zerotest.zip-rw-r--r-- 1 mhx users  157819 Jul  1 15:26 zerotest.7z-rw-r--r-- 1 mhx users   33762 Jul  1 15:28 zerotest.tar.zstd-rw-r--r-- 1 mhx users     848 Jul  1 15:23 zerotest.dwarfs

I definitely didn't expect the DwarFS image to bethat small.Dropping the section index would actually save another 100 bytes.So, if you want to archive lots of zeroes, DwarFS is your friend.

Anyway, let's look at how fast and efficiently the zeroes canbe read from the different archives. First, thezip archive:

$ time dd if=mnt/zerotest/zeroes of=/dev/null status=progress1020117504 bytes (1.0 GB, 973 MiB) copied, 2 s, 510 MB/s2097152+0 records in2097152+0 records out1073741824 bytes (1.1 GB, 1.0 GiB) copied, 2.10309 s, 511 MB/sreal    0m2.104suser    0m0.264ssys     0m0.486s

CPU time used by the FUSE driver was 1.8 seconds and mount timewas in the milliseconds.

Now, the7z archive:

 $ time dd if=mnt/zerotest/zeroes of=/dev/null status=progress594759168 bytes (595 MB, 567 MiB) copied, 1 s, 595 MB/s2097152+0 records in2097152+0 records out1073741824 bytes (1.1 GB, 1.0 GiB) copied, 1.76904 s, 607 MB/sreal    0m1.772suser    0m0.229ssys     0m0.572s

CPU time used by the FUSE driver was 2.9 seconds and mount timewas just over 1.0 seconds.

Now, the.tar.zstd archive:

$ time dd if=mnt/zerotest/zeroes of=/dev/null status=progress2097152+0 records in2097152+0 records out1073741824 bytes (1.1 GB, 1.0 GiB) copied, 0.799409 s, 1.3 GB/sreal    0m0.801suser    0m0.262ssys     0m0.537s

CPU time used by the FUSE driver was 0.53 seconds and mount timewas 0.13 seconds.

Last but not least, let's look at DwarFS:

$ time dd if=mnt/zeroes of=/dev/null status=progress2097152+0 records in2097152+0 records out1073741824 bytes (1.1 GB, 1.0 GiB) copied, 0.753 s, 1.4 GB/sreal    0m0.757suser    0m0.220ssys     0m0.534s

CPU time used by the FUSE driver was 0.17 seconds and mount timewas less than a millisecond.

If we increase the block size for thedd command, we can geteven higher throughput. For fuse-archive with the.tar.zstd:

$ time dd if=mnt/zerotest/zeroes of=/dev/null status=progress bs=1638465536+0 records in65536+0 records out1073741824 bytes (1.1 GB, 1.0 GiB) copied, 0.318682 s, 3.4 GB/sreal    0m0.323suser    0m0.005ssys     0m0.154s

And for DwarFS:

$ time dd if=mnt/zeroes of=/dev/null status=progress bs=1638465536+0 records in65536+0 records out1073741824 bytes (1.1 GB, 1.0 GiB) copied, 0.172226 s, 6.2 GB/sreal    0m0.176suser    0m0.020ssys     0m0.141s

This is all nice, but what about a more real-life use case?Let's take the 1.82.0 boost release archives:

$ ll --sort=size boost_1_82_0.*-rw-r--r-- 1 mhx users 208188085 Apr 10 14:25 boost_1_82_0.zip-rw-r--r-- 1 mhx users 142580547 Apr 10 14:23 boost_1_82_0.tar.gz-rw-r--r-- 1 mhx users 121325129 Apr 10 14:23 boost_1_82_0.tar.bz2-rw-r--r-- 1 mhx users 105901369 Jun 28 12:47 boost_1_82_0.dwarfs-rw-r--r-- 1 mhx users 103710551 Apr 10 14:25 boost_1_82_0.7z

Here are the timings for mounting each archive and then usingtar to build another archive from the mountpoint and just countingthe number of bytes in that archive, e.g.:

$ time tar cf - mnt | wc -c803614720real    0m4.602suser    0m0.156ssys     0m1.123s

Here are the results in terms of wallclock time and FUSE driverCPU time:

ArchiveMount Timetar Wallclock TimeFUSE Driver CPU Time
.zip0.458s5.073s4.418s
.tar.gz1.391s3.483s3.943s
.tar.bz215.663s17.942s32.040s
.7z0.321s32.554s31.625s
.dwarfs0.013s2.974s1.984s

DwarFS easily wins all categories while still compressing the dataalmost as well as7z.

What about accessing files more randomly?

$ find mnt -type f -print0 | xargs -0 -P32 -n32 cat | dd of=/dev/null status=progress

It turns out that fuse-archive grinds to a halt in this case, so I hadto run the test on a subset (theboost subdirectory) of the data.The.tar.bz2 and.7z archives were so slow to read that I stoppedthem after a few minutes.

ArchiveThroughputWallclock TimeFUSE Driver CPU Time
.zip1.8 MB/s83.245s83.669s
.tar.gz1.2 MB/s121.377s122.711s
.tar.bz20.2 MB/s--
.7z0.3 MB/s--
.dwarfs598.0 MB/s0.249s1.099s

Performance Monitoring

Both the FUSE driver anddwarfsextract by default have support forsimple performance monitoring. You can build binaries without thisfeature (-DENABLE_PERFMON=OFF), but impact should be negligible evenif performance monitoring is enabled at run-time.

To enable the performance monitor, you pass a list of components for whichyou want to collect latency metrics, e.g.:

$ dwarfs test.dwarfs mnt -f -operfmon=fuse

When the driver exits, you will see output like this:

[fuse.op_read]      samples: 45145      overall: 3.214s  avg latency: 71.2us  p50 latency: 131.1us  p90 latency: 131.1us  p99 latency: 262.1us[fuse.op_readdir]      samples: 2      overall: 51.31ms  avg latency: 25.65ms  p50 latency: 32.77us  p90 latency: 67.11ms  p99 latency: 67.11ms[fuse.op_lookup]      samples: 16      overall: 19.98ms  avg latency: 1.249ms  p50 latency: 2.097ms  p90 latency: 4.194ms  p99 latency: 4.194ms[fuse.op_init]      samples: 1      overall: 199.4us  avg latency: 199.4us  p50 latency: 262.1us  p90 latency: 262.1us  p99 latency: 262.1us[fuse.op_open]      samples: 16      overall: 122.2us  avg latency: 7.641us  p50 latency: 4.096us  p90 latency: 32.77us  p99 latency: 32.77us[fuse.op_getattr]      samples: 1      overall: 5.786us  avg latency: 5.786us  p50 latency: 8.192us  p90 latency: 8.192us  p99 latency: 8.192us

The metrics should be self-explanatory. However, note that thepercentile metrics are logarithmically quantized in order to useas little resources as possible. As a result, you will only seevalues that look an awful lot like powers of two.

Currently, the supported components arefuse for the FUSEoperations,filesystem_v2 for the DwarFS file system componentandinode_reader_v2 for the component that handles allread()system calls.

The FUSE driver also exposes the performance monitor metrics viaanextended attribute.

Other Obscure Features

Setting Worker Thread CPU Affinity

This only works on Linux and usually only makes sense if you have CPUswith different types of cores (e.g. "performance" vs "efficiency" cores)and arereally trying to squeeze the last ounce of speed out of DwarFS.

By setting the environment variableDWARFS_WORKER_GROUP_AFFINITY, youcan set the CPU affinity of different worker thread groups, e.g.:

export DWARFS_WORKER_GROUP_AFFINITY=blockify=3:compress=6,7

This will set the affinity of theblockify worker group to CPU 3 andthe affinity of thecompress worker group to CPUs 6 and 7.

You can use this feature for all tools that use one or more worker threadgroups. For example, the FUSE driverdwarfs anddwarfsextract use aworker groupblkcache that the block cache (i.e. block decompression andlookup) runs on.mkdwarfs uses a whole array of different worker groups,namelycompress for compression,scanner for scanning,ordering forinput ordering, andblockify for segmenting.blockify is what you wouldtypically want to run on your "performance" cores.

Stargazers over Time

Stargazers over Time


[8]ページ先頭

©2009-2025 Movatter.jp