Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

CrashMonkey: tools for testing file-system reliability (OSDI 18)

License

NotificationsYou must be signed in to change notification settings

utsaslab/crashmonkey

Repository files navigation

StatusLicense

Bounded Black-Box Crash Testing

Bounded black-box crash testing (B3) is a new approach to testing file-system crash consistency. B3 is a black-box testing approach which requiresno modification to file-system code. B3 exhaustively generates and tests workloads in a bounded space. We implement B3 by building two tools - CrashMonkey and Ace. The OSDI'18 paperFinding Crash-Consistency Bugs with Bounded Black-Box Crash Testing has a detailed discussion of B3, CrashMonkey, and Ace.
[Paper PDF] [Slides] [Bibtex] [Talk video]

CrashMonkey and Ace have found several long-standing bugs in widely-used file systems like btrfs and F2FS. The tools work out-of-the-box with any POSIX file system: no modification required to the file system.

CrashMonkey

CrashMonkey is a file-system agnostic record-replay-and-test framework. Unlike existing tools like dm-log-writes which require a manual checker script, CrashMonkey automatically tests for data and metadata consistency of persisted files. CrashMonkey needs only one input to run - the workload to be tested. We have described the rules for writing a workload for CrashMonkeyhere. More details about the internals of CrashMonkey can be foundhere.

Automatic Crash Explorer (Ace)

Ace is an automatic workload generator, that exhaustively generates sequences of file-system operations (workloads), given certain bounds. Ace consists of a workload synthesizer that generates workloads in a high-level language which we call J-lang. A CrashMonkey adapter, that we built, converts these workloads into C++ test files that CrashMonkey can work with. We also have an XFSTest adapter that allows us to convert workloads into bash scripts to be used withxfstest. More details on the current bounds imposed by Ace and guidelines on workload generation can be foundhere.

CrashMonkey and Ace can be used out of the box on any Linux filesystem that implements POSIX API. Our tools have been tested to work with ext2, ext3, ext4, xfs, F2FS, and btrfs, across Linux kernel versions - 3.12, 3.13, 3.16, 4.1, 4.4, 4.15, 4.16, 5.5 and 5.6.

Results

We have tested four Linux file systems (ext4, xfs, btrfs, F2FS) and two verified file systems:FCSQ and theYxv6 file system.

Ace and Crashmonkey have found 8 previously undiscovered bugs in btrfs, 2 in F2FS, and 1 bug in FSCQ.

FSCQ bug. The FSCQ bug would result in fdatasync not persisting data correctly to the file system. The bug was in the unverified part of the file system, in the Haskell-C bindings, due to an optimization that was not fully tested. The authors have acknowledged andfixed the bug.

Table Of Contents

  1. Setup
  2. Push Button Testing for Seq-1 Workloads
  3. Tutorial on Workload Generation and Testing
  4. Demo
  5. List of Bugs Reproduced by CrashMonkey and Ace
  6. List of New Bugs Found by CrashMonkey and Ace
  7. Research That Uses Our Tools
  8. Contact Info

Advanced Documentation

  1. VM Setup and Deployment
  2. CrashMonkey
  3. Ace

Setup

Here is a checklist of dependencies to get CrashMonkey and Ace up and running on your system.

  • You need a Linux machine to get started. We recommend spinning up a Ubuntu 16.04 VM with one of the supported kernel versions mentioned above. 20GB disk space and 2-4GB of RAM is recommended, especially if you plan on running large tests.

  • If you want to install kernel 4.16, we have ascript to help you.

  • Install dependencies.

    apt-get install git make gcc g++ libattr1-dev btrfs-tools f2fs-tools xfsprogs libelf-dev linux-headers-$(uname -r) python3 python3-pip

    python3 -m pip install progress progressbar

    Ensure your glibc version is 2.23 or above (Check using ldd --version)

  • Clone the repository.

    git clone https://github.com/utsaslab/crashmonkey.git

  • Compile CrashMonkey's test harness, kernel modules and the test suite of seq-1 workloads (workloads with 1 file-system operation). The initial compile should take a minute or less.

    cd crashmonkey; make -j4

  • Create a directory for the test harness to mount devices to.

    mkdir /mnt/snapshot

Push Button Testing for Seq-1 Workloads

This repository contains a pre-generated suite of 328 seq-1 workloads (workloads with 1 file-system operation)here. Once you haveset up CrashMonkey on your machine (or VM), you can simply run :

pythonxfsMonkey.py-f/dev/sda-d/dev/cow_ram0-tbtrfs-e102400-ubuild/tests/seq1/>outfile

Sit back and relax. This is going to take about 12 minutes to complete if run on a single machine. This will run all the 328 tests of seq-1 on abtrfs file system100MB in size. The bug reports can be found in the folderdiff_results. The workloads are named j-lang<1-328>, and, if any of these resulted in a bug, you will see a bug report with the same name as that of the workload, describing the difference between the expected and actual state.

Tutorial

This tutorial walks you through the workflow of workload generation to testing, using a small bounded space of seq-1 workloads. Generating and running the tests in this tutorial will take less than 2 minutes.

  1. Select Bounds :Let us generate workloads of sequence length 1, and test only two file-system operations,link andfallocate. Our reduced file set consists of just two files.

    FileOptions= ['foo']SecondFileOptions= ['A/bar']

    The link and fallocate system calls pick file arguments from the above list. Additionally fallocate allows several modes includingZERO_RANGE,PUNCH_HOLE etc. We pick one of the modes to bound the space here.

    FallocOptions= ['FALLOC_FL_ZERO_RANGE|FALLOC_FL_KEEP_SIZE']

    The fallocate system call also requires offset and length parameters which are chosen to be one of the following.

    WriteOptions= ['append','overlap_unaligned_start','overlap_extend']

    All these options are configurable in theace script.

  2. Generate Workloads :To generate workloads conforming to the above mentioned bounds, run the following command in theace directory :

    cdacepythonace.py-l1-nFalse-dTrue

    -l flag sets the length of the sequence to 1,-n is used to indicate if we want to include a nested directory to the file set and-d indicates the demo workload set, which appropriately sets the above mentioned bounds on system calls and their parameters.

    This generates about 9 workloads in about a second. You will find the generated workloads atcode/tests/seq1_demo. Additionally, you can find the J-lang equivalent of these test files atcode/tests/seq1_demo/j-lang-files

  3. Compile Workloads : In order to compile the test workloads into.so files to be run by CrashMonkey

    1. Copy the generated test files intogenerated_workloads directory.
    cd ..cp code/tests/seq1_demo/j-lang*.cpp code/tests/generated_workloads/
    1. Compile the new tests.
    make gentests

    This will compile all the new tests and place the.so files atbuild/tests/generated_workloads

  4. Run : Now its time to test all these workloads using CrashMonkey. Run the xfsMonkey script, which simply invokes CrashMonkey in a loop, testing one workload at a time.

    For example, let's run the generated tests on thebtrfs file system, on a100MB image.

    pythonxfsMonkey.py-f/dev/sda-d/dev/cow_ram0-tbtrfs-e102400-ubuild/tests/generated_workloads/>outfile
  5. Bug Reports : The generated bug reports can be found atdiff_results. If the test file "x" triggered a bug, you will find a bug report with the same name in this directory.

    For example, j-lang1.cpp will result in a crash-consistency bug on btrfs, as on kernel 4.16 (Bug #7). The corresponding bug report will be as follows.

automated check_test:                failed: 1DIFF: Content Mismatch /A/fooActual (/mnt/snapshot/A/foo):---File Stat Atrributes---Inode     : 258TotalSize : 0BlockSize : 4096#Blocks   : 0#HardLinks: 1Expected (/mnt/cow_ram_snapshot2_0/A/foo):---File Stat Atrributes---Inode     : 258TotalSize : 0BlockSize : 4096#Blocks   : 0#HardLinks: 2

Similarly, j-lang4.cpp results in the incorrect block count bug (bug #8) on btrfs as of kernel 4.16. The corresponding bug report is as shown below.

automated check_test:                failed: 1DIFF: Content Mismatch /A/fooActual (/mnt/snapshot/A/foo):---File Stat Atrributes---Inode     : 257TotalSize : 32768BlockSize : 4096#Blocks   : 64#HardLinks: 1Expected (/mnt/cow_ram_snapshot2_0/A/foo):---File Stat Atrributes---Inode     : 257TotalSize : 32768BlockSize : 4096#Blocks   : 128#HardLinks: 1

Demo

All these steps have been assembled for you in the scripthere. The link to the demo video ishere. Try out the demo by running./demo.sh btrfs

Research that uses our tools

  1. Barrier-Enabled IO Stack for Flash Storage. Youjip Won, Hanyang University; Jaemin Jung, Texas A&M University; Gyeongyeol Choi, Joontaek Oh, and Seongbae Son, Hanyang University; Jooyoung Hwang and Sangyeun Cho, Samsung Electronics. Proceedings of the 16th USENIX Conference on File and Storage Technologies (FAST 18).Link
  2. Bringing Order to Chaos: Barrier-Enabled I/O Stack for Flash Storage. Youjin Won, Joontaek Oh, Jaemin Jung, Gyeongyeol Choi, Seongbae Son, Jooyoung Hwang, and Sangyeun Cho. "Bringing Order to Chaos: Barrier-Enabled I/O Stack for Flash Storage." ACM Transactions on Storage (TOS) 14, no. 3 (2018): 24. 2018. ACM Transactions on Storage (TOS), 14(3), p.24.Link
  3. LineFS: Efficient SmartNIC Offload of a Distributed File System with Pipeline Parallelism. Jongyul Kim, Insu Jang, Waleed Reda, Jaeseong Im, Marco Canini, Dejan Kostic, Youngjin Kwon, Simon Peter, Emmett Witchel. Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles (SOSP 21).Link
  4. A file system for safely interacting with untrusted USB flash drives. Ke Zhong, University of Pennsylvania; Zhihao Jiang and Ke Ma, Shanghai Jiao Tong University; Sebastian Angel, University of Pennsylvania. HotStorage 2020.Link

Contact Info

Please contact us atvijay@cs.utexas.edu with any questions. Drop us a note if you are using or plan to use CrashMonkey or Ace to test your file system.

About

CrashMonkey: tools for testing file-system reliability (OSDI 18)

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors11


[8]ページ先頭

©2009-2025 Movatter.jp