Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

User-Space Block Device (USBD) Framework (written in Go)

License

NotificationsYou must be signed in to change notification settings

tarndt/usbd

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A User-Space Block Device (USBD) Framework (written in Go)License: MPL 2.0Go ReferenceGo Report Card

Introduction

Many people are familiar withuser-spacefilesystem; onLinux these are popularly implemented usingFUSE. This library is an attempt to similar provide similar facility for user-spaceblock devices, specifically those written inGo. USBD takes advantage of the seldom usedNBD interface provided by an in tree Linux kernel module to allow a daemon running in user-space to export a block device. This package can be used to write software block device (in Go) and export it via/dev/nbdX whereX is the next available device on the system. After doing so, such a device can be formatted with thefilesystem of the user's choice (ex.ext4,btrfs,xfs,zfs, etc) and mounted in the usual ways (ex.mount,/etc/fstab, etc).

Adaemon is created using the USBD framework that acts as an NBD server. Overhead between the user-space daemon and thekernel is minimized by using theAF_UNIX (also known as AF_LOCAL) socket type to communicate between local processes efficiently. The USBD framework allows a Go programmer to define their own block devices by implementing atype that conforms the followingsimple interface:

typeDeviceinterface {Size()int64BlockSize()int64ReadAt(buf []byte,posint64) (countint,errerror)WriteAt(buf []byte,posint64) (countint,errerror)Trim(posint64,countint)errorFlush()errorClose()error}

Typically a daemon will callNewNbdHandler to instantiate a newusbdlib.NbdStream which is configured to communicate with the kernel NDB module and is passed tousbdlib.ReqProcessor by callingNbdStream.ProcessRequests which acts as a proxy between the kernel which is handling NDB requests and the users usbdlib.Device implementation. A command-line server,usbdsrvd, is included to allow easy testing of the USBD engine and reference devices. In its implementation the above initialization procedure can beobserved. This server will make achosen device available as/dev/nbdX after which time it can be formatted and mounted or otherwise usedas any other block device.

Project Structure

cmd├── usbdsrvd│   │   └── An example daemon that use usbdlib to serve this repositories user-space│   │       device implementations.│   └── conf│       └── usbdsrvd configuration definitions and logic.pkg├── devices│   ├── filedisk│   │   └── An example USBD implementation that is backed by a simple file (filesystem).│   ├── dedupdisk│   │   └── An example, but potentially useful, USBD implementation that uses files to│   │       back a logical device, but with the added use of hashing and a Pebble database│   │       for block level deduplication. This is an older prototype and file/ramdisk are│   │       better examples for those looking to implement their own devices.│   ├── ramdisk│   │   └── An example USBD implementation that is memory backed (ramdrive).│   └── objstore│       └── A USBD implementation, which is practically useful as it creates a logical│           device which is cached locally but actually backed by a remote objectstore│           of the users choice. (Ex. S3/minio, Swift, BackBlaze, Azure/Google Object Storage).│           However, due to its complexity is not a great example for learning how to build│           a user-space block device.├── usbdlib│   └── The core USBD interface definitions and engine to export/serve implementations└── util    └── Utilities for implementors of user-space block devices to reuse

Note theobjstore anddedupdisk implementations are non-trivial implementation compared to the others and use thestow ObjectStorage abstraction library andPebble DB respectively.

Current state

As of writing, this package is actively maintained. Issue reports are welcome. Pull requests are welcome but will require consent to a contributor agreement. This project is currently beta quality. There are no known defects or risks to usage, but it has not yet been widely used. Experience reports are helpful and encouraged!

Basic Reference Devices (ramdisk and filedisk)

There areramdisk andfiledisk device implementations that are excellent for understanding how to implement a simple USBD device. Startingusbdsrvd with defaults (no arguments) will result in a 1 GBramdisk backed device being exposed as the next available NBD device, typically/dev/nbd0. Unlike these straight forward examples, there are also someadvanced device implemenations which may be of more practical value.

Building

This project is completely Linux centric (since it uses NBD) and requiresthe NBD kernel headers to be installed and available. To build theGo toolchain must beinstalled after that building is a simple matter of runninggo build inthe usbdsrvd directory to generate an executable.

Testing

usbd has both automated unit testing and manual testing approaches.

Unit Testing

There is a provideda suite of basic unit tests for implementors of devices that all of the included devices use:

$ cd usbd$ go test ./...ok  github.com/tarndt/usbd/pkg/devices/dedupdisk/dedupdisk_test6.166sok  github.com/tarndt/usbd/pkg/devices/filedisk/filedisk_test8.011sok  github.com/tarndt/usbd/pkg/devices/objstore65.617sok  github.com/tarndt/usbd/pkg/devices/ramdisk2.351sok  github.com/tarndt/usbd/pkg/usbdlib(cached)ok  github.com/tarndt/usbd/pkg/usbdlib/usbdlib_test0.197sok  github.com/tarndt/usbd/pkg/util(cached)

Executing these tests is just a matter of runninggo test. As seen above you can run all unit tests in the project repository by runninggo test ./... in the root directory. Sometests that interact with the Linux kernel requiresuper-user privileges (akaroot). Should you run tests with verbosity (go test -v), you will see some tests are skipped during normal execution. Additionally, while running tests with the race detector (-race) has proven fruitful for discovering data races, please be warned that in this mode tests often take an order of magnitude longer to run and may require you to increase the test timeout (ex.-timeout=10m). The provided test suits do support a short mode as well (go test -short). Sometimes its useful to compile a package's unit tests to its own executable, and this can be done in typical Go fashiongo test -c (orgo test -c -race to also enable the data race detector).

Manual Testing

Manual testing has been performed using standard block device tools (ex.hdparm) and by exporting an instance of the USBD reference implementations and creating aVirtualBoxVM using the exported NDB device and installingWindows XP/10 andUbuntu on it.

Advanced Devices

In addition to the basic reference devices, as mentioned in the project structure above, there are two notable non-trivial device implemenations.

ObjectStore (objstore)

The ObjectStore (objstore) driver creates a logical device which is cached locally on disk but actually backed by a remoteobject storage of the users choice (ex.S3,Swift, BackBlazeB2 andAzure's/Google's Object Storage). Besides the original implementors and providers of each object storage technology/service there are other projects and organizations that provide compatible alternatives for them; for example,minio allows easy self-hosted/Kubernetes-hosted S3,Ceph emulates both S3 and Swift with isrados gateway and many other cloud providers offer S3-compatible SaaS such asDigitalOcean's Spaces andwasabi.

The above OS install procedure has been used to manually verify correct operation of theobjectstore implementation. Key to this implementation preforming well is the use of the includedS2 compression and fast (enough) network access to thebacking objectstore server. For security reasons its highly recommended to usethe provided AES encryption functionality.

Due to fixes that have not yet made it upstream theObjectStore implementation usesa fork of thestow ObjectStorage abstraction library which enables this project to easily supportmany popular object stores. The command line argument-objstore-cfg=<yourJSON> allows for ObjectStore type specific configuration parameters to be passed. To see examples these parameters refer to the stow ObjectStore configuration definitions, for example,S3 config orAzure config options.

An easy way to test this is to setup a localminio (S3 compatible) objectstore server:

cd /tmpwget https://dl.min.io/server/minio/release/linux-amd64/miniochmod u+x ./miniomkdir /minio_data./minio server ./minio_data

The minio server instance will use the default minio credentials which are in turn the defaults used byusbdsrvd when objectstore config is not provided.

Deduplication (dedupdisk)

The Deduplication (dedupdisk) device implementation uses files to back a logical device, but with the added use ofcontent-hashing and aPebble database forblock-level deduplication. Work isin progress to allow the blocks that must be stored to also be compressed.

To verify thededup implementation, after initial installation testing the disk was then "quick" (no zeroing) re-formatted, and another fresh install was performed and file analysis verified duplication was taking place by checking the disk files did not grow meaningfully. Read performance up to 3.2 GB/s and write performance (while writing duplicate data) as high as 2.1 GB/s with SSDs hosting the backing PebbleDB database and block-file as been achieved.

Running

While this library is intended to be used by other daemons, the includedusbdsrvd (main.go) will host instances of thesample device implementations and may be useful in its own right. Startingusbdsrvd with defaults (no arguments) will result in a 1 GBramdisk backed device being exposed as the next available NBD device typically/dev/nbd0. If the NBD kernel module is not loadedusbdsrvdwill attempt to load it. The maximum number of NBD devices a system can haveis determined at kernel module load time so if thedefault is too few devices you may need to increase it with-nbd-max-devs if using theusbdsrvd daemon or viaan argument toNewNbdHandler if interfacing programmatically.

Usage: ./usbdsrvd [optional: options see below...] [optional: NBD device to use ex. /dev/nbd0, if absent the first free device is used.]Arguments starting with <driver name>-X are only applicable if dev-type=X is being set.Example:1 GiB device backed with by memory: ./usbdsrvd8 GiB device backed by a file and exported specifically on /dev/nbd5: ./usbdsrvd -dev-type=file -store-dir=/tmp -store-name=testfilevol -store-size=8GiB /dev/nbd512 GiB device backed by file deduplicated using PebbleDB: ./usbdsrvd -dev-type=file -store-dir=/tmp -store-name=testdedupvol -store-size=12GiB20 GiB device backed by a locally running S3/minio objectstore: ./usbdsrvd -dev-type=objstore -store-dir=/tmp -store-name=testobjvol -store-size=20GiB  -help    Display help and exit  -dev-type string    Type of device to back block device with: 'mem', 'file', 'dedup', 'objstore'. (default "mem")  -store-dir string    Location to create new backing disk files in (default "./")  -store-name string    File base name to use for new backing disk files (default "test-lun")  -store-size value    Amount of storage capcity to use for new backing files (ex. 100 MiB, 20 GiB) (default 1.0 GiB)  -nbd-max-devs uint    If the NBD kernel module is loaded by this daemon how many NBD devices should it create (default 32)  -dedup-memcache string    Amount of memory to the dedup store ID cache (ex. 100 MiB, 20 GiB) (default "512 MiB")  -objstore-kind string    Type of remote objectstore: 's3', 'b2', 'local', 'azure', 'swift', 'google', 'oracle' or 'sftp' (default "s3")  -objstore-cfg string    JSON configuration (default assumes local minio [kind "s3"] with default settings) (default "{\"access_key_id\":\"minioadmin\",\"endpoint\":\"http://127.0.0.1:9000\",\"secret_key\":\"minioadmin\"}")  -objstore-objsize value    Size of remote objects (ex. 32 MiB, 1 GiB) (default 64 MiB)  -objstore-diskcache value    Amount of disk for caching remote objects (0 implies local fullbacking/caching or ex. 100 MiB, 20 GiB)  -objstore-persistcache    Should the local cache be persistent or deleted on device shutdown  -objstore-aeskey string    If AES is enabled; AES key to use to encrypt remote objects (if absent a key is generated and saved to ./key.aes, otherwise use: key:<value>, file:<path>, env:<varname>  -objstore-aesmode string    AES encryption mode to use to encrypt remote objects: "aes-cfb", "aes-ctr", "aes-ofb" or "identity" for no encryption. "aes-ctr" is recommended. (default "aes-ctr")  -objstore-compress string    Compression algorithm to use for remote objects: "s2", "gzip" or "identity" for no compression) (default "s2")  -objstore-concurflush uint    Maximum number of dirty local objects to concurrently upload to the remote objectstore (0 implies use heuristic)  -objstore-flushevery duration    Frequency in which dirty local objects are uploaded to the remote objectstore (0 disables autoflush) (default 10s)

Usage Tutorial: objstore

Lets create a 10 GB volume using local minio (S3 compatible) server. Please note that many of the operations below requiresuper-user privileges (akaroot).

Terminal Session 1: Setup minio server

cd /tmpwget https://dl.min.io/server/minio/release/linux-amd64/miniochmod u+x ./miniomkdir ./minio_data./minio server ./minio_data

Note the data in/tmp/minio_data would normally live on a remote service.

You should see minio is running:

WARNING: Detected default credentials 'minioadmin:minioadmin', we recommend that you change these values with 'MINIO_ROOT_USER' and 'MINIO_ROOT_PASSWORD' environment variables......WARNING: Console endpoint is listening on a dynamic port (44599), please use --console-address ":PORT" to choose a static port.

Terminal Session 2: Start a usbd server

cd /tmpmkdir ./usbdsrv_cachegit clone https://github.com/tarndt/usbd.gitcd usbd/cmd/usbdsrvdgo buildsudo ./usbdsrvd -dev-type=objstore -store-dir=/tmp/usbdsrv_cache -store-name=tutorial-vol -store-size=10GiB

You should see the block device is being exported:

Generated AES-256 key and stored it in "/tmp/usbdsrv_cache/key.aes"USBD Server (./usbdsrvd) started.USBD Server (./usbdsrvd) using config: Exporting 20 GiB volume "tutorial-vol" as next available NBD device with local storage at "/tmp/usbdsrv_cache" using driver locally-cached objectstore using a s3 remote object store (secret_key=<REDACTED>, access_key_id="minioadmin", endpoint="http://127.0.0.1:9000"), with 64 MiB objects, using up to as much as total device size of persistent local storage for cache, s2 compression, aes-ctr encryption, flushing to remote story every 10s using 16 workersUSBD Server (./usbdsrvd) is processing requests for "/dev/nbd0"

Take note the device the volume is being exported on, the following instructions assume "/dev/nbd0" as shown above. Later, you can shutdown this service usingCtrl+C orkill (but make sure you umount the device first!).

Terminal Session 3: Using it!You can now format the device and start using it:

sudo mkfs.ext4 /dev/nbd0cd /tmpmkdir ./tutorial-volsudo mount /dev/nbd0 /tmp/tutorial-volsudo chmod a+r,a+w /tmp/tutorial-vol/

The device should show as mounted and files written to/tmp/tutorial-vol are now backed with the (in this case the local minio) objectstore.

$ mount | grep nbd/dev/nbd0 on /tmp/tutorial-vol type ext4 (rw,relatime)$ cd /tmp/tutorial-vol/$ echo "test file" > test.txt$ cat test.txttest file

Feel free to inspectusbdsrvd's cache and/orminio's data directories to observe the presence of data files.

To cleanup ensure you are not in/tmp/tutorial-vol (or you will get a busy error) and umount.

cd /tmpsudo umount /tmp/tutorial-vol

You can now return to session 2,Ctrl+C usbdsrvd, and then finally session 3 andCtrl+C minio. To avoid data loss, it is key that correct shutdown order is followed:

  1. Cease usage of mount point
  2. unmount block device
  3. daemon shutdown
  4. objectstore shutdown

Future

Work is in progress for an patch to the Linux kernel NBD driver to allow dynamic creation of NBD devices in addition to the pre-allocation at module load provided for today. This will allow facilities such asudev and themknodsystem call orcommand to provision more NBD devices then intitally created.

Work is in progress for aCSI storage driver that uses the objectstore implementation provided in this repository. This will allowKubernetes to use the provideded object-storage backed device to providepersistant volumes. If you are interested in contributing please contact the author!

About

User-Space Block Device (USBD) Framework (written in Go)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages


[8]ページ先頭

©2009-2025 Movatter.jp