Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

create lots of data for backup scalability testing

NotificationsYou must be signed in to change notification settings

borgbackup/backupdata

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 

Repository files navigation

I made this to test scalability for borgbackup, but maybe you find it usefulfor testing other stuff, too.

mkdata.py

Realistically testing a deduplication backup software with a lot of data isn'teasy if you do not have a lot of such data.

If you need to create such data, you can't just duplicate existing data (thebackup tool would just deduplicate it and not create a lot of output data).Also, just fetching data from /dev/urandom is rather slow (and the data is notat all "realistic", because it is too random).

The solution is to start from a set of real files (maybe 1-2GB in size), butto modify each copy slightly (and repeatedly, so there are not even longerduplicate chunks inside the files) by inserting some bytes derived from acounter.

Please note that due to this, all output files are "corrupt" copies andonly intended as test data and expected to be thrown away after the test.The input files are not modified on disk.

This tool expects some data in the SRC directory, it could look likethis, for example (test data is not included, please use your own data):

234M  testdata/bin     # linux executable binaries245M  testdata/jpg     # photos101M  testdata/ogg     # music4.0K  testdata/sparse  # 1x 1GB empty sparse file, name must be "sparse"259M  testdata/src_txt # source code, lots of text files151M  testdata/tgz     # 1x tar.gz file

Make sure all the SRC data fits into memory as it will be read into and keptin RAM for better performance.

The tool creates N (modified) copies of this data set in directories named0 .. N inside the DST directory.

The copies of the empty "sparse" file will also be created as empty sparsefiles and they won't be modified. This can be used to test extremededuplication (or handling of sparse input files) by the tested backup tool.

About

create lots of data for backup scalability testing

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages


[8]ページ先頭

©2009-2025 Movatter.jp