Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

TestFileCreate - A small Linux app that creates test files in a single directory or in a tree of directories. Files can be identical or individually filled with random printable characters or binary data. Number of printable characters can be selected from the ASCII set (max 95). File size is selectable or random. Has a resource calculator.

NotificationsYou must be signed in to change notification settings

Jim-JMCD/TestFilesCreate

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

53 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A small Linux app that creates test files filling them with random data in a single directory or a directory tree.

  • Includes a calculator for creating data in directory trees. [ Option -C ]
  • Will create a single file to directory trees of files, minimum file size = 2 bytes
  • Files can be identical or individually filled with random content.
  • File contents can either be printable or binary. All contents generated from /dev/urandom.
  • Selecting the printable pool of characters determines the complexity of file contents.
  • Files sizes can be identical or randomly sized within a given range.
  • Run either interactively or unaccompanied in batch mode.

SeeComaparitive benchmark testing of data compression and deduplication section on how TestFileCreate can be used as a standardised benchmark for comparing data storage reduction techniques.


TestFilesCreate is a Linux portable x64 executable created from the bash scriptTFile_Create (private Github repository) using shc.

Dependency

This requires a bash environment to run.An executable created from theshc utility always requires bash on x64

More :Github shc


CALCULATOR mode (interactive, tree only)

TestFileCreate -C

User Inputs: Tree depth, width, number files per directory and file size

Output: Summary, tables of data trees of current and smaller trees. Tables contain data size and file numbers for each tree


TEST DATA CREATION mode

DEFAULTS

  • File contents are binary. Use -P or -D for creating compressable printable data
  • If all files same size then contents will also be identical. Use -r option to randomise file contents.
  • A time stamped output directory is created in users current directory. Alternatively use another directory with the -o option
  • Interactive mode, requires confirmation to proceed after providing user with a summary. See examples.

Maximum permitted values, seeLimitations section

OPTIONS :

Directory LayoutAll mandatory

For the following,n is a number, minimum is 1

  • -nn Number of files in each directory.
  • -dn Depth. How many directories deep.
  • -wn Width. How many directories wide.

Create single directory: -d 1 (-w if set, will be ignored)

Create tree of directories:

  • Depth min is -d 2
  • Width min is -w 1 and manditory

File SizeA file size is mandatory

Fixed File Size

  • File sizes have to be designated by B, K, M or G. Minimum is 2B (2 bytes). Example 2KiB = 2K, 3MiB = 3M 4GiB = 4G
  • -f Fixed file size, default [usage: -f 2K]

NOTE: Default content for fixed file size of printable data is: ALL FILES ARE IDENTICAL, use

  • -r Random content is generated individually for every file.

Random File Size

  • All files individually filled with random content.
  • -s Smallest file size.
  • -l Largest file size.
  • If -s is omitted, the random range starts at 2B (2 bytes) if largest file size is <1G, smallest size will be 1M (1MiB) if largest file size is >= 1G

File ContentsDefault is random binary

  • -Pn Wheren is a number in the range 1 to 95. Selects the pool of printable characters from the ASCII set.

    • n = 1 files only contain the uppercase 'A'
    • n = 2 to 26 files only contain lowercase Latin alphabet characters
    • n >26 files contain printable ASCII characters. Max n = 95
  • -Dn Wheren is a number in the range 1 to 10. Selects the pool of digit charcters from the ASCII set.

    • n = 1 files only contain zeros '0'
    • n > 1 files contain digits. Max n = 10
  • -r Random content for fixed file sizes.

INPUT, OUTPUT and LOGGING

  • -b Batch/quiet run with no user checks. Default is interactive with user input.
  • -o Output to anexisting directory.
  • Defaults to current working directory
  • Creates a new time stamped directory for content (tfc_YYMMDD_hhmm_ss).
  • No logging. In batch mode user has to redirect output to a file
  • Progress indicated by time stamping every ten directories filled with files.
  • The 'script' command can be used in interactive mode to record all activty.

LIMITATIONS

Data creation bails out before any data creation if:

  • The number of directories to be created exceeds 100 million
  • The number of files to be created exceeds 100 million
  • If the 'shuf' command is not available
  • if the -c option not avaiable for the 'head' command.
  • For x64 Linux. Do do list: AArch64/ARM64 version.

If the 'seq' command is not avaiable. The character pool will not be displayed in the inital summary. The seq command is not required for file creation.

Binary data is generated from /dev/urandom. This data will not compress that well. Binary data that is stored/transmitted may render data deduplication and compression ineffective.

FILE CONTENT VALIDATION

Validate contents: all Files:

od -N <bytes> -Ax -t x1z <file name>
  • Where <bytes> is the number to check from beginning of file.
  • Non-printable characters will appear as "dots"

Validate printable character distribution:

od -a <file name>  | cut -b 9- | tr " " \\n | egrep -v "^$" | sort | uniq -cORsed 's/\(.\)/\1\n/g' <file name> | sort | uniq -c

Output

  • Column 1 : Character count
  • Column 2 : Character being counted. NOTE : This column should only contain a single charcter,if more than one character then file contents is binary data.

Validate printable character pool count:

Example: confirms that a complexity of 17 given by-P 17 contains a pool of 17 different characters.

od -a <file name>  | cut -b 9- | tr " " \\n | egrep -v "^$" | sort | uniq -c | wc -lORsed 's/\(.\)/\1\n/g' <file name> | sort | uniq -c | wc -l

EXAMPLES

TestFilesCreate -P 28 -d 3 -w 5 -f 15M -n 50

DIRECTORY TREE each directory contains 5 directories and 50 filesThe tree is 3 levels deepOutput: /home/ted/test/tfc_240930-1759-37All files with identical contentsFiles created are all 15MStorage used...... 22.71G (max potential)File Contents..... Random selection from the 28 char set: !"#$%&'()*+,-./0123456789:;<Total data directories........30Total data files............1550Do you want to proceed? (y/n)

TestFilesCreate -D 5 -d 1 -f 600K -n 1000 -r -o /home/ted/test

SINGLE DIRECTORY containing 1000 filesOutput: /home/ted/test/tfc_240930-1802-53Random data created individually for all filesFiles created are all 600KStorage used...... 585.94M (max potential)File Contents..... Random selection from the 5 digit set: 01234Total data directories.........1Total data files............1000Do you want to proceed? (y/n)

Comaparitive benchmark testing of data compression and deduplication

TestFileCreate can be used as a standardised benchmark for comparing data storage reduction techniques.

In these examples theData Complexity is set by the-P option. A data complexity of 10 = -P 10 and a data complexity of 12 = -P 12

For more information on the creation of the charts seeTestFilesCreate datasheet.

Test Image

About

TestFileCreate - A small Linux app that creates test files in a single directory or in a tree of directories. Files can be identical or individually filled with random printable characters or binary data. Number of printable characters can be selected from the ASCII set (max 95). File size is selectable or random. Has a resource calculator.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

[8]ページ先頭

©2009-2025 Movatter.jp