You signed in with another tab or window.Reload to refresh your session.You signed out in another tab or window.Reload to refresh your session.You switched accounts on another tab or window.Reload to refresh your session.Dismiss alert
copytester: A utility to run a corpus of inputs against a running server
hongfuzz-pg.patch: patch to allow using honggfuzz fuzzer to test COPY FROM
gencopyfuzz
prepared corpus of different inputs, generated manually and by honggfuzz
Using copytester
Usage:
copytester <inputdir> <connstring>
copytester connects to a running PostgreSQL server, and issues a COPY FROM commandto load each file in to a temporary table. It prints out anyerrors, and the final contents of the table. This is useful for comparingbehavior of two PostgreSQL versions, or a patched server against unpatched one.
There are a bunch of input files included in the 'corpus' directory, and you canuse the included 'gencopyfuzz' program or honggfuzz to generate more.
For example, you can run copytester against two servers and check if the producethe same result:
copytester corpus "dbname=postgres port=5432" > results-A.txtcopytester corpus "dbname=postgres port=5433" > results-B.txtdiff -u results-A.txt results-B.txt
Using gencopyfuzz
make
./gencopyfuzz corpus
This generates files named 'gencopyfuzz-[00000-77777]' in the corpus directory.
Using honggfuzz
Apply honggfuzz-pg.patch to PostgreSQL sources:
cd patch -p1 < honggfuzz-pg.patch
Create a test cluster following the intructions in startfuzz.sh
Run hongfuzz:
./startfuzz.sh
The 'corpus' directory in the git repository contains test inputs thatwere generated with this method. If you just want to run the existingtests against a running server, you don't need to run honggfuzz yourself.
Corpus
The 'corpus' directory contains test input files for COPY FROM. a fewof them were created by hand, the rest were generated by honggfuzz.Run 'gencopyfuzz corpus' to generate another set of inputs.
The existing corpus was generated with UTF-8 as the client and serverencoding. To test other encodings and encoding conversions, you maywant to edit the dictionary in gencopyfuzz.c, and also run honggfuzzyourself with different settings.
Tips
By default, copytester sends the input file to the server one byte at a time.That's highly inefficient, but useful for finding bugs in the server'shandling of look-ahead and buffer boundaries. You can adjust the RAW_BUF_SIZEconstant if you don't want that.
Similarly, it can be very useful to reduce the server's input buffer size,by changing the RAW_BUF_SIZE constant in src/include/commands/copyfromparse_internal.hin the PostgreSQL source tree.
Most of the corpus has been generated by fuzzing with UTF-8. But it mightstill be useful to run it with other encodings. Many cases will fail withinvalid encoding errors, but some inputs happen to be valid in other encodings,too.