Greybox fuzzer
This document describes principles and capabilities of the implemented fuzzing module.
For what?
Any symbolic execution engine has its drawbacks and limitations:
- Generation of some complex non-trivial objects often is impossible
- Computational costs for solving the system of restrictions
- State space explosion problem
Based on these problems, a fuzzing module without symbolic execution is required for full-fledged testing of programs. Also, modern research shows that the most effective is thehybrid mode, when symbolic and concrete executions help each other. The most effective way of fuzzing without symbolic execution is greybox fuzzing, which is proposed in this PR.
How it works?
You can read about how greybox fuzzing workshere.
Fuzzing process can be divided into 2 stages: exploration and exploitation. In the exploration phase, seeds are generated to test the target function, in the exploitation phase, mutations are applied to the best seeds. Why we need exploitation phase? Consider an example of a simple Java program:
public int test(int[] c) { if (c[0] == 0) { if (c[1] == 1) { if (c[2] == 2) { //Bug here return 42; } } } return -1;}
The probability that we initially generate an array that finds a bug is 0%. It is much more likely to generate an array whose first element is 0, and then through mutations to get data that will find a bug.
Exploration phase
To implement the exploration phase, junit-quickcheck was used, which contains built-in configurable generators for many java library types. Initially, the library generated objects that were converted to UtModel using UtModelConstructor. But this approach has shown its inoperability due to the impossibility of converting complex objects containing a large number of fields into UtModel (java.lang.Thread for example). Therefore, it was decided to rewrite the library (packageorg.utbot.quickcheck) to generate UtModels instead of objects. Also in theorg.utbot.engine.greyboxfuzzer.generator.GeneratorConfigurator class it is possible to configure generators by limiting the range of generated values and the size of collections.
In addition to generators for many types from the standard library, a generator for user-defined classes has been implemented, which has the following capabilities:
- Generation of an object using a constructor
- Generation of an object using static methods
- Producing of an object from a static field
- Setting random values in the fields of the generated object
- Generation of classes that implement the interface
The biggest difficulty in generating user classes is the processing of type parameters. For this, the libraries javaruntype (70 kB) and generics-resolver (77 kB) were used. In the implemented module, it is possible to replace type parameters with values suitable for bound, for example, in the function
public <T extends Number> int test(T a) { return 0;}
The T parameter can be replaced by random inheritor of the Number class.
The algorithm for object generation works as follows:
- An instance of the class is generated, which contains the target function (this instance)
- For each parameter of the target function, a
ParameterTypeContext is built containing the necessary information for resolving type parameters - A generator is searched in the GeneratorRepository, if it is a library type for which a generator exists, then it is taken, if not --- then a UserClassGenerator is generated
- UserClassGenerator chooses how to generate an object: with a probability of 2/3 a constructor, otherwise a static method or field
- When generating an object using a constructor, the constructor with the min number of parameters is selected with the highest probability
- Search for static methods and fields of object generation is performed using the Soot library
- The components responsible for generating custom class instances are in the
org.utbot.engine.greyboxfuzzer.generator.userclasses.generator package
- The result of generation is UtAssembleModel
The result of the exploration phase is a sorted set of seeds. Seeds are ranked according to the new coverage it opens.
Exploitation phase
This phase is designed to mutate seeds from the exploration phase.
Exploitation phase has the following features:
- Calling a random class method of the generated object
- Setting a random field to a random value
- A set of built-in mutations for arrays
During the work of this phase, the ranking of seeds is also carried out, the preference is given to "more successful".
Experiments
Experiments are carried out on projects from the SBST competition
TODO()
Current status
The exploration phase is currently being tested and work in progress with exploitation phase.
Greybox fuzzer
This document describes principles and capabilities of the implemented fuzzing module.
For what?
Any symbolic execution engine has its drawbacks and limitations:
Based on these problems, a fuzzing module without symbolic execution is required for full-fledged testing of programs. Also, modern research shows that the most effective is thehybrid mode, when symbolic and concrete executions help each other. The most effective way of fuzzing without symbolic execution is greybox fuzzing, which is proposed in this PR.
How it works?
You can read about how greybox fuzzing workshere.
Fuzzing process can be divided into 2 stages: exploration and exploitation. In the exploration phase, seeds are generated to test the target function, in the exploitation phase, mutations are applied to the best seeds. Why we need exploitation phase? Consider an example of a simple Java program:
The probability that we initially generate an array that finds a bug is 0%. It is much more likely to generate an array whose first element is 0, and then through mutations to get data that will find a bug.
Exploration phase
To implement the exploration phase, junit-quickcheck was used, which contains built-in configurable generators for many java library types. Initially, the library generated objects that were converted to UtModel using UtModelConstructor. But this approach has shown its inoperability due to the impossibility of converting complex objects containing a large number of fields into UtModel (
java.lang.Threadfor example). Therefore, it was decided to rewrite the library (packageorg.utbot.quickcheck) to generate UtModels instead of objects. Also in theorg.utbot.engine.greyboxfuzzer.generator.GeneratorConfiguratorclass it is possible to configure generators by limiting the range of generated values and the size of collections.In addition to generators for many types from the standard library, a generator for user-defined classes has been implemented, which has the following capabilities:
The biggest difficulty in generating user classes is the processing of type parameters. For this, the libraries javaruntype (70 kB) and generics-resolver (77 kB) were used. In the implemented module, it is possible to replace type parameters with values suitable for bound, for example, in the function
The T parameter can be replaced by random inheritor of the Number class.
The algorithm for object generation works as follows:
ParameterTypeContextis built containing the necessary information for resolving type parametersorg.utbot.engine.greyboxfuzzer.generator.userclasses.generatorpackageThe result of the exploration phase is a sorted set of seeds. Seeds are ranked according to the new coverage it opens.
Exploitation phase
This phase is designed to mutate seeds from the exploration phase.
Exploitation phase has the following features:
During the work of this phase, the ranking of seeds is also carried out, the preference is given to "more successful".
Experiments
Experiments are carried out on projects from the SBST competition
TODO()
Current status
The exploration phase is currently being tested and work in progress with exploitation phase.