- Notifications
You must be signed in to change notification settings - Fork0
uw-unsat/bpf-jit-verif
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
This repository has been moved tohttps://github.com/uw-unsat/serval-bpf .Please visit that repository if you want the latest version of the codeand guide.
This repository contains a tool for formally verifyingthe Linux BPF JITs that buildson our verification frameworkServal.
We used this tool to help develop the RV32 BPF JIT(inarch/riscv/net/bpf_jit_comp32.c).
We also used this tool to find the bugs in the RV64 BPF JIT(1e692f09e091),and also to review the patches that add support for far jumps and branching:
- [PATCH bpf-next 2/8] riscv, bpf: add support for far branching
- [PATCH bpf-next 3/8] riscv, bpf: add support for far jumps and exits
After installing Racket, install a good version of Serval with
git clone --recursive'https://github.com/uw-unsat/bpf-jit-verif.git'cd bpf-jit-verifraco pkg install --auto ./serval
arch/riscv/net/bpf_jit_comp.c contains the C implementation ofthe BPF JIT for rv64 from Linux.
arch/riscv/net/bpf_jit_comp32.c contains a C implementation ofthe BPF JIT for rv32. It is generated from the Racket implementationunderracket/rv32/.
racket/lib/bpf-common.rkt contains the BPF JIT correctnessspecification and other common BPF functionality.
racket/lib/riscv-common.rkt provides features specificto the JIT for the RISC-V ISA.
racket/lib/lemmas.lean contains the axiomatization of bitvectoroperations in Lean.
racket/rv64/bpf_jit_riscv64.rkt is a manual translationof the C implementation into Racket for verification.
racket/rv64/spec.rkt contains the specification of the BPF JITfor rv64.
racket/rv64/synthesis.rkt contains the synthesis infrastructureand sketch for the BPF JIT for rv64.
racket/rv32/bpf_jit_comp32.rkt is a Racket implementation of theBPF JIT for rv32. It is used withracket/rv32/bpf_jit_comp32.c.tmplto generate the final C implementation.
racket/rv32/spec.rkt contains the specification of the BPF JITfor rv32.
racket/rv32/synthesis.rkt contains the synthesis infrastructureand sketch for the BPF JIT for rv32.
racket/test/ contains verification and synthesis test cases fordifferent classes of instructions.
racotest --jobs 8 racket/testruns all of the verification test cases in parallelusing 8 jobs. You can also runindividual files for a specific class of instructions,e.g.,
racotest racket/test/rv64/verify-alu32-x.rktAs an example, let's inject a bug fixed in commit1e692f09e091.Specifically, remove the zero extension forBPF_ALU|BPF_ADD|BPF_Xinracket/rv64/bpf_jit_riscv64.rkt:
[(list (or 'BPF_ALU 'BPF_ALU64) 'BPF_ADD 'BPF_X) (emit (if is64 (rv_add rd rd rs) (rv_addw rd rd rs)) ctx) (when (! is64) (emit_zext_32 rd ctx))]Now we have a buggy JIT implementation:
[(list (or 'BPF_ALU 'BPF_ALU64) 'BPF_ADD 'BPF_X) (emit (if is64 (rv_add rd rd rs) (rv_addw rd rd rs)) ctx)]Run the verification:
racotest racket/test/rv64/verify-alu32-x.rktVerification will fail with a counterexample:
Running test "VERIFY (BPF_ALU BPF_ADD BPF_X)"--------------------riscv64-alu32-x tests > VERIFY (BPF_ALU BPF_ADD BPF_X)FAILUREname: check-unsat?location: [...]/bpf-common.rkt:218:4params: '((model [x?$0 #f] [r0$0 (bv #x0000000080000000 64)] [r1$0 (bv #x0000000000000000 64)] [insn$0 (bv #x00800000 32)] [offsets$0 (fv (bitvector 32)~>(bitvector 32))] [target-pc-base$0 (bv #x0000000000000000 64)] [off$0 (bv #x8000 16)]))--------------------The counterexample produced by the tool givesBPF register values that will trigger the bug:it chooses r0 to be 0x0000000080000000and r1 to be 0x0000000000000000. This demonstratesthe bug because the RISC-V instructions sign extendthe result of the 32-bit addition, where the BPF instructionperforms zero extension.
The verification works by proving equivalencebetween a BPF instruction and the RISC-V instructionsgenerated by the JIT for that instruction.
As a concrete example, consider verifying theBPF_ALU|BPF_ADD|BPF_Xinstruction. The verification starts by constructing asymbolic BPFinstruction where the destination and source register fields cantake on any legit value:
BPF_ALU32_REG(BPF_ADD, {{rd}}, {{rs}})Next, the verifier symbolically evaluates the JIT on the BPFinstruction, producing a set of possible sequences of symbolicRISC-V instructions:
addw {{rv_reg(rd)}} {{rv_reg(rd)}} {{rv_reg(rs)}}slli {{rv_reg(rd)}} {{rv_reg(rd)}} 32srli {{rv_reg(rd)}} {{rv_reg(rd)}} 32Here,rv_reg denotes the RISC-V register associatedwith a particular BPF register.
Next, the tool proves that every possible sequence of generatedRISC-V instructions has the equivalent behavior as the original BPFinstruction, for all possible choices of registersrd andrs,and for all initial values of all RISC-V general-purpose registers.To do so, it leverages automated verifiers provided by theServalverification framework, as follows.
The verifier starts with a symbolic BPF state and symbolic RISC-Vstate, calledbpf-state andriscv-state, assuming that the twoinitial states match, e.g.,riscv-state[rv_reg(reg)] == bpf-state[reg]for allreg. Next, it runs the Serval BPF and RISC-V verifierson the BPF instruction overbpf-state and every possible sequenceof generated RISC-V instructions overriscv-state, respectively.It then proves that the final BPFand RISC-V states still match.
Support for more complex instructions, such as jumps and branches,requires additional care. For the details, you can see the JITcorrectness definition inlib/bpf-common.rkt:verify-jit-refinement.This complexity comes from having to prove that the generatedinstructions preserve a correspondence between the BPF programcounter and the RISC-V program counter.
To help develop and optimize the RV32 JIT, we used Rosette'sprogram synthesis feature to synthesize per-instructioncompilers for a subset of BPF instructions.The synthesis process takes as input the BPF and RISC-V interpretersfrom Serval, the BPF JIT correctness specification, and a programsketch which describes the structure of the code to search for.Synthesis then exhaustively searches increasingly large candidatesequences in the search space until it finds one that satisfiesthe JIT correctness specification.
You can try this feature out by running the following:
racotest racket/test/rv32/synthesize-alu64-x.rktNote that this test will attempt to use theBoolectorSMT solver first; if you do not have it installed itwill fall back to Rosette's provided Z3, which may takesignificantly longer (more than 10x slower on my laptop).It will produce output similar to the following:
riscv32-alu64-x synthesisRunning test "SYNTHESIZE (BPF_ALU64 BPF_SUB BPF_X)"Using solver #<boolector>Synthesizing for op '(BPF_ALU64 BPF_SUB BPF_X) with size 0Synthesizing for op '(BPF_ALU64 BPF_SUB BPF_X) with size 1Synthesizing for op '(BPF_ALU64 BPF_SUB BPF_X) with size 2Synthesizing for op '(BPF_ALU64 BPF_SUB BPF_X) with size 3Synthesizing for op '(BPF_ALU64 BPF_SUB BPF_X) with size 4Solution found for '(BPF_ALU64 BPF_SUB BPF_X):void emit_op(u8 rd, u8 rs, s32 imm, struct rv_jit_context *ctx) { emit(rv_sub(RV_REG_T1, hi(rd), hi(rs)), ctx); emit(rv_sltu(RV_REG_T0, lo(rd), lo(rs)), ctx); emit(rv_sub(hi(rd), RV_REG_T1, RV_REG_T0), ctx); emit(rv_sub(lo(rd), lo(rd), lo(rs)), ctx);}cpu time: 978 real time: 66937 gc time: 49In this example, synthesis was able to find a sequence offour RV32 instructions that correctly emulate the behaviorof aBPF_ALU64 BPF_SUB BPF_X instruction. This solutionis printed out as a C functionemit_op.You can see how this is integrated into the final JITinarch/riscv/net/bpf_jit_comp32.c,in theemit_rv32_alu_r64 function, in theBPF_SUB case. Note that the instructions in the JIT maybe different than the solution the solver on your machinehappens to find.
You can play around with synthesizing other instructionsby looking in the otherracket/test/rv32/synthesize-*.rkt files,and uncommenting cases for other instructions in those files.
Not every instruction sequence in the final JIT was found using synthesis.Many synthesis queries either take extremely long or do notproduce any results: especially those that require very longinstruction sequences (e.g., 64-bit shifts), or those thatuse non-linear arithmetic (multiplication, division, etc.)For these instructions, we manually write an implementationand verify the correctness using the other techniques describedin this guide.
The RV32 JIT is split in two parts: the Racket implementationinracket/rv32/bpf_jit_comp32.rkt contains the code requiredfor generating RV32 instruction sequences for a given BPF instruction,andracket/rv32/bpf_jit_comp32.c.tmpl is a template containingglue C code required to have the JIT interface compile in C andinterface with the rest of the kernel. The templatecontains holes that are filled in by extracting the corresponding Racketsource code to C.
The final C code inarch/riscv/net/bpf_jit_comp32.c can be generatedfrom these two components via:
make arch/riscv/net/bpf_jit_comp32.c
This file can be copied to the Linux kernel source tree for buildingand testing.
The test cases underracket/test/ show which instructionsthe JIT is currently verified for. For those instructions,it proves that the JIT is correct for all possible initialregister values, for all jump offsets, for all immediate values, etc.
There are several properties of the JIT thatare currently not specified or verified:
- Memory instructions (e.g.
BPF_LD,BPF_ST, etc.) - JIT prologue and epilogue
- 64-bit immediate load (
BPF_LD | BPF_IMM | BPF_DW) - Call instructions (e.g.,
BPF_CALL,BPF_TAIL_CALL) build_body: the verification proves the JIT correctfor individual BPF instructions at a time.- The loop in
bpf_int_jit_compile: verification assumesthectx->offsetmapping has already been correctlyconstructed by previous iterations. - The verified JIT is a manual Rosette port of the C version:mismatches in this translation can mean there are bugsin the C version not present in the verified one.
- The verification assumes that the BPF program being compiledpassed the kernel's BPF verifier: e.g., it assumes nodivide-by-zero or out-of-range shifts.
- Verification makes assumptions on the total size of thecompiled BPF program for jumps: it assumes that the number of BPFinstructions is less than 16,777,216; and that the total numberof generated RISC-V instructions is less than 134,217,728.These bounds can be increased but will increase overallverification time for jumps.
About
Resources
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Contributors2
Uh oh!
There was an error while loading.Please reload this page.