- Notifications
You must be signed in to change notification settings - Fork0
dillonhuff/SFGen
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This repo is a simple synthesis tool that lets you write functional unitslike dividers and square roots as python functions using a pre-built bitvector library and then compile them in to Verilog when you are readyto synthesize them.
Run the following commands to install and run the unit tests (note that you will need icarus verilog for some of the unit tests):
git clone https://github.com/dillonhuff/SFGen.gitcd SFGenpytest
Look in the file./examples/cube.py. You shouldsee a functioncube(x)
that takes in one argument and returns thecube of the argument:
fromsfgen.bit_vectorimport*defcube(x):out=x*x*xreturnout
A simple python testbench for this function is shown intest/test_cube.py:
fromsfgen.bit_vectorimport*fromexamples.cubeimport*deftest_cube():width=32a=bv_from_int(width,7)correct=bv_from_int(width,7*7*7)print('a =',a)print('correct =',correct)print('cube(a) =',cube(a))assert(cube(a)==correct)
We can run it like so:
pytest test_main.py test/test_cube.py
With all tests passing we are ready to generate Verilog for our design.To generate verilog fromcube
we use a synthesis scriptlocated inexamples/synthesize_cube.py. The codelooks like so:
importosimportos.pathimportsysdir_path=os.path.dirname(os.path.realpath(__file__))sys.path.append(os.path.abspath(os.path.join(dir_path,os.pardir)))fromsfgen.verilog_backendimport*constraints=ScheduleConstraints()synthesize_verilog('examples/cube','cube', [l.ArrayType(32)],constraints)
To run this script and generate verilog use the command:
python ./examples/synthesize_cube.py
You should now see a new file called cube_32.v that contains an implementation ofthe cube function as a combinational circuit using 32 bit multipliers.
modulebuiltin_assign_32(in, out);input [31:0] in;output [31:0] out;assign out= in;endmodulemodulebuiltin_mult_32_32(in1, in0, out);input [31:0] in0;input [31:0] in1;output [31:0] out;assign out= in0* in1;endmodulemodulecube_32(x, out);input [31:0]x;output [31:0] out;wire [31:0] fs_0;wire [31:0] fs_1;wire [31:0] fresh_wire_0;wire [31:0] fresh_wire_2;wire [31:0] fresh_wire_4;builtin_assign_32fresh_assign_1(.in(fresh_wire_0), .out(fs_0));builtin_mult_32_32mult_32_0(.in0(x), .in1(x), .out(fresh_wire_0));builtin_assign_32fresh_assign_3(.in(fresh_wire_2), .out(fs_1));builtin_mult_32_32mult_32_1(.in0(fs_0), .in1(x), .out(fresh_wire_2));builtin_assign_32fresh_assign_5(.in(fresh_wire_4), .out(out));builtin_assign_32assign_32_2(.in(fs_1), .out(fresh_wire_4));endmodule
The implementation ofcube
above uses 2 multipliers, but what if weonly want to use one multiplier? We can add a resource constraint that forcesthe synthesis program to do both operations on the same multiplier by splittingthe operations up over two cycles.
You can see how to do this in the synthesis script inexamples/synthesize_cube_one_mult.py.The script is the same as the previous one with one added line after the creationof theconstraints
variable:
constraints.set_resource_count('mult_32',1)
This line tells the compiler that only one multiplier can be used to implement thecircuit. We run this new synthesis script like so:
python ./examples/synthesize_cube_one_mult.py
The new verilog is a sequential circuit with only one multiplier, and a stagecounter and multiplexers to control the data input to the multiplier:
modulebuiltin_counter_1(rst, clk, out);input [0:0] clk;input [0:0] rst;output [0:0] out;reg [0:0] stage_num;always @(posedge clk)beginif (rst)beginstage_num<=0;endelseif (stage_num==1)beginstage_num<=0;endelsebeginstage_num<= stage_num+1;endendassign out= stage_num;endmodulemodulebuiltin_fifo_0_32(in, clk, out);input [31:0] in;input [0:0] clk;output [31:0] out;assign out= in;endmodulemodulebuiltin_fifo_1_32(in, clk, out);input [31:0] in;input [0:0] clk;output [31:0] out;reg [31:0] delay_reg_0;always @(posedge clk)begindelay_reg_0<= in;endassign out= delay_reg_0;endmodulemodulebuiltin_mux_2_32(in1, sel, in0, out);input [31:0] in0;input [31:0] in1;input [0:0] sel;output [31:0] out;reg [31:0] out_reg;always @(*)begincase(sel)1'b0: out_reg= in0;1'b1: out_reg= in1;endcaseendassign out= out_reg;endmodulemodulebuiltin_assign_32(in, out);input [31:0] in;output [31:0] out;assign out= in;endmodulemodulebuiltin_mult_32_32(in1, in0, out);input [31:0] in0;input [31:0] in1;output [31:0] out;assign out= in0* in1;endmodulemodulebuiltin_constant_32_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx(out);output [31:0] out;assign out=32'bxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx;endmodulemodulecube_32(x, en, clk, out);input [31:0]x;output [31:0] out;wire [31:0] fs_0;wire [31:0] fs_1;wire [0:0] global_stage_counter;input [0:0] clk;input [0:0] en;wire [31:0] fresh_wire_0;wire [31:0] fresh_wire_2;wire [31:0] fresh_wire_4;wire [31:0] fresh_wire_6;wire [31:0] fresh_wire_8;wire [31:0] fresh_wire_10;wire [31:0] fresh_wire_12;wire [31:0] undefined_value_16;wire [31:0] fresh_wire_17;wire [31:0] fresh_wire_19;wire [31:0] fresh_wire_21;builtin_counter_1stage_counter(.clk(clk), .rst(en), .out(global_stage_counter));builtin_fifo_0_32fifo_1(.in(x), .out(fresh_wire_0), .clk(clk));builtin_fifo_1_32fifo_3(.in(fs_0), .out(fresh_wire_2), .clk(clk));builtin_mux_2_32in_mux_5(.sel(global_stage_counter), .in0(fresh_wire_0), .in1(fresh_wire_2), .out(fresh_wire_4));builtin_fifo_0_32fifo_7(.in(x), .out(fresh_wire_6), .clk(clk));builtin_fifo_1_32fifo_9(.in(x), .out(fresh_wire_8), .clk(clk));builtin_mux_2_32in_mux_11(.sel(global_stage_counter), .in0(fresh_wire_6), .in1(fresh_wire_8), .out(fresh_wire_10));builtin_assign_32fresh_assign_13(.in(fresh_wire_12), .out(fs_0));builtin_assign_32fresh_assign_14(.in(fresh_wire_12), .out(fs_1));builtin_mult_32_32mult_32_0(.in0(fresh_wire_4), .in1(fresh_wire_10), .out(fresh_wire_12));builtin_constant_32_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx_const_15(.out(undefined_value_16));builtin_fifo_0_32fifo_18(.in(fs_1), .out(fresh_wire_17), .clk(clk));builtin_mux_2_32in_mux_20(.sel(global_stage_counter), .in0(undefined_value_16), .in1(fresh_wire_17), .out(fresh_wire_19));builtin_assign_32fresh_assign_22(.in(fresh_wire_21), .out(out));builtin_assign_32assign_32_1(.in(fresh_wire_19), .out(fresh_wire_21));endmodule
Often real functional units like reciprocal dividers or the CORDIC algorithm needto read from a table of values that is pre-computed at design time. This toolsupports pre-computed tables through a special higher-order functionlookup_in_table
.
For an example consider the functionfoo
inexamples/table_lookup.py:
fromsfgen.bit_vectorimport*deftable_func(a):returna-bv_from_int(a.width(),1)deffoo(a):res=lookup_in_table(a,table_func)returnres
foo
calls the ordinary functiontable_func
which subtracts 1 from itsargument, but instead of calling it directly it callstable_func
ona
through thelookup_in_table
function. This is a cue to the compiler topre-compute all possible values of table func and implement it as a table inverilog.
If we run the synthesis script forfoo
using a 4 bit wide argument located inexamples/synthesize_table_lookup.py like so:
python ./examples/synthesize_table_lookup.py
then we get verilog like this:
modulebuiltin_assign_4(in, out);input [3:0] in;output [3:0] out;assign out= in;endmodulemodulebuiltin_table_lookup_table_func_4_4(in, out);input [3:0] in;output [3:0] out;reg [3:0] out_reg;always @(*)begincase(in)4'b0000: out_reg=4'b1111;4'b0001: out_reg=4'b0000;4'b0010: out_reg=4'b0001;4'b0011: out_reg=4'b0010;4'b0100: out_reg=4'b0011;4'b0101: out_reg=4'b0100;4'b0110: out_reg=4'b0101;4'b0111: out_reg=4'b0110;4'b1000: out_reg=4'b0111;4'b1001: out_reg=4'b1000;4'b1010: out_reg=4'b1001;4'b1011: out_reg=4'b1010;4'b1100: out_reg=4'b1011;4'b1101: out_reg=4'b1100;4'b1110: out_reg=4'b1101;4'b1111: out_reg=4'b1110;endcaseendassign out= out_reg;endmodulemodulefoo_4(a, res);input [3:0] a;output [3:0] res;wire [3:0] fs_1;wire [3:0] fresh_wire_0;wire [3:0] fresh_wire_2;builtin_assign_4fresh_assign_1(.in(fresh_wire_0), .out(fs_1));builtin_table_lookup_table_func_4_4builtin_table_lookup_table_func_0(.in(a), .out(fresh_wire_0));builtin_assign_4fresh_assign_3(.in(fresh_wire_2), .out(res));builtin_assign_4assign_4_1(.in(fs_1), .out(fresh_wire_2));endmodule
Thetable_function
has been pre-compiled in to a giant case statement thatcan be synthesized as an SRAM. Be warned that large tables may take a long timeto calculate!
- examples/huang_divider.py - A lookup table based Taylor series divider for signed integers.
- examples/divider.py - A Newton-Raphson divider for signed integers that uses a lookup table and one iteration of refinement.
- Operations on bit vectors from the pre-made bit vector library insfgen/bit_vector
- Function calls
- Lookup in pre-computed tables
- Conditional assignment statements
I am working on adding support for structs, if-else statements, and fixed bound loops.
- Python 3
- pytest
- Icarus Verilog for the unit test suite