Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

A Python HLS Tool For Generating Special Functions (Divide, Square Root, etc.)

NotificationsYou must be signed in to change notification settings

dillonhuff/SFGen

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Build Status

A Python3 HLS Tool for Writing Special Functions

This repo is a simple synthesis tool that lets you write functional unitslike dividers and square roots as python functions using a pre-built bitvector library and then compile them in to Verilog when you are readyto synthesize them.

Getting started

Run the following commands to install and run the unit tests (note that you will need icarus verilog for some of the unit tests):

git clone https://github.com/dillonhuff/SFGen.gitcd SFGenpytest

Example: Cubing a Number

Look in the file./examples/cube.py. You shouldsee a functioncube(x) that takes in one argument and returns thecube of the argument:

fromsfgen.bit_vectorimport*defcube(x):out=x*x*xreturnout

A simple python testbench for this function is shown intest/test_cube.py:

fromsfgen.bit_vectorimport*fromexamples.cubeimport*deftest_cube():width=32a=bv_from_int(width,7)correct=bv_from_int(width,7*7*7)print('a       =',a)print('correct =',correct)print('cube(a) =',cube(a))assert(cube(a)==correct)

We can run it like so:

pytest test_main.py test/test_cube.py

Synthesizing the Python Function

With all tests passing we are ready to generate Verilog for our design.To generate verilog fromcube we use a synthesis scriptlocated inexamples/synthesize_cube.py. The codelooks like so:

importosimportos.pathimportsysdir_path=os.path.dirname(os.path.realpath(__file__))sys.path.append(os.path.abspath(os.path.join(dir_path,os.pardir)))fromsfgen.verilog_backendimport*constraints=ScheduleConstraints()synthesize_verilog('examples/cube','cube', [l.ArrayType(32)],constraints)

To run this script and generate verilog use the command:

python ./examples/synthesize_cube.py

You should now see a new file called cube_32.v that contains an implementation ofthe cube function as a combinational circuit using 32 bit multipliers.

modulebuiltin_assign_32(in, out);input [31:0] in;output [31:0] out;assign out= in;endmodulemodulebuiltin_mult_32_32(in1, in0, out);input [31:0] in0;input [31:0] in1;output [31:0] out;assign out= in0* in1;endmodulemodulecube_32(x, out);input [31:0]x;output [31:0] out;wire [31:0] fs_0;wire [31:0] fs_1;wire [31:0] fresh_wire_0;wire [31:0] fresh_wire_2;wire [31:0] fresh_wire_4;builtin_assign_32fresh_assign_1(.in(fresh_wire_0), .out(fs_0));builtin_mult_32_32mult_32_0(.in0(x), .in1(x), .out(fresh_wire_0));builtin_assign_32fresh_assign_3(.in(fresh_wire_2), .out(fs_1));builtin_mult_32_32mult_32_1(.in0(fs_0), .in1(x), .out(fresh_wire_2));builtin_assign_32fresh_assign_5(.in(fresh_wire_4), .out(out));builtin_assign_32assign_32_2(.in(fs_1), .out(fresh_wire_4));endmodule

Adding a Resource Constraint

The implementation ofcube above uses 2 multipliers, but what if weonly want to use one multiplier? We can add a resource constraint that forcesthe synthesis program to do both operations on the same multiplier by splittingthe operations up over two cycles.

You can see how to do this in the synthesis script inexamples/synthesize_cube_one_mult.py.The script is the same as the previous one with one added line after the creationof theconstraints variable:

constraints.set_resource_count('mult_32',1)

This line tells the compiler that only one multiplier can be used to implement thecircuit. We run this new synthesis script like so:

python ./examples/synthesize_cube_one_mult.py

The new verilog is a sequential circuit with only one multiplier, and a stagecounter and multiplexers to control the data input to the multiplier:

modulebuiltin_counter_1(rst, clk, out);input [0:0] clk;input [0:0] rst;output [0:0] out;reg [0:0] stage_num;always @(posedge clk)beginif (rst)beginstage_num<=0;endelseif (stage_num==1)beginstage_num<=0;endelsebeginstage_num<= stage_num+1;endendassign out= stage_num;endmodulemodulebuiltin_fifo_0_32(in, clk, out);input [31:0] in;input [0:0] clk;output [31:0] out;assign out= in;endmodulemodulebuiltin_fifo_1_32(in, clk, out);input [31:0] in;input [0:0] clk;output [31:0] out;reg [31:0] delay_reg_0;always @(posedge clk)begindelay_reg_0<= in;endassign out= delay_reg_0;endmodulemodulebuiltin_mux_2_32(in1, sel, in0, out);input [31:0] in0;input [31:0] in1;input [0:0] sel;output [31:0] out;reg [31:0] out_reg;always @(*)begincase(sel)1'b0: out_reg= in0;1'b1: out_reg= in1;endcaseendassign out= out_reg;endmodulemodulebuiltin_assign_32(in, out);input [31:0] in;output [31:0] out;assign out= in;endmodulemodulebuiltin_mult_32_32(in1, in0, out);input [31:0] in0;input [31:0] in1;output [31:0] out;assign out= in0* in1;endmodulemodulebuiltin_constant_32_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx(out);output [31:0] out;assign out=32'bxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx;endmodulemodulecube_32(x, en, clk, out);input [31:0]x;output [31:0] out;wire [31:0] fs_0;wire [31:0] fs_1;wire [0:0] global_stage_counter;input [0:0] clk;input [0:0] en;wire [31:0] fresh_wire_0;wire [31:0] fresh_wire_2;wire [31:0] fresh_wire_4;wire [31:0] fresh_wire_6;wire [31:0] fresh_wire_8;wire [31:0] fresh_wire_10;wire [31:0] fresh_wire_12;wire [31:0] undefined_value_16;wire [31:0] fresh_wire_17;wire [31:0] fresh_wire_19;wire [31:0] fresh_wire_21;builtin_counter_1stage_counter(.clk(clk), .rst(en), .out(global_stage_counter));builtin_fifo_0_32fifo_1(.in(x), .out(fresh_wire_0), .clk(clk));builtin_fifo_1_32fifo_3(.in(fs_0), .out(fresh_wire_2), .clk(clk));builtin_mux_2_32in_mux_5(.sel(global_stage_counter), .in0(fresh_wire_0), .in1(fresh_wire_2), .out(fresh_wire_4));builtin_fifo_0_32fifo_7(.in(x), .out(fresh_wire_6), .clk(clk));builtin_fifo_1_32fifo_9(.in(x), .out(fresh_wire_8), .clk(clk));builtin_mux_2_32in_mux_11(.sel(global_stage_counter), .in0(fresh_wire_6), .in1(fresh_wire_8), .out(fresh_wire_10));builtin_assign_32fresh_assign_13(.in(fresh_wire_12), .out(fs_0));builtin_assign_32fresh_assign_14(.in(fresh_wire_12), .out(fs_1));builtin_mult_32_32mult_32_0(.in0(fresh_wire_4), .in1(fresh_wire_10), .out(fresh_wire_12));builtin_constant_32_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx_const_15(.out(undefined_value_16));builtin_fifo_0_32fifo_18(.in(fs_1), .out(fresh_wire_17), .clk(clk));builtin_mux_2_32in_mux_20(.sel(global_stage_counter), .in0(undefined_value_16), .in1(fresh_wire_17), .out(fresh_wire_19));builtin_assign_32fresh_assign_22(.in(fresh_wire_21), .out(out));builtin_assign_32assign_32_1(.in(fresh_wire_19), .out(fresh_wire_21));endmodule

Example: Creating a Pre-Computed Table

Often real functional units like reciprocal dividers or the CORDIC algorithm needto read from a table of values that is pre-computed at design time. This toolsupports pre-computed tables through a special higher-order functionlookup_in_table.

For an example consider the functionfoo inexamples/table_lookup.py:

fromsfgen.bit_vectorimport*deftable_func(a):returna-bv_from_int(a.width(),1)deffoo(a):res=lookup_in_table(a,table_func)returnres

foo calls the ordinary functiontable_func which subtracts 1 from itsargument, but instead of calling it directly it callstable_func onathrough thelookup_in_table function. This is a cue to the compiler topre-compute all possible values of table func and implement it as a table inverilog.

If we run the synthesis script forfoo using a 4 bit wide argument located inexamples/synthesize_table_lookup.py like so:

python ./examples/synthesize_table_lookup.py

then we get verilog like this:

modulebuiltin_assign_4(in, out);input [3:0] in;output [3:0] out;assign out= in;endmodulemodulebuiltin_table_lookup_table_func_4_4(in, out);input [3:0] in;output [3:0] out;reg [3:0] out_reg;always @(*)begincase(in)4'b0000: out_reg=4'b1111;4'b0001: out_reg=4'b0000;4'b0010: out_reg=4'b0001;4'b0011: out_reg=4'b0010;4'b0100: out_reg=4'b0011;4'b0101: out_reg=4'b0100;4'b0110: out_reg=4'b0101;4'b0111: out_reg=4'b0110;4'b1000: out_reg=4'b0111;4'b1001: out_reg=4'b1000;4'b1010: out_reg=4'b1001;4'b1011: out_reg=4'b1010;4'b1100: out_reg=4'b1011;4'b1101: out_reg=4'b1100;4'b1110: out_reg=4'b1101;4'b1111: out_reg=4'b1110;endcaseendassign out= out_reg;endmodulemodulefoo_4(a, res);input [3:0] a;output [3:0] res;wire [3:0] fs_1;wire [3:0] fresh_wire_0;wire [3:0] fresh_wire_2;builtin_assign_4fresh_assign_1(.in(fresh_wire_0), .out(fs_1));builtin_table_lookup_table_func_4_4builtin_table_lookup_table_func_0(.in(a), .out(fresh_wire_0));builtin_assign_4fresh_assign_3(.in(fresh_wire_2), .out(res));builtin_assign_4assign_4_1(.in(fs_1), .out(fresh_wire_2));endmodule

Thetable_function has been pre-compiled in to a giant case statement thatcan be synthesized as an SRAM. Be warned that large tables may take a long timeto calculate!

More Complicated Examples

The Supported Subset of Python

  • Operations on bit vectors from the pre-made bit vector library insfgen/bit_vector
  • Function calls
  • Lookup in pre-computed tables
  • Conditional assignment statements

I am working on adding support for structs, if-else statements, and fixed bound loops.

Dependencies

  • Python 3
  • pytest
  • Icarus Verilog for the unit test suite

About

A Python HLS Tool For Generating Special Functions (Divide, Square Root, etc.)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

[8]ページ先頭

©2009-2025 Movatter.jp