Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

RISCV Vector Kernel C/LLVM-IR generator

License

NotificationsYou must be signed in to change notification settings

cbalint13/rvv-kernels

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This is a C/LLVM-IR kernel generator that address unsupported RVV ISA versions for LLVM or any other toolchains.

Benchmark

XuanTie TH1520SpacemiT K1 X60
INT8-v0.7.1-BENCHMARKINT8-v1.0-BENCHMARK
FP16-v0.7.1-BENCHMARKFP16-v1.0-BENCHMARK
FP32-v0.7.1-BENCHMARKFP32-v1.0-BENCHMARK

Usage

  • Prepare a docker image with rv64 cross compiler
$ git clone https://github.com/cbalint13/rvv-kernels$ cd rvv-kernels$ docker build --file Dockerfile.ML.fedora --tag th1520-rvv .
  • Generate a kernel
$ docker run -it --rm -v "$PWD":/opt/src th1520-rvv bash[root@b8032fd28a75 src]# ./make.sh 32 4 int8 v0.7.1 cbalint@192.168.1.45(x) Naive kernel:  HEX = b0 28 00 00 b0 66 00 00 b0 a4 00 00 b0 e2 00 00  O[] = 00010416 00026288 00042160 00058032(x) MACC operations: elems[32] x lanes[4] = 256 Ops(x) RVV kernel:  HEX = b0 28 00 00 b0 66 00 00 b0 a4 00 00 b0 e2 00 00  O[] = 00010416 00026288 00042160 00058032RVV bench: 25.600 GOPS in 2.215818 secsRVV speed: 11.553 GOPS/sec[root@b8032fd28a75 src]# ls -l dot_int8_kernel.*-rw-r--r-- 1 1000 1000 3867 Mar 13 18:03 dot_int8_kernel.c-rw-r--r-- 1 1000 1000 5034 Mar 13 18:03 dot_int8_kernel.ir
  • Optional benchmark logs & graph
[root@b8032fd28a75 src]# ./script/0-explore.sh[root@b8032fd28a75 src]# ls -l benchmark-int8.log-rw-r--r-- 1 1000 1000 5731 Mar 13 17:38 benchmark-int8.log[root@b8032fd28a75 src]# ./script/1-plotgraph.py --logs benchmark-int8.log --title 'RVV v0.7.1 int8 kernels benchmark (TH1520)'[root@b8032fd28a75 src]# ls -l benchmark-int8.log.png-rw-r--r-- 1 1000 1000 58380 Mar 13 18:47 benchmark-int8.log.png

Notes

  • This generator emmits C / LLVM-IR kernels, with encoded insn, thus making it RVV version agnostic
  • T-Head 1520 (C906, also others) implements older v0.7.1 RVV ISA, now unsupported by LLVM upstream
  • TH1520setvli ASIC implementation is slow, see comments on a dynamic kernel:trials/riscv-asm.c
  • Thesetvli slowness issue force the SVE (scalable vector) concept to avoid frequentsetvli calls

Thetrials/riscv-asm.c sample kernel would cope withSVE concept ofruntime dynamismbut for reasons tested and mentioned here, on the particular T-Head's C906 RVV ASIC implementation, the contextswitchingsetvli drags down the whole performance in a severe way, thussetvli calls should be minimizedfor this particular target.

For RVV 0.7.1 there is a limit of how & which vector registers can be used in the context of MUL (multiplier),so the maximum vector fill width of 64 xint8 being reduced into x2 lanes is not possible, it would requiree8/m4 MUL mode that leaves room for only 4 x vregs (v0, v8, v16, v24) a insufficient amount of registers.The maximum usableint8 elements width is 32 for RVV 0.7.1 version.

The generated kernel setssetvli once and unrolls computations across the vector registers.

Changelog

  • 16 Dec 2024 benchmark full int8/fp16/fp32 RVV v1.0 & v0.7.1
  • 06 Jun 2024 realeasefp16 &fp32 for RVV 0.7.1 version
  • 13 Mar 2024 intial realease, for nowint8 with RVV 0.7.1 version

About

RISCV Vector Kernel C/LLVM-IR generator

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

[8]ページ先頭

©2009-2025 Movatter.jp