Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork56.4k
OpenCV RISC V

RISC-V is an Instruction Set Architecture (ISA) that is gaining popularity as analternative to traditional ISAs such as x86/x86_64 and ARM/AArch64. It iscovered by an open-source license, which allows for royalty-free usage by bothhardware and software providers.
Besides base integer instruction set RISC-V processors can implement variousarchitecture extensions, for exampleF for single-precision floating pointsupport orM for integer multiplication and division. There exists anextension for vector operations (aka SIMD - single instruction multiple data) -V (RVV), which is beneficial for high-performance computing applicationslike image processing, machine learning and deep learning. This extension can beleveraged by the OpenCV to achieve significant performance improvement acrossmany algorithms.V-extension analogs in other platforms are SSE/AVX forx86_64 and NEON/SVE for ARM/AArch64.
Major difference between theV extension and other popular SIMD extensionsis non-fixed vector length: while SSE instructions operate on 128-bit registersand AVX2 on 256-bit registers, instructions in theV extension can operateon whatever register width is provided by an actual hardware. This kind of SIMDspecification is also called Scalable SIMD. Similar approach is used by SVE(Scalable Vector Extension) on ARM platforms.
In this document we will focus mainly on RVV extension usage in generaland in OpenCV specifically.
Links:
- Wikipedia / RISC-V
- RISC-V / about
- RISC-V / Vector Extension spec v1.0
- RISC-V / Vector Extension spec v0.7.1
- ARM - What is SVE
Historically, the firstV specification version implemented in hardware wasv0.7.1. It differs from the finalized v1.0 in several ways. Below is a list ofdevices known to support RVV extension and being used for OpenCV optimizationtesting recently:
- RVV 1.0
- CanMV K230
- Banana Pi BPI-F3
- Muse Pi
- LicheePi 3A
- RVV 0.7.1
- LicheePi 4A
In order to useV extension one should use Linux system which includeskernel built with RVV support. Often it might be the kernel provided by SoC/coremanufacturer or mainline kernel with corresponding patches. To checkVextension support run the following command and check thatisa line containsletterv afterrv64 base specifier:
cat /proc/cpuinfo
Example output:
...isa : rv64imafdcvu... ^ hereWhile it is possible to write vectorized code using RVV assembly, C/C++libraries and applications often use vector intrinsics - set of types andfunctions built into the compiler, which corresponds to machine instructions.
Usually software for RISC-V is built on regular Linux or Window platforms usingcross-compilation process. Cross-compiling toolchains include compiler and otherlibraries and tools required for development. Following toolchains are known toinclude intrinsics for RVV extension:
- Mainline compilers (RVV 1.0)
- GCC 13-14 (https://github.com/riscv-collab/riscv-gnu-toolchain) - usesrecent intrinsics specification, supports v1.0 of vector extension
- LLVM/Clang 17-20 (https://github.com/llvm/llvm-project) - uses recentintrinsics specification, supports v1.0 of vector extension
- XuanTie compiler (RVV 0.7.1 and RVV 1.0)
- xuantie-gnu-toolchain 2.x is based on GCC 10(https://github.com/XUANTIE-RV/xuantie-gnu-toolchain, see "Releases")
Links:
Seehttps://llvm.org/docs/GettingStarted.html#getting-the-source-code-and-building-llvm
git clone --depth 1 https://github.com/llvm/llvm-projectcd llvm-projectcmake -S llvm -B build -G Ninjacmake --build build --target installNote: use-DCMAKE_INSTALL_PREFIX CMake option to change install location
# get 'riscv-collab' repositorygit clone --depth=1 https://github.com/riscv-collab/riscv-gnu-toolchaincd riscv-gnu-toolchaingit submodule init&& git submodule update# update 'gcc' subfolderGCC_TAG=releases/gcc-14.2.0git -C gcc remote updategit -C gcc fetch origin${GCC_TAG}git -C gcc checkout${GCC_TAG}# build./configure --enable-linuxCPUNUM=4make -j${CPUNUM} linuxmake -j${CPUNUM} build-qemu
Note: use--prefix configure option to change install location
Note: you can also updateqemu subfolder to specific version or omitbuild-qemu command if you don't need it
# get repositorygit clone https://github.com/XUANTIE-RV/xuantie-gnu-toolchaincd xuantie-gnu-toolchain# build./configure --enable-linuxmake -j8 linuxmake -j8 build-qemu
Note: use--prefix configure option to change install location
Note: omit QEMU build if you don't need it
In order to test applications without hardware one can use emulation software,for exampleQEMU. Often QEMU emulator is included intoolchain package or can be built together with compiler. There are twooperating modes for QEMU: full system emulation and user-mode emulation. In thefirst case user has to describe and prepare full virtual system with disks andother peripherals, install operating system and then work with it like withstandalone machine - boot, install software, interact. In the second case usercan straight-away run their RISC-V application and QEMU will proxy system callsto the host OS (Linux) - this mode is the best suitable for an applicationdevelopment and debugging, so we will review it in more details.
QEMU user-mode applications have names likeqemu-arm,qemu-aarch64,qemu-riscv64 - they do not have wordsystem. In order to launch anapplication using QEMU one should pass their command line to the QEMU programlike this:
qemu-riscv64<qemu options> ./my-app<app arguments>
Following QEMU options are most important to run RISC-V application:
-cpu <model>- select CPU model and feature, for example QEMU provided byT-Head allows selecting specific core to emulate:c906andc906fdv(withf,dandvextensions),c908andc908v,c910andc910v. GenericRISC-V emulation also allows extension selection:-cpu rv64,v=true,vext_spec=v1.0will enable RVV v1.0.-L <path>- set path where LD interpreter will be rooted. Usually thisfolder is part of toolchain distribution, so this parameter might look likethis-L <toolchain root>/sysroot.
Other options might be useful for debugging and fine tuning:
-help- show all options and their descriptions-cpu list- show all supported CPU models-E <var>=<value>- set environment variables-g <port>- wait for GDB connection on selected port
Links:
Usually most convenient way is to debug user application remotely, becauseeither target system do not have a debugger, or the one it has in packages doesnot match the compiler used for build (e.g. does not support RVV). Remotedebugging process with GDB is as follows:
- Build your application with debugging information enabled, use
-g -Ogcompiler options or-DCMAKE_BUILD_TYPE=Debugcmake option in case of OpenCV - On remote machine run your application with the gdbserver (port can be chosenarbitrarily, e.g. 1234):Program will start and pause immediately waiting for remote connection.
gdbserver :<port> ./my-app<args>
- On host machine run your application using the GDB from toolchain:Program will be loaded and GDB will wait for further instructions
<toolchain root>/bin/riscv64-unknown-linux-gnu-gdb ./my-app<args>
- Setup remote connection on the host machine using GDB command
target remote <address>:<port>, whereaddress is your remote machine IPaddress or hostname andport is the same as on step 2 - Debug application from host as usual, e.g. enter the
continuecommand tocontinue execution until it crashes and examine program state afterwards.
Similar procedure can be used with the QEMU emulation - set-g <port> optionto start server (step 2) and connect usingtaget remote :<port> on the GDBside (step 4).
OpenCV support RISC-V platform since 2020 and each year it grows and improves.Major contribution has been made by theT-Head(平头哥半导体有限公司) (intrin_rvv071.hpp) and by the Chinese Academy ofSciences (intrin_rvv.hpp,intrin_rvv_scalable.hpp).
Note: in the latest OpenCV versionsintrin_rvv.hpp implementation has beenremoved
Most CPU optimizations in OpenCV are achieved through the use of UniversalIntrinsics, which act as wrappers over platform-specific SIMD compilerintrinsics. Currently, OpenCV supports implementations forSSE/AVX/NEON/RVV/VSX/MSA/LSX/WASM intrinsics.
Historically, the Universal Intrinsics have undergone three generations:
- Fixed size intrinsics: types have indication of element size and count, e.g.
v_int8x16- vectors with 16 8-bit elements (128-bit registers). - Wide intrinsics: types have indication of element size only, element count isselected at compile-time, e.g.
v_int8- vector with 8-bitVTraits<v_int8>::nlaneselements, wherenlanescan be 16, 32 or 64depending on vector register size (128, 256, 512 bits). - Scalable intrinsics: types have indication of element size, element count isselected at run-time depending on platform where it is executed, e.g.
v_int8- for 8-bit vectors withV_Traits<v_int8>::vlanes()elements.
Scalable intrinsics implementation matches well with RVV sizeless vectorsparadigm. During summer 2023 and 2024, as part of OpenCV Summer of Code project,the library has been refactored by @hanliutong using semi-automated approach tosupport scalable intrinsics in most areas.
There are 2 implementations of universal intrinsics for RISC-V RVV in OpenCV(files located inmodules/core/include/opencv2/core/hal):
intrin_rvv_scalable.hpp- uses latest RVV intrinsics, limited to RVV v1.0.Can be built with recent versions of GCC and LLVM toolchains. First realimplementation of Scalable Universal Intriniscs. Can be run on the T-HeadQEMU (c908v), mainline QEMU with RVV 1.0 enabled or HW supporting RVV 1.0.intrin_rvv071.hpp- soon after OpenCV 4.9.0 release this implementation hasbeen reworked by the T-Head to support modern intrinsics dialect, toolchains(2.6.x - 2.8.x) and both 0.7.1 and 1.0 RVV versions (seePR#24841). It uses fixedvector length (128 bit) by setting the compiler option:-mrvv-vector-bits=128. At the time of writing this implementation mightshow worse efficiency than previous one, but further improvements in thisare should be possible.
We will describe both build variants, one for each Universal Intrinsicsimplementation listed above.
We assume the following directory structure:
<root>/ - opencv/ - OpenCV repository - opencv_extra/ - OpenCV Extra repository with testdata (not required for build) - build/ - build location (empty)Prerequisites are:
cmakeninja-build(Makefiles generator can be used as well)python3(?)- selected RISC-V toolchain installed somewhere (
TOOLCHAIN_ROOT)
We will use static builds (BUILD_SHARED_LIBS option) for deploymentconvenience, but dynamic builds can be used as well. We also disable OpenCLduring builds (WITH_OPENCL option) to avoid loading of an experimental OpenCLruntime during OpenCV test execution and reduce testing time and non-relevantfailures, this option is not necessary for regular use.
Main difference between build variants is the<toolchain>.cmake file beingused and some specific options.
Use mainline GCC or LLVM toolchains.
Build with GCC:
cd buildPATH=${TOOLCHAIN_ROOT}/bin:${PATH} \cmake -GNinja \ -DCMAKE_BUILD_TYPE=Release \ -DBUILD_SHARED_LIBS=OFF \ -DWITH_OPENCL=OFF \ -DCMAKE_TOOLCHAIN_FILE=../opencv/platforms/linux/riscv64-gcc.toolchain.cmake \ -DRISCV_RVV_SCALABLE=ON \ ../opencvninja
Build with LLVM (also requires GCC for standard libraries):
cd buildcmake -GNinja \ -DCMAKE_BUILD_TYPE=Release \ -DBUILD_SHARED_LIBS=OFF \ -DWITH_OPENCL=OFF \ -DCMAKE_TOOLCHAIN_FILE=../opencv/platforms/linux/riscv64-clang.toolchain.cmake \ -DRISCV_CLANG_BUILD_ROOT=${LLVM_TOOLCHAIN_ROOT} \ -DRISCV_GCC_INSTALL_ROOT=${GCC_TOOLCHAIN_ROOT} \ -DRISCV_RVV_SCALABLE=ON \ ../opencvninja
Run OpenCV core test using regular QEMU:
cd buildOPENCV_TEST_DATA_PATH=../opencv_extra/testdata \${QEMU_DIR}/bin/qemu-riscv64 \ -L${TOOLCHAIN_ROOT}/sysroot \ -cpu rv64,v=true,vext_spec=v1.0 \ ./bin/opencv_test_core
Note: OpenCV uses flexible CPU feature detection during configurationprocess, so if the compiler does not support RVV optimizations it will be turnedoff and build will proceed. To avoid this behavior and fail build process incase when compiler does not support RVV, the following option options should beadded:-DCPU_BASELINE_REQUIRE=RVV
Note: CMake option can be used to change default CPU option used for RVVdetection:-DCMAKE_CXX_FLAGS="-march=rv64gcv1p0". It can be useful if yourwant to enable extra RISC-V features or your compiler requires specific featuredescription syntax.
OpenCV > 4.9.0
Use T-Head 2.x toolchain.
Build:
cd buildPATH=${TOOLCHAIN_ROOT}/bin:${PATH} \cmake -GNinja \ -DCMAKE_BUILD_TYPE=Release \ -DBUILD_SHARED_LIBS=OFF \ -DWITH_OPENCL=OFF \ -DCMAKE_TOOLCHAIN_FILE=../opencv/platforms/linux/riscv64-071-gcc.toolchain.cmake \ -DCORE=C910V \ ../opencvninja
Run OpenCV core test using T-Head QEMU (select CPU model corresponding to the build CPU option):
cd buildOPENCV_TEST_DATA_PATH=../opencv_extra/testdata \${QEMU_DIR}/bin/qemu-riscv64 \ -L${TOOLCHAIN_ROOT}/sysroot \ -cpu c910v \ ./bin/opencv_test_core
Note: see all supported CPU models accepted by the-DCORE= option in theplatforms/linux/riscv64-071-gcc.toolchain.cmake. RVV version and itsavailability depend on selected CPU model.
© Copyright 2019-2025, OpenCV team