gemv

Star

Here are 5 public repositories matching this topic...

Language:All

Filter by language

All5 Cuda3 C1 C++1

DefTruth /CUDA-Learn-Notes

Star2.9k

📚200+ Tensor/CUDA Cores Kernels, ⚡️flash-attn-mma, ⚡️hgemm with WMMA, MMA and CuTe (98%~100% TFLOPS of cuBLAS/FA2 🎉🎉).

cuda cuda-kernels cutlass cudnn cuda-toolkit gemm cuda-programming gemv hgemm flash-attention flash-mla

UpdatedMar 19, 2025
Cuda

Bruce-Lee-LY /cuda_hgemv

Star57

Several optimization methods of half-precision general matrix vector multiplication (HGEMV) using CUDA core.

gpu cuda cublas nvidia gemm gemv matrix-multiply tensor-core hgemm cuda-core hgemv

UpdatedSep 8, 2024
Cuda

yzhaiustc /Optimizing-SGEMV-on-NVIDIA-GPUs

Star9

An implementation of SGEMV with performance comparable to cuBLAS.

cuda blas gemv

UpdatedMay 21, 2021
Cuda

nsomatilda /Matilda

Star3

Matilda is a library to repeatedly multiply a constant matrix with a variable vector

realtime multithreading simd low-latency avx2 adaptive-optics matrix-vector-multiplication avx-512 gemv

UpdatedMay 23, 2024
C++

yzhaiustc /Optimizing-DGEMV-on-Intel-CPUs

Star3

Highly optimized DGEMV on CPU with both serial and parallel performance better than MKL and OpenBLAS.

openmp simd blas avx512 mkl gemv

UpdatedMay 24, 2021
C

Improve this page

Add a description, image, and links to thegemv topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with thegemv topic, visit your repo's landing page and select "manage topics."

Learn more

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly