Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

OpenSource GPU, in Verilog, loosely based on RISC-V ISA

License

NotificationsYou must be signed in to change notification settings

hughperkins/VeriGPU

Repository files navigation

Build an opensource GPU, targeting ASIC tape-out, formachine learning ("ML"). Hopefully, can get it to work with thePyTorch deep learning framework.

Vision

Create an opensource GPU for machine learning.

I don't actually intend to tape this out myself, but I intend to do what I can to verify somehow that tape-out would work ok, timings ok, etc.

Intend to implement aHIP API, that is compatible withpytorch machine learning framework. Open to provision of other APIs, such asSYCL orNVIDIA® CUDA™.

Internal GPU Core ISA loosely compliant withRISC-V ISA. Where RISC-V conflicts with designing for a GPU setting, we break with RISC-V.

Intend to keep the cores very focused on ML. For example,brain floating point ("BF16") throughout, to keep core die area low. This should keep the per-core cost low. Similarly, Intend to implement only few float operations critical to ML, such asexp,log,tanh,sqrt.

Architecture

Big Picture:

Big Picture

GPU Die Architecture:

GPU Die Architecture

Single Core:

Single Core

Single-source compilation and runtime

End-to-end Architecture

Simulation

Single-source C++

Single-source C++:

Single-source C++

Compile the GPU and runtime:

Compile GPU and runtime

Compile the single-source C++, and run:

Run single-source example

Planning

What direction are we thinking of going in? What works already? See:

Tech details

Our assembly language implementation and progress. Design of GPU memory, registers, and so on. See:

Verification

If we want to tape-out, we need solid verification. Read more at:

Metrics

we want the GPU to run quickly, and to use minimal die area. Read how we measure timings and area at:


[8]ページ先頭

©2009-2025 Movatter.jp