- Notifications
You must be signed in to change notification settings - Fork0
fml-fam/craze
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
- Version: 0.2-1
- License:BSL-1.0
- Project home:https://github.com/fml-fam/fmlr
- Bug reports:https://github.com/fml-fam/fmlr/issues
High-level, R-like interface forfmlr. The package name is a play on the German word 'fimmel' (fml, fimmel, ...).
The goal of the package is to give a more traditional, R-like interface around fmlr functions and methods. It's basically just a shallow S4 wrapper. The canonical use case would be something like:
- build your matrix in R
- convert to an fmlr object:
- the easy way: use
fmlmat() - the harder, more robust way:
- convert the R matrix to an fmlr object via
as_cpumat(),as_gpumat(), oras_mpimat()(may require a copy) - convert to a craze
fmlmatobject viaas_fmlmat()(no copy)
- convert the R matrix to an fmlr object via
- the easy way: use
- call the desired linear algebra function(s)
You will need an installation offmlr. See thefmlr installation guide for more details.
You can install the stable version fromthe HPCRAN using the usualinstall.packages():
install.packages("craze",repos=c("https://hpcran.org","https://cran.rstudio.com"))
The development version of craze is maintained on GitHub:
remotes::install_github("fml-fam/craze")
Multiplying CPU data:
library(craze)x=matrix(as.double(1:9),3)x_cpu= fmlmat(x)x_cpu
## # cpumat 3x3 type=d## 1.0000 4.0000 7.0000 ## 2.0000 5.0000 8.0000 ## 3.0000 6.0000 9.0000x_cpu%*%x_cpu
## # cpumat 3x3 type=d## 30.0000 66.0000 102.0000 ## 36.0000 81.0000 126.0000 ## 42.0000 96.0000 150.0000and GPU data:
x_gpu= fmlmat(x,backend="gpu")x_gpu
## # gpumat 3x3 type=d ## 1.0000 4.0000 7.0000 ## 2.0000 5.0000 8.0000 ## 3.0000 6.0000 9.0000x_gpu%*%x_gpu
## # gpumat 3x3 type=d ## 30.0000 66.0000 102.0000 ## 36.0000 81.0000 126.0000 ## 42.0000 96.0000 150.0000Throughout, I'm using:
- R
- R version 3.6.2
- float version 0.2-4
- fml version 0.2-1
- craze version 0.1-0
- CPU
- AMD Ryzen 7 1700X Eight-Core Processor
- OpenBLAS with 8 threads
- GPU
- NVIDIA GeForce GTX 1070 Ti
- cuBLAS
Let's take a look at a quick matrix multiplication benchmark. First, we need to set up the test matrices:
library(craze)n=5000x=matrix(runif(n*n),n,n)x_flt= fl(x)x_cpu= fmlmat(x)x_gpu= fmlmat(x,type="float",backend="gpu")
We're using float for the GPU data because my graphics card doesn't have the full double precision cores. That change should give us a roughly 2x run-time advantage over a double precision test, like the R version. It's more amenable to the float test.
First we'll time the R matrix product (double precision, CPU)
system.time(x%*%x)
## user system elapsed ## 10.241 0.330 1.345As I recall, R's matrix multiplication does some pre-scanning for bad numerical values. None of the implementations that follow do, so there is some overhead in this implementation that may or may not be of value to you.
Here's the float test (single precision, CPU)
system.time(x_flt%*%x_flt)
## user system elapsed ## 4.898 0.212 0.640This is a little more than twice as fast, which makes sense.
Here's a double precision test using fmlr as the backend (double precision, CPU)
system.time(x_cpu%*%x_cpu)
## user system elapsed ## 10.285 0.317 1.327Even with the overhead of the R version, the run times are essentially the same. This is expected, since most of the work is actually in computing the product. And both are calling out to the samedgemm() function in OpenBLAS. The float version above is callingsgemm() from OpenBLAS.
The GPU numbers are pretty different though:
system.time(x_gpu%*%x_gpu)
## user system elapsed ## 0.002 0.001 0.002This is more than 300x faster than the CPU float version. There's a reason I chose matrix multiplication for this benchmark 😉
| Benchmark | Precision | Wall-clock time | Relative Performance |
|---|---|---|---|
R matrixx | double | 1.345 | 672.5 |
float matrixx_flt | single | 0.640 | 320.0 |
fmlr CPU matrixx_cpu | double | 1.327 | 663.5 |
fmlr GPU matrixx_gpu | single | 0.002 | 1.0 |
About
High-level, R-like interface for fmlr
Topics
Resources
Uh oh!
There was an error while loading.Please reload this page.