NotificationsYou must be signed in to change notification settings
Fork0
Star3

High-level, R-like interface for fmlr

You must be signed in to change notification settings

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
R		R
man		man
tests		tests
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
NAMESPACE		NAMESPACE
NEWS.md		NEWS.md
README.md		README.md

Repository files navigation

craze

Version: 0.2-1
License:BSL-1.0
Project home:https://github.com/fml-fam/fmlr
Bug reports:https://github.com/fml-fam/fmlr/issues

High-level, R-like interface forfmlr. The package name is a play on the German word 'fimmel' (fml, fimmel, ...).

The goal of the package is to give a more traditional, R-like interface around fmlr functions and methods. It's basically just a shallow S4 wrapper. The canonical use case would be something like:

build your matrix in R
convert to an fmlr object:
- the easy way: usefmlmat()
- the harder, more robust way:
  - convert the R matrix to an fmlr object viaas_cpumat(),as_gpumat(), oras_mpimat() (may require a copy)
  - convert to a crazefmlmat object viaas_fmlmat() (no copy)
call the desired linear algebra function(s)

Installation

You will need an installation offmlr. See thefmlr installation guide for more details.

You can install the stable version fromthe HPCRAN using the usualinstall.packages():

install.packages("craze",repos=c("https://hpcran.org","https://cran.rstudio.com"))

The development version of craze is maintained on GitHub:

remotes::install_github("fml-fam/craze")

Example

Multiplying CPU data:

library(craze)x=matrix(as.double(1:9),3)x_cpu= fmlmat(x)x_cpu

## # cpumat 3x3 type=d## 1.0000 4.0000 7.0000 ## 2.0000 5.0000 8.0000 ## 3.0000 6.0000 9.0000

x_cpu%*%x_cpu

## # cpumat 3x3 type=d## 30.0000 66.0000 102.0000 ## 36.0000 81.0000 126.0000 ## 42.0000 96.0000 150.0000

and GPU data:

x_gpu= fmlmat(x,backend="gpu")x_gpu

## # gpumat 3x3 type=d ## 1.0000 4.0000 7.0000 ## 2.0000 5.0000 8.0000 ## 3.0000 6.0000 9.0000

x_gpu%*%x_gpu

## # gpumat 3x3 type=d ## 30.0000 66.0000 102.0000 ## 36.0000 81.0000 126.0000 ## 42.0000 96.0000 150.0000

Benchmark

Throughout, I'm using:

R
- R version 3.6.2
- float version 0.2-4
- fml version 0.2-1
- craze version 0.1-0
CPU
- AMD Ryzen 7 1700X Eight-Core Processor
- OpenBLAS with 8 threads
GPU
- NVIDIA GeForce GTX 1070 Ti
- cuBLAS

Let's take a look at a quick matrix multiplication benchmark. First, we need to set up the test matrices:

library(craze)n=5000x=matrix(runif(n*n),n,n)x_flt= fl(x)x_cpu= fmlmat(x)x_gpu= fmlmat(x,type="float",backend="gpu")

We're using float for the GPU data because my graphics card doesn't have the full double precision cores. That change should give us a roughly 2x run-time advantage over a double precision test, like the R version. It's more amenable to the float test.

First we'll time the R matrix product (double precision, CPU)

system.time(x%*%x)

##   user  system elapsed ## 10.241   0.330   1.345

As I recall, R's matrix multiplication does some pre-scanning for bad numerical values. None of the implementations that follow do, so there is some overhead in this implementation that may or may not be of value to you.

Here's the float test (single precision, CPU)

system.time(x_flt%*%x_flt)

##  user  system elapsed ## 4.898   0.212   0.640

This is a little more than twice as fast, which makes sense.

Here's a double precision test using fmlr as the backend (double precision, CPU)

system.time(x_cpu%*%x_cpu)

##   user  system elapsed ## 10.285   0.317   1.327

Even with the overhead of the R version, the run times are essentially the same. This is expected, since most of the work is actually in computing the product. And both are calling out to the samedgemm() function in OpenBLAS. The float version above is callingsgemm() from OpenBLAS.

The GPU numbers are pretty different though:

system.time(x_gpu%*%x_gpu)

##  user  system elapsed ## 0.002   0.001   0.002

This is more than 300x faster than the CPU float version. There's a reason I chose matrix multiplication for this benchmark 😉

Benchmark	Precision	Wall-clock time	Relative Performance
R matrix`x`	double	1.345	672.5
float matrix`x_flt`	single	0.640	320.0
fmlr CPU matrix`x_cpu`	double	1.327	663.5
fmlr GPU matrix`x_gpu`	single	0.002	1.0

About

High-level, R-like interface for fmlr

Releases2

Version 0.2-0 Latest

May 30, 2020

+ 1 release

Packages

No packages published

Languages

R100.0%

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

craze

Installation

Example

Benchmark

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases2

Packages

Languages

Movatterモバイル変換

fml-fam/craze

Folders and files

Latest commit

History

Repository files navigation

craze

Installation

Example

Benchmark

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases2

Packages0

Languages

Packages