Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork10
R Package: Parallel Distance Matrix Computation using Multiple Threads
License
alexeckert/parallelDist
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
TheparallelDist package provides a fast parallelized alternative to R's native 'dist' function to calculate distance matrices for continuous, binary, and multi-dimensional input matrices and offers a broad variety of predefined distance functions from the 'stats', 'proxy' and 'dtw' R packages, as well as support for user-defined distance functions written in C++. For ease of use, the 'parDist' function extends the signature of the 'dist' function and uses the same parameter naming conventions as distance methods of existing R packages. The package currently supports 41 different distance methods.
The package is mainly implemented in C++ and leverages the 'Rcpp' and 'RcppParallel' package to parallelize the distance computations with the help of the TinyThread library. Furthermore, the Armadillo linear algebra library is used via 'RcppArmadillo' for optimized matrix operations for distance calculations. The curiously recurring template pattern (CRTP) technique is applied to avoid virtual functions, which improves the Dynamic Time Warping calculations while keeping the implementation flexible enough to support different step patterns and normalization methods.
Usage examples and performance benchmarks can be found in the included vignette.
Details about the 41 supported distance methods and their parameters are described on the help page of the 'parDist' function. The help page can be displayed with the following command:
?parDistSince version 0.2.0, parallelDist supports fast parallel distance matrix computations for user-defined distance functions written in C++.
A user-defined function needs to have the following signature (also see theArmadillo documentation):
doublecustomDist(const arma::mat &A,const arma::mat &B)
Defining and compiling the function, as well as creating an external pointer to the user-defined function can easily be achieved with thecppXPtr function of the 'RcppXPtrUtils' package. The following code shows a full example of defining and using a user-defined euclidean distance function:
# RcppArmadillo is used as dependencylibrary(RcppArmadillo)# RcppXPtrUtils is used for simple handling of C++ external pointerslibrary(RcppXPtrUtils)# compile user-defined function and return pointer (RcppArmadillo is used as dependency)euclideanFuncPtr<- cppXPtr("double customDist(const arma::mat &A, const arma::mat &B) { return sqrt(arma::accu(arma::square(A - B))); }",depends= c("RcppArmadillo"))# distance matrix for user-defined euclidean distance function# (note that method is set to "custom")parDist(matrix(1:16,ncol=2),method="custom",func=euclideanFuncPtr)
More information can be found in the vignette and the help pages.
parallelDist is available onCRAN and can be installed with the following command:
install.packages("parallelDist")The current version from github can be installed using the 'devtools' package:
library(devtools)install_github("alexeckert/parallelDist")
To build the package from source, the following system dependencies are required:
- LAPACK and BLAS libraries: Required for linear algebra operations
On Ubuntu/Debian systems:
sudo apt-get install liblapack-dev libblas-dev
On RHEL/CentOS/Fedora systems:
sudo yum install lapack-devel blas-devel# or on newer systems:sudo dnf install lapack-devel blas-develOn macOS with Homebrew:
brew install lapack openblas
Alexander Eckert
GPL (>= 2)
About
R Package: Parallel Distance Matrix Computation using Multiple Threads
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Sponsor this project
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Contributors4
Uh oh!
There was an error while loading.Please reload this page.