NotificationsYou must be signed in to change notification settings
Fork2
Star37

Tensor Contraction Code Generator

You must be signed in to change notification settings

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
benchmark		benchmark
scripts		scripts
tccg		tccg
.gitignore		.gitignore
README.md		README.md
config.cfg		config.cfg
example.tccg		example.tccg
setup.py		setup.py

Repository files navigation

Tensor Contraction Code Generator

The Tensor Contraction Code Generator (TCCG) generates high-performance (parallel and) vectorized C code for tensor contractions.

From a computational perspective, tensorscan be interpreted as higher dimensional matrices or simply asmultidimensional arrays; likewise, tensor contractionsare a generalization of the matrix-matrix multiplication to higherdimensions. For instance, A[i,k], B[k,j] and C[i,j] denote two-dimensionaltensors (i.e., matrices) and C[i,j] = A[i,k] * B[k,j] represents a tensorcontraction where the sum over 'k' as well as the loops over 'i' and 'j' areimplicit. Further examples of tensor contractions are: C[i0,j0,j1] = A[i0,k0] * B[j1,k0,j0];C[i0,j0,j1,i1] = A[i0,k0,i1] * B[j1,k0,j0]; C[i0,j0,j1,i1] = A[k0,i0,k1,i1] * B[k1,j1,k0,j0] ...

Current version:v0.1.2

Key Features

TCCG generates high-performance vectorized C code
TCCG generates code based on three different approaches:
- GEMM-like Tensor-Tensor Multiplication (GETT): This novel approach to tensor contractions is at the core of our latest publication (see below).
- Transpose-Transpose-GEMM-Transpose (TTGT)
- Loops-over-GEMM (LoG)
Shared-memory parallelism
- TTGT, LoG, GETT
Support for single- and double-precision
Auto-Fine-Tuning:
- Automatically explores a search space of promising implementation candidates
- The fastest candidate will be selected and returned automatically
- A performance model guides the search
- The search space can be limited by the user (via the --maxImplementations=N command line argument)
Support for multiple instruction sets:
- AVX2: GETT, TTGT, LoG
- AVX512: GETT, TTGT, LoG (experimental)
- CUDA: TTGT, LoG

Advantages of GETT

GETT's advantages are manifold:* GETT-based code isfully vectorized andexploits the cache hierarchy.* Sub-tensors are packed into the caches as needed. Thus, GETT avoids the explicit transposition overhead incurred by TTGT.* Thestride-one index is preserved while packing the sub-tensors into a specified level of the cache hierarchy.*No additional workspace is required (except for small buffers which fit into the caches).* Thearithmetic intensity is retained for any given tensor contraction.

While GETT exhibits excellent performance across a wide range of tensor contractions, its performance for bandwidth-bound tensor contractions is especially outstanding.

For further information, please see our(paper).

Requirements

In order to use TCCG, a working C compiler and some BLAS library (e.g., Intel's MKL) as well as theHigh-Perfromance Tensor Transposition library (HPTT) are required:

Intel's ICC (>= v15.0, recommended) or g++ (>= v4.8, experimental)
Some BLAS library (e.g.,BLIS,ATLAS)
High-Performance Tensor Transposition (HPTT) library
Python (tested with v2.7.5 and v2.7.9)
Tensor Contraction LibraryTCL (OPTIONAL)

Install

Clone the repository into a desired directory and change to that location:
git clonehttps://github.com/HPAC/tccg.gitcd tccg
Install TCCG:
python setup.py install --user
Export the TCCG_ROOT environment variable (add to your .bashrc):
export TCCG_ROOT=pwd
Setup the your BLAS library within the $TCCG_ROOT/config.cfg (default: mkl).
You might have to add the installed location to your PATH environment variable:
export PATH=$PATH:~/.local/bin

Getting Started

Please runtccg --help to get an overview of TCCG's parameters.

Here is an exemplary input file to TCCG:

C[a,b,i,j] = A[i,m,a] * B[m,j,b]a = 24b = 24i = 24j = 24m = 24

TCCG command line arguments:

tccg --arch=avx2 --numThreads=1 --floatType=s example.tccg

Further exmaples (.tccg files) can be generated via:

python bechmark/benchmark.py

Benchmark

TCCG provides abenchmark for tensor contractions.

python benchmark.py

This will generate the input files (.tccg) for TCCG for each of the test-cases within the benchmark.The tensor contractions within the benchmark are collected from four different publications to cover a broad range of use cases (see paper, Sec. 7.1); this being said, we don't claim that this benchmark is exhaustive in any sense.If you think that the benchmark is missing certain tensor contractions or sizes, please feel free to contribute to the benchmark.

Since this benchmark may evolve over time and to make comparisons easier, please refer to the current version of the benchmark.

Benchmark version:v0.1

Current Limitations of GETT

The product of the sizes corresponding to the free indices of each input tensor needs to be amultiple of 24. This limitation will be lifted in a future version of GETT.

Citation

In case you want to refer to TCCG as part of a research paper, please cite the followingarticle(pdf):

@article{tccg2016a,   author      = {Paul Springer and Paolo Bientinesi},   title       = {{Design of a high-performance GEMM-like Tensor-Tensor Multiplication}},   archivePrefix = "arXiv",   eprint = {1607.00145},   primaryClass = "quant-ph",   journal     = {CoRR},   year        = {2016},   issue_date  = {July 2016},   url         = {http://arxiv.org/abs/1607.00145}}

Changelog

V0.2.0:

GETT is now also parallelized
this branch now uses theHigh-Perfromance Tensor Transposition library (HPTT) which significantly reduces the compile time

Feedback & Contributions

We are happy for any feedback or feature requests. Please contactspringer@aices.rwth-aachen.de.

We also welcome any contributions to the code base or the benchmark.

About

Tensor Contraction Code Generator

Releases

1tags

Packages

No packages published

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Tensor Contraction Code Generator

Key Features

Advantages of GETT

Requirements

Install

Getting Started

Benchmark

Current Limitations of GETT

Citation

Changelog

Feedback & Contributions

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Uh oh!

Languages

Movatterモバイル変換

HPAC/tccg

Folders and files

Latest commit

History

Repository files navigation

Tensor Contraction Code Generator

Key Features

Advantages of GETT

Requirements

Install

Getting Started

Benchmark

Current Limitations of GETT

Citation

Changelog

Feedback & Contributions

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages0

Uh oh!

Languages

Packages