- Notifications
You must be signed in to change notification settings - Fork24
The NAS Parallel Benchmarks for evaluating C++ parallel programming frameworks on shared-memory architectures
License
GMAP/NPB-CPP
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
The NAS Parallel Benchmarks for evaluating C++ parallel programming frameworks on shared-memory architectures
The NPB's Fortran codes were carefully ported toC++ and are fully compliant with theNPB3.4.1 version (NPB official webpage). Ourpaper contains abundant information on how the porting was conducted and discusses the outcome performance we obtained withNPB-CPP on different machines (Intel Xeon, AMD Epyc, and IBM Power8) and compilers (GCC, ICC, and Clang). Results showed that we achieved similar performance withNPB-CPP compared to the originalNPB.You can use our papers, along with the official reports, as a guide to assess performance using the NPB suite.
🔉News: A new parallel implementation is now available using the Parallel STL (PSTL). 📅24/Jan/2025
[DOI] J. Löff, D. Griebler, G. Mencagli et al.,The NAS Parallel Benchmarks for evaluating C++ parallel programming frameworks on shared-memory architectures,Future Generation Computer Systems (FGCS) (2021)
[DOI] J. Löff; R. B. Hoffmann; A. S. Bianchessi; L. Mallmann; D. Griebler; W. Binder.NPB-PSTL: C++ STL Algorithms with Parallel Execution Policies in NAS Parallel Benchmarks. In: 2025 33st Euromicro International Conference on Parallel, Distributed and NetworkBased Processing (PDP), 2025, Turim, Italy.
This is a repository aimed at providing parallel codes with different C++ parallel programming APIs for the NAS Parallel Benchmarks (NPB). You can also contribute with this project, writing issues and pull requests.
The conventions we used in our porting can be foundhere
=================================================================== NAS Parallel Benchmarks in C++ using OpenMP, FastFlow, Intel TBB, and C++ parallel STL algorithms. This project was conducted in the Parallel Applications Modelling Group (GMAP) at PUCRS - Brazil. GMAP Research Group leaders: Luiz Gustavo Leão Fernandes (PUCRS) Dalvan Griebler (PUCRS) Code contributors: Dalvan Griebler (PUCRS) Gabriell Araujo (PUCRS) Júnior Löff (PUCRS) In case of questions or problems, please send an e-mail to us: dalvan.griebler@acad.pucrs.br gabriell.araujo@edu.pucrs.br junior.loff@edu.pucrs.br We would like to thank the following researchers for the fruitful discussions: Gabriele Mencagli(UNIPI) Massimo Torquati(UNIPI) Marco Danelutto (UNIPI)===================================================================
NPB-SER - This directory contains the sequential version.
NPB-OMP - This directory contains the parallel version implemented with OpenMP (based in the original NPB version).
NPB-TBB - This directory contains the parallel version implemented with Threading Building Blocks.
NPB-FF - This directory contains the parallel version implemented with FastFlow.
NPB-PSTL - This directory contains the parallel version implemented with C++ parallel STL algorithms.
Each directory is independent and contains its own implemented version of the kernels and pseudo-applications:
EP - Embarrassingly Parallel, floating-point operation capacityMG - Multi-Grid, non-local memory accesses, short- and long-distance communicationCG - Conjugate Gradient, irregular memory accesses and communicationFT - discrete 3D fast Fourier Transform, intensive long-distance communicationIS - Integer Sort, integer computation and communication
BT - Block Tri-diagonal solverSP - Scalar Penta-diagonal solverLU - Lower-Upper Gauss-Seidel solver
Tip: The pseudo-applications' performance is bounded to the sequential partial differential equation (PDE) solver
Warning: our tests were made with GCC-9 and ICC-19
Enter the directory from the version desired and execute:
$ make _BENCHMARK CLASS=_WORKLOAD
_BENCHMARKs are:
EP, CG, MG, IS, FT, BT, SP and LU
_WORKLOADs are:
Class S: small for quick test purposesClass W: workstation size (a 90's workstation; now likely too small)Classes A, B, C: standard test problems; ~4X size increase going from one class to the nextClasses D, E, F: large test problems; ~16X size increase from each of the previous Classes
Command example:
$ make ep CLASS=A
Binaries are generated inside the bin folder
Command example:
$ ./bin/ep.A
Each folder contains a default compiler configuration that can be modified in theconfig/make.def
file.You must use this file if you want to modify the target compiler, flags or links that will be used to compile the applications.
The repository already has an additional directorylibs
with the FastFlow and Intel TBB libraries.
For TBB you need to compile the library and load the environment variables, therefore, enterlibs/tbb-2020.1
and execute the following command:
$ make
This command will generate a folder insidelibs/tbb-2020.1/build
. Finally, you can load TBB vars within the scripttbbvars.sh
, for example, executing the following command in your terminal:
$ source libs/tbb-2020.1/build/linux_intel64_gcc_cc7.5.0_libc2.27_kernel4.15.0_release/tbbvars.sh
The degree of parallelism can be set using the*RUNTIME*_NUM_THREADS
environment variable.
Command example:
$ export OMP_NUM_THREADS=32
orTBB_NUM_THREADS
andFF_NUM_THREADS
About
The NAS Parallel Benchmarks for evaluating C++ parallel programming frameworks on shared-memory architectures