Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

The NAS Parallel Benchmarks for evaluating C++ parallel programming frameworks on shared-memory architectures

License

NotificationsYou must be signed in to change notification settings

GMAP/NPB-CPP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

The NPB's Fortran codes were carefully ported toC++ and are fully compliant with theNPB3.4.1 version (NPB official webpage). Ourpaper contains abundant information on how the porting was conducted and discusses the outcome performance we obtained withNPB-CPP on different machines (Intel Xeon, AMD Epyc, and IBM Power8) and compilers (GCC, ICC, and Clang). Results showed that we achieved similar performance withNPB-CPP compared to the originalNPB.You can use our papers, along with the official reports, as a guide to assess performance using the NPB suite.

🔉News: A new parallel implementation is now available using the Parallel STL (PSTL). 📅24/Jan/2025

How to cite our works

[DOI] J. Löff, D. Griebler, G. Mencagli et al.,The NAS Parallel Benchmarks for evaluating C++ parallel programming frameworks on shared-memory architectures,Future Generation Computer Systems (FGCS) (2021)

[DOI] J. Löff; R. B. Hoffmann; A. S. Bianchessi; L. Mallmann; D. Griebler; W. Binder.NPB-PSTL: C++ STL Algorithms with Parallel Execution Policies in NAS Parallel Benchmarks. In: 2025 33st Euromicro International Conference on Parallel, Distributed and NetworkBased Processing (PDP), 2025, Turim, Italy.

This is a repository aimed at providing parallel codes with different C++ parallel programming APIs for the NAS Parallel Benchmarks (NPB). You can also contribute with this project, writing issues and pull requests.

The conventions we used in our porting can be foundhere

===================================================================  NAS Parallel Benchmarks in C++ using OpenMP, FastFlow, Intel TBB, and C++ parallel STL algorithms.    This project was conducted in the Parallel Applications    Modelling Group (GMAP) at PUCRS - Brazil.    GMAP Research Group leaders:        Luiz Gustavo Leão Fernandes (PUCRS)    Dalvan Griebler (PUCRS)    Code contributors:         Dalvan Griebler (PUCRS)        Gabriell Araujo (PUCRS)        Júnior Löff (PUCRS)  In case of questions or problems, please send an e-mail to us:    dalvan.griebler@acad.pucrs.br    gabriell.araujo@edu.pucrs.br    junior.loff@edu.pucrs.br  We would like to thank the following researchers for the   fruitful discussions:      Gabriele Mencagli(UNIPI)      Massimo Torquati(UNIPI)      Marco Danelutto (UNIPI)===================================================================

Folders inside the project:

NPB-SER - This directory contains the sequential version.

NPB-OMP - This directory contains the parallel version implemented with OpenMP (based in the original NPB version).

NPB-TBB - This directory contains the parallel version implemented with Threading Building Blocks.

NPB-FF - This directory contains the parallel version implemented with FastFlow.

NPB-PSTL - This directory contains the parallel version implemented with C++ parallel STL algorithms.

The Five Kernels and Three Pseudo-applications

Each directory is independent and contains its own implemented version of the kernels and pseudo-applications:

Kernels

EP - Embarrassingly Parallel, floating-point operation capacityMG - Multi-Grid, non-local memory accesses, short- and long-distance communicationCG - Conjugate Gradient, irregular memory accesses and communicationFT - discrete 3D fast Fourier Transform, intensive long-distance communicationIS - Integer Sort, integer computation and communication

Pseudo-applications

BT - Block Tri-diagonal solverSP - Scalar Penta-diagonal solverLU - Lower-Upper Gauss-Seidel solver

Tip: The pseudo-applications' performance is bounded to the sequential partial differential equation (PDE) solver

Software Requirements

Warning: our tests were made with GCC-9 and ICC-19

How to Compile

Enter the directory from the version desired and execute:

$ make _BENCHMARK CLASS=_WORKLOAD

_BENCHMARKs are:

EP, CG, MG, IS, FT, BT, SP and LU

_WORKLOADs are:

Class S: small for quick test purposesClass W: workstation size (a 90's workstation; now likely too small)Classes A, B, C: standard test problems; ~4X size increase going from one class to the nextClasses D, E, F: large test problems; ~16X size increase from each of the previous Classes

Command example:

$ make ep CLASS=A

How to Execute

Binaries are generated inside the bin folder

Command example:

$ ./bin/ep.A

Compiler and Parallel Configurations

Each folder contains a default compiler configuration that can be modified in theconfig/make.def file.You must use this file if you want to modify the target compiler, flags or links that will be used to compile the applications.

Parallel Execution

Using and configuring the used parallel programming frameworks

The repository already has an additional directorylibs with the FastFlow and Intel TBB libraries.

For TBB you need to compile the library and load the environment variables, therefore, enterlibs/tbb-2020.1 and execute the following command:

$ make

This command will generate a folder insidelibs/tbb-2020.1/build. Finally, you can load TBB vars within the scripttbbvars.sh, for example, executing the following command in your terminal:

$ source libs/tbb-2020.1/build/linux_intel64_gcc_cc7.5.0_libc2.27_kernel4.15.0_release/tbbvars.sh

Setting the degree of parallelism (NUM_THREADS)

The degree of parallelism can be set using the*RUNTIME*_NUM_THREADS environment variable.

Command example:

$ export OMP_NUM_THREADS=32orTBB_NUM_THREADS andFF_NUM_THREADS


[8]ページ先頭

©2009-2025 Movatter.jp