- Notifications
You must be signed in to change notification settings - Fork1
Simple wrapper for PAPI most common used functions
License
UDC-GAC/papi_wrapper
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Simple wrapper for PAPI most common used functions: set up events, start, stopand print counters within a region of interest. Besides, within that region candefine many subregions in order to count them separately. Thus, this librarysimplifies setting up low-level features of PAPI such as domain, granularity oroverflow.
PAPI's low-level API allows programmers to program different hardware counterswhile executing a program. However, programming those counters may introduce alot of complex code onto the original program. Besides, configuring properlyPAPI may be tedious. For these reasons we have created a set of macros in orderto simplify the problem, while configuring PAPI at compilation time: eitherusing flags or files.
This interface works for multithreaded programs using OpenMP; actually this wasthe main reason for developing this interface. SeeUsage andOptions sections for further details.
It is only needed to rewrite our code as (<papi.h>
header files are alreadyincluded inpapi_wrapper
):
#include <papi_wrapper.h>...pw_init_instruments; /* initialize counters */pw_start_instruments; /* starts automatically PAPI counters *//* region of interest (ROI) to measure */pw_stop_instruments; /* stop counting */pw_print_instruments; /* print results */
This way, it is only needed to add the header#include <papi_wrapper.h>
tothe source code and compile with-I/source/to/papi-wrapper papi_wrapper.c -lpapi
. Another way to use this wrapper library, using subregions, could be:
#include <papi_wrapper.h>...pw_init_start_instruments_sub(2); /* initialize counters *//* region of interest (ROI) to measure */pw_begin_subregion(0);/* subregion 0 */pw_end_subregion(0);pw_begin_subregion(1);/* subregion 1 */pw_end_subregion(1);pw_stop_instruments; /* stop counting */pw_print_subregions; /* print results */
TL; DR: macros available inpapi_wrapper
:
pw_set_thread_report
: set thread to measure when single thread.pw_init_instruments
: init PAPI and flush caches.pw_init_start_instruments
: wrapper forpw_init_instruments
andpw_start_instruments
pw_init_start_instruments_sub(n)
: init library and set number of regionsto measurepw_start_instruments
: start counting.pw_stop_instruments
: stop counting.pw_start_instruments_loop(n)
: to use within a parallel loop, e.g.#pragma omp parallel for
.pw_stop_instruments_loop(n)
: to use within a parallel loop, e.g.#pragma omp parallel for
.pw_begin_subregion(n)
: start measuringn
region.pw_end_subregion(n)
: stop measuringn
region.pw_print_instruments
: print counters.pw_print_subregions
: print counters by subregion measured.
For more examples, refer totests
subdirectory. They can be executed withCTest
.
- PAPI >=5.x
- GCC C compiler >=8.x
- PAPI library
- Doxygen >=1.8
In order to differenciate PAPI macros and PAPI_wrapper macros, as mnemonicsall PAPI_wrapper options begin with the prefixPW_
.
High-level configuration parameters:
-DPW_THREAD_MONITOR
- default value0
. Indicates the master thread ifPW_MULTITHREAD
also enabled.-DPW_MULTITHREAD
- disabled by default. If not defined, onlyPW_THREAD_MONITOR
will count events (only one thread). This option is notcompatible when using uncore events, it basically makes the PAPI librarycrash. Need to be compiled with-fopenmp
.-DPW_VERBOSE
- disabled by default. More text in the output and errors.-DPW_CSV
- disabled by default. Print in CSV format using comma(-DPW_CSV_SEPARATOR=","
) as divider where first row contains the thread numberand the names of the hardware counters used, containing the following rowseach thread and its counter values.-DPW_FILE
- print output to file specified by-DPW_FILENAME=<file>
(default to/tmp/__tmp_papi_wrapper.output
), instead of standard output.
Low-level configuration parameters (refer toPAPIfor further information):
-DPW_GRN=<granularity>
- default valuePAPI_GRN_MIN
.-DPW_DOM=<domain>
- default valuePAPI_DOM_ALL
.-DPW_SAMPLING
- disabled by default. Enables sampling for all the eventsspecified inPW_FLIST
with thresholds specified inPW_FSAMPLE
.
Configuration files (see their format incircleci/circleci-docs/tree/teesloane-patch-5
PAPI wrapper may be precompiled and linked to your executable or compileddirectly with your sources. PAPI wrapper basically initializes a PAPI event setfor each counter specified in thePW_FLIST
. If multithread is enabled, thenall threads will count events individually and simultaneously, but one counterat a time. Multiplexing is a experimental feature that should be avoided inPAPI, since inPAPI's discussions there is some skepticism regarding itsreliability.
When talking about the overhead of a library, we can think about the overheadof using it regarding execution time or memory. In any case, overheads:
- Costs of the PAPI library: initializing the library, starting counters,stoping them. For further details refer to
papi_cost
utility. - Costs of PAPI wrapper: iterating over the list of events to measure. Eachcounter is measured individually, so there is no concurrency at all when itcomes to measure different events. There is only concurrency when measuringdifferent thread: they are all measured at the same time
List of known issues when testing:
- Do not mix uncore and not uncore events in the list: undefined behavior.
- Uncore events must explicitly have the
:cpu=X
flag on them. - With the major version 1.0.0, PAPI wrapper introduces subregions, whichbasically permits measuring different regions of code simoultaneously andindividually. Nonetheless, if the region or subregion measured has an order ofmagnitude lower than the proper PAPI library cost (refer toOverhead), the result will have too much noise.
Refer toReleaseswebpage. Versioning followsSemantic Versioning2.0.0.
Maintainer:
- Marcos Horro (marcos.horro (at) udc.gal)
Authors:
- Marcos Horro
- Dr. Gabriel Rodríguez
This version is based onPolyBench, under GPLv2 license.
MIT License.
About
Simple wrapper for PAPI most common used functions