Movatterモバイル変換


[0]ホーム

URL:


PAPI

Overview

The Performance Application Programming Interface (PAPI) offers a universal interface and methodology for gathering performance counter information from diverse hardware and software components. This includes major CPUs, GPUs, accelerators, interconnects, I/O systems, power interfaces, and even virtual cloud environments. Collaborations with industry leaders like AMD, Cray/HPE, IBM, Intel, NVIDIA, and others ensure seamless integration of PAPI with new architectures as they are introduced or come close to release. As the PAPI component architecture expands, 3rd-party performance tools interfacing with PAPI gain the capability to measure data from these emerging architectures.

In 2024, PAPI released version 7.2.0b1 as a beta release. This release introduces a new component, rocp_sdk, which supports AMD GPUs/APUs through the ROCprofiler-SDK interface, which is still under development and testing. The release also includes general improvements to the PAPI code, enhancing both design and functionality, e.g., introducing PAPI Preset Events for AMD Zen5 and Intel Ice Lake, as well as various bug fixes.

Latest Releases

PAPI7.2.0b2
2025-03-05
PAPI 7.2.0b2 Release

PAPI 7.2.0b2 is now available as a beta release. This release introduces improvementsto the rocp_sdk component, which supports AMD GPUs/APUs through the ROCprofiler-SDKinterface, currently still under development and testing. The release alsoincludes general improvements to the PAPI code, enhancing both design andfunctionality, as well as various bug fixes.

Additional Major Changes are:

  • AMD ROCprofiler-SDK component (rocp_sdk): Support for sampling (device profiling) mode, and multiple devices
    • Tested on Instinct MI50, MI210, MI250x, and MI300a
    • Sampling functionality has been tested successfully with ROCm-6.3.2. Earlier versions might lead to unexpected behavior.
  • CUDA component:
    • Added support for heterogeneous systems
    • Added support for "device" qualifier to reduce papi_native_avail output length
  • Updated libpfm4 to latest commit 762ca94010d9a8f21f0440c0b5807e9a2e849420
  • AMD power: Added support for family 25 (19h) processors in the RAPL component
  • Intel power: Add support for RaptorLake in RAPL component
  • IBM POWER10: Added preset events
  • Updated papi_events.csv to remove deprecated preset events
  • Improvements in the Counter Analysis Toolkit (CAT)
  • Added tests for the lmsensors component
  • Allow user to optionally disable perf_event, perf_events_uncore, and cpu
  • Sysdetect: allow users to disable component
  • papi_mem_info: added support for ARM Neoverse V2
  • Testing: run_tests.sh tests only active components

Acknowledgements

This release is the result of efforts from many people. The PAPI team would like to express special thanks to Vince Weaver, Stephane Eranian (for libpfm4), William Cohen, Steve Kaufmann, Peinan Zhang, Rashawn Knapp and Phil Mucci.

Download papi-7.2.0b2.tar.gz

To verify the integrity of the download, check the MD5 hash md5sum papi-7.2.0b2.tar.gz:

f75e93d9b3abdeb7736a713f9db76b73

PAPI 7.2.0b2

Related Links

Papers

Jagode, H.,A. Danalis,G. Congiu,D. Barry,A. Castaldo, andJ. Dongarra,Advancements of PAPI for the exascale generation,”The International Journal of High Performance Computing Applications, December 2024.DOI:10.1177/10943420241303884
Barry, D.,H. Jagode,A. Danalis, andJ. Dongarra,Memory Traffic and Complete Application Profiling with PAPI Multi-Component Measurements,”2023 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), St. Petersburg, Florida, IEEE, August 2023.DOI:10.1109/IPDPSW59300.2023.00070 (1.81 MB)
Danalis, A., andH. Jagode,Performance Application Programming Interface,”Accelerated Computing with HIP: Sun, Baruah and Kaeli, December 2022.
Dongarra, J.,H. Jagode,A. Danalis,D. Barry, andV. Weaver,Performance Application Programming Interface for Extreme-Scale Environments (PAPI-EX) (Poster) , Seattle, WA, 2020 NSF Cyberinfrastructure for Sustained Scientific Innovation (CSSI) Principal Investigator Meeting, 20 2020. (2.53 MB)
Barry, D.,A. Danalis, andH. Jagode,Effortless Monitoring of Arithmetic Intensity with PAPI's Counter Analysis Toolkit,”13th International Workshop on Parallel Tools for High Performance Computing, Dresden, Germany, Springer International Publishing, September 2020. (738.47 KB)
Jagode, H.,A. Danalis, andD. Genet,Roadmap for Refactoring Classic PAPI to PAPI++: Part II: Formulation of Roadmap Based on Survey Results,”PAPI++ Working Notes, no. 2, ICL-UT-20-09: Innovative Computing Laboratory, University of Tennessee, July 2020. (763.75 KB)
Jagode, H.,A. Danalis, andJ. Dongarra,Exa-PAPI: The Exascale Performance API with Modern C++ , Houston, TX, 2020 Exascale Computing Project Annual Meeting, February 2020. (556.78 KB)
Jagode, H., andA. Danalis,PULSE: PAPI Unifying Layer for Software-Defined Events (Poster) , Seattle, WA, 2020 NSF Cyberinfrastructure for Sustained Scientific Innovation (CSSI) Principal Investigator Meeting, February 2020. (1.86 MB)
Winkler, F.,Redesigning PAPI's High-Level API,”Innovative Computing Laboratory Technical Report, no. ICL-UT-20-03: University of Tennessee, February 2020. (356.41 KB)
Jagode, H.,A. Danalis, andJ. Dongarra,Formulation of Requirements for New PAPI++ Software Package: Part I: Survey Results,”PAPI++ Working Notes, no. 1, ICL-UT-20-02: Innovative Computing Laboratory, University of Tennessee Knoxville, January 2020. (1.49 MB)
Jagode, H.,A. Danalis,H. Anzt, andJ. Dongarra,PAPI Software-Defined Events for in-Depth Performance Analysis,”The International Journal of High Performance Computing Applications, vol. 33, issue 6, pp. 1113-1127, November 2019. (442.39 KB)
Davis, J.,T. Gao,S. Chandrasekaran,H. Jagode,A. Danalis,P. Balaji,J. Dongarra, andM. Taufer,Characterization of Power Usage and Performance in Data-Intensive Applications using MapReduce over MPI,”2019 International Conference on Parallel Computing (ParCo2019), Prague, Czech Republic, September 2019.
Jagode, H.,A. Danalis, andJ. Dongarra,What it Takes to keep PAPI Instrumental for the HPC Community,”1st Workshop on Sustainable Scientific Software (CW3S19), Collegeville, Minnesota, July 2019. (50.57 KB)
Danalis, A.,H. Jagode,T. Herault,P. Luszczek, andJ. Dongarra,Software-Defined Events through PAPI,”2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Rio de Janeiro, Brazil, IEEE, May 2019.DOI:10.1109/IPDPSW.2019.00069 (446.41 KB)
Danalis, A.,H. Jagode,H. Hanumantharayappa,S. Ragate, andJ. Dongarra,Counter Inspection Toolkit: Making Sense out of Hardware Performance Events,”11th International Workshop on Parallel Tools for High Performance Computing, Dresden, Germany, Cham, Switzerland: Springer, February 2019.DOI:10.1007/978-3-030-11987-4_2 (216.39 KB)
Haidar, A.,H. Jagode,P. Vaccaro,A. YarKhan,S. Tomov, andJ. Dongarra,Investigating Power Capping toward Energy-Efficient Scientific Applications,”Concurrency Computation: Practice and Experience, vol. 2018, issue e4485, pp. 1-14, April 2018.DOI:10.1002/cpe.4485 (1.2 MB)
Parker, S.,J. Mellor-Crummey,D. H. Ahn,H. Jagode,H. Brunst,S. Shende,A. D. Malony,D. DelSignore,R. Tschuter,R. Castain, et al.,Performance Analysis and Debugging Tools at Scale,”Exascale Scientific Applications: Scalability and Performance Portability: Chapman & Hall / CRC Press, pp. 17-50, November 2017.DOI:10.1201/b21930
Haidar, A.,H. Jagode,A. YarKhan,P. Vaccaro,S. Tomov, andJ. Dongarra,Power-aware Computing: Measurement, Control, and Performance Analysis for Intel Xeon Phi,”2017 IEEE High Performance Extreme Computing Conference (HPEC'17), Best Paper Finalist, Waltham, MA, IEEE, September 2017.DOI:10.1109/HPEC.2017.8091085 (908.84 KB)
Jagode, H.,A. YarKhan,A. Danalis, andJ. Dongarra,Power Management and Event Verification in PAPI,”Tools for High Performance Computing 2015: Proceedings of the 9th International Workshop on Parallel Tools for High Performance Computing, September 2015, Dresden, Germany, Dresden, Germany, Springer International Publishing, pp. pp. 41-51, 2016.DOI:10.1007/978-3-319-39589-0_4 (565.14 KB)
McCraw, H.,J. Ralph,A. Danalis, andJ. Dongarra,Power Monitoring with PAPI for Extreme Scale Architectures and Dataflow-based Programming Models,”2014 IEEE International Conference on Cluster Computing, no. ICL-UT-14-04, Madrid, Spain, IEEE, September 2014.DOI:10.1109/CLUSTER.2014.6968672 (3.45 MB)
Nelson, J.,Analyzing PAPI Performance on Virtual Machines,”VMWare Technical Journal, vol. Winter 2013, January 2014.
Nelson, J.,Analyzing PAPI Performance on Virtual Machines,”ICL Technical Report, no. ICL-UT-13-02, August 2013. (437.37 KB)
McCraw, H.,D. Terpstra,J. Dongarra,K. Davis, andR. Musselman,Beyond the CPU: Hardware Performance Counter Monitoring on Blue Gene/Q,”International Supercomputing Conference 2013 (ISC'13), Leipzig, Germany, Springer, June 2013. (624.58 KB)
Weaver, V.,D. Terpstra, andS. Moore,Non-Determinism and Overcount on Modern Hardware Performance Counter Implementations,”2013 IEEE International Symposium on Performance Analysis of Systems and Software, Austin, TX, IEEE, April 2013. (307.24 KB)
Weaver, V.,D. Terpstra,H. McCraw,M. Johnson,K. Kasichayanula,J. Ralph,J. Nelson,P. Mucci,T. Mohan, andS. Moore,PAPI 5: Measuring Power, Energy, and the Cloud , Austin, TX, 2013 IEEE International Symposium on Performance Analysis of Systems and Software, April 2013. (78.39 KB)
Weaver, V. M.,M. Johnson,K. Kasichayanula,J. Ralph,P. Luszczek,D. Terpstra, andS. Moore,Measuring Energy and Power with PAPI,”International Workshop on Power-Aware Systems and Architectures, Pittsburgh, PA, September 2012.DOI:10.1109/ICPPW.2012.39 (146.79 KB)
Johnson, M.,H. McCraw,S. Moore,P. Mucci,J. Nelson,D. Terpstra,V. M. Weaver, andT. Mohan,PAPI-V: Performance Monitoring for Virtual Machines,”CloudTech-HPC 2012, Pittsburgh, PA, September 2012.DOI:10.1109/ICPPW.2012.29 (2.69 MB)
Kasichayanula, K.,D. Terpstra,P. Luszczek,S. Tomov,S. Moore, andG. D. Peterson,Power Aware Computing on GPUs,”SAAHPC '12 (Best Paper Award), Argonne, IL, July 2012. (658.06 KB)
Malony, A. D.,S. Biersdorff,S. Shende,H. Jagode,S. Tomov,G. Juckeland,R. Dietrich,D. Poole, andC. Lamb,Parallel Performance Measurement of Heterogeneous Parallel Systems with GPUs,”International Conference on Parallel Processing (ICPP'11), Taipei, Taiwan, ACM, September 2011.DOI:10.1109/ICPP.2011.71 (1.41 MB)
Kasichayanula, K.,H. You,S. Moore,S. Tomov,H. Jagode, andM. Johnson,Power-aware Computing on GPGPUs , Gatlinburg, TN, Fall Creek Falls Conference, Poster, September 2011. (2.89 MB)
Moore, S., andJ. Ralph,User-Defined Events for Hardware Performance Monitoring,”Procedia Computer Science, vol. 4: Elsevier, pp. 2096-2104, May 2011.DOI:10.1016/j.procs.2011.04.229 (361.76 KB)
Weaver, V. M., andJ. Dongarra,Can Hardware Performance Counters Produce Expected, Deterministic Results?,”3rd Workshop on Functionality of Hardware Performance Monitoring, Atlanta, GA, December 2010. (392.71 KB)
Terpstra, D.,H. Jagode,H. You, andJ. Dongarra,Collecting Performance Data with PAPI-C,”Tools for High Performance Computing 2009, 3rd Parallel Tools Workshop, Dresden, Germany, Springer Berlin / Heidelberg, pp. 157-173, May 2010.DOI:10.1007/978-3-642-11261-4_11 (4.45 MB)
Mucci, P.,D. Ahlin,J. Danielsson,P. Ekman, andL. Malinowski,PerfMiner: Cluster-Wide Collection, Storage and Presentation of Application Level Hardware Performance Data,”European Conference on Parallel Processing (Euro-Par 2005), Monte de Caparica, Portugal, Springer, September 2005.DOI:10.1007/11549468_1 (205.45 KB)
Moore, S.,D. Cronk,F. Wolf,A. Purkayastha,P. J. Teller,R. Araiza,G. Aguilera, andJ. Nava,Performance Profiling and Analysis of DoD Applications using PAPI and TAU,”Proceedings of DoD HPCMP UGC 2005, Nashville, TN, IEEE, June 2005. (322.56 KB)
Andersson, U., andP. Mucci,Analysis and Optimization of Yee_Bench using Hardware Performance Counters,”Proceedings of Parallel Computing 2005 (ParCo), Malaga, Spain, January 2005. (72.27 KB)
Dongarra, J.,S. Moore,P. Mucci,K. Seymour, andH. You,Accurate Cache and TLB Characterization Using Hardware Counters,”International Conference on Computational Science (ICCS 2004), Krakow, Poland, Springer, June 2004.DOI:10.1007/978-3-540-24688-6_57 (167.1 KB)
Yi, Q.,K. Kennedy,H. You,K. Seymour, andJ. Dongarra,Automatic Blocking of QR and LU Factorizations for Locality,”2nd ACM SIGPLAN Workshop on Memory System Performance (MSP 2004), Washington, DC, ACM, June 2004.DOI:10.1145/1065895.1065898 (212.77 KB)
Mucci, P.,J. Dongarra,R. Kufrin,S. Moore,F. Song, andF. Wolf,Automating the Large-Scale Collection and Analysis of Performance,”5th LCI International Conference on Linux Clusters: The HPC Revolution, Austin, Texas, May 2004. (511.6 KB)
Wolf, F., andB. Mohr,Hardware-Counter Based Automatic Performance Analysis of Parallel Programs,”Advances in Parallel Computing, vol. 13, Dresden, Germany, Elsevier, pp. 753-760, January 2004, 2003.DOI:10.1016/S0927-5452(04)80092-3
Dongarra, J.,A. D. Malony,S. Moore,P. Mucci, andS. Shende,Performance Instrumentation and Measurement for Terascale Systems,”ICCS 2003 Terascale Workshop, Melbourne, Australia, Springer, Berlin, Heidelberg, June 2003.DOI:10.1007/3-540-44864-0_6 (5.36 MB)
Dongarra, J.,K. London,S. Moore,P. Mucci,D. Terpstra,H. You, andM. Zhou,Experiences and Lessons Learned with a Portable Interface to Hardware Performance Counters,”PADTAD Workshop, IPDPS 2003, Nice, France, IEEE, April 2003. (432.57 KB)
Moore, S.,A Comparison of Counting and Sampling Modes of Using Performance Monitoring Hardware,”International Conference on Computational Science (ICCS 2002), Amsterdam, Netherlands, Springer, April 2002.DOI:10.1007/3-540-46080-2_95 (122 KB)
Moore, S.,D. Cronk,K. London, andJ. Dongarra,Review of Performance Analysis Tools for MPI Parallel Programs,”European Parallel Virtual Machine / Message Passing Interface Users’ Group Meeting, Lecture Notes in Computer Science 2131, Greece, Springer Verlag, Berlin, pp. 241-248, September 2001.DOI:10.1007/3-540-45417-9_34 (39.61 KB)
London, K.,J. Dongarra,S. Moore,P. Mucci,K. Seymour, andT. Spencer,End-user Tools for Application Performance Analysis, Using Hardware Counters,”International Conference on Parallel and Distributed Computing Systems, Dallas, TX, August 2001. (306.54 KB)
London, K.,S. Moore,P. Mucci,K. Seymour, andR. Luczak,The PAPI Cross-Platform Interface to Hardware Performance Counters,”Department of Defense Users' Group Conference Proceedings, Biloxi, Mississippi, June 2001. (328.56 KB)
Dongarra, J.,K. London,S. Moore,P. Mucci, andD. Terpstra,Using PAPI for Hardware Performance Monitoring on Linux Systems,”Conference on Linux Clusters: The HPC Revolution, Urbana, Illinois, Linux Clusters Institute, June 2001. (422.35 KB)

Presentations

Dongarra, J.,H. Jagode,A. Danalis,D. Barry, andV. Weaver,Performance Application Programming Interface for Extreme-Scale Environments (PAPI-EX) (Poster) , Seattle, WA, 2020 NSF Cyberinfrastructure for Sustained Scientific Innovation (CSSI) Principal Investigator Meeting, 20 2020. (2.53 MB)
Jagode, H.,A. Danalis, andJ. Dongarra,Exa-PAPI: The Exascale Performance API with Modern C++ , Houston, TX, 2020 Exascale Computing Project Annual Meeting, February 2020. (556.78 KB)
Jagode, H., andA. Danalis,PULSE: PAPI Unifying Layer for Software-Defined Events (Poster) , Seattle, WA, 2020 NSF Cyberinfrastructure for Sustained Scientific Innovation (CSSI) Principal Investigator Meeting, February 2020. (1.86 MB)
Danalis, A.,H. Jagode, andJ. Dongarra,PAPI's new Software-Defined Events for in-depth Performance Analysis , Dresden, Germany, 13th Parallel Tools Workshop, September 2019. (3.14 MB)
Danalis, A.,H. Jagode, andJ. Dongarra,Does your tool support PAPI SDEs yet? , Tahoe City, CA, 13th Scalable Tools Workshop, July 2019. (3.09 MB)
Jagode, H.,A. Danalis, andJ. Dongarra,What it Takes to keep PAPI Instrumental for the HPC Community , Collegeville, MN, The 2019 Collegeville Workshop on Sustainable Scientific Software (CW3S19), July 2019. (3.29 MB)
Danalis, A.,H. Jagode, andJ. Dongarra,Is your scheduling good? How would you know? , Bordeaux, France, 14th Scheduling for Large Scale Systems Workshop, June 2019. (2.5 MB)
Danalis, A.,H. Jagode,D. Barry, andJ. Dongarra,Understanding Native Event Semantics , Knoxville, TN, 9th JLESC Workshop, April 2019. (2.33 MB)
Jagode, H.,A. Danalis, andJ. Dongarra,PAPI's New Software-Defined Events for In-Depth Performance Analysis , Lyon, France, CCDSC 2018: Workshop on Clusters, Clouds, and Data for Scientific Computing, September 2018.
Danalis, A.,H. Jagode, andJ. Dongarra,Software-Defined Events through PAPI for In-Depth Analysis of Application Performance , Basel, Switzerland, 5th Platform for Advanced Scientific Computing Conference (PASC18), July 2018.
Danalis, A.,H. Jagode, andJ. Dongarra,PAPI: Counting outside the Box , Barcelona, Spain, 8th JLESC Meeting, April 2018.
Weaver, V.,D. Terpstra,H. McCraw,M. Johnson,K. Kasichayanula,J. Ralph,J. Nelson,P. Mucci,T. Mohan, andS. Moore,PAPI 5: Measuring Power, Energy, and the Cloud , Austin, TX, 2013 IEEE International Symposium on Performance Analysis of Systems and Software, April 2013. (78.39 KB)
Kasichayanula, K.,H. You,S. Moore,S. Tomov,H. Jagode, andM. Johnson,Power-aware Computing on GPGPUs , Gatlinburg, TN, Fall Creek Falls Conference, Poster, September 2011. (2.89 MB)

ICL Team Members

DanielBarry
Graduate Research Assistant
TreeceBurgess
Research Associate II
GiuseppeCongiu
Visiting Scholar
AnthonyDanalis
Research Assistant Professor
HeikeJagode
Research Associate Professor
Dong JunWoun
Graduate Research Assistant

Sponsored By

Exascale Computing Project
National Science Foundation
The United States Department of Energy

[8]ページ先頭

©2009-2025 Movatter.jp