Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Performance monitoring and benchmarking suite

License

NotificationsYou must be signed in to change notification settings

RRZE-HPC/likwid

Repository files navigation


Likwid is a simple to install and use toolsuite of command line applications and a libraryfor performance oriented programmers. It works for Intel, AMD, ARMv8 and POWER9processors on the Linux operating system. There is additional support for Nvidia and AMD GPUs.There is support for ARMv7 and POWER8/9 but there is currently no test machine inour hands to test them properly.

LIKWID Playlist (YouTube)

Build StatusGeneral LIKWID DOI

It consists of:

  • likwid-topology: print thread, cache and NUMA topology
  • likwid-perfctr: configure and read out hardware performance counters on Intel, AMD, ARM and POWER processors and Nvidia GPUs
  • likwid-powermeter: read out RAPL Energy information and get info about Turbo mode steps
  • likwid-pin: pin your threaded application (pthread, Intel and gcc OpenMP to dedicated processors)
  • likwid-bench: Micro benchmarking platform for CPU architectures
  • likwid-features: Print and manipulate cpu features like hardware prefetchers (x86 only)
  • likwid-genTopoCfg: Dumps topology information to a file
  • likwid-mpirun: Wrapper to start MPI and Hybrid MPI/OpenMP applications (Supports Intel MPI, OpenMPI, MPICH and SLURM)
  • likwid-perfscope: Frontend to the timeline mode of likwid-perfctr, plots live graphs of performance metrics using gnuplot
  • likwid-memsweeper: Sweep memory of NUMA domains and evict cachelines from the last level cache
  • likwid-setFrequencies: Tool to control the CPU and Uncore frequencies (x86 only)
  • likwid-sysFeatures: Tool to system settings like frequencies, powercaps and prefetchers (experimental)

For further information please take a look at theWiki or contact us via Matrix chatLIKWID General.


Supported architectures

Intel

  • Intel Atom
  • Intel Pentium M
  • Intel Core2
  • Intel Nehalem
  • Intel NehalemEX
  • Intel Westmere
  • Intel WestmereEX
  • Intel Xeon Phi (KNC)
  • Intel Silvermont & Airmont
  • Intel Goldmont
  • Intel SandyBridge
  • Intel SandyBridge EP/EN
  • Intel IvyBridge
  • Intel IvyBridge EP/EN/EX
  • Intel Xeon Phi (KNL, KNM)
  • Intel Haswell
  • Intel Haswell EP/EN/EX
  • Intel Broadwell
  • Intel Broadwell D
  • Intel Broadwell EP
  • Intel Skylake
  • Intel Kabylake
  • Intel Coffeelake
  • Intel Skylake SP
  • Intel Cascadelake SP
  • Intel Icelake
  • Intel Icelake SP
  • Intel Tigerlake (experimental)
  • Intel SapphireRapids
  • Intel EmeraldRapids

AMD

  • AMD K8
  • AMD K10
  • AMD Interlagos
  • AMD Kabini
  • AMD Zen
  • AMD Zen2
  • AMD Zen3
  • AMD Zen4

ARM

  • ARMv7
  • ARMv8
  • Special support for Marvell Thunder X2
  • Fujitsu A64FX
  • ARM Neoverse N1 (AWS Graviton 2)
  • ARM Neoverse V1
  • HiSilicon TSV110
  • Apple M1 (only with Linux)

POWER (experimental)

  • IBM POWER8
  • IBM POWER9

Nvidia GPUs

AMD GPUs


Download, Build and Install

You can get the releases of LIKWID at:http://ftp.fau.de/pub/likwid/

For build and installation hints see INSTALL file or check the build instructionspage in the wikihttps://github.com/RRZE-HPC/likwid/wiki/Build

For quick install:

VERSION=stablewget http://ftp.fau.de/pub/likwid/likwid-$VERSION.tar.gztar -xaf likwid-$VERSION.tar.gzcd likwid-*vi config.mk# configure build, e.g. change installation prefix and architecture flagsmakesudo make install# sudo required to install the access daemon with proper permissions

For ARM builds, theCOMPILER flag inconfig.mk needs to changed toGCCARMv8 orARMCLANG (experimental).For POWER builds, theCOMPILER flag inconfig.mk needs to changed toGCCPOWER orXLC (experimental).For Nvidia GPU support, setNVIDIA_INTERFACE inconfig.mk totrue and adjust build-time variables if neededFor AMD GPU support, setROCM_INTERFACE inconfig.mk totrue and adjust build-time variables if needed


Usage examples

likwid-topology
--------------------------------------------------------------------------------CPU name:Intel(R) Core(TM) i7-6700K CPU @ 4.00GHzCPU type:Intel Skylake processorCPU stepping:3********************************************************************************Hardware Thread Topology********************************************************************************Sockets:1Cores per socket:4Threads per core:2--------------------------------------------------------------------------------HWThread        Thread        Core        Die        Socket        Available0               0             0           0          0             *                1               0             1           0          0             *                2               0             2           0          0             *                3               0             3           0          0             *                4               1             0           0          0             *                5               1             1           0          0             *                6               1             2           0          0             *                7               1             3           0          0             *                --------------------------------------------------------------------------------Socket 0:( 0 4 1 5 2 6 3 7 )--------------------------------------------------------------------------------********************************************************************************Cache Topology********************************************************************************Level:1Size:32 kBCache groups:( 0 4 ) ( 1 5 ) ( 2 6 ) ( 3 7 )--------------------------------------------------------------------------------Level:2Size:256 kBCache groups:( 0 4 ) ( 1 5 ) ( 2 6 ) ( 3 7 )--------------------------------------------------------------------------------Level:3Size:8 MBCache groups:( 0 4 1 5 2 6 3 7 )--------------------------------------------------------------------------------********************************************************************************NUMA Topology********************************************************************************NUMA domains:1--------------------------------------------------------------------------------Domain:0Processors:( 0 4 1 5 2 6 3 7 )Distances:10Free memory:318.203 MBTotal memory:7626.23 MB--------------------------------------------------------------------------------
likwid-perfctr
$ likwid-perfctr -C 0 -g L2 hostname--------------------------------------------------------------------------------CPU name:Intel(R) Core(TM) i7-6700K CPU @ 4.00GHzCPU type:Intel Skylake processorCPU clock:4.01 GHz--------------------------------------------------------------------------------mytesthost--------------------------------------------------------------------------------Group 1: L2+-----------------------+---------+------------+|         Event         | Counter | HWThread 0 |+-----------------------+---------+------------+|   INSTR_RETIRED_ANY   |  FIXC0  |     321342 || CPU_CLK_UNHALTED_CORE |  FIXC1  |     450498 ||  CPU_CLK_UNHALTED_REF |  FIXC2  |    1118900 ||    L1D_REPLACEMENT    |   PMC0  |       6670 ||      L1D_M_EVICT      |   PMC1  |       1840 || ICACHE_64B_IFTAG_MISS |   PMC2  |       9293 |+-----------------------+---------+------------+

+--------------------------------+------------+| Metric | HWThread 0 |+--------------------------------+------------+| Runtime (RDTSC) [s] | 0.0022 || Runtime unhalted [s] | 0.0001 || Clock [MHz] | 1613.6392 || CPI | 1.4019 || L2D load bandwidth [MBytes/s] | 197.8326 || L2D load data volume [GBytes] | 0.0004 || L2D evict bandwidth [MBytes/s] | 54.5745 || L2D evict data volume [GBytes] | 0.0001 || L2 bandwidth [MBytes/s] | 528.0381 || L2 data volume [GBytes] | 0.0011 |+--------------------------------+------------+

likwid-pin
$ likwid-pin -c 0,1,2 ./a.out[pthread wrapper] [pthread wrapper] MAIN -> 0[pthread wrapper] PIN_MASK: 0->1  1->2  [pthread wrapper] SKIP MASK: 0x0threadid 140566548539136 -> hwthread 1 - OKthreadid 140566540146432 -> hwthread 2 - OKNumber of Threads requested = 3Thread 0 running on processor 0 ....Thread 1 running on processor 1 ....Thread 2 running on processor 2 ....[...]
likwid-bench
$ likwid-bench -t triad_avx -W N:2GB:3Warning: Sanitizing vector length to a multiple of the loop stride 16 and thread count 3 from 62500000 elements (500000000 bytes) to 62499984 elements (499999872 bytes)Allocate: Process running on hwthread 0 (Domain N) - Vector length 62499984/499999872 Offset 0 Alignment 512Allocate: Process running on hwthread 0 (Domain N) - Vector length 62499984/499999872 Offset 0 Alignment 512Allocate: Process running on hwthread 0 (Domain N) - Vector length 62499984/499999872 Offset 0 Alignment 512Allocate: Process running on hwthread 0 (Domain N) - Vector length 62499984/499999872 Offset 0 Alignment 512Initialization: Each thread in domain initializes its own stream chunks--------------------------------------------------------------------------------LIKWID MICRO BENCHMARKTest: triad_avx--------------------------------------------------------------------------------Using 1 work groupsUsing 3 threads--------------------------------------------------------------------------------Running without Marker API. Activate Marker API with -m on commandline.--------------------------------------------------------------------------------Group: 0 Thread 1 Global Thread 1 running on hwthread 4 - Vector length 20833328 Offset 20833328Group: 0 Thread 0 Global Thread 0 running on hwthread 0 - Vector length 20833328 Offset 0Group: 0 Thread 2 Global Thread 2 running on hwthread 1 - Vector length 20833328 Offset 41666656--------------------------------------------------------------------------------Cycles:22977763263CPU Clock:4007946861Cycle Clock:4007946861Time:5.733051e+00 secIterations:96Iterations per thread:32Inner loop executions:1302083Size (Byte):1999999488Size per thread:666666496Number of Flops:3999998976MFlops/s:697.71Data volume (Byte):63999983616MByte/s:11163.34Cycles per update:11.488885Cycles per cacheline:91.911077Loads per update:3Stores per update:1Load bytes per element:24Store bytes per elem.:8Load/store ratio:3.00Instructions:2374999408UOPs:3749999040--------------------------------------------------------------------------------
likwid-mpirun
$ likwid-mpirun -mpi slurm -np 4 -t 2 ./a.outMPI startedProcess with rank 0 running on Node f0846.nhr.fau.de core 0Process with rank 2 running on Node f0859.nhr.fau.de core 0Process with rank 3 running on Node f0859.nhr.fau.de core 36Process with rank 1 running on Node f0846.nhr.fau.de core 36Enter OpenMP parallel regionStart OpenMP threadsRank 0 Thread 0 running on Node f0846.nhr.fau.de core 0Rank 0 Thread 1 running on Node f0846.nhr.fau.de core 1Rank 1 Thread 0 running on Node f0846.nhr.fau.de core 36Rank 1 Thread 1 running on Node f0846.nhr.fau.de core 37Rank 2 Thread 0 running on Node f0859.nhr.fau.de core 0Rank 2 Thread 1 running on Node f0859.nhr.fau.de core 1Rank 3 Thread 0 running on Node f0859.nhr.fau.de core 36Rank 3 Thread 1 running on Node f0859.nhr.fau.de core 37
likwid-powermeter
$ likwid-powermeter --------------------------------------------------------------------------------CPU name:Intel(R) Core(TM) i7-6700K CPU @ 4.00GHzCPU type:Intel Skylake processorCPU clock:4.01 GHz----------------------------------------------------------------------------------------------------------------------------------------------------------------Runtime: 2.00019 sMeasure for socket 0 on CPU 0Domain PKG:Energy consumed: 7.47705 JoulesPower consumed: 3.73817 WattDomain PP0:Energy consumed: 5.42047 JoulesPower consumed: 2.70998 WattDomain PP1:Energy consumed: 0.0872803 JoulesPower consumed: 0.043636 WattDomain DRAM:Energy consumed: 1.02612 JoulesPower consumed: 0.513013 WattDomain PLATFORM:Energy consumed: 0 JoulesPower consumed: 0 Watt--------------------------------------------------------------------------------
likwid-features
$ likwid-features -c 0 -lFeature               HWThread 0HW_PREFETCHER         onCL_PREFETCHER         onDCU_PREFETCHER        onIP_PREFETCHER         onFAST_STRINGS          onTHERMAL_CONTROL       onPERF_MON              onFERR_MULTIPLEX        offBRANCH_TRACE_STORAGE  onXTPR_MESSAGE          offPEBS                  onSPEEDSTEP             onMONITOR               onSPEEDSTEP_LOCK        offCPUID_MAX_VAL         offXD_BIT                onDYN_ACCEL             offTURBO_MODE            onTM2                   off

Documentation

For a detailed documentation on the usage of the tools have a look at thehtml documentation build with doxygen. Call

make docs

or after installation, look at the man pages.

There is also a wiki at the github page:https://github.com/rrze-likwid/likwid/wiki

If you have problems or suggestions please let me know on the likwid mailing list:http://groups.google.com/group/likwid-users

or if it is bug, add an issue at:https://github.com/rrze-likwid/likwid/issues

You can also chat with us through Matrix:


Extras


Survey

We opened a survey at the user mailing list to get a feeling who uses LIKWID and how.Moreover we would be interested if you are missing a feature or what annoys you when using LIKWID.Link to the survey:https://groups.google.com/forum/#!topic/likwid-users/F7TDho3k7ps


Funding

LIKWID development was funded by BMBF Germany under theFEPA project, grant 01IH13009. Since 2017 the development is further funded by BMBF Germany under theSeASiTe project, grant 01IH16012A. In 2022, theEE-HPC project is funded by BMBF Germany in the GreenHPC grant.

BMBF logo

[8]ページ先頭

©2009-2025 Movatter.jp