Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

ROCm Parallel Primitives

License

NotificationsYou must be signed in to change notification settings

ROCm/rocPRIM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Note

The published rocPRIM documentation is availablehere in an organized, easy-to-read format, with search and a table of contents. The documentation source files reside in thedocs folder of this repository. As with all ROCm projects, the documentation is open source. For more information on contributing to the documentation, seeContribute to ROCm documentation.

rocPRIM is a header-only library that provides HIP parallel primitives. You can use this library todevelop performant GPU-accelerated code on AMD ROCm platforms.

Requirements

  • Git
  • CMake (3.16 or later)
  • AMDROCm platform (1.8.2 or later)
  • C++17
  • Python 3.6 or higher (HIP on Windows only, required only for install script)
  • Visual Studio 2019 with Clang support (HIP on Windows only)
  • Strawberry Perl (HIP on Windows only)

Optional:

  • GoogleTest
    • Required only for tests. Building tests is on by default.
    • This is automatically downloaded and built by the CMake script.
  • Google Benchmark
    • Required only for benchmarks. Building benchmarks is off by default.
    • This is automatically downloaded and built by the CMake script.

Build and install

You can build and install rocPRIM on Linux or Windows.

  • Linux:

    git clone https://github.com/ROCm/rocPRIM.git# Go to rocPRIM directory, create and go to the build directory.cd rocPRIM; mkdir build;cd build# Configure rocPRIM, setup options for your system.# Build options:#   ONLY_INSTALL - OFF by default, If this flag is on, the build ignore the BUILD_* flags#   BUILD_TEST - OFF by default,#   BUILD_EXAMPLE - OFF by default,#   BUILD_BENCHMARK - OFF by default.#   BENCHMARK_CONFIG_TUNING - OFF by default. The purpose of this flag to find the best kernel config parameters.#     At ON the compilation time can be increased significantly.#   AMDGPU_TARGETS - list of AMD architectures, default: gfx803;gfx900;gfx906;gfx908.#     You can make compilation faster if you want to test/benchmark only on one architecture,#     for example, add -DAMDGPU_TARGETS=gfx906 to 'cmake' parameters.#   AMDGPU_TEST_TARGETS - list of AMD architectures, default: "" (default system device)#     If you want to detect failures on a per GFX IP basis, setting it to some set of ips will create#     separate tests with the ip name embedded into the test name. Building for all, but selecting#     tests only of a specific architecture is possible for eg: ctest -R gfx803|gfx900## ! IMPORTANT !# Set C++ compiler to HIP-clang. You can do it by adding 'CXX=<path-to-compiler>'# before 'cmake' or setting cmake option 'CMAKE_CXX_COMPILER' to path to the compiler.# Using HIP-clang:[CXX=hipcc] cmake -DBUILD_BENCHMARK=ON ../.# Buildmake -j4# Optionally, run tests if they're enabled.ctest --output-on-failure# Install[sudo] make install
  • Windows:

    We've added initial support for HIP on Windows; to install, use the providedrmake.py python script:

    git clone https://github.com/ROCm/rocPRIM.gitcd rocPRIM# the -i option will install rocPRIM to C:\hipSDK by defaultpython rmake.py -i# the -c option will build all clients including unit testspython rmake.py -c

Using rocPRIM

Include the<rocprim/rocprim.hpp> header:

#include<rocprim/rocprim.hpp>

We recommended including rocPRIM into a CMake project by using the package configuration files.The rocPRIM package name isrocprim.

# "/opt/rocm" - default install prefixfind_package(rocprim REQUIRED CONFIGPATHS"/opt/rocm/rocprim")...# Includes only rocPRIM headers, HIP libraries have# to be linked manually by usertarget_link_libraries(<your_target> roc::rocprim)# Include rocPRIM headers and required HIP dependencies# - If using HIP language support (USE_HIPCXX=ON):target_link_libraries(<your_target> hip::host)# - Otherwise:target_link_libraries(<your_target> hip::device)

For more information onhip::host andhip::device, please see theROCm documentation.

Running unit tests

Unit tests are implemented in terms of GoogleTest. Collections of tests are wrapped and invoked fromCTest.

# Go to rocPRIM build directorycd rocPRIM;cd build# List available testsctest --show-only# To run all testsctest# Run specific test(s)ctest -R<regex># To run the Google Test manually./test/rocprim/test_<unit-test-name>

Using multiple GPUs concurrently for testing

This feature requires using CMake 3.16+ for building and testing.

Prior versions of CMake can't assign IDs to tests when running in parallel. Assigning tests to distinctdevices could only be done at the cost of extreme complexity.

Unit tests can make use of theCTest resource allocationfeature, which you can use to distribute tests across multiple GPUs in an intelligent manner. Thisfeature can accelerate testing when multiple GPUs of the same family are in a system. It can also testmultiple product families from one invocation without having to use theHIP_VISIBLE_DEVICESenvironment variable. The feature relies on the presence of a resource specifications file.

Trying to use `RESOURCE_GROUPS` and `--resource-spec-file` with CMake and CTest for versions priorto 3.16 silently omits the feature. No warnings are issued about unknown properties or command-linearguments. Make sure that the `cmake` and `ctest` versions you invoke are sufficiently recent.

Auto resource specification generation

You can independently call the utility script located in the repository using the following code:

# Go to rocPRIM build directorycd rocPRIM;cd build# Invoke directly or use CMake script mode via cmake -P../cmake/GenerateResourceSpec.cmake# Assuming you have 2 compatible GPUs in the systemctest --resource-spec-file ./resources.json --parallel 2

Manual

Assuming you have two GPUs from the gfx900 family and they are the first devices enumerated by thesystem, you can use-D AMDGPU_TEST_TARGETS=gfx900 during configuration to specify that onlyone family will be tested. Leaving this var empty (default) results in targeting the default device in thesystem. To let CMake know there are two GPUs that should be targeted, you have to provide aJSONfile to CTest via the--resource-spec-file <path_to_file> flag. For example:

{"version": {"major":1,"minor":0  },"local": [    {"gfx900": [        {"id":"0"        },        {"id":"1"        }      ]    }  ]}

Invoking CTest asctest --resource-spec-file <path_to_file> --parallel 2 allows two tests to runconcurrently, distributed between the two GPUs.

Using custom seeds for the tests

Modify therocPRIM/test/rocprim/test_seed.hpp file.

//(1)staticconstexprint random_seeds_count =10;//(2)staticconstexprunsignedint seeds [] = {0,2,10,1000};//(3)staticconstexprsize_t seed_size =sizeof(seeds) /sizeof(seeds[0]);

(1) Defines a constant that sets how many passes over the tests will be done with runtime-generatedseeds. Modify at will.

(2) Defines the user-generated seeds. Each of the array elements will be used as seed for all tests.Modify at will. If you don't want any static seeds, leave the array empty.

staticconstexprunsignedint seeds [] = {};

(3) Never modify this line.

Running benchmarks

# Go to rocPRIM build directorycd rocPRIM;cd build# To run benchmark for warp functions:# Further option can be found using --help# [] Fields are optional./benchmark/benchmark_warp_<function_name> [--size<size>] [--trials<trials>]# To run benchmark for block functions:# Further option can be found using --help# [] Fields are optional./benchmark/benchmark_block_<function_name> [--size<size>] [--trials<trials>]# To run benchmark for device functions:# Further option can be found using --help# [] Fields are optional./benchmark/benchmark_device_<function_name> [--size<size>] [--trials<trials>]

Performance configuration

Most device-specific primitives provided by rocPRIM can be tuned for other AMD devices, anddifferent types and operations, by passing compile-time configuration structures as a templateparameter. The main "knobs" are usually the size of the block and the number of items processed by asingle thread.

rocPRIM has built-in default configurations for each of its primitives, these will be used automaticallybased on the input types and the target architecture from the stream used.

hipCUB

hipCUB is a thin wrapper library on top ofrocPRIM orCUB. You can use it to port projects that use the CUB library to theHIP layer and run them on AMD hardware. In theROCm environment, hipCUB uses the rocPRIM library as abackend; on CUDA platforms, it uses CUB as a backend.

Building the documentation locally

Requirements

Doxygen

The build system uses Doxygenversion 1.9.4. You can try using a newer version, but that might cause issues.

After you have downloaded Doxygen version 1.9.4:

# Add doxygen to your PATHecho'export PATH=<doxygen 1.9.4 path>/bin:$PATH'>>~/.bashrc# Apply the updated .bashrcsource~/.bashrc# Confirm that you are using version 1.9.4doxygen --version

Python

The build system uses Python version 3.10. You can try using a newer version, but that might cause issues.

You can install Python 3.10 alongside your other Python versions usingpyenv:

# Install Python 3.10pyenv install 3.10# Create a Python 3.10 virtual environmentpyenv virtualenv 3.10 venv_rocprim# Activate the virtual environmentpyenv activate venv_rocprim

Building

After cloning this repository, andcding into it:

# Install Python dependenciespython3 -m pip install -r docs/sphinx/requirements.txt# Build the documentationpython3 -m sphinx -T -E -b html -d docs/_build/doctrees -D language=en docs docs/_build/html

You can then opendocs/_build/html/index.html in your browser to view the documentation.

Build documentation via CMake

Installrocm-cmake

# Change directory to rocPRIMcd rocPRIM# Install documentation dependenciespython3 -m pip install -r docs/sphinx/requirements.txt# Set C++ compiler# This example uses hipcc and assumes it is at the path /usr/binexport CXX=hipccexport PATH=/usr/bin:$PATH# Configure the projectcmake -S. -B ./build -D BUILD_DOCS=ON# Build the documentationcmake --build ./build --target doc# To serve the HTML docs locallycd ./build/docs/htmlpython3 -m http.server

Support

You can report bugs and feature requests through our GitHubissue tracker.

Contributions and license

Contributions of any kind are most welcome! Contribution instructions are inCONTRIBUTING.

Licensing information is inLICENSE.


[8]ページ先頭

©2009-2025 Movatter.jp