oneapi-src/unified-memory-frameworkPublic

NotificationsYou must be signed in to change notification settings
Fork42
Star74

A library for constructing allocators and memory pools. It also contains broadly useful abstractions and utilities for memory management. UMF allows users to manage multiple memory pools characterized by different attributes, allowing certain allocation types to be isolated from others and allocated using different hardware resources as required.

oneapi-src.github.io/unified-memory-framework/

License

View license

74 stars 42 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 2,708 Commits
.github		.github
benchmark		benchmark
cmake		cmake
docs		docs
examples		examples
include		include
licensing		licensing
scripts		scripts
src		src
test		test
third_party		third_party
.clang-format		.clang-format
.cmake-format		.cmake-format
.gitignore		.gitignore
.mailmap		.mailmap
.trivyignore		.trivyignore
CMakeLists.txt		CMakeLists.txt
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
ChangeLog		ChangeLog
LICENSE.TXT		LICENSE.TXT
README.md		README.md
RELEASE_STEPS.md		RELEASE_STEPS.md
security.md		security.md
vcpkg.json		vcpkg.json

Repository files navigation

Unified Memory Framework

Introduction

The Unified Memory Framework (UMF) is a library for constructing allocators and memory pools. It also contains broadly useful abstractions and utilities for memory management. UMF allows users to manage multiple memory pools characterized by different attributes, allowing certain allocation types to be isolated from others and allocated using different hardware resources as required.

Usage

For a quick introduction to UMF usage, please seeexamplesdocumentation, which includes the code of thebasic example.There are also more advanced examples that allocate USM memory from theLevel Zero device using the Level Zero API and UMF Level Zero memory provider andCUDA device using the CUDA API and UMF CUDA memory provider.

UMF's experimental CTL API is showcased in theCTL example,which explores provider and pool statistics, and in thecustom CTL example, which wires CTL support into a custom memory provider. These examples rely on experimental headers which may change in future releases.

Build

Requirements

Required packages:

libhwloc-dev >= 2.3.0 (Linux) / hwloc >= 2.3.0 (Windows)
C compiler
CMake >= 3.14.0

For development and contributions:

clang-format-15.0 (can be installed withpython -m pip install clang-format==15.0.7)
cmake-format-0.6 (can be installed withpython -m pip install cmake-format==0.6.13)
black (can be installed withpython -m pip install black==24.3.0)

Note: All devs dependencies are defined inthird-party/requirements.txtand can be installed, for example:pip install -r third_party/requirements.txt.

For building tests and multithreaded benchmarks:

C++ compiler with C++17 support

For Level Zero memory provider tests:

Level Zero headers and libraries
compatible GPU with installed driver

Linux

Executable and binaries will be inbuild/bin.The{build_config} can be eitherDebug orRelease.

cmake -B build -DCMAKE_BUILD_TYPE={build_config}cmake --build build -j$(nproc)

Windows

Generating Visual Studio Project. EXE and binaries will be inbuild/bin/{build_config}.The{build_config} can be eitherDebug orRelease.

cmake -B build -G"Visual Studio 15 2017 Win64"cmake --build build --config {build_config} -j$Env:NUMBER_OF_PROCESSORS

Benchmark

UMF comes with a single-threaded micro benchmark based onubench.In order to build the benchmark, theUMF_BUILD_BENCHMARKS CMake configuration flag has to be turnedON.

UMF also provides multithreaded benchmarks that can be enabled by setting bothUMF_BUILD_BENCHMARKS andUMF_BUILD_BENCHMARKS_MT CMakeconfiguration flags toON. Multithreaded benchmarks require C++ support.

The Scalable Pool requirements can be found in the relevant 'Memory Poolmanagers' section below.

Sanitizers

List of sanitizers available on Linux:

AddressSanitizer
UndefinedBehaviorSanitizer
ThreadSanitizer
- Is mutually exclusive with other sanitizers.
MemorySanitizer
- Requires linking against MSan-instrumented libraries to prevent false positive reports. More informationhere.

List of sanitizers available on Windows:

AddressSanitizer

Listed sanitizers can be enabled with appropriateCMake options.

Fuzz testing

To enable fuzz testing, theUMF_BUILD_FUZZTESTS CMake configuration flag mustbe set toON. Note, that this feature is supported only on Linux and requiresClang. Additionally, ensure that theCMAKE_PREFIX_PATH includes the directorycontaining the libraries necessary for fuzzing (e.g., Clang'slibclang_rt.fuzzer_no_main-x86_64.a).

Example:

cmake -B build -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_BUILD_TYPE=Debug -DUMF_BUILD_FUZZTESTS=ON -DCMAKE_PREFIX_PATH=/path/to/fuzzer/libs

CMake standard options

List of options provided by CMake:

Name	Description	Values	Default
UMF_BUILD_SHARED_LIBRARY	Build UMF as shared library	ON/OFF	OFF
UMF_BUILD_LEVEL_ZERO_PROVIDER	Build Level Zero memory provider	ON/OFF	ON
UMF_BUILD_CUDA_PROVIDER	Build CUDA memory provider	ON/OFF	ON
UMF_BUILD_LIBUMF_POOL_JEMALLOC	Build the libumf_pool_jemalloc static library	ON/OFF	OFF
UMF_BUILD_TESTS	Build UMF tests	ON/OFF	ON
UMF_BUILD_GPU_TESTS	Build UMF GPU tests	ON/OFF	OFF
UMF_BUILD_BENCHMARKS	Build UMF benchmarks	ON/OFF	OFF
UMF_BUILD_EXAMPLES	Build UMF examples	ON/OFF	ON
UMF_BUILD_FUZZTESTS	Build UMF fuzz tests (supported only on Linux with Clang)	ON/OFF	OFF
UMF_BUILD_GPU_EXAMPLES	Build UMF GPU examples	ON/OFF	OFF
UMF_DEVELOPER_MODE	Enable additional developer checks and logs	ON/OFF	OFF
UMF_FORMAT_CODE_STYLE	Add clang, cmake, and black -format-check and -format-apply targets to make	ON/OFF	OFF
UMF_TESTS_FAIL_ON_SKIP	Treat skips in tests as fail	ON/OFF	OFF
UMF_USE_ASAN	Enable AddressSanitizer checks	ON/OFF	OFF
UMF_USE_UBSAN	Enable UndefinedBehaviorSanitizer checks	ON/OFF	OFF
UMF_USE_TSAN	Enable ThreadSanitizer checks	ON/OFF	OFF
UMF_USE_MSAN	Enable MemorySanitizer checks	ON/OFF	OFF
UMF_USE_VALGRIND	Enable Valgrind instrumentation	ON/OFF	OFF
UMF_USE_COVERAGE	Build with coverage enabled (Linux only)	ON/OFF	OFF
UMF_LINK_HWLOC_STATICALLY	Link UMF with HWLOC library statically (proxy library will be disabled on Windows+Debug build)	ON/OFF	OFF

Architecture: memory pools and providers

A UMF memory pool is a combination of a pool allocator and a memory provider. A memory provider is responsible forcoarse-grained memory allocations and management of memory pages, while the pool allocator controls memory poolingand handles fine-grained memory allocations.

Pool allocator can leverage existing allocators (e.g. jemalloc or tbbmalloc) or be written from scratch.

UMF comes with predefined pool allocators (seeinclude/umf/pools) and providers(seeinclude/umf/providers). UMF can also work with user-defined pools andproviders that implement a specific interface (seeinclude/umf/memory_pool_ops.handinclude/umf/memory_provider_ops.h).

More detailed documentation is available here:https://oneapi-src.github.io/unified-memory-framework/

Memory providers

Fixed memory provider

A memory provider that can provide memory from a given pre-allocated buffer.

OS memory provider

A memory provider that provides memory from an operating system.

OS memory provider supports two types of memory mappings (set by thevisibility parameter):

private memory mapping (UMF_MEM_MAP_PRIVATE)
shared memory mapping (UMF_MEM_MAP_SHARED - supported on Linux only yet)

IPC API requires theUMF_MEM_MAP_SHARED memoryvisibility mode(UMF_RESULT_ERROR_INVALID_ARGUMENT is returned otherwise).

IPC API uses file descriptor duplication, which requires thepidfd_getfd(2) system call to obtaina duplicate of another process's file descriptor. This system call is supported since Linux 5.6.Required permission ("restricted ptrace") is governed by thePTRACE_MODE_ATTACH_REALCREDS check(seeptrace(2)). To allow file descriptor duplication in a binary that opens IPC handle, you can callprctl(PR_SET_PTRACER, ...) in the producer binary that gets the IPC handle.Alternatively you can change theptrace_scope globally in the system, e.g.:

sudo bash -c"echo 0 > /proc/sys/kernel/yama/ptrace_scope"

There are available two mechanisms for the shared memory mapping:

a named shared memory object (used if theshm_name parameter is not NULL) or
an anonymous file descriptor (used if theshm_name parameter is NULL)

Theshm_name parameter should be a null-terminated string of up to NAME_MAX (i.e., 255) characters none of which are slashes.

An anonymous file descriptor for the shared memory mapping will be created using:

memfd_secret() syscall - (if it is implemented and) if theUMF_MEM_FD_FUNC environment variable does not contain the "memfd_create" string or
memfd_create() syscall - otherwise (and if it is implemented).

Requirements

IPC API on Linux requires thePTRACE_MODE_ATTACH_REALCREDS permission (seeptrace(2))to duplicate another process's file descriptor (see above).

Packages required for tests (Linux-only yet):

libnuma-dev

Level Zero memory provider

A memory provider that provides memory from L0 device.

sudo bash -c"echo 0 > /proc/sys/kernel/yama/ptrace_scope"

Requirements

Linux or Windows OS
TheUMF_BUILD_LEVEL_ZERO_PROVIDER option turnedON (by default)
IPC API on Linux requires thePTRACE_MODE_ATTACH_REALCREDS permission (seeptrace(2))to duplicate another process's file descriptor (see above).

Additionally, required for tests:

TheUMF_BUILD_GPU_TESTS option turnedON
System with Level Zero compatible GPU
Required packages:
- liblevel-zero-dev (Linux) or level-zero-sdk (Windows)

DevDax memory provider (Linux only)

A memory provider that provides memory from a device DAX (a character device file like/dev/daxX.Y).It can be used when large memory mappings are needed.

Requirements

Linux OS
A character device file /dev/daxX.Y created in the OS.

File memory provider (Linux only yet)

A memory provider that provides memory by mapping a regular, extendable file.

IPC API requires theUMF_MEM_MAP_SHARED memoryvisibility mode(UMF_RESULT_ERROR_INVALID_ARGUMENT is returned otherwise).

The memory visibility mode parameter must be set toUMF_MEM_MAP_SHARED in case of FSDAX.

Requirements

Linux OS
A length of a path of a file to be mapped can bePATH_MAX (4096) characters at most.

CUDA memory provider

A memory provider that provides memory from CUDA device.

Requirements

Linux or Windows OS
TheUMF_BUILD_CUDA_PROVIDER option turnedON (by default)

Additionally, required for tests:

TheUMF_BUILD_GPU_TESTS option turnedON
System with CUDA compatible GPU
Required packages:
- nvidia-cuda-dev (Linux) or cuda-sdk (Windows)

Memory pool managers

Proxy pool (part of libumf)

This memory pool is distributed as part of libumf. It forwards all requests to the underlyingmemory provider. Currently umfPoolRealloc, umfPoolCalloc and umfPoolMallocUsableSize functionsare not supported by the proxy pool.

Disjoint pool (part of libumf)

The Disjoint pool is designed to keep internal metadata separate from user data.This separation is particularly useful when user data needs to be placed in memory with relatively high latency,such as GPU memory or disk storage.

Jemalloc pool

Jemalloc pool is ajemalloc-based memorypool manager built as a separate static library: libjemalloc_pool.a on Linux andjemalloc_pool.lib on Windows.TheUMF_BUILD_LIBUMF_POOL_JEMALLOC option has to be turnedON to build this library.

jemalloc is required to build the jemalloc pool.

In case of Linux OS jemalloc is built from the (fetched) sources with the followingnon-default options enabled:

--with-jemalloc-prefix=je_ - adds theje_ prefix to all public APIs,
--disable-cxx - disables C++ integration, it will cause thenew and thedeleteoperators implementations to be omitted.
--disable-initial-exec-tls - disables the initial-exec TLS model for jemalloc'sinternal thread-local storage (on those platforms that supportexplicit settings), it can allow jemalloc to be dynamicallyloaded after program startup (e.g. usingdlopen()).

The default jemalloc package is required on Windows.

Requirements

TheUMF_BUILD_LIBUMF_POOL_JEMALLOC option turnedON
jemalloc is required:

on Linux and MacOS: jemalloc is fetched and built from sources (a custom build),
on Windows: the default jemalloc package is required

Scalable Pool (part of libumf)

Scalable Pool is aoneTBB-based memory pool manager.It is distributed as part of libumf. To use this pool, TBB must be installed in the system.

Requirements

Packages required for using this pool and executing tests/benchmarks (not required for build):

libtbb-dev (libtbbmalloc.so.2) on Linux or tbb (tbbmalloc.dll) on Windows

Memspaces (Linux-only)

Note: The memspace, memtarget and mempolicy APIs are experimental and may change in future releases.

TODO: Add general information about memspaces.

Host all memspace

Memspace backed by all available NUMA nodes discovered on the platform. Can be retrievedusing umfMemspaceHostAllGet.

Highest capacity memspace

Memspace backed by all available NUMA nodes discovered on the platform sorted by capacity.Can be retrieved using umfMemspaceHighestCapacityGet.

Highest bandwidth memspace

Memspace backed by an aggregated list of NUMA nodes identified as highest bandwidth after selecting each available NUMA node as the initiator.Querying the bandwidth value requires HMAT support on the platform. CallingumfMemspaceHighestBandwidthGet() will return NULL if it's not supported.

Lowest latency memspace

Memspace backed by an aggregated list of NUMA nodes identified as lowest latency after selecting each available NUMA node as the initiator.Querying the latency value requires HMAT support on the platform. CallingumfMemspaceLowestLatencyGet() will return NULL if it's not supported.

Proxy library

UMF provides the UMF proxy library (umf_proxy) that makes it possibleto override the default allocator in other programs in both Linux and Windows.

To enable this feature, theUMF_BUILD_SHARED_LIBRARY option needs to be turnedON.

Linux

In case of Linux it can be done without any code changes using theLD_PRELOAD environment variable:

LD_PRELOAD=/usr/lib/libumf_proxy.so myprogram

The memory used by the proxy memory allocator is mmap'ed:

with theMAP_PRIVATE flag by default or
with theMAP_SHARED flag if theUMF_PROXY environment variable contains one of two following strings:page.disposition=shared-shm orpage.disposition=shared-fd. These two options differ in a mechanism used during IPC:
- page.disposition=shared-shm - IPC uses the named shared memory. An SHM name is generated using theumf_proxy_lib_shm_pid_$PID pattern, where$PID is the PID of the process. It creates the/dev/shm/umf_proxy_lib_shm_pid_$PID file.
- page.disposition=shared-fd - IPC API uses file descriptor duplication, which requires thepidfd_getfd(2) system call to obtain a duplicate of another process's file descriptor. This system call is supported since Linux 5.6. Required permission ("restricted ptrace") is governed by thePTRACE_MODE_ATTACH_REALCREDS check (seeptrace(2)). To allow file descriptor duplication in a binary that opens IPC handle, you can callprctl(PR_SET_PTRACER, ...) in the producer binary that gets the IPC handle. Alternatively you can change theptrace_scope globally in the system, e.g.:sudo bash -c "echo 0 > /proc/sys/kernel/yama/ptrace_scope".

Size threshold

Thesize threshold feature (Linux only) causes that all allocations of size less than the given threshold value go to the default system allocator instead of the proxy library.It can be enabled by adding thesize.threshold=<value> string to theUMF_PROXY environment variable (with';' as a separator), for example:UMF_PROXY="page.disposition=shared-shm;size.threshold=64".

Remark: changing a size of allocation (usingrealloc() ) does not change the allocator (realloc(malloc(threshold - 1), threshold + 1) still belongs to the default system allocator andrealloc(malloc(threshold + 1), threshold - 1) still belongs to the proxy library pool allocator).

Windows

In case of Windows it requires:

explicitly linking your program dynamically with theumf_proxy.dll library
(C++ code only) includingproxy_lib_new_delete.h in a single(!) source file in your projectto override also thenew/delete operations.

Contributions

All contributions to the UMF project are most welcome! Before submittingan issue or a Pull Request, please readContribution Guide.

Logging

To enable logging in UMF source files please follow the guide in theweb documentation.

CMake integration

Integration of UMF into another project via CMake'sFetchContent,is possible with:

include(FetchContent)FetchContent_Declare(    unified-memory-framework    GIT_REPOSITORY https://github.com/oneapi-src/unified-memory-framework.git    GIT_TAG main# This will pull the latest (potentially unstable) changes from the main branch)FetchContent_MakeAvailable(unified-memory-framework)add_executable(some_example some_example.cpp)target_include_directories(some_examplePRIVATE ${unified-memory-framework_SOURCE_DIR}/include)target_link_libraries(some_examplePRIVATE umf::umf umf::headers)

Notices

The contents of this repository may have been developed with support from one or more Intel-operated generative artificial intelligence solutions.

About

oneapi-src.github.io/unified-memory-framework/

Code of conduct

Contributing

Security policy

Activity

Custom properties

Stars

74 stars

Watchers

14 watching

Forks

42 forks

Report repository

Releases14

UMF 1.0.3 Latest

Sep 16, 2025

+ 13 releases

Packages

No packages published

Contributors31

+ 17 contributors

Movatterモバイル変換

License

oneapi-src/unified-memory-framework

Folders and files

Latest commit

History

Repository files navigation

Unified Memory Framework

Introduction

Usage

Build

Requirements

Linux

Windows

Benchmark

Sanitizers

Fuzz testing

CMake standard options

Architecture: memory pools and providers

Memory providers

Fixed memory provider

OS memory provider

Requirements

Level Zero memory provider

Requirements

DevDax memory provider (Linux only)

Requirements

File memory provider (Linux only yet)

Requirements

CUDA memory provider

Requirements

Memory pool managers

Proxy pool (part of libumf)

Disjoint pool (part of libumf)

Jemalloc pool

Requirements

Scalable Pool (part of libumf)

Requirements

Memspaces (Linux-only)

Host all memspace

Highest capacity memspace

Highest bandwidth memspace

Lowest latency memspace

Proxy library

Linux

Windows

Contributions

Logging

CMake integration

Notices

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases14

Packages0

Uh oh!

Contributors31

Uh oh!

Languages

Packages