Movatterモバイル変換


[0]ホーム

URL:


NVIDIACUDA Toolkit Documentation
Search In:
< Previous |Next >
CUDA Driver API (PDF) - v13.0.97 (older) - Last updated October 2, 2025 -Send Feedback

6.14. Virtual Memory Management

This section describes the virtual memory management functions of the low-level CUDA driver application programming interface.

Functions

CUresult cuMemAddressFree (CUdeviceptr ptr, size_t size )
Free an address range reservation.
CUresult cuMemAddressReserve (CUdeviceptr* ptr, size_t size, size_t alignment,CUdeviceptr addr, unsigned long long flags )
Allocate an address range reservation.
CUresult cuMemCreate ( CUmemGenericAllocationHandle* handle, size_t size, constCUmemAllocationProp* prop, unsigned long long flags )
Create a CUDA memory handle representing a memory allocation of a given size described by the given properties.
CUresult cuMemExportToShareableHandle ( void* shareableHandle, CUmemGenericAllocationHandle handle,CUmemAllocationHandleType handleType, unsigned long long flags )
Exports an allocation to a requested shareable handle type.
CUresult cuMemGetAccess ( unsigned long long* flags, constCUmemLocation* location,CUdeviceptr ptr )
Get the accessflags set for the givenlocation andptr.
CUresult cuMemGetAllocationGranularity ( size_t* granularity, constCUmemAllocationProp* prop,CUmemAllocationGranularity_flags option )
Calculates either the minimal or recommended granularity.
CUresult cuMemGetAllocationPropertiesFromHandle (CUmemAllocationProp* prop, CUmemGenericAllocationHandle handle )
Retrieve the contents of the property structure defining properties for this handle.
CUresult cuMemImportFromShareableHandle ( CUmemGenericAllocationHandle* handle, void* osHandle,CUmemAllocationHandleType shHandleType )
Imports an allocation from a requested shareable handle type.
CUresult cuMemMap (CUdeviceptr ptr, size_t size, size_t offset, CUmemGenericAllocationHandle handle, unsigned long long flags )
Maps an allocation handle to a reserved virtual address range.
CUresult cuMemMapArrayAsync (CUarrayMapInfo* mapInfoList, unsigned int count,CUstream hStream )
Maps or unmaps subregions of sparse CUDA arrays and sparse CUDA mipmapped arrays.
CUresult cuMemRelease ( CUmemGenericAllocationHandle handle )
Release a memory handle representing a memory allocation which was previously allocated through cuMemCreate.
CUresult cuMemRetainAllocationHandle ( CUmemGenericAllocationHandle* handle, void* addr )
Given an addressaddr, returns the allocation handle of the backing memory allocation.
CUresult cuMemSetAccess (CUdeviceptr ptr, size_t size, constCUmemAccessDesc* desc, size_t count )
Set the access flags for each location specified indesc for the given virtual address range.
CUresult cuMemUnmap (CUdeviceptr ptr, size_t size )
Unmap the backing memory of a given address range.

Functions

CUresult cuMemAddressFree (CUdeviceptr ptr, size_t size )
Free an address range reservation.
Parameters
ptr
- Starting address of the virtual address range to free
size
- Size of the virtual address region to free
Description

Frees a virtual address range reserved by cuMemAddressReserve. The size must match what was given to memAddressReserve and the ptr given must match what was returned from memAddressReserve.

See also:

cuMemAddressReserve

CUresult cuMemAddressReserve (CUdeviceptr* ptr, size_t size, size_t alignment,CUdeviceptr addr, unsigned long long flags )
Allocate an address range reservation.
Parameters
ptr
- Resulting pointer to start of virtual address range allocated
size
- Size of the reserved virtual address range requested
alignment
- Alignment of the reserved virtual address range requested
addr
- Hint address for the start of the address range
flags
- Currently unused, must be zero
Description

Reserves a virtual address range based on the given parameters, giving the starting address of the range inptr. This API requires a system that supports UVA. The size and address parameters must be a multiple of the host page size and the alignment must be a power of two or zero for default alignment. Ifaddr is 0, then the driver chooses the address at which to place the start of the reservation whereas when it is non-zero then the driver treats it as a hint about where to place the reservation.

See also:

cuMemAddressFree

CUresult cuMemCreate ( CUmemGenericAllocationHandle* handle, size_t size, constCUmemAllocationProp* prop, unsigned long long flags )
Create a CUDA memory handle representing a memory allocation of a given size described by the given properties.
Parameters
handle
- Value of handle returned. All operations on this allocation are to be performed using this handle.
size
- Size of the allocation requested
prop
- Properties of the allocation to create.
flags
- flags for future use, must be zero now.
Description

This creates a memory allocation on the target device specified through theprop structure. The created allocation will not have any device or host mappings. The generic memoryhandle for the allocation can be mapped to the address space of calling process viacuMemMap. This handle cannot be transmitted directly to other processes (seecuMemExportToShareableHandle). On Windows, the caller must also pass an LPSECURITYATTRIBUTE inprop to be associated with this handle which limits or allows access to this handle for a recipient process (seeCUmemAllocationProp::win32HandleMetaData for more). Thesize of this allocation must be a multiple of the the value given viacuMemGetAllocationGranularity with theCU_MEM_ALLOC_GRANULARITY_MINIMUM flag. To create a CPU allocation that doesn't target any specific NUMA nodes, applications must set CUmemAllocationProp::CUmemLocation::type toCU_MEM_LOCATION_TYPE_HOST. CUmemAllocationProp::CUmemLocation::id is ignored for HOST allocations. HOST allocations are not IPC capable andCUmemAllocationProp::requestedHandleTypes must be 0, any other value will result inCUDA_ERROR_INVALID_VALUE. To create a CPU allocation targeting a specific host NUMA node, applications must set CUmemAllocationProp::CUmemLocation::type toCU_MEM_LOCATION_TYPE_HOST_NUMA and CUmemAllocationProp::CUmemLocation::id must specify the NUMA ID of the CPU. On systems where NUMA is not available CUmemAllocationProp::CUmemLocation::id must be set to 0. SpecifyingCU_MEM_LOCATION_TYPE_HOST_NUMA_CURRENT as theCUmemLocation::type will result inCUDA_ERROR_INVALID_VALUE.

Applications that intend to useCU_MEM_HANDLE_TYPE_FABRIC based memory sharing must ensure: (1) `nvidia-caps-imex-channels` character device is created by the driver and is listed under /proc/devices (2) have at least one IMEX channel file accessible by the user launching the application.

When exporter and importer CUDA processes have been granted access to the same IMEX channel, they can securely share memory.

The IMEX channel security model works on a per user basis. Which means all processes under a user can share memory if the user has access to a valid IMEX channel. When multi-user isolation is desired, a separate IMEX channel is required for each user.

These channel files exist in /dev/nvidia-caps-imex-channels/channel* and can be created using standard OS native calls like mknod on Linux. For example: To create channel0 with the major number from /proc/devices users can execute the following command: `mknod /dev/nvidia-caps-imex-channels/channel0 c <major number>=""> 0`

If CUmemAllocationProp::allocFlags::usage containsCU_MEM_CREATE_USAGE_TILE_POOL flag then the memory allocation is intended only to be used as backing tile pool for sparse CUDA arrays and sparse CUDA mipmapped arrays. (seecuMemMapArrayAsync).

Note:

Note that this function may also return error codes from previous, asynchronous launches.

See also:

cuMemRelease,cuMemExportToShareableHandle,cuMemImportFromShareableHandle

CUresult cuMemExportToShareableHandle ( void* shareableHandle, CUmemGenericAllocationHandle handle,CUmemAllocationHandleType handleType, unsigned long long flags )
Exports an allocation to a requested shareable handle type.
Parameters
shareableHandle
- Pointer to the location in which to store the requested handle type
handle
- CUDA handle for the memory allocation
handleType
- Type of shareable handle requested (defines type and size of theshareableHandle output parameter)
flags
- Reserved, must be zero
Description

Given a CUDA memory handle, create a shareable memory allocation handle that can be used to share the memory with other processes. The recipient process can convert the shareable handle back into a CUDA memory handle usingcuMemImportFromShareableHandle and map it withcuMemMap. The implementation of what this handle is and how it can be transferred is defined by the requested handle type inhandleType

Once all shareable handles are closed and the allocation is released, the allocated memory referenced will be released back to the OS and uses of the CUDA handle afterward will lead to undefined behavior.

This API can also be used in conjunction with other APIs (e.g. Vulkan, OpenGL) that support importing memory from the shareable type

See also:

cuMemImportFromShareableHandle

CUresult cuMemGetAccess ( unsigned long long* flags, constCUmemLocation* location,CUdeviceptr ptr )
Get the accessflags set for the givenlocation andptr.
Parameters
flags
- Flags set for this location
location
- Location in which to check the flags for
ptr
- Address in which to check the access flags for
Description

See also:

cuMemSetAccess

CUresult cuMemGetAllocationGranularity ( size_t* granularity, constCUmemAllocationProp* prop,CUmemAllocationGranularity_flags option )
Calculates either the minimal or recommended granularity.
Parameters
granularity
Returned granularity.
prop
Property for which to determine the granularity for
option
Determines which granularity to return
Description

Calculates either the minimal or recommended granularity for a given allocation specification and returns it in granularity. This granularity can be used as a multiple for alignment, size, or address mapping.

See also:

cuMemCreate,cuMemMap

CUresult cuMemGetAllocationPropertiesFromHandle (CUmemAllocationProp* prop, CUmemGenericAllocationHandle handle )
Retrieve the contents of the property structure defining properties for this handle.
Parameters
prop
- Pointer to a properties structure which will hold the information about this handle
handle
- Handle which to perform the query on
CUresult cuMemImportFromShareableHandle ( CUmemGenericAllocationHandle* handle, void* osHandle,CUmemAllocationHandleType shHandleType )
Imports an allocation from a requested shareable handle type.
Parameters
handle
- CUDA Memory handle for the memory allocation.
osHandle
- Shareable Handle representing the memory allocation that is to be imported.
shHandleType
- handle type of the exported handleCUmemAllocationHandleType.
Description

If the current process cannot support the memory described by this shareable handle, this API will error asCUDA_ERROR_NOT_SUPPORTED.

IfshHandleType isCU_MEM_HANDLE_TYPE_FABRIC and the importer process has not been granted access to the same IMEX channel as the exporter process, this API will error asCUDA_ERROR_NOT_PERMITTED.

Note:

Importing shareable handles exported from some graphics APIs(VUlkan, OpenGL, etc) created on devices under an SLI group may not be supported, and thus this API will return CUDA_ERROR_NOT_SUPPORTED. There is no guarantee that the contents ofhandle will be the same CUDA memory handle for the same given OS shareable handle, or the same underlying allocation.

See also:

cuMemExportToShareableHandle,cuMemMap,cuMemRelease

CUresult cuMemMap (CUdeviceptr ptr, size_t size, size_t offset, CUmemGenericAllocationHandle handle, unsigned long long flags )
Maps an allocation handle to a reserved virtual address range.
Parameters
ptr
- Address where memory will be mapped.
size
- Size of the memory mapping.
offset
- Offset into the memory represented by
  • handle from which to start mapping
  • Note: currently must be zero.
handle
- Handle to a shareable memory
flags
- flags for future use, must be zero now.
Description

Maps bytes of memory represented byhandle starting from byteoffset tosize to address range [addr,addr +size]. This range must be an address reservation previously reserved withcuMemAddressReserve, andoffset +size must be less than the size of the memory allocation. Bothptr,size, andoffset must be a multiple of the value given viacuMemGetAllocationGranularity with theCU_MEM_ALLOC_GRANULARITY_MINIMUM flag. Ifhandle represents a multicast object,ptr,size andoffset must be aligned to the value returned bycuMulticastGetGranularity with the flag CU_MULTICAST_MINIMUM_GRANULARITY. For best performance however, it is recommended thatptr,size andoffset be aligned to the value returned bycuMulticastGetGranularity with the flag CU_MULTICAST_RECOMMENDED_GRANULARITY.

Whenhandle represents a multicast object, this call may return CUDA_ERROR_ILLEGAL_STATE if the system configuration is in an illegal state. In such cases, to continue using multicast, verify that the system configuration is in a valid state and all required driver daemons are running properly.

Please note callingcuMemMap does not make the address accessible, the caller needs to update accessibility of a contiguous mapped VA range by callingcuMemSetAccess.

Once a recipient process obtains a shareable memory handle fromcuMemImportFromShareableHandle, the process must usecuMemMap to map the memory into its address ranges before setting accessibility withcuMemSetAccess.

cuMemMap can only create mappings on VA range reservations that are not currently mapped.

Note:

Note that this function may also return error codes from previous, asynchronous launches.

See also:

cuMemUnmap,cuMemSetAccess,cuMemCreate,cuMemAddressReserve,cuMemImportFromShareableHandle

CUresult cuMemMapArrayAsync (CUarrayMapInfo* mapInfoList, unsigned int count,CUstream hStream )
Maps or unmaps subregions of sparse CUDA arrays and sparse CUDA mipmapped arrays.
Parameters
mapInfoList
- List of CUarrayMapInfo
count
- Count of CUarrayMapInfo inmapInfoList
hStream
- Stream identifier for the stream to use for map or unmap operations
Description

Performs map or unmap operations on subregions of sparse CUDA arrays and sparse CUDA mipmapped arrays. Each operation is specified by a CUarrayMapInfo entry in themapInfoList array of sizecount. The structure CUarrayMapInfo is defined as follow:

‎     typedef struct CUarrayMapInfo_st {CUresourcetype resourceType;                                 union {CUmipmappedArray mipmap;CUarray array;              } resource;CUarraySparseSubresourceType subresourceType;                 union {                  struct {                      unsigned int level;                                           unsigned int layer;                                           unsigned int offsetX;                                         unsigned int offsetY;                                         unsigned int offsetZ;                                         unsigned int extentWidth;                                     unsigned int extentHeight;                                    unsigned int extentDepth;                                 } sparseLevel;                  struct {                      unsigned int layer;                      unsigned long long offset;                                    unsigned long long size;                                  } miptail;              } subresource;CUmemOperationType memOperationType;CUmemHandleType memHandleType;                                union {                  CUmemGenericAllocationHandle memHandle;              } memHandle;                    unsigned long long offset;                                    unsigned int deviceBitMask;                                   unsigned int flags;                                           unsigned int reserved[2];                                 }CUarrayMapInfo;

whereCUarrayMapInfo::resourceType specifies the type of resource to be operated on. IfCUarrayMapInfo::resourceType is set toCUresourcetype::CU_RESOURCE_TYPE_ARRAY then CUarrayMapInfo::resource::array must be set to a valid sparse CUDA array handle. The CUDA array must be either a 2D, 2D layered or 3D CUDA array and must have been allocated usingcuArrayCreate orcuArray3DCreate with the flagCUDA_ARRAY3D_SPARSE orCUDA_ARRAY3D_DEFERRED_MAPPING. For CUDA arrays obtained usingcuMipmappedArrayGetLevel,CUDA_ERROR_INVALID_VALUE will be returned. IfCUarrayMapInfo::resourceType is set toCUresourcetype::CU_RESOURCE_TYPE_MIPMAPPED_ARRAY then CUarrayMapInfo::resource::mipmap must be set to a valid sparse CUDA mipmapped array handle. The CUDA mipmapped array must be either a 2D, 2D layered or 3D CUDA mipmapped array and must have been allocated usingcuMipmappedArrayCreate with the flagCUDA_ARRAY3D_SPARSE orCUDA_ARRAY3D_DEFERRED_MAPPING.

CUarrayMapInfo::subresourceType specifies the type of subresource within the resource. CUarraySparseSubresourceType_enum is defined as:

‎    typedef enum CUarraySparseSubresourceType_enum {              CU_ARRAY_SPARSE_SUBRESOURCE_TYPE_SPARSE_LEVEL = 0,              CU_ARRAY_SPARSE_SUBRESOURCE_TYPE_MIPTAIL = 1          }CUarraySparseSubresourceType;

where CUarraySparseSubresourceType::CU_ARRAY_SPARSE_SUBRESOURCE_TYPE_SPARSE_LEVEL indicates a sparse-miplevel which spans at least one tile in every dimension. The remaining miplevels which are too small to span at least one tile in any dimension constitute the mip tail region as indicated by CUarraySparseSubresourceType::CU_ARRAY_SPARSE_SUBRESOURCE_TYPE_MIPTAIL subresource type.

IfCUarrayMapInfo::subresourceType is set to CUarraySparseSubresourceType::CU_ARRAY_SPARSE_SUBRESOURCE_TYPE_SPARSE_LEVEL then CUarrayMapInfo::subresource::sparseLevel struct must contain valid array subregion offsets and extents. The CUarrayMapInfo::subresource::sparseLevel::offsetX, CUarrayMapInfo::subresource::sparseLevel::offsetY and CUarrayMapInfo::subresource::sparseLevel::offsetZ must specify valid X, Y and Z offsets respectively. The CUarrayMapInfo::subresource::sparseLevel::extentWidth, CUarrayMapInfo::subresource::sparseLevel::extentHeight and CUarrayMapInfo::subresource::sparseLevel::extentDepth must specify valid width, height and depth extents respectively. These offsets and extents must be aligned to the corresponding tile dimension. For CUDA mipmapped arrays CUarrayMapInfo::subresource::sparseLevel::level must specify a valid mip level index. Otherwise, must be zero. For layered CUDA arrays and layered CUDA mipmapped arrays CUarrayMapInfo::subresource::sparseLevel::layer must specify a valid layer index. Otherwise, must be zero. CUarrayMapInfo::subresource::sparseLevel::offsetZ must be zero and CUarrayMapInfo::subresource::sparseLevel::extentDepth must be set to 1 for 2D and 2D layered CUDA arrays and CUDA mipmapped arrays. Tile extents can be obtained by callingcuArrayGetSparseProperties andcuMipmappedArrayGetSparseProperties

IfCUarrayMapInfo::subresourceType is set to CUarraySparseSubresourceType::CU_ARRAY_SPARSE_SUBRESOURCE_TYPE_MIPTAIL then CUarrayMapInfo::subresource::miptail struct must contain valid mip tail offset in CUarrayMapInfo::subresource::miptail::offset and size in CUarrayMapInfo::subresource::miptail::size. Both, mip tail offset and mip tail size must be aligned to the tile size. For layered CUDA mipmapped arrays which don't have the flagCU_ARRAY_SPARSE_PROPERTIES_SINGLE_MIPTAIL set inCUDA_ARRAY_SPARSE_PROPERTIES::flags as returned bycuMipmappedArrayGetSparseProperties, CUarrayMapInfo::subresource::miptail::layer must specify a valid layer index. Otherwise, must be zero.

If CUarrayMapInfo::resource::array or CUarrayMapInfo::resource::mipmap was created withCUDA_ARRAY3D_DEFERRED_MAPPING flag set theCUarrayMapInfo::subresourceType and the contents of CUarrayMapInfo::subresource will be ignored.

CUarrayMapInfo::memOperationType specifies the type of operation.CUmemOperationType is defined as:

‎    typedef enum CUmemOperationType_enum {              CU_MEM_OPERATION_TYPE_MAP = 1,              CU_MEM_OPERATION_TYPE_UNMAP = 2          }CUmemOperationType;
IfCUarrayMapInfo::memOperationType is set to CUmemOperationType::CU_MEM_OPERATION_TYPE_MAP then the subresource will be mapped onto the tile pool memory specified by CUarrayMapInfo::memHandle at offsetCUarrayMapInfo::offset. The tile pool allocation has to be created by specifying theCU_MEM_CREATE_USAGE_TILE_POOL flag when callingcuMemCreate. Also,CUarrayMapInfo::memHandleType must be set to CUmemHandleType::CU_MEM_HANDLE_TYPE_GENERIC.

IfCUarrayMapInfo::memOperationType is set to CUmemOperationType::CU_MEM_OPERATION_TYPE_UNMAP then an unmapping operation is performed. CUarrayMapInfo::memHandle must be NULL.

CUarrayMapInfo::deviceBitMask specifies the list of devices that must map or unmap physical memory. Currently, this mask must have exactly one bit set, and the corresponding device must match the device associated with the stream. IfCUarrayMapInfo::memOperationType is set to CUmemOperationType::CU_MEM_OPERATION_TYPE_MAP, the device must also match the device associated with the tile pool memory allocation as specified by CUarrayMapInfo::memHandle.

CUarrayMapInfo::flags andCUarrayMapInfo::reserved[] are unused and must be set to zero.

See also:

cuMipmappedArrayCreate,cuArrayCreate,cuArray3DCreate,cuMemCreate,cuArrayGetSparseProperties,cuMipmappedArrayGetSparseProperties

CUresult cuMemRelease ( CUmemGenericAllocationHandle handle )
Release a memory handle representing a memory allocation which was previously allocated through cuMemCreate.
Parameters
handle
Value of handle which was returned previously by cuMemCreate.
Description

Frees the memory that was allocated on a device through cuMemCreate.

The memory allocation will be freed when all outstanding mappings to the memory are unmapped and when all outstanding references to the handle (including it's shareable counterparts) are also released. The generic memory handle can be freed when there are still outstanding mappings made with this handle. Each time a recipient process imports a shareable handle, it needs to pair it withcuMemRelease for the handle to be freed. Ifhandle is not a valid handle the behavior is undefined.

Note:

Note that this function may also return error codes from previous, asynchronous launches.

See also:

cuMemCreate

CUresult cuMemRetainAllocationHandle ( CUmemGenericAllocationHandle* handle, void* addr )
Given an addressaddr, returns the allocation handle of the backing memory allocation.
Parameters
handle
CUDA Memory handle for the backing memory allocation.
addr
Memory address to query, that has been mapped previously.
Description

The handle is guaranteed to be the same handle value used to map the memory. If the address requested is not mapped, the function will fail. The returned handle must be released with corresponding number of calls tocuMemRelease.

Note:

The addressaddr, can be any address in a range previously mapped bycuMemMap, and not necessarily the start address.

See also:

cuMemCreate,cuMemRelease,cuMemMap

CUresult cuMemSetAccess (CUdeviceptr ptr, size_t size, constCUmemAccessDesc* desc, size_t count )
Set the access flags for each location specified indesc for the given virtual address range.
Parameters
ptr
- Starting address for the virtual address range
size
- Length of the virtual address range
desc
- Array of CUmemAccessDesc that describe how to change the
  • mapping for each location specified
count
- Number of CUmemAccessDesc indesc
Description

Given the virtual address range viaptr andsize, and the locations in the array given bydesc andcount, set the access flags for the target locations. The range must be a fully mapped address range containing all allocations created bycuMemMap /cuMemCreate. Users cannot specifyCU_MEM_LOCATION_TYPE_HOST_NUMA accessibility for allocations created on with other location types. Note: When CUmemAccessDesc::CUmemLocation::type isCU_MEM_LOCATION_TYPE_HOST_NUMA, CUmemAccessDesc::CUmemLocation::id is ignored. When setting the access flags for a virtual address range mapping a multicast object,ptr andsize must be aligned to the value returned bycuMulticastGetGranularity with the flag CU_MULTICAST_MINIMUM_GRANULARITY. For best performance however, it is recommended thatptr andsize be aligned to the value returned bycuMulticastGetGranularity with the flag CU_MULTICAST_RECOMMENDED_GRANULARITY.

Note:
  • Note that this function may also return error codes from previous, asynchronous launches.

  • This function exhibitssynchronous behavior for most use cases.

See also:

cuMemSetAccess,cuMemCreate, :cuMemMap

CUresult cuMemUnmap (CUdeviceptr ptr, size_t size )
Unmap the backing memory of a given address range.
Parameters
ptr
- Starting address for the virtual address range to unmap
size
- Size of the virtual address range to unmap
Description

The range must be the entire contiguous address range that was mapped to. In other words,cuMemUnmap cannot unmap a sub-range of an address range mapped bycuMemCreate /cuMemMap. Any backing memory allocations will be freed if there are no existing mappings and there are no unreleased memory handles.

WhencuMemUnmap returns successfully the address range is converted to an address reservation and can be used for a future calls tocuMemMap. Any new mapping to this virtual address will need to have access granted throughcuMemSetAccess, as all mappings start with no accessibility setup.

Note:
  • Note that this function may also return error codes from previous, asynchronous launches.

  • This function exhibitssynchronous behavior for most use cases.

See also:

cuMemCreate,cuMemAddressReserve



[8]ページ先頭

©2009-2025 Movatter.jp