This section describes the interactions between the CUDA Driver API and the CUDA Runtime API
Primary Contexts
There exists a one to one relationship between CUDA devices in the CUDA Runtime API andCUcontext s in the CUDA Driver API within a process. The specific context which the CUDA Runtime API uses for a device is called the device's primary context. From the perspective of the CUDA Runtime API, a device and its primary context are synonymous.
Initialization and Tear-Down
CUDA Runtime API calls operate on the CUDA Driver APICUcontext which is current to to the calling host thread.
The functioncudaInitDevice() ensures that the primary context is initialized for the requested device but does not make it current to the calling thread.
The functioncudaSetDevice() initializes the primary context for the specified device and makes it current to the calling thread by callingcuCtxSetCurrent().
The CUDA Runtime API will automatically initialize the primary context for a device at the first CUDA Runtime API call which requires an active context. If noCUcontext is current to the calling thread when a CUDA Runtime API call which requires an active context is made, then the primary context for a device will be selected, made current to the calling thread, and initialized.
The context which the CUDA Runtime API initializes will be initialized using the parameters specified by the CUDA Runtime API functionscudaSetDeviceFlags(),cudaD3D9SetDirect3DDevice(),cudaD3D10SetDirect3DDevice(),cudaD3D11SetDirect3DDevice(),cudaGLSetGLDevice(), andcudaVDPAUSetVDPAUDevice(). Note that these functions will fail withcudaErrorSetOnActiveProcess if they are called when the primary context for the specified device has already been initialized, except forcudaSetDeviceFlags() which will simply overwrite the previous settings.
Primary contexts will remain active until they are explicitly deinitialized usingcudaDeviceReset(). The functioncudaDeviceReset() will deinitialize the primary context for the calling thread's current device immediately. The context will remain current to all of the threads that it was current to. The next CUDA Runtime API call on any thread which requires an active context will trigger the reinitialization of that device's primary context.
Note that primary contexts are shared resources. It is recommended that the primary context not be reset except just before exit or to recover from an unspecified launch failure.
Context Interoperability
Note that the use of multipleCUcontext s per device within a single process will substantially degrade performance and is strongly discouraged. Instead, it is highly recommended that the implicit one-to-one device-to-context mapping for the process provided by the CUDA Runtime API be used.
If a non-primaryCUcontext created by the CUDA Driver API is current to a thread then the CUDA Runtime API calls to that thread will operate on thatCUcontext, with some exceptions listed below. Interoperability between data types is discussed in the following sections.
The functioncudaPointerGetAttributes() will return the errorcudaErrorIncompatibleDriverContext if the pointer being queried was allocated by a non-primary context. The functioncudaDeviceEnablePeerAccess() and the rest of the peer access API may not be called when a non-primaryCUcontext is current. To use the pointer query and peer access APIs with a context created using the CUDA Driver API, it is necessary that the CUDA Driver API be used to access these features.
All CUDA Runtime API state (e.g, global variables' addresses and values) travels with its underlyingCUcontext. In particular, if aCUcontext is moved from one thread to another then all CUDA Runtime API state will move to that thread as well.
Please note that attaching to legacy contexts (those with a version of 3010 as returned bycuCtxGetApiVersion()) is not possible. The CUDA Runtime will returncudaErrorIncompatibleDriverContext in such cases.
Interactions between CUstream and cudaStream_t
The typesCUstream andcudaStream_t are identical and may be used interchangeably.
Interactions between CUevent and cudaEvent_t
The typesCUevent andcudaEvent_t are identical and may be used interchangeably.
Interactions between CUarray and cudaArray_t
The typesCUarray and struct cudaArray * represent the same data type and may be used interchangeably by casting the two types between each other.
In order to use aCUarray in a CUDA Runtime API function which takes a struct cudaArray *, it is necessary to explicitly cast theCUarray to a struct cudaArray *.
In order to use a struct cudaArray * in a CUDA Driver API function which takes aCUarray, it is necessary to explicitly cast the struct cudaArray * to aCUarray .
Interactions between CUgraphicsResource and cudaGraphicsResource_t
The typesCUgraphicsResource andcudaGraphicsResource_t represent the same data type and may be used interchangeably by casting the two types between each other.
In order to use aCUgraphicsResource in a CUDA Runtime API function which takes acudaGraphicsResource_t, it is necessary to explicitly cast theCUgraphicsResource to acudaGraphicsResource_t.
In order to use acudaGraphicsResource_t in a CUDA Driver API function which takes aCUgraphicsResource, it is necessary to explicitly cast thecudaGraphicsResource_t to aCUgraphicsResource.
Interactions between CUtexObject and cudaTextureObject_t
The typesCUtexObject andcudaTextureObject_t represent the same data type and may be used interchangeably by casting the two types between each other.
In order to use aCUtexObject in a CUDA Runtime API function which takes acudaTextureObject_t, it is necessary to explicitly cast theCUtexObject to acudaTextureObject_t.
In order to use acudaTextureObject_t in a CUDA Driver API function which takes aCUtexObject, it is necessary to explicitly cast thecudaTextureObject_t to aCUtexObject.
Interactions between CUsurfObject and cudaSurfaceObject_t
The typesCUsurfObject andcudaSurfaceObject_t represent the same data type and may be used interchangeably by casting the two types between each other.
In order to use aCUsurfObject in a CUDA Runtime API function which takes acudaSurfaceObject_t, it is necessary to explicitly cast theCUsurfObject to acudaSurfaceObject_t.
In order to use acudaSurfaceObject_t in a CUDA Driver API function which takes aCUsurfObject, it is necessary to explicitly cast thecudaSurfaceObject_t to aCUsurfObject.
Interactions between CUfunction and cudaFunction_t
The typesCUfunction andcudaFunction_t represent the same data type and may be used interchangeably by casting the two types between each other.
In order to use acudaFunction_t in a CUDA Driver API function which takes aCUfunction, it is necessary to explicitly cast thecudaFunction_t to aCUfunction.
Interactions between CUkernel and cudaKernel_t
The typesCUkernel andcudaKernel_t represent the same data type and may be used interchangeably by casting the two types between each other.
In order to use acudaKernel_t in a CUDA Driver API function which takes aCUkernel, it is necessary to explicitly cast thecudaKernel_t to aCUkernel.
Returns infunctionPtr the device entry function corresponding to the symbolsymbolPtr.
Returns inkernelPtr the device kernel corresponding to the entry functionentryFuncAddr.
Note that it is possible that there are multiple symbols belonging to different translation units with the sameentryFuncAddr registered with this CUDA Runtime and so the order which the translation units are loaded and registered with the CUDA Runtime can lead to differing return pointers inkernelPtr . Suggested methods of ensuring uniqueness are to limit visibility of __global__ device functions by using static or hidden visibility attribute in the respective translation units.
See also:
cudaGetKernel (C++ API)