Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Allow access to CUDA pointers for interoperability with other libraries#16513

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
opencv-pushbot merged 1 commit intoopencv:masterfrompwuertz:cuda_py_interop
Mar 5, 2020
Merged

Conversation

@pwuertz
Copy link
Contributor

@pwuertzpwuertz commentedFeb 5, 2020
edited by alalek
Loading

This is a proposal for addingCV_WRAP compatiblecudaPtr() getter methods toGpuMat andStream, required for enabling interoperability betweenOpenCV and other CUDA supporting python libraries likeNumba,CuPy,PyTorch, etc.

Here is an example for sharing aGpuMat withCuPy:

importcv2ascvimportcupyascp# Create GPU array with OpenCVdata_gpu_cv=cv.cuda_GpuMat()data_gpu_cv.upload(np.eye(64,dtype=np.float32))# Modify the same GPU array with CuPydata_gpu_cp=cp.asarray(CudaArrayInterface(data_gpu_cv))data_gpu_cp*=42.0# Download and verifyassertnp.allclose(data_gpu_cp.get(),np.eye(64)*42.0)

In this exampleCudaArrayInterface is a (incomplete) adapter class that implements thecuda array interface used by other frameworks:

classCudaArrayInterface:def__init__(self,gpu_mat):w,h=gpu_mat.size()type_map= {cv.CV_8U:"u1",cv.CV_8S:"i1",cv.CV_16U:"u2",cv.CV_16S:"i2",cv.CV_32S:"i4",cv.CV_32F:"f4",cv.CV_64F:"f8",        }self.__cuda_array_interface__= {"version":2,"shape": (h,w),"data": (gpu_mat.cudaPtr(),False),"typestr":type_map[gpu_mat.type()],"strides": (gpu_mat.step,gpu_mat.elemSize()),        }

If possible, I'd like to implement__cuda_array_interface__ within theGpuMat python binding in a future PR (not sure how to define a python property using the wrapper generator though).


force_builders=Custombuildworker:Custom=linux-4build_image:Custom=ubuntu-cuda:18.04

@leofang
Copy link

Hi@pwuertz, thanks for joining the discussion on Numba. I don't have comment for your effort here (yet), but I'd like to kindly ask that when this PR (and any subsequent ones) is merged, it'd be nice if you could follownumba/numba#5104 and add opencv to the list 🙂 Thank you.

@alalek
Copy link
Member

CudaArrayInterface

I believe it should own upstreamgpu_mat object in its fields (extend lifetime ofdata_gpu_cv in example).

@pwuertz
Copy link
ContributorAuthor

@alalek Please note thatclass CudaArrayInterface is not part of this PR. The PR was meant to provide the minimum requirement for any kind of interoperability in python, which is the access to the CUDA pointers.

For out-of-the-box interoperability with other libraries like Numba and Cupy I was planning on implementing the CUDA array interface (CAI) in a follow-up PR. As described earlier, I'm having trouble figuring out some OpenCV python binding generator details though.

Also note that within the current CAI specification (version 2), the responsibility for lifetime and synchronization resides with the user (similar to using cv::Mat constructors with data pointers). If this changes in some future version I'd be willing to update the CAI version on the OpenCV side of course.

alalek and asmorkalov reacted with thumbs up emoji

operatorbool_type()const;

//! return Pointer to CUDA stream
CV_WRAPvoid*cudaPtr()const;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

There isStreamAccessor to do it:https://docs.opencv.org/master/d6/df1/structcv_1_1cuda_1_1StreamAccessor.html. I think it's better to wrap accessor rather expose private fields.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

StreamAccessor

Not sure that this can help to reach the final goal:

to implement__cuda_array_interface__ within theGpuMat python binding

inline
void*GpuMat::cudaPtr()const
{
return data;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

There isCV_PROP_RW macro that allows to expose object properties to Python and other languages. You do not need own method for it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I believe current approach is fine.
We should not allow changing of this pointer throughCV_PROP_RW.

@pwuertz
Copy link
ContributorAuthor

@asmorkalov

  • I tried addinguchar* data asCV_PROP, but it doesn't compile.CV_PROP apparently doesn't support pointer types.
  • Like@alalek, I figured that it shouldn't be allowed to change thedata pointer via public interface (neither from C++ nor from python), so having/promoting a getter method for it seems like a good thing.
  • TheGpuMat::data pointer looks identical toMat::data (name, uint8_t*, doc), yet it represents something completely different. I thinkvoid* cudaPtr() is a good name for what this pointer represents - an opaque handle to the CUDA array, not a data pointer for reading / writing uint8_t.
alalek reacted with thumbs up emoji

@pwuertz
Copy link
ContributorAuthor

@alalek It's true that__cuda_array_interface__ currently does not specify stream handling, but stream interoperability really is a low hanging fruit. With access to the CUDA pointers, I did the following:

# allocated dataX_cpu, dataX_gpu_cv, stream_cv# (proof of principle numba views dataX_gpu_nb, stream_nb)t1=time.time()data1_gpu_cv.upload(arr=data1_cpu,stream=stream_cv)# OpenCV uploadkernel_nb(data1_gpu_nb,out=data2_gpu_nb,stream=stream_nb)# Numba operationdata2_gpu_cv.download(dst=data2_cpu,stream=stream_cv)# OpenCV downloadt2=time.time()# CPU time: 0.001 sstream_nb.synchronize()t3=time.time()# GPU time: 0.042 s# verified data2_cpu

So you can freely mix OpenCV and Numba operations on a single, fully async stream.

@asmorkalov Oh sorry, haven't noticed theStreamAccessor class before. I'll try adding wrapper definitions to it. This means thatcudaStream_t needs some kind of globally defined conversion rule too?

@pwuertz
Copy link
ContributorAuthor

@asmorkalovStreamAccessor uses thecudaStream_t typedef, thus having a hard dependency on the CUDA SDK. I assume this is the reason for keeping it separate fromStream, which provides a public interface without CUDA dependency.

Even withHAVE_CUDA defined, neithermisc/python headers norcv2.cpp are able to include<cuda_runtime.h>.

How to proceed? Usingvoid* for transporting CUDA stream pointers appears to be the least intrusive solution..

Copy link
Member

@alalekalalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Looks good to me 👍

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@asmorkalovasmorkalovasmorkalov approved these changes

@alalekalalekalalek approved these changes

Assignees

@asmorkalovasmorkalov

Labels

Projects

None yet

Milestone

4.3.0

Development

Successfully merging this pull request may close these issues.

5 participants

@pwuertz@leofang@alalek@asmorkalov@opencv-pushbot

[8]ページ先頭

©2009-2025 Movatter.jp