NotificationsYou must be signed in to change notification settings
Fork338
Star4k

GPU Support via OpenCL#1377

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Jump to bottom

Open

Pencilcaseman wants to merge23 commits intorust-ndarray:master

base:master

Choose a base branch

fromPencilcaseman:master

Open

GPU Support via OpenCL#1377

Pencilcaseman wants to merge23 commits intorust-ndarray:masterfromPencilcaseman:master

Conversation

Copy link

Pencilcaseman commentedMar 30, 2024

This is very much a work-in-progress, but I wanted to know how this approach fits with the rest of the codebase. I realise it's not the most "rusty" implementation, but I think it could be abstracted away quite nicely.

There is a fair amount of unnecessary/unclean code, but that can be cleaned up pretty quickly. The changes made here only allow for binary operations (+ - * /) on contiguousArrayBase references (see the example below). With more work, it could be expanded to support almost everything the CPU "backend" supports.

To enable OpenCL support, you need to enable theopencl feature.

Here is an example:

// Unfortunately, OpenCL requires some initialisation before it can be used.// There are currently no checks on this, but they can be easily added.ndarray::configure();// Note that the result of `move_to_device` is a `Result<_, OpenCLErrorCode>`, so errors// can be handled correctlylet x = ndarray::Array2::<f32>::from_shape_fn((3,4), |(r, c)|(c + r*4)asf32).move_to_device(ndarray::Device::OpenCL).unwrap_or_else(||panic!("Something went wrong"));let y = ndarray::Array2::<f32>::from_shape_fn((3,4), |(r, c)|12.0 -(c + r*4)asf32).move_to_device(ndarray::Device::OpenCL).unwrap_or_else(||panic!("Something went wrong"));// Only this form of binary operation is supported currently.// i.e. reference <op> reference// This operation takes place in a JIT-compiled kernel on the GPUlet z =&x +&y;// You can only print something if it's in Host memory. This could be changed// to automatically copy/move to host if necessaryprintln!("Result:\n{:>2}",    z.move_to_device(ndarray::Device::Host).unwrap());// [[12, 12, 12, 12],//  [12, 12, 12, 12],//  [12, 12, 12, 12]]

A very similar approach can be taken to get CUDA support working. It might even be possible to merge them into a singleGPU backend trait, for example, which would simplify the implementations quite a bit. It'd require a few substantial changes internally, though (I think. Maybe not?).

Anyway, let me know what you think!

Pencilcaseman added23 commits

March 25, 2024 02:43

Initial OpenCL things

28bdd02

Update CI

aa19194

Why is CI failing?

24bcf1d

Remove some clippy warnings and fix some bugs

931dde4

Fix

67a0382

Fix tests

b03ff6f

Update bench config

8dd39d1

Do not allow operations on mismatched devices

772bda4

oops

4ac3f96

Apparently strip isn't allowed in 1.57.0

00e3955

Bump MSRV

0642766

Remove unnecessary unsafe block

1b5d086

Fix issue

f92fe0d

Why did that not work... Bump MSRV again

b04c533

Update CI and tests. Decrease MSRV.

722c4f3

Update CI

fdcd653

Further update CI

d813d66

Version matching is hard

2253f78

Again, change MSRV

ed26986

First OpenCL kernels!

c4541f5

Clean up some code and update CI

77bfad6

Fix include order

48c5ee2

Revert formatting changes :(

96aba35

Copy link

Member

bluss commentedApr 1, 2024

That is interesting. Thanks for the neat proof of concept.

Now I don't have bandwidth to look at this in depth unfortunately. Look at the repo, we're working on version 0.16 which is only three years late or so. :) Brief and off the cuff for that reason. I'll ask some questions, that doesn't mean I'm making requirements or demands, they are things to think about.

Why not have a separate owned type for arrays that are on a non-cpu device? Users probably don't want a situation where they can't tell if adding two arrays together is going to cause a crash or not.

How do array views work in this model? Most interesting functionality in ndarray happens through array views.

I do agree - if you think so - that too much type information is not good either - it just gets in the way, the hard part is finding a balance.

bluss reviewed

Apr 1, 2024

View reviewed changes

src/impl_ops.rs

		let typename = match crate::opencl::rust_type_to_c_name::<A>() {
		Some(x) => x,
		None => panic!("The Rust type {} is not supported by the \
		OpenCL backend", std::any::type_name::<A>())

Copy link

Member

blussMar 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Enforce at type level?

src/impl_ops.rs

		unsafe {
		let elements = self.len();
		let self_ptr = self.as_ptr() as *mut std::ffi::c_void;
		let other_ptr = rhs.as_ptr() as *mut std::ffi::c_void;

Copy link

Member

blussMar 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I'm assuming this operation can be invalid here because they are only known to be on the same device by a debug assertion, not a hard assertion?

src/impl_methods.rs

		let dim = self.dim;
		let strides = self.strides;
		let data = self.data.move_to_device(device)?;
		let ptr = std::ptr::NonNull::new(data.as_ptr() as *mut A).unwrap();

Copy link

Member

blussMar 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

The new ptr is invalid if move is Host to Host and the ptr before move was not to the start of the allocation. This is probably just a feature not yet implemented here.

Copy link

Author

Pencilcaseman commentedApr 2, 2024

Thanks for the review@bluss. I didn't check the validity of stuff particularly rigorously, so I'm sure there are a few potential issues.

I've had a bit of a think, and I reckon it'd be possible to kill a few birds with one (admittedly quite large) stone.

I've read through most of the open issues, and it seems like a lot of them relate to the maintainability of the library (I'd be keen to help out with this if you're still looking for someone to maintain things) and cleaning up or refactoring the code.

If there were aBackend trait which exposed things like OwnedRepr creation (each backend will have its own OwnedRepr type), contiguous operations, strided operations, linear algebra operations, etc., for a given device, then it would be possible to abstract the calculations/operations away from the ArrayBase struct and into those backends.

The backends could have reasonably generic code that uses templates, cutting down on the heavy macro expansions and making the code easier to maintain. The Array and View types would need to take the backend as a generic type, but that should allow any backend to have full View functionality since it'd all be handled by the backend provider.

Obviously, things like custom mapping functions and hand-written iterators won't work on anything other than the CPU since the data isn't available in host memory, but everything else should work fine.

That has the added benefit of allowing anyone to write their own backend forndarray for whatever use case they have instead of weaving the functionality into the existing codebase.

Anyway, maybe I'm getting a bit ahead of myself, since that'd be a fairly substantial refactor of the codebase. If that's something you think would be beneficial, then I'd be happy to start looking into it. As I mentioned earlier, I'd also be happy to help maintain the project if you need another maintainer :)

Thanks again for the review!

Copy link

Member

bluss commentedAug 2, 2024

About maintainership, that's welcome, feel free to join the new disucssion board#1272 (comment)

Labels

None yet

2 participants

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GPU Support via OpenCL#1377

Are you sure you want to change the base?

GPU Support via OpenCL#1377

Uh oh!

Conversation

Pencilcaseman commentedMar 30, 2024

Uh oh!

bluss commentedApr 1, 2024

Uh oh!

blussMar 30, 2024

Choose a reason for hiding this comment

Uh oh!

blussMar 30, 2024

Choose a reason for hiding this comment

Uh oh!

blussMar 30, 2024

Choose a reason for hiding this comment

Uh oh!

Pencilcaseman commentedApr 2, 2024

Uh oh!

bluss commentedAug 2, 2024

Uh oh!

Uh oh!