Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Intent to adapt Array API in Python-Blosc2#968

FrancescAlted started this conversation inAnnouncements
Discussion options

Hi! ThePython-Blosc2 project is trying to adapt to the Array API specs. We have started work here:Blosc/python-blosc2@main...array-api

The Array API has been in our radar at least since 2023, when I attended to the Aaron talk in SciPy conference, but probably I forgot the details and was naively thinking that just adapting to name convention was going to be enough. During recent EuroSciPy conference, it was clear that adapting to Array API is much more than that; thanks to@seberg and@lucascolley for making that clearer and for their help in our initial effort! In case anyone has suggestions, or is willing to help us in this attempt, please shout!

You must be logged in to vote

Replies: 4 comments 15 replies

Comment options

Hi,
We have been working on implementing the standard for Blosc2 and are just about there for the ndarray_object tests. There is one small question that I have about one of the tests (seethis test), which executes whenvalue is an array object of shape():

ph.assert_0d_equals(    "__setitem__",    x_repr="value",    x_val=value,    out_repr=f"modified x[{idx}]",    out_val=res[idx])

which is equivalent toassert x_val == out_val. For our case this fails sinceres[idx] outputs a (true) scalar, whereasvalue is an array object -assert res[idx] == value[()] is true however. As I understand it, both assertions are true for NumPy only due to overloading the== operator, and I am not sure if that is desirable in general. Would it not be better to modify the test to

ph.assert_0d_equals(    "__setitem__",    x_repr="value",    x_val=value[()], <-----------------------------------    out_repr=f"modified x[{idx}]",    out_val=res[idx])

?
Thanks

You must be logged in to vote
9 replies
@lucascolley
Comment options

If all the functions in the array namespace accept NumPy arrays as input, this does seem like a minor issue. I guess there are some things which will just be wrong though, likex[idx].__array_namespace__().

Would it be an option to return a subclass ofnp.ndarray for decompressed slices, and implement== (and things like__array_namespace__) accordingly?

@lshaw8317
Comment options

That is also quite a good idea thanks very much. I have opened a ticketBlosc/python-blosc2#464 (comment). Will try to work on it in the medium term and focus on the rest of the array-api suite for the moment.

@ev-br
Comment options

ev-brSep 9, 2025
Collaborator

Please don't hesitate reaching out, here or in thearray-api-tests repo for specific issues with the test suite!

@seberg
Comment options

If all the functions in the array namespace accept NumPy arrays as input, this does seem like a minor issue. I guess there are some things which will just be wrong though, like x[idx].array_namespace().

The problem with this is mainly that we don't allow "promotion", i.e. no way to mix different array types in__array_namespace__().
So subclass that does nothing but change the namespace can solve this, although, I can't say I am a big fan of an unnecessary subclass that might confuse things.

(The question is whether you want users to mix NumPy and blosc arrays, because if you do, users may run into this issue whether the subclass exists or not. And the only fix is to allow array-namespace to figure things out. That isn't hard, Array API just didn't want to add the complexity.)

EDIT: If you are happy with the subclass, fine, it's avery simple one. But it leaves me a bit unsatisfied because I think it is an inelegant solution for what is an Array API issue/limitation.

@FrancescAlted
Comment options

Yeah, we have discussed that internally, and I am not a big fan of the subclass either, precisely because I feel it inelegant too, but also because it adds another class that the user has to know about. For the time being we will just skip the tests that are related with this. Perhaps in the future we can come with a solution that satisfies most of us.

Comment options

Hi, I have another question - I suppose it is perhaps related to the test suite but it's not really an issue per se. When executing the tests fortensordot I frequently get a Deadline Exceeded error. I have tried to reproduce it in a separate script (which does something similar) and the time taken is much less (80ms rather than 1000ms). The costly (in terms of time) part of the test seems to be in the assert_equal statement in the test (which eventually callsall(equal(...)), but even then, the call toall(equal( below is still only 60ms. Is this strange, or something to do with hypothesis?

@profiledef run_test():    shape1 = (4, 1, 1, 4, 1)     shape2 = (4, 1, 1, 4, 2)    axes = ((), ())    x1 = blosc2.zeros(shape=shape1, dtype=np.uint8)    x2 = blosc2.zeros(shape=shape2, dtype=np.uint8)    result = blosc2.tensordot(x1, x2, axes=axes)    newres = blosc2.asarray(np.tensordot(x1, x2, axes=axes))    blosc2.all(blosc2.equal(result, newres))    return result
You must be logged in to vote
1 reply
@ev-br
Comment options

ev-brSep 16, 2025
Collaborator

At a guess, that's hypothsis generating a too-large array to test. Note that assertions inarray-api-tests typically loop over array elements in python, so there's overhead.

Try passing--disable-deadline to the pytest invocation.

Comment options

Another thing that I'd like to bring on the table is the use of an optionalshape= in constructors. Specifying the shape is optional in some constructors of the API (e.g.ones(),zeros(),full() and the like), but not allowed in others (e.g.arange(),linspace()), so an additional.reshape() operation is needed. As you may presume, doing such a reshape operation in chunked containers like Blosc2 requires a copy, and can be expensive; this is why we support an optionalshape= in all the Blosc2 constructors.

It would be cool if Array API may adopt the same convention and allow an optionalshape= in constructors too. Thoughts?

You must be logged in to vote
5 replies
@lucascolley
Comment options

I imagine this is just a question of whether current adoption is widespread enough to justify requiring it from all compliant implementations.

@kgryte
Comment options

At the moment, libraries, such as NumPy, JAX, Torch, and Dask, don't support ashape kwarg inarange andlinspace, so it would likely be an uphill battle standardizing. In searching on the NumPy issue tracker, I wasn't able to find a feature request for addingshape to eitherarange orlinspace, so it is not clear there is user demand for this atm.

@lucascolley
Comment options

Could blosc evaluatexp.reshape(xp.arange(...), shape) lazily as equivalent toxp.arange(..., shape=shape)@FrancescAlted ?

@FrancescAlted
Comment options

Ok, that looks like a good idea, and performance wise, the difference is not that much:

In [1]: import blosc2In [2]: %time a = blosc2.lazyexpr("arange(100_000_000)")CPU times: user 27.7 ms, sys: 22.7 ms, total: 50.4 msWall time: 45.5 msIn [3]: %timeit blosc2.reshape(a, (10_000, 10_000))700 ms ± 2.72 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)In [4]: %timeit blosc2.arange(100_000_000, shape=(10_000, 10_000))686 ms ± 2.73 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)In [5]: %timeit blosc2.arange(100_000_000).reshape((10_000, 10_000))829 ms ± 1.89 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
@lucascolley
Comment options

great! I guess this is why the standard includesreshape as a function rather than an array object method.

Comment options

Ok, so with Blosc2 3.9 series we did a big push towards array API compatibility:https://github.com/Blosc/python-blosc2/releases

Hopefully we can get some funding for making more progress in the future, but what we have now is a pretty good basic compliance. Thanks for your help so far!

You must be logged in to vote
0 replies
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Labels
None yet
6 participants
@FrancescAlted@seberg@ev-br@kgryte@lucascolley@lshaw8317

[8]ページ先頭

©2009-2025 Movatter.jp