Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork11.9k
ENH: Create helper for conversion to arrays#24214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
mhvk commentedJul 19, 2023
I like this idea. Following the broadcasting suggestion, it may be nice to try to be reasonably compatible with what p.s. Might it be possible to factor out the |
mdhaber commentedJul 19, 2023 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
That would be nice. It seems like a common enough follow-up step that it would be a convenient addition to this convenience function.
We do this frequently in SciPy now, so not insane. Seems unusual for NumPy, but I wouldn't complain.
Thanks for this! |
seberg commentedJul 20, 2023
Yeah, agree that the ufunc machinery should share as much as possible, although was a bit hoping to mainly consolidate Python paths first. The reason was that looking at the array-wrap implementation in ufuncs they are messy and buggy (there is an issue), so fixing that seems like a bit of a messy change easier on its own. I think a method for output array prep could make sense, adding it to the initial call seems like it would overload meaning (we don't know the common dtype is right for one thing).
Right, that is how it is implemented. Just
Ah, right! Yeah, maybe make its own issue, it seems to me we should with integrating that into |
rgommers commentedJul 20, 2023
@seberg from your description it is not entirely clear to me if you mean for this to be a public or private function? With the namespace we discussed somewhere else, Since it was mentioned above, we recently came across the need for |
seberg commentedJul 20, 2023
The first issue I want to solve would mean private is fine, but yes suspect we should just make it public and if we are unsure about some methods/attributes or so give it an |
mhvk commentedJul 20, 2023
Agreed that the output makes more sense as a separate method, rather than on input, especially since in most cases something has to be generated. Also, 👍 to making p.s. A better name than p.p.s. You'll not be surprised to hear that I think that by default subclasses should be passed through... But it may be good to think a bit more generally what to do with array mimics - the benefit of getting |
mdhaber commentedAug 12, 2023 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
I just ran into this little gem , which might have been avoided with use of a helper like this. np.float16([2])<np.float32(2.0009766)# array([False])np.float16([2])<np.float32([2.0009766])# array([ True]) |
seberg commentedAug 12, 2023
You might think that helps, but it probably wouldn't :). This is NEP 50 stuff. |
mdhaber commentedAug 12, 2023 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
If before comparison the float16 is promoted to the result type or if the float32 is broadcasted to the result shape, the comparison is done correctly. So I think the helper would help with those things. |
seberg commentedAug 12, 2023
@mdhaber no, |
mdhaber commentedAug 12, 2023
I always pass the dtypes into |
seberg commentedAug 12, 2023
Right, if you pass the dtypes you get the right thing, but that is the same as adopting NEP 50. |
seberg commentedNov 3, 2023
I was trying to get back to this a bit, but having difficulties to quite formulate what we need/want here. I would prefer to not go the feature-creep route for starters. Things that I would like:
There are some options, which would probably fit as kwargs2 :
Another API, I worry about more: it would be nice to provide a clean way to preserve Python better. This is to allow later normal promotion to use NEP 50 rules. Maybe one solution is to make it: conv=array_converter(*arrays)dt=conv.common_dtype()# always applicable, may do initial conversion to arraysconv.shape# could do this to get the broadcast shape, but could also defer.conv.was_pyscalar# maybe. Or an enum with `0-D, scalar, python scalar`.arrays=conv.arrays(subok=True,allow_scalars=True,broadcast=True,dtype=dt)# Wrap result (if wanted) (see footnote 2) [^2]returnconv.wrap(result) Any thoughts? Implementing any of this isn't super hard, but getting more clarity on where to start without a feature creap, is for me. I think, I like that last thought as API (initialization takes no kwargs, but fetching the arrays is a function and not an attribute/property). But it isn't quite clear to me. Especially how to do scalar preservation. Footnotes
|
| stop=asanyarray(stop)*1.0 | ||
| conv=_array_converter(start,stop) | ||
| start,stop=conv.as_arrays(subok=True) | ||
| dt=conv.result_type(ensure_inexact=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Iterated a bit, this is still a bit rough (I would like to consolidate__array_wrap__ a bit at least).
This does defineconv.wrap() as the one true way to call__array_wrap__, but for NEP 50, the interesting part is really this part.
(I haved also moved the cast to float into the subtract viadtype=, which isn't strictly necessary.)
The point is, that this meansnp.linspace(np.float32(3), 10) returns float32. (yes, still needs tests, but thought I would note this, as this pattern would hopefully be useful a few places, that are currently subtly broken in the same way.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
mhvk left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
@seberg - now had a better look. I think this is really nice and it should be both relatively straightforward to extend and to use some of it inside the ufuncs.
One suggestion: it might be nice ifconv can be indexed like a tuple. That should then return the "anyarray" version (in which case the defaultsubok=False foras_arrays() makes sense; without this, I feelsubok=True would be a more logical default).
If not doing that, I would suggest to replaceary, = conv.as_arrays() withary = conv.as_arrays()[0] for clarity (the trailing comma before the equal sign is easy to miss...).
Uh oh!
There was an error while loading.Please reload this page.
| Py_ssize_tnarrs_ssize_t= (args==NULL) ?0 :PyTuple_GET_SIZE(args); | ||
| intnarrs= (int)narrs_ssize_t; | ||
| /* Limit to 64 for now to use in a few places. */ | ||
| if (narrs_ssize_t>64) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
More logical to useNPY_MAXARGS?
| item->DType=NPY_DTYPE(PyArray_DESCR(item->array)); | ||
| Py_INCREF(item->DType); | ||
| if (!npy_mark_tmp_array_if_pyscalar( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I think I'd swap the branches to avoid the!
| Py_INCREF(item->descr); | ||
| } | ||
| else { | ||
| /* just clear the flag again */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Well, swap only if this is really not needed! This would seem to preclude using it later, is that intended? Or is->was_scalar exactly the same (i.e., python scalars only)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I also leavedescr empty. Se yes, it is just the same. Just wanted to go through that, because that is wehre other paths go through.
| res_item=item->object; | ||
| } | ||
| else { | ||
| res_item= (PyObject*)item->array; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Probably premature optimization, but it would seem logical not to even create the array from a python scalar unless it is actually needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
yes, but that requires splitting up the creation, so I indeed thought it premature.
This is one of the things that we could do in the ufunc if and only if we delete the "promotion state" (to allow setting it to warn for NumPy 2).
| creation_helper_dealloc(PyArrayCreationHelperObject*self) | ||
| { | ||
| creation_item*item=self->items; | ||
| for (inti=0;i<self->narrs;i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
you need anitem++ here too! (or doself->items[i])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Ouch, luckily, probably the reason the sanitizer job failed...
| creation_helper_as_arrays(PyArrayCreationHelperObject*self, | ||
| PyObject*const*args,Py_ssize_tlen_args,PyObject*kwnames) | ||
| { | ||
| npy_boolsubok=NPY_FALSE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I think this will replace moreasanyarray thanasarray, so maybe make the defaultNPY_TRUE? In general, subclasses should be OK...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
OK, let me change it. In practice it isn't a big change. For a bit I was thinking "whyasanyarray at all?!
I am not actually sure it is more useful for most places, but since theconv.wrap() is a no-op (normally) for those now, I also don't really care to prefer it.
The point forasanyarray() is allowing functions to work due to composition of other functions/methods working out, when simple wrapping might not propagate info well.
In practice, I do wonder how often that is actually well advised (compared to using__array_function__ and dealing with it manually). The point being: if the class is not trivial enough for just wrapping, maybe it should use__array_function__. OTOH, I can see a small amount of pragmatism in it, so fine with preferring it.
Uh oh!
There was an error while loading.Please reload this page.
numpy/lib/_arraysetops_impl.py Outdated
| """ | ||
| conv=_array_converter(ary) | ||
| ary,=conv.as_arrays(subok=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Maybeary = conv.as_arrays(subok=True)[0].ravel().
It does feel one should just be able to indexconv (effectively making it a bit like a subclass of tuple).
Then it would beary = conv[0].ravel().
| # matrices have to be transposed first, because they collapse dimensions! | ||
| out_arr=transpose(buff,buff_permute) | ||
| returnres.__array_wrap__(out_arr) | ||
| res=transpose(buff,buff_permute) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
For subclasses other thanmatrix, this would be an API change if.transpose() is overridden. That seems rather unlikely (and wrong-headed), so I'm fine with making the change regardless...
seberg left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I have addressed all the comments and changed the default forsubok (I was torn on that initially, but I made my peace with it).
I allow indexing now (which also allowsa, b = conv unpacking, although calling the method is actually a tiny bit faster (the reason is that unpacking checksconv[length] to trigger the index error).
I have a start on reorganizing the ufuncs to share some infrastructure for array-wrap at least. But decided that shouldn't stop us here, since it mostly means moving thefind_wrap function out to share it and the ufunc changes are not quite utterly trivial.
Uh oh!
There was an error while loading.Please reload this page.
| res_item=item->object; | ||
| } | ||
| else { | ||
| res_item= (PyObject*)item->array; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
yes, but that requires splitting up the creation, so I indeed thought it premature.
This is one of the things that we could do in the ufunc if and only if we delete the "promotion state" (to allow setting it to warn for NumPy 2).
| Py_INCREF(item->descr); | ||
| } | ||
| else { | ||
| /* just clear the flag again */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I also leavedescr empty. Se yes, it is just the same. Just wanted to go through that, because that is wehre other paths go through.
| } | ||
| else { | ||
| /* | ||
| * `__array_wrap__` expects an array argument, so we ensure that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Well, I think that scalars remain common in some code paths (for the time being at least). But rephrased the comment.
| creation_helper_dealloc(PyArrayCreationHelperObject*self) | ||
| { | ||
| creation_item*item=self->items; | ||
| for (inti=0;i<self->narrs;i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Ouch, luckily, probably the reason the sanitizer job failed...
Uh oh!
There was an error while loading.Please reload this page.
| creation_helper_as_arrays(PyArrayCreationHelperObject*self, | ||
| PyObject*const*args,Py_ssize_tlen_args,PyObject*kwnames) | ||
| { | ||
| npy_boolsubok=NPY_FALSE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
OK, let me change it. In practice it isn't a big change. For a bit I was thinking "whyasanyarray at all?!
I am not actually sure it is more useful for most places, but since theconv.wrap() is a no-op (normally) for those now, I also don't really care to prefer it.
The point forasanyarray() is allowing functions to work due to composition of other functions/methods working out, when simple wrapping might not propagate info well.
In practice, I do wonder how often that is actually well advised (compared to using__array_function__ and dealing with it manually). The point being: if the class is not trivial enough for just wrapping, maybe it should use__array_function__. OTOH, I can see a small amount of pragmatism in it, so fine with preferring it.
mhvk left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Looks mostly great now! Just a few remaining comments.
One comment: with the indexing giving any array, I think it is actually OK to havesubok=False as default foras_arrays(). Though also completely happy to keep it as you have it now - in principle, subclasses should behave like regular ones, so it should be fine to keep them (and it is nice if the default actually corresponds to the least amount of work, which is the case forsubok=True since you are not viewing as a regular array).
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
| } | ||
| } | ||
| Py_INCREF(self); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Are you sure you need theINCREF here?PyObject_NewVar returns a new reference already (which youDECREF in the fail path).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Right, thanks! Must have been some remnant of some debugging confusion...
| returnNULL; | ||
| } | ||
| /* When we only have scalars, the default is to retur an array: */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
retur ->return
Uh oh!
There was an error while loading.Please reload this page.
mhvk commentedNov 28, 2023
p.s. On the |
seberg left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Yeah, aboutSUBCLASS_SAFE_FUNCTIONS: it would be great if we could formalize this in some form and document the pattern. But the best would be to generalize it to non-subclasses somehow, which might be just callingfunc._implementation() rather thanndarray.__array_function__.
with the indexing giving any array, I think it is actually OK to have subok=False as default for
I agree more with the latter part, unless we hadasanyarray also: align default and short-hand. Plus, if you want a real nudge towardsubok having the opposite default is no good.
The main reason fornot doing it was maybe always matrices (and probably historic). But we should hopefully worry less and less about those.
Also added a start for documentation.
Uh oh!
There was an error while loading.Please reload this page.
| } | ||
| } | ||
| Py_INCREF(self); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Right, thanks! Must have been some remnant of some debugging confusion...
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
ngoldbaum left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I looked this over and spotted a few minor things but no showstoppers.
numpy/_core/src/multiarray/ctors.c Outdated
| * This means it was: | ||
| * - Recognized as a scalar by a/the dtype. This can be DType specific, | ||
| * for example a tuple may be a scalar, but only for structured dtypes. | ||
| * - Anything not recognized a scalar (mapped to a DType) but also not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
| *-Anythingnotrecognizedascalar (mappedtoaDType)butalsonot | |
| *-AnythingnotrecognizedasaninstanceofaDType's scalar type but also not |
I got tripped up on the original wording - take it or leave it, but if you leave it, it should berecognized as a scalar
Uh oh!
There was an error while loading.Please reload this page.
numpy/_core/_add_newdocs.py Outdated
| Helper to convert one or more objects to arrays. Integrates machinery | ||
| to deal with the ``result_type`` and ``__array_wrap__``. | ||
| The reason for this is that e.g. ``result_type`` needs to covert to arrays |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
| Thereasonforthisisthate.g.``result_type``needstocoverttoarrays | |
| Thereasonforthisisthate.g.``result_type``needstoconverttoarrays |
mhvk left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I think what's left is only typos and some suggestions for the new docstrings!
numpy/_core/_add_newdocs.py Outdated
| wrap(arr, /, to_scalar=None) | ||
| Call ``__array_wrap__`` on the input ``arr``. When ``arr`` is already | ||
| the correct subclass, this may be a no-op. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I think one should be explicit about what is done: "this may be a no-op" -> "no wrapping is done". Or maybe rewrite the note:
Call ``__array_wrap__`` on ``arr`` if ``arr`` is not the same subclassas the input the ``__array_wrap__`` method was retrieved from.numpy/_core/_add_newdocs.py Outdated
| add_newdoc('numpy._core._multiarray_umath','_array_converter', ('scalar_input', | ||
| """ | ||
| A tuple which is ``True`` if the original inputs were scalars and | ||
| ``False`` otherwise. Scalars are all objects which are coerced to a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
How about
A tuple which indicates for each input whether it was a scalar thatwas coerced to a 0-D array (and was not already an array or somethingconverted via a protocol like ``__array__()``).numpy/_core/_add_newdocs.py Outdated
| ``False`` otherwise. Scalars are all objects which are coerced to a | ||
| 0-D array but are not already arrays or converted via one of the protocols. | ||
| .. note:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
My sense would be to delete this note: it probably adds more confusion than it solves...
numpy/_core/_add_newdocs.py Outdated
| ---------- | ||
| arr : ndarray | ||
| The object to be wrapped. Ideally, this is always an ndarray | ||
| or subclass, although NumPy scalars are accepted as of now by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
How about
The object to be wrapped. Normally an ndarray or subclass,although for backward compatibility NumPy scalars are also accepted(these will be converted to a NumPy array before being passed on tothe ``__array_wrap__`` method).numpy/_core/_add_newdocs.py Outdated
| When ``True`` will convert a 0-d array to a scalar via ``result[()]`` | ||
| (with a fast-path for non-subclasses). If ``False`` the result should | ||
| be an array-like (as ``__array_wrap__`` is free to return a non-array). | ||
| By default (``None``), a scalar is returned if all inputs are scalar. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
are scalar -> were scalar
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
| PyObject*obj=self->items[i].object; | ||
| if (PyArray_CheckExact(obj)) { | ||
| if (wrap==NULL||priority<0) { | ||
| Py_INCREF(Py_None); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Strictly, you don't need to set the wrap here, given theif (wrap == NULL) clause at the end. (same for scalar)
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
mhvk commentedNov 30, 2023
p.s. Probably time for some squashing of commits too... |
seberg commentedNov 30, 2023
I don't want to jinx this because of it, but the |
mhvk commentedNov 30, 2023
I'm a bit confused: why would it not work for That said, you are right that in principle it would be better to give |
ngoldbaum commentedNov 30, 2023
see#25278 which was reported with impressive timing for this PR |
mhvk commentedDec 19, 2023
I'm still not quite sold on |
…r`` (#25409)This reorganize how array-wrap is called. It might very mildly change the semantics for reductions I think (and for negative priorities).Overall, it now passes a new return_scalar=False/True when calling __array_wrap__ and deprecates any array-wrap which does not accept arr, context, return_scalar.I have not integrated it yet, but half the reason for the reorganization is to integrate it/reuse it in the array_coverter helper PR (gh-24214), which stumbled over trying to make the scalar handling sane.Forcing downstream to add return_scalar=False to the signature is a bit annoying, but e.g. our memory maps currently try to guess at it, which seems bad. I am hoping, this can be part to making the scalar vs. array return more sane.But, maybe mainly, I hope it consolidates things (together withgh-24124 mainly, as if ufuncs were the only complex place we used this, it wouldn't matter much).---* API: Reorganize `__array_wrap__` and add `return_scalar=False`This also deprecates any `__array_wrap__` which does not accept`context` and `return_scalar`.* BUG: Fix niche bug in rounding.* MAINT: Adjust __array_wrap__ in code and tests (also deprecation test)* MAINT: Use/move the simplest C-wrapping also* DOC: Update doc and add release note* STY: Make linter happy in old tests* MAINT: Silence GCC warning (value cannot really be used uninitialized)* MAINT: Small style fixes* BUG: Fix reference leak in ufunc array-wrappingThis probably doesn't fix the 32bit issue unfortunately, only the windows ones...* BUG: Fix leak for result arrays in all ufunc calls* Ensure we try passing context and address related smaller review comments* Ensure we don't try `context=None` and expand code comment* Rely on return_scalar always being right (and style nit)* Remove outdated comments as per review* Let's just undo force-wrap for now for reductions (its a change...)* ENH: Chain the original error when the deprecationwarning is raisedDoing this due togh-25635 since it is super confusing with thebad retrying...* BUG,MAINT: Address Martens review comments
This helper allows dealing with result_type and array wrappingwithout explicitly holding on to the original objects andespecially potentially converting them many times.
This doesn't quite fix all places, but it does most (most prominentlylinalg also does some quite awkward wrapping).
seberg commentedJan 20, 2024
@mhvk and@ngoldbaum, I merged main (i.e. the array-wrap helper) and applied all the remaining doc suggestions (thanks, they were great!). Then squashed it down to three commits. Ithink this should be pretty much ready now, since IIRC it was relatively settled. But happy to do another round if things come up of course. |
mhvk left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This looks good! Looking at it again, I mainly wonder about methods and arguments that are defined but not actually used. Arguably, we should not have those without a clear use case... But as long as this is private, perhaps it is OK?
| its memory, in which case you can traverse ``a.base`` for a memory handler. | ||
| """) | ||
| add_newdoc('numpy._core._multiarray_umath','_array_converter', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I wondered a bit about whether this needs to be private, as it seems generally useful. But I guess this allows possible changes before making it public?
numpy/_core/fromnumeric.py Outdated
| result=wrap(result) | ||
| returnresult | ||
| conv=_array_converter(obj) | ||
| # As this already tried the method, subok is maybe quite reasonable here: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Maybe add# but this follows what was done before. TODO: revisit this. Or something else that makes perhaps clearer that it would be good to make this change (even if best not to do so in this PR!).
| @@ -0,0 +1,471 @@ | |||
| /* | |||
| * This file defines an _array_converter object used internally to NumPy | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
"used internallyin Numpyto [deal ...]"
| item->DType=NPY_DTYPE(PyArray_DESCR(item->array)); | ||
| Py_INCREF(item->DType); | ||
| if (npy_mark_tmp_array_if_pyscalar( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Could this beif (item->scalar_input && npy_mark...)? If so, maybe combine with l.93-95,
if (item->scalar_input) { if (npy_mark...) { ... } else { ... }}else { /* non-scalar input */ self->flags &= ~(NPY_CH_ALL_PYSCALARS | NPY_CH_ALL_SCALARS);}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Yes, that is correct, and I adopted it, but not as suggested here. The pretty-code annoyance is thatitem->descr needs to be dealt with, so whatever you do, you have some duplication.
But I agree, it isn't super nice to split this from L93, and maybe easier to grok if it is clear thatnpy_mark_tmp_array_if_pyscalar does indeed requirescalar_input.
numpy/_core/_add_newdocs.py Outdated
| float, or complex. | ||
| """) | ||
| add_newdoc('numpy._core._multiarray_umath','_array_converter', ('scalar_input', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I thinkscalar_input is not used anywhere in the code. Do you have cases in mind where it would be useful? It sort-of feels like it should be, but if there is no use, should it be exposed? (Fine to leave in, just asking.)
Also, right now it is a property. Might it make sense to make it a method (with no arguments), so that one could extend it if needed? (E.g., to ask not just for scalar, but python scalar inputs.)
But perhaps all this just argues for keeping_array_converter private for now, so we can see what is best!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The linter says this line is one character too long
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I didn't think we cared, but fair, pushed the fix (inclued it in the previous commit).
| returnNULL; | ||
| } | ||
| /* When we only have scalars, the default is to return an array: */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I was again a bit confused about this, maybe it helps to say "We follow the default setting for as_arrays of CONVERT_IF_NO_ARRAY"
| ---------- | ||
| subok : True or False, optional | ||
| Whether array subclasses are preserved. | ||
| pyscalars : {"convert", "preserve", "convert_if_no_array"}, optional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This argument is not used anywhere, so a bit of the same question as forscalar_input above. In particular, I wonder if it wouldn't be handier to pass this oninput (perhaps still allowing to override it here). But without a use case, probably not worth overthinking it now!
seberg commentedJan 21, 2024
Addressed the smaller comments, thanks. The API issues are not very clear to me. I really like this type of feature, but it is quite fuzzy to me how it should look like and which details are useful (enough). That is why I opted for private for now: I am not comfortable with saying that this API seems good (enough) unless others give very clear +1's. Generally, I would like this or something similar to be public, because I think it can be generally useful and also specifically because I think it is more useful with NEP 50 if downstream wants to be pedantic. From a maintenance burden I am not super worried even if we deprecate it again soonish, we would name squat If wekeep things private, I somewhat like just keeping those odd features. I can see the I am not sure that we will use it (in fact I doubt it in the near future), but I also like the idea of "advertising" that this is possible. If we lean towardspublic API, I fully agree to ditch them all until it is very clear we or downstream use them. Those odd features shouldn't stop us from making this public! |
mhvk left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
OK, code looks good now! On the API, I'm fine with keeping it private for now and seeing what ends up being useful and what not. I'll approve now, but won't merge to give@ngoldbaum a chance to look.
p.s. Note that the linter failed.
ngoldbaum commentedJan 21, 2024
I'll look at this tomorrow. |
seberg commentedJan 22, 2024
FWIW, I rebased mytry-0d-preservation branch, and ended up using those features (not I am sure alternatives wouldn't be better). It actually seems surprisingly close to doing something reasonable. |
ngoldbaum left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I looked over the C code again and didn't spot any issues. Please push a fixup for the linter then this should be good to go.
numpy/_core/_add_newdocs.py Outdated
| float, or complex. | ||
| """) | ||
| add_newdoc('numpy._core._multiarray_umath','_array_converter', ('scalar_input', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The linter says this line is one character too long
ngoldbaum commentedJan 22, 2024
I figure it's better to keep the linter happy so people don't get surprise linter corrections on future PRs. Thanks! |
Had to give this a start to be able to experiment. It does solve at least some problems (e.g.
np.select()use-case and it looks like it would fix and/or simplify linalg code).The main thing is that we often convert multiple arrays, and then want something like:
a, b, c = np.asarray(a), np.asarray(b), np.asarray(c)and thendtype = np.result_type(a, b, c)often with an additional0.0. That is a pretty common pattern, NEP 50 makes it tricky if you wish to support weak scalars, because you need to callresult_typewith the scalar information intact (before theasarrayor manually undo). (For example, this happens innp.select())__array_wrap__for subclass support, although it seems consistently used for single input functions (outside ufuncs).First, I wanted to just make a single function. But with multiple things above that may be interesting to return, and e.g.
result_type()might error and I am not sure we always need it.So, is it insane to create a rich object (rather than a complicated return value and many kwargs):
(I can see
a, b, info = helper(a, b)as a nice pattern; OTOH it makes including casting kwarg heavy since you somewhat need to include it in the first step.)Further things might include:
arrays.to_common_dtype(ensure_inexact=True))