Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

ENH: addquantile function with weights support#494

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Draft
cakedev0 wants to merge27 commits intodata-apis:main
base:main
Choose a base branch
Loading
fromcakedev0:quantile

Conversation

@cakedev0
Copy link
Contributor

@cakedev0cakedev0 commentedOct 21, 2025
edited
Loading

Resolves#340

Specs:

  • no weights: ND-support, propagate NaNs, supported methods:"linear", "inverted_cdf", "averaged_inverted_cdf"
    • adding support for omitting NaNs shouldn't be too hard (if you assume NaNs are sorted at the end, which is what everyone does), but I don't think it's needed for now.
  • with weights: Only 1D or 2D support, propagate or omit NaNs, supported methods:"inverted_cdf", "averaged_inverted_cdf"
  • Delegates for all major supported backends when possible (numpy, cupy, torch and jax), except dask, as it was raising errors.

Comparison with NumPy:

Those specs matches NumPyquantile fornan_policy="propagate" andnanquantile fornan_policy="omit". Differences are:

  • usingnan_policy="propagate"|"omit" (inspired by scipy) instead of two functions like numpy (quantile fornan_policy="propagate" andnanquantile. Why? Less code/doc duplication, and clearer behavior IMO.
  • weighted case:
    • support for"averaged_inverted_cdf" method (only"inverted_cdf" method is supported by numpy): we need this in scikit-learn.
    • only up to 2D: we don't need more for now.
  • non-weighted case:
    • less methods. Why? We don't need more. I've only found one call with a method in scikit-learn and it's a deprecated method.
    • no support fornan_policy="omit": there are a few calls tonanpercentile in scikit-learn. Would be easy to implement, following what has been done in scipy.
    • implementation: we sort instead of relying on partitioning like numpy does. The partitioning-based implementation would be quite more complex and would not benefits perf in most cases, as currentlyxpx.partion relies on sorting when not delegating. This could be worked

Comparison with SciPy:

Main difference is the broadcasting/reduction behavior. We aligned on numpy for this one.

Also, SciPy doesn't support weights.

Implementation for the non-weighted case is more or less copy-pasted from SciPy, except that I've only kept support for 3 methods.

Future use ofxpx.quantile in scikit-learn

Those specs should be enough to replacesklearn.utils.stats._weighted_percentile and to replace most uses ofnp.quantile/percentile when rewriting numpy-based functions to support Array API standard.

Note that the implementation for the weighted case differs a bit from sklearn's_weighted_percentile: it's mostly the same approach (sortweights based ona and compute cumulative sum), but they differ in the way edge cases are handled (null weights mostly). I believe my implementation to be easier to read and to get right, and equivalent in terms of performance (dominated by argsort for big inputs).


Still TODO:

  • decide whether to support dask. I don't think we should it's very slow and doesn't match the purpose of dask of mentioned here. Supporting dask would slow down the CI quite a bit.
  • read scikit-learn's tests for_weighted_percentile to see if there is something I missed in my tests

@cakedev0cakedev0 marked this pull request as draftOctober 21, 2025 10:05
@lucascolleylucascolley changed the titleFEA: Add quantile function - method="linear", no weightsENH: addquantile function -method="linear", no weightsOct 21, 2025
@lucascolleylucascolley added enhancementNew feature or request new function labelsOct 21, 2025
@lucascolleylucascolley mentioned this pull requestOct 21, 2025
@lucyleeow
Copy link

Fromscipy/scipy#23832

There is a slight difference between NumPy gufunc and scipy.stats rules that has to do with prepending 1s to the shapes when the arguments have different dimensionality. Specifically, in scipy.stats, axis identifies the core dimension after 1s are prepended to match the dimensionalities.

I am assuming all delegate packages do what numpy does, in terms of broadcasting? If that is the case we should probably follow unless we think scipy has better rules?

@cakedev0cakedev0 changed the titleENH: addquantile function -method="linear", no weightsENH: addquantile function with weights supportOct 23, 2025
raiseValueError(msg)
else:
ifmethodnotin {"inverted_cdf","averaged_inverted_cdf"}:
msg=f"`method` '{method}' not supported with weights."

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

only method x/ y support weights?

lucyleeow reacted with thumbs up emoji

methods= {"linear","inverted_cdf","averaged_inverted_cdf"}
ifmethodnotinmethods:
msg=f"`method` must be one of{methods}"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Sort methods to get a deterministic output?

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I think it's deterministic already. But do you mean declaring methods in the sorted order?

Like this:methods = {"averaged_inverted_cdf", "inverted_cdf", "linear"}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I thought this was not deterministic - but maybe this was in older python versions or between different OS, or maybe I just misremembered? In any case - sorry about the noise.

raiseValueError(msg)
nan_policies= {"propagate","omit"}
ifnan_policynotinnan_policies:
msg=f"`nan_policy` must be one of{nan_policies}"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

ditto

@lucascolley
Copy link
Member

@cakedev0 would you be able to fill out the PR description a little more to justify API decisions, primarily comparing how the implementation compares to what is available in NumPy and SciPy respectively?

Is there a link to a PR which would use this in sklearn, or are we not at that stage yet?

@cakedev0
Copy link
ContributorAuthor

Added some comparisons with numpy/scipy in the PR description.

Is there a link to a PR which would use this in sklearn, or are we not at that stage yet?

We are not at this stage yet, but I would be happy to open an "experiment PR" where I replace most calls tonp.percentile/np.quantile/_weighted_percentile by calls toxpx.quantile to check that the tests pass. This would help a lot being confident about this PR (in terms of API decisions and implementation correctness).

lucascolley reacted with thumbs up emoji

@lucascolley
Copy link
Member

Thanks, yeah, that sounds like a good idea.

@cakedev0
Copy link
ContributorAuthor

Updates from working on the "experiment PR" (you can see the diff here:cakedev0/scikit-learn#3):

  • all tests pass 🎉 (including tests for_weighted_percentile)
  • supportingnan_policy="omit" is important/needed. It unblocks 3 preprocessors:QuantileTransformer, RobustScaler, SplineTransformer, which is more of less half of scikit-learn things needing axpx.quantile on the listMake more of the "tools" of scikit-learn Array API compatible scikit-learn/scikit-learn#26024 - So I'll think
  • supporting more methods would be a bit useful forQuantileTransformer (but I think it's fine to not support all the methods for all the backends...).Question: do we need to raise an error for a method that's implemented by numpy but not by xpx when the backend is numpy?
  • it would be convenient thatmethod default to havea"default" method that default to"averaged_inverted_cdf" for the weighted-case and"linear" for the non-weighted case. Unsure if this a good practice in terms of interface though?
lucyleeow reacted with thumbs up emoji

@cakedev0
Copy link
ContributorAuthor

(putting this in draft mode while I'm working on supporting nan_policy="omit" for the non-weighted case)

@cakedev0cakedev0 marked this pull request as draftOctober 27, 2025 19:43
@lucyleeow
Copy link

would you be able to fill out the PR description a little more to justify API decisions,

FYI@cakedev0, there is currently a discussion about nan API in numpy:https://mail.python.org/archives/list/numpy-discussion@python.org/thread/P4EEL5P2PNDASWLI24CPHOW2KG2LZ3YY/

supporting more methods would be a bit useful for QuantileTransformer

Can you expand?

cakedev0 reacted with thumbs up emoji

@cakedev0
Copy link
ContributorAuthor

cakedev0 commentedOct 28, 2025
edited
Loading

supporting more methods would be a bit useful for QuantileTransformer

Sorry, not theQuantileTransformer but theKBinsDiscretizer, it supports various quantile methods:https://github.com/scikit-learn/scikit-learn/blob/main/sklearn/preprocessing/_discretization.py#L59-L66

(but I guess it's fine not to support all the methods in the array API version)

@mdhaber
Copy link
Contributor

mdhaber commentedNov 4, 2025
edited
Loading

Perhaps I can't be entirely objective, but I think there's also objective argument against a split betweenxpx.quantile andstats.quantile.

It sounds like the pain points are the broadcasting behavior and weights. Let's see if I can address those.

Main difference is the broadcasting/reduction behavior. We aligned on numpy for this one.

NumPy is self-inconsistent on this one. NEP 5 specifies what generalized universal functions should look like. I don't know the history - maybequantile/percentile existed first - but they could (in principle) follow the NEP and don't. That behavior makes it impossible to take quantiles at different probabilities for different slices, which has real-world uses (e.g. BCa bootstrap). The behavior ofstats.quantiles permits thatand permits the Cartesian product calculation.

Now, this is not to say that either of the NumPy precedents should be the final answer for the array API orarray-api-extra. I think we should do what's right for the ecosystem. Consistency with other array API functions (does anything else "broadcast" by taking a Cartesian product? Imagine if this were the standard for other array operations like addition!) and functionality (do we really want to preclude the possibility of different probabilities corresponding with each slice?) needs to be considered. If sklearnonly needs the Cartesian product behavior and doesn't like the standard broadcasting rules, a lightweight wrapper can easily translate.

Also, SciPy doesn't support weights.

weights were requested inscipy/scipy#22894. I was willing to help with it, but there was no response toscipy/scipy#22794 (comment). A PR for weights appeared, but it didn't address the concerns inscipy/scipy#22894. I didn't say anything to block it, but apparently nobody else is interested, so neither has progressed.weights are not very hard to implement; it's the theoretical/interface question that is tricky.

If I can get some feedback aboutscipy/scipy#22644 - which is aboutquantile and predates theweights request - and get that merged, I would be more proactive about addingweights support to SciPy'squantile. That PR actually has a similar interface question about how to acceptmethod-specific options, which we might have ifweights were not supported by the continuousmethods.

lucascolley reacted with heart emoji

@lorentzenchr
Copy link

After getting weights into numpy quantile/percentile, I was so exhausted to continue pushing it elsewhere. At least now, there is one reference, numpy. I would love to see it adopted elsewhere. Quantiles are just very important in many fields.

@mdhaber
Copy link
Contributor

I drafted it forstats.quantile last night, and I studied existing uses ofweights inscipy.stats. I'll submit a PR today or tomorrow.

The same code works for all H&F methods. I was a little concerned about non-integer weights before, but I don't think there's a real problem. For integer weights, it does the same thing as repeated samples, so we can document that these are frequency weights1. Its just that for most methods, the interpolation actuallydoes depend on the total amount of data / total weight.

Details
importnumpyasnpimportmatplotlib.pyplotaspltrng=np.random.default_rng(328983495438219)p=np.linspace(0,1,300)res=np.quantile([0,1],p,method='linear')res2=np.quantile([0,0,1,1],p,method='linear')res3=np.quantile([0,0,0,1,1,1],p,method='linear')plt.plot(p,res,p,res2,p,res3)
image

Footnotes

  1. If they can be interpreted as some other kind of weight, too, great. But almost all uses ofweights instats turn out to be frequency weights.

ogrisel reacted with thumbs up emojicakedev0 reacted with heart emojilucyleeow reacted with eyes emoji

@mdhaber
Copy link
Contributor

mdhaber commentedNov 6, 2025
edited
Loading

PR is up atscipy/scipy#23941. Here's a wrapper that hamstringsstats.quantile to get NumPy's Cartesian product behavior:

importnumpyasnpfromscipyimportstatsdefquantile_cartesian_product(x,p,method='linear',axis=0):p_shape=p.shapep=np.ravel(p)x=np.moveaxis(x,axis,-1)res=stats.quantile(x,p,method=method,axis=-1,keepdims=True)iflen(p_shape):res=np.moveaxis(res,-1,0)res=np.reshape(res,p_shape+res.shape[1:])else:res=np.squeeze(res,axis=-1)returnres# testrng=np.random.default_rng(234892398458345982)x=rng.random((6,7,8))foraxisin [-1,0,1]:forp_shapein [(), (1,), (2,), (2,3)]:p=rng.random(p_shape)kwargs=dict(axis=1,method='weibull')res=quantile_with_wonky_cartesian_product_behavior(x,p,**kwargs)ref=np.quantile(x,p,**kwargs)np.testing.assert_allclose(res,ref)

@lucascolley
Copy link
Member

where does this stand now thatscipy/scipy#23941 has merged?

@cakedev0
Copy link
ContributorAuthor

Personally, I don't really agree with some decisions that were made inscipy.stats but I was unable to make comments on the PR due to my wrist issues.

Also, I'm more busy lately and I don't want to spend time porting the quantile function from scipy here, it's really big now and relies on a lot of internal scipy helpers, it will be a lot of work.

Being very new to the ecosystem I'm not sure my opinion is well informed, still here are my disagreements with the choices that were made in scipy:

About the broadcasting behavior: I agree that the behavior ofscipy.stats.quantile is indeed the most flexible.
But I don't think it's natural. In the doc, it's written that it behaves the same as other functions inscipy.stats but I disagree. Other functions I could find with broadcasting behavior inscipy.stats all work this way:
f(x, y, axis) -> broadcast(x, y); apply f along axis (and reduce it away if keepdims=False).
In those,x andy are of the same nature, so it's natural to broadcast them together. And, thenaxis is reduced away. It's equivalent to the behavior you'll get forf(x, y, axis): return (x + y).sum(axis=axis). Very natural.
For the quantile, first it's not very natural to broadcast "data" with "parameters", and secondly the reduction behavior is much more complex. One thing I really don't like is that the output dimension is not determined by the input dimensions, for instance: bothquantile([0, 1], [0.5]) andquantile([0, 1], 0.5) have dimension 0 butquantile([0, 1], [0.5, 0.9]) has dimension 1.

So for me, it's not clear which option is the best. IMO, here are the pros and cons of the scipy approach:
+++ more powerful, unlocks things that would require a Python loop otherwise. But how much this loop would cost? It's generally very ok to loop on 100 features for instance.
-- more complex behavior for users, changes from what they are used to for this function (all other array libraries align with numpy I think).
- forarray-api-extra: more complex implementation, more complex delegation code, less cases where delegation is possible.

About the weights: personally, I don't understand the interest of frequency weights when you have a valid interpretation only for integers. Those should be calledcounts, notweights, and they are just an optimization, not a feature, becausef(x, weights=counts) returns the same result asf(xp.repeat(x, counts)), it just computes it in a more efficient way.
So I don't think supporting weights as frequency weights for all methods is a good idea, it would be better to implement acounts argument if the optimization is needed.

For methods"inverted_cdf" and"averaged_inverted_cdf", there are clear interpretations for real-valued weights: the one in the numpy doc, or the link with the argmin of the weighted pinball loss (which is why scikit-learn needs weighted quantile in many cases I think).

@lucascolley
Copy link
Member

I don't have a horse in the race w.r.t. the nicest user interface, so I will leave that discussion to those already involved.

What I will say is that from an ecosystem perspective, now that SciPy has committed to the public API, I would in principle be happy with a situation where we:

  • offer a certain API in array-api-extra which has some limitations but is more lightweight and familiar for users of preceding tools
  • direct users to SciPy for use-cases not covered by that implementation, or for situations where things like the SciPy broadcasting may result in code that is easier to understand

The downside of having two functions with the same name providing different semantics is real. But if the consensus from the technical discussion is that there are real tradeoffs such that each set of semantics has situations in which it is preferable, that might be acceptable.

cakedev0 and lucyleeow reacted with thumbs up emojilucyleeow reacted with eyes emoji

@mdhaber
Copy link
Contributor

mdhaber commentedDec 8, 2025
edited
Loading

Naturalness is subjective, so we valued consistency with the rest ofscipy.stats1. Unfortunately, this meant lack of consistency withnp.quantile becausenp.quantile was an anomaly and didn't respect decisions made by NumPy previously (NEP 5).

One thing I really don't like is that the output dimension is not determined by the input dimensions,

Of course they are. But like every otherstats function, they also depend onkeepdims, which controls whether theaxis is reduced away. The default ofkeepdims isFalse in all other functions because itcan always beFalse. We tried to keep in line with that by making the defaultFalse where possible. (Ithas to beTrue if the second argument has length > 1 along axis; )

Now, I don't necessarily like the existence ofkeepdims as a parameter throughoutstats. But it is what it is.

But I don't think it's natural...

Every other independent two-sample statistic in SciPy (e.g.ttest_ind,mannwhitneyu) broadcasts the arrays exactly asquantile does. You can tell because theyuse the same function to do the broadcasting. So quantile doesf(x, y, axis) -> broadcast(x, y); apply f along axis (and reduce it away if keepdims=False) just as described.

If you mean "they don't broadcast the arrays exactly likenp.broadcast_arrays", then you're right. We make one modification to thebroadcast_arrays rule because we have to generalize to the case in which the length alongaxis is not equal.

After that, the only difference is that the length of the output core dimension ofquantile is that of the second argument, and the length of the output core dimension of other statistics is1.

About the weights: personally, I don't understand the interest of frequency weights when you have a valid interpretation only for integers.

About a dozen functions in SciPy accept frequencyweights via theweights parameter, so we stuck with that name instead of inventing a new one likecounts. And also because sometimes the implementation of frequency and other kinds of weights are exactly the same: forinverted_cdf andaveraged_inverted_cdf, the weighted statistic is calculated the same way as NumPy and as proposed here. So the weights are not limited to integers, and you can interpret them however you wish. There is no limitation here; we justadded the opportunity to use frequency weights with othermethods.

Footnotes

  1. which I find quite natural, thinking about arrays as arrays. I do not introduce a notion of some arrays being "data" and others being "parameters"; from that perspective, I can see how array broadcasting in general would be unintuitive.

@cakedev0
Copy link
ContributorAuthor

cakedev0 commentedDec 9, 2025
edited
Loading

About NEP 5: can you be more precise about how scipy's quantile respects it and not numpy's quantile?

I've read it a couple of times, and honestly I don't fully understand. One of the main example isnp.inner1d, which is nownp.inner. Andnp.inner does do the Cartesian product, see thedocumentation:out.shape = (*a.shape[:-1], *b.shape[:-1]).

So to your question:

does anything else "broadcast" by taking a Cartesian product? Imagine if this were the standard for other array operations like addition!

The answer is not as obvious as you make it sound. If the Cartesian product is valid fornp.inner (which "NEP5-style signature" is(i),(i)->()) why disqualify it fornp.quantile (which "NEP5-style signature" would be(i),(j)->(j) I guess)?

Edit:

  • np.inner andnp.dot do Cartesian product
  • np.vecdot andnp.matmul broadcast

Both exists. So maybe it's fine for both versions of quantile to exist in the array-API space? 😄

@cakedev0
Copy link
ContributorAuthor

The default ofkeepdims isFalse in all other functions because it can always beFalse. We tried to keep in line with that by making the defaultFalse where possible. (It has to beTrue if the second argument has length > 1 along axis; )

To me, this shows that quantile is different from all other functions inscipy.stats. So the arguments based on consistency with the rest ofscipy.stats do not convince me.

Also, if the default ofkeepdims depends on the length along axis, for me it's not different than saying that for the default case, output ndim isn't determined by inputs ndim alone.

@cakedev0
Copy link
ContributorAuthor

cakedev0 commentedDec 9, 2025
edited
Loading

@lucascolley

I'm not objective because I invested time in some code that will be completely thrown away if scipy.stats choices are adopted here.

I argued for my choices in detail, or at least clarified that other choices come with trade-offs too.

Now, I'll let people who have no horse in the race decide, and:

  • I'll happy to finish this PR if the final decision is to continue in the current direction.
  • Otherwise, feel free to close this PR. It'll be easier to start from scratch and copy things from scipy, even for tests (many of my tests are based on numpy).

@mdhaber
Copy link
Contributor

mdhaber commentedDec 9, 2025
edited
Loading

I've read it a couple of times, and honestly I don't fully understand. One of the main example isnp.inner1d, which is nownp.inner.

"which is nownp.inner..." - are you sure? Research (further below) suggests thatnp.inner existed beforenp.inner1d, andinner1d was added to meet the need for the broadcasting (rather than Cartesian product) behavior. It seemsinner1d was never part of the mainnumpy namespace, and it disappeared quietly. (That said, sklearn noticed:numpy/numpy#10815.)

Andnp.inner does do the Cartesian product, see thedocumentation:out.shape = (*a.shape[:-1], *b.shape[:-1]).

Which is diferent from what the NEP 5 says thatinner1d did:

For example, where a is of shape(3,5,N) and b is of shape(5,N), this will return an output of shape(3,5)

importnumpyasnpN=10a=np.ones((3,5,N))b=np.ones((5,N))np.inner(a,b).shape# (3, 5, 5)

I'd suggest thatmatmul andvecdotcorrect the behavior ofdot andinner. For instance, frommatmul:

The matmul function implements the semantics of the@ operator introduced in Python 3.5 followingPEP 465.

Which states:

For inputs with more than 2 dimensions, we treat the last two dimensions as being the dimensions of the matrices to multiply, and ‘broadcast’ across the other dimensions.

np.vecdot was added because broadcasting, rather than Cartesian product, is the behavior offered by the Array API standardvecdot ("shape determined according to Broadcasting along the non-contracted axes").

I say "correct" because presumably the discussions that led to the inclusion of the@ operator in Python (PEP 465) and the array API were more careful than whatever led tonp.inner, which has had the Cartesian product behavior since at leastNumPy 1.3. Or so I infer fromthis mailing list discussion, which seems to have been the precursor to NEP 5. Since then, there have been discussions started by users looking for the broadcasting behavior and not getting what they wanted frominner, e.g.here andhere.

Cartesian product is OG NumPy behavior. NEP 5 broadcasting superseded it. Let's not go backwards.

cakedev0 reacted with thumbs up emoji

@cakedev0
Copy link
ContributorAuthor

... Well that is very convincing. Immense thank you for the detailed explanation, I definitely learned a lot there (and sorry for the time you had to invest, I hope I'll be able to make good use of the understanding I gained here in future contribution).

which is now np.inner

Indeed no. Sorry, I assumed that based on the very similar names, my bad!

@cakedev0
Copy link
ContributorAuthor

I still think that this behavior isn't right:

both quantile([0, 1], [0.5]) and quantile([0, 1], 0.5) have dimension 0 but quantile([0, 1], [0.5, 0.9]) has dimension 1.

But this is more of a detail, and I do understand why it's like this. I'll try to see if I can propose something I would find better, but I'm not sure I'll be able to.

Maybe nokeepdims? The signature is(i),(j)->(j), it doesn't reduce away a dimension, so maybe it's fair to not have akeepdims parameter? I could not find a gufunc with such signature in numpy... Formatmul, passingkeepdims is forbidden, but the error message says it's because it's 2d.

You said you don't likekeepdims, so maybe you'll like this idea 😉

@mdhaber
Copy link
Contributor

mdhaber commentedDec 10, 2025
edited
Loading

You're welcome to argue against the existence ofkeepdims (andaxis, andnan_policy, for that matter. I could argue that all of them do more harm to the ecosystem than good).

The signature is (i),(j)->(j), it doesn't reduce away a dimension...

OK, let's think through the implications of that signature withx of shape(i,) andp of shape(). At first glance, you might think that this (the simplest, most common use ofquantile!) would raise an error becausep is missing a core dimension. Actually, NEP 5does allow for this, though - the first rule is "While an input array has a smaller dimensionality than the corresponding number of core dimensions, 1’s are prepended to its shape." So we end up withp of shape(1,), thereforej=1, and the output would have shape(1,), not(). I don't think this is what you wanted!

Butquantile in the simplest, most common case,is a reducing statistic, which would make the signature(i),()->(). A similar function isexpectile, which@lorentzenchr added in gh-17039. It doesn't have akeepdims parameter, and it also reduces. Incidentally, it reduces whether the second argumentalpha is an array of shape(1,), a 0-d array, or afloat1.

fromscipy.statsimportexpectileexpectile(np.arange(10),np.asarray([0.5]))# np.float64(4.5)expectile(np.arange(10),0.5))# np.float64(4.5)

The documented expectation is thatalpha is a float, so I'm not really giving this as evidence that the behavior with 1dalpha isright orbest, just that it's natural forquantile, likeexpectile, to reduce (if the fact thatnp.quantile is a reducing statistic is not enough).

So really, we have this situation where we want to support(i),()->() in the simplest case and(i),(j)->(j) in others. Why do we ever want to treat the signature as(i),(j)->(j)? After all, assuming we're working along the last axis, I'm pretty sure that the output is almost always going to be the same for compatible array shapes2. The reason is that treating the signature as(i),()->() ismuch less efficient in some cases becauseall dimensions of the second argument - including the last - are treated as loop dimensions, so any preprocessing you do on the first argument (e.g.partition orsort) is repeated forevery element of the second argument. It's much more efficient to share that preprocessing on the first argument for all elements of the second argument along the last axis.

So you want to get the best of both worlds - efficiency benefits of(i),(j)->(j) while respecting the fact that the quantile operation is naturally(i),()->(). You can invent your own way out of the conundrum, but what we did instats wasalways treat the signature as(i),(j)->(j) but definedkeepdims=False to make it behave more like(i),(j)->() whenj=1. Depending on how we defined the default behavior ofkeepdims, I suppose wecould have made arrays with shapes(i,) and(1,) produce an array of shape(1,), but we didn't. I'm not sure if changingthat is worth a deprecation cycle or a separate function, since you can get whatever you want by specifyingkeepdims explicitly.

Footnotes

  1. If you like, you can think of this signature as(i),(j)->() with the requirement thatj must equal1. It is not outrageous to impose a requirement on the value ofi orj (e.g. a gufunc for cross product... preferably one that doesn't have FOURaxis arguments likenp.cross).

  2. For example, consider inputs of shape(i,) and(j,). Whether the signature is(i),()->() or(i),(j)->(j), the output shape will be(j,), it's just that in the former case,j is the length of abatch (or "loop", in the NEP 5 terminology) dimension, not a core dimension.

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

1 more reviewer

@mathausemathausemathause left review comments

Reviewers whose approvals may not affect merge requirements

Assignees

No one assigned

Labels

enhancementNew feature or requestnew function

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

ENH: addquantile

6 participants

@cakedev0@lucyleeow@lucascolley@mdhaber@lorentzenchr@mathause

[8]ページ先頭

©2009-2025 Movatter.jp