Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

API: Introducecopy argument fornp.asarray [Array API]#25168

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
ngoldbaum merged 5 commits intonumpy:mainfrommtsokol:asarray-copy-arg
Feb 29, 2024

Conversation

@mtsokol
Copy link
Member

@mtsokolmtsokol commentedNov 17, 2023
edited
Loading

Tracking issue:#25076

Hi@seberg,

Here's a PR with an initial work on introducingcopy keyword tonp.asarray.

I started looking at the internals that we discussed during the triage meeting, but first, I wanted to try with the simplest approach: The implementation ofnp.asarray is mostly the same asnp.array - it just skips a few parameters. One of them iscopy which is hardcoded toFalse innp.asarray.

Here I addedcopy with a default valueFalse, so it's a backward compatible change. Forcopy=True it behaves the same way asnp.array(..., copy=True). To sum up - thecopy mechanism didn't change here, only thecopy keyword is added for compatibility.

WDYT?

Copy link
Member

@sebergseberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Not sure what you are looking for. If we need to add thecopy kwarg toasarray, this is the right way.

The first interesting part is vetting the "no copy" logic (I don't remember if it is bad, but I definitely don't trust it without someone auditing it once more).
The other interesting part is places wherecopy=False already exists.

Fornp.array() that is likely fine (a brief search may be nice), some peoplewill be usingnp.array(..., copy=False) or actuallynp.array(x, None, False) (or a dtype)1.
The hope for those use-cases is that most of them will actually like the new meaning, and if they don't they will fail gracefully (it would be cute to mentioncopy=None andasarray() in the error message).

The trickier thing is probablyarr.astype(original, copy=False). The problem is that this is very muchvery common code and such code will definitely expect the copy to be necessary often.
Also there, you can kust replace it withcopy=None, but I am worried about the amount of churn we might create, sincecopy=False was always a best practice!
(It may be that while a best-practice in many places, it isn't nearly as common in end-user code, but I suspect it is fairly common.)

Footnotes

  1. That may look crazy, but the the use of kwargs used to come with a large overhead andasarray was a Python wrapper using kwargs making it very slow for just returning the original object.

@mtsokolmtsokol marked this pull request as ready for reviewNovember 17, 2023 13:34
@ngoldbaum
Copy link
Member

It looks like there are a number of places pandas usesarr.astype(..., copy=False) (althoughlots of other code does it too). Maybe running the pandas test suite against a numpy with this PR applied will be enough to discover whether that causes breakages?

mtsokol reacted with thumbs up emoji

@ngoldbaum
Copy link
Member

You can usehttps://pandas-coverage-12d2130077bc.herokuapp.com to find which pandas tests hit a particular line of code in the pandas codebase.

@mtsokolmtsokolforce-pushed theasarray-copy-arg branch 2 times, most recently from311af11 to609f35fCompareDecember 8, 2023 13:30
@seberg
Copy link
Member

It seems to me that there is a worm somewhere in all of this. What isastype() supposed to do? The documentation says thatFalse allows a copy if needed. That is also true for the docs in the array-api.
But doesn't this change it so thatFalse enforces something stricter at least sometimes?

So, is the meaning ofcopy=False just different for different functions? That solves a lot of problems, although it is also a bit confusing...

@mtsokol
Copy link
MemberAuthor

mtsokol commentedDec 8, 2023
edited
Loading

@seberg To sum up:
The new meanings forcopy keyword fornp.array,np.asarray andndarray.astype are:

  • True - make a copy
  • None - make a copy if needed
  • False - don't make a copy, or throw an exception that explains the new meaning of False and points tocopy=None

Fornp.array the default stays asTrue, onlynp.array(..., copy=False) is impacted. Fornp.asarray the default changed toNone (with the same meaning internally asFalse before) so if nocopy keyword was provided then the behavior doesn't change (same forastype), andnp.asarray(..., copy=False) is impacted.

(Also I reflected these changes in masked arrays and legacy matrices, let me know if I should revert any of public API changes)

I updated the whole codebase to adhere to these new meanings (if there are no caveats I can address it in downstream libraries).

What isastype() supposed to do?

Hmm I see that Array API definesnp.astype withTrue/False only. There are a few options:

  1. astype API definition should also includeNone forcopy to be consistent withasarray/array, and we stay with the current PRs implementation.
  2. astype keeps operating only withTrue/False as before, withFalse's original meaning (soastype's False ->asarray's None) (I think it's a bit confusing but also there are other functions with differentcopy meanings, likenp.meshgrid)
    WDYT?

@seberg
Copy link
Member

I am trying to avoid to think about it... I think it is strange to diverge the meaning, but breakingastype(copy=False) is a huge amount of churn that needs a bit clearer motivation.

@charrischarris changed the titleAPI: Introducecopy argument fornp.asarrayAPI: Introducecopy argument fornp.asarrayDec 8, 2023
@mtsokolmtsokolforce-pushed theasarray-copy-arg branch 2 times, most recently from96ce46d toa5785c7CompareDecember 12, 2023 09:27
@mtsokolmtsokolforce-pushed theasarray-copy-arg branch 3 times, most recently from9a4b7a2 to92cb661CompareDecember 15, 2023 12:56
@mtsokol
Copy link
MemberAuthor

mtsokol commentedDec 15, 2023
edited
Loading

Hi@seberg,

I updated the PR according to our discussion during the triage call anddata-apis/array-api#721.

The new semanticsNone/True/False are now present only innp.array andnp.asarray, and default values didn't change.np.astype andndarray.astypecopy behavior also stayed the same.

In the C-level code there is an updated definition ofNPY_COPYMODE to adhere to the new semantics and a newNPY_ASTYPECOPYMODE enum forastype to keep the same behavior.

If it's Ok, I would updatescipy and other libraries accordingly.

@seberg
Copy link
Member

I donot see it as my decision to give the OK on changing how copy behaves inasarray,array, and friends. If there isn't a consensus, it should be in the NEP to propose to change it.

But, I do not like the change. The reason why I can live with the redefinition ofcopy=False innp.array(..., copy=False) is that it can be (and often is) spelled asnp.asarray(...) (this is assuming downstream churn seems acceptable, I have no idea how bad it is). So, useasarray, notcopy=None.
That is:copy=None is acceptable maybe, but partially because we at least normally do not use it explicitly (not sure aboutmeshgrid, which defaults tocopy=True).

mtsokol reacted with thumbs up emoji

@seberg
Copy link
Member

In fact, we could just split out thearray(..., copy=False) ->asarray refactor to make this PR smaller.

mtsokol reacted with thumbs up emoji

@mtsokolmtsokolforce-pushed theasarray-copy-arg branch 2 times, most recently fromd226c87 to0ede769CompareDecember 18, 2023 10:31
@mtsokolmtsokol changed the titleAPI: Introducecopy argument fornp.asarrayAPI: Introducecopy argument fornp.asarray [Array API]Dec 19, 2023
@mtsokolmtsokol changed the titleAPI: Introducecopy argument fornp.asarray [Array API]API: Introducecopy argument fornp.asarray [Array API]Dec 19, 2023
@mtsokolmtsokol added this to the2.0.0 release milestoneDec 20, 2023
@mattip
Copy link
Member

The interesting part is to try to pass it where we callarray (probably only when copy is indicated).

In the triage meeting we decided to add the kwarg in a try, catch a kwarg error, and emit a deprecation error if the user's__array__ method does not allow thecopy kwargs. We should make sure it is documented.

seberg and rgommers reacted with thumbs up emoji

@mtsokol
Copy link
MemberAuthor

The interesting part is to try to pass it where we call__array__ (probably only when copy is indicated).

@seberg The only relevant place that I found where__array__ gets called was insidenp.array implementation. I modified it in such a way thatcopy=... argument fromnp.array(...) is passed to__array__. If__array__ doesn't support it a warning is raised and it gets called again withoutcopy.

Classes in tests/codebase that implement__array__ function now havedtype andcopy arguments to adhere to the protocol - tomorrow I will update the documentation to cover this change.

@mtsokol
Copy link
MemberAuthor

Ok, I updated docs so that__array__ part mentionscopy argument.

Copy link
Member

@ngoldbaumngoldbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Sorry for not following up quickly after you responded to my last review. I think there's still one refcounting issue. I also noticed a separate more minor issue in the docs but I think is worth addressing to reduce confusion from users.

@seberg if you have a chance, it would be nice if you could confirm you're ok with the semantics in this version of the PR just so I can make sure we're all on the same page.

@seberg
Copy link
Member

I am happy with the addition. Note that it will be quite noisy unless you do not pass copy=None and take it this very slow.

Code wise happy if you are. matching error exact is likely nice. Passing positionally for simplicity woukd be fine to me or using vectorcall would be best (much quicker probably and probably also simpler).

@ngoldbaum
Copy link
Member

Merging this. Thanks@mtsokol! Also thanks for getting all the NEP 56 PRs across the line! 🎆 🥳

If you are reading this because this PR broke your nightly tests, take a look at the new docs, the NEP, and the scipy and scikit-learn PRs. If there is some sort of breakage in your code that is not straightforward to deal with in a backward compatible way, please open an issue.

@ngoldbaumngoldbaum merged commitc0617ac intonumpy:mainFeb 29, 2024
@mtsokolmtsokol deleted the asarray-copy-arg branchFebruary 29, 2024 16:45
@rgommers
Copy link
Member

Thanks for the hard work on this! Still a few things to iron out I think, xrefgh-25916.

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@ngoldbaumngoldbaumngoldbaum left review comments

@sebergsebergAwaiting requested review from seberg

Assignees

@mtsokolmtsokol

Projects

None yet

Milestone

2.0.0 release

Development

Successfully merging this pull request may close these issues.

5 participants

@mtsokol@ngoldbaum@seberg@rgommers@mattip

[8]ページ先頭

©2009-2025 Movatter.jp