Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Implement IAlternateEqualityComparer support for Dictionary, HashSet, ConcurrentDictionary#102907

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged

Conversation

@stephentoub
Copy link
Member

This addsIAlternateEqualityComparer<TAlternate, T>, implementsIAlternateEqualityComparer<ReadOnlySpan<char>, string> on all of theStringComparer singletons, and adds associatedAlternateLookup types toDictionary<TKey, TValue>,HashSet<T>, andConcurrentDictionary<TKey, TValue>.

Contributes to#27229. This doesn't completely close that issue as this PR doesn't add support toFrozenDictionary<TKey, TValue> orFrozenSet<T>; those will require some more thought due to how they're implemented, where the core of the search is provided by a virtual method that would likely need a corresponding generic virtual method in this scheme, and that has performance implications.

This also doesn't yet implementIAlternateEqualityComparer<TAlternate, T> onEqualityComparer<string>.Default. I'd prototyped doing so, but have a few open issues and have separated it out into#102906. The user-facing ramification if we don't get to that in .NET 9 is just that to be able to use the new lookups the dictionary/set will need to have a StringComparer (or equivalent) comparer explicitly passed into the collection's constructor.

I still need to do some perf validation. In particular, I tweaked the string hash code routines to accomodate span inputs (the existing logic took advantage of the \0 guarantee at the end of a string), and I need to see how much this impacts things and whether it'll instead be worth duplicating the functions to have one for string and one for span.

I also added a few APIs that weren't explicitly approved in the API review meeting, but I believe that was an oversight:

  • I added a Dictionary property to the lookups for {Concurrent}Dictionary and a Set property to the lookup for HashSet. This is so that you can just store the lookup struct and access the original object through it. If you look back at the history of the issue and API review discussions, we'd planned to add these, but then I neglected to copy/paste them over for the final review. We could still debate the actual name used.
  • We neglected to add indexers to the dictionary lookups, but they're important, in particular the setter, in order to support update behavior, which otherwise isn't exposed.
  • For the {Try}Remove overloads on the dictionaries that took an out TValue, I added an out TKey actualKey parameter.
  • (We might also want to consider GetOrAdd/AddOrUpdate methods for ConcurrentDictionary, but I didn't add those.)

There's more code duplication than I'd like between the routines on the main class and the routines on the lookups; we may want to revisit that.

Our choice around exposing the GetAlternateLookup methods as extensions for Dictionary/HashSet does make them more awkward to use, as you need to provide all of the generic parameters, not just the TAlternateKey one. This isn't the case for ConcurrentDictionary, where we did it as an instance method, and thus you don't need to provide TKey/TValue again. This is a visible inconsistency as to how they're consumed.

BrennanConroy, PaulusParssinen, En3Tho, Hixon10, and kg reacted with hooray emoji
… ConcurrentDictionaryThis adds `IAlternateEqualityComparer<TAlternate, T>`, implements `IAlternateEqualityComparer<ReadOnlySpan<char>, string>` on all of the `StringComparer` singletons, and adds associated `AlternateLookup` types to `Dictionary<TKey, TValue>`, `HashSet<T>`, and `ConcurrentDictionary<TKey, TValue>`.
@ghost
Copy link

Note regarding thenew-api-needs-documentation label:

This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change.

@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-runtime
See info inarea-owners.md if you want to be subscribed.

@bencyoung-Fignum
Copy link

Out of interest, could this extend to having the comparers implementing IAlternateEqualityComparer<ReadOnlySpan, string> (or some new UTF8 ref struct wrapper) as well to allow for UTF-8 insertion within the same dictionary - as long as the hashes can be made consistent?

@stephentoub
Copy link
MemberAuthor

Out of interest, could this extend to having the comparers implementing IAlternateEqualityComparer `<ReadOnlySpan, string> (or some new UTF8 ref struct wrapper) as well to allow for UTF-8 insertion within the same dictionary - as long as the hashes can be made consistent?

Conceptually, yes. And you could certainly write your own comparer with any number of IAlternateEqualityComparer implementations for different TAlternate types (ReadOnlySpan<char>,ReadOnlySpan<byte>, etc.) We're not planning to do so on theStringComparers this release, and it's an open question whether we'd do so in the future. There are corner-cases such as what string isnew byte[] { 0xFF } equal to? Is it equal toEncoding.UTF8.GetString(new byte[] { 0xFF })? And if so, is it also equal toEncoding.UTF8.GetString(new byte[] { 0xFE }), since those produce the same strings? This brings policy into these core types, so while technically we can do it, it's a more complicated discussion.

@bencyoung-Fignum
Copy link

Out of interest, could this extend to having the comparers implementing IAlternateEqualityComparer `<ReadOnlySpan, string> (or some new UTF8 ref struct wrapper) as well to allow for UTF-8 insertion within the same dictionary - as long as the hashes can be made consistent?

Conceptually, yes. And you could certainly write your own comparer with any number of IAlternateEqualityComparer implementations for different TAlternate types (ReadOnlySpan<char>,ReadOnlySpan<byte>, etc.) We're not planning to do so on theStringComparers this release, and it's an open question whether we'd do so in the future. There are corner-cases such as what string isnew byte[] { 0xFF } equal to? Is it equal toEncoding.UTF8.GetString(new byte[] { 0xFF })? And if so, is it also equal toEncoding.UTF8.GetString(new byte[] { 0xFE }), since those produce the same strings? This brings policy into these core types, so while technically we can do it, it's a more complicated discussion.

Agree yes it's not always obvious, but I'm glad it's flexible to support multiple implementations! Might only make sense with a validated UTF8 ref struct type. Thanks for the reply

@stephentoubstephentoubforce-pushed thealternateequalitycomparer branch fromede4ca4 to9558a8dCompareJune 5, 2024 14:47
@stephentoub
Copy link
MemberAuthor

I still need to do some perf validation. In particular, I tweaked the string hash code routines to accomodate span inputs (the existing logic took advantage of the \0 guarantee at the end of a string), and I need to see how much this impacts things and whether it'll instead be worth duplicating the functions to have one for string and one for span.

Unfortunately I couldn't quite get rid of the regression here for strings when using/updating a shared implementation, so I created a duplicate implementation of the two functions for span that returns the same hash value as for string but doesn't rely on the null terminator. I kept the slower non-ASCII fallback for both, but improved it to a) use a larger stack allocated buffer and b) not rehash all of the ASCII values leading up to the first non-ASCII value.

Should be good to go now.

@stephentoubstephentoub merged commit5c403bb intodotnet:mainJun 5, 2024
@stephentoubstephentoub deleted the alternateequalitycomparer branchJune 5, 2024 21:57
@TonyValenti
Copy link

Where can I read more about IAlternateEqualityComparer and its uses?

@github-actionsgithub-actionsbot locked and limited conversation to collaboratorsJul 7, 2024
Sign up for freeto subscribe to this conversation on GitHub. Already have an account?Sign in.

Reviewers

@EgorBoEgorBoEgorBo left review comments

@tannergoodingtannergoodingtannergooding approved these changes

+1 more reviewer

@saulsaulsaul left review comments

Reviewers whose approvals may not affect merge requirements

Assignees

@stephentoubstephentoub

Projects

None yet

Milestone

9.0.0

Development

Successfully merging this pull request may close these issues.

6 participants

@stephentoub@bencyoung-Fignum@TonyValenti@saul@EgorBo@tannergooding

[8]ページ先頭

©2009-2025 Movatter.jp