Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Light up String.Manipulation APIs with Vector512 codepath#93043

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged

Conversation

@khushal1996
Copy link
Member

@khushal1996khushal1996 commentedOct 5, 2023
edited
Loading

Optimizing the following String APIs

  1. String.Split --> Optimizing MakeSeparatorListVectorized
  2. String.Replace(char oldChar, char newChar) --> Optimizing for a single iteration. Although we have measured perf on this API, it just represents optimizing a single iteration and not all.

PERF on ICX


Below tables show a result comparison output by ResultComparer in the performance repo.

Base = No changes
Diff = With the PR changes

1. Split


A Vector128 code path already exists for this API. We are adding a similar Vector256 and Vector512 code path.

base =Diff Vector256 code path vs diff =Diff Vector512 code path

Slowerdiff/baseBase Median (ns)Diff Median (ns)Modality
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.0925083.8827315.47
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.07216.23231.68
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.06527.11561.25
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.0621021.3122223.51
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.04292.20304.67
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.045308.705499.70
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.02663.91678.31
Fasterbase/diffBase Median (ns)Diff Median (ns)Modality
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.43100.2769.98
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.3799.7872.70
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.0647539.4444884.05
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.032787.302701.91
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.0338497.3237374.38
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.021089.851073.03

base =Base Vector128 code path vs diff =Diff Vector256 code path

Fasterbase/diffBase Median (ns)Diff Median (ns)Modality
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.77176.7699.78
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.76176.33100.27
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.4756625.2738497.32
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.4367789.5947539.44
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.271151.90908.21
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.206348.065308.70
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.195407.904549.70
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.191293.971089.85
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.182753.152337.50
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.1624393.6021021.31
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.16609.65527.11
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.1628984.7025083.88
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.153213.502787.30
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.11239.76216.23
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.10320.35292.20
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.09721.48663.91

This is one of the issues where Avx512 is not that performance because of the issue with using multipleVector512.Equals(). I ran a couple of iterations using StopWatch method and below are the results.

15

As you can see, for each iteration,Vector512 is almost the same asVector256. Let me know if there are any suggestions for further optimizingVector512 code path. We have to decide whether this can ne merged or not since there are alreadyVector128 code path for both the APIs. Also, the Vector256 and Vector512 code path provide a significant speed up over Vector128 code path.

2. Replace_Char


A Vector128 code path already exists for this API. We are just adding a single iteration of Vector512 or Vector256.

base =Diff, Vector256 code path vs diff =Diff Vector512 code path

Slowerdiff/baseBase Median (ns)Diff Median (ns)Modality
System.Tests.Perf_String.Replace_Char(text: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgbl1.1224.7427.63
System.Tests.Perf_String.Replace_Char(text: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgbl1.064003.774246.22
System.Tests.Perf_String.Replace_Char(text: "This is a very nice sentence", oldC1.0318.0518.56
Fasterbase/diffBase Median (ns)Diff Median (ns)Modality
System.Tests.Perf_String.Replace_Char(text: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgbl1.25173.57139.21
System.Tests.Perf_String.Replace_Char(text: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgbl1.151486.211292.79
System.Tests.Perf_String.Replace_Char(text: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgbl1.15722.97630.87
System.Tests.Perf_String.Replace_Char(text: "This is a very nice sentence", oldC1.093.903.57
System.Tests.Perf_String.Replace_Char(text: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgbl1.092216.982029.28

base =Base Vector128 code path vs diff =Diff Vector256 code path

Slowerdiff/baseBase Median (ns)Diff Median (ns)Modality
System.Tests.Perf_String.Replace_Char(text: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgbl1.1224.7427.63
System.Tests.Perf_String.Replace_Char(text: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgbl1.064003.774246.22
System.Tests.Perf_String.Replace_Char(text: "This is a very nice sentence", oldC1.0318.0518.56
Fasterbase/diffBase Median (ns)Diff Median (ns)Modality
System.Tests.Perf_String.Replace_Char(text: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgbl1.25173.57139.21
System.Tests.Perf_String.Replace_Char(text: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgbl1.151486.211292.79
System.Tests.Perf_String.Replace_Char(text: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgbl1.15722.97630.87
System.Tests.Perf_String.Replace_Char(text: "This is a very nice sentence", oldC1.093.903.57
System.Tests.Perf_String.Replace_Char(text: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgbl1.092216.982029.28

@ghostghost added community-contributionIndicates that the PR has been added by a community member needs-area-labelAn area label is needed to ensure this gets routed to the appropriate area owners labelsOct 5, 2023
@khushal1996khushal1996 marked this pull request as ready for reviewOctober 9, 2023 18:41
@khushal1996
Copy link
MemberAuthor

@tannergooding Just sending out a reminder to review this PR.

@adamsitnikadamsitnik added area-System.Runtime and removed needs-area-labelAn area label is needed to ensure this gets routed to the appropriate area owners labelsNov 3, 2023
@ghost
Copy link

Tagging subscribers to this area: @dotnet/area-system-runtime
See info inarea-owners.md if you want to be subscribed.

Issue Details

Optimizing the following String APIs

  1. String.Split --> Optimizing MakeSeparatorListVectorized
  2. String.Replace(char oldChar, char newChar) --> Optimizing for a single iteration. Although we have measured perf on this API, it just represents optimizing a single iteration and not all.

PERF on ICX


Below tables show a result comparison output by ResultComparer in the performance repo.

Base = No changes
Diff = With the PR changes

1. Split


A Vector128 code path already exists for this API. We are adding a similar Vector256 and Vector512 code path.

base =Diff Vector256 code path vs diff =Diff Vector512 code path

Slowerdiff/baseBase Median (ns)Diff Median (ns)Modality
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.0925083.8827315.47
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.07216.23231.68
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.06527.11561.25
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.0621021.3122223.51
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.04292.20304.67
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.045308.705499.70
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.02663.91678.31
Fasterbase/diffBase Median (ns)Diff Median (ns)Modality
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.43100.2769.98
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.3799.7872.70
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.0647539.4444884.05
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.032787.302701.91
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.0338497.3237374.38
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.021089.851073.03

base =Base Vector128 code path vs diff =Diff Vector256 code path

Fasterbase/diffBase Median (ns)Diff Median (ns)Modality
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.77176.7699.78
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.76176.33100.27
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.4756625.2738497.32
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.4367789.5947539.44
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.271151.90908.21
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.206348.065308.70
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.195407.904549.70
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.191293.971089.85
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.182753.152337.50
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.1624393.6021021.31
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.16609.65527.11
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.1628984.7025083.88
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.153213.502787.30
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.11239.76216.23
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.10320.35292.20
System.Tests.Perf_String.Split(s: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgblu3at20nfab1.09721.48663.91

This is one of the issues where Avx512 is not that performance because of the issue with using multipleVector512.Equals(). I ran a couple of iterations using StopWatch method and below are the results.

15

As you can see, for each iteration,Vector512 is almost the same asVector256. Let me know if there are any suggestions for further optimizingVector512 code path. We have to decide whether this can ne merged or not since there are alreadyVector128 code path for both the APIs. Also, the Vector256 and Vector512 code path provide a significant speed up over Vector128 code path.

2. Replace_Char


A Vector128 code path already exists for this API. We are just adding a single iteration of Vector512 or Vector256.

base =Diff, Vector256 code path vs diff =Diff Vector512 code path

Slowerdiff/baseBase Median (ns)Diff Median (ns)Modality
System.Tests.Perf_String.Replace_Char(text: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgbl1.1224.7427.63
System.Tests.Perf_String.Replace_Char(text: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgbl1.064003.774246.22
System.Tests.Perf_String.Replace_Char(text: "This is a very nice sentence", oldC1.0318.0518.56
Fasterbase/diffBase Median (ns)Diff Median (ns)Modality
System.Tests.Perf_String.Replace_Char(text: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgbl1.25173.57139.21
System.Tests.Perf_String.Replace_Char(text: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgbl1.151486.211292.79
System.Tests.Perf_String.Replace_Char(text: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgbl1.15722.97630.87
System.Tests.Perf_String.Replace_Char(text: "This is a very nice sentence", oldC1.093.903.57
System.Tests.Perf_String.Replace_Char(text: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgbl1.092216.982029.28

base =Base Vector128 code path vs diff =Diff Vector256 code path

Slowerdiff/baseBase Median (ns)Diff Median (ns)Modality
System.Tests.Perf_String.Replace_Char(text: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgbl1.1224.7427.63
System.Tests.Perf_String.Replace_Char(text: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgbl1.064003.774246.22
System.Tests.Perf_String.Replace_Char(text: "This is a very nice sentence", oldC1.0318.0518.56
Fasterbase/diffBase Median (ns)Diff Median (ns)Modality
System.Tests.Perf_String.Replace_Char(text: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgbl1.25173.57139.21
System.Tests.Perf_String.Replace_Char(text: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgbl1.151486.211292.79
System.Tests.Perf_String.Replace_Char(text: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgbl1.15722.97630.87
System.Tests.Perf_String.Replace_Char(text: "This is a very nice sentence", oldC1.093.903.57
System.Tests.Perf_String.Replace_Char(text: "yfesgj0sg1ijslnjsb3uofdz3tbzf6ysgbl1.092216.982029.28
Author:khushal1996
Assignees:-
Labels:

area-System.Runtime,community-contribution,needs-area-label

Milestone:-

@adamsitnikadamsitnik added the tenet-performancePerformance related issue labelNov 3, 2023
@khushal1996
Copy link
MemberAuthor

@tannergooding just sending out a reminder for this pending review.

@tannergooding
Copy link
Member

CC.@stephentoub,@GrabYourPitchforks,@adamsitnik

Could one of you give this a secondary review.

@khushal1996
Copy link
MemberAuthor

CC.@stephentoub,@GrabYourPitchforks,@adamsitnik

Could one of you give this a secondary review.

@stephentoub@GrabYourPitchforks@adamsitnik sending a reminder for review. Can you please review this PR.

Copy link
Member

@stephentoubstephentoub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Thanks

@khushal1996
Copy link
MemberAuthor

@tannergooding@kunalspathak can you please help with merging this PR? It has been approved for quite some time now.

Copy link
Contributor

@kunalspathakkunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

LGTM

@kunalspathak
Copy link
Contributor

CI seems red, so I will kick off another round once again to make sure there is nothing related to this PR.

/azp run runtime

@kunalspathakkunalspathak merged commit14127ea intodotnet:mainDec 22, 2023
@github-actionsgithub-actionsbot locked and limited conversation to collaboratorsJan 22, 2024
Sign up for freeto subscribe to this conversation on GitHub. Already have an account?Sign in.

Reviewers

@stephentoubstephentoubstephentoub approved these changes

@tannergoodingtannergoodingtannergooding approved these changes

+1 more reviewer

@kunalspathakkunalspathakkunalspathak approved these changes

Reviewers whose approvals may not affect merge requirements

Assignees

@tannergoodingtannergooding

Labels

area-System.Runtimecommunity-contributionIndicates that the PR has been added by a community membertenet-performancePerformance related issue

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

5 participants

@khushal1996@tannergooding@kunalspathak@stephentoub@adamsitnik

[8]ページ先頭

©2009-2025 Movatter.jp