Whitespace ignoring diff makes it clearer. This was the most expensive script analysis rule when being run in warm mode and also easy to fix :-)
It also shows the performance improvements in .Net Core 5

PR Checklist

PR has a meaningful title
- Use the present tense and imperative mood when describing your changes
Summarized changes
Change is not breaking
Make sure all.cs,.ps1 and.psm1 files have the correct copyright header
Make sure you've added a new test if existing tests do not effectively test the code changed and/or updated documentation
This PR is ready to merge and is notWork in Progress.
- If the PR is work in progress, please add the prefixWIP: to the beginning of the title and remove the prefix when the PR is ready.

Christoph Bergmeister added4 commits

April 22, 2020 20:33

Improve performance by not using regex (2% improvement)

9a6db4c

Merge branch 'master' ofhttps://github.com/bergmeister/psscriptanalyzer

095c2f1

 into perf/regex

replace regex

eab6312

fix index and simplify

172ebdd

bergmeister changed the title~~Performance: Eliminate Regex overhead in AvoidTrailingWhitespace -> Speedup of 5% (PowerShell 5.1) or 2.5 % (PowerShell 7)~~Performance: Eliminate Regex overhead in AvoidTrailingWhitespace -> Speedup of 5% (PowerShell 5.1) or 2.5 % (PowerShell 7.1-preview.2)

Apr 27, 2020

bergmeister added Area - Performance Area - Rules labels

Apr 27, 2020

bergmeister requested review fromJamesWTruher andrjmholt

April 27, 2020 18:57

tidy

9456321

rjmholt approved these changes

Apr 27, 2020

View reviewed changes

Copy link

Contributor

rjmholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

If we use regexes anywhere else in the codebase, we could probably save some performance by just making the regex static and constructing it withRegexOptions.Compile

Rules/AvoidTrailingWhitespace.csShow resolvedHide resolved

Rules/AvoidTrailingWhitespace.cs Outdated

		));
		continue;
		}
		if(line[line.Length-1]!=' '&&

Copy link

Contributor

rjmholtApr 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Would this be better aschar.IsWhiteSpace(line[line.Length - 1])?

Copy link

CollaboratorAuthor

bergmeisterApr 27, 2020•
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

@rjmholt because of

readablity
perfomance
covering the variety of unicode chars? from the docshere, it would probably be good but what about theUnicodeCategory.LineSeparator char? I don't have much Unicode experience to make a judgement call here tbh if this list includes too much or not

Copy link

Contributor

rjmholtApr 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

My thinking here is actually just thatPowerShell uses that API to see whitespace.

Given how we split the string already, it's possibly dangerous to go by unicode whitespace, but possibly not...

I suspect that really this won't make much difference; leaving non-ASCII whitespace at the ends of lines isn't something I can imagine being an issue for anyone really.

Copy link

CollaboratorAuthor

bergmeisterApr 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Ok, so that sounds more like a tendency to useIsWhiteSpace? I'd be OK with that, you are right that the impact is probably quite low, especially since this rules is not enabled by default for vs-code users.

Rules/AvoidTrailingWhitespace.csShow resolvedHide resolved

Rules/AvoidTrailingWhitespace.cs OutdatedShow resolvedHide resolved

Rules/AvoidTrailingWhitespace.cs

		vardiagnosticRecords=newList<DiagnosticRecord>();

		string[]lines=Regex.Split(ast.Extent.Text,@"\r?\n");
		string[]lines=ast.Extent.Text.Split(new[]{"\r\n","\r","\n"},StringSplitOptions.None);

Copy link

Contributor

rjmholtApr 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This makes me wonder: if we're just trying to find the extents of trailing whitespace, there's no need to split the string at all; we should just read through ourselves without allocating all these strings... But too much burden for this PR!

Copy link

CollaboratorAuthor

bergmeisterApr 27, 2020•
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Hmm, yh, I hear what you say, I guess for perf what counts is the 80-20 rule :-) Technically speakingstring.IndexOf would probably the fastest way of finding the indices where\s\r or \s\n occurs....
I'm aware of lot's of other small micro optimisations that one can make and even tried some but they didn't have a measurable outcome. Therefore I am focussed on just fixing what gives at least a measurable return.

bergmeisterand others added2 commits

April 27, 2020 22:05

Apply suggestions from code review

c35ec92

Co-Authored-By: Robert Holt <rjmholt@gmail.com>

Use IsWhiteSpace

aedbc13

bergmeister merged commit6fa29cb intoPowerShell:master

Apr 28, 2020

Labels

Area - Performance Area - Rules

Movatterモバイル変換

Performance: Eliminate Regex overhead in AvoidTrailingWhitespace -> Speedup of 5% (PowerShell 5.1) or 2.5 % (PowerShell 7.1-preview.2)#1465

Performance: Eliminate Regex overhead in AvoidTrailingWhitespace -> Speedup of 5% (PowerShell 5.1) or 2.5 % (PowerShell 7.1-preview.2)#1465

Uh oh!

Conversation

bergmeister commentedApr 27, 2020• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

PR Summary

PR Checklist

Uh oh!

rjmholt left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rjmholtApr 27, 2020

Choose a reason for hiding this comment

Uh oh!

bergmeisterApr 27, 2020• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rjmholtApr 28, 2020

Choose a reason for hiding this comment

Uh oh!

bergmeisterApr 28, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rjmholtApr 27, 2020

Choose a reason for hiding this comment

Uh oh!

bergmeisterApr 27, 2020• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

bergmeister commentedApr 27, 2020•
edited
Loading

bergmeisterApr 27, 2020•
edited
Loading

bergmeisterApr 27, 2020•
edited
Loading