Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Deprecated: Improve $.trim performance for strings with lots of whitespace#5068

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
mgol merged 1 commit intojquery:3.x-stablefromvlsi:faster_trim
Jul 20, 2022

Conversation

vlsi
Copy link

The old implementation took O(N^2) time to trim the string when multiple adjacent spaces
were present.

In my case, the issue impacts JIRA which takes ~20 seconds to display an issue where most of the time is spent injQuery.trim.

For instance, consider the string "A B" (10 spaces between A and B).
Then old regexp /[\s]+$/ would take 10 steps until it realizes
the regexp does not match at position 1.
Then it would try to match the regexp at position 2, and it would take 9 steps.
Then 8 steps for position 3, ... so it would take 10*10/2 steps until
it figures out the regexp does not match.

The new approach is to require "non-whitespace" char before the whitespace run,
so it spends just one step for each space in the string.

ExE-Boss reacted with thumbs up emojimrk-andreev reacted with hooray emojimrk-andreev, danny0405, and luizbills reacted with rocket emoji
@vlsivlsiforce-pushed thefaster_trim branch 2 times, most recently from61fb761 to66018e1CompareJune 27, 2022 18:01
@vlsivlsi changed the titlePerformance: improve .trim performance for large strings contains lots of whitespacesPerformance: improve .trim performance for large strings containing lots of whitespacesJun 28, 2022
@vlsivlsi changed the titlePerformance: improve .trim performance for large strings containing lots of whitespacesPerformance: improve .trim performance for large strings containing lots of whitespaceJun 28, 2022
@mgol
Copy link
Member

Thanks for the PR. This is adding 63 bytes to the gzipped minified build, we consider that a significant number. As I suggested in the other thread, can you try removing the branch that leverages nativetrim when available to decrease the size impact? Are there any issues with this strategy?

To verify the size increase, please runnpm run build locally on the3.x-stable branch and then on your PR rebased over that branch; you should see at the end what's the size of your current build compared to3.x-stable - the number 68 I posted above.

@vlsi
Copy link
Author

can you try removing the branch that leverages native trim when available to decrease the size impact?

Do you mean I should completely remove native trim approach?

@mgol
Copy link
Member

Do you mean I should completely remove native trim approach?

Yes, that was the idea. Are there any issues with this approach? As I understand, your regex changes should resolve the performance issues that you reported.

You can also try the version without a nativetrim on a separate branch, just to measure what size impact would this version have while still being able to go back to the version you currently proposed if needed.

@vlsi
Copy link
Author

vlsi commentedJul 13, 2022
edited
Loading

Native trim should be faster than any regexp, and I think the recent browsers should not require the regexp.
Frankly speaking, I think going for native trim is worth doing as it would make the webpages faster for the end-user.

Just in case:

#baseline   raw     gz Sizes289641  85451 dist/jquery.js 89653  30945 dist/jquery.min.js# with regexp only (without native trim)   raw     gz Sizes290048  85627 dist/jquery.js 89691  30959 dist/jquery.min.js  +407   +176 dist/jquery.js   +38    +14 dist/jquery.min.js# regexp and native trim   raw     gz Sizes290297  85690 dist/jquery.js 89812  31008 dist/jquery.min.js  +656   +239 dist/jquery.js  +159    +63 dist/jquery.min.js# regexp, single trim function that checks useDefaultTrim always   raw     gz Sizes290226  85687 dist/jquery.js 89781  31006 dist/jquery.min.js  +585   +236 dist/jquery.js  +128    +61 dist/jquery.min.js

@vlsi
Copy link
Author

vlsi commentedJul 13, 2022
edited
Loading

Here's a sample benchmark:https://jsbench.me/oml5j9kjfu/1

Note that even the optimized regex still needs to walk over the entire string while the native implementation can walk the string backwards.

Results on my machine are:

trim 1000 char text like__________ __________ __________ __________ __________ __________ __________ __________ __________ __________ __________ __________ ...

letn=1;letwordLen=10;letnumWords=100;letspaceAround=1;letmid=('_'.repeat(wordLen)+' ').repeat(numWords)+'_';letsrc=' '.repeat(spaceAround)+mid+' '.repeat(spaceAround);letltrim=/^[\s\uFEFF\xA0]+/;letrtrim=/([^\s\uFEFF\xA0])[\s\uFEFF\xA0]+$/;
native: 16M ops/sec ~ 0.06 ms/trimregex: 700K ops/sec ~ 1.4 ms/trim (~23 times slower)

trim' '.repeat(1000) + '_' + ' '.repeat(1000)

native: 750K ops/sec ~ 1.3 ms/trimregex: 680K ops/sec ~ 1.47 ms/trim (13% slower)

trim' _ '

native: 85M ops/sec ~ 13 us/trimregex: 5.5M ops/sec ~ 180 us/trim (~13 times slower)

Frankly speaking, I think the results show that performance improvement is worth the added 63 bytes.
So I believe regex-based implementation could be a not-so-bad workaround for stale browsers, while native trim would save users's CPU and battery (which is more important than 63-byte transfer).

WDYT?

Copy link
Member

@mgolmgol left a comment
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Thanks for the test cases.

I added some comments that should contribute to making this patch much smaller.

It's a bit tricky as this is a deprecated util that should generally not be used at all... The use case presented atjquery/jquery-migrate#461 (comment) says something about a slow JIRA instance. I wonder: would updating jQuery with this patch help here, considering that JIRA would have to update the included jQuery?

That said, I'd be leaning towards accepting this change with my suggestions as trimming is a pretty common operation and I imagine$.trim must be used a lot, especially in older code. My suggestions should make the size increase significantly down.

Thoughts,@timmywil@dmethvin@gibson042?

@mgolmgol added Deprecated 3.x-only Needs info Discuss in MeetingReserved for Issues and PRs that anyone would like to discuss in the weekly meeting. and removed Needs info labelsJul 13, 2022
@vlsi
Copy link
Author

Here's profiling of JIRA issue I run into. I'm running Chrome 103.0.5060.114 on Apple M1 Max. There are twojquery.trim executions, and each of them takes 5 seconds.

JIRA freeze

@vlsi
Copy link
Author

I've merged functions into one:

   raw     gz Sizes290221  85685 dist/jquery.js 89781  31006 dist/jquery.min.js  +580   +234 dist/jquery.js  +128    +61 dist/jquery.min.js

@vlsi
Copy link
Author

I've createddeprecated/support, and the size is

   raw     gz Sizes290171  85668 dist/jquery.js 89767  30997 dist/jquery.min.js  +530   +217 dist/jquery.js  +114    +52 dist/jquery.min.js

@vlsi
Copy link
Author

I wonder: would updating jQuery with this patch help here, considering that JIRA would have to update the included jQuery?

Of course, it would definitely help.
If I remember well, then the optimized regex improves the performance from 5 seconds to 5 milliseconds. Unfortunately, the input data has sensitive information, and I would like to avoid spending time distilling the test case since the optimization is kind of obvious.

@vlsivlsiforce-pushed thefaster_trim branch 5 times, most recently from81d37f7 to0621c3eCompareJuly 13, 2022 11:27
@vlsi
Copy link
Author

   raw     gz Compared to 3.x-stable @ 0f6c3d9efc5f7e844bdcf8ef44f9327f414bea77  +102    +28 dist/jquery.js   +53    +29 dist/jquery.min.js

Copy link
Member

@mgolmgol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Just a few minor comments, otherwise looks good! I'm also happy where this landed size-wise.

@gibson042
Copy link
Member

gibson042 commentedJul 14, 2022
edited
Loading

Adding 29 bytes for performance that only matters with pathological input seems like too much to me, but I would be comfortable with stillimproving performance by orders of magnitude for the earlier 6-byte increase:

@@ -15,7 +15,7 @@ define( [  // Support: Android <=4.0 only // Make sure we trim BOM and NBSP-var rtrim = /^[\s\uFEFF\xA0]+|[\s\uFEFF\xA0]+$/g;+var rtrim = /^[\s\uFEFF\xA0]+|([^\s\uFEFF\xA0])[\s\uFEFF\xA0]+$/g;  // Bind a function to a context, optionally partially applying any // arguments.@@ -82,6 +82,6 @@ jQuery.isNumeric = function( obj ) { jQuery.trim = function( text ) {        return text == null ?                "" :-               ( text + "" ).replace( rtrim, "" );+               ( text + "" ).replace( rtrim, "$1" ); }; } );
vlsi and Krinkle reacted with thumbs up emoji

@vlsivlsiforce-pushed thefaster_trim branch 3 times, most recently from6051fc7 to159f9c1CompareJuly 14, 2022 16:46

"use strict";

support.trim = "" === "\uFEFF\xA0".trim();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

To be clear, I am arguing against the introduction ofsupport.trim at +29 bytes (and really, even at +10).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

To me personally the biggest issue with size increases is they creep in, adding up to a meaningful increase over time. In this case, this danger does not exist as this is deprecated code that's already removed in4.0.0-pre. That makes +29 bytes more acceptable to me.

BTW, we could still make it smaller by dropping\xA0 - this is actually trimmed properly by Android 4.0,\uFEFF is not, I just checked. That lowers the size increase to +24 bytes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Native implementation is still can be an order of magnitude faster, andtrim is used a lot in the wild.
So it looks wrong to default to regexp implementation which is needed for a single stale browser only.

I agree that keeping the dependencies small is important, however, I would say that battery life might be even more important than 29 bytes of extra transfer.

Code size is easy to compare as long as npm prints it. However, it would be nice to have benchmarks that would capture energy consumed by "real-life websites".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Are you claiming that a real-life website, let alone a significant number of them, are spending so much time injQuery.trim that an improvement from O(n²) to O(n) (with real effect of 240x for input with hundreds of medial whitespace characters and 23000x for input with tens of thousands, taking the OP 20 seconds down to about 1 ms) is not sufficient to address the overall experience? I'm not denying that nativeString.prototype.trim is astronomically faster still, but how would such sites evenget the input data fast enough for that further improvement to make a material difference?

That said, I'm not going to panic over 18 bytes in a legacy branch. If@mgol is in favor, that's fine with me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

A real site, can and should, of course, call the nativetrim() method directly if its product requirements allows that, and could do so, at least for the handful of rare cases where this is known to make a significant difference.

While sites may be embedding a particular jQuery version of a variety of different reasons (e.g. blocked on migration of code, or offer jQuery API to other plugins, or browser support forone of the legacy browsers, such as the case for Wikipedia where the JS payload supports IE11 and Android 5 but not IE9 or Android 4). It's quite likely a consumer could even adopt String-trim unconditionally in its code, if all its clients support ES5, even if jQuery itself maintains wider support.

@timmywiltimmywil added this to the3.6.1 milestoneJul 18, 2022
@timmywil
Copy link
Member

Thanks for your contribution@vlsi! We discussed this in the meeting and we'd prefer to go with@gibson042's shorter solution and not addsupport.trim to 3.x this late in the game. If you can make that change, we'd be happy to get this into 3.6.1.

mgol reacted with thumbs up emoji

@vlsi
Copy link
Author

vlsi commentedJul 19, 2022
edited
Loading

For reference, I've captured the problematic text (~900KiB).

Old regexp takes ~5 seconds.
The updated one takes 3ms.

For trivial cases (e.g.trim('<div>')), the updated regexp executes with a rate of ~11M/sec (the native executes at a rate of 130M/sec).
In other words, 10K trims would consume 1ms of CPU time.

timmywil, esabol, luizbills, Schweinepriester, and ExE-Boss reacted with hooray emoji

Copy link
Member

@mgolmgol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Thank you! Looks good to me now.

@mgol
Copy link
Member

@gibson042 since your "Request changes" review is pending - can you have a look and approve if you have no remarks? This now implements your +6 bytes proposal, as I understand.

@mgolmgol requested a review fromgibson042July 19, 2022 17:38
@mgolmgol removed the Needs info labelJul 19, 2022
… of whitespace runsRegex imp implementation takes O(N^2) time to trim the string when multipleadjacent spaces were present.The new expression require that the "whitespace run" starts from a non-whitespaceto avoid O(N^2) behavior when the engine would try matching "\s+$" at each space position.
@mgolmgol removed Needs review Discuss in MeetingReserved for Issues and PRs that anyone would like to discuss in the weekly meeting. labelsJul 20, 2022
@mgolmgol changed the titlePerformance: improve .trim performance for large strings containing lots of whitespaceDeprecated: Improve $.trim performance for strings with lots of whitespaceJul 20, 2022
@mgolmgol merged commit6994010 intojquery:3.x-stableJul 20, 2022
@mgol
Copy link
Member

Landed, thanks! This change will be included in the upcoming3.6.1 release.

esabol and ChaseBianchiSlalom reacted with hooray emoji

vlsi added a commit to vlsi/jquery-migrate that referenced this pull requestJul 20, 2022
…s of whitespacesThe old implementation took O(N^2) time to trim the string when multiple adjacent spaceswere present.For instance, consider the string "A          B" (10 spaces between A and B).Then old regexp /[\s]+$/ would take 10 steps until it realizesthe regexp does not match at position 1.Then it would try to match the regexp at position 2, and it would take 9 steps.Then 8 steps for position 3, ... so it would take 10*10/2 steps untilit figures out the regexp does not match.Seejquery/jquery#5068The new approach is to require "non-whitespace" char before the whitespace run,so it spends just one step for each space in the string.
mgol pushed a commit to jquery/jquery-migrate that referenced this pull requestJul 20, 2022
…spaceRegex imp implementation takes `O(N^2)` time to trim the string whenmultiple adjacent spaces were present.The new expression require that the "whitespace run" starts froma non-whitespace to avoid `O(N^2)` behavior when the engine wouldtry matching `\s+$` at each space position.Closesgh-461Refjquery/jquery#5068
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@KrinkleKrinkleKrinkle approved these changes

@timmywiltimmywiltimmywil approved these changes

@gibson042gibson042gibson042 approved these changes

@mgolmgolmgol approved these changes

@dmethvindmethvinAwaiting requested review from dmethvin

Assignees
No one assigned
Milestone
3.6.1
Development

Successfully merging this pull request may close these issues.

5 participants
@vlsi@mgol@Krinkle@gibson042@timmywil

[8]ページ先頭

©2009-2025 Movatter.jp