Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
/perl5Public

OP_SUBSTR_LEFT - a specialised OP_SUBSTR variant#22785

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged

Conversation

richardleach
Copy link
Contributor

This commit addsOP_SUBSTR_NIBBLE and associated machinery for fast handling of the constructions:

    substr EXPR,0,LENGTH,''

and

    substr EXPR,0,LENGTH

WhereEXPR is a scalar lexical, theOFFSET is zero, and either there is noREPLACEMENT or it is the empty string.LENGTH can be anything thatOP_SUBSTR supports. These constraints allow for a very stripped back and optimised version of pp_substr.

The primary motivation was for situations where a scalar, containing some network packets or other binary data structure, is being parsed piecemeal. Nibbling away at the scalar can be useful when you don't know how exactly it will be parsed and unpacked until you get started. It also means that you don't need to worry about correctly updating a separate offset variable.

This operator also turns out to be an efficient way to (destructively) break an expression up into fixed size chunks. For example, given:

my $x = ''; my $str = "A"x100_000_000;

This code:

$x = substr($str, 0, 5, "") while ($str);

is twice as fast as doing:

for ($pos = 0; $pos < length($str); $pos += 5) {    $x = substr($str, $pos, 5);}

Compared with blead,$y = substr($x, 0, 5) runs 40% faster and$y = substr($x, 0, 5, '') runs 45% faster.


  • This set of changes requires a perldelta entry, and I will add one shortly.

Copy link
Contributor

@leonerdleonerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

A couple of small comments but overall nothing troubling-looking here.

I wonder a bit about the name though. I've usually seen the word "nibble" to mean a half-byte; i.e. a 4-bit value. I wondered if that is what is going on here at first. If there are other candidate names to call it, perhaps something else would be better? Not a huge problem though.

@richardleach
Copy link
ContributorAuthor

How aboutsubstr_chop, since a big part of its design is being a fast route for getting toPerl_sv_chop?

Food related alternatives:substr_peck?substr_graze?substr_smidgen?substr_julienne?substr_shuck?substr_heel, like the end of a loaf? Non-foodie:substr_shave?

@richardleach
Copy link
ContributorAuthor

Not sure what's going on with the ABRT test failures. Don't get them locally../perl -Ilib t/perf/benchmarks.t seemingly only uses 8MB of memory, so running out of memory doesn't seem to be the cause.

@richardleach
Copy link
ContributorAuthor

Not sure what's going on with the ABRT test failures.

Looks like anop_private flags assertion. I'll dig into it soon.

@richardleachrichardleach changed the titleOP_SUBSTR_NIBBLE - a specialised OP_SUBSTR variantOP_SUBSTR_CHOP - a specialised OP_SUBSTR variantNov 28, 2024
@richardleach
Copy link
ContributorAuthor

I'm rebasing and renaming it toSUBSTR_CHOP.

@leonerd
Copy link
Contributor

Doesn't perl'schop() function eat from the other end though?

@richardleach
Copy link
ContributorAuthor

The Perlchop takes from the end, butPerl_sv_chop takes from the front. (I don't know who we have to thank for that amazing piece of naming.) The pp_ function for this op callsPerl_sv_chop.

@jkeenan
Copy link
Contributor

@richardleach , merge conflicts ^^

@leonerd
Copy link
Contributor

The Perlchop takes from the end, butPerl_sv_chop takes from the front. (I don't know who we have to thank for that amazing piece of naming.) The pp_ function for this op callsPerl_sv_chop.

Oh wow. Huh. In that case, might as well call this oneSUBSTR_CHOP indeed then.

Otherwise my thoughts were going to be something likeSUBSTR_PREFIX but that isn't much more descriptive.

@Grinnz
Copy link
Contributor

Consider ltrim, with inspiration from PHP and Redis (or lstrip a la Ruby/Python but that sounds more whitespace-specific). Though it is also unrelated to builtin::trim, I think it's a bit more descriptive at least

@richardleach
Copy link
ContributorAuthor

Consider ltrim, with inspiration from PHP and Redis (or lstrip a la Ruby/Python but that sounds more whitespace-specific). Though it is also unrelated to builtin::trim, I think it's a bit more descriptive at least

Hmmm, I'm not sure about this. It seems only more descriptive to someone who already is familiar withltrim, otherwise it's likely to lead to confusion withbuiltin:trim or even reducing the other end of the string. There might be some confusion around _CHOP, but at least the connection tosv_chop is there.

@tonycoz
Copy link
Contributor

MaybeOP_SUBSTR_LEFT borrowing from perl'sBASIC antecedents.

@richardleach
Copy link
ContributorAuthor

MaybeOP_SUBSTR_LEFT borrowing from perl'sBASIC antecedents.

Ok, that seems straightforward enough without colliding with Perlspace. Will rename.

Variants are named to match the style of macros in op.h
@richardleachrichardleach changed the titleOP_SUBSTR_CHOP - a specialised OP_SUBSTR variantOP_SUBSTR_LEFT - a specialised OP_SUBSTR variantDec 12, 2024
@richardleach
Copy link
ContributorAuthor

OP renamed toOP_SUBSTR_LEFT and, I think, review comments all addressed.

@bulk88
Copy link
Contributor

BINOPs like PP

if(index($str, 'ZZZZZZ) == -1) { }
XS have no concpext of "G_BOOL" content. There is definently a need to deliver bool contet, from runloop to the XS.

@iabyn
Copy link
Contributor

iabyn commentedDec 23, 2024 via email

On Thu, Dec 19, 2024 at 02:35:50AM -0800, bulk88 wrote: BINOPs like PP `` if(index($str, 'ZZZZZZ) == -1) { } `` XS have no concpext of "G_BOOL" content. There is definently a need to deliver bool contet, from runloop to the XS.
What has this got to do with the proposed OP_SUBSTR_LEFT op?
-- "You may not work around any technical limitations in the software" -- Windows Vista license

This commit adds OP_SUBSTR_LEFT and associated machinery for fasthandling of the constructions:        substr EXPR,0,LENGTH,''and        substr EXPR,0,LENGTHWhere EXPR is a scalar lexical, the OFFSET is zero, and either thereis no REPLACEMENT or it is the empty string. LENGTH can be anythingthat OP_SUBSTR supports. These constraints allow for a very strippedback and optimised version of pp_substr.The primary motivation was for situations where a scalar, containingsome network packets or other binary data structure, is being parsedpiecemeal. Nibbling away at the scalar can be useful when you don'tknow how exactly it will be parsed and unpacked until you get started.It also means that you don't need to worry about correctly updatinga separate offset variable.This operator also turns out to be an efficient way to (destructively)break an expression up into fixed size chunks. For example, given:    my $x = ''; my $str = "A"x100_000_000;This code:    $x = substr($str, 0, 5, "") while ($str);is twice as fast as doing:    for ($pos = 0; $pos < length($str); $pos += 5) {        $x = substr($str, $pos, 5);    }Compared with blead, `$y = substr($x, 0, 5)` runs 40% faster and`$y = substr($x, 0, 5, '')` runs 45% faster.
@richardleachrichardleach merged commit6197847 intoPerl:bleadJan 13, 2025
@richardleachrichardleach deleted the hydahy/op_substr_nibble2 branchJanuary 13, 2025 23:23
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@leonerdleonerdleonerd left review comments

@tonycoztonycoztonycoz approved these changes

Assignees
No one assigned
Labels
None yet
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

7 participants
@richardleach@leonerd@jkeenan@Grinnz@tonycoz@bulk88@iabyn

[8]ページ先頭

©2009-2025 Movatter.jp