Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

#20411 fix Yaml parsing for very long quoted strings#21523

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Closed

Conversation

@RichardBradley
Copy link
Contributor

QA
Branch?2.7
Bug fix?yes
New feature?no
BC breaks?no
Deprecations?no
Tests pass?yes
Fixed tickets#20411
LicenseMIT
Doc PRno

This is a second fix for the issue discussed in#20411. My first PR (#21279) didn't fix the bug in all cases, sorry.

If a YAML string has too many spaces in the value, it can trigger aPREG_BACKTRACK_LIMIT_ERROR error in the Yaml parser.

There should be no behavioural change other than the bug fix

I have included a test which fails before this fix and passes after this fix.

I have also added checks that detect other PCRE internal errors and throw a more descriptive exception. Before this patch, the YAML engine would often give incorrect results, rather than throwing, on a PCREPREG_BACKTRACK_LIMIT_ERROR error.


$isRef =$mergeNode =false;
if (preg_match('#^\-((?P<leadspaces>\s+)(?P<value>.+?))?\s*$#u',$this->currentLine,$values)) {
if (self::preg_match('#^\-((?P<leadspaces>\s+)(?P<value>.+))?$#u',rtrim($this->currentLine),$values)) {
Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

See comment on line 127 below

$this->refs[$isRef] =end($data);
}
}elseif (preg_match('#^(?P<key>'.Inline::REGEX_QUOTED_STRING.'|[^\'"\[\{].*?) *\:(\s+(?P<value>.+?))?\s*$#u',$this->currentLine,$values) && (false ===strpos($values['key'],' #') ||in_array($values['key'][0],array('"',"'")))) {
}elseif (self::preg_match('#^(?P<key>'.Inline::REGEX_QUOTED_STRING.'|[^\'"\[\{].*?) *\:(\s+(?P<value>.+))?$#u',rtrim($this->currentLine),$values)
Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Here, as well as wrapping "preg_match", I have fixed the regex to avoid large numbers of pcre backtracks by moving the trailing whitespace trimming behaviour out of the regex pattern and into a "rtrim" in the argument list.

This may potentially be less performant in some cases (but I expect more performant in most cases, actually), but I have not measured this.

It demonstrably fixes a bug, as can be seen by the unit test I have added to this commit, which fails without this change and passes with it.

Copy link
ContributorAuthor

@RichardBradleyRichardBradleyFeb 3, 2017
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

The specific problem with the regex here was the ".+?" followed by a "\s*$", which leads to a great deal of backtracking behaviour in long strings. I could not find a simpler fix than the one I propose here (the possessive quantifier fix used in#21279 would not work here)

}else {
if (isset($values['leadspaces'])
&&preg_match('#^(?P<key>'.Inline::REGEX_QUOTED_STRING.'|[^\'"\{\[].*?) *\:(\s+(?P<value>.+?))?\s*$#u',$values['value'],$matches)
&&self::preg_match('#^(?P<key>'.Inline::REGEX_QUOTED_STRING.'|[^\'"\{\[].*?) *\:(\s+(?P<value>.+))?$#u',rtrim($values['value']),$matches)
Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

see comment on line 127

$error ='Error.';
}

thrownewParseException($error);
Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This should not be reachable, but should hopefully mean that any further undetected "backtrack_limit" bugs, or any similar bugs added in the future, will result in an exception rather than an incorrect result.

* @throws ParseException on a PCRE internal error
* @see preg_last_error()
*/
staticfunctionpreg_match($pattern,$subject, &$matches =null,$flags =0,$offset =0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Should probably be calledpregMatch if you want to call it like this. Perhapsmatch would be a better name

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I deliberately named it "preg_match" so that it was a drop-in replacement for the builtinpreg_match. Is that disallowed?

*
* This avoids us needing to check for "false" every time PCRE is used
* in the YAML engine
*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

must be@internal as we clearly don't want to support BC on it

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

ok, will do

$error ='Error.';
}

thrownewParseException($error);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

this misses the location of the error in the ParseException

Copy link
ContributorAuthor

@RichardBradleyRichardBradleyFeb 3, 2017
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

The location is not always available, unless I make two wrappers -- one static, for Inline and other callers, and one non-static, for use during the parse. Do you think that's worthwhile?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

You could pass the needed context to the new method. Might not look nice, but will do what it should.

publicfunctionparse($value,$exceptionOnInvalidType =false,$objectSupport =false,$objectForMap =false)
{
if (!preg_match('//u',$value)) {
if (!self::preg_match('//u',$value)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I would not replace this, as it would throw an exception sayingMalformed UTF-8 data. instead of the expected one

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Good point. I'll add a test to cover this

@fabpot
Copy link
Member

fabbot failures must be fixed.

@xabbuh
Copy link
Member

@RichardBradley Do you have time to finish here? :)

@RichardBradley
Copy link
ContributorAuthor

Sorry, I have been busy. I should be able to look over the next couple of days, yes.

@RichardBradley
Copy link
ContributorAuthor

I have pushed an update which addresses the review comments above and fixes the "fabbot" style checks.

}elseif (preg_match('#^(?P<key>'.Inline::REGEX_QUOTED_STRING.'|[^\'"\[\{].*?) *\:(\s+(?P<value>.+?))?\s*$#u',$this->currentLine,$values) && (false ===strpos($values['key'],' #') ||in_array($values['key'][0],array('"',"'")))) {
}elseif (self::preg_match('#^(?P<key>'.Inline::REGEX_QUOTED_STRING.'|[^\'"\[\{].*?) *\:(\s+(?P<value>.+))?$#u',rtrim($this->currentLine),$values)
&& (false ===strpos($values['key'],' #')
||in_array($values['key'][0],array('"',"'")))) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

The way theif condition is wrapped now IMO doesn't make it more readable. Maybe reformat it to something like this:

}elseif (self::preg_match('#^(?P<key>'.Inline::REGEX_QUOTED_STRING.'|[^\'"\[\{].*?) *\:(\s+(?P<value>.+))?$#u',rtrim($this->currentLine),$values)    && (false ===strpos($values['key'],' #') ||in_array($values['key'][0],array('"',"'")))) {

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I have pushed a new version with your preferred indentation

}

thrownewParseException($error,$this->getRealCurrentLineNb() +1,$this->currentLine);
thrownewParseException('Unable to parse',$this->getRealCurrentLineNb() +1,$this->currentLine);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Will this now ever be reached anymore?

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Yes, this line is covered by 3 tests:

  • ParserTest::testUnindentedCollectionException
  • ParserTest::testShortcutKeyUnindentedCollectionException
  • ParserTest::testScalarInSequence

@RichardBradley
Copy link
ContributorAuthor

I have pushed a new version which I believe addresses all the review issues raised

Copy link
Member

@xabbuhxabbuh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

👍

Status: Reviewed

@fabpot
Copy link
Member

Thank you@RichardBradley.

fabpot added a commit that referenced this pull requestMar 17, 2017
…ardBradley)This PR was squashed before being merged into the 2.7 branch (closes#21523).Discussion----------#20411 fix Yaml parsing for very long quoted strings| Q             | A| ------------- | ---| Branch?       | 2.7| Bug fix?      | yes| New feature?  | no| BC breaks?    | no| Deprecations? | no| Tests pass?   | yes| Fixed tickets |#20411| License       | MIT| Doc PR        | noThis is a second fix for the issue discussed in#20411. My first PR (#21279) didn't fix the bug in all cases, sorry.If a YAML string has too many spaces in the value, it can trigger a `PREG_BACKTRACK_LIMIT_ERROR` error in the Yaml parser.There should be no behavioural change other than the bug fixI have included a test which fails before this fix and passes after this fix.I have also added checks that detect other PCRE internal errors and throw a more descriptive exception. Before this patch, the YAML engine would often give incorrect results, rather than throwing, on a PCRE `PREG_BACKTRACK_LIMIT_ERROR` error.Commits-------c9a1c09#20411 fix Yaml parsing for very long quoted strings
@fabpotfabpot closed thisMar 17, 2017
@fabpotfabpot mentioned this pull requestApr 4, 2017
This was referencedApr 5, 2017
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@stofstofstof left review comments

@xabbuhxabbuhxabbuh approved these changes

+1 more reviewer

@linaorilinaorilinaori left review comments

Reviewers whose approvals may not affect merge requirements

Assignees

No one assigned

Projects

None yet

Milestone

2.7

Development

Successfully merging this pull request may close these issues.

7 participants

@RichardBradley@fabpot@xabbuh@stof@linaori@nicolas-grekas@carsonbot

[8]ページ先頭

©2009-2025 Movatter.jp