Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Able to load big xml files with DomCrawler#16873

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Closed
zorn-v wants to merge2 commits intosymfony:2.8fromzorn-v:dom-crawler-load-big-xml

Conversation

@zorn-v
Copy link

QA
Bug fix?yes
New feature?no
BC breaks?no
Deprecations?no
Tests pass?yes
Fixed tickets
LicenseMIT
Doc PR

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Does this option have any drawbacks when parsing non-huge documents?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Manual says that it only relaxes any hardcoded limit from the parser.
https://secure.php.net/manual/en/libxml.constants.php

It only for Libxml >= 2.7.0 but I dont know is version below is widespread.
For ex. on CentOS 6 is 2.7.6

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Maybe we need something like:
LIBXML_NONET | (defined('LIBXML_PARSEHUGE') ? LIBXML_PARSEHUGE : 0)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This constant defined in php extension and avail since PHP >= 5.3.2 and PHP >= 5.2.12 which is less than min requirement for DomCrawler

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

@zorn-v is it always defined, or does it depend on the libxml version being used ? Distributions generally compile PHP against the system libxml rather than the version bundled with PHP, meaning that it may change

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

You are right. In php 5.3.9 ext\libxml\libxml.c

#ifLIBXML_VERSION >=20703REGISTER_LONG_CONSTANT("LIBXML_PARSEHUGE",XML_PARSE_HUGE,CONST_CS |CONST_PERSISTENT);#endif

So minimum libxml version actualy 2.7.3 not 2.7.0

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I can find only one dist with libxml2 < 2.7.3 - CentOS 5. But there PHP 5.1.6 (in standart repo)
Even on Debian 6 it 2.7.8

I think threre is no sense in that check, but I add it just in case.

@rvanlaak
Copy link
Contributor

Libxml introducedXML_MAX_TEXT_LENGTH in2.9.0 and results in a limit of ~10MB per node. Libxml'sMAX_TEXT_LENGTH cannot be changed via PHP, it is compiled in the library.

We recently updated the library to2.9.2 from2.8.0 and aren't able to parse big xml files with the Serializer because of this. So, I think this should also be configurable for the Serializer component. Our problem is that the decoding happens before controller in theFOS\RestBundle\EventListener\BodyListener:114, but we only want to enable this setting for a couple of specific requests.

Please also mentionhttp://symfony.com/blog/security-release-symfony-2-0-17-released about theLIBXML_PARSEHUGE constant, it should definitely not be there by default.

@fabpot
Copy link
Member

Thank you@zorn-v.

fabpot added a commit that referenced this pull requestJan 25, 2016
This PR was submitted for the 2.8 branch but it was merged into the 2.3 branch instead (closes#16873).Discussion----------Able to load big xml files with DomCrawler| Q             | A| ------------- | ---| Bug fix?      | yes| New feature?  | no| BC breaks?    | no| Deprecations? | no| Tests pass?   | yes| Fixed tickets || License       | MIT| Doc PR        |Commits-------3dae825 Able to load big xml files with DomCrawler
@fabpotfabpot closed thisJan 25, 2016
@rvanlaak
Copy link
Contributor

Think this also should be applied to theXmlEncoder. How can we make sure the variable is defined before the RestBundle'sBodyListener without putting it inapp.php?

@fabpotfabpot mentioned this pull requestFeb 3, 2016
This was referencedFeb 28, 2016
fabpot added a commit that referenced this pull requestMar 2, 2016
…-grekas)This PR was merged into the 2.3 branch.Discussion----------[DomCrawler] Dont use LIBXML_PARSEHUGE by default| Q             | A| ------------- | ---| Branch        | 2.3| Bug fix?      | yes| New feature?  | no| BC breaks?    | no| Deprecations? | no| Tests pass?   | no| Fixed tickets |#16873,#17956| License       | MIT| Doc PR        | -Because ofhttp://symfony.com/blog/security-release-symfony-2-0-17-releasedCommits-------fda32f8 [DomCrawler] Dont use LIBXML_PARSEHUGE by default
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

No reviews

Assignees

No one assigned

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

7 participants

@zorn-v@rvanlaak@fabpot@stof@dosten@xabbuh@carsonbot

[8]ページ先頭

©2009-2025 Movatter.jp