Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork9.7k
[DomCrawler] return empty string onCrawler::text() andCrawler::html() instead of an exception#28581
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
[DomCrawler] return empty string onCrawler::text() andCrawler::html() instead of an exception#28581
Uh oh!
There was an error while loading.Please reload this page.
Conversation
respinoza commentedSep 24, 2018
I am not sure if the BC is allowed or if we should add an argument to control the behavior and deprecate (but this kinda defeats this change as goal seems to make this more developer friendly) |
null ontext andhtml methods when empty in…null ontext andhtml methods when empty instead of an exceptionnicolas-grekas commentedSep 24, 2018 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
We shouldn't break BC, so a constructor argument to enable this behavior would be needed. |
19c9818 tofc5effdComparerespinoza commentedSep 24, 2018
Here is a better proposal@nicolas-grekas to avoid BC as you suggested (way better), not sure about the new argument name. Now, for the |
stof commentedSep 24, 2018
We still have a bunch of other methods which fail for an empty node list. And btw, making return types nullable will not save you for doing a check to handle that case. Instead of checking whether the crawler is empty or no, you will have to check whether the value is |
stof commentedSep 24, 2018
and regarding BC, this is still changing the contract of the class. Some code receiving a Crawler as argument will not control this boolean, and so would have to be updated to support the new signature as well. So this approach is still a BC break. |
respinoza commentedSep 24, 2018
@stof looking at the original issue, it could also be empty, let's say an empty string, in this way it keeps the signature. |
stof commentedSep 24, 2018
Well, returning the empty string is possible for these 2 methods. But then, we still cannot remove the exception from other methods (the ones returning an object for instance) |
571d6bc toe0d0842Comparerespinoza commentedSep 24, 2018
A test fails but seems to be a travis problem with composer? |
nicolas-grekas left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I think I'm fine with this PR (just one minor comment)
->text() & ->html() not throwing is good for DX (but throwing for the other methods is OK also)
Uh oh!
There was an error while loading.Please reload this page.
nicolas-grekas commentedSep 25, 2018
@respinoza how would you leverage this new argument? Do you create your crawlers manually? |
jakzal commentedSep 25, 2018 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
With |
respinoza commentedSep 25, 2018
@jakzal the |
respinoza commentedSep 25, 2018
@nicolas-grekas we will have to pass it to the BrowserKit Client constructor and then pass it around inhttps://github.com/symfony/browser-kit/blob/68f233bc3a0fb6b64f48ff6318038087b49a1639/Client.php#L516` Currently there is no way to inject a DomCrawler instance. This can also be solved by improving the DomCrawler/BrowserKit as follows
|
nicolas-grekas commentedSep 25, 2018
so let's allow it via a new constructor arg to BrowserKit's Client? |
nicolas-grekas commentedSep 25, 2018
hum, looks like a crawlerfactory would be needed, so now I get your proposal. |
respinoza commentedSep 25, 2018
@nicolas-grekas having a factory will be way better. I will get on that. |
stof commentedSep 25, 2018
I'm not sure passing a Crawler factory just to control this setting (which looks transitional) makes sense. Injecting a Crawler indeed does not make sense, as the Crawler holds some content, so it does not make sense to inject one that the Client would not know what is already in it. |
respinoza commentedSep 25, 2018
@stof the current code in BrowserKit creates a new instance or null if the class doesn't exist. The factory will be called instead, behavior is the same (new instance on method call), we will just show to return a configured instance. |
stof commentedSep 25, 2018
Well, the question is whether we need to create a concept of factory when theonly meaningful configuration to be customized by the extension point is this boolean. Currently, this PR makes that setting a permanent option. It does not deprecate the old behavior. Is it intended ? Variable signatures based on an option are harder to use (as any code not creating the crawler could not rely on whether an exception could be thrown or no for empty crawlers, and so should keep handling the case throwing exceptions too, defeating this PR).
Thus, if the goal is to deprecate the old behavior, I think a constructor argument is the wrong way to opt-in for the new behavior: the code needing to deal with the new behavior (and so knowing whether it can opt-in or no) may not be the same than the code instantiating the crawler (and when using BrowserKit, that would mean the code instantiating the client, to configure it to configure the crawler creation...). |
respinoza commentedSep 25, 2018
I don't believe we want to get rid of the old behavior so there won't be any deprecation. I agree with your point on the variable signature. We could always have two new methods and get rid of the parameter. (No idea how to make those methods) |
stof commentedSep 26, 2018
@jakzal creating a decorator will be a pain:
At that point, it is much better to have this text extraction utility in your own code rather than in a crawler decorator. And regarding the discussion about BC, a configurable signature is a bigger BC break than a variable signature based on a new optional argument. A variable signature creates a BC break for child class (and the migration path is clear, with the possibility to trigger a deprecation). A configurable signature creates a BC break for consuming code (and with no way to warn them) due to the signature change (as they might not be the one controlling the contrustor-based opt-in). |
nicolas-grekas commentedSep 26, 2018
Option 1. is possible, using func_get_arg() to prevent any BC break. We already did so in other places. |
a6858eb tob3505efComparerespinoza commentedSep 29, 2018
I added the argument to the function through Also again not sure about the parameter name (naming things is hard), ideas welcome 😅 |
null ontext andhtml methods when empty instead of an exceptionCrawler::text() andCrawler::html() instead of an exceptionro0NL commentedSep 29, 2018
@respinoza what about |
amer-flix commentedSep 29, 2018
Passing true for every method is also a hectic approach. It would have been better if we could somehow transfer this approach from the Crawler constructor. |
jakzal commentedSep 29, 2018
A default value, rather than a boolean flag, would be more flexible. |
respinoza commentedSep 29, 2018
I would say no to a default value as we don't want to change the signature (return single), we could check it is a string but in practice is someone going to have another value than empty string? Maybe the idea of having new methods to handle this might be better. Any more ideas? |
nicolas-grekas commentedSep 29, 2018
Yes, two :)
|
b3505ef to3c399cfComparerespinoza commentedDec 16, 2018
@nicolas-grekas sorry for the delay, changes have been done and hopefully it can be included for 4.3 |
…`html()` instead of throwing an exception when node is empty.
3c399cf to6a4ce38Comparenicolas-grekas commentedDec 17, 2018
Thank you@respinoza. |
…and `Crawler::html()` instead of an exception (respinoza)This PR was merged into the 4.3-dev branch.Discussion----------[DomCrawler] return empty string on `Crawler::text()` and `Crawler::html()` instead of an exception…stead of an exception| Q | A| ------------- | ---| Branch? | master| Bug fix? | no| New feature? | yes| BC breaks? | yes| Deprecations? | no| Tests pass? | yes| Fixed tickets |#28313| License | MIT| Doc PR | symfony/symfony-docs#... <!-- required for new features -->This changes the behavior when using text and html methods from the Crawler by returning null instead of throwing an exception.Commits-------6a4ce38 [DomCrawler] Added ability to return a default value in `text()` and `html()` instead of throwing an exception when node is empty.
…ereguiluz)This PR was merged into the master branch.Discussion----------Documented the default values of text() and html()Documentssymfony/symfony#28581See alsohttps://symfony.com/blog/new-in-symfony-4-3-domcrawler-improvementsCommits-------851956d Documented the default values of text() and html()
elgandoz commentedNov 26, 2023 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
This changes the behavior when using text and html methods from the Crawler by returning null instead of throwing an exception.