Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork9.7k
[Serializer][CSV] Add context options to handle BOM#33896
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
Uh oh!
There was an error while loading.Please reload this page.
src/Symfony/Component/Serializer/Tests/Encoder/CsvEncoderTest.php OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
nicolas-grekas left a comment• edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
When decoding a value, we should always deal with the BOM to me, per the robustness principle.
Then, does it really make sense to add an option to prefix the BOM in the output, when the alternative is just a concatention away? I'm not sure...
Uh oh!
There was an error while loading.Please reload this page.
stof commentedOct 8, 2019
I agree with@nicolas-grekas that we should always handle BOM in the input. |
malarzm commentedOct 8, 2019
@nicolas-grekas so you're proposing we always add BOM to output and handle it in input (if present)? |
nicolas-grekas commentedOct 8, 2019
I'm proposing to never add the BOM in the output, and always handle it in the input. |
malarzm commentedOct 8, 2019
I'm 👎 for never adding BOM in the output as then I'm as a consumer responsible for checking whether data is serialized to CSV and if so, add BOM to it. |
stof commentedOct 8, 2019
@malarzm as@nicolas-grekas said, you can still concatenate the BOM before the serialized data. |
malarzm commentedOct 8, 2019
@nicolas-grekas@stof I'm not sure when exactly you want to manually concatenate BOM, may I ask for an example how do you see it? |
ebe7fd3 tof6b1405Comparemalarzm commentedOct 8, 2019
@nicolas-grekas I've removed the ByteOrderMask class keeping only UTF-8 BOM in the encoder itself, the BOM is always stripped while decoding and, for now, I've left a bool switch to include BOM in encoded string as I believe this is encoder's responsibility. |
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
93602d0 to35d8e61Compare
fabpot left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Small typo
| ----- | ||
| * deprecated the`XmlEncoder::TYPE_CASE_ATTRIBUTES` constant, use`XmlEncoder::TYPE_CAST_ATTRIBUTES` instead | ||
| * added option to output an UTF-8 BOM in CSV encoder via`CsvEncoder::OUTPUT_UTF8_BOM_KEY` context option |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
a UTF-8
| if ($outputBom) { | ||
| if (!preg_match('//u',$value)) { | ||
| thrownewUnexpectedValueException('You are trying to add an UTF-8 BOM to a non UTF-8 text.'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
a UTF-8
malarzm commentedOct 13, 2019
@fabpot thanks for noticing, fixed :) |
fabpot commentedOct 13, 2019
Thank you@malarzm. |
…alarzm)This PR was merged into the 4.4 branch.Discussion----------[Serializer][CSV] Add context options to handle BOM| Q | A| ------------- | ---| Branch? | 4.4| Bug fix? | no| New feature? | yes| Deprecations? | no| Tickets |Fix#33684| License | MIT| Doc PR |symfony/symfony-docs#12461This allows BOM handling in en/decoded CSV files. To keep current behaviour intact both skipping BOM at the beginning of the CSV and outputting BOM are an opt-in feature.Personally I'd propose to make `SKIP_INPUT_BOM` default to `false` in 5.0 so the BOM is transparent and people that for some reasons expect BOM characters to be present in the parsed text explicitly opt-out of trimming it.Commits-------3eb3668 Add context options to handle BOM
This PR was merged into the 4.4 branch.Discussion----------List CSV encoder's context optionsI needed to add new option `output_utf8_bom` introduced insymfony/symfony#33896 and since context options weren't documented I figured I'll list them all. Format and text is based on XML's part of the docs.Commits-------b600aa0 List CSV encoder's context options
Uh oh!
There was an error while loading.Please reload this page.
This allows BOM handling in en/decoded CSV files. To keep current behaviour intact both skipping BOM at the beginning of the CSV and outputting BOM are an opt-in feature.
Personally I'd propose to make
SKIP_INPUT_BOMdefault tofalsein 5.0 so the BOM is transparent and people that for some reasons expect BOM characters to be present in the parsed text explicitly opt-out of trimming it.