Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

[Serializer] Add support for auto generated custom normalizers#52905

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
Nyholm wants to merge13 commits intosymfony:7.4
base:7.4
Choose a base branch
Loading
fromNyholm:better-serializer

Conversation

Nyholm
Copy link
Member

@NyholmNyholm commentedDec 5, 2023
edited by OskarStark
Loading

QA
Branch?7.3
Bug fix?no
New feature?yes
Deprecations?no
Issues
LicenseMIT

This is an opt-in performance boost for the serializer component. The trick is to move work from runtime to build-time by generating custom normalizers.


The Serializer is slow because every time we serialize or deserialize an object we are trying to figure out what this object really is. By marking the classes we are interested in serializeing, we can do all this "figure out" at build time. This saves LOADs of time.

The more complex the value objects are, the more time we will save. On a simple "hello word" object, we reduce the time with about 80%.

This feature is not complete. We do not yet support the Serializer attributes likeIgnore andSerializerName. But since this is opt-in AND experimental I think it is safe to review, try and maybe even merge this PR. Because many large PRs on this component have stalled the passed few years.

Usage

# config/package/serializer.yamlframework:serializer:auto_normalizer:paths:App\Model:'src/Model'App\Entity:'src/Entity'
App\Model value objects
# src/Model/Foo/Blog.phpnamespaceApp\Model\Foo;useSymfony\Component\Serializer\Attribute\Serializable;#[Serializable]class Blog{publicstring$name;}
# src/Model/User.phpnamespaceApp\Model;useApp\Model\Foo\Blog;useSymfony\Component\Serializer\Attribute\Serializable;#[Serializable]class User{publicstring$name;publicBlog$blog;}
#[AsCommand('app:debug')]class DebugCommandextends Command{publicfunction__construct(privateSerializerInterface$serializer    ){parent::__construct();    }protectedfunctionexecute(InputInterface$input,OutputInterface$output):int    {$blog =newBlog();$blog->name ='My Blog';$user =newUser();$user->name ='Tobias';$user->blog =$blog;$start =microtime(true);for($i =0;$i <200000;$i++) {$string =$this->serializer->serialize($user,'json');        }$end =microtime(true);echo'Time:'.round(($end -$start) *1000).' ms'.\PHP_EOL;$output->writeln($string);return Command::SUCCESS;    }}

Output:

➜  example-app git:(7.1) ✗ sf app:debug                                 Time: 2557 ms{"name":"Tobias","blog":{"name":"My Blog"}}➜  example-app git:(better-serializer) ✗ sf app:debugTime: 310 ms{"name":"Tobias","blog":{"name":"My Blog"}}

Generated classes

The tests have a lot of example classes and expected outputs. Seehttps://github.com/Nyholm/symfony/tree/better-serializer/src/Symfony/Component/Serializer/Tests/Fixtures/CustomNormalizer

See generated classes for PR example
namespaceSymfony\Serializer\Normalizer;useApp\Model\Foo\Blog;useSymfony\Component\Serializer\Normalizer\NormalizerInterface;useSymfony\Component\Serializer\Normalizer\DenormalizerInterface;class App_Model_Foo_Blogimplements NormalizerInterface, DenormalizerInterface{publicfunctiongetSupportedTypes(?string$format):array    {return [Blog::class =>true];    }publicfunctionsupportsNormalization(mixed$data, ?string$format =NULL,array$context = []):bool    {return$datainstanceof Blog;    }publicfunctionsupportsDenormalization(mixed$data,string$type, ?string$format =NULL,array$context = []):bool    {return$type === Blog::class;    }/**     * @param Blog $object     */publicfunctionnormalize(mixed$object, ?string$format =NULL,array$context = []):array|string|int|float|bool|\ArrayObject|null    {return ['name' =>$object->name,        ];    }publicfunctiondenormalize(mixed$data,string$type, ?string$format =NULL,array$context = []):mixed    {$output =newBlog();if (array_key_exists('name',$data)) {$output->name =$data['name'];        }return$output;    }}
namespaceSymfony\Serializer\Normalizer;useApp\Model\User;useSymfony\Component\Serializer\Normalizer\NormalizerInterface;useSymfony\Component\Serializer\Normalizer\DenormalizerInterface;useSymfony\Component\Serializer\Normalizer\NormalizerAwareInterface;useSymfony\Component\Serializer\Exception\DenormalizingUnionFailedException;useSymfony\Component\Serializer\Normalizer\DenormalizerAwareInterface;class App_Model_Userimplements NormalizerInterface, DenormalizerInterface, NormalizerAwareInterface, DenormalizerAwareInterface{privatenull|NormalizerInterface$normalizer =NULL;privatenull|DenormalizerInterface$denormalizer =NULL;publicfunctiongetSupportedTypes(?string$format):array    {return [User::class =>true];    }publicfunctionsupportsNormalization(mixed$data, ?string$format =NULL,array$context = []):bool    {return$datainstanceof User;    }publicfunctionsupportsDenormalization(mixed$data,string$type, ?string$format =NULL,array$context = []):bool    {return$type === User::class;    }/**     * @param User $object     */publicfunctionnormalize(mixed$object, ?string$format =NULL,array$context = []):array|string|int|float|bool|\ArrayObject|null    {return ['name' =>$object->name,'blog' =>$this->normalizeChild($object->blog,$format,$context,false),        ];    }publicfunctionsetNormalizer(NormalizerInterface$normalizer):void    {$this->normalizer =$normalizer;    }privatefunctionnormalizeChild(mixed$object, ?string$format,array$context,bool$canBeIterable):mixed    {if (is_scalar($object) ||null ===$object) {return$object;        }if ($canBeIterable ===true &&is_iterable($object)) {returnarray_map(fn($item) =>$this->normalizeChild($item,$format,$context,true),$object);        }return$this->normalizer->normalize($object,$format,$context);            }publicfunctiondenormalize(mixed$data,string$type, ?string$format =NULL,array$context = []):mixed    {$output =newUser();if (array_key_exists('name',$data)) {$output->name =$data['name'];        }if (array_key_exists('blog',$data)) {$setter0 =$this->denormalizeChild($data['blog'], \App\Model\Foo\Blog::class,$format,$context,false);$output->blog =$setter0;        }return$output;    }publicfunctionsetDenormalizer(DenormalizerInterface$denormalizer):void    {$this->denormalizer =$denormalizer;    }privatefunctiondenormalizeChild(mixed$data,string$type, ?string$format,array$context,bool$canBeIterable):mixed    {if (is_scalar($data) ||null ===$data) {return$data;        }if ($canBeIterable ===true &&is_iterable($data)) {returnarray_map(fn($item) =>$this->denormalizeChild($item,$type,$format,$context,true),$data);        }return$this->denormalizer->denormalize($data,$type,$format,$context);            }}

OskarStark, yceruto, Kocal, Zuruuh, ging-dev, Jibbarth, welcoMattic, javiereguiluz, theofidry, Guikingone, and 8 more reacted with thumbs up emojitucksaun, GaryPEGEOT, Zuruuh, ging-dev, Jibbarth, and Aweptimum reacted with hooray emojiJibbarth, makraz, and Korbeil reacted with rocket emoji
@yceruto
Copy link
Member

Thanks Tobias for this proposal! It's nice to see some performance improvements for this component.

I did just a quick check of the code and I wonder if the code generator building blocks can be a separate component or considerhttps://github.com/nette/php-generator. It looks decoupled from this feature.

@yceruto
Copy link
Member

Another alternative for PHP code generationnikic/php-parserhttps://github.com/nikic/PHP-Parser/blob/HEAD/doc/component/AST_builders.markdown we already use this library in the Translation component.

valtzu reacted with thumbs up emoji

@Nyholm
Copy link
MemberAuthor

Nyholm commentedDec 5, 2023
edited
Loading

That are some good points.
Im not in love with Nette because of its ugly generated PHP code, awkward API, it is opinionated and lack of extensibility. This is of course 100% my opinion and it is from a few years back.

Nikic is way too complex and verbose.

I also looked into re-use the code generator from Config component. But that one is also very opinionated and not generic. I've also heard@mtarld is using a code generator for upcoming PR... I really think a group of contributors should look over Symfony's use of code generators, but I dont think that is a blocker for this PR. (The CodeGenerator here is marked as internal)

yceruto and dkarlovi reacted with thumbs up emoji

@RobertMe
Copy link
Contributor

Just a quick glance over the description and example, and I noticed this in the generated code:

if (array_key_exists('name',$data)) {$output->name =$data['name'];        }

To my understandingisset() checks have always been (way) faster thanarray_key_exists(). Obviously there is the difference innull handling, but over the years I've seen quite a lot of code where this is optimized intoisset(...) || array_key_exists(...) where normally the (much) quickerisset() check is enough. Might this be an optimization which could be applied here as well?

Great idea though, and looking forward for this 😄

Nyholm reacted with heart emoji

@mtarld
Copy link
Contributor

That are some good points. Im not in love with Nette because of its ugly generated PHP code, awkward API, it is opinionated and lack of extensibility. This is of course 100% my opinion and it is from a few years back.

Nikic is way too complex and verbose.

I also looked into re-use the code generator from Config component. But that one is also very opinionated and not generic. I've also heard@mtarld is using a code generator for upcoming PR... I really think a group of contributors should look over Symfony's use of code generators, but I dont think that is a blocker for this PR. (The CodeGenerator here is marked as internal)

I finally usednikic/php-parser in#51718, and I think that's the way to go.
IMHO we can rely on such a stable library. While it is indeed verbose, it'll lighten the codebase a lot, allowing to get rid of builders.
Plus, having an AST instead of a string feels more secure to me, as it avoids typos and can be compatible to any PHP version.

yceruto, welcoMattic, javiereguiluz, agustingomes, dkarlovi, and chalasr reacted with thumbs up emoji

@staabm
Copy link
Contributor

To my understanding isset() checks have always been (way) faster than array_key_exists(). Obviously there is the difference in null handling, but over the years I've seen quite a lot of code where this is optimized into isset(...) || array_key_exists(...) where normally the (much) quicker isset() check is enough. Might this be an optimization which could be applied here as well?

Thats only correct for older PHP versions. In recent versions the opposite is true.

RobertMe reacted with thumbs up emoji

@RobertMe
Copy link
Contributor

Thats only correct for older PHP versions. In recent versions the opposite is true.

You'll learn something new every day 😄. Ran a quick PHPBench (for loop with 10M iterations doingisset(...) || array_key_exists(...) vsarray_key_exists(...) and split over "key exists" and "key doesn't exist", so 4 benches in total). Andarray_key_exists with a non existing key is even quickest of all. Then comesisset || array_key_exists with an existing key quickly followed byarray_key_exists with an existing key, whileisset || array_key_exists for a non existing key is over 50% slower compared to all of the others.

staabm and ging-dev reacted with laugh emoji

@fabpot
Copy link
Member

Usingnikic/php-parser is the way to go IMHO, which is what the maker bundle uses IIRC.

welcoMattic, Korbeil, dkarlovi, vasilvestre, soyuka, Zuruuh, chalasr, ging-dev, Nyholm, and yceruto reacted with thumbs up emoji

@Korbeil
Copy link
Contributor

Hey, we spoke a bit together during SymfonyCon hackday. I am working on JanePHP components & AutoMapper. We do a lot of code generation and I love seeing such feature coming to Symfony Serializer ! I didn't saw your PR before some tweet I saw today that's why I didn't understand well all you were talking with Mathias.
I'll also recommend usingnikic/php-parser since it allows you to validate part of the input and will make this code safer ! Also you're not obligated to work only with AST, the library comes with a builder that makes some tasks easier, you can see some examples there:https://github.com/Korbeil/jane-v8/blob/main/src/Component/JsonSchemaGenerator/Generator/ModelGenerator.php#L56-L66 (this is part of the new JanePHP version that I'm working on 😉).

In your implementation I would like to see some sort of hash corresponding to the model your generated the normalizer from, the idea is to know that you generated the normalizer with a specific version of that model and if the model changes, you will generate a new version of it. You can find this logic here if you want an example:https://github.com/jolicode/automapper/blob/9c43e0d3710f202097d30656667041c3cbf8382c/src/MapperMetadata.php#L138-L155.

I'm not sure I am a fan of having the model that holds information about what is going to be generated having also the role to generate stuff, but it fits well in your pull request. Also, do you plan on replacing stuff like ArrayDenormalizer with this PR or a later PR ? I would love being able to have all the normalizer code being generated and not only some part.

dkarlovi reacted with hooray emojiqboot reacted with heart emoji

@dbu
Copy link
Contributor

dbu commentedDec 13, 2023
edited
Loading

this has crazy performance impacts.

me and some collegues wrote something like this a while ago for JMS serializer. if you need more inspiration:https://github.com/liip/serializer

we generate code to convert between object and array and back. we did not use any code generation tool buttwig :-D (but using a code generator is probably the better way to do this). for our complex models, we generate 50k or more lines of PHP that is all dead simple, and thanks to opcache it is quite ok to have such large code files.

depending on event system during the on-the-fly conversion, our generated code will not do the same. it should be possible to trigger the events from the generated code, but we did not use events so we did not make the effort.

Nyholm reacted with heart emoji

Copy link
Contributor

@soyukasoyuka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Nice it reminds me of the work by@GuilhemN#22051. I'm definitely in favor ofnikic/php-parser for code generation.

Nyholm and Korbeil reacted with thumbs up emoji
@@ -17,8 +17,6 @@
* Infers the constructor argument type.
*
* @author Dmitrii Poddubnyi <dpoddubny@gmail.com>
*
* @internal
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I'm not a huge fan, you did this because it's used in another component? Just ignore the error for internal use between components it's fine no?

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Yes. It is because I used it in another component.

Do you know the reason why it is internal in the first place?

@joelwurtz
Copy link
Contributor

Really nice, would love to see that in symfony, i think it's simple enough for a first implementation and other features can be added in subsequent PR.

More optimizations can be done also to increase performance (like directly inject the child normalizer when needed, to have done a similar thing having this direct injection normally provide between 10 to 20% performance in some cases, like a property that is an array of objects) but all of this can be added further.

Also ATM there is now 3 different pull request with a similar goal with the Mapper component and the JsonEncoder component, i don't know which way the symfony team want to go but IMO there is a path were all of this PR can use the same implementation and code base which will drasticly reduce code maintenance and burden to the core team by doing something like :

  • create an internal implementation to generate a mapper that read data from an array / object and write it to an array / object, if the implementation use nikic/php-parser it's really easy to change the way the property is read / write on the fly during generation

With this implementation you will be able to achieve the same thing as those 3 PR :

  • Normalization Generation -> object to array
  • Denormalization Generation -> array to object
  • Mapper Generation -> object to object

For the JsonEncode PR this may require some more works to be able to use only specific part of the implementation but i think this is totally doable

Having this will also allow each of this component to share new features (supporting groups, excluding properties, renaming properties, circular loop, etc ...)

Korbeil and mtarld reacted with thumbs up emojiwelcoMattic, qboot, and Nyholm reacted with heart emoji

@dbu
Copy link
Contributor

dbu commentedDec 14, 2023

* create an internal implementation to generate a mapper that read data from an array / object and write it to an array / object, if the implementation use nikic/php-parser it's really easy to change the way the property is read / write on the fly during generation

this is also the approach we took for our liip serializer-generator:https://github.com/liip/metadata-parser

because we wanted to build a drop-in replacement for JMS, we specifically support JMS annotations, but such a metadata parser could be really generic and provide metadata from the code or any kind of attributes/annotations/metadata mapping files in yaml/xml like doctrine does etc.

Copy link
MemberAuthor

@NyholmNyholm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Also, do you plan on replacing stuff like ArrayDenormalizer with this PR or a later PR ?

That is a later PR =)

me and some collegues wrote something like this a while ago for JMS serializer. if you need more inspiration:https://github.com/liip/serializer

Yeah. I think I've seen a conference talk about that.


PR is updated. We now use nikic/PHP-Parser

OskarStark reacted with thumbs up emoji
@@ -17,8 +17,6 @@
* Infers the constructor argument type.
*
* @author Dmitrii Poddubnyi <dpoddubny@gmail.com>
*
* @internal
Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Yes. It is because I used it in another component.

Do you know the reason why it is internal in the first place?

@Nyholm
Copy link
MemberAuthor

Nyholm commentedJan 15, 2024
edited
Loading

Tests are green.
The Psalm error is a false negative.

@xabbuhxabbuh modified the milestones:7.1,7.2May 15, 2024
@andreybolonin
Copy link
Contributor

andreybolonin commentedJun 4, 2024
edited
Loading

Any chances to merge into 7.2?

@Korbeil
Copy link
Contributor

Korbeil commentedJun 4, 2024
edited
Loading

Any chances to merge into 7.2?

I do think it will overlap with coming ObjectMapper (see#51741). First implementation will use Reflection but it's planned (see#54476) to make a second implementation in order to generated mappers that will act as normalizers.

@tsantos84
Copy link
Contributor

Hi@Nyholm, I've tried to developmy own serializer component based on generated code, exactly the same way you're proposing to Symfony Serializer component. In that moment (5 years back), I made a lot of performance tests with the generated code and without any surprise, the benchmark showed that this is the way to go. If I can contribute with your PR, I'd say to take a look on thecode generator class to see how I generated the normalizer code. Each decision on the generated code was implemented with the light of Blackfire to minimize any bottleneck. You'll notice that to increase even more the performance, some Symfony Serializer logic should be refactored to allow them to be executed by the generated code, like ignored attributes, normalized name, class inheritance and etc. All of this feature has an overhead when used through reflection at runtime and can be improved on the generated code.

This tool helped me a lot when I wrote the component.

Nyholm and andreybolonin reacted with heart emoji

@dunglas
Copy link
Member

How would this interact with#51718?

@Nyholm
Copy link
MemberAuthor

How would this interact with#51718?

#51718 is for specific (but very common) scenarios. Ie DTOs to/form JSON (and maybe XML). It also skips the normalizer step.

This PR is for the generic use case, to/form all kind of data and does not change the architecture of the serialization process. However, it does not give as impressive results (but almost). Me and@mtarld believes they can/should co-exists.

mtarld reacted with thumbs up emoji

@Korbeil
Copy link
Contributor

I made a quick schema to summarize how each PR interact with the Serializer concepts:

image

ping#51741
ping#51718

@fabpotfabpot modified the milestones:7.2,7.3Nov 20, 2024
@joelwurtz
Copy link
Contributor

ObjectMapper was merged recently and@soyuka tell me that it would be nice to have generated code to improve performance of such mapper.

I do believe it's possible to have the same implementation that does :

  • Normalizer
  • Denormalizer
  • ObjectMapper
  • (Maybe JsonEncoder ?)

As the only difference in this is the way to write / read data

The main problem is : where to put this implementation, as other components will have to require this implementation and this is something that should be fully internal (no public extensions points, they are provided by other components as their are specific to the wanted behavior)

I'm not sure what your status are on this actually@Nyholm ?

soyuka reacted with thumbs up emoji

@soyuka
Copy link
Contributor

soyuka commentedMar 30, 2025
edited
Loading

The main problem is : where to put this implementation, as other components will have to require this implementation and this is something that should be fully internal (no public extensions points, they are provided by other components as their are specific to the wanted behavior)

I see a few ways:

  • create a new "internal" component that's used to generate code (could re-use some of the JsonStream component for example)
  • create a public component (just like[AstGenerator] [WIP] New Component, add normalizer generator #17516)
  • each component does its own thing and some code will be duplicated (actually we need to see if there's really something reusable across components)

Anyways this probably needs an RFC issue to prevent polluting this topic

mtarld reacted with thumbs up emoji

@fabpotfabpot modified the milestones:7.3,7.4May 26, 2025
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@OskarStarkOskarStarkOskarStark left review comments

@soyukasoyukasoyuka left review comments

@dunglasdunglasAwaiting requested review from dunglasdunglas is a code owner

Assignees
No one assigned
Projects
None yet
Milestone
7.4
Development

Successfully merging this pull request may close these issues.

17 participants
@Nyholm@yceruto@RobertMe@mtarld@staabm@fabpot@Korbeil@dbu@joelwurtz@andreybolonin@tsantos84@dunglas@soyuka@OskarStark@javiereguiluz@xabbuh@carsonbot

[8]ページ先頭

©2009-2025 Movatter.jp