Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

[HttpFoundation] AddStreamedJsonResponse for efficient JSON streaming#47709

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged

Conversation

@alexander-schranz
Copy link
Contributor

@alexander-schranzalexander-schranz commentedSep 27, 2022
edited
Loading

QA
Branch?6.2
Bug fix?no
New feature?yes
Deprecations?no
TicketsFix #...
LicenseMIT
Doc PRsymfony/symfony-docs#17301

When big data are streamed via JSON API it can sometimes be difficult to keep the resources usages low. For this I experimented with a different way of streaming data for JSON responses. It uses combination ofstructured array andgenerics which did result in a lot better result.

More can be read about here:https://github.com/alexander-schranz/efficient-json-streaming-with-symfony-doctrine.

I thought it maybe can be a great addition to Symfony itself to make this kind of responses easier and that APIs can be made more performant.

Usage

First Version (replaced)
class ArticleListAction {publicfunction__invoke(EntityManagerInterface$entityManager):Response    {$articles =$this->findArticles($entityManager);returnnewStreamedJsonResponse(// json structure with replacers identifiers            ['_embedded' => ['articles' =>'__articles__',                ],            ],// array of generator replacer identifier used as key            ['__articles__' =>$this->findArticles('Article'),            ]        );    }privatefunctionfindArticles(EntityManagerInterface$entityManager):\Generator    {$queryBuilder =$entityManager->createQueryBuilder();$queryBuilder->from(Article::class,'article');$queryBuilder->select('article.id')            ->addSelect('article.title')            ->addSelect('article.description');return$queryBuilder->getQuery()->toIterable();    }}

Update Version (thx to@ro0NL for the idea):

class ArticleListAction {publicfunction__invoke(EntityManagerInterface$entityManager):Response    {$articles =$this->findArticles($entityManager);returnnewStreamedJsonResponse(// json structure with generators in it which are streamed            ['_embedded' => ['articles' =>$this->findArticles('Article'),// returns a generator which is streamed                ],            ],        );    }privatefunctionfindArticles(EntityManagerInterface$entityManager):\Generator    {$queryBuilder =$entityManager->createQueryBuilder();$queryBuilder->from(Article::class,'article');$queryBuilder->select('article.id')            ->addSelect('article.title')            ->addSelect('article.description');return$queryBuilder->getQuery()->toIterable();    }}

As proposed by@OskarStark the Full Content of Blog about"Efficient JSON Streaming with Symfony and Doctrine":

Efficient JSON Streaming with Symfony and Doctrine

After reading a tweet about we provide only a few items (max. 100) over our
JSON APIs but providing 4k images for our websites. I did think about why is
this the case.

The main difference first we need to know about how images are streamed.
On webservers today is mostly the sendfile feature used. Which is very
efficient as it can stream a file chunk by chunk and don't need to load
the whole data.

So I'm asking myself how we can achieve the same mechanisms for our
JSON APIs, with a little experiment.

As an example we will have a look at a basic entity which has the
following fields defined:

  • id: int
  • title: string
  • description: text

The response of our API should look like the following:

{"_embedded": {"articles": [      {"id":1,"title":"Article 1","description":"Description 1\nMore description text ...",      },...    ]  } }

Normally to provide this API we would do something like this:

<?phpnamespaceApp\Controller;useApp\Entity\Article;useDoctrine\ORM\EntityManagerInterface;useSymfony\Component\HttpFoundation\JsonResponse;useSymfony\Component\HttpFoundation\Response;class ArticleListAction{publicfunction__invoke(EntityManagerInterface$entityManager):Response    {$articles =$this->findArticles($entityManager);return JsonResponse::fromJsonString(json_encode(['embedded' => ['articles' =>$articles,            ],'total' =>100_000,        ],JSON_THROW_ON_ERROR |JSON_UNESCAPED_SLASHES |JSON_UNESCAPED_UNICODE));    }// normally this method would live in a repositoryprivatefunctionfindArticles(EntityManagerInterface$entityManager):iterable    {$queryBuilder =$entityManager->createQueryBuilder();$queryBuilder->from(Article::class,'article');$queryBuilder->select('article.id')            ->addSelect('article.title')            ->addSelect('article.description');return$queryBuilder->getQuery()->getResult();    }}

In most cases we will add some pagination to the endpoint so our response are not too big.

Making the api more efficient

But there is also a way how we can stream this response in an efficient way.

First of all we need to adjust how we load the articles. This can be done by replace
thegetResult with the more efficienttoIterable:

-        return $queryBuilder->getQuery()->getResult();+        return $queryBuilder->getQuery()->toIterable();

Still the whole JSON need to be in the memory to send it. So we need also refactoring
how we are creating our response. We will replace ourJsonResponse with the
StreamedResponse object.

returnnewStreamedResponse(function()use ($articles) {// stream json},200, ['Content-Type' =>'application/json']);

But thejson format is not the best format for streaming, so we need to add some hacks
so we can make it streamable.

First we will create will define the basic structure of our JSON this way:

$jsonStructure =json_encode(['embedded' => ['articles' => ['__REPLACES_ARTICLES__'],    ],'total' =>100_000,],JSON_THROW_ON_ERROR |JSON_UNESCAPED_SLASHES |JSON_UNESCAPED_UNICODE);

Instead of the$articles we are using a placeholder which we use to split the string into
a$before and$after variable:

[$before,$after] =explode('"__REPLACES_ARTICLES__"',$jsonStructure,2);

Now we are first sending the$before:

echo$before .PHP_EOL;

Then we stream the articles one by one to it here we need to keep the comma in mind which
we need to add after every article but not the last one:

foreach ($articlesas$count =>$article) {if ($count !==0) {echo',' .PHP_EOL;// if not first element we need a separator    }echojson_encode($article,JSON_THROW_ON_ERROR |JSON_UNESCAPED_SLASHES |JSON_UNESCAPED_UNICODE);}

Also we will add an additionalflush after every 500 elements:

if ($count %500 ===0 &&$count !==100_000) {// flush response after every 500flush();}

After that we will also send the$after part:

echoPHP_EOL;echo$after;

The result

So at the end the whole action looks like the following:

<?phpnamespaceApp\Controller;useApp\Entity\Article;useDoctrine\ORM\EntityManagerInterface;useSymfony\Component\HttpFoundation\Response;useSymfony\Component\HttpFoundation\StreamedResponse;class ArticleListAction{publicfunction__invoke(EntityManagerInterface$entityManager):Response    {$articles =$this->findArticles($entityManager);returnnewStreamedResponse(function()use ($articles) {// defining our json structure but replaces the articles with a placeholder$jsonStructure =json_encode(['embedded' => ['articles' => ['__REPLACES_ARTICLES__'],                ],'total' =>100_000,            ],JSON_THROW_ON_ERROR |JSON_UNESCAPED_SLASHES |JSON_UNESCAPED_UNICODE);// split by placeholder            [$before,$after] =explode('"__REPLACES_ARTICLES__"',$jsonStructure,2);// send first before part of the jsonecho$before .PHP_EOL;// stream article one by one as own jsonforeach ($articlesas$count =>$article) {if ($count !==0) {echo',' .PHP_EOL;// if not first element we need a separator                }if ($count %500 ===0 &&$count !==100_000) {// flush response after every 500flush();                }echojson_encode($article,JSON_THROW_ON_ERROR |JSON_UNESCAPED_SLASHES |JSON_UNESCAPED_UNICODE);            }// send the after part of the json as lastechoPHP_EOL;echo$after;        },200, ['Content-Type' =>'application/json']);    }privatefunctionfindArticles(EntityManagerInterface$entityManager):iterable    {$queryBuilder =$entityManager->createQueryBuilder();$queryBuilder->from(Article::class,'article');$queryBuilder->select('article.id')            ->addSelect('article.title')            ->addSelect('article.description');return$queryBuilder->getQuery()->toIterable();    }}

The metrics for 100000 Articles (nginx + php-fpm 7.4 - Macbook Pro 2013):

Old ImplementationNew Implementation
Memory Usage49.53 MB2.10 MB
Memory Usage Peak59.21 MB2.10 MB
Time to first Byte478ms28ms
Time2.335 s0.584 s

This way we did not only reduce the memory usage on our server
also we did make the response faster. The memory usage was
measured here withmemory_get_usage andmemory_get_peak_usage.
The "Time to first Byte" by the browser value and response times
over curl.

Updated 2022-10-02 - (symfony serve + php-fpm 8.1 - Macbook Pro 2021)

Old ImplementationNew Implementation
Memory Usage64.21 MB2.10 MB
Memory Usage Peak73.89 MB2.10 MB
Time to first Byte0.203 s0.049 s
Updated Time (2022-10-02)0.233 s0.232 s

While there is not much different for a single response in the time,
the real performance is the lower memory usage. Which will kick in when
you have a lot of simultaneously requests. On my machine >150 simultaneously
requests - which is a high value but will on a normal server be a lot lower.

While 150 simultaneously requests crashes in the old implementation
the new implementation still works with 220 simultaneously requests. Which
means we got about ~46% more requests possible.

Reading Data in javascript

As we stream the data we should also make our JavaScript on the other
end the same way - so data need to read in streamed way.

Here I'm just following the example from theFetch API Processing a text file line by line

So if we look at ourscript.js we split the object
line by line and append it to our table. This method is definitely not the
way how JSON should be read and parsed. It should only be shown as example
how the response could be read from a stream.

Conclusion

The implementation looks a little hacky for maintainability it could
be moved into its own Factory which creates this kind of response.

Example:

return StreamedResponseFactory::create(    ['embedded' => ['articles' => ['__REPLACES_ARTICLES__'],        ],'total' =>100_000,    ],    ['____REPLACES_ARTICLES__' =>$articles]);

The JavaScript part something is definitely not ready for production
and if used you should probably creating your own content-type e.g.:
application/json+stream. So you are parsing the json this way
only when you know it is really in this line by line format.
There maybe better libraries likeJSONStream
to read data but at current state did test them out. Let me know
if somebody has experience with that and has solutions for it.

Atleast what I think everybody should use for providing lists
is to usetoIterable when possible for your lists when loading
your data via Doctrine and and select specific fields instead
of using theORM to avoid hydration process to object.

Let me know what you think about this experiment and how you currently are
providing your JSON data.

The whole experiment here can be checked out and test yourself viathis repository.

Attend the discussion about this onTwitter.

Update 2022-09-27

Added aStreamedJsonRepsonse class and
try to contribute this implementation to the Symfony core.

#47709

Update 2022-10-02

Updated some statistics with new machine and apache benchmark tests for concurrency requests.

Kocal, GromNaN, Korbeil, welcoMattic, maidmaid, gilles-g, and powernic reacted with heart emojirvanlaak and NiklasBr reacted with rocket emojiMatTheCat, ro0NL, and maxhelias reacted with eyes emoji
@carsonbotcarsonbot added this to the6.2 milestoneSep 27, 2022
@carsonbotcarsonbot changed the titleAdd StreamedJsonResponse for efficient JSON streaming[HttpFoundation] Add StreamedJsonResponse for efficient JSON streamingSep 27, 2022
@alexander-schranz
Copy link
ContributorAuthor

The error in the tests ofStopwatch is unrelated to the pull request.

@OskarStarkOskarStark changed the title[HttpFoundation] Add StreamedJsonResponse for efficient JSON streaming[HttpFoundation] AddStreamedJsonResponse for efficient JSON streamingSep 29, 2022
@ro0NL
Copy link
Contributor

would it be reasonable to consider a "compute json inline" approach, rather than end-users taking care of unique identifiers

$lazyJson = ['key' =>fn() =>yieldfrom$heavy];

@stof
Copy link
Member

@ro0NL this would force to re-implement the whole json encoding in userland

@ro0NL
Copy link
Contributor

we could array walk the structure first, thus keeping the unique placeholders an implementation detail.

@stof
Copy link
Member

@ro0NL if you do that, you are not streaming json anymore, defeating the whole purpose of this PR.

@ro0NL
Copy link
Contributor

ro0NL commentedSep 29, 2022
edited
Loading

the idea is to split the generators from the structure, preserving remaining logic. But this is an extra step yes, thus less ideal perhaps.

@alexander-schranz
Copy link
ContributorAuthor

@ro0NL interesting input. As I think the structure array is mostly small it could be possible. But we would need to have a look at what difference this would be in the performance.

I hacked something together usingarray_walk_recursive:https://3v4l.org/tndhO. Will have a deeper look at it at the evening or next days.

@stof
Copy link
Member

@alexander-schranz be careful when implementing this.is_callable would turn some strings into placeholders instead of outputting them.

@alexander-schranz
Copy link
ContributorAuthor

@stof great hint think$item instanceof Closure should then do the job?

ro0NL reacted with thumbs up emoji

@stof
Copy link
Member

now that we have first class callables, I would say yes. You can convert any callable to a closure using this feature.

@alexander-schranz
Copy link
ContributorAuthor

Okay I don't need to check forclosures or callables. I just need to check on\Generators because the Closures are already called. Which is very important, as example when Database connection is not available the exception need to be thrown in the Controller and should not be thrown when Status Code 200 is already returned:

returnnewStreamedJsonResponse(    ['_embedded' => ['articles' =>$this->findArticles('Article'),// returns a \Generator which will generate a list of data        ],    ],);

The diff between old and new implementation is not big it just takes about0.0000128s todo the array_walk_recursive and replace it. It also did not have any visible changes on the memory usage. The tested arrays are really small but that will mostly be the case I think in this kind of responses.

I also update the example repository using the new class under/symfony-articles.json:https://github.com/alexander-schranz/efficient-json-streaming-with-symfony-doctrine if somebody want to experiment with it.

GromNaN reacted with thumbs up emoji

@alexander-schranzalexander-schranzforce-pushed thefeature/streamed-json-response branch 2 times, most recently from2480746 to7d8700fCompareOctober 24, 2022 19:08
@alexander-schranzalexander-schranzforce-pushed thefeature/streamed-json-response branch from3453946 to3de6fc7CompareOctober 24, 2022 20:50
@OskarStark
Copy link
Contributor

I propose to add the content from the README of your prototype application to the PR header 👍🏻

OskarStark reacted with thumbs up emoji

@alexander-schranz
Copy link
ContributorAuthor

@OskarStark added.

Think PR is blocked until 6.3 branch is created?

@OskarStark
Copy link
Contributor

@OskarStark added.

thanks

Think PR is blocked until 6.3 branch is created?

Yes

@dunglas
Copy link
Member

For the record,@mtarld@soyuka and I are working on a new component that will be an alternative tojson_encode/json_decode and to the Symfony Serializer that will natively support JSON streaming (for encoding and decoding). Maybe will it be possible to use this component in this PR.

jseparovic1 reacted with rocket emoji

@alexander-schranz
Copy link
ContributorAuthor

@dunglas that sounds very interesting. I think currently I would stay with the implementation how it is for now, this gives a very low resource solution without the need that the http foundation package has additional requirements to any kind of serializer and so on. Still a serializer/normalizer is possible be used inside the Generator already, which will be they current implementation of this class also be very low on resources usage as it don't try to serialize all objects at once just one after the other and so don't need to keep more then one object in the memory aslong as the ORM loading allows that.

@chalasr
Copy link
Member

Shall we move forward on this one?

@alexander-schranzalexander-schranzforce-pushed thefeature/streamed-json-response branch from626eafe toa3ee766CompareDecember 29, 2022 13:35
@alexander-schranz
Copy link
ContributorAuthor

@chalasr rebased. Not sure what is open or required to get this merged :)

@chalasrchalasrforce-pushed thefeature/streamed-json-response branch froma3ee766 toecc5355CompareDecember 29, 2022 13:44
@chalasr
Copy link
Member

Let's iterate, thanks@alexander-schranz!

alexander-schranz reacted with hooray emoji

@chalasrchalasr merged commitf43cd26 intosymfony:6.3Dec 29, 2022
@alexander-schranzalexander-schranz deleted the feature/streamed-json-response branchDecember 29, 2022 13:48
@alexander-schranz
Copy link
ContributorAuthor

🎉 Thx you all for the great feedback and ideas. Think we got a great solution out of it with a better DX as I could think of when created the Pull request.

@chalasr that sounds great :)

chalasr and GromNaN reacted with heart emoji

@fabpotfabpot mentioned this pull requestMay 1, 2023
nicolas-grekas added a commit that referenced this pull requestMay 16, 2023
…medJsonResponse (alexander-schranz)This PR was merged into the 6.3 branch.Discussion----------[HttpFoundation] Fix problem with empty generator in StreamedJsonResponse| Q             | A| ------------- | ---| Branch?       | 6.3 (Feature `StreamedJsonResponse`:#47709)| Bug fix?      | yes| New feature?  | no <!-- please update src/**/CHANGELOG.md files -->| Deprecations? | no <!-- please update UPGRADE-*.md and src/**/CHANGELOG.md files -->| Tickets       | Fix - was reported to me on Slack by `@norkunas`| License       | MIT| Doc PR        | symfony/symfony-docs#... <!-- required for new features -->Currently when the Generator is empty the return is invalid JSON which should not happen. So adding a testcase and a fix to the problem with the empty generator.Commits-------39bb6b6 Fix problem with empty generator in StreamedJsonResponse
javiereguiluz added a commit to symfony/symfony-docs that referenced this pull requestJun 6, 2023
…onse` (alexander-schranz)This PR was squashed before being merged into the 6.3 branch.Discussion----------[HttpFoundation] Add documentation for `StreamedJsonResponse`Docs for:symfony/symfony#47709# TODO- [x] Example of Flush HandlingCommits-------8a285e3 [HttpFoundation] Add documentation for `StreamedJsonResponse`
fabpot added a commit that referenced this pull requestOct 1, 2023
…medJsonResponse (Jeroeny)This PR was merged into the 6.4 branch.Discussion----------[HttpFoundation] Support root-level Generator in StreamedJsonResponse| Q             | A| ------------- | ---| Branch?       | 6.4| Bug fix?      | no| New feature?  | yes| Deprecations? | no| License       | MITCurrently the `StreamedJsonResponse` only supports streaming nested Generators within an array data structure.However if a response is a list of items (for example database entities) on the root level, this isn't usable.I think both usecases can be supported with the change in this PR.The root level generator doesn't account for additional nested generators yet. I could add that by doing `is_array($item)` and the call the recursive placeholder logic.Link to first PR that introduced StreamedJsonResponse:#47709~~Also something I noticed is I only got intermediate output, when adding a `flush()` call after each item has been echo'd (with a `sleep(1)` after each item to see it output the parts individually).~~ Edit: I see the class' PhpDoc describes this and it's probably expected to be done in userland implementations.Commits-------05e582f support root-level Generator in StreamedJsonResponse
symfony-splitter pushed a commit to symfony/http-foundation that referenced this pull requestOct 1, 2023
…medJsonResponse (Jeroeny)This PR was merged into the 6.4 branch.Discussion----------[HttpFoundation] Support root-level Generator in StreamedJsonResponse| Q             | A| ------------- | ---| Branch?       | 6.4| Bug fix?      | no| New feature?  | yes| Deprecations? | no| License       | MITCurrently the `StreamedJsonResponse` only supports streaming nested Generators within an array data structure.However if a response is a list of items (for example database entities) on the root level, this isn't usable.I think both usecases can be supported with the change in this PR.The root level generator doesn't account for additional nested generators yet. I could add that by doing `is_array($item)` and the call the recursive placeholder logic.Link to first PR that introduced StreamedJsonResponse:symfony/symfony#47709~~Also something I noticed is I only got intermediate output, when adding a `flush()` call after each item has been echo'd (with a `sleep(1)` after each item to see it output the parts individually).~~ Edit: I see the class' PhpDoc describes this and it's probably expected to be done in userland implementations.Commits-------05e582f1a3 support root-level Generator in StreamedJsonResponse
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@nicolas-grekasnicolas-grekasnicolas-grekas left review comments

@jderussejderussejderusse left review comments

@OskarStarkOskarStarkOskarStark left review comments

@derrabusderrabusderrabus left review comments

@GromNaNGromNaNGromNaN approved these changes

@chalasrchalasrchalasr approved these changes

@dunglasdunglasAwaiting requested review from dunglas

@stofstofAwaiting requested review from stof

+3 more reviewers

@ro0NLro0NLro0NL left review comments

@HeahDudeHeahDudeHeahDude left review comments

@ibousfihaibousfihaibousfiha left review comments

Reviewers whose approvals may not affect merge requirements

Assignees

No one assigned

Projects

None yet

Milestone

6.3

Development

Successfully merging this pull request may close these issues.

15 participants

@alexander-schranz@ro0NL@stof@fabpot@chalasr@nicolas-grekas@welcoMattic@OskarStark@dunglas@GromNaN@jderusse@derrabus@HeahDude@ibousfiha@carsonbot

[8]ページ先頭

©2009-2025 Movatter.jp