This repository was archived by the owner on Jan 28, 2025. It is now read-only.

serverless-nextjs/serverless-next.jsPublic archive

NotificationsYou must be signed in to change notification settings
Fork457
Star4.5k

Incremental static regeneration#1028

Merged

dphang merged 61 commits intomasterfromfeature/incremental-static-regeneration

May 19, 2021

Merged

Incremental static regeneration#1028

dphang merged 61 commits intomasterfromfeature/incremental-static-regeneration

May 19, 2021

Conversation

Copy link

Collaborator

kirkness commentedApr 28, 2021•
edited
Loading

AddsIncremental Static Regeneration!

Related issues#804 #995

Add unit tests for regeneration logic
Test assumptions around how rollup.js bundles dynamic imports as per@dphang'scomment
Add e2e tests for ISR fallbacks
Setup deployment of infra viaserverless
Add e2e tests
Make code for ISR possible to be agnostic to the provider as per@dphang'scomments
Only pass what's needed to the queue asper comment from@jvarho
Find a workaround for max 300 SQS calls per second (using lambda async invoke)
Setup deployment of infra viaCDK
Create a demo site for reviewers -https://d3kjdl6gtxbc5l.cloudfront.net/about
Update cache logic to handle changes torevalidate property, as per @dvarho'scomment
Load test the SQS queue to more than 300 requests in one seconds totest this assumption from@evangow - assumption was correct, it's max 300 API calls per second, will find a workaround
Tidy-up code by moving duplicate code (for example the page generation code) into functions/seperate files

The implementation looks like:

At "origin response" we check whether the response page is an ISR page (has anExpires header, or the original build page had a revalidate property)
We check whether theExpires header is in the future and if so apply a cache-control header such that the object is stored in the edge until its ready for revalidation.
If the Expires header is in the past then we trigger the page to be regenerated, whilst still returning the existing page with no-cache
To trigger a page rebuild we send a message to a FIFO SQS Queue, deduplicated using the identifier of the S3 object we are replacing so that we at most trigger one rebuild per page. FIFO SQS queues do come with a 300/second rate limit on all API Requests, which might be hit on very active sites, in which case we cache the page for an extra second to back-off requests.

You can check out a demo of this featurehere, check the headers and the response times to try to get your head around its behaviour. The code for this page looks like:

exportasyncfunctiongetStaticProps(){return{revalidate:60,props:{date:newDate().toLocaleString()}};}constAboutPage=(props:{date:string})=>(// ...<p>This is the about page, and the time is about{props.date}</p>// ...)exportdefaultAboutPage

👍

jvarho, jbt95, gmorang, adamdotdevin, davegariepy, joswayski, matijagrcic, fultonm, Linux249, mk0812, and 17 more reacted with thumbs up emoji

ibrahimcesar, pawelphilipczyk, josneijr, gopeter, raffij, jbt95, alanschapira, nrdcp, gmorang, adamdotdevin, and 27 more reacted with hooray emoji

joswayski, adamdotdevin, agnese-kerubina, igormartimiano, FullOfOrange, rntdrts, YuiSakamoto, mikiastilahun, abhushanaj, elbasan, and 2 more reacted with heart emoji

joswayski, JakePartusch, ngnathan, adamdotdevin, purecopy, rntdrts, abhushanaj, igormartimiano, elbasan, ahashem, and relferreira reacted with rocket emoji

ibrahimcesar, gopeter, jbt95, alanschapira, adamdotdevin, davegariepy, jonahallibone, abhushanaj, elbasan, and mathvbarone reacted with eyes emoji

wip: initial proof-of-concept incremental static regeneration

31f313c

kirkness mentioned this pull request

Apr 28, 2021

[RFC] SSG with incremental-static-regeneration and revalidate#804

Closed

fix: dont ts build the regeneration handler

35cf5ff

dphang reviewed

Apr 28, 2021

View reviewed changes

packages/libs/lambda-at-edge/src/regeneration-handler.ts

		retryStrategy: await buildS3RetryStrategy()
		});

		const { req, res } = lambdaAtEdgeCompat(

Copy link

Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This looks to be just regular Lambda, so this should not be needed, right? You can just create a Node req, res in that case?

Copy link

CollaboratorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Correct! Good catch.

Copy link

CollaboratorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

On second thought, the reason this is here is due to the fact that the original CloudFront event is actually passed through the queue to this lambda (thecloudFrontEventRequest variable), and therefore although this is a standard lambda we are still serving the original CloudFront event. The ideal change would be to standardizee the request object throughout, which would be nice perhaps using some of the principles in theserverless-http package. However, might slightly sit outside the scope of this PR. Keen to know your thoughts@dphang!

Copy link

Collaborator

dphangMay 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Yea, makes sense for now. I think it would be good to have a generic event since in the future this can be used for regular Lambda traffic (and maybe other providers)

dphang reviewed

Apr 28, 2021

View reviewed changes

packages/libs/lambda-at-edge/src/regeneration-handler.ts OutdatedShow resolvedHide resolved

dphang reviewed

Apr 28, 2021

View reviewed changes

packages/libs/lambda-at-edge/package.json OutdatedShow resolvedHide resolved

dphang reviewed

Apr 28, 2021

View reviewed changes

packages/libs/lambda-at-edge/src/default-handler.ts OutdatedShow resolvedHide resolved

dphang reviewed

Apr 28, 2021

View reviewed changes

packages/libs/lambda-at-edge/src/default-handler.ts OutdatedShow resolvedHide resolved

Copy link

evangow commentedApr 28, 2021

I didn't see any handler for rate limit errors on SQS queue in your code.

Thought about ISR quite a bit before. See my comment here:#804 (comment)

After I posted the comment above, I spent a while thinking about the edge cases I listed and the SQS queue.

The downside of using SQS to deduplicate requests is that you have to use a FIFO queue, which only supports 300 messages per second (300 send, receive, or delete operations per second) without batching.

The example I had in my head when thinking about this issue was Reddit. They could benefit a great deal from ISR on /r/popular (as an example), because the content can be stale for 60 seconds and it wouldn't really matter.

The second that the page expires though, there would be 10,000s (100Ks? 1Ms?) of requests to regenerate /r/popular.

The SQS queue would throw a rate limit error on any of those requests past the first 300 and all of the other pages requesting revalidation would also run into a rate limit error.

To handle the rate limit, you could have all initial requests go to a non-FIFO SQS queue, then have that queue send the messages to the FIFO queue.

You would also be able to batch the requests from initial queue and send them over to the FIFO SQS queue. By batching the requests, you can process 3,000 requests / second (300 message limit * 10 messages / batch).

If the FIFO queue returns a rate limit error, then you can just leave those in the initial queue and retry later.

In other words:
Queue 1) Non-FIFO. Receives all regeneration requests. Batches them 10 at a time to send to Queue 2. Retries if Queue 2 returns a rate limit error.
Queue 2) FIFO. Deduplicates messages received from Queue 1 and calls the page regeneration handler.

Ultimately, I think adding a FIFO queue makes things too complicated for deduplicating requests due to the rate limit.

I would just regenerate the page via an async lambda, and if it gets regenerate a whole bunch of times because a page is popular, you're just spending some additional money to handle those requests.

Those pages are probably currently using SSR with right now anyway where every request already currently regenerates the page, so ISR is already going to be a money saver.

If you just use an async lambda, you'll save a little bit less money but you can avoid the complexity that SQS would involve to do it The Right Way.

I could probably be convinced either way is good, but I think just 1 FIFO queue might be an issue.

Unrelated:s-maxage = initialRevalidateSeconds - createdAgo Isn't s-maxage supposed to be how long the page should be cached before revalidation? In other words:s-maxage = Date.now() + initialRevalidateSeconds or something? I'm not super familiar with the various expiration headers, so ignore me if I'm just wrong here.

Copy link

CollaboratorAuthor

kirkness commentedApr 28, 2021

Hey@evangow, cheers for the message. I've not investigated this in-depth yet, however, have implemented the queue with fifo + message deduplication so that the receiving lambda will only be invoked at most once for every time a page needs to be regenerated. What I don't know is whether the queue's rate limit is depleted even when the message is discarded when its a duplicate?

Copy link

evangow commentedApr 28, 2021

I didn't see anything in the docs about that.

If it's the case that only the 1st instance counts against the API usage limits, then that would be awesome.

I assumed the docs to mean that any sqs.send() call would count against your usage even if the message is deduplicated.

fix: dynamic paths should regenerate

65ba637

Copy link

Collaborator

jvarho commentedApr 29, 2021•
edited
Loading

Nice 👍

I also started working on this yesterday, but your efforts are clearly much further along.

However, I don't think your cache-control logic in default-handler will work. It isn't sufficient to just check the prerender manifest for revalidate times, since 1) dynamic pages won't be there and 2) revalidate is returned fromgetStaticProps and can differ from one render to the next. Instead, I think you need to store and check when the page will actually expire.

Here's my series up to the point where actual revalidation logic would be needed:master...jvarho:isr-wip
Feel free to use (or not) as you like. (Edit: fixed an issue in the topmost commit's tests.)

Copy link

CollaboratorAuthor

kirkness commentedApr 29, 2021

Nice 👍
I also started working on this yesterday, but your efforts are clearly much further along.
However, I don't think your cache-control logic in default-handler will work. It isn't sufficient to just check the prerender manifest for revalidate times, since 1) dynamic pages won't be there and 2) revalidate is returned fromgetStaticProps and can differ from one render to the next. Instead, I think you need to store and check when the page will actually expire.
Here's my series up to the point where actual revalidation logic would be needed:master...jvarho:isr-wip
Feel free to use (or not) as you like. (Edit: fixed an issue in the topmost commit's tests.)

Thanks,@jvarho! I hadn't considered these cases and I think your approach of using theExpires header (based on what I can see in the AWS docs) would work a treat! I'll give it a go implementing and see if I can push a demo of it working (will add tests eventually as well).

Copy link

Collaborator

jvarho commentedApr 29, 2021

Thanks,@jvarho! I hadn't considered these cases and I think your approach of using theExpires header (based on what I can see in the AWS docs) would work a treat! I'll give it a go implementing and see if I can push a demo of it working (will add tests eventually as well).

The way I used it, the Expires header is only used to store the time in S3. It is the Cache-Control I set here that actually matters for Cloudfront:master...jvarho:isr-wip#diff-f4709477772e5badfaaf543f6104de4b4ab3bf8300a056c088f84c184d800288R828

The Expires header could – and probably should – be removed at that point.

(The reason to use Cache-Control is of course the distinction between Cloudfront and browser caching.)

Copy link

CollaboratorAuthor

kirkness commentedApr 29, 2021

Thanks,@jvarho! I hadn't considered these cases and I think your approach of using theExpires header (based on what I can see in the AWS docs) would work a treat! I'll give it a go implementing and see if I can push a demo of it working (will add tests eventually as well).
The way I used it, the Expires header is only used to store the time in S3. It is the Cache-Control I set here that actually matters for Cloudfront:master...jvarho:isr-wipdiff-f4709477772e5badfaaf543f6104de4b4ab3bf8300a056c088f84c184d800288R828
The Expires header could – and probably should – be removed at that point.
(The reason to use Cache-Control is of course the distinction between Cloudfront and browser caching.)

Ok, so you set the Expires header when creating the file in S3, and then use that to define the smax header at which point in the Origin Response the Expires header can be removed?

Copy link

Collaborator

jvarho commentedApr 29, 2021

Ok, so you set the Expires header when creating the file in S3, and then use that to define the smax header at which point in the Origin Response the Expires header can be removed?

Exactly 👍

kirkness added3 commits

April 29, 2021 10:11

add Expires header to the regenerated S3 objects

0ee2f55

Merge branch 'master' into feature/incremental-static-regeneration

6f1ce93

use expires header rather than last-modified and a small code tidy

fc8bd38

Copy link

CollaboratorAuthor

kirkness commentedApr 29, 2021

Ok, so you set the Expires header when creating the file in S3, and then use that to define the smax header at which point in the Origin Response the Expires header can be removed?
Exactly 👍

Added this in0ee2f55 andfc8bd38.

Copy link

adamdotdevin commentedApr 29, 2021

Copy link

Collaborator

jvarho commentedApr 29, 2021

Ok, so you set the Expires header when creating the file in S3, and then use that to define the smax header at which point in the Origin Response the Expires header can be removed?
Exactly +1
Added this in0ee2f55 andfc8bd38.

That looks correct for the regeneration path. However, you still need something like my changes injvarho@bd988a4 andjvarho@4423a47 to get prerendered and fallback-rendered pages to expire, do you not? (I think you could just cherry-pick those two commits.)

Copy link

CollaboratorAuthor

kirkness commentedApr 29, 2021

Ok, so you set the Expires header when creating the file in S3, and then use that to define the smax header at which point in the Origin Response the Expires header can be removed?
Exactly +1
Added this in0ee2f55 andfc8bd38.
That looks correct for the regeneration path. However, you still need something like my changes injvarho@bd988a4 andjvarho@4423a47 to get prerendered and fallback-rendered pages to expire, do you not? (I think you could just cherry-pick those two commits.)

You are probably right, I've not looked at those behaviours yet, but if I can just cherry-pick then that's awesome!

revalidation relies on the expires header

8a0d5ff

Copy link

CollaboratorAuthor

kirkness commentedApr 29, 2021

Like I wrote in elsewhere, dynamic pages do not haveinitialRevalidateSeconds in the manifest. The only way to know that they should be revalidated is to store that information in S3 – i.e. the Expires header you now use.

Update:8a0d5ff

Frustratingly there is no simple way of setting theExpires header per-file when uploading the initial build via CDK (obviously possible via the serverless component). With the CDK construct, we can only set headers for an entire directory. Because of this, I've updated thedefault-handler (logic moved ingetStaticRegenerationResponse) to either check for anExpires header or for aninitialRevalidateSeconds. This should result in new pages being regenerated when needed and initial pages being regenerated, however, the narrow case you've just mentioned, where you might convert an ISR page to a static page will not be. Unless you can think of another edge-case that may not be fulfilled, just this one feels acceptable until we've figured a method of settingExpires per-file in CDK.

kirkness added3 commits

April 30, 2021 13:31

fix fallback isr fallback behaviour

67cda68

Merge branch 'master' into feature/incremental-static-regeneration

16fe62b

update cdk snapshot

26ed898

jvarho reviewed

May 1, 2021

View reviewed changes

packages/libs/lambda-at-edge/src/build.ts OutdatedShow resolvedHide resolved

packages/libs/lambda-at-edge/src/default-handler.tsShow resolvedHide resolved

packages/libs/lambda-at-edge/src/lib/getStaticRegenerationResponse.ts OutdatedShow resolvedHide resolved

packages/libs/lambda-at-edge/src/lib/triggerStaticRegeneration.ts OutdatedShow resolvedHide resolved

Merge branch 'master' into feature/incremental-static-regeneration

d546e02

Copy link

CollaboratorAuthor

kirkness commentedMay 14, 2021

@kirkness updated the policy, I think you need to add sqs.SendMessage permission to the default policy here?
serverless-next.js/packages/serverless-components/nextjs-component/src/component.ts
Line 450 in07b8135
letpolicy:Record<string,unknown>={
Also 1/3 of the new ISR tests passed, not sure if they are flaky but AFAIK there are no more permissions issues

Sweet, thanks for checking, odd that it worked locally! I'll add this today along with making sure these tests aren't too flakey!

Copy link

CollaboratorAuthor

kirkness commentedMay 14, 2021

@kirkness updated the policy, I think you need to add sqs.SendMessage permission to the default policy here?
serverless-next.js/packages/serverless-components/nextjs-component/src/component.ts
Line 450 in07b8135
letpolicy:Record<string,unknown>={
Also 1/3 of the new ISR tests passed, not sure if they are flaky but AFAIK there are no more permissions issues

That permission appears to already exist on the default lambda's policy:

serverless-next.js/packages/serverless-components/nextjs-component/src/component.ts

Lines 478 to 482 in1f757d1

	{
	Effect:"Allow",
	Resource:queue.arn,
	Action:["sqs:SendMessage"]
	}

Perhaps the policy isn't getting updated?

kirkness added2 commits

May 14, 2021 12:03

fix: dont deploy a queue if we dont need one

a3c3e36

fix: merge conflicts

738335e

Copy link

Collaborator

dphang commentedMay 15, 2021

@kirkness updated the policy, I think you need to add sqs.SendMessage permission to the default policy here?
serverless-next.js/packages/serverless-components/nextjs-component/src/component.ts
Line 450 in07b8135
letpolicy:Record<string,unknown>={
Also 1/3 of the new ISR tests passed, not sure if they are flaky but AFAIK there are no more permissions issues
That permission appears to already exist on the default lambda's policy:
serverless-next.js/packages/serverless-components/nextjs-component/src/component.ts
Lines 478 to 482 in1f757d1
{
Effect:"Allow",
Resource:queue.arn,
Action:["sqs:SendMessage"]
}
Perhaps the policy isn't getting updated?

Yup, thanks! I think the policy might not be getting updated after it's already deployed...but I think it should be ok to ask users to update it

kirkness added10 commits

May 16, 2021 16:43

fix: revert normalise uri call

bf8038a

fix: update snapshot

e49eac4

fix: update regeneration fallback test

b13cb05

fix: convert secs to millis

aaa0916

fix: revert 404 case

f104d19

fix: revert 308 case

bf6ceed

fix: revert 404 case

200f195

fix: set cache folder for more reliability(??)

e2cae31

fix: 404 redirect test, update post cypress bump

f4e5218

fix: resolve conflits

2cde26b

Copy link

CollaboratorAuthor

kirkness commentedMay 18, 2021•
edited
Loading

@dphang - the e2e's are killing me slowly 😂.... It looks like the issue is caused by the shared libs not reliably being built due to yarn failing to install (which I understand is a known issue). I've gone through each of the tests locally and each app/test suite passes.

I'd be fairly keen to work on this issue in a follow-up PR. My suspicion is that the issue is related to thepostinstall script is runningyarn install in parallel (perhaps trying to create the same file/dir at the same time). So perhaps shifting to usingworkspaces would be beneficial? Has this been looked at before?

Edit: Looks like@jvarho'sPR will help!

fix: merge conflict

3c37c3f

Copy link

Collaborator

dphang commentedMay 18, 2021•
edited
Loading

@dphang - the e2e's are killing me slowly 😂.... It looks like the issue is caused by the shared libs not reliably being built due to yarn failing to install (which I understand is a known issue). I've gone through each of the tests locally and each app/test suite passes.
I'd be fairly keen to work on this issue in a follow-up PR. My suspicion is that the issue is related to thepostinstall script is runningyarn install in parallel (perhaps trying to create the same file/dir at the same time). So perhaps shifting to usingworkspaces would be beneficial? Has this been looked at before?
Edit: Looks like@jvarho'sPR will help!

Yep hopefully it helps.

I do see that the code limit is reached (75 GB), I'll add a script to clean up old Lambdas which can run in the finalize step. Hopefully we can get it checked in today, just wanting to make sure that the e2e tests are relatively stable at this point.

Copy link

Contributor

raffij commentedMay 19, 2021

Looking at this I learnt aboutversionFunctions: false in the serverless.yml which is useful for our deploys, but possibly not for this instance.

Merge branch 'master' into feature/incremental-static-regeneration

7586260

dphang reviewed

May 19, 2021

View reviewed changes

README.md

		"lambda:CreateEventSourceMapping",
		"iam:UpdateAssumeRolePolicy",
		"iam:DeleteRolePolicy",
		"sqs:*"

Copy link

Collaborator

dphangMay 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

For better security, we should update this to just the sqs permissions needed, but not blocking for this PR

Copy link

Collaborator

dphang commentedMay 19, 2021

I think so far it looks good to me, really great work on this! If you have any improvements like updating the README on how to use this feature (e.g all the SQS permissions etc needed), improving tests, cleaning up more of the code etc. please feel free to add in future PRs.