Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

feat: optimize frame layout for tail-call-only functions#11608

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Draft
pnodet wants to merge1 commit intobytecodealliance:main
base:main
Choose a base branch
Loading
frompnodet:pnodet-11

Conversation

@pnodet
Copy link
Contributor

Reduce frame size from 16 to 8 bytes for functions that only make tail calls (FunctionCalls::TailOnly). This optimization:

Reduce frame size from 16 to 8 bytes for functions that only make tailcalls (FunctionCalls::TailOnly). This optimization:- Uses single register operations (str/ldr fp) instead of pairoperations (stp/ldp fp,lr)- Applies when no other frame requirements exist (no frame pointers,stack args, etc.)- Is instruction-based: functions containing only return_callinstructions get optimized- Maintains ABI compatibility and includes comprehensive test coverage
@pnodet
Copy link
ContributorAuthor

@cfallin What do you think of something like this? I only looked into aarch64 for the moment since other ISAs such as x64 s390x looks quite different and more complex to implement.

@cfallin
Copy link
Member

Unfortunately I don't think this is going to work: the stack pointer has to be 16-aligned, and aarch64 will actually trap if memory accesses occur with a misaligned SP.

Furthermore the savings I would expect is not "only push FP, not LR", but "don't push anything at all if the frame is zero-size". This should be the case for tail-calling functions with. no stack storage (spillslots, stackslots or clobbers) and no outgoing argument space.

@pnodet
Copy link
ContributorAuthor

Don't debuggers rely on frame pointers for stack traces? Could setting the frame size to 0 hurt debugging/unwinding?

@bjorn3
Copy link
Contributor

Debuggers and profilers should handle missing stack frames for leaf functions already. And besides debuggers actually generally use .eh_frame for stack unwinding, only falling back to frame pointers when .eh_frame is not available.

@cfallin
Copy link
Member

Right -- we already omit frame pointers for functions that are truly leaf functions (no calls at all, with no frame storage); this is a common optimization.

In Wasmtime, where we use our own stack-walking logic and unwinder and want simplicity/robustness, we configure Cranelift never to omit frame pointers; so this optimization largely applies to other uses of Cranelift, like bjorn3'scg_clif.

@pnodet
Copy link
ContributorAuthor

Then could it be safe to have something like this?

        // Compute linkage frame size.        let setup_area_size = if flags.preserve_frame_pointers()            // The function arguments that are passed on the stack are addressed            // relative to the Frame Pointer.            || flags.unwind_info()            || incoming_args_size > 0            || clobber_size > 0            || fixed_frame_storage_size > 0        {            16 // FP, LR        } else {            match function_calls {                FunctionCalls::Regular => 16,                FunctionCalls::None => 0,-               FunctionCalls::TailOnly => 8,+               FunctionCalls::TailOnly => 0,            }        };

@cfallin
Copy link
Member

I think you'll want to check the tail args and outgoing args size as well (the other parameters tocompute_frame_layout) -- basically, if any part of the frame needs to exist, then we need to do the FP setup even if we only have tail calls.

@github-actionsgithub-actionsbot added craneliftIssues related to the Cranelift code generator cranelift:area:aarch64Issues related to AArch64 backend. labelsSep 4, 2025
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

No reviews

Assignees

No one assigned

Labels

cranelift:area:aarch64Issues related to AArch64 backend.craneliftIssues related to the Cranelift code generator

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

3 participants

@pnodet@cfallin@bjorn3

[8]ページ先頭

©2009-2025 Movatter.jp