Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

GH-113655: Lower the C recursion limit for some platforms#124264

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
vstinner merged 1 commit intopython:mainfromthesamesam:stack-size-endian
Sep 23, 2024

Conversation

@thesamesam
Copy link
Contributor

@thesamesamthesamesam commentedSep 20, 2024
edited
Loading

Lower the C recursion limit for HPPA, PPC64 and SPARC, as they use
relatively large stack frames that cause e.g.test_descr to hit
a stack overflow. According to quick testing, it seems that values
around 8000 are max for HPPA and PPC64 (ELFv1 ABI) and 7000 for SPARC64.
To keep things safe, let's use 5000 for PPC64 and 4000 for SPARC.

Co-authored-by: Sam Jamessam@gentoo.org

// higher stack memory usage than a release build: use a lower limit.
# definePy_C_RECURSION_LIMIT 500
#elif defined(__s390x__)
#elif defined(__BIG_ENDIAN__)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Please add a comment explaining why big endian has a lower limit, withgh-113655 reference.

@vstinner
Copy link
Member

We currently reduce the C recursion limit just for s390x (which is big-endian), but in Gentoo, we hit the same issues on ppc64be, hppa, and sparc.
At least for ppc64, it seems like the ABI used on ppc64be means you end up with bigger stack frames.

That's strange that all big endian archs allocates more memory on the stack. What can explain that only big endian archs are affected?

@mgorny
Copy link
Contributor

I think it's a coincidence, and the logic is more related to the fact that big endian architectures aren't tested much. Perhaps the more appropriate logic would be to have a "more conservative" value in the default branch, and make it higher for platforms that we know are safe.

@mgorny
Copy link
Contributor

Ok, so I've played a bit with godbolt and got the following numbers for a dummy function with no args or variables:

  • ARM (32-bit): 8 bytes (BE/LE)
  • AArch64: 16 bytes (BE/LE)
  • x86: 8 bytes (+ 4 bytes in 32-bit programs)
  • LoongArch: 16 bytes
  • MIPS: 24/32 bytes (depending on ABI, BE/LE)
  • ppc64 (BE): 128 bytes
  • ppc64le: 112 bytes
  • ppc32: 16 bytes
  • riscv: 16 bytes (32/64-bit)
  • s390x: 160 bytes
  • s390: 96 bytes
  • SPARC: 96 bytes
  • SPARC64: 176 bytes

So, looks like ppc64/s390/SPARC are the platforms with unusually large stack frames. However, I don't see any obvious relation between byte order and stack frame size here. ppc64le's is only a little smaller than ppc64 BE, so I have no clue why we're hitting the crash with one but not the other — but maybe it just happened to be on the border. On all other biendian arches, stack frame seems to be of the same size on both variants.


I've used the following code:

voidb() {}voida() {b();}

Tested with various compilers, compiler options (-mbig-endian,-mlittle-endian,-m32…). Enabled "Stack Usage" via "Add new…"

@vstinner
Copy link
Member

So, looks like ppc64/s390/SPARC are the platforms with unusually large stack frames

Would it be possible to only use lower limit on these 3 archs? Rather than testing "big endian".

@mgorny
Copy link
Contributor

Yeah, I suppose so.@thesamesam also mentioned HPPA, but godbolt doesn't cover that.

// higher stack memory usage than a release build: use a lower limit.
# definePy_C_RECURSION_LIMIT 500
#elif defined(__s390x__)
#elif defined(__BIG_ENDIAN__)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Suggested change
#elif defined(__BIG_ENDIAN__)
#elif defined(__hppa__) || defined(__powerpc64__) || defined(__s390__) || defined(__sparc__)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Or we could try to use different values for different arches, given that ppc64 could probably use a higher value. Probably worth double-checking SPARC64, as it has even bigger frames than s390x.

@thesamesam
Copy link
ContributorAuthor

That's strange that all big endian archs allocates more memory on the stack. What can explain that only big endian archs are affected?

I saw this question just before I went to bed and my assumption (which I wanted to prove/disprove today) was that it was an unfortunate artefact of ABI design and that if we ended up looking at the ABIs for newer (and therefore LE) arches, we'd end up seeing some notable differences.

But mgorny (thank you!) beat me to it and I think his analysis is right. I'll look a bit more then update this PR. Thanks again both.

@mgorny
Copy link
Contributor

Ok, now for some real hardware testing of whentest_descr starts crashing. The highest recursion limits I was able to use (with regular build frommain, gcc, default optimization level, GNU/Linux):

  • amd64/x86/aarch64/ppc32: 32k
  • ppc64 (be): 9k (8k with PGO + forced-O2)
  • ppc64le: 11k
  • s390x: 12k (7k with PGO +-O2)
  • sparc64: 7k (also with PGO +-O2)
  • pa-risc: over 10k (host too slow to test more)
  • alpha: over 10k (host too slow to test more)

It looks like amd64/x86/etc. are hitting some other issue that looks like a 16-bit signed counter. The values I've gotten for PPC64 suggests that LE was on the edge already. Surprisingly, my value for s390x is much higher than the one currently used — FWICS it was lowered due to buildbots failing.

Honestly, I don't know what values to put here.

@mgorny
Copy link
Contributor

@vstinner, with Sam's permission, I have pushed a commit that lowers the limits on PPC64 and SPARC instead.

Lower the C recursion limit for HPPA, PPC64 and SPARC, as they userelatively large stack frames that cause e.g. `test_descr` to hita stack overflow.  According to quick testing, it seems that valuesaround 8000 are max for HPPA and PPC64 (ELFv1 ABI) and 7000 for SPARC64.To keep things safe, let's use 5000 for PPC64 and 4000 for SPARC.Co-authored-by: Sam James <sam@gentoo.org>
@thesamesam
Copy link
ContributorAuthor

Thanks! Happy now.

@vstinner Can you take another look please?

@vstinner
Copy link
Member

Merged, thanks. The final commit is way better than the first iteration only checking for big endian ;-)

thesamesam reacted with laugh emoji

@thesamesamthesamesam deleted the stack-size-endian branchSeptember 23, 2024 07:04
@thesamesamthesamesam changed the titleGH-113655: Lower the C recursion limit for all big-endian platformsGH-113655: Lower the C recursion limit for some platformsSep 23, 2024
@mgorny
Copy link
Contributor

Thanks a lot!

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@ericsnowcurrentlyericsnowcurrentlyAwaiting requested review from ericsnowcurrentlyericsnowcurrently is a code owner

@vstinnervstinnerAwaiting requested review from vstinner

1 more reviewer

@mgornymgornymgorny left review comments

Reviewers whose approvals may not affect merge requirements

Assignees

No one assigned

Labels

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

3 participants

@thesamesam@vstinner@mgorny

[8]ページ先頭

©2009-2026 Movatter.jp