Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

[asan] Fixunknown-crash being reported for multi-byte errors, and incorrect memory access addresses being reported#144480

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
wxwern wants to merge3 commits intollvm:main
base:main
Choose a base branch
Loading
fromwxwern:fix-unknown-crash-desc-for-multi-byte-err

Conversation

wxwern
Copy link

@wxwernwxwern commentedJun 17, 2025
edited
Loading

This comprises of a fix for two intertwined bugs in ASan. The two changes would need to be simultaneously merged to not break any functionality.


unknown-crash reported for multi-byte errors

Given that a reported error by ASan spans multiple bytes, ASan may flag the error as anunknown-crash instead of the appropriate error name.

This error can be reproduced via a partial buffer overflow (on GCC, not Clang*), which reportsunknown-crash instead ofstack-buffer-overflow for the below:

https://godbolt.org/z/abrjrvnzj

# minimal reprod (should occur on gcc-7 - gcc-15, x86_64)## gcc -fsanitize=address reprod.cstruct X {    char bytes[16];};__attribute__((noinline)) struct X out_of_bounds() {    volatile char bytes[16];    struct X* x_ptr = (struct X*)(bytes + 2);    return *x_ptr;}int main() {    struct X x = out_of_bounds();    return x.bytes[0];}

This is due to a flawed heuristic inasan_errors.cpp, which won't always locate the appropriate shadow byte that would indicate a corresponding error. This can happen for any reported errors which span either:

  • exactly 8 bytes, or
  • 16 and more bytes.

Reproducibility on Clang

The above example doesn't reproduce the issue on Clang due to another bug* masking this one. Specifically:

  • GCC-compiled binaries report the starting address and size of the failing read attempt to ASan.

  • Clang-compiled binaries use__asan_memcpy, which directly highlights the first byte access that overflows the buffer to ASan. This thus coincidentally allows the heuristic to always work. This appears to be an incorrect interpretation.

In order to replicate this bug on Clang (so that we can do tests), another bug in ASan must first be fixed, as below:


Incorrect reported address inACCESS_MEMORY_RANGE

ACCESS_MEMORY_RANGE defined inasan_interceptors_memintrinsics.h reports the poisoned address (__bad) instead of the memory access start address (__offset) toReportGenericError. (link).

We can determine that the latter (reporting__offset) should be the intended interpretation, as most error descriptions are decided by treating the givenaddr as a start address. For example, see:PrintAccessAndVarIntersection inasan_descriptions.cpp - it usesaddr andaccess_size to determine whether a variable access overflows/underflows/etc. (link).

GCC also uses the latter interpretation, as mentioned above.

Existing tests previously assumed and check for the former incorrect interpretation. Corrections are made to update their output checks.

Performing this fix will result in theunknown-crash bug visible in GCC-compiled binaries to surface on Clang ones as well.

@github-actionsGitHub Actions
Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using@ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by theLLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on theLLVM Discord or on theforums.

@wxwernwxwern changed the title[libasan] Fixunknown-crash reported for multi-byte errors[asan] Fixunknown-crash reported for multi-byte errorsJun 17, 2025
@wxwernwxwernforce-pushed thefix-unknown-crash-desc-for-multi-byte-err branch from0a9f98e to4a46b7aCompareJune 17, 2025 09:06
@wxwernwxwern marked this pull request as ready for reviewJune 18, 2025 02:23
@llvmbot
Copy link
Member

@llvm/pr-subscribers-compiler-rt-sanitizer

Author: Wern (wxwern)

Changes

Given that a reported error by asan spans multiple bytes, asan may flag the error as anunknown-crash instead of the appropriate error name.

This error can be reproduced via a partial buffer overflow (on gcc), which reportsunknown-crash instead ofstack-buffer-overflow for the below:

https://godbolt.org/z/abrjrvnzj

# minimal reprod (should occur on gcc-7 - gcc-15)## gcc -fsanitize=address reprod.cstruct X {    char bytes[16];};__attribute__((noinline)) struct X out_of_bounds() {    volatile char bytes[16];    struct X* x_ptr = (struct X*)(bytes + 2);    return *x_ptr;}int main() {    struct X x = out_of_bounds();    return x.bytes[0];}

This is due to a flawed heuristic inasan_errors.cpp, which won't always locate the appropriate shadow byte that would indicate a corresponding error. This can happen for any reported errors which span either:

  • exactly 8 bytes, or
  • 16 and more bytes.

The above example doesn't reproduce the issue on clang as it reports errors via different pathways:

  • gcc-compiled binaries report the starting address and size of the failing read attempt to asan.

  • clang-compiled binaries highlight the first byte access that overflows the buffer to asan.

    Note: out-of-scope, but this is also possibly misleading, as it still reports the full size of the read attempt, paired with an address that's not the start of the read.

This behavior appears to be identical for all past versions tested. I'm not aware of a way to replicate this specific issue with clang, though it might have impacted error reporting in other areas.

This patch resolves this issue via a linear scan of applicable shadow bytes (instead of the original heuristic, which, at best, only increments the shadow byte address by 1 for these scenarios).


Full diff:https://github.com/llvm/llvm-project/pull/144480.diff

1 Files Affected:

  • (modified) compiler-rt/lib/asan/asan_errors.cpp (+5-2)
diff --git a/compiler-rt/lib/asan/asan_errors.cpp b/compiler-rt/lib/asan/asan_errors.cppindex 2a207cd06ccac..9e109c0895589 100644--- a/compiler-rt/lib/asan/asan_errors.cpp+++ b/compiler-rt/lib/asan/asan_errors.cpp@@ -437,8 +437,11 @@ ErrorGeneric::ErrorGeneric(u32 tid, uptr pc_, uptr bp_, uptr sp_, uptr addr,     bug_descr = "unknown-crash";     if (AddrIsInMem(addr)) {       u8 *shadow_addr = (u8 *)MemToShadow(addr);-      // If we are accessing 16 bytes, look at the second shadow byte.-      if (*shadow_addr == 0 && access_size > ASAN_SHADOW_GRANULARITY)+      u8 *shadow_addr_upper_bound =+          shadow_addr + (1 + ((access_size - 1) / ASAN_SHADOW_GRANULARITY));+      // If the read could span multiple shadow bytes,+      // do a sequential scan and look for the first bad shadow byte.+      while (*shadow_addr == 0 && shadow_addr < shadow_addr_upper_bound)         shadow_addr++;       // If we are in the partial right redzone, look at the next shadow byte.       if (*shadow_addr > 0 && *shadow_addr < 128) shadow_addr++;

@wxwernwxwernforce-pushed thefix-unknown-crash-desc-for-multi-byte-err branch from4a46b7a to69f24b6CompareJune 18, 2025 02:26
@vitalybukavitalybuka self-requested a reviewJune 18, 2025 02:37
@vitalybuka
Copy link
Collaborator

We need a test for that

@vitalybuka
Copy link
Collaborator

The above example doesn't reproduce the issue on clang as it reports errors via different pathways:

You can probably trigger that path through ACCESS_MEMORY_RANGE and INTERCEPTORs?

@wxwernwxwernforce-pushed thefix-unknown-crash-desc-for-multi-byte-err branch from69f24b6 tof39a0feCompareJune 18, 2025 03:03
@wxwern
Copy link
Author

wxwern commentedJun 18, 2025
edited
Loading

We need a test for that

The above example doesn't reproduce the issue on clang as it reports errors via different pathways:

You can probably trigger that path through ACCESS_MEMORY_RANGE and INTERCEPTORs?

Thanks, will look into it. I've not written tests yet as I haven't found a way to reproduce this viaclang, will do once I can find a reproducible example.

@wxwern

This comment was marked as outdated.

@wxwernwxwernforce-pushed thefix-unknown-crash-desc-for-multi-byte-err branch 2 times, most recently from63f33c6 tobed1f80CompareJune 26, 2025 08:36
@wxwernwxwern changed the title[asan] Fixunknown-crash reported for multi-byte errors[asan] Fixunknown-crash reported for multi-byte errors and incorrect addressesJun 26, 2025
@wxwernwxwernforce-pushed thefix-unknown-crash-desc-for-multi-byte-err branch 3 times, most recently fromfec87fc tob4754b8CompareJune 26, 2025 09:12
@wxwernwxwern requested a review fromvitalybukaJune 26, 2025 09:14
@wxwernwxwern changed the title[asan] Fixunknown-crash reported for multi-byte errors and incorrect addresses[asan] Fixunknown-crash being reported for multi-byte errors, and incorrect memory access addresses being reportedJun 26, 2025
@wxwern
Copy link
Author

@vitalybuka I've made some updates, and the PR description has been updated with more details and findings. Please do let me know if they're alright, thanks!

ACCESS_MEMORY_RANGE defined in asan_interceptors_memintrinsics.hreports the poisoned address (__bad), instead of the start address(__offset) during a memory access to ReportGenericError.We can determine that the latter (__offset) is the intendedinterpretation, as most error descriptions are decided by treatingthe given address as a start address (for example, see:PrintAccessAndVarIntersection in asan_descriptions.cpp, whichdecides whether a variable underflows or overflows dependingon the given addr and access_size).GCC also uses the latter interpretation. For instance, in bufferoverflows, it appears to do its own processing, and will reportthe start address of an overflowing read to ASan. This is incontrast to Clang, which uses __asan_memcpy directly.This patch fixes the above issue.Existing tests previously assumed and check for the former incorrectbehaviour. The error descriptions in those tests have thus beencorrected.
Given that a reported error by ASan spans multiple bytes, ASan mayflag the error as an 'unknown-crash' instead of the appropriate errorname.This error can be reproduced via a partial buffer overflow (any GCC,or after performing the patch in the previous commit to Clang).They'll report 'unknown-crash' instead of 'stack-buffer-overflow'for the below:    # minimal reprod    #https://godbolt.org/z/abrjrvnzj    #    # gcc -fsanitize=address reprod.c    struct X {        char bytes[16];    };    __attribute__((noinline)) struct X out_of_bounds() {        volatile char bytes[16];        struct X* x_ptr = (struct X*)(bytes + 2);        return *x_ptr;    }    int main() {        struct X x = out_of_bounds();        return x.bytes[0];    }This is due to a flawed heuristic in asan_errors.cpp, which won'talways locate the appropriate shadow byte that would indicate acorresponding error. This can happen for any reported errors whichspan either: exactly 8 bytes, or 16 and more bytes.This bug was previously hidden from Clang (but has always been presentin GCC) until the previous commit's fix on address reporting.Specifically, ACCESS_MEMORY_RANGE in ASan previously reports the firstpoisoned byte (instead of the start address, like in GCC). This maskedthe above bug from occuring, as it coincidentally guarantees theheuristic will always work, with slightly inaccurate reports.This patch resolves this issue via a linear scan of applicableshadow bytes (instead of the original heuristic, which, at best, onlyincrements the shadow byte address by 1 for these scenarios).
@wxwernwxwernforce-pushed thefix-unknown-crash-desc-for-multi-byte-err branch fromb4754b8 to57dc11aCompareJuly 3, 2025 07:08
@wxwern
Copy link
Author

Ping

@wxwern
Copy link
Author

Ping

1 similar comment
@wxwern
Copy link
Author

Ping

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@vitalybukavitalybukaAwaiting requested review from vitalybuka

Assignees
No one assigned
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

3 participants
@wxwern@llvmbot@vitalybuka

[8]ページ先頭

©2009-2025 Movatter.jp