@mcspr This is a proof of concept PR for an alternative solution to#9193 and alsocloses#9194. Could you please verify? I would appreciate any input and feedback. Thanks!

Description

This PR adds more granular control over how the crash dump is delivered, and some more features:

custom_panic_callback(panic_file, panic_line, panic_func, panic_what): Allows the user to additionally listen for panics (useful to record them in flash memory)
ets_uart_putc1() →crash_ets_uart_putc1(): renamed function that prints to the Serial port. This allows the function to be overridden, possibly applying transformations on the text (useful if transmitting over a Bus-like UART, such as RS-485), remove clutter or insert additional comments (such as uptime), or completely inhibit the dump (useful is uart is connected to a peripheral device)
NO_CUT_HERE: Introduce a macro that allows disabling the "CUT HERE" message and the divider lines. Can be used to minify the number of transmitted/recorded characters over the wire or storage. Can be used with the-D flag.
ets_println() instead ofets_putc('\n'): This makes the new lines use the "cooked" interface instead of the "raw" one, which is also used on the staticets_printf_P() function, which allows:
- uniform line endings (always "\r\n", instead of just the ones used withinprintf_P(...)
- making the line endings also silent when thecrash_ets_uart_putc1() is overriden
rst_info.exccause used instead of localexccause for the6 exception (divide by zero): unless I'm missing something, this allows the custom_crash_handler at the end of the postmortem function to also receive a correct exception number. Useful for projects such as EspCrashSave and EspCrashSaveSpiffs

Additional notes

The naming of some functions might not be the optimal value, if you would like to suggest a better one please either modify this PR or submit a comment.

DRSDavidSoft added3 commits

August 30, 2024 03:26

expose more crash reporting apis

6765e0d

change tab to spaces

6b3e716

rename symbols to correct ones

2716f3c

Copy link

Collaborator

mcspr commentedAug 30, 2024

custom_panic_callback

Probably just one callback would be ok? We do have custom 'USER' reasons which are private now, but obviously can be moved to a public header. Seerst_reason_sw. Suppose, this can be extended with something likeesp_panic_info() returning a struct of those pointers?

NO_CUT_HERE

Any reason not to fully hide the wholeinternal printf with a similar flag? Or, at least hide the part of postmortem & funcs calling postmortem that print stuff, structuring it a bit different in the main func especially.
(or simply replacing such internal printf with an empty macro on demand?)

rst_info.exccause used instead of local exccause for the 6 exception (divide by zero): unless I'm missing something, this allows the custom_crash_handler at the end of the postmortem function to also receive a correct exception number. Useful for projects such as EspCrashSave and EspCrashSaveSpiffs

see#7496
This helps out exception decoder by pretending the hw does have instructions that would've triggered the exception. If only to be consistent with the output... I think the only issue is exc code would be different between rst info struct kept in memory between reboots and the one passed to the callback?

Copy link

ContributorAuthor

DRSDavidSoft commentedAug 30, 2024

Thank you for the feedback! I agree with your points. Let me elaborate a bit on the original intentions behind these changes while also addressing your suggestions.

`custom_panic_callback`

The idea behind introducing a custom panic callback was to provide users with more flexibility in handling panics, especially for scenarios where recording the crash information in flash memory is necessary. I agree that consolidating this into a single callback would be best as it simplifies the interface and reduces the addition of unnecessary callbacks.

I suppose the reason for adding this to the PR was to avoid modification of the existingcustom_crash_callback's signature.

Probably just one callback would be ok? ... Suppose, this can be extended with something likeesp_panic_info() returning a struct of those pointers

That's a great idea! I very much prefer to use something likeesp_panic_info() instead of what I came with (custom_panic_callback).

Moving therst_reason_sw to a public header and extending it with something like anesp_panic_info() function that returns a structured set of pointers would be an excellent approach. Do you have any specific ideas on what that struct should contain, or how the implementation might be extended best?

`NO_CUT_HERE`

The purpose of theNO_CUT_HERE macro was to provide users with an option to minimize the output, especially when the crash dump is being transmitted over limited bandwidth or recorded in constrained storage environments. I can see the benefit of hiding more of the internalprintf calls, particularly those within the postmortem functions.

Any reason not to fully hide the wholeinternal printf with a similar flag, ... or simply replacing such internal printf with an empty macro on demand?

The second idea is exactly what I tried to pursue by introducing the newcrash_ets_uart_putc1() func. Since it can be overridden by the user, it canalso be used to point to a no-op function, or to simply not print anything to the serial. I also agree with using a macro instead, however I have found that the linkage approach works better. Alternatively, replacing the higher-level internalprintf with a macro as needed could offer a more straightforward solution.

As to why thecut_here() function specifically, it's currently the only part that produces static content, and can be forgoed when saving or transmitting the crash dump. I just needed a way to signal to prevent it from being generated. I agree this is not the best approach and leads to clutter in code, but it's currently the best I could come up with 😅 I would appreciate other solutions as well!

Structuring the main function differently to conditionally include or exclude these outputs is also a good approach. Would you prefer the more granular control of individualprintf calls, or a broader flag that suppresses all related output?

`rst_info.exccause`

Your point about usingrst_info.exccause for the divide by zero exception is well taken. The intention here was to ensure consistency across the crash reports, especially for tools like EspCrashSave and EspCrashSaveSpiffs that rely on accurate exception information.

I think the only issue is exc code would be different between rst info struct kept in memory between reboots and the one passed to the callback

As for my personal opinion, I can accept that the consequent exception code wouldn't be the correct one (6), that is due to ESP8266's hardware design. The more important factor is that during the crash instance, the code is correctlyreported (as is the current case when considering print to serial), and the code is also correctlysaved, so that further in the line it can also becorrectly reported over the network, or something else.

However, you raise a valid concern about the discrepancy between therst_info struct retained between reboots and the one passed to the callback. This could potentially lead to confusion in diagnosing issues.

Currently, the exception is correctlyonly displayed when a crash happens. We don't care about the next reboot, because we have lost the context of the actual crash. This PR aims to also correctly save the exception code. This means that later, when we access the crash information, we will also see the correct code. This can either be another print to the uart (triggered by user), or in the case I'm interested, transmit it over the network on the next reboot. Which means the code that is available in next reboots will be ignored in favor of the saved one, received from the crash callback.

I'm open to suggestions on how to handle this more effectively. Sadly, it appears that getting the same exception code across both instances or providing additional context in the callback to clarify the source of the exception code are both unlikely to implement.

Once again, I appreciate your insights and would love to hear any further thoughts you have on these topics. Your expertise in this area is invaluable, and I'm keen to ensure that the final implementation meets both the flexibility and reliability requirements of our users.

Merge branch 'esp8266:master' into pr-postmortem-extension

2a6b758

DRSDavidSoft mentioned this pull request

Aug 17, 2025

Feature Request: Don't print stack trace to Serial on crash#9194

Open

5 tasks

mcspr added a commit to mcspr/esp8266-Arduino that referenced this pull request

Aug 27, 2025

Postmortem - export user reset reason & extra callbacks

6d5627c

ref.esp8266#9193,esp8266#9194,esp8266#9196- persistent & public REASON_USER_... for user callback rst_info reason- public postmortem callback functions, don't overly hide the implementation- separate reasons for panic, assert, abort and c++ exceptions- put postmortem func one level down in the function chain, allow to  override both the function itself or individual internals- stack offset for -fexceptions, skip unwind data to the actual point of origin

mcspr mentioned this pull request

Aug 27, 2025

Postmortem - allow more control over crash dump#9271

Open

Labels

None yet

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix #9194 - Allow more control over crash dump#9196

Are you sure you want to change the base?

Fix #9194 - Allow more control over crash dump#9196

Conversation

DRSDavidSoft commentedAug 30, 2024•
edited
Loading