- Notifications
You must be signed in to change notification settings - Fork14.5k
[lldb][windows] force the console to use a UTF-8 codepage#149493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:main
Are you sure you want to change the base?
[lldb][windows] force the console to use a UTF-8 codepage#149493
Conversation
@llvm/pr-subscribers-lldb Author: Charles Zablit (charles-zablit) ChangesThis patch sets the codepage of the parent Windows console to This fixes a rendering issue where the characters defined in This solution is based on thisSO thread andthis patch downstream. rdar://156064500 Full diff:https://github.com/llvm/llvm-project/pull/149493.diff 2 Files Affected:
diff --git a/lldb/source/Plugins/Platform/Windows/PlatformWindows.cpp b/lldb/source/Plugins/Platform/Windows/PlatformWindows.cppindex c0c26cc5f1954..d3e981de81313 100644--- a/lldb/source/Plugins/Platform/Windows/PlatformWindows.cpp+++ b/lldb/source/Plugins/Platform/Windows/PlatformWindows.cpp@@ -41,6 +41,10 @@ LLDB_PLUGIN_DEFINE(PlatformWindows) static uint32_t g_initialize_count = 0;+#if defined(_WIN32)+std::optional<UINT> g_prev_console_cp = std::nullopt;+#endif+ PlatformSP PlatformWindows::CreateInstance(bool force, const lldb_private::ArchSpec *arch) { // The only time we create an instance is when we are creating a remote@@ -98,6 +102,7 @@ void PlatformWindows::Initialize() { default_platform_sp->SetSystemArchitecture(HostInfo::GetArchitecture()); Platform::SetHostPlatform(default_platform_sp); #endif+ SetConsoleCodePage(); PluginManager::RegisterPlugin( PlatformWindows::GetPluginNameStatic(false), PlatformWindows::GetPluginDescriptionStatic(false),@@ -108,6 +113,7 @@ void PlatformWindows::Initialize() { void PlatformWindows::Terminate() { if (g_initialize_count > 0) { if (--g_initialize_count == 0) {+ ResetConsoleCodePage(); PluginManager::UnregisterPlugin(PlatformWindows::CreateInstance); } }@@ -808,3 +814,17 @@ extern "C" { return Status(); }++void PlatformWindows::SetConsoleCodePage() {+ #if defined(_WIN32)+ g_prev_console_cp = GetConsoleOutputCP();+ SetConsoleOutputCP(CP_UTF8);+ #endif+}++void PlatformWindows::ResetConsoleCodePage() {+ #if defined(_WIN32)+ if (g_prev_console_cp)+ SetConsoleOutputCP(*g_prev_console_cp);+ #endif+}diff --git a/lldb/source/Plugins/Platform/Windows/PlatformWindows.h b/lldb/source/Plugins/Platform/Windows/PlatformWindows.hindex 771133f341e90..d14aa52e5e1c8 100644--- a/lldb/source/Plugins/Platform/Windows/PlatformWindows.h+++ b/lldb/source/Plugins/Platform/Windows/PlatformWindows.h@@ -80,6 +80,14 @@ class PlatformWindows : public RemoteAwarePlatform { size_t GetSoftwareBreakpointTrapOpcode(Target &target, BreakpointSite *bp_site) override;+ /// Set the current console's code page to UTF-8 and store the previous+ /// codepage in \a g_prev_console_cp.+ static void SetConsoleCodePage();++ /// Reset the current console's code page to the value stored+ /// in \a g_prev_console_cp if any.+ static void ResetConsoleCodePage();+ std::vector<ArchSpec> m_supported_architectures; private: |
github-actionsbot commentedJul 18, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
✅ With the latest revision this PR passed the C/C++ code formatter. |
Windows Terminal is the default on Windows 10 at least. I think buildbot launches things in conhost, but if utf-8 there was a problem for tests, we would have seen it before now. |
charles-zablit commentedJul 18, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
From my understanding, theoriginal issue in jq is that they did not reset the codepage after the program had exited. My patch does reset it if lldb gracefully exits. The Another temporary fix for this while we figure out a long term solution could be to force the use of theANSI characters on windows. |
charles-zablit commentedJul 18, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Yes that makes sense, Windows Terminal doesn't default to utf-8 either. I was thinking of something else. What if we added this new code, and it tried to set utf-8 and a test relied on that. However, this is not a problem because we already test this annotation feature via conhost and it has no problems. So even if the calls do nothing, it doesn't matter. In other words: tests on Windows aren't scraping the output of the terminal, they'll be reading strings internally and not care about the code page. Which is a good thing. |
charles-zablit commentedJul 18, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
From my understanding of the thread you linked, there are 3 ways to approach this:
|
I added |
Then setting the code page is probably the best idea, requiring the least amount of effort. As far as I know, |
Thanks for clarifying, I corrected the 4th option. 2b8c692 does not look like such big of a change, is |
A
I agree, that would probably be of a similar size here (i.e. check for the file handle and call the windows impl if needed). |
This patch sets the codepage of the parent Windows console to
utf-8
and resets it back to the original codepage oncelldb
exits.This fixes a rendering issue where the characters defined in
DiagnosticsRendering.cpp
("╰"
for instance) are not rendered properly on Windows out of the box, because the default codepage is notutf-8
.This solution is based on thisSO thread andthis patch downstream.
rdar://156064500