Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Remove global spinlock for EH stacktrace#103076

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged

Conversation

janvorli
Copy link
Member

The global spinlock that was used to ensure that stack trace and the associated dynamic methods array were updated and read atomically. However, for the new EH, it has shown to cause a high contention in case many threads were handling exceptions at the same time.

This change replaces the two arrays by one object member in the exception class. It contains reference to either thebyte[] of the stack trace (when there are no dynamic methods on the stack trace) or anobject[] where the first element contains the stack tracebyte[] reference and the following elements contain what used to be in the dynamic array. That allows atomic updates and reads of the stack trace and dynamic method keepalive references without a need of a lock.

It improves the performance of exceptions being handled on multiple threads at the same time 3 fold. For example, a testing app running on my Windows x64 machine with 10 CPU cores that throws an exception on 10 threads in parallel 1000000 times over 10 stack frames took 30s of each thread's time before and 9 seconds after the change.

The original code was quite convoluted, it was difficult to reason about and it had some races in it that were hidden behind the global lock. So I have decided to rewrite the whole thing from scratch.

The way it ensures that it is race free is that whenever it updates the exception stack trace and the one that's on the exception was created by a different thread, it creates a deep copy of both the stack trace and the keepalive array. When making the copy, it also handles a case when a frame that needs a keepalive entry is on the stack trace part, but the keepalive array extracted from the exception is stale (the other thread needed to resize the keepalive array, but not the stack trace). In that case, the stack trace is trimmed at first such entry found.

Since the case when multiple threads are throwing the same exception and so they are modifying its stack trace in parallel is pathological anyways, I believe the extra work spent on creating the clones of the arrays is a good tradeoff for ensuring easy to reason about thread safety.

I have also merged theStackTraceInfo::AppendElement and theStackTraceInfo::SaveStackTrace into one, as they were always being called in pair for a single frame. Before, the first one needed to store the frame and the second would extract it from that storage and append it to the stack trace.

Finally, since with the previous iteration of this change, a bug in building the stack trace was found, I have added a coreclr test to verify stack trace for an exception matches the expectations.

samsosa reacted with hooray emojidanmoseley reacted with rocket emojiEn3Tho reacted with eyes emoji
The global spinlock that was used to ensure that stack trace and theassociated dynamic methods array were updated and read atomically.However, for the new EH, it has shown to cause a high contention in casemany threads were handling exceptions at the same time.This change replaces the two arrays by one object member in theexception class. It contains reference to either the byte[] of thestack trace (when there are no dynamic methods on the stack trace) or anobject[] where the first element contains the stack trace byte[]reference and the following elements contain what used to be in thedynamic array. That allows atomic updates and reads of the stack traceand dynamic method keepalive references without a need of a lock.The original code was quite convoluted, it was difficult to reason aboutand it had some races in it that were hidden behind the global lock. SoI have decided to rewrite the whole thing from scratch.The way it ensures that it is race free is that whenever it updates theexception stack trace and the one that's on the exception was created bya different thread, it creates a deep copy of both the stack trace andthe keepalive array. When making the copy, it also handles a case whena frame that needs a keepalive entry is on the stack trace part, but thekeepalive array extracted from the exception is stale (the other threadneeded to resize the keepalive array, but not the stack trace). In thatcase, the stack trace is trimmed at first such entry found.Since the case when multiple threads are throwing the same exception andso they are modifying its stack trace in parallel is pathologicalanyways, I believe the extra work spent on creating the clones of the arraysis a good tradeoff for ensuring easy to reason about thread safety.I have also removed a dead code path from theStackTraceInfo::SaveStackTrace.Finally, since with the previous iteration of this change, a bug inbuilding the stack trace was found, I have added a coreclr test toverify stack trace for an exception matches the expectations.
@janvorlijanvorli added this to the9.0.0 milestoneJun 5, 2024
@janvorlijanvorli requested a review fromjkotasJune 5, 2024 15:09
@janvorlijanvorli self-assigned thisJun 5, 2024
@janvorli
Copy link
MemberAuthor

There are some test errors, I am investigating them.

// Given an exception object, this method will extract the stacktrace and dynamic method array and set them up for return to the caller.
FCIMPL3(VOID, ExceptionNative::GetStackTracesDeepCopy, Object* pExceptionObjectUnsafe, Object **pStackTraceUnsafe, Object **pDynamicMethodsUnsafe);
FCIMPL2(VOID, ExceptionNative::GetStackTracesDeepCopy, Object* pExceptionObjectUnsafe, Object **pStackTraceUnsafe);
Copy link
Member

@jkotasjkotasJun 6, 2024
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Suggested change
FCIMPL2(VOID, ExceptionNative::GetStackTracesDeepCopy, Object*pExceptionObjectUnsafe, Object **pStackTraceUnsafe);
FCIMPL1(Object *, ExceptionNative::FreezeStackTrace, Object*pStackTrace)

This can return the value now that there is just a single value to return

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Updated the suggestion toFreezeStackTrace since the method does not actually perform a deep copy in typical case anymore.

}
if (keepaliveObject == NULL)
{
// Trim the stack trace at a point where a dynamic or collectible method is found without a corresponding keepalive object.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

How is this possible?

If I understand the code correctly, the thread safety should be guaranteed by writing thesize field last and reading it first. Is that right? (Whatever it is, it would be nice to have it documented in a comment somewhere.)

I do not think it should be possible to get methods items without corresponding keep alive items here. If it was possible, I suspect there may be situations where the method is not kept alive and thepMethod->IsLCGMethod() check above can crash.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Here is the case when it happens:

  • Thread A owns the stack trace and is adding a new entry that requires a keep alive object.
  • Thread B wants to update the stack trace too, so it fetches the stack trace array S1 and keep alive array K1 from the exception
  • Thread A finds the keep alive array is full, so it creates a larger clone of the previous one, let's call it K2
  • Thread A adds the new entry to keep alive array K2
  • Thread A adds the new entry to stack trace array S1
  • Thread A writes K2 to the exception
  • Now thread B creates clones of S1 and K1 (when it read it from the exception, the K2 was not there yet). But S1 already has the new entry added by thread A, which is not covered by K1. Since K1 won't change anymore, we solve this by trimming the S1 at the element where we've found a frame that needs keep alive object and was not covered by K1

However, I have realized you are right that in such case, calling IsLCGMethod on the MethodTable above could crash. For example, if the thread A's exception was collected together with its keep alive array K2 that can be the only thing holding the method alive and if that would happen before the thread B called the IsLCGMethod.
So I need to figure out some other way to handle the situation described above.

@jkotas
Copy link
Member

Since the case when multiple threads are throwing the same exception and so they are modifying its stack trace in parallel is pathological anyways, I believe the extra work spent on creating the clones of the arrays is a good tradeoff for ensuring easy to reason about thread safety.

I agree that multiple threads throwing the same exception is a pathological case that can be slow. One exception being caught on one thread and rethrown on different threads using ExceptionDispatchInfo is not that uncommon in async code.

* Missing calls to IsOverflow at few places* Added a flag on StackTraceElement to indicate that the element needs a  keepalive entry. It removes the need to call IsLCGMethod / Collectible  check on the method table stored in the element and eliminates a  possible problem with the method being collected in one place.* Returned missing call to StackFrameInfo::Init to the x86 code path* Removed obsolete comment and code line
* Add keep alive items count to the stack trace header.* Implement the concept of frozen stack traces to eliminate copies in  the ExceptionDispatchInfo storing / restoring exceptions.
In the StackTraceArray::Allocate
@janvorlijanvorliforce-pushed theremove-eh-stacktrace-global-lock-new branch from69d1e71 tod056127CompareJune 10, 2024 19:32
Also rename GetStackTracesDeepCopy to GetFrozenStackTrace and move thereturn argument to return value.
Comment on lines 161 to 162

// There can be no GC after setting the frozenStackTrace until the Object is returned.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Suggested change
// There can be no GC after setting the frozenStackTrace until the Object is returned.

I am not sure what this comment is trying to say.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Ah, I've renamed the frozenStackTrace to result and forgotten to update the comment. It is trying to say that the gc.result is not protected after the HELPER_METHOD_FRAME_END

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Ok, that's writing manually managed code 101.

(We will want to convert this method to QCALL in near future to get rid of the HELPER_METHOD_FRAME.)

Plus an unused method removal and a little naming / contract cleanup
m_array = (I1ARRAYREF) AllocatePrimitiveArray(ELEMENT_TYPE_I1, static_cast<DWORD>(src.Capacity()));

Volatile<size_t> size = src.Size();
uint32_t size = src.Size();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

We should use volatile read for the Size. Otherwise, the C++ compiler is free to duplicate the read and use onesize for value for the memcpy and return a different value from the method.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I have thought you were suggesting before that I get rid of the Volatile here, so I am bit confused.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I would:

  • Get rid of the volatile local variable. It does not make sense to tell the compiler to issue read/write barriers when reading/writing stack. It is deoptimization.
  • Change the read in Size() method to be volatile read, so that it is not reordered. This volatile read is counterpart of volatile write inStackTraceArray::Append (it is not actual volatile write, but MemoryBarrier + regular write that is equivalent in this situation). In general, volatile reads and volatile writes in lock-free algorithm have to come in pairs on producing/consuming sides.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Ok, I have added a commit with that change. I've modified the GetSize / SetSize and added such treatment to the keep alive count accessors too. And removed the MemoryBarrier from the StackTraceArray::Append where it is no longer needed after this change.

Copy link
Member

@jkotasjkotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

LGTM otherwise. Thank you!

Also remove the memory barrier from the StackTraceArray::Append since itis not needed after that change.
@janvorli
Copy link
MemberAuthor

This PR needs to wait for merging until SOS with corresponding change is released.

@janvorlijanvorli added the NO-MERGEThe PR is not ready for merge yet (see discussion for detailed reasons) labelJun 11, 2024
I have also realized that when we need to trim, the keepAlive array isalways fully populated, so we don't need to check for cases where therewould be NULL in an entry of the array.
@janvorli
Copy link
MemberAuthor

/azp run runtime-coreclr gcstress0x3-gcstress0xc

@azure-pipelinesAzure Pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@janvorli
Copy link
MemberAuthor

/azp run runtime-coreclr gcstress0x3-gcstress0xc

@azure-pipelinesAzure Pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@janvorli
Copy link
MemberAuthor

/azp run runtime-coreclr gcstress0x3-gcstress0xc

@azure-pipelinesAzure Pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@janvorlijanvorli merged commitdd2120b intodotnet:mainJul 8, 2024
108 of 112 checks passed
@janvorlijanvorli deleted the remove-eh-stacktrace-global-lock-new branchJuly 8, 2024 19:52
@janvorlijanvorli removed the NO-MERGEThe PR is not ready for merge yet (see discussion for detailed reasons) labelJul 24, 2024
@github-actionsgithub-actionsbot locked and limited conversation to collaboratorsAug 24, 2024
Sign up for freeto subscribe to this conversation on GitHub. Already have an account?Sign in.
Reviewers

@jkotasjkotasjkotas approved these changes

@mikem8361mikem8361mikem8361 left review comments

Assignees

@janvorlijanvorli

Projects
None yet
Milestone
9.0.0
Development

Successfully merging this pull request may close these issues.

3 participants
@janvorli@jkotas@mikem8361

[8]ページ先頭

©2009-2025 Movatter.jp