Movatterモバイル変換

janvorli added this to the9.0.0 milestone

janvorli requested a review fromjkotas

June 5, 2024 15:09

janvorli self-assigned this

Failure in GC\API\NoGCRegion\Callback_Svr\Callback_Svr.cmd#100149

janvorli added2 commits

June 5, 2024 17:33

Fix MUSL build

96da965

Fix x86 build

fb8cc6b

This was referencedJun 5, 2024

Closed

[x86] stress failure in RayTracer.GetNaturalColor with DOTNET_JitStress=2#102590

Closed

Copy link

MemberAuthor

janvorli commentedJun 5, 2024

There are some test errors, I am investigating them.

jkotas reviewed

src/coreclr/vm/excep.cpp OutdatedShow resolvedHide resolved

src/coreclr/vm/object.cpp OutdatedShow resolvedHide resolved

jkotas reviewed

Jun 6, 2024

src/coreclr/vm/clrex.hShow resolvedHide resolved

src/coreclr/vm/clrex.h OutdatedShow resolvedHide resolved

src/coreclr/vm/comutilnative.cpp OutdatedShow resolvedHide resolved

src/coreclr/vm/comutilnative.cpp Outdated

		// Given an exception object, this method will extract the stacktrace and dynamic method array and set them up for return to the caller.
		FCIMPL3(VOID, ExceptionNative::GetStackTracesDeepCopy, Object* pExceptionObjectUnsafe, Object pStackTraceUnsafe, Object pDynamicMethodsUnsafe);
		FCIMPL2(VOID, ExceptionNative::GetStackTracesDeepCopy, Object* pExceptionObjectUnsafe, Object **pStackTraceUnsafe);

Copy link

Member

jkotasJun 6, 2024•
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Suggested change

	FCIMPL2(VOID, ExceptionNative::GetStackTracesDeepCopy, ObjectpExceptionObjectUnsafe, Object *pStackTraceUnsafe);
	FCIMPL1(Object , ExceptionNative::FreezeStackTrace, ObjectpStackTrace)

This can return the value now that there is just a single value to return

Copy link

Member

jkotasJun 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Updated the suggestion toFreezeStackTrace since the method does not actually perform a deep copy in typical case anymore.

src/coreclr/vm/object.cpp OutdatedShow resolvedHide resolved

src/coreclr/vm/object.cpp Outdated

		}
		if (keepaliveObject == NULL)
		{
		// Trim the stack trace at a point where a dynamic or collectible method is found without a corresponding keepalive object.

Copy link

Member

jkotasJun 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

How is this possible?

If I understand the code correctly, the thread safety should be guaranteed by writing thesize field last and reading it first. Is that right? (Whatever it is, it would be nice to have it documented in a comment somewhere.)

I do not think it should be possible to get methods items without corresponding keep alive items here. If it was possible, I suspect there may be situations where the method is not kept alive and thepMethod->IsLCGMethod() check above can crash.

Copy link

MemberAuthor

janvorliJun 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Here is the case when it happens:

Thread A owns the stack trace and is adding a new entry that requires a keep alive object.
Thread B wants to update the stack trace too, so it fetches the stack trace array S1 and keep alive array K1 from the exception
Thread A finds the keep alive array is full, so it creates a larger clone of the previous one, let's call it K2
Thread A adds the new entry to keep alive array K2
Thread A adds the new entry to stack trace array S1
Thread A writes K2 to the exception
Now thread B creates clones of S1 and K1 (when it read it from the exception, the K2 was not there yet). But S1 already has the new entry added by thread A, which is not covered by K1. Since K1 won't change anymore, we solve this by trimming the S1 at the element where we've found a frame that needs keep alive object and was not covered by K1

However, I have realized you are right that in such case, calling IsLCGMethod on the MethodTable above could crash. For example, if the thread A's exception was collected together with its keep alive array K2 that can be the only thing holding the method alive and if that would happen before the thread B called the IsLCGMethod.
So I need to figure out some other way to handle the situation described above.

Copy link

Member

jkotas commentedJun 6, 2024

Since the case when multiple threads are throwing the same exception and so they are modifying its stack trace in parallel is pathological anyways, I believe the extra work spent on creating the clones of the arrays is a good tradeoff for ensuring easy to reason about thread safety.

I agree that multiple threads throwing the same exception is a pathological case that can be slow. One exception being caught on one thread and rethrown on different threads using ExceptionDispatchInfo is not that uncommon in async code.

jkotas reviewed

Jun 6, 2024

src/coreclr/vm/excep.cpp OutdatedShow resolvedHide resolved

janvorli added3 commits

June 6, 2024 18:50

Fix several issues

2d63781

* Missing calls to IsOverflow at few places* Added a flag on StackTraceElement to indicate that the element needs a  keepalive entry. It removes the need to call IsLCGMethod / Collectible  check on the method table stored in the element and eliminates a  possible problem with the method being collected in one place.* Returned missing call to StackFrameInfo::Init to the x86 code path* Removed obsolete comment and code line

Few changes based on feedback

8eccdbc

* Add keep alive items count to the stack trace header.* Implement the concept of frozen stack traces to eliminate copies in  the ExceptionDispatchInfo storing / restoring exceptions.

Rename keepalive to keepAlive

f774c60

jkotas reviewed

src/coreclr/vm/object.h OutdatedShow resolvedHide resolved

Handle possible array size overflow

d056127

In the StackTraceArray::Allocate

janvorli force-pushed theremove-eh-stacktrace-global-lock-new branch from69d1e71 tod056127Compare

June 10, 2024 19:32

Fix typo

0fa22ad

jkotas reviewed

src/coreclr/System.Private.CoreLib/src/System/Exception.CoreCLR.cs OutdatedShow resolvedHide resolved

janvorli added2 commits

June 10, 2024 23:20

Change the size / keepAlive fields in stack trace to uint32_t

dfe4e47

Plus a build break fix

Remove SaveStackTracesFromDeepCopy

8355c30

Also rename GetStackTracesDeepCopy to GetFrozenStackTrace and move thereturn argument to return value.

jkotas reviewed

src/coreclr/vm/comutilnative.cpp OutdatedShow resolvedHide resolved

jkotas reviewed

src/coreclr/vm/clrex.h OutdatedShow resolvedHide resolved

jkotas reviewed

src/coreclr/vm/comutilnative.cpp Outdated

Comment on lines 161 to 162


		// There can be no GC after setting the frozenStackTrace until the Object is returned.

Copy link

Member

jkotasJun 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Suggested change


	// There can be no GC after setting the frozenStackTrace until the Object is returned.

I am not sure what this comment is trying to say.

Copy link

MemberAuthor

janvorliJun 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Ah, I've renamed the frozenStackTrace to result and forgotten to update the comment. It is trying to say that the gc.result is not protected after the HELPER_METHOD_FRAME_END

Copy link

Member

jkotasJun 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Ok, that's writing manually managed code 101.

(We will want to convert this method to QCALL in near future to get rid of the HELPER_METHOD_FRAME.)

jkotas reviewed

src/coreclr/vm/comutilnative.cpp OutdatedShow resolvedHide resolved

jkotas reviewed

src/coreclr/vm/excep.cpp OutdatedShow resolvedHide resolved

jkotas reviewed

src/coreclr/vm/object.cpp OutdatedShow resolvedHide resolved

Move the race handling into GetStackTrace only

12db638

Plus an unused method removal and a little naming / contract cleanup

jkotas reviewed

src/coreclr/vm/object.cpp OutdatedShow resolvedHide resolved

jkotas reviewed

src/coreclr/vm/object.cpp

		m_array = (I1ARRAYREF) AllocatePrimitiveArray(ELEMENT_TYPE_I1, static_cast<DWORD>(src.Capacity()));

		Volatile<size_t> size = src.Size();
		uint32_t size = src.Size();

Copy link

Member

jkotasJun 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

We should use volatile read for the Size. Otherwise, the C++ compiler is free to duplicate the read and use onesize for value for the memcpy and return a different value from the method.

Copy link

MemberAuthor

janvorliJun 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I have thought you were suggesting before that I get rid of the Volatile here, so I am bit confused.

Copy link

Member

jkotasJun 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I would:

Get rid of the volatile local variable. It does not make sense to tell the compiler to issue read/write barriers when reading/writing stack. It is deoptimization.
Change the read in Size() method to be volatile read, so that it is not reordered. This volatile read is counterpart of volatile write inStackTraceArray::Append (it is not actual volatile write, but MemoryBarrier + regular write that is equivalent in this situation). In general, volatile reads and volatile writes in lock-free algorithm have to come in pairs on producing/consuming sides.

Copy link

MemberAuthor

janvorliJun 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Ok, I have added a commit with that change. I've modified the GetSize / SetSize and added such treatment to the keep alive count accessors too. And removed the MemoryBarrier from the StackTraceArray::Append where it is no longer needed after this change.

jkotas approved these changes

Copy link

Member

jkotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

LGTM otherwise. Thank you!

jkotas reviewed

src/coreclr/inc/sospriv.idlShow resolvedHide resolved

Add VolatileLoad/Store around the size / keep alive count

4259620

Also remove the memory barrier from the StackTraceArray::Append since itis not needed after that change.

Copy link

MemberAuthor

janvorli commentedJun 11, 2024

This PR needs to wait for merging until SOS with corresponding change is released.

janvorli added the NO-MERGEThe PR is not ready for merge yet (see discussion for detailed reasons) label