- Notifications
You must be signed in to change notification settings - Fork6.2k
8274338: com/sun/jdi/RedefineCrossEvent.java failed "assert(m != __null) failed: NULL mirror"#5935
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
…ll) failed: NULL mirror"
👋 Welcome back coleenp! A progress list of the required criteria for merging this PR into |
mlbridgebot commentedOct 13, 2021 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Webrevs
|
dholmes-ora left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Hi Coleen,
Given the array_klasses() is accessed lock-free (using acquire/release semantics) I don't see how adding this lock prevents a partially restored array from being observed? The lock prevents concurrent restoration operations, and prevents restoration concurrently with creation (if that is even possible), but readers are still lock-free AFAICS.
Thanks,
David
| if (array_klasses() !=NULL) { | ||
| // To get a consistent list of classes we need MultiArray_lock to ensure | ||
| // array classes aren't observed while they are being created. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
s/created/restored
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
fixed.
coleenp commentedOct 13, 2021
If you look at everywhere array classes are created, there's a MultiArray_lock protecting them until the mirror is created so that it seems atomic when added to the ClassLoaderData::_klasses and the mirror is created and added to the klass. The JVMTI code locks MultiArray_lock to iterate through the classes (with a similar comment, actually copied). |
iklam left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
LGTM
dholmes-ora commentedOct 14, 2021
If the JVMTI code already acquires the lock to iterate through the classes then I can see why taking the lock in restore_unshareable_info might fix the observed problem. But that begs the question as to why JVMTI, as an observer, needs to take the lock in the first place, when these objects should not be observable until after they have been safely published? If the objects are not being safely published then other observers may also need to take out this lock. The real bug here seems to be that instanceKlasses can be found and inspected before InstanceKlass::restore_unshareable_info has been applied to that instanceKlass! Looking at Klass::restore_unshareable_info it has this: And that to me seems a bug as now the class can be found through the CLD before it is ready to be observed. You fixed the current problem where JVMTI stumbles over a null mirror related to an array class, but what other inappropriately set data in the instanceKlass might it also look at? |
iklam commentedOct 14, 2021
Yes, I agree that's fishy. But I think the bug may not be the call to add_class. If you look at ClassFileParser::fill_instance_klass(), it also does this very early, before many fields of the class are initialized: whereas Klass::restore_unshareable_info always uses publicize==true. Could that be an issue? Also, when a class is parsed, the mirror is created almost at the very end. However, for archived classes, the mirror is created when only the Klass portion are initialized. Maybe we should delay the mirror creation to a later time? |
dholmes-ora commentedOct 14, 2021
The definition of publicize is just:
and I can't see how that would change the visibility in terms of the JVMTI code.
I think there should be a lot of similarity between the process for regular loading and the loading from the archive. But ultimately both should have the same property: no publishing of the klass until after it is fully "initialized". But neither process seems to ensure that, which is puzzling ... unless I'm not looking high enough up for some locking in the process in both cases. Thinking further on the fix ... without knowing exactly how the assertion failure arises, it seems to me that the fix cannot be complete. If the JVMTI code is racing with the execution of restore_unshareable_info and the setting of the mirror, then even with the lock added the JVMTI code could still get there first and find the mirror not set. |
dholmes-ora commentedOct 14, 2021
AFAICS |
dholmes-ora commentedOct 14, 2021
Maybe it is the assertion that is wrong? If we are actually adding incomplete classes to the CLD such that they are visible to traversal, then the code should not be assuming they are complete and a NULL mirror may be perfectly acceptable? |
coleenp commentedOct 14, 2021
Thanks Ioi for the review and for answering David's questions. Let me try to answer further. In ClassLoaderData::loaded_classes_do, the InstanceKlass is added to the CLD but it's not marked as 'loaded' until the mirror is created, so it won't be followed. The array case has no such initialization state field, so both adding to CLD and updating the mirror must be as-if atomic. This is accomplished by locking the MultiArray_lock. In the creation case, the ArrayKlass is added to the CLD after the mirror is created, still with the lock. In the restore_unshareable_info, the class is added to the CLD first because it shares code with InstanceKlass (it's in Klass::restore_unshareable_info). We could add some 'if' statements in Klass::restore_unshareable_info to fix the order of the ArrayKlass case, but taking out the lock will still be needed to be consistent with the creation case, and is sufficient here also. |
coleenp commentedOct 14, 2021
Also, I would not like to take out the null check for this case. Having code find a null mirror further down will likely cause problems in whatever the do_klass() callback wants (in this case, likely wants the mirror). |
dholmes-ora commentedOct 14, 2021
I will state again: If the JVMTI code is racing with the execution of restore_unshareable_info and the setting of the mirror, then even with the lock added the JVMTI code could still get there first and find the mirror not set. The added lock simply ensures that the JVMTI code and the restore_unshareable_info code cannot execute concurrently - it does not guarantee that the JVMTI code can't execute until after restore_unshareable_info. |
iklam commentedOct 14, 2021
For an InstanceKlass k to be iterated on by ClassLoaderData::loaded_classes_do(), k->is_loaded() must be true. The class enters the "loaded" state when SystemDictionay::add_to_hierarchy(k) is called, which happens after k->restore_unshareable_info() has completed. So the current implementation is safe w.r.t. InstanceKlasses. For ArrayKlasses, they would bekind of safe after this PR, as the JVMTI code will not see an ArrayKlasses that's in the middle of restore_unshareable_info(). However, it will still be possible for the JVMTI code to see the |
dholmes-ora commentedOct 14, 2021
We hit an assertion failure because execution of loaded_klasses_do by JVMTI encountered a shared array class that had a NULL mirror. So two things need to be known:
Either we must guarantee that the mirror is set before the array class is visible, or the two actions must be atomic with respect to the execution of loaded_klasses_do. The latter implies that any locking must cover both actions. |
coleenp commentedOct 15, 2021
Yes:
The guarantee that the two actions are atomic wrt to loaded_classes_do are because the MultiArray_lock is held while both 1 and 2 are executed. The lock keeps 1 and 2 together. The lock keeps Jvmti from iterating through the CLDG while the ArrayKlass is in an inconsistent state. @iklam you're right - we'd see the ObjArrayKlass first since the InstanceKlass will be added to the CLD before its marked as is_loaded(). I don't know what the result of that would be. You could open a new bug for that to investigate that and try to write a test case (using JVMTI that would show if it got confused). |
dholmes-ora commentedOct 15, 2021
@coleenp ,@iklam thanks for your patience on this. My confusion/concern was around whether Thanks, |
openjdkbot commentedOct 15, 2021 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
@coleenp This change now passes allautomated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the fileCONTRIBUTING.md for details. After integration, the commit message for the final commit will be: You can usepull request commands such as/summary,/contributor and/issue to adjust it as needed. At the time when this comment was updated there had been 26 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the/integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
iklam commentedOct 15, 2021
I filedhttps://bugs.openjdk.java.net/browse/JDK-8275318 "loaded_classes_do may see ArrayKlass before InstanceKlass is loaded" |
coleenp commentedOct 15, 2021
Thank you David and Ioi. |
Going to push as commit172aed1.
Your commit was automatically rebased without conflicts. |
Uh oh!
There was an error while loading.Please reload this page.
The MultiArray_lock isn't held when restoring shared arrays, so an observer can find one that isn't completely restored.
Tested with tier1-3 in progress, and CDS tests locally. Not really reproduceable otherwise even with sleeps.
Progress
Issue
Reviewers
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/5935/head:pull/5935$ git checkout pull/5935Update a local copy of the PR:
$ git checkout pull/5935$ git pull https://git.openjdk.java.net/jdk pull/5935/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 5935View PR using the GUI difftool:
$ git pr show -t 5935Using diff file
Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/5935.diff