Movatterモバイル変換


[0]ホーム

URL:


US20140075163A1 - Load-monitor mwait - Google Patents

Load-monitor mwait
Download PDF

Info

Publication number
US20140075163A1
US20140075163A1US13/607,175US201213607175AUS2014075163A1US 20140075163 A1US20140075163 A1US 20140075163A1US 201213607175 AUS201213607175 AUS 201213607175AUS 2014075163 A1US2014075163 A1US 2014075163A1
Authority
US
United States
Prior art keywords
instruction
load
execution
monitor
thread
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US13/607,175
Inventor
Paul N. Loewenstein
Mark A. Luttrell
Paul J. Jordan
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Oracle International Corp
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by IndividualfiledCriticalIndividual
Priority to US13/607,175priorityCriticalpatent/US20140075163A1/en
Assigned to ORACLE INTERNATIONAL CORPORATIONreassignmentORACLE INTERNATIONAL CORPORATIONASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS).Assignors: JORDAN, PAUL J., LOEWENSTEIN, PAUL N., LUTTRELL, MARK A.
Publication of US20140075163A1publicationCriticalpatent/US20140075163A1/en
Priority to US14/967,954prioritypatent/US9940132B2/en
Abandonedlegal-statusCriticalCurrent

Links

Images

Classifications

Definitions

Landscapes

Abstract

Techniques are disclosed relating to suspending execution of a processor thread while monitoring for a write to a specified memory location. An execution subsystem may be configured to perform a load instruction that causes the processor to retrieve data from a specified memory location and atomically begin monitoring for a write to the specified location. The load instruction may be a load-monitor instruction. The execution subsystem may be further configured to perform a wait instruction that causes the processor to suspend execution of a processor thread during at least a portion of an interval specified by the wait instruction and to resume execution of the processor thread at the end of the interval. The wait instruction may be a monitor-wait instruction. The processor may be further configured to resume execution of the processor thread in response to detecting a write to a memory location specified by a previous monitor instruction.

Description

Claims (20)

What is claimed is:
1. An apparatus, comprising:
an execution subsystem configured to perform a load instruction that causes the apparatus to retrieve data from a specified memory location and atomically begin monitoring for a write to the specified memory location.
2. The apparatus ofclaim 1, wherein the load instruction is a load-monitor instruction of a particular instruction set architecture and the execution subsystem is a load/store unit.
3. The apparatus ofclaim 1, wherein the apparatus is configured to:
suspend execution of a thread based on a wait instruction; and
resume execution of the thread in response to detecting a write to the specified memory location.
4. The apparatus ofclaim 1, wherein the apparatus is configured to begin monitoring for a write to the specified memory location before completion of the load instruction.
5. The apparatus ofclaim 1, further comprising:
a monitor unit; and
a cache comprising a plurality of cache lines;
wherein, to monitor for a write to the specified memory location, the monitor unit is configured to monitor a state of a cache line associated with the specified memory location.
6. The apparatus ofclaim 1, further comprising:
a monitor unit; and
an address bus;
wherein, to monitor for a write to the specified memory location, the monitor unit is configured to snoop the address bus.
7. The apparatus ofclaim 1, wherein the execution subsystem comprises:
a monitor unit; and
a load buffer;
wherein the execution subsystem is configured to speculatively perform the load instruction;
wherein the load buffer is configured to store the speculatively performed load; and
wherein, to atomically begin monitoring, the monitor unit is configured to begin monitoring for writes to the specified location while the speculatively performed load is stored in the load buffer.
8. A apparatus, comprising:
an execution subsystem configured to perform a wait instruction that causes the apparatus to suspend execution of a thread during at least a portion of an interval specified by the wait instruction;
wherein the apparatus is configured to resume execution of the thread upon an expiration of the interval.
9. The apparatus ofclaim 8,
wherein the wait instruction is a monitor-wait instruction of a particular instruction set architecture; and
wherein, to specify the interval, the monitor-wait instruction specifies a register configured to store the interval.
10. The apparatus ofclaim 8, wherein the wait instruction comprises a field that specifies the interval as an immediate value.
11. The apparatus ofclaim 8, wherein the apparatus is further configured to resume execution of the thread in response to detecting a write to a memory location specified by a previous monitor instruction.
12. The apparatus ofclaim 11, wherein the apparatus is further configured to indicate, after resuming the thread, whether the thread was resumed based on the interval or based on detecting the write.
13. The apparatus ofclaim 8, wherein execution of the wait instruction causes the apparatus to suspend execution of the thread in response to the threshold being longer than a threshold interval.
14. The apparatus ofclaim 8, wherein the apparatus is also configured to resume execution of the thread in response to:
a trap request; or
a change in a processing state of the thread.
15. The apparatus ofclaim 8,
wherein the execution subsystem, in response to receiving another instance of the wait instruction, is configured to perform a no-operation in response to one of a set of criteria being satisfied;
wherein the set of criteria is selected from the group consisting of: no monitor instruction is pending, a write to a memory location specified by a most recent monitor instruction is detected, and a trap occurs between the most recent monitor instruction and the other instance of the wait instruction.
16. A method, comprising:
an execution unit in a processor performing a load instruction, wherein the performing includes:
the execution unit causing data specified by the load instruction to be retrieved from a memory location; and
monitoring for a store to the memory location;
wherein the causing data specified by the load instruction to be retrieved and beginning the monitoring are performed atomically.
17. The method ofclaim 16, further comprising:
the processor performing a wait instruction that specifies a suspension interval;
the processor suspending execution of a processor thread; and
the processor resuming execution of the thread in response to detecting an end of the suspension interval.
18. The method ofclaim 16, wherein the load instruction is an atomic load-monitor instruction, and wherein the atomic load-monitor instruction is a most recently executed load-monitor instruction prior to the wait instruction.
19. The method ofclaim 16, wherein the monitoring includes monitoring for a store by another processor thread to the memory location.
20. The method ofclaim 16, wherein the monitoring includes monitoring a state of a cache line associated with the memory location.
US13/607,1752012-09-072012-09-07Load-monitor mwaitAbandonedUS20140075163A1 (en)

Priority Applications (2)

Application NumberPriority DateFiling DateTitle
US13/607,175US20140075163A1 (en)2012-09-072012-09-07Load-monitor mwait
US14/967,954US9940132B2 (en)2012-09-072015-12-14Load-monitor mwait

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
US13/607,175US20140075163A1 (en)2012-09-072012-09-07Load-monitor mwait

Related Child Applications (1)

Application NumberTitlePriority DateFiling Date
US14/967,954ContinuationUS9940132B2 (en)2012-09-072015-12-14Load-monitor mwait

Publications (1)

Publication NumberPublication Date
US20140075163A1true US20140075163A1 (en)2014-03-13

Family

ID=50234602

Family Applications (2)

Application NumberTitlePriority DateFiling Date
US13/607,175AbandonedUS20140075163A1 (en)2012-09-072012-09-07Load-monitor mwait
US14/967,954Active2033-03-19US9940132B2 (en)2012-09-072015-12-14Load-monitor mwait

Family Applications After (1)

Application NumberTitlePriority DateFiling Date
US14/967,954Active2033-03-19US9940132B2 (en)2012-09-072015-12-14Load-monitor mwait

Country Status (1)

CountryLink
US (2)US20140075163A1 (en)

Cited By (24)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20140082243A1 (en)*2012-09-142014-03-20Ian BettsAchieving deterministic execution of time critical code sections in multi-core systems
US9384036B1 (en)*2013-10-212016-07-05Google Inc.Low latency thread context caching
US20170031813A1 (en)*2015-07-272017-02-02International Business Machines CorporationMulti-section garbage collection method
WO2017112374A1 (en)*2015-12-242017-06-29Intel CorporationMethod and apparatus for user-level thread synchronization with a monitor and mwait architecture
WO2017139054A1 (en)*2016-02-092017-08-17Intel CorporationMethods, apparatus, and instructions for user-level thread suspension
US20180143828A1 (en)*2016-11-182018-05-24Red Hat Israel, Ltd.Efficient scheduling for hyper-threaded cpus using memory monitoring
US20180189060A1 (en)*2016-12-292018-07-05Intel CorporationWait and poll instructions for monitoring a plurality of addresses
CN108345534A (en)*2017-01-242018-07-31Arm 有限公司The device and method for generating and handling tracking stream
US10055248B1 (en)2017-02-222018-08-21Red Hat, Inc.Virtual processor scheduling via memory monitoring
US10073770B2 (en)2015-07-272018-09-11International Business Machines CorporationScheme for determining data object usage in a memory region
US10185564B2 (en)2016-04-282019-01-22Oracle International CorporationMethod for managing software threads dependent on condition variables
US10353706B2 (en)2017-04-282019-07-16Intel CorporationInstructions and logic to perform floating-point and integer operations for machine learning
US10409614B2 (en)2017-04-242019-09-10Intel CorporationInstructions having support for floating point and integer data types in the same register
US10942745B2 (en)*2016-04-072021-03-09International Business Machines CorporationFast multi-width instruction issue in parallel slice processor
US11294710B2 (en)*2017-11-102022-04-05Advanced Micro Devices, Inc.Thread switch for accesses to slow memory
US11314509B2 (en)*2020-03-192022-04-26Arm LimitedProcessing of plural-register-load instruction
US11347680B2 (en)*2016-04-022022-05-31Intel CorporationProcessors, methods, systems, and instructions to atomically store to memory data wider than a natively supported data width
US11361496B2 (en)2019-03-152022-06-14Intel CorporationGraphics processors and graphics processing units having dot product accumulate instruction for hybrid floating point format
EP4246333A1 (en)*2022-03-152023-09-20Nxp B.V.Adaptive prefetcher for shared system cache
US11842423B2 (en)2019-03-152023-12-12Intel CorporationDot product operations on sparse matrix elements
US11934342B2 (en)2019-03-152024-03-19Intel CorporationAssistance for hardware prefetch in cache access
US12056059B2 (en)2019-03-152024-08-06Intel CorporationSystems and methods for cache optimization
US12361600B2 (en)2019-11-152025-07-15Intel CorporationSystolic arithmetic on sparse data
US12443558B2 (en)2024-05-212025-10-14Intel CorporationProcessors, methods, systems, and instructions to atomically store to memory data wider than a natively supported data width

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US10929141B1 (en)*2018-03-062021-02-23Advanced Micro Devices, Inc.Selective use of taint protection during speculative execution

Citations (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20100332538A1 (en)*2009-06-302010-12-30Microsoft CorporationHardware accelerated transactional memory system with open nested transactions
US20110154079A1 (en)*2009-12-182011-06-23Dixon Martin GInstruction For Enabling A Procesor Wait State

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US5630095A (en)*1993-08-031997-05-13Motorola Inc.Method for use with a data coherency protocol allowing multiple snoop queries to a single snoop transaction and system therefor
US6484254B1 (en)*1999-12-302002-11-19Intel CorporationMethod, apparatus, and system for maintaining processor ordering by checking load addresses of unretired load instructions against snooping store addresses
US7213093B2 (en)2003-06-272007-05-01Intel CorporationQueued locks using monitor-memory wait
US7810083B2 (en)*2004-12-302010-10-05Intel CorporationMechanism to emulate user-level multithreading on an OS-sequestered sequencer
US8607235B2 (en)*2004-12-302013-12-10Intel CorporationMechanism to schedule threads on OS-sequestered sequencers without operating system intervention
US9081687B2 (en)*2007-12-282015-07-14Intel CorporationMethod and apparatus for MONITOR and MWAIT in a distributed cache architecture

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20100332538A1 (en)*2009-06-302010-12-30Microsoft CorporationHardware accelerated transactional memory system with open nested transactions
US20110154079A1 (en)*2009-12-182011-06-23Dixon Martin GInstruction For Enabling A Procesor Wait State

Cited By (78)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US9286137B2 (en)*2012-09-142016-03-15Intel CorporationAchieving deterministic execution of time critical code sections in multi-core systems
US20140082243A1 (en)*2012-09-142014-03-20Ian BettsAchieving deterministic execution of time critical code sections in multi-core systems
US9384036B1 (en)*2013-10-212016-07-05Google Inc.Low latency thread context caching
US10127076B1 (en)2013-10-212018-11-13Google LlcLow latency thread context caching
US10073770B2 (en)2015-07-272018-09-11International Business Machines CorporationScheme for determining data object usage in a memory region
US20170031813A1 (en)*2015-07-272017-02-02International Business Machines CorporationMulti-section garbage collection method
US10838857B2 (en)*2015-07-272020-11-17International Business Machines CorporationMulti-section garbage collection
US10802964B2 (en)*2015-07-272020-10-13International Business Machines CorporationMulti-section garbage collection method
US10223257B2 (en)*2015-07-272019-03-05International Business Machines CorporationMulti-section garbage collection
US10083113B2 (en)2015-07-272018-09-25International Business Machines CorporationScheme for determining data object usage in a memory region
TWI775105B (en)*2015-12-242022-08-21美商英特爾股份有限公司Method and apparatus for user-level thread synchronization with a monitor and mwait architecture
WO2017112374A1 (en)*2015-12-242017-06-29Intel CorporationMethod and apparatus for user-level thread synchronization with a monitor and mwait architecture
US9898351B2 (en)2015-12-242018-02-20Intel CorporationMethod and apparatus for user-level thread synchronization with a monitor and MWAIT architecture
TWI706323B (en)*2015-12-242020-10-01美商英特爾股份有限公司Method and apparatus for user-level thread synchronization with a monitor and mwait architecture
US11023233B2 (en)2016-02-092021-06-01Intel CorporationMethods, apparatus, and instructions for user level thread suspension
WO2017139054A1 (en)*2016-02-092017-08-17Intel CorporationMethods, apparatus, and instructions for user-level thread suspension
US12020031B2 (en)2016-02-092024-06-25Intel CorporationMethods, apparatus, and instructions for user-level thread suspension
US12007938B2 (en)2016-04-022024-06-11Intel CorporationProcessors, methods, systems, and instructions to atomically store to memory data wider than a natively supported data width
US11347680B2 (en)*2016-04-022022-05-31Intel CorporationProcessors, methods, systems, and instructions to atomically store to memory data wider than a natively supported data width
US10942745B2 (en)*2016-04-072021-03-09International Business Machines CorporationFast multi-width instruction issue in parallel slice processor
US10761846B2 (en)2016-04-282020-09-01Oracle International CorporationMethod for managing software threads dependent on condition variables
US10185564B2 (en)2016-04-282019-01-22Oracle International CorporationMethod for managing software threads dependent on condition variables
US20180143828A1 (en)*2016-11-182018-05-24Red Hat Israel, Ltd.Efficient scheduling for hyper-threaded cpus using memory monitoring
US11061730B2 (en)*2016-11-182021-07-13Red Hat Israel, Ltd.Efficient scheduling for hyper-threaded CPUs using memory monitoring
US10394678B2 (en)*2016-12-292019-08-27Intel CorporationWait and poll instructions for monitoring a plurality of addresses
US10289516B2 (en)2016-12-292019-05-14Intel CorporationNMONITOR instruction for monitoring a plurality of addresses
US20180189060A1 (en)*2016-12-292018-07-05Intel CorporationWait and poll instructions for monitoring a plurality of addresses
CN108255520A (en)*2016-12-292018-07-06英特尔公司N roads monitor
CN108345534A (en)*2017-01-242018-07-31Arm 有限公司The device and method for generating and handling tracking stream
US10055248B1 (en)2017-02-222018-08-21Red Hat, Inc.Virtual processor scheduling via memory monitoring
US10871982B2 (en)2017-02-222020-12-22Red Hat, Inc.Virtual processor scheduling via memory monitoring
US11409537B2 (en)2017-04-242022-08-09Intel CorporationMixed inference using low and high precision
US10409614B2 (en)2017-04-242019-09-10Intel CorporationInstructions having support for floating point and integer data types in the same register
US12411695B2 (en)2017-04-242025-09-09Intel CorporationMulticore processor with each core having independent floating point datapath and integer datapath
US12175252B2 (en)2017-04-242024-12-24Intel CorporationConcurrent multi-datatype execution within a processing resource
US11461107B2 (en)2017-04-242022-10-04Intel CorporationCompute unit having independent data paths
US10474458B2 (en)*2017-04-282019-11-12Intel CorporationInstructions and logic to perform floating-point and integer operations for machine learning
US11080046B2 (en)2017-04-282021-08-03Intel CorporationInstructions and logic to perform floating point and integer operations for machine learning
US11360767B2 (en)2017-04-282022-06-14Intel CorporationInstructions and logic to perform floating point and integer operations for machine learning
US12217053B2 (en)2017-04-282025-02-04Intel CorporationInstructions and logic to perform floating point and integer operations for machine learning
US11169799B2 (en)2017-04-282021-11-09Intel CorporationInstructions and logic to perform floating-point and integer operations for machine learning
US11720355B2 (en)2017-04-282023-08-08Intel CorporationInstructions and logic to perform floating point and integer operations for machine learning
US12141578B2 (en)2017-04-282024-11-12Intel CorporationInstructions and logic to perform floating point and integer operations for machine learning
US12039331B2 (en)2017-04-282024-07-16Intel CorporationInstructions and logic to perform floating point and integer operations for machine learning
US10353706B2 (en)2017-04-282019-07-16Intel CorporationInstructions and logic to perform floating-point and integer operations for machine learning
US11294710B2 (en)*2017-11-102022-04-05Advanced Micro Devices, Inc.Thread switch for accesses to slow memory
US11842423B2 (en)2019-03-152023-12-12Intel CorporationDot product operations on sparse matrix elements
US12204487B2 (en)2019-03-152025-01-21Intel CorporationGraphics processor data access and sharing
US12386779B2 (en)2019-03-152025-08-12Intel CorporationDynamic memory reconfiguration
US11995029B2 (en)2019-03-152024-05-28Intel CorporationMulti-tile memory management for detecting cross tile access providing multi-tile inference scaling and providing page migration
US11954063B2 (en)2019-03-152024-04-09Intel CorporationGraphics processors and graphics processing units having dot product accumulate instruction for hybrid floating point format
US12007935B2 (en)2019-03-152024-06-11Intel CorporationGraphics processors and graphics processing units having dot product accumulate instruction for hybrid floating point format
US12013808B2 (en)2019-03-152024-06-18Intel CorporationMulti-tile architecture for graphics operations
US11934342B2 (en)2019-03-152024-03-19Intel CorporationAssistance for hardware prefetch in cache access
US11899614B2 (en)2019-03-152024-02-13Intel CorporationInstruction based control of memory attributes
US12056059B2 (en)2019-03-152024-08-06Intel CorporationSystems and methods for cache optimization
US12066975B2 (en)2019-03-152024-08-20Intel CorporationCache structure and utilization
US12079155B2 (en)2019-03-152024-09-03Intel CorporationGraphics processor operation scheduling for deterministic latency
US12093210B2 (en)2019-03-152024-09-17Intel CorporationCompression techniques
US12099461B2 (en)2019-03-152024-09-24Intel CorporationMulti-tile memory management
US12124383B2 (en)2019-03-152024-10-22Intel CorporationSystems and methods for cache optimization
US12321310B2 (en)2019-03-152025-06-03Intel CorporationImplicit fence for write messages
US12141094B2 (en)2019-03-152024-11-12Intel CorporationSystolic disaggregation within a matrix accelerator architecture
US12153541B2 (en)2019-03-152024-11-26Intel CorporationCache structure and utilization
US11709793B2 (en)2019-03-152023-07-25Intel CorporationGraphics processors and graphics processing units having dot product accumulate instruction for hybrid floating point format
US12182035B2 (en)2019-03-152024-12-31Intel CorporationSystems and methods for cache optimization
US12182062B1 (en)2019-03-152024-12-31Intel CorporationMulti-tile memory management
US12198222B2 (en)2019-03-152025-01-14Intel CorporationArchitecture for block sparse operations on a systolic array
US11954062B2 (en)2019-03-152024-04-09Intel CorporationDynamic memory reconfiguration
US12210477B2 (en)2019-03-152025-01-28Intel CorporationSystems and methods for improving cache efficiency and utilization
US11361496B2 (en)2019-03-152022-06-14Intel CorporationGraphics processors and graphics processing units having dot product accumulate instruction for hybrid floating point format
US12242414B2 (en)2019-03-152025-03-04Intel CorporationData initialization techniques
US12293431B2 (en)2019-03-152025-05-06Intel CorporationSparse optimizations for a matrix accelerator architecture
US12361600B2 (en)2019-11-152025-07-15Intel CorporationSystolic arithmetic on sparse data
US11314509B2 (en)*2020-03-192022-04-26Arm LimitedProcessing of plural-register-load instruction
EP4246333A1 (en)*2022-03-152023-09-20Nxp B.V.Adaptive prefetcher for shared system cache
US11994993B2 (en)2022-03-152024-05-28Nxp B.V.Adaptive prefetcher for shared system cache
US12443558B2 (en)2024-05-212025-10-14Intel CorporationProcessors, methods, systems, and instructions to atomically store to memory data wider than a natively supported data width

Also Published As

Publication numberPublication date
US9940132B2 (en)2018-04-10
US20160098274A1 (en)2016-04-07

Similar Documents

PublicationPublication DateTitle
US9940132B2 (en)Load-monitor mwait
US8429386B2 (en)Dynamic tag allocation in a multithreaded out-of-order processor
US9213551B2 (en)Return address prediction in multithreaded processors
EP2707794B1 (en)Suppression of control transfer instructions on incorrect speculative execution paths
US9026705B2 (en)Interrupt processing unit for preventing interrupt loss
US9122487B2 (en)System and method for balancing instruction loads between multiple execution units using assignment history
US9690625B2 (en)System and method for out-of-order resource allocation and deallocation in a threaded machine
US8412911B2 (en)System and method to invalidate obsolete address translations
US8904156B2 (en)Perceptron-based branch prediction mechanism for predicting conditional branch instructions on a multithreaded processor
US8429636B2 (en)Handling dependency conditions between machine instructions
US9086889B2 (en)Reducing pipeline restart penalty
EP2707792B1 (en)Branch target storage and retrieval in an out-of-order processor
US8572356B2 (en)Space-efficient mechanism to support additional scouting in a processor using checkpoints
US8335912B2 (en)Logical map table for detecting dependency conditions between instructions having varying width operand values
US10338928B2 (en)Utilizing a stack head register with a call return stack for each instruction fetch
US20110078425A1 (en)Branch prediction mechanism for predicting indirect branch targets
US20100274961A1 (en)Physically-indexed logical map table
US20110276760A1 (en)Non-committing store instructions
US20100268892A1 (en)Data Prefetcher
US20130024647A1 (en)Cache backed vector registers
US20130138888A1 (en)Storing a target address of a control transfer instruction in an instruction field
US9507656B2 (en)Mechanism for handling unfused multiply-accumulate accrued exception bits in a processor
US8504805B2 (en)Processor operating mode for mitigating dependency conditions between instructions having different operand sizes
US9304767B2 (en)Single cycle data movement between general purpose and floating-point registers

Legal Events

DateCodeTitleDescription
ASAssignment

Owner name:ORACLE INTERNATIONAL CORPORATION, CALIFORNIA

Free format text:ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:LOEWENSTEIN, PAUL N.;LUTTRELL, MARK A.;JORDAN, PAUL J.;REEL/FRAME:028919/0598

Effective date:20120904

STCBInformation on status: application discontinuation

Free format text:ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION


[8]ページ先頭

©2009-2025 Movatter.jp