Movatterモバイル変換


[0]ホーム

URL:


US20060015547A1 - Efficient circuits for out-of-order microprocessors - Google Patents

Efficient circuits for out-of-order microprocessors
Download PDF

Info

Publication number
US20060015547A1
US20060015547A1US11/077,565US7756505AUS2006015547A1US 20060015547 A1US20060015547 A1US 20060015547A1US 7756505 AUS7756505 AUS 7756505AUS 2006015547 A1US2006015547 A1US 2006015547A1
Authority
US
United States
Prior art keywords
instruction
instructions
register
circuit
memory
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US11/077,565
Inventor
Bradley Kuszmaul
Dana Henry-Kuszmaul
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yale University
Original Assignee
Yale University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yale UniversityfiledCriticalYale University
Priority to US11/077,565priorityCriticalpatent/US20060015547A1/en
Publication of US20060015547A1publicationCriticalpatent/US20060015547A1/en
Abandonedlegal-statusCriticalCurrent

Links

Images

Classifications

Definitions

Landscapes

Abstract

The poor scalability of existing superscalar processors has been of great concern to the computer engineering community. In particular, the critical-path delays of many components in existing implementations grow quadratically with the issue width and the window size. This patent presents a novel way to reimplement these components and reduce their critical-path delay growth. It then describes an entire processor microarchitecture, called the Ultrascalar processor, that has better critical-path delay growth than existing superscalars. Most of our scalable designs are based on a single circuit, a cyclic segmented parallel prefix (cspp). We observe that processor components typically operate on a wrap-around sequence of instructions, computing some associative property of that sequence. For example, to assign an ALU to the oldest requesting instruction, each instruction in the instruction sequence must be told whether any preceding instructions are requesting an ALU. Similarly, to read an argument register, an instruction must somehow communicate with the most recent preceding instruction that wrote that register. A cspp circuit can implement such functions by computing for each instruction within a wrap-around instruction sequence the accumulative result of applying some associative operator to all the preceding instructions. A cspp circuit has a critical path gate delay logarithmic in the length of the instruction sequence. Depending on its associative operation and its layout, a cspp circuit can have a critical path wire delay sublinear in the length of the instruction sequence.

Description

Claims (3)

1-2. (canceled)
3. An ultrascalar processor comprising:
a fetch stage;
a rename stage for processing output from the fetch stage;
an analyze stage for processing output from the rename stage;
a schedule stage for processing output from the analyze stage;
an execute stage for processing output from the schedule; and
a broadcast stage for processing output from the execute stage to the analyze stage,
wherein the scheduler is implemented using a cyclic segmented parallel prefix circuit.
4. An ultrascalar processor comprising:
a fetch stage;
a rename stage for processing output from the fetch stage;
an analyze stage for processing output from the rename stage;
a schedule stage for processing output from the analyze stage;
an execute stage for processing output from the schedule; and
a broadcast stage for processing output from the execute stage to the analyze stage,
wherein a parallel-prefix summation circuit computes program counts in the fetch stage.
US11/077,5651998-03-122005-03-07Efficient circuits for out-of-order microprocessorsAbandonedUS20060015547A1 (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
US11/077,565US20060015547A1 (en)1998-03-122005-03-07Efficient circuits for out-of-order microprocessors

Applications Claiming Priority (5)

Application NumberPriority DateFiling DateTitle
US7766998P1998-03-121998-03-12
US10831898P1998-11-131998-11-13
US09/267,827US6609189B1 (en)1998-03-121999-03-12Cycle segmented prefix circuits
US10/608,621US20040034678A1 (en)1998-03-122003-06-27Efficient circuits for out-of-order microprocessors
US11/077,565US20060015547A1 (en)1998-03-122005-03-07Efficient circuits for out-of-order microprocessors

Related Parent Applications (1)

Application NumberTitlePriority DateFiling Date
US10/608,621ContinuationUS20040034678A1 (en)1998-03-122003-06-27Efficient circuits for out-of-order microprocessors

Publications (1)

Publication NumberPublication Date
US20060015547A1true US20060015547A1 (en)2006-01-19

Family

ID=27738994

Family Applications (3)

Application NumberTitlePriority DateFiling Date
US09/267,827Expired - Fee RelatedUS6609189B1 (en)1998-03-121999-03-12Cycle segmented prefix circuits
US10/608,621AbandonedUS20040034678A1 (en)1998-03-122003-06-27Efficient circuits for out-of-order microprocessors
US11/077,565AbandonedUS20060015547A1 (en)1998-03-122005-03-07Efficient circuits for out-of-order microprocessors

Family Applications Before (2)

Application NumberTitlePriority DateFiling Date
US09/267,827Expired - Fee RelatedUS6609189B1 (en)1998-03-121999-03-12Cycle segmented prefix circuits
US10/608,621AbandonedUS20040034678A1 (en)1998-03-122003-06-27Efficient circuits for out-of-order microprocessors

Country Status (1)

CountryLink
US (3)US6609189B1 (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20080250205A1 (en)*2006-10-042008-10-09Davis Gordon TStructure for supporting simultaneous storage of trace and standard cache lines
US20080250206A1 (en)*2006-10-052008-10-09Davis Gordon TStructure for using branch prediction heuristics for determination of trace formation readiness
US20100169861A1 (en)*2008-12-292010-07-01Cheng WangEnergy/performance with optimal communication in dynamic parallelization of single-threaded programs
WO2011027302A1 (en)*2009-09-022011-03-10Plurality Ltd.Associative distribution units for a high flow-rate synchronizer/scheduler
US8041551B1 (en)2006-05-312011-10-18The Mathworks, Inc.Algorithm and architecture for multi-argument associative operations that minimizes the number of components using a latency of the components
US20150277905A1 (en)*2014-03-282015-10-01Fujitsu LimitedArithmetic processing unit and control method for arithmetic processing unit
US10990408B1 (en)*2019-09-252021-04-27Amazon Technologies, Inc.Place and route aware data pipelining
US12236244B1 (en)2022-06-302025-02-25Apple Inc.Multi-degree branch predictor

Families Citing this family (124)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US7089404B1 (en)1999-06-142006-08-08Transmeta CorporationMethod and apparatus for enhancing scheduling in an advanced microprocessor
US6748589B1 (en)1999-10-202004-06-08Transmeta CorporationMethod for increasing the speed of speculative execution
US6738795B1 (en)*2000-05-302004-05-18Hewlett-Packard Development Company, L.P.Self-timed transmission system and method for processing multiple data sets
US7471643B2 (en)*2002-07-012008-12-30Panasonic CorporationLoosely-biased heterogeneous reconfigurable arrays
US7461234B2 (en)*2002-07-012008-12-02Panasonic CorporationLoosely-biased heterogeneous reconfigurable arrays
US7593976B1 (en)*2003-01-062009-09-22Marvell Israel (M.I.S.L.) Ltd.Method and apparatus for finding the next free bit in a register
US7082044B2 (en)*2003-03-122006-07-25Sensory Networks, Inc.Apparatus and method for memory efficient, programmable, pattern matching finite state machine hardware
US7191432B2 (en)*2003-06-052007-03-13International Business Machines CorporationHigh frequency compound instruction mechanism and method for a compare operation in an arithmetic logic unit
JP2005108086A (en)*2003-10-012005-04-21Handotai Rikougaku Kenkyu Center:KkData processor
US7174428B2 (en)*2003-12-292007-02-06Intel CorporationMethod and system for transforming memory location references in instructions
US7219319B2 (en)*2004-03-122007-05-15Sensory Networks, Inc.Apparatus and method for generating state transition rules for memory efficient programmable pattern matching finite state machine hardware
US7877630B1 (en)2005-09-282011-01-25Oracle America, Inc.Trace based rollback of a speculatively updated cache
US7966479B1 (en)2005-09-282011-06-21Oracle America, Inc.Concurrent vs. low power branch prediction
US8032710B1 (en)2005-09-282011-10-04Oracle America, Inc.System and method for ensuring coherency in trace execution
US7937564B1 (en)2005-09-282011-05-03Oracle America, Inc.Emit vector optimization of a trace
US7953933B1 (en)2005-09-282011-05-31Oracle America, Inc.Instruction cache, decoder circuit, basic block cache circuit and multi-block cache circuit
US8015359B1 (en)2005-09-282011-09-06Oracle America, Inc.Method and system for utilizing a common structure for trace verification and maintaining coherency in an instruction processing circuit
US8499293B1 (en)2005-09-282013-07-30Oracle America, Inc.Symbolic renaming optimization of a trace
US7606975B1 (en)2005-09-282009-10-20Sun Microsystems, Inc.Trace cache for efficient self-modifying code processing
US7870369B1 (en)*2005-09-282011-01-11Oracle America, Inc.Abort prioritization in a trace-based processor
US8037285B1 (en)2005-09-282011-10-11Oracle America, Inc.Trace unit
US8024522B1 (en)2005-09-282011-09-20Oracle America, Inc.Memory ordering queue/versioning cache circuit
US7849292B1 (en)*2005-09-282010-12-07Oracle America, Inc.Flag optimization of a trace
US7953961B1 (en)2005-09-282011-05-31Oracle America, Inc.Trace unit with an op path from a decoder (bypass mode) and from a basic-block builder
US7783863B1 (en)*2005-09-282010-08-24Oracle America, Inc.Graceful degradation in a trace-based processor
US8019944B1 (en)2005-09-282011-09-13Oracle America, Inc.Checking for a memory ordering violation after a speculative cache write
US8370576B1 (en)2005-09-282013-02-05Oracle America, Inc.Cache rollback acceleration via a bank based versioning cache ciruit
US7987342B1 (en)2005-09-282011-07-26Oracle America, Inc.Trace unit with a decoder, a basic-block cache, a multi-block cache, and sequencer
US8051247B1 (en)2005-09-282011-11-01Oracle America, Inc.Trace based deallocation of entries in a versioning cache circuit
US7949854B1 (en)2005-09-282011-05-24Oracle America, Inc.Trace unit with a trace builder
US7797517B1 (en)2005-11-182010-09-14Oracle America, Inc.Trace optimization via fusing operations of a target architecture operation set
CN101449256B (en)2006-04-122013-12-25索夫特机械公司Apparatus and method for processing instruction matrix specifying parallel and dependent operations
US7933940B2 (en)2006-04-202011-04-26International Business Machines CorporationCyclic segmented prefix circuits for mesh networks
US20080022401A1 (en)*2006-07-212008-01-24Sensory Networks Inc.Apparatus and Method for Multicore Network Security Processing
US20080022079A1 (en)*2006-07-242008-01-24Archer Charles JExecuting an allgather operation with an alltoallv operation in a parallel computer
US20080077778A1 (en)*2006-09-252008-03-27Davis Gordon TMethod and Apparatus for Register Renaming in a Microprocessor
US20080215804A1 (en)*2006-09-252008-09-04Davis Gordon TStructure for register renaming in a microprocessor
US8370609B1 (en)2006-09-272013-02-05Oracle America, Inc.Data cache rollbacks for failed speculative traces with memory operations
US8010745B1 (en)2006-09-272011-08-30Oracle America, Inc.Rolling back a speculative update of a non-modifiable cache line
EP2527972A3 (en)2006-11-142014-08-06Soft Machines, Inc.Apparatus and method for processing complex instruction formats in a multi- threaded architecture supporting various context switch modes and virtualization schemes
US8265135B2 (en)*2007-01-292012-09-11Intel CorporationMethod and apparatus for video processing
US7752421B2 (en)*2007-04-192010-07-06International Business Machines CorporationParallel-prefix broadcast for a parallel-prefix operation on a parallel computer
WO2008153667A2 (en)*2007-05-222008-12-18Guy MaorMethod and system for high speed and low memory footprint static timing analysis
US8161480B2 (en)*2007-05-292012-04-17International Business Machines CorporationPerforming an allreduce operation using shared memory
US8140826B2 (en)*2007-05-292012-03-20International Business Machines CorporationExecuting a gather operation on a parallel computer
US20090006663A1 (en)*2007-06-272009-01-01Archer Charles JDirect Memory Access ('DMA') Engine Assisted Local Reduction
US8090704B2 (en)*2007-07-302012-01-03International Business Machines CorporationDatabase retrieval with a non-unique key on a parallel computer system
US7827385B2 (en)*2007-08-022010-11-02International Business Machines CorporationEffecting a broadcast with an allreduce operation on a parallel computer
US7840779B2 (en)*2007-08-222010-11-23International Business Machines CorporationLine-plane broadcasting in a data communications network of a parallel computer
US20090133022A1 (en)*2007-11-152009-05-21Karim Faraydon OMultiprocessing apparatus, system and method
US8275963B2 (en)*2008-02-012012-09-25International Business Machines CorporationAsynchronous memory move across physical nodes with dual-sided communication
US7930504B2 (en)*2008-02-012011-04-19International Business Machines CorporationHandling of address conflicts during asynchronous memory move operations
US8356151B2 (en)*2008-02-012013-01-15International Business Machines CorporationReporting of partially performed memory move
US8245004B2 (en)*2008-02-012012-08-14International Business Machines CorporationMechanisms for communicating with an asynchronous memory mover to perform AMM operations
US8015380B2 (en)*2008-02-012011-09-06International Business Machines CorporationLaunching multiple concurrent memory moves via a fully asynchronoous memory mover
US8095758B2 (en)*2008-02-012012-01-10International Business Machines CorporationFully asynchronous memory mover
US8327101B2 (en)*2008-02-012012-12-04International Business Machines CorporationCache management during asynchronous memory move operations
US7991857B2 (en)*2008-03-242011-08-02International Business Machines CorporationBroadcasting a message in a parallel computer
US8122228B2 (en)*2008-03-242012-02-21International Business Machines CorporationBroadcasting collective operation contributions throughout a parallel computer
US8422402B2 (en)2008-04-012013-04-16International Business Machines CorporationBroadcasting a message in a parallel computer
US8375197B2 (en)*2008-05-212013-02-12International Business Machines CorporationPerforming an allreduce operation on a plurality of compute nodes of a parallel computer
US8161268B2 (en)*2008-05-212012-04-17International Business Machines CorporationPerforming an allreduce operation on a plurality of compute nodes of a parallel computer
US8484440B2 (en)2008-05-212013-07-09International Business Machines CorporationPerforming an allreduce operation on a plurality of compute nodes of a parallel computer
US8281053B2 (en)2008-07-212012-10-02International Business Machines CorporationPerforming an all-to-all data exchange on a plurality of data buffers by performing swap operations
US8392692B2 (en)*2008-08-152013-03-05Lsi CorporationDetermining index values for bits of binary vector by processing masked sub-vector index values
US8615540B2 (en)2009-07-242013-12-24Honeywell International Inc.Arithmetic logic unit for use within a flight control system
US8359558B2 (en)*2010-03-162013-01-22Synopsys, Inc.Modeling of cell delay change for electronic design automation
US8565089B2 (en)*2010-03-292013-10-22International Business Machines CorporationPerforming a scatterv operation on a hierarchical tree network optimized for collective operations
US8332460B2 (en)2010-04-142012-12-11International Business Machines CorporationPerforming a local reduction operation on a parallel computer
US9424087B2 (en)2010-04-292016-08-23International Business Machines CorporationOptimizing collective operations
US8346883B2 (en)2010-05-192013-01-01International Business Machines CorporationEffecting hardware acceleration of broadcast operations in a parallel computer
US8949577B2 (en)2010-05-282015-02-03International Business Machines CorporationPerforming a deterministic reduction operation in a parallel computer
US8489859B2 (en)2010-05-282013-07-16International Business Machines CorporationPerforming a deterministic reduction operation in a compute node organized into a branched tree topology
US8776081B2 (en)2010-09-142014-07-08International Business Machines CorporationSend-side matching of data communications messages
KR101685247B1 (en)2010-09-172016-12-09소프트 머신즈, 인크.Single cycle multi-branch prediction including shadow cache for early far branch prediction
CN107092467B (en)2010-10-122021-10-29英特尔公司Instruction sequence buffer for enhancing branch prediction efficiency
EP2628076B1 (en)*2010-10-122017-08-30Intel CorporationAn instruction sequence buffer to store branches having reliably predictable instruction sequences
US8566841B2 (en)2010-11-102013-10-22International Business Machines CorporationProcessing communications events in parallel active messaging interface by awakening thread from wait state
US9766893B2 (en)2011-03-252017-09-19Intel CorporationExecuting instruction sequence code blocks by using virtual cores instantiated by partitionable engines
KR101620676B1 (en)2011-03-252016-05-23소프트 머신즈, 인크.Register file segments for supporting code block execution by using virtual cores instantiated by partitionable engines
KR101966712B1 (en)2011-03-252019-04-09인텔 코포레이션Memory fragments for supporting code block execution by using virtual cores instantiated by partitionable engines
US9940134B2 (en)2011-05-202018-04-10Intel CorporationDecentralized allocation of resources and interconnect structures to support the execution of instruction sequences by a plurality of engines
EP2710480B1 (en)2011-05-202018-06-20Intel CorporationAn interconnect structure to support the execution of instruction sequences by a plurality of engines
US8893083B2 (en)2011-08-092014-11-18International Business Machines CoporationCollective operation protocol selection in a parallel computer
US8667501B2 (en)2011-08-102014-03-04International Business Machines CorporationPerforming a local barrier operation
US8910178B2 (en)2011-08-102014-12-09International Business Machines CorporationPerforming a global barrier operation in a parallel computer
US10191746B2 (en)2011-11-222019-01-29Intel CorporationAccelerated code optimizer for a multiengine microprocessor
CN104040491B (en)2011-11-222018-06-12英特尔公司 Microprocessor-accelerated code optimizer
US9495135B2 (en)2012-02-092016-11-15International Business Machines CorporationDeveloping collective operations for a parallel computer
US8930674B2 (en)2012-03-072015-01-06Soft Machines, Inc.Systems and methods for accessing a unified translation lookaside buffer
US9229873B2 (en)2012-07-302016-01-05Soft Machines, Inc.Systems and methods for supporting a plurality of load and store accesses of a cache
US9916253B2 (en)2012-07-302018-03-13Intel CorporationMethod and apparatus for supporting a plurality of load accesses of a cache in a single cycle to maintain throughput
US9740612B2 (en)2012-07-302017-08-22Intel CorporationSystems and methods for maintaining the coherency of a store coalescing cache and a load cache
US9710399B2 (en)2012-07-302017-07-18Intel CorporationSystems and methods for flushing a cache with modified data
US9678882B2 (en)2012-10-112017-06-13Intel CorporationSystems and methods for non-blocking implementation of cache flush instructions
GB2511072A (en)*2013-02-222014-08-27IbmNon-deterministic finite state machine module for use in a regular expression matching system
US10275255B2 (en)2013-03-152019-04-30Intel CorporationMethod for dependency broadcasting through a source organized source view data structure
EP2972845B1 (en)2013-03-152021-07-07Intel CorporationA method for executing multithreaded instructions grouped onto blocks
US9886279B2 (en)2013-03-152018-02-06Intel CorporationMethod for populating and instruction view data structure by using register template snapshots
US9891924B2 (en)2013-03-152018-02-13Intel CorporationMethod for implementing a reduced size register view data structure in a microprocessor
WO2014150806A1 (en)2013-03-152014-09-25Soft Machines, Inc.A method for populating register view data structure by using register template snapshots
US10140138B2 (en)2013-03-152018-11-27Intel CorporationMethods, systems and apparatus for supporting wide and efficient front-end operation with guest-architecture emulation
WO2014150971A1 (en)2013-03-152014-09-25Soft Machines, Inc.A method for dependency broadcasting through a block organized source view data structure
US9811342B2 (en)2013-03-152017-11-07Intel CorporationMethod for performing dual dispatch of blocks and half blocks
WO2014150991A1 (en)2013-03-152014-09-25Soft Machines, Inc.A method for implementing a reduced size register view data structure in a microprocessor
WO2014151043A1 (en)2013-03-152014-09-25Soft Machines, Inc.A method for emulating a guest centralized flag architecture by using a native distributed flag architecture
US9588770B2 (en)2013-03-152017-03-07Samsung Electronics Co., Ltd.Dynamic rename based register reconfiguration of a vector register file
US9569216B2 (en)2013-03-152017-02-14Soft Machines, Inc.Method for populating a source view data structure by using register template snapshots
US9904625B2 (en)2013-03-152018-02-27Intel CorporationMethods, systems and apparatus for predicting the way of a set associative cache
EP3011706B1 (en)*2013-06-192017-09-13Huawei Technologies Co., Ltd.P-select n-port round robin arbiter for scheduling requests
JP2015005137A (en)*2013-06-202015-01-08株式会社東芝 Server, electronic device, electronic device control method, electronic device control program
US9471480B2 (en)2013-12-022016-10-18The Regents Of The University Of MichiganData processing apparatus with memory rename table for mapping memory addresses to registers
RU2592465C2 (en)*2014-07-242016-07-20Федеральное государственное учреждение "Федеральный научный центр Научно-исследовательский институт системных исследований Российской академии наук" (ФГУ ФНЦ НИИСИ РАН)Method of filling cache memory and commands output to execute and device for filling cache memory and commands output to execute
KR20180090124A (en)*2017-02-022018-08-10에스케이하이닉스 주식회사Memory system and operating method of memory system
US11775313B2 (en)*2017-05-262023-10-03Purdue Research FoundationHardware accelerator for convolutional neural networks and method of operation thereof
US11119972B2 (en)2018-05-072021-09-14Micron Technology, Inc.Multi-threaded, self-scheduling processor
US11126587B2 (en)2018-05-072021-09-21Micron Technology, Inc.Event messaging in a system having a self-scheduling processor and a hybrid threading fabric
US11132233B2 (en)*2018-05-072021-09-28Micron Technology, Inc.Thread priority management in a multi-threaded, self-scheduling processor
US11157286B2 (en)2018-05-072021-10-26Micron Technology, Inc.Non-cached loads and stores in a system having a multi-threaded, self-scheduling processor
US11119782B2 (en)2018-05-072021-09-14Micron Technology, Inc.Thread commencement using a work descriptor packet in a self-scheduling processor
US10635444B2 (en)2018-06-292020-04-28International Businss Machines CorporationShared compare lanes for dependency wake up in a pair-based issue queue
US11144497B2 (en)2018-08-162021-10-12Tachyum Ltd.System and method of populating an instruction word
US10747545B2 (en)2018-11-282020-08-18International Business Machines CorporationDual compare of least-significant-bit for dependency wake up from a fused instruction tag in a microprocessor
US11301213B2 (en)*2019-06-242022-04-12Intel CorporationReduced latency multiplier circuitry for very large numbers

Citations (6)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US4757444A (en)*1983-12-261988-07-12Hitachi, Ltd.Vector processor capable of performing iterative processing
US5021690A (en)*1989-11-131991-06-04Advanced Micro Devices, Inc.Programmable logic array apparatus
US5752069A (en)*1995-08-311998-05-12Advanced Micro Devices, Inc.Superscalar microprocessor employing away prediction structure
US5764946A (en)*1995-04-121998-06-09Advanced Micro DevicesSuperscalar microprocessor employing a way prediction unit to predict the way of an instruction fetch address and to concurrently provide a branch prediction address corresponding to the fetch address
US5884059A (en)*1996-01-261999-03-16Advanced Micro Devices, Inc.Unified multi-function operation scheduler for out-of-order execution in a superscalar processor
US6557095B1 (en)*1999-12-272003-04-29Intel CorporationScheduling operations using a dependency matrix

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US3716851A (en)*1971-02-091973-02-13Bell Telephone Labor IncSelf-synchronizing sequential encoding systems
US4594655A (en)*1983-03-141986-06-10International Business Machines Corporation(k)-Instructions-at-a-time pipelined processor for parallel execution of inherently sequential instructions
US5487156A (en)*1989-12-151996-01-23Popescu; ValeriProcessor architecture having independently fetching issuing and updating operations of instructions which are sequentially assigned and stored in order fetched
WO1992006436A2 (en)*1990-10-031992-04-16Thinking Machines CorporationParallel computer system
US5333135A (en)*1993-02-011994-07-26North American Philips CorporationIdentification of a data stream transmitted as a sequence of packets
US5560025A (en)*1993-03-311996-09-24Intel CorporationEntry allocation apparatus and method of same
US5544325A (en)*1994-03-211996-08-06International Business Machines CorporationSystem and method for generating messages for use in transaction networks
US5822778A (en)*1995-06-071998-10-13Advanced Micro Devices, Inc.Microprocessor and method of using a segment override prefix instruction field to expand the register file
US5999961A (en)*1997-09-151999-12-07California Institute Of TechnologyParallel prefix operations in asynchronous processors
US6151295A (en)*1998-02-262000-11-21Wavesat Telecom Inc.OFDM receiving system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US4757444A (en)*1983-12-261988-07-12Hitachi, Ltd.Vector processor capable of performing iterative processing
US5021690A (en)*1989-11-131991-06-04Advanced Micro Devices, Inc.Programmable logic array apparatus
US5764946A (en)*1995-04-121998-06-09Advanced Micro DevicesSuperscalar microprocessor employing a way prediction unit to predict the way of an instruction fetch address and to concurrently provide a branch prediction address corresponding to the fetch address
US5752069A (en)*1995-08-311998-05-12Advanced Micro Devices, Inc.Superscalar microprocessor employing away prediction structure
US5884059A (en)*1996-01-261999-03-16Advanced Micro Devices, Inc.Unified multi-function operation scheduler for out-of-order execution in a superscalar processor
US6557095B1 (en)*1999-12-272003-04-29Intel CorporationScheduling operations using a dependency matrix

Cited By (11)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US8041551B1 (en)2006-05-312011-10-18The Mathworks, Inc.Algorithm and architecture for multi-argument associative operations that minimizes the number of components using a latency of the components
US8775147B1 (en)2006-05-312014-07-08The Mathworks, Inc.Algorithm and architecture for multi-argument associative operations that minimizes the number of components using a latency of the components
US20080250205A1 (en)*2006-10-042008-10-09Davis Gordon TStructure for supporting simultaneous storage of trace and standard cache lines
US8386712B2 (en)*2006-10-042013-02-26International Business Machines CorporationStructure for supporting simultaneous storage of trace and standard cache lines
US20080250206A1 (en)*2006-10-052008-10-09Davis Gordon TStructure for using branch prediction heuristics for determination of trace formation readiness
US20100169861A1 (en)*2008-12-292010-07-01Cheng WangEnergy/performance with optimal communication in dynamic parallelization of single-threaded programs
US9715376B2 (en)*2008-12-292017-07-25Intel CorporationEnergy/performance with optimal communication in dynamic parallelization of single threaded programs
WO2011027302A1 (en)*2009-09-022011-03-10Plurality Ltd.Associative distribution units for a high flow-rate synchronizer/scheduler
US20150277905A1 (en)*2014-03-282015-10-01Fujitsu LimitedArithmetic processing unit and control method for arithmetic processing unit
US10990408B1 (en)*2019-09-252021-04-27Amazon Technologies, Inc.Place and route aware data pipelining
US12236244B1 (en)2022-06-302025-02-25Apple Inc.Multi-degree branch predictor

Also Published As

Publication numberPublication date
US6609189B1 (en)2003-08-19
US20040034678A1 (en)2004-02-19

Similar Documents

PublicationPublication DateTitle
US6609189B1 (en)Cycle segmented prefix circuits
US7490218B2 (en)Building a wavecache
KR101754462B1 (en)Method and apparatus for implementing a dynamic out-of-order processor pipeline
Ipek et al.Core fusion: accommodating software diversity in chip multiprocessors
Palacharla et al.Complexity-effective superscalar processors
CN107810483B (en)Apparatus, storage device and method for verifying jump target in processor
Swanson et al.The wavescalar architecture
Stone et al.Computer architecture in the 1990s
US7117345B2 (en)Non-stalling circular counterflow pipeline processor with reorder buffer
CN103646009A (en)Apparatus and method for processing an instruction matrix specifying parallel and dependent operations
CN112148647A (en)Apparatus, method and system for memory interface circuit arbitration
CN1983165A (en)System and method for processing thread groups in a SIMD architecture
JPH11249897A (en) Method and apparatus for selecting the next instruction in a superscalar or very long instruction word computer with N-way branches
EP0518420A2 (en)Computer system for concurrent processing of multiple out-of-order instructions
Gunadi et al.CRIB: Consolidated rename, issue, and bypass
OmondiThe microarchitecture of pipelined and superscalar computers
Busaba et al.IBM zEnterprise 196 microprocessor and cache subsystem
Henry et al.The ultrascalar processor-an asymptotically scalable superscalar microarchitecture
Galceran-Oms et al.Microarchitectural transformations using elasticity
Kuszmaul et al.A comparison of scalable superscalar processors
Miller et al.Non-stalling counterflow architecture
Kuszmaul et al.A comparison of asymptotically scalable superscalar processors
WinkelOptimal Global Instruction Scheduling for the Itanium® Processor Architecture
SuperscalarThe Ultrascalar Processor
Zhang et al.Performance modeling and code partitioning for the DS architecture

Legal Events

DateCodeTitleDescription
STCBInformation on status: application discontinuation

Free format text:ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION


[8]ページ先頭

©2009-2025 Movatter.jp