Movatterモバイル変換


[0]ホーム

URL:


US20080071851A1 - Instruction and logic for performing a dot-product operation - Google Patents

Instruction and logic for performing a dot-product operation
Download PDF

Info

Publication number
US20080071851A1
US20080071851A1US11/524,852US52485206AUS2008071851A1US 20080071851 A1US20080071851 A1US 20080071851A1US 52485206 AUS52485206 AUS 52485206AUS 2008071851 A1US2008071851 A1US 2008071851A1
Authority
US
United States
Prior art keywords
packed
data
product
dot
instruction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US11/524,852
Inventor
Ronen Zohar
Mark Seconi
Rajesh Parthasarathy
Srinivas Chennupaty
Mark Buxton
Chuck DeSylva
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Intel Corp
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by IndividualfiledCriticalIndividual
Priority to US11/524,852priorityCriticalpatent/US20080071851A1/en
Assigned to INTEL CORPORATIONreassignmentINTEL CORPORATIONASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS).Assignors: DESYLVA, CHUCK, BUXTON, MARK, CHENNUPATY, SRINIVAS, PARTHASARATHY, SRINIVAS, SECONI, MARK, ZOHAR, RONEN
Priority to CN201710964492.XAprioritypatent/CN107741842B/en
Priority to JP2007244076Aprioritypatent/JP4697639B2/en
Priority to CN2007101806477Aprioritypatent/CN101187861B/en
Priority to KR1020117020282Aprioritypatent/KR101300431B1/en
Priority to CN2011104607310Aprioritypatent/CN102622203A/en
Priority to PCT/US2007/079098prioritypatent/WO2008036859A1/en
Priority to KR1020097005675Aprioritypatent/KR101105527B1/en
Priority to DE112007002101Tprioritypatent/DE112007002101T5/en
Priority to CN201510348092.7Aprioritypatent/CN105022605B/en
Priority to RU2009114818/08Aprioritypatent/RU2421796C2/en
Priority to CN201010535666.9Aprioritypatent/CN102004628B/en
Publication of US20080071851A1publicationCriticalpatent/US20080071851A1/en
Priority to US13/844,366prioritypatent/US20130290392A1/en
Priority to US14/042,681prioritypatent/US20140032624A1/en
Priority to US14/042,696prioritypatent/US20140032881A1/en
Priority to US15/640,395prioritypatent/US20170364476A1/en
Abandonedlegal-statusCriticalCurrent

Links

Images

Classifications

Definitions

Landscapes

Abstract

Method, apparatus, and program means for performing a dot-product operation. In one embodiment, an apparatus includes execution resources to execute a first instruction. In response to the first instruction, said execution resources store to a storage location a result value equal to a dot-product of at least two operands.

Description

Claims (39)

US11/524,8522006-09-202006-09-20Instruction and logic for performing a dot-product operationAbandonedUS20080071851A1 (en)

Priority Applications (16)

Application NumberPriority DateFiling DateTitle
US11/524,852US20080071851A1 (en)2006-09-202006-09-20Instruction and logic for performing a dot-product operation
CN201010535666.9ACN102004628B (en)2006-09-202007-09-20Instruction and logic for performing a dot-product operation
DE112007002101TDE112007002101T5 (en)2006-09-202007-09-20 Instruction and logic for performing a dot product operation
RU2009114818/08ARU2421796C2 (en)2006-09-202007-09-20Instruction and logical circuit to carry out dot product operation
CN2007101806477ACN101187861B (en)2006-09-202007-09-20 Instructions and logic to perform the dot product operation
KR1020117020282AKR101300431B1 (en)2006-09-202007-09-20Instruction and logic for performing a dot-product operation
CN2011104607310ACN102622203A (en)2006-09-202007-09-20Instruction and logic for performing a dot-product operation
PCT/US2007/079098WO2008036859A1 (en)2006-09-202007-09-20Instruction and logic for performing a dot-product operation
KR1020097005675AKR101105527B1 (en)2006-09-202007-09-20Instruction and logic for performing a dot-product operation
CN201710964492.XACN107741842B (en)2006-09-202007-09-20 Instructions and logic for performing dot product operations
CN201510348092.7ACN105022605B (en)2006-09-202007-09-20Instruction for executing dot-product operation and logic
JP2007244076AJP4697639B2 (en)2006-09-202007-09-20 Instructions and logic for performing dot product operations
US13/844,366US20130290392A1 (en)2006-09-202013-03-15Instruction and logic for performing a dot-product operation
US14/042,681US20140032624A1 (en)2006-09-202013-09-30Instruction and logic for performing a dot-product operation
US14/042,696US20140032881A1 (en)2006-09-202013-09-30Instruction and logic for performing a dot-product operation
US15/640,395US20170364476A1 (en)2006-09-202017-06-30Instruction and logic for performing a dot-product operation

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
US11/524,852US20080071851A1 (en)2006-09-202006-09-20Instruction and logic for performing a dot-product operation

Related Child Applications (1)

Application NumberTitlePriority DateFiling Date
US13/844,366ContinuationUS20130290392A1 (en)2006-09-202013-03-15Instruction and logic for performing a dot-product operation

Publications (1)

Publication NumberPublication Date
US20080071851A1true US20080071851A1 (en)2008-03-20

Family

ID=39189946

Family Applications (5)

Application NumberTitlePriority DateFiling Date
US11/524,852AbandonedUS20080071851A1 (en)2006-09-202006-09-20Instruction and logic for performing a dot-product operation
US13/844,366AbandonedUS20130290392A1 (en)2006-09-202013-03-15Instruction and logic for performing a dot-product operation
US14/042,696AbandonedUS20140032881A1 (en)2006-09-202013-09-30Instruction and logic for performing a dot-product operation
US14/042,681AbandonedUS20140032624A1 (en)2006-09-202013-09-30Instruction and logic for performing a dot-product operation
US15/640,395AbandonedUS20170364476A1 (en)2006-09-202017-06-30Instruction and logic for performing a dot-product operation

Family Applications After (4)

Application NumberTitlePriority DateFiling Date
US13/844,366AbandonedUS20130290392A1 (en)2006-09-202013-03-15Instruction and logic for performing a dot-product operation
US14/042,696AbandonedUS20140032881A1 (en)2006-09-202013-09-30Instruction and logic for performing a dot-product operation
US14/042,681AbandonedUS20140032624A1 (en)2006-09-202013-09-30Instruction and logic for performing a dot-product operation
US15/640,395AbandonedUS20170364476A1 (en)2006-09-202017-06-30Instruction and logic for performing a dot-product operation

Country Status (7)

CountryLink
US (5)US20080071851A1 (en)
JP (1)JP4697639B2 (en)
KR (2)KR101300431B1 (en)
CN (5)CN105022605B (en)
DE (1)DE112007002101T5 (en)
RU (1)RU2421796C2 (en)
WO (1)WO2008036859A1 (en)

Cited By (74)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20080114826A1 (en)*2006-10-312008-05-15Eric Oliver MejdrichSingle Precision Vector Dot Product with "Word" Vector Write Mask
US20080114824A1 (en)*2006-10-312008-05-15Eric Oliver MejdrichSingle Precision Vector Permute Immediate with "Word" Vector Write Mask
US20080170688A1 (en)*2007-01-152008-07-17Hitachi-Lg Data Storage Korea, Inc.Method of recording and reproducing data on and from optical disc
US20090249026A1 (en)*2008-03-282009-10-01Mikhail SmelyanskiyVector instructions to enable efficient synchronization and parallel reduction operations
CN102520906A (en)*2011-12-132012-06-27中国科学院自动化研究所Vector dot product accumulating network supporting reconfigurable fixed floating point and configurable vector length
WO2013101018A1 (en)*2011-12-292013-07-04Intel CorporationDot product processors, methods, systems, and instructions
US8577948B2 (en)2010-09-202013-11-05Intel CorporationSplit path multiply accumulate unit
US20130339689A1 (en)*2011-12-292013-12-19Srikanth T. SrinivasanLater stage read port reduction
US20140068227A1 (en)*2011-12-222014-03-06Bret L. TollSystems, apparatuses, and methods for extracting a writemask from a register
US8688957B2 (en)2010-12-212014-04-01Intel CorporationMechanism for conflict detection using SIMD
US20140207838A1 (en)*2011-12-222014-07-24Klaus DanneMethod, apparatus and system for execution of a vector calculation instruction
US9360920B2 (en)2011-11-212016-06-07Intel CorporationReducing power consumption in a fused multiply-add (FMA) unit of a processor
US20160224344A1 (en)*2015-02-022016-08-04Optimum Semiconductor Technologies, Inc.Vector processor configured to operate on variable length vectors using digital signal processing instructions
US9411592B2 (en)2012-12-292016-08-09Intel CorporationVector address conflict resolution with vector population count functionality
US9411584B2 (en)2012-12-292016-08-09Intel CorporationMethods, apparatus, instructions, and logic to provide vector address conflict detection functionality
US9495165B2 (en)2009-12-172016-11-15Intel CorporationMethod and apparatus for performing a shift and exclusive or operation in a single instruction
US9898286B2 (en)2015-05-052018-02-20Intel CorporationPacked finite impulse response (FIR) filter processors, methods, systems, and instructions
US10049082B2 (en)*2016-09-152018-08-14Altera CorporationDot product based processing elements
GB2560159A (en)*2017-02-232018-09-05Advanced Risc Mach LtdWidening arithmetic in a data processing apparatus
WO2018174925A1 (en)*2017-03-202018-09-27Intel CorporationSystems, methods, and apparatuses for dot production operations
US10152401B2 (en)2012-02-022018-12-11Intel CorporationInstruction and logic to test transactional execution status
US20190042541A1 (en)*2017-12-292019-02-07Intel CorporationSystems, methods, and apparatuses for dot product operations
US20190079764A1 (en)*2017-09-082019-03-14Oracle International CorporationEfficient direct convolution using simd instructions
US20190103857A1 (en)*2017-09-292019-04-04Zoran ZivkovicApparatus and method for performing horizontal filter operations
US20190227797A1 (en)*2018-01-242019-07-25Intel CorporationApparatus and method for vector multiply and accumulate of packed words
US20190242704A1 (en)*2018-02-062019-08-08Stmicroelectronics S.R.L.Tilt event detection device, system and method
US10423411B2 (en)2015-09-262019-09-24Intel CorporationData element comparison processors, methods, systems, and instructions
US10642614B2 (en)*2018-09-292020-05-05Intel CorporationReconfigurable multi-precision integer dot-product hardware accelerator for machine-learning applications
US20200210517A1 (en)*2018-12-272020-07-02Intel CorporationSystems and methods to accelerate multiplication of sparse matrices
WO2020190809A1 (en)*2019-03-152020-09-24Intel CorporationArchitecture for block sparse operations on a systolic array
US10866786B2 (en)2018-09-272020-12-15Intel CorporationSystems and methods for performing instructions to transpose rectangular tiles
US10896043B2 (en)2018-09-282021-01-19Intel CorporationSystems for performing instructions for fast element unpacking into 2-dimensional registers
US10922077B2 (en)2018-12-292021-02-16Intel CorporationApparatuses, methods, and systems for stencil configuration and computation instructions
US10929143B2 (en)2018-09-282021-02-23Intel CorporationMethod and apparatus for efficient matrix alignment in a systolic array
US10929503B2 (en)2018-12-212021-02-23Intel CorporationApparatus and method for a masked multiply instruction to support neural network pruning operations
CN112394987A (en)*2019-08-132021-02-23上海寒武纪信息科技有限公司Short shaping to half precision floating point instruction processing device, method and related product
US10942985B2 (en)2018-12-292021-03-09Intel CorporationApparatuses, methods, and systems for fast fourier transform configuration and computation instructions
US10963246B2 (en)2018-11-092021-03-30Intel CorporationSystems and methods for performing 16-bit floating-point matrix dot product instructions
US10963256B2 (en)2018-09-282021-03-30Intel CorporationSystems and methods for performing instructions to transform matrices into row-interleaved format
US10970076B2 (en)2018-09-142021-04-06Intel CorporationSystems and methods for performing instructions specifying ternary tile logic operations
US10990396B2 (en)2018-09-272021-04-27Intel CorporationSystems for performing instructions to quickly convert and use tiles as 1D vectors
US10990397B2 (en)2019-03-302021-04-27Intel CorporationApparatuses, methods, and systems for transpose instructions of a matrix operations accelerator
US11016731B2 (en)2019-03-292021-05-25Intel CorporationUsing Fuzzy-Jbit location of floating-point multiply-accumulate results
US11023235B2 (en)2017-12-292021-06-01Intel CorporationSystems and methods to zero a tile register pair
US11036504B2 (en)2018-11-092021-06-15Intel CorporationSystems and methods for performing 16-bit floating-point vector dot product instructions
US11048508B2 (en)2016-07-022021-06-29Intel CorporationInterruptible and restartable matrix multiplication instructions, processors, methods, and systems
US11093579B2 (en)2018-09-052021-08-17Intel CorporationFP16-S7E8 mixed precision for deep learning and other algorithms
US11093247B2 (en)2017-12-292021-08-17Intel CorporationSystems and methods to load a tile register pair
US11175891B2 (en)2019-03-302021-11-16Intel CorporationSystems and methods to perform floating-point addition with selected rounding
US11249761B2 (en)2018-09-272022-02-15Intel CorporationSystems and methods for performing matrix compress and decompress instructions
US11263291B2 (en)*2020-06-262022-03-01Intel CorporationSystems and methods for combining low-mantissa units to achieve and exceed FP64 emulation of matrix multiplication
US11269630B2 (en)2019-03-292022-03-08Intel CorporationInterleaved pipeline of floating-point adders
US11275588B2 (en)2017-07-012022-03-15Intel CorporationContext save with variable save state size
US11294671B2 (en)2018-12-262022-04-05Intel CorporationSystems and methods for performing duplicate detection instructions on 2D data
US11334647B2 (en)2019-06-292022-05-17Intel CorporationApparatuses, methods, and systems for enhanced matrix multiplier architecture
US11361496B2 (en)2019-03-152022-06-14Intel CorporationGraphics processors and graphics processing units having dot product accumulate instruction for hybrid floating point format
US11403097B2 (en)2019-06-262022-08-02Intel CorporationSystems and methods to skip inconsequential matrix operations
US11416260B2 (en)2018-03-302022-08-16Intel CorporationSystems and methods for implementing chained tile operations
US11579883B2 (en)2018-09-142023-02-14Intel CorporationSystems and methods for performing horizontal tile operations
US11714875B2 (en)2019-12-282023-08-01Intel CorporationApparatuses, methods, and systems for instructions of a matrix operations accelerator
US11720355B2 (en)2017-04-282023-08-08Intel CorporationInstructions and logic to perform floating point and integer operations for machine learning
US11789729B2 (en)2017-12-292023-10-17Intel CorporationSystems and methods for computing dot products of nibbles in two tile operands
US11809869B2 (en)2017-12-292023-11-07Intel CorporationSystems and methods to store a tile register pair to memory
US11816483B2 (en)2017-12-292023-11-14Intel CorporationSystems, methods, and apparatuses for matrix operations
US11886875B2 (en)2018-12-262024-01-30Intel CorporationSystems and methods for performing nibble-sized operations on matrix elements
US11934342B2 (en)2019-03-152024-03-19Intel CorporationAssistance for hardware prefetch in cache access
US11941395B2 (en)2020-09-262024-03-26Intel CorporationApparatuses, methods, and systems for instructions for 16-bit floating-point matrix dot product instructions
US11972230B2 (en)2020-06-272024-04-30Intel CorporationMatrix transpose and multiply
US12001887B2 (en)2020-12-242024-06-04Intel CorporationApparatuses, methods, and systems for instructions for aligning tiles of a matrix operations accelerator
US12001385B2 (en)2020-12-242024-06-04Intel CorporationApparatuses, methods, and systems for instructions for loading a tile of a matrix operations accelerator
US12056059B2 (en)2019-03-152024-08-06Intel CorporationSystems and methods for cache optimization
US12112167B2 (en)2020-06-272024-10-08Intel CorporationMatrix data scatter and gather between rows and irregularly spaced memory locations
US12175252B2 (en)2017-04-242024-12-24Intel CorporationConcurrent multi-datatype execution within a processing resource
US12361600B2 (en)2019-11-152025-07-15Intel CorporationSystolic arithmetic on sparse data

Families Citing this family (25)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20080071851A1 (en)*2006-09-202008-03-20Ronen ZoharInstruction and logic for performing a dot-product operation
US8515052B2 (en)2007-12-172013-08-20Wai WuParallel signal processing system and method
CN102184521B (en)*2011-03-242013-03-06苏州迪吉特电子科技有限公司High-performance image processing system and image processing method
US9804844B2 (en)*2011-09-262017-10-31Intel CorporationInstruction and logic to provide stride-based vector load-op functionality with mask duplication
BR112014004603A2 (en)*2011-09-262017-06-13Intel Corp instruction and logic for providing step-masking functionality and vector loads and stores
US20130311753A1 (en)*2012-05-192013-11-21Venu KandadaiMethod and device (universal multifunction accelerator) for accelerating computations by parallel computations of middle stratum operations
WO2014004222A1 (en)*2012-06-292014-01-03Intel CorporationInstruction and logic to test transactional execution status
JP6378515B2 (en)*2014-03-262018-08-22株式会社メガチップス VLIW processor
US20170046153A1 (en)*2015-08-142017-02-16Qualcomm IncorporatedSimd multiply and horizontal reduce operations
US10007519B2 (en)*2015-12-222018-06-26Intel IP CorporationInstructions and logic for vector bit field compression and expansion
US20170185402A1 (en)*2015-12-232017-06-29Intel CorporationInstructions and logic for bit field address and insertion
US9875084B2 (en)*2016-04-282018-01-23Vivante CorporationCalculating trigonometric functions using a four input dot product circuit
CN106874796B (en)*2017-02-162021-03-30中云信安(深圳)科技有限公司Safety detection and fault-tolerant method for instruction stream in system operation
CN110312993B (en)*2017-02-232024-04-19Arm有限公司 Vector element-by-element operations in data processing devices
CN106951211B (en)*2017-03-272019-10-18南京大学 A Reconfigurable Fixed-Floating-Point Universal Multiplier
CN107220702B (en)*2017-06-212020-11-24北京图森智途科技有限公司 A computer vision processing method and device for low computing power processing equipment
GB2563878B (en)*2017-06-282019-11-20Advanced Risc Mach LtdRegister-based matrix multiplication
CN109062607B (en)*2017-10-302021-09-21上海寒武纪信息科技有限公司Machine learning processor and method for executing vector minimum instruction using the processor
CN109871236B (en)*2017-12-012025-05-06超威半导体公司 Stream processor with low-power parallel matrix multiplication pipeline
US10657442B2 (en)*2018-04-192020-05-19International Business Machines CorporationDeep learning accelerator architecture with chunking GEMM
US11990137B2 (en)2018-09-132024-05-21Shanghai Cambricon Information Technology Co., Ltd.Image retouching method and terminal device
US10768895B2 (en)*2018-11-082020-09-08Movidius LimitedDot product calculators and methods of operating the same
US11416580B2 (en)*2019-11-132022-08-16Intel CorporationDot product multiplier mechanism
US11157238B2 (en)*2019-11-152021-10-26Intel CorporationUse of a single instruction set architecture (ISA) instruction for vector normalization
KR102474054B1 (en)*2020-06-222022-12-06주식회사 퓨리오사에이아이Neural network processor

Citations (39)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US1020060A (en)*1910-08-191912-03-12Otis Elevator CoConveyer.
US1467622A (en)*1922-04-201923-09-11Crawford E McmurphyNest box
US4771379A (en)*1985-10-231988-09-13Mitsubishi Denki Kabushiki KaishaDigital signal processor with parallel multipliers
US4949250A (en)*1988-03-181990-08-14Digital Equipment CorporationMethod and apparatus for executing instructions for a vector processing system
US5111422A (en)*1989-09-201992-05-05Deutsche Itt Industries GmbhCircuit arrangement for calculating product sums
US5311459A (en)*1992-09-171994-05-10Eastman Kodak CompanySelectively configurable integrated circuit device for performing multiple digital signal processing functions
US5422799A (en)*1994-09-151995-06-06Morrison, Sr.; Donald J.Indicating flashlight
US5506865A (en)*1992-11-241996-04-09Qualcomm IncorporatedPilot carrier dot product circuit
US5669010A (en)*1992-05-181997-09-16Silicon EnginesCascaded two-stage computational SIMD engine having multi-port memory and multiple arithmetic units
US5793661A (en)*1995-12-261998-08-11Intel CorporationMethod and apparatus for performing multiply and accumulate operations on packed data
US5859789A (en)*1995-07-181999-01-12Sgs-Thomson Microelectronics LimitedArithmetic unit
US5983257A (en)*1995-12-261999-11-09Intel CorporationSystem for signal processing using multiply-add operations
US5987490A (en)*1997-11-141999-11-16Lucent Technologies Inc.Mac processor with efficient Viterbi ACS operation and automatic traceback store
US5996066A (en)*1996-10-101999-11-30Sun Microsystems, Inc.Partitioned multiply and add/subtract instruction for CPU with integrated graphics functions
US6115812A (en)*1998-04-012000-09-05Intel CorporationMethod and apparatus for efficient vertical SIMD computations
US6128726A (en)*1996-06-042000-10-03Sigma Designs, Inc.Accurate high speed digital signal processor
JP2001256199A (en)*2000-03-132001-09-21Hitachi Ltd Data processor and data processing system
US6353843B1 (en)*1999-10-082002-03-05Sony Corporation Of JapanHigh performance universal multiplier circuit
US6385634B1 (en)*1995-08-312002-05-07Intel CorporationMethod for performing multiply-add operations on packed data
US20020078011A1 (en)*2000-05-052002-06-20Lee Ruby B.Method and system for performing permutations with bit permutation instructions
US6557022B1 (en)*2000-02-262003-04-29Qualcomm, IncorporatedDigital signal processor with coupled multiply-accumulate units
US20030084083A1 (en)*2001-07-312003-05-01Hull James M.Method and apparatus for performing integer multiply operations using primitive multi-media operations that operate on smaller operands
US6571328B2 (en)*2000-04-072003-05-27Nintendo Co., Ltd.Method and apparatus for obtaining a scalar value directly from a vector register
US6574651B1 (en)*1999-10-012003-06-03Hitachi, Ltd.Method and apparatus for arithmetic operation on vectored data
US20030151608A1 (en)*2002-01-172003-08-14Chung Chris YoochangProgrammable 3D graphics pipeline for multimedia applications
US6675286B1 (en)*2000-04-272004-01-06University Of WashingtonMultimedia instruction set for wide data paths
US6687724B1 (en)*1999-05-072004-02-03Sony CorporationInformation processor
US20040078549A1 (en)*2002-06-032004-04-22Tetsuya TanakaProcessor executing SIMD instructions
US6728874B1 (en)*2000-10-102004-04-27Koninklijke Philips Electronics N.V.System and method for processing vectorized data
US20040230632A1 (en)*2002-09-242004-11-18Interdigital Technology CorporationComputationally efficient mathematical engine
US20040263519A1 (en)*2003-06-302004-12-30Microsoft CorporationSystem and method for parallel execution of data generation tasks
US20040267857A1 (en)*2003-06-302004-12-30Abel James C.SIMD integer multiply high with round and shift
US20050071415A1 (en)*2003-09-302005-03-31Broadcom CorporationMethods for performing multiply-accumulate operations on operands representing complex numbers
US20050071413A1 (en)*2003-05-092005-03-31Schulte Michael J.Processor reduction unit for accumulation of multiple operands with or without saturation
US6922716B2 (en)*2001-07-132005-07-26Motorola, Inc.Method and apparatus for vector processing
US7062526B1 (en)*2000-02-182006-06-13Texas Instruments IncorporatedMicroprocessor with rounding multiply instructions
US7072929B2 (en)*2000-11-012006-07-04Pts CorporationMethods and apparatus for efficient complex long multiplication and covariance matrix implementation
US20060149804A1 (en)*2004-11-302006-07-06International Business Machines CorporationMultiply-sum dot product instruction with mask and splat
US20130290392A1 (en)*2006-09-202013-10-31Ronen ZoharInstruction and logic for performing a dot-product operation

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US5119484A (en)*1987-02-241992-06-02Digital Equipment CorporationSelections between alternate control word and current instruction generated control word for alu in respond to alu output and current instruction
JPH05242065A (en)*1992-02-281993-09-21Hitachi LtdInformation processor and its system
CN103064651B (en)*1995-08-312016-01-27英特尔公司For performing the device of grouping multiplying in integrated data
JP3790307B2 (en)1996-10-162006-06-28株式会社ルネサステクノロジ Data processor and data processing system
US6230253B1 (en)*1998-03-312001-05-08Intel CorporationExecuting partial-width packed data instructions
US6484255B1 (en)*1999-09-202002-11-19Intel CorporationSelective writing of data elements from packed data based upon a mask using predication
US6774903B1 (en)*2000-11-062004-08-10Ati International SrlPalette anti-sparkle enhancement
US20040054877A1 (en)*2001-10-292004-03-18Macy William W.Method and apparatus for shuffling data
US7539714B2 (en)*2003-06-302009-05-26Intel CorporationMethod, apparatus, and instruction for performing a sign operation that multiplies
GB2409063B (en)*2003-12-092006-07-12Advanced Risc Mach LtdVector by scalar operations
KR20060044102A (en)*2004-11-112006-05-16삼성전자주식회사 Multipliers and Multiple Multiplication Methods Including Multiple Identical Subproduct Calculation Modules
US8074051B2 (en)*2004-04-072011-12-06Aspen Acquisition CorporationMultithreaded processor with multiple concurrent pipelines per thread
US7475222B2 (en)*2004-04-072009-01-06Sandbridge Technologies, Inc.Multi-threaded processor having compound instruction and operation formats

Patent Citations (42)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US1020060A (en)*1910-08-191912-03-12Otis Elevator CoConveyer.
US1467622A (en)*1922-04-201923-09-11Crawford E McmurphyNest box
US4771379A (en)*1985-10-231988-09-13Mitsubishi Denki Kabushiki KaishaDigital signal processor with parallel multipliers
US4949250A (en)*1988-03-181990-08-14Digital Equipment CorporationMethod and apparatus for executing instructions for a vector processing system
US5111422A (en)*1989-09-201992-05-05Deutsche Itt Industries GmbhCircuit arrangement for calculating product sums
US5669010A (en)*1992-05-181997-09-16Silicon EnginesCascaded two-stage computational SIMD engine having multi-port memory and multiple arithmetic units
US5311459A (en)*1992-09-171994-05-10Eastman Kodak CompanySelectively configurable integrated circuit device for performing multiple digital signal processing functions
US5506865A (en)*1992-11-241996-04-09Qualcomm IncorporatedPilot carrier dot product circuit
US5422799A (en)*1994-09-151995-06-06Morrison, Sr.; Donald J.Indicating flashlight
US5859789A (en)*1995-07-181999-01-12Sgs-Thomson Microelectronics LimitedArithmetic unit
US6385634B1 (en)*1995-08-312002-05-07Intel CorporationMethod for performing multiply-add operations on packed data
US5983257A (en)*1995-12-261999-11-09Intel CorporationSystem for signal processing using multiply-add operations
US5793661A (en)*1995-12-261998-08-11Intel CorporationMethod and apparatus for performing multiply and accumulate operations on packed data
US6128726A (en)*1996-06-042000-10-03Sigma Designs, Inc.Accurate high speed digital signal processor
US5996066A (en)*1996-10-101999-11-30Sun Microsystems, Inc.Partitioned multiply and add/subtract instruction for CPU with integrated graphics functions
US5987490A (en)*1997-11-141999-11-16Lucent Technologies Inc.Mac processor with efficient Viterbi ACS operation and automatic traceback store
US6115812A (en)*1998-04-012000-09-05Intel CorporationMethod and apparatus for efficient vertical SIMD computations
US6687724B1 (en)*1999-05-072004-02-03Sony CorporationInformation processor
US6574651B1 (en)*1999-10-012003-06-03Hitachi, Ltd.Method and apparatus for arithmetic operation on vectored data
US6353843B1 (en)*1999-10-082002-03-05Sony Corporation Of JapanHigh performance universal multiplier circuit
US7062526B1 (en)*2000-02-182006-06-13Texas Instruments IncorporatedMicroprocessor with rounding multiply instructions
US6557022B1 (en)*2000-02-262003-04-29Qualcomm, IncorporatedDigital signal processor with coupled multiply-accumulate units
JP2001256199A (en)*2000-03-132001-09-21Hitachi Ltd Data processor and data processing system
US7028066B2 (en)*2000-03-132006-04-11Renesas Technology Corp.Vector SIMD processor
US6571328B2 (en)*2000-04-072003-05-27Nintendo Co., Ltd.Method and apparatus for obtaining a scalar value directly from a vector register
US6675286B1 (en)*2000-04-272004-01-06University Of WashingtonMultimedia instruction set for wide data paths
US20020078011A1 (en)*2000-05-052002-06-20Lee Ruby B.Method and system for performing permutations with bit permutation instructions
US6728874B1 (en)*2000-10-102004-04-27Koninklijke Philips Electronics N.V.System and method for processing vectorized data
US7072929B2 (en)*2000-11-012006-07-04Pts CorporationMethods and apparatus for efficient complex long multiplication and covariance matrix implementation
US6922716B2 (en)*2001-07-132005-07-26Motorola, Inc.Method and apparatus for vector processing
US20030084083A1 (en)*2001-07-312003-05-01Hull James M.Method and apparatus for performing integer multiply operations using primitive multi-media operations that operate on smaller operands
US20030151608A1 (en)*2002-01-172003-08-14Chung Chris YoochangProgrammable 3D graphics pipeline for multimedia applications
US20040078549A1 (en)*2002-06-032004-04-22Tetsuya TanakaProcessor executing SIMD instructions
US20040230632A1 (en)*2002-09-242004-11-18Interdigital Technology CorporationComputationally efficient mathematical engine
US20050071413A1 (en)*2003-05-092005-03-31Schulte Michael J.Processor reduction unit for accumulation of multiple operands with or without saturation
US20040267857A1 (en)*2003-06-302004-12-30Abel James C.SIMD integer multiply high with round and shift
US20040263519A1 (en)*2003-06-302004-12-30Microsoft CorporationSystem and method for parallel execution of data generation tasks
US20050071415A1 (en)*2003-09-302005-03-31Broadcom CorporationMethods for performing multiply-accumulate operations on operands representing complex numbers
US20060149804A1 (en)*2004-11-302006-07-06International Business Machines CorporationMultiply-sum dot product instruction with mask and splat
US20130290392A1 (en)*2006-09-202013-10-31Ronen ZoharInstruction and logic for performing a dot-product operation
US20140032624A1 (en)*2006-09-202014-01-30Ronen ZoharInstruction and logic for performing a dot-product operation
US20140032881A1 (en)*2006-09-202014-01-30Ronen ZoharInstruction and logic for performing a dot-product operation

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
May et al. (May) (The PowerPC Architecture: A Specification for a New Family of RISC Processors); May 1994; Morgan Kaufmann Publishers, Inc. Second Edition; 3 total pages*

Cited By (193)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US8332452B2 (en)*2006-10-312012-12-11International Business Machines CorporationSingle precision vector dot product with “word” vector write mask
US20080114824A1 (en)*2006-10-312008-05-15Eric Oliver MejdrichSingle Precision Vector Permute Immediate with "Word" Vector Write Mask
US9495724B2 (en)2006-10-312016-11-15International Business Machines CorporationSingle precision vector permute immediate with “word” vector write mask
US20080114826A1 (en)*2006-10-312008-05-15Eric Oliver MejdrichSingle Precision Vector Dot Product with "Word" Vector Write Mask
US20080170688A1 (en)*2007-01-152008-07-17Hitachi-Lg Data Storage Korea, Inc.Method of recording and reproducing data on and from optical disc
US9513905B2 (en)2008-03-282016-12-06Intel CorporationVector instructions to enable efficient synchronization and parallel reduction operations
US9678750B2 (en)2008-03-282017-06-13Intel CorporationVector instructions to enable efficient synchronization and parallel reduction operations
US20090249026A1 (en)*2008-03-282009-10-01Mikhail SmelyanskiyVector instructions to enable efficient synchronization and parallel reduction operations
US10684855B2 (en)2009-12-172020-06-16Intel CorporationMethod and apparatus for performing a shift and exclusive or operation in a single instruction
US9747105B2 (en)2009-12-172017-08-29Intel CorporationMethod and apparatus for performing a shift and exclusive or operation in a single instruction
US9495165B2 (en)2009-12-172016-11-15Intel CorporationMethod and apparatus for performing a shift and exclusive or operation in a single instruction
US9501281B2 (en)2009-12-172016-11-22Intel CorporationMethod and apparatus for performing a shift and exclusive or operation in a single instruction
US9495166B2 (en)2009-12-172016-11-15Intel CorporationMethod and apparatus for performing a shift and exclusive or operation in a single instruction
US8577948B2 (en)2010-09-202013-11-05Intel CorporationSplit path multiply accumulate unit
US8688957B2 (en)2010-12-212014-04-01Intel CorporationMechanism for conflict detection using SIMD
US9360920B2 (en)2011-11-212016-06-07Intel CorporationReducing power consumption in a fused multiply-add (FMA) unit of a processor
CN102520906A (en)*2011-12-132012-06-27中国科学院自动化研究所Vector dot product accumulating network supporting reconfigurable fixed floating point and configurable vector length
CN104303141A (en)*2011-12-222015-01-21英特尔公司Systems, apparatuses, and methods for extracting a writemask from a register
US20140207838A1 (en)*2011-12-222014-07-24Klaus DanneMethod, apparatus and system for execution of a vector calculation instruction
US20140068227A1 (en)*2011-12-222014-03-06Bret L. TollSystems, apparatuses, and methods for extracting a writemask from a register
EP2798457A4 (en)*2011-12-292016-07-27Intel Corp PROCESSORS, METHODS, SYSTEMS, AND SCALAR PRODUCT INSTRUCTIONS
TWI512612B (en)*2011-12-292015-12-11Intel CorpDot product processors, methods, systems and instructions
US20130339689A1 (en)*2011-12-292013-12-19Srikanth T. SrinivasanLater stage read port reduction
WO2013101018A1 (en)*2011-12-292013-07-04Intel CorporationDot product processors, methods, systems, and instructions
US10210065B2 (en)2012-02-022019-02-19Intel CorporationInstruction and logic to test transactional execution status
US10261879B2 (en)2012-02-022019-04-16Intel CorporationInstruction and logic to test transactional execution status
US10248524B2 (en)2012-02-022019-04-02Intel CorporationInstruction and logic to test transactional execution status
US10223227B2 (en)2012-02-022019-03-05Intel CorporationInstruction and logic to test transactional execution status
US10152401B2 (en)2012-02-022018-12-11Intel CorporationInstruction and logic to test transactional execution status
US10210066B2 (en)2012-02-022019-02-19Intel CorporationInstruction and logic to test transactional execution status
US9411592B2 (en)2012-12-292016-08-09Intel CorporationVector address conflict resolution with vector population count functionality
US9411584B2 (en)2012-12-292016-08-09Intel CorporationMethods, apparatus, instructions, and logic to provide vector address conflict detection functionality
US10824586B2 (en)2015-02-022020-11-03Optimum Semiconductor Technologies Inc.Vector processor configured to operate on variable length vectors using one or more complex arithmetic instructions
US10922267B2 (en)2015-02-022021-02-16Optimum Semiconductor Technologies Inc.Vector processor to operate on variable length vectors using graphics processing instructions
US10846259B2 (en)2015-02-022020-11-24Optimum Semiconductor Technologies Inc.Vector processor to operate on variable length vectors with out-of-order execution
US11544214B2 (en)2015-02-022023-01-03Optimum Semiconductor Technologies, Inc.Monolithic vector processor configured to operate on variable length vectors using a vector length register
US10733140B2 (en)2015-02-022020-08-04Optimum Semiconductor Technologies Inc.Vector processor configured to operate on variable length vectors using instructions that change element widths
US10339095B2 (en)*2015-02-022019-07-02Optimum Semiconductor Technologies Inc.Vector processor configured to operate on variable length vectors using digital signal processing instructions
US20160224344A1 (en)*2015-02-022016-08-04Optimum Semiconductor Technologies, Inc.Vector processor configured to operate on variable length vectors using digital signal processing instructions
US9898286B2 (en)2015-05-052018-02-20Intel CorporationPacked finite impulse response (FIR) filter processors, methods, systems, and instructions
US11113053B2 (en)2015-09-262021-09-07Intel CorporationData element comparison processors, methods, systems, and instructions
US10423411B2 (en)2015-09-262019-09-24Intel CorporationData element comparison processors, methods, systems, and instructions
US12050912B2 (en)2016-07-022024-07-30Intel CorporationInterruptible and restartable matrix multiplication instructions, processors, methods, and systems
US11048508B2 (en)2016-07-022021-06-29Intel CorporationInterruptible and restartable matrix multiplication instructions, processors, methods, and systems
US12204898B2 (en)2016-07-022025-01-21Intel CorporationInterruptible and restartable matrix multiplication instructions, processors, methods, and systems
US11698787B2 (en)2016-07-022023-07-11Intel CorporationInterruptible and restartable matrix multiplication instructions, processors, methods, and systems
US10339201B1 (en)2016-09-152019-07-02Altera CorporationDot product based processing elements
US10049082B2 (en)*2016-09-152018-08-14Altera CorporationDot product based processing elements
GB2560159B (en)*2017-02-232019-12-25Advanced Risc Mach LtdWidening arithmetic in a data processing apparatus
US11567763B2 (en)2017-02-232023-01-31Arm LimitedWidening arithmetic in a data processing apparatus
GB2560159A (en)*2017-02-232018-09-05Advanced Risc Mach LtdWidening arithmetic in a data processing apparatus
US11360770B2 (en)2017-03-202022-06-14Intel CorporationSystems, methods, and apparatuses for zeroing a matrix
US12182571B2 (en)2017-03-202024-12-31Intel CorporationSystems, methods, and apparatuses for tile load, multiplication and accumulation
US12147804B2 (en)2017-03-202024-11-19Intel CorporationSystems, methods, and apparatuses for tile matrix multiplication and accumulation
US11263008B2 (en)2017-03-202022-03-01Intel CorporationSystems, methods, and apparatuses for tile broadcast
US11288069B2 (en)2017-03-202022-03-29Intel CorporationSystems, methods, and apparatuses for tile store
US10877756B2 (en)2017-03-202020-12-29Intel CorporationSystems, methods, and apparatuses for tile diagonal
US20220058021A1 (en)*2017-03-202022-02-24Intel CorporationSystems, methods, and apparatuses for dot production operations
US11714642B2 (en)2017-03-202023-08-01Intel CorporationSystems, methods, and apparatuses for tile store
US11288068B2 (en)2017-03-202022-03-29Intel CorporationSystems, methods, and apparatus for matrix move
WO2018174925A1 (en)*2017-03-202018-09-27Intel CorporationSystems, methods, and apparatuses for dot production operations
US11977886B2 (en)2017-03-202024-05-07Intel CorporationSystems, methods, and apparatuses for tile store
US12282773B2 (en)2017-03-202025-04-22Intel CorporationSystems, methods, and apparatus for tile configuration
US11200055B2 (en)2017-03-202021-12-14Intel CorporationSystems, methods, and apparatuses for matrix add, subtract, and multiply
US12039332B2 (en)2017-03-202024-07-16Intel CorporationSystems, methods, and apparatus for matrix move
US12260213B2 (en)2017-03-202025-03-25Intel CorporationSystems, methods, and apparatuses for matrix add, subtract, and multiply
US11567765B2 (en)2017-03-202023-01-31Intel CorporationSystems, methods, and apparatuses for tile load
US11163565B2 (en)*2017-03-202021-11-02Intel CorporationSystems, methods, and apparatuses for dot production operations
CN114461276A (en)*2017-03-202022-05-10英特尔公司Systems, methods, and apparatus for dot product operations
CN114816530A (en)*2017-03-202022-07-29英特尔公司 System, method and apparatus for dot product operation
US12124847B2 (en)2017-03-202024-10-22Intel CorporationSystems, methods, and apparatuses for tile transpose
US12314717B2 (en)*2017-03-202025-05-27Intel CorporationSystems, methods, and apparatuses for dot production operations
US12106100B2 (en)2017-03-202024-10-01Intel CorporationSystems, methods, and apparatuses for matrix operations
CN110337635A (en)*2017-03-202019-10-15英特尔公司Systems, methods, and apparatus for dot-product operations
US11080048B2 (en)2017-03-202021-08-03Intel CorporationSystems, methods, and apparatus for tile configuration
US11086623B2 (en)2017-03-202021-08-10Intel CorporationSystems, methods, and apparatuses for tile matrix multiplication and accumulation
US11847452B2 (en)2017-03-202023-12-19Intel CorporationSystems, methods, and apparatus for tile configuration
US12175252B2 (en)2017-04-242024-12-24Intel CorporationConcurrent multi-datatype execution within a processing resource
US12411695B2 (en)2017-04-242025-09-09Intel CorporationMulticore processor with each core having independent floating point datapath and integer datapath
US12217053B2 (en)2017-04-282025-02-04Intel CorporationInstructions and logic to perform floating point and integer operations for machine learning
US12141578B2 (en)2017-04-282024-11-12Intel CorporationInstructions and logic to perform floating point and integer operations for machine learning
US12039331B2 (en)2017-04-282024-07-16Intel CorporationInstructions and logic to perform floating point and integer operations for machine learning
US11720355B2 (en)2017-04-282023-08-08Intel CorporationInstructions and logic to perform floating point and integer operations for machine learning
US11275588B2 (en)2017-07-012022-03-15Intel CorporationContext save with variable save state size
US20190079764A1 (en)*2017-09-082019-03-14Oracle International CorporationEfficient direct convolution using simd instructions
CN111213125A (en)*2017-09-082020-05-29甲骨文国际公司 Efficient direct convolution using SIMD instructions
US11803377B2 (en)*2017-09-082023-10-31Oracle International CorporationEfficient direct convolution using SIMD instructions
US20190103857A1 (en)*2017-09-292019-04-04Zoran ZivkovicApparatus and method for performing horizontal filter operations
US10749502B2 (en)*2017-09-292020-08-18Intel CorporationApparatus and method for performing horizontal filter operations
US11023235B2 (en)2017-12-292021-06-01Intel CorporationSystems and methods to zero a tile register pair
US12293186B2 (en)2017-12-292025-05-06Intel CorporationSystems and methods to store a tile register pair to memory
US11816483B2 (en)2017-12-292023-11-14Intel CorporationSystems, methods, and apparatuses for matrix operations
US12182568B2 (en)2017-12-292024-12-31Intel CorporationSystems and methods for computing dot products of nibbles in two tile operands
US11809869B2 (en)2017-12-292023-11-07Intel CorporationSystems and methods to store a tile register pair to memory
US11093247B2 (en)2017-12-292021-08-17Intel CorporationSystems and methods to load a tile register pair
US11789729B2 (en)2017-12-292023-10-17Intel CorporationSystems and methods for computing dot products of nibbles in two tile operands
US20190042541A1 (en)*2017-12-292019-02-07Intel CorporationSystems, methods, and apparatuses for dot product operations
US11609762B2 (en)2017-12-292023-03-21Intel CorporationSystems and methods to load a tile register pair
US11669326B2 (en)*2017-12-292023-06-06Intel CorporationSystems, methods, and apparatuses for dot product operations
US12236242B2 (en)2017-12-292025-02-25Intel CorporationSystems and methods to load a tile register pair
US12282525B2 (en)2017-12-292025-04-22Intel CorporationSystems, methods, and apparatuses for matrix operations
US11645077B2 (en)2017-12-292023-05-09Intel CorporationSystems and methods to zero a tile register pair
US11409525B2 (en)*2018-01-242022-08-09Intel CorporationApparatus and method for vector multiply and accumulate of packed words
US20190227797A1 (en)*2018-01-242019-07-25Intel CorporationApparatus and method for vector multiply and accumulate of packed words
US10921122B2 (en)*2018-02-062021-02-16Stmicroelectronics S.R.L.Tilt event detection device, system and method
US20190242704A1 (en)*2018-02-062019-08-08Stmicroelectronics S.R.L.Tilt event detection device, system and method
US11416260B2 (en)2018-03-302022-08-16Intel CorporationSystems and methods for implementing chained tile operations
US11093579B2 (en)2018-09-052021-08-17Intel CorporationFP16-S7E8 mixed precision for deep learning and other algorithms
US11579883B2 (en)2018-09-142023-02-14Intel CorporationSystems and methods for performing horizontal tile operations
US10970076B2 (en)2018-09-142021-04-06Intel CorporationSystems and methods for performing instructions specifying ternary tile logic operations
US11714648B2 (en)2018-09-272023-08-01Intel CorporationSystems for performing instructions to quickly convert and use tiles as 1D vectors
US11579880B2 (en)2018-09-272023-02-14Intel CorporationSystems for performing instructions to quickly convert and use tiles as 1D vectors
US11403071B2 (en)2018-09-272022-08-02Intel CorporationSystems and methods for performing instructions to transpose rectangular tiles
US10990396B2 (en)2018-09-272021-04-27Intel CorporationSystems for performing instructions to quickly convert and use tiles as 1D vectors
US11249761B2 (en)2018-09-272022-02-15Intel CorporationSystems and methods for performing matrix compress and decompress instructions
US12265826B2 (en)2018-09-272025-04-01Intel CorporationSystems for performing instructions to quickly convert and use tiles as 1D vectors
US10866786B2 (en)2018-09-272020-12-15Intel CorporationSystems and methods for performing instructions to transpose rectangular tiles
US12175246B2 (en)2018-09-272024-12-24Intel CorporationSystems and methods for performing matrix compress and decompress instructions
US11954489B2 (en)2018-09-272024-04-09Intel CorporationSystems for performing instructions to quickly convert and use tiles as 1D vectors
US11748103B2 (en)2018-09-272023-09-05Intel CorporationSystems and methods for performing matrix compress and decompress instructions
US10929143B2 (en)2018-09-282021-02-23Intel CorporationMethod and apparatus for efficient matrix alignment in a systolic array
US11954490B2 (en)2018-09-282024-04-09Intel CorporationSystems and methods for performing instructions to transform matrices into row-interleaved format
US10896043B2 (en)2018-09-282021-01-19Intel CorporationSystems for performing instructions for fast element unpacking into 2-dimensional registers
US11507376B2 (en)2018-09-282022-11-22Intel CorporationSystems for performing instructions for fast element unpacking into 2-dimensional registers
US11675590B2 (en)2018-09-282023-06-13Intel CorporationSystems and methods for performing instructions to transform matrices into row-interleaved format
US10963256B2 (en)2018-09-282021-03-30Intel CorporationSystems and methods for performing instructions to transform matrices into row-interleaved format
US11392381B2 (en)2018-09-282022-07-19Intel CorporationSystems and methods for performing instructions to transform matrices into row-interleaved format
US10642614B2 (en)*2018-09-292020-05-05Intel CorporationReconfigurable multi-precision integer dot-product hardware accelerator for machine-learning applications
US11366663B2 (en)2018-11-092022-06-21Intel CorporationSystems and methods for performing 16-bit floating-point vector dot product instructions
US11263009B2 (en)2018-11-092022-03-01Intel CorporationSystems and methods for performing 16-bit floating-point vector dot product instructions
US11036504B2 (en)2018-11-092021-06-15Intel CorporationSystems and methods for performing 16-bit floating-point vector dot product instructions
US12008367B2 (en)2018-11-092024-06-11Intel CorporationSystems and methods for performing 16-bit floating-point vector dot product instructions
US11893389B2 (en)2018-11-092024-02-06Intel CorporationSystems and methods for performing 16-bit floating-point matrix dot product instructions
US11614936B2 (en)2018-11-092023-03-28Intel CorporationSystems and methods for performing 16-bit floating-point matrix dot product instructions
US10963246B2 (en)2018-11-092021-03-30Intel CorporationSystems and methods for performing 16-bit floating-point matrix dot product instructions
US12307250B2 (en)2018-11-092025-05-20Intel CorporationSystems and methods for performing 16-bit floating-point matrix dot product instructions
US10929503B2 (en)2018-12-212021-02-23Intel CorporationApparatus and method for a masked multiply instruction to support neural network pruning operations
US11294671B2 (en)2018-12-262022-04-05Intel CorporationSystems and methods for performing duplicate detection instructions on 2D data
US11886875B2 (en)2018-12-262024-01-30Intel CorporationSystems and methods for performing nibble-sized operations on matrix elements
US11847185B2 (en)2018-12-272023-12-19Intel CorporationSystems and methods of instructions to accelerate multiplication of sparse matrices using bitmasks that identify non-zero elements
US12287843B2 (en)2018-12-272025-04-29Intel CorporationSystems and methods of instructions to accelerate multiplication of sparse matrices using bitmasks that identify non-zero elements
US20200210517A1 (en)*2018-12-272020-07-02Intel CorporationSystems and methods to accelerate multiplication of sparse matrices
US10942985B2 (en)2018-12-292021-03-09Intel CorporationApparatuses, methods, and systems for fast fourier transform configuration and computation instructions
US10922077B2 (en)2018-12-292021-02-16Intel CorporationApparatuses, methods, and systems for stencil configuration and computation instructions
US11954063B2 (en)2019-03-152024-04-09Intel CorporationGraphics processors and graphics processing units having dot product accumulate instruction for hybrid floating point format
US11709793B2 (en)2019-03-152023-07-25Intel CorporationGraphics processors and graphics processing units having dot product accumulate instruction for hybrid floating point format
US11676239B2 (en)2019-03-152023-06-13Intel CorporationSparse optimizations for a matrix accelerator architecture
US12007935B2 (en)2019-03-152024-06-11Intel CorporationGraphics processors and graphics processing units having dot product accumulate instruction for hybrid floating point format
US12013808B2 (en)2019-03-152024-06-18Intel CorporationMulti-tile architecture for graphics operations
US12293431B2 (en)2019-03-152025-05-06Intel CorporationSparse optimizations for a matrix accelerator architecture
WO2020190809A1 (en)*2019-03-152020-09-24Intel CorporationArchitecture for block sparse operations on a systolic array
US12210477B2 (en)2019-03-152025-01-28Intel CorporationSystems and methods for improving cache efficiency and utilization
US12056059B2 (en)2019-03-152024-08-06Intel CorporationSystems and methods for cache optimization
US12066975B2 (en)2019-03-152024-08-20Intel CorporationCache structure and utilization
US12079155B2 (en)2019-03-152024-09-03Intel CorporationGraphics processor operation scheduling for deterministic latency
US12093210B2 (en)2019-03-152024-09-17Intel CorporationCompression techniques
US12099461B2 (en)2019-03-152024-09-24Intel CorporationMulti-tile memory management
US11361496B2 (en)2019-03-152022-06-14Intel CorporationGraphics processors and graphics processing units having dot product accumulate instruction for hybrid floating point format
US11954062B2 (en)2019-03-152024-04-09Intel CorporationDynamic memory reconfiguration
US11995029B2 (en)2019-03-152024-05-28Intel CorporationMulti-tile memory management for detecting cross tile access providing multi-tile inference scaling and providing page migration
US12124383B2 (en)2019-03-152024-10-22Intel CorporationSystems and methods for cache optimization
US12321310B2 (en)2019-03-152025-06-03Intel CorporationImplicit fence for write messages
US12141094B2 (en)2019-03-152024-11-12Intel CorporationSystolic disaggregation within a matrix accelerator architecture
US12386779B2 (en)2019-03-152025-08-12Intel CorporationDynamic memory reconfiguration
US12153541B2 (en)2019-03-152024-11-26Intel CorporationCache structure and utilization
US12242414B2 (en)2019-03-152025-03-04Intel CorporationData initialization techniques
US11934342B2 (en)2019-03-152024-03-19Intel CorporationAssistance for hardware prefetch in cache access
US12182035B2 (en)2019-03-152024-12-31Intel CorporationSystems and methods for cache optimization
US12182062B1 (en)2019-03-152024-12-31Intel CorporationMulti-tile memory management
US11899614B2 (en)2019-03-152024-02-13Intel CorporationInstruction based control of memory attributes
US11113784B2 (en)2019-03-152021-09-07Intel CorporationSparse optimizations for a matrix accelerator architecture
US12198222B2 (en)2019-03-152025-01-14Intel CorporationArchitecture for block sparse operations on a systolic array
US11842423B2 (en)2019-03-152023-12-12Intel CorporationDot product operations on sparse matrix elements
US12204487B2 (en)2019-03-152025-01-21Intel CorporationGraphics processor data access and sharing
US11016731B2 (en)2019-03-292021-05-25Intel CorporationUsing Fuzzy-Jbit location of floating-point multiply-accumulate results
US11269630B2 (en)2019-03-292022-03-08Intel CorporationInterleaved pipeline of floating-point adders
US11175891B2 (en)2019-03-302021-11-16Intel CorporationSystems and methods to perform floating-point addition with selected rounding
US10990397B2 (en)2019-03-302021-04-27Intel CorporationApparatuses, methods, and systems for transpose instructions of a matrix operations accelerator
US11403097B2 (en)2019-06-262022-08-02Intel CorporationSystems and methods to skip inconsequential matrix operations
US11900114B2 (en)2019-06-262024-02-13Intel CorporationSystems and methods to skip inconsequential matrix operations
US11334647B2 (en)2019-06-292022-05-17Intel CorporationApparatuses, methods, and systems for enhanced matrix multiplier architecture
CN112394987A (en)*2019-08-132021-02-23上海寒武纪信息科技有限公司Short shaping to half precision floating point instruction processing device, method and related product
US12361600B2 (en)2019-11-152025-07-15Intel CorporationSystolic arithmetic on sparse data
US11714875B2 (en)2019-12-282023-08-01Intel CorporationApparatuses, methods, and systems for instructions of a matrix operations accelerator
US12204605B2 (en)2019-12-282025-01-21Intel CorporationApparatuses, methods, and systems for instructions of a matrix operations accelerator
US11263291B2 (en)*2020-06-262022-03-01Intel CorporationSystems and methods for combining low-mantissa units to achieve and exceed FP64 emulation of matrix multiplication
US11669586B2 (en)2020-06-262023-06-06Intel CorporationSystems and methods for combining low-mantissa units to achieve and exceed FP64 emulation of matrix multiplication
US12112167B2 (en)2020-06-272024-10-08Intel CorporationMatrix data scatter and gather between rows and irregularly spaced memory locations
US11972230B2 (en)2020-06-272024-04-30Intel CorporationMatrix transpose and multiply
US12405770B2 (en)2020-06-272025-09-02Intel CorporationMatrix transpose and multiply
US11941395B2 (en)2020-09-262024-03-26Intel CorporationApparatuses, methods, and systems for instructions for 16-bit floating-point matrix dot product instructions
US12001887B2 (en)2020-12-242024-06-04Intel CorporationApparatuses, methods, and systems for instructions for aligning tiles of a matrix operations accelerator
US12001385B2 (en)2020-12-242024-06-04Intel CorporationApparatuses, methods, and systems for instructions for loading a tile of a matrix operations accelerator

Also Published As

Publication numberPublication date
KR101300431B1 (en)2013-08-27
CN107741842B (en)2021-08-06
CN102004628B (en)2015-07-22
US20170364476A1 (en)2017-12-21
RU2421796C2 (en)2011-06-20
JP4697639B2 (en)2011-06-08
JP2008077663A (en)2008-04-03
CN102004628A (en)2011-04-06
CN105022605A (en)2015-11-04
US20140032881A1 (en)2014-01-30
RU2009114818A (en)2010-10-27
US20130290392A1 (en)2013-10-31
CN105022605B (en)2018-10-26
CN107741842A (en)2018-02-27
US20140032624A1 (en)2014-01-30
DE112007002101T5 (en)2009-07-09
KR20090042329A (en)2009-04-29
KR20110112453A (en)2011-10-12
CN101187861A (en)2008-05-28
WO2008036859A1 (en)2008-03-27
CN102622203A (en)2012-08-01
KR101105527B1 (en)2012-01-13
CN101187861B (en)2012-02-29

Similar Documents

PublicationPublication DateTitle
US20220107809A1 (en)Instruction and logic for processing text strings
US20170364476A1 (en)Instruction and logic for performing a dot-product operation
US10684855B2 (en)Method and apparatus for performing a shift and exclusive or operation in a single instruction
US20140280271A1 (en)Instruction and logic for processing text strings

Legal Events

DateCodeTitleDescription
ASAssignment

Owner name:INTEL CORPORATION, CALIFORNIA

Free format text:ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:ZOHAR, RONEN;SECONI, MARK;PARTHASARATHY, SRINIVAS;AND OTHERS;REEL/FRAME:019198/0919;SIGNING DATES FROM 20061027 TO 20061205

STCBInformation on status: application discontinuation

Free format text:ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION


[8]ページ先頭

©2009-2025 Movatter.jp