Movatterモバイル変換

[0]ホーム

Jump to content

Massively parallel processor array

Català

Edit links

From Wikipedia, the free encyclopedia

Type of integrated circuit

Amassively parallel processor array, also known as amulti purpose processor array (MPPA) is a type ofintegrated circuit which has amassively parallel array of hundreds or thousands ofCPUs andRAM memories. These processors pass work to one another through areconfigurable interconnect ofchannels. By harnessing a large number of processors working in parallel, an MPPA chip can accomplish more demanding tasks than conventional chips. MPPAs are based on a software parallelprogramming model for developing high-performanceembedded system applications.

Architecture

[edit]

MPPA is aMIMD (Multiple Instruction streams, Multiple Data) architecture, withdistributed memory accessed locally, not shared globally. Each processor is strictly encapsulated, accessing only its own code and memory. Point-to-point communication between processors is directly realized in the configurable interconnect.^[1]

The MPPA's massive parallelism and its distributed memory MIMD architecture distinguishes it frommulticore andmanycore architectures, which have fewer processors and anSMP or othershared memory architecture, mainly intended for general-purpose computing. It's also distinguished fromGPGPUs withSIMD architectures, used forHPC applications.^[2]

Programming

[edit]

An MPPA application is developed by expressing it as a hierarchicalblock diagram orworkflow, whose basic objects run in parallel, each on their own processor. Likewise, large data objects may be broken up and distributed into local memories with parallel access. Objects communicate over a parallel structure of dedicated channels. The objective is to maximize aggregate throughput while minimizing local latency, optimizing performance and efficiency. An MPPA'smodel of computation is similar to aKahn process network orcommunicating sequential processes (CSP).^[3]

Applications

[edit]

MPPAs are used in high-performanceembedded systems andhardware acceleration ofdesktop computer andserver applications, such asvideo compression,^[4]^[5]image processing,^[6]medical imaging,network processing,software-defined radio and other compute-intensive streaming media applications, which otherwise would useFPGA,DSP and/orASIC chips.

Examples

[edit]

MPPAs developed in companies include ones designed at:Ambric,PicoChip,Intel,^[7]IntellaSys,GreenArrays,ASOCS,Tilera,Kalray,Coherent Logix,Tabula, andAdapteva.Aspex (Ericsson) Linedancer differs in that it was a Massive wideSIMD Array rather than an MPPA. Strictly speaking it could qualify asAssociative processing due to all 4096 of the 3,000 gate cores each having its own Content-Addressable Memory.^[8]^[9]^[10]

Fabricated MPPAs developed in universities include: 36-core^[11] and 167-core^[12]Asynchronous Array of Simple Processors (AsAP) arrays from theUniversity of California, Davis, 16-core RAW^[13] fromMIT, and 16-core^[14] and 24-core^[15] arrays fromFudan University.

The ChineseSunway project developed their own 260-coreSW26010 manycore chip for theTaihuLight supercomputer, which was, from June 2016 to June 2018, the world's fastest supercomputer.^[16]^[17]

Anton 3 processors, designed byD. E. Shaw Research formolecular dynamics simulations, contain arrays of 576 processors arranged in a 12×24 tiled grid of pairs of cores; a routed network links these tiles together and extends off-chip to other nodes in a full system.^[18]^[19]

References

[edit]

^Mike Butts (September–October 2007). "Synchronization through Communication in a Massively Parallel Processor Array".IEEE Micro.27 (5).IEEE Computer Society: 32.Bibcode:2007IMicr..27e..32A.doi:10.1109/MM.2007.4378781.
^Mike Butts. "Multicore and Massively Parallel Platforms and Moore's Law Scalability".Proceedings of the Embedded Systems Conference - Silicon Valley, April 2008.
^Mike Butts; Brad Budlong; Paul Wasson; Ed White (April 2008).Reconfigurable Work Farms on a Massively Parallel Processor Array. 2008 16th International Symposium on Field-Programmable Custom Computing Machines.IEEE Computer Society.doi:10.1109/FCCM.2008.6.
^Laurent Bonetto (May 16, 2008)."Massively parallel processing arrays (MPPAs) for embedded HD video and imaging (Part 1)". Video/Imaging DesignLine.EE Times.
^Laurent Bonetto (July 18, 2008)."Massively parallel processing arrays (MPPAs) for embedded HD video and imaging (Part 2)". Video/Imaging DesignLine.EE Times.
^Paul Chen (March 18, 2008)."Multimode sensor processing using Massively Parallel Processor Arrays (MPPAs)". Programmable Logic DesignLine.EE Times.
^Vangal, Sriram R.; Howard, Jason; Ruhl, Gregory; Dighe, Saurabh; Wilson, Howard; Tschanz, James; Finan, David; et al. (2008). "An 80-tile sub-100-w teraflops processor in 65-nm cmos".IEEE Journal of Solid-State Circuits.43 (1):29–41.Bibcode:2008IJSSC..43...29V.doi:10.1109/JSSC.2007.910957.
^Krikelis, A. (1990)."Artificial Neural Network on a Massively Parallel Associative Architecture".International Neural Network Conference. p. 673.doi:10.1007/978-94-009-0643-3_39.ISBN 978-0-7923-0831-7.
^"Effective Monte Carlo simulation on System-V massively parallel associative string processing architecture"(PDF). Archived fromthe original(PDF) on 2021-06-06.
^"A Programmable Processor with 4096 Processing Units for Media Applications".
^Yu, Zhiyi; Meeuwsen, Michael; Apperson, Ryan; Sattari, Omar; Lai, Michael; Webb, Jeremy; Work, Eric; Mohsenin, Tinoosh; Singh, Mandeep; Baas, Bevan (2006).An asynchronous array of simple processors for DSP applications. IEEE International Solid-State Circuits Conference (ISSCC’06). Vol. 49. pp. 428–429.doi:10.1109/ISSCC.2006.1696225.
^Truong, Dean; Cheng, Wayne; Mohsenin, Tinoosh; Yu, Zhiyi; Jacobson, Toney; Landge, Gouri; Meeuwsen, Michael; et al. (2008).A 167-processor 65 nm computational platform with per-processor dynamic supply voltage and dynamic clock frequency scaling. Symposium on VLSI Circuits. pp. 22–23.doi:10.1109/VLSIC.2008.4585936.
^Michael Bedford Taylor; Jason Kim; Jason Miller; David Wentzlaff; Fae Ghodrat; Ben Greenwald; Henry Hoffmann; Paul Johnson; Walter Lee; Arvind Saraf; Nathan Shnidman; Volker Strumpen; Saman Amarasinghe; Anant Agarwal (February 2003). "A 16-issue multiple-program-counter microprocessor with point-to-point scalar operand network".Proceedings of the IEEE International Solid-State Circuits Conference.doi:10.1109/ISSCC.2003.1234253.
^Yu, Zhiyi; You, Kaidi; Xiao, Ruijin; Quan, Heng; Ou, Peng; Ying, Yan; Yang, Haofan; Zeng, Xiaoyang (2012). "An 800MHz 320mW 16-core processor with message-passing and shared-memory inter-core communication mechanisms".2012 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC). IEEE. pp. 64–66.doi:10.1109/ISSCC.2012.6176931.
^Ou, Peng; Zhang, Jiajie; Quan, Heng; Li, Yi; He, Maofei; Yu, Zheng; Yu, Xueqiu; et al. (2013). "A 65nm 39GOPS/W 24-core processor with 11 Tb/s/W packet-controlled circuit-switched double-layer network-on-chip and heterogeneous execution array".2013 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC). IEEE. pp. 56–57.doi:10.1109/ISSCC.2013.6487635.
^Dongarra, Jack (June 20, 2016)."Report on the Sunway TaihuLight System"(PDF).www.netlib.org. RetrievedJune 20, 2016.
^Fu, Haohuan; Liao, Junfeng; Yang, Jinzhe; et al. (2016)."The Sunway TaihuLight Supercomputer: System and Applications".Sci. China Inf. Sci.59 (7) 072001.doi:10.1007/s11432-016-5588-7.
^Shaw, David E.; Adams, Peter J.; Azaria, Asaph; Bank, Joseph A.; Batson, Brannon; Bell, Alistair; Bergdorf, Michael; Bhatt, Jhanvi; Butts, J. Adam; Correia, Timothy; Dirks, Robert M.; Dror, Ron O.; Eastwood, Michael P.; Edwards, Bruce; Even, Amos (2021-11-14). "Anton 3".Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. St. Louis Missouri: ACM. pp. 1–11.doi:10.1145/3458817.3487397.ISBN 978-1-4503-8442-1.S2CID 239036976.
^Adams, Peter J.; Batson, Brannon; Bell, Alistair; Bhatt, Jhanvi; Butts, J. Adam; Correia, Timothy; Edwards, Bruce; Feldmann, Peter; Fenton, Christopher H.; Forte, Anthony; Gagliardo, Joseph; Gill, Gennette; Gorlatova, Maria; Greskamp, Brian; Grossman, J.P. (2021-08-22). "The ΛNTON 3 ASIC: A Fire-Breathing Monster for Molecular Dynamics Simulations".2021 IEEE Hot Chips 33 Symposium (HCS). Palo Alto, CA, USA: IEEE. pp. 1–22.doi:10.1109/HCS52781.2021.9567084.ISBN 978-1-6654-1397-8.S2CID 239039245.