Power10SCM | |
| General information | |
|---|---|
| Launched | 2021 |
| Designed by | IBM,OpenPower partners |
| Common manufacturer | |
| Performance | |
| Max.CPUclock rate | +3.5 GHz to +4 GHz |
| Physical specifications | |
| Cores |
|
| Package |
|
| Socket |
|
| Cache | |
| L1cache | 48+32 KB per core |
| L2 cache | 2 MB per core |
| L3 cache | 120 MB per chip |
| Architecture and classification | |
| Technology node | 7 nm |
| Microarchitecture | P10 |
| Instruction set | Power ISA (Power ISA v.3.1) |
| History | |
| Predecessor | POWER9 |
| Successor | Power11 |
| POWER,PowerPC, andPower ISA architectures |
|---|
| NXP (formerly Freescale and Motorola) |
| IBM |
|
| IBM/Nintendo |
| Other |
| Related links |
| Cancelled in gray,historic in italic |
Power10 is asuperscalar,multithreading,multi-coremicroprocessor family, based on theopen sourcePower ISA, announced in August 2020 and available from September 2021. The processor is designed to have 15cores available. The main features of Power10 are higherperformance per watt and bettermemory andI/O architectures, with a focus onartificial intelligence (AI) workloads. Each Power10 core has doubled up on mostfunctional units compared to its predecessorPOWER9. Power10 is available in a range of IBM models and is supported byoperating systems includingLinux 5.9 andPowerVM. The branding is unusual in that its name is not capitalized like POWER9 and all other previous POWER processors.
The Power10superscalar,multithreading,multi-coremicroprocessor family is based on theopen sourcePower ISA. It was announced in August 2020 at theHot Chips conference. Systems with Power10 CPUs were generally available from September 2021 in the IBM Power10 Enterprise E1080 server. The processor is designed to have 15cores available, but a spare core will be included during manufacture to cost-effectively allow foryield issues. The main features of Power10 are higherperformance per watt and bettermemory andI/O architectures, with a focus onartificial intelligence (AI) workloads.[1]
Power10-based processors is manufactured by Samsung using a7 nm process with 18 layers of metal and 18 billiontransistors on a 602 mm2silicon die.[2][3][4][5]
Each Power10 core has doubled up on mostfunctional units compared to its predecessorPOWER9. The core is eight-waymultithreaded (SMT8) and has 48 KB instruction and 32 KB dataL1 caches, a 2 MB large L2 cache and a very largetranslation lookaside buffer (TLB) with 4096 entries.[4] Latency cycles to the different cache stages and TLB has been reduced significantly. Each core has eight execution slices each with onefloating-point unit (FPU),arithmetic logic unit (ALU),branch predictor,load–store unit andSIMD-engine, able to be fed128-bit (64+64) instructions from the new prefix/fuse instructions of the Power ISA v.3.1. Each execution slice can handle 20 instructions each, backed up by a shared 512-entry instruction table, and fed to 128-entry-wide (64 single-threaded)load queue and 80-entry (40 single-threaded) wide store queue. Better branch prediction features have doubled the accuracy. A core has fourmatrix math assist (MMA) engines,[6] for better handling of SIMD code, especially formatrix multiplication instructions whereAI inference workloads have a 20-fold performance increase.[7]
The processor has two "hemispheres" with eight cores each, sharing a 64 MB L3 cache for a total of 16 cores and 128 MB L3 caches. Due to yield issues, at least one core is always disabled, reducing L3 cache by 8 MB to a usable total of 15 cores and 120 MB L3 cache. Each chip also has eightcrypto accelerators offloading common algorithms such asAES andSHA-3.
Increasedclock gating and reworkedmicroarchitecture at every stage, together with the fuse/prefix instructions enabling more work with fewer work units, and smarter cache with lowermemory latencies and effective address tagging reducing cache misses, enables the Power10 core to consume half the power as POWER9. Combined with the improvements in the compute facilities by up to 30% makes the whole processor perform 2.6× better per watt than its predecessor. And in the case of mounting two cores on the same module, up to 3 times as fast in the same power budget.
As the cores can act like eight logical processors each the 15-core processor looks like 120 cores to theoperating system. On a dual-chip module, that becomes 240 simultaneous threads persocket.
The chips have completely reworked memory and I/O architectures, using the openCoherent Accelerator Processor Interface (OpenCAPI) and Open Memory Interface (OMI). Usingserial memory communications to off chipcontrollers reduces signaling lanes to and from the chip, increases thebandwidth and allows the processor to be flexible in its memory technology.[5]
Power10 supports a wide range of memory types, including DDR3 through DDR5, GDDR, HBM, or Persistent Storage Memory. These configurations can be changed by the customer to best fit the use case intended for the system.
Power10 enables encrypting of data with no performance penalty at every stage from RAM, across accelerators and cluster nodes to data at rest.
Power10 comes withPowerAXON facility enabling chip to chip, system to system and OpenCAPI bus for accelerators, I/O and other high performancecache coherent peripherals. It manages the communications between nodes in a 16x socket single chip module (SCM) cluster or a 4x socket dual chip module (DCM) cluster. It also manages thememory semantics for clustering of systems enablingload/store access from the core up to 2 PB of RAM on the entire Power10 cluster. IBM calls this featureMemory Inception.
Both OMI and PowerAXON can handle 1 TB/s communications off the chip.
Power10 includesPCIe 5. The SCM has 32x and the DCM has 64x PCIe 5 lanes. The decision to removeNVLink support from Power10 was made due to PCIe 5.0's bandwidth capabilities rendering NVLink support obsolete for the use cases that Power10 was designed for.[4] Support for NVLink on-chip was previously a unique selling point forPOWER8 and POWER9.
The Power10 chip is available in two variants, defined byfirmware in the packaging. Even though the chips are physically identical and the difference is set in firmware, it cannot be changed by the user nor IBM after manufacturing.[8]
The Power10 comes in threeflip-chip plastic land grid array (FC-PLGA)packages: onesingle chip module (SCM) and twodual-chip modules (DCM and eSCM).
Power10 is available in a range of IBM computers.
TheIBM Power E1080, codenameDenali, is the top end Power10 computer by IBM. It's made of 1-4× Central Electronics Complex (CEC) nodes, each one taking up 5Us of space. Each node has 4× Power10 SCM, configurable with 10, 12, or 15SMT8 cores per processor, and up to 16 TBOMI-DDR4 RAM. The Power E1080 natively runsPowerVM runningAIX,IBM i andlittle-endianLinux.[12] An E1080 system also needs a 2U high System Control Unit for monitoring and configuration.
The Power E1080 also supports up to sixteen I/O expansion drawers, four per CEC node. Each expansion drawer is connected to the respective CEC node by two PCIe fanout modules, and has twelve FHFL PCIe slots. Four of these slots arePCIe 3.0 x16, while the remaining eight are PCIe 3.0 x8. A maximum specification configuration allows the Power E1080 to support 192 single slot PCIe cards across a 16 socket system.[13]
The S-models can run Linux, IBM i and AIX. The L-models are made for Linux, but are allowed to run AIX and IBM i on up to 25% of available CPU cores.[10]
The followingoperating systems that support Power10:
Power10 is unusual in that its name is not capitalized like POWER9 and all other previous POWER processors are. This change is one part in IBM's rebranding of their Power Systems offering, which beginning with Power10 is now just "Power". Power10 also has a logo.[19]
{{cite web}}: CS1 maint: multiple names: authors list (link){{cite web}}: CS1 maint: multiple names: authors list (link){{cite web}}: CS1 maint: multiple names: authors list (link)