Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

POWER3

From Wikipedia, the free encyclopedia
1998 family of microprocessors by IBM
This article includes a list ofgeneral references, butit lacks sufficient correspondinginline citations. Please help toimprove this article byintroducing more precise citations.(September 2017) (Learn how and when to remove this message)
POWER3
POWER 3 microprocessor
General information
Launched1998
Designed byIBM
Architecture and classification
Instruction setPowerPC
History
PredecessorPOWER2
SuccessorPOWER4
POWER,PowerPC, andPower ISA architectures
NXP (formerly Freescale and Motorola)
IBM
IBM/Nintendo
Other
Related links
Cancelled in gray,historic in italic
Dual 375 MHz IBM POWER3-II processors on the CPU module of a RS/6000 44P 270.

ThePOWER3 is amicroprocessor, designed and exclusively manufactured byIBM, that implemented the 64-bit version of thePowerPCinstruction set architecture (ISA), including all of the optional instructions of the ISA (at the time) such as instructions present in thePOWER2 version of thePOWER ISA but not in the PowerPC ISA. It was introduced on 5 October 1998, debuting in theRS/6000 43P Model 260, a high-end graphics workstation.[1] The POWER3 was originally supposed to be called thePowerPC 630 but was renamed, probably to differentiate the server-orientedPOWER processors it replaced from the more consumer-oriented 32-bit PowerPCs. The POWER3 was the successor of theP2SC derivative of thePOWER2 and completed IBM's long-delayed transition from POWER to PowerPC, which was originally scheduled to conclude in 1995. The POWER3 was used in IBMRS/6000 servers and workstations at 200 MHz. It competed with theDigital Equipment Corporation (DEC)Alpha 21264 and theHewlett-Packard (HP)PA-8500.

Description

[edit]
The logic schema of the POWER3 processor

The POWER3 was based on thePowerPC 620, an earlier 64-bit PowerPC implementation that was late, under-performing and commercially unsuccessful. Like the PowerPC 620, the POWER3 has threefixed-point units, but the singlefloating-point unit (FPU) was replaced with two floating-pointfused multiply–add units, and an extra load-store unit was added (for a total of two) to improve floating-point performance. The POWER3 is asuperscalar design that executed instructionsout of order. It has a seven-stage integer pipeline, a minimal eight-stage load/store pipeline and a ten-stage floating-point pipeline.

The front end consists of two stages: fetch and decode. During the first stage, eight instructions were fetched from a 32 KB instruction cache and placed in a 12-entry instruction buffer. During the second stage, four instructions were taken from the instruction buffer, decoded, and issued to instruction queues. Restrictions on instruction issue are few: of the two integer instruction queues, only one can accept one instruction, the other can accept up to four, as does the floating-point instruction queue. If the queues do not have enough unused entries, instructions cannot be issued. The front end has a short pipeline, resulting in a small three-cyclebranch misprediction penalty.

In stage three, instructions in the instruction queues that are ready for execution have their operands read from the register files. The general-purpose register file contains 48 registers, of which 32 are general-purpose registers and 16 are rename registers forregister renaming. To reduce the number of ports required to provide data and receive results, the general purpose register file is duplicated so that there are two copies, the first supporting three integer execution units and the second supporting the two load/store units. This scheme was similar to a contemporary microprocessor, theDECAlpha 21264, but was simpler as it did not require an extra clock cycle to synchronize the two copies due to the POWER3's higher cycle times. The floating-point register file contains 56 registers, of which 32 are floating-point registers and 24 rename registers. Compared to the PowerPC 620, there were more rename registers, which allowed more instructions to be executed out of order, improving performance.

Execution begins in stage four. The instruction queues dispatch up to eight instructions to the execution units. Integer instructions are executed in three integer execution units (termed "fixed-point units" by IBM). Two of the units are identical and execute all integer instructions except for multiply and divide. All instructions executed by them have a one-cycle latency. The third unit executes multiply and divide instructions. These instructions are not pipelined and have multi-cycle latencies. 64-bit multiply has a nine-cycle latency and 64-bit divide has a 37-cycle latency.

Floating-point instructions are executed in two floating-point units (FPUs). The FPUs are capable offused multiply–add, where multiplication and addition is performed simultaneously. Such instructions, along with individual add and multiply, have a four-cycle latency. Divide and square-root instructions are executed in the same FPUs, but are assisted by specialized hardware. Single-precision (32-bit) divide and square-root instructions have a 14-cycle latency, whereas double-precision (64-bit) divide and square-root instructions have an 18-cycle and a 22-cycle latency, respectively.

After execution is completed, the instructions are held in buffers before being committed and made visible to software. Execution finishes in stage five for integer instructions and stage eight for floating-point. Committing occurs during stage six for integers, stage nine for floating-point. Writeback occurs in the stage after commit. The POWER3 can retire up to four instructions per cycle.

The PowerPC 620 data cache was optimized for technical and scientific applications. Its capacity was doubled to 64 KB, to improve the cache-hit rate; the cache was dual-ported, implemented by interleaving eight banks, to enable two loads or two stores to be performed in one cycle in certain cases; and the line-size was increased to 128-bytes. The L2 cache bus was doubled in width to 256 bits to compensate for the larger cache line size and to retain a four-cycle latency for cache refills.

The POWER3 contained 15 million transistors on a 270 mm2 die. It was fabricated in IBM's CMOS-6S2 process, acomplementary metal–oxide–semiconductor process that is a hybrid of 0.25 μm feature sizes and 0.35 μm metal layers. The process features five layers of aluminium. It was packaged in the same 1,088-columnceramic column grid array as theP2SC, but with a different pin out.

POWER3-II

[edit]
POWER3-II

The POWER3-II was an improved POWER3 that increased the clock frequency to 450 MHz. It contains 23 million transistors and measured 170 mm2. It was fabricated in the IBM CMOS7S process, a 0.22 μm CMOS process with six levels ofcopper interconnect. It was succeeded by thePOWER4 in 2001.

See also

[edit]

Notes

[edit]
  1. ^New IBM POWER3 chip.

References

[edit]
Retrieved from "https://en.wikipedia.org/w/index.php?title=POWER3&oldid=1130573697"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2025 Movatter.jp