Movatterモバイル変換

AMD FireStream

From Wikipedia, the free encyclopedia

Brand name by AMD

AMD FireStream wasAMD's brand name for theirRadeon-based product line targetingstream processing and/orGPGPU insupercomputers. Originally developed byATI Technologies around theRadeon X1900 XTX in 2006, the product line was previously branded as bothATI FireSTREAM andAMD Stream Processor.^[1] The AMD FireStream can also be used as afloating-point co-processor for offloading CPU calculations, which is part of theTorrenza initiative. The FireStream line has been discontinued since 2012, when GPGPU workloads were entirely folded into theAMD FirePro line.

Overview

[edit]

The FireStream line is a series of add-onexpansion cards released from 2006 to 2010, based on standard Radeon GPUs but designed to serve as a general-purposeco-processor, rather than rendering and outputting 3D graphics. Like theFireGL/FirePro line, they were given more memory and memory bandwidth, but the FireStream cards do not necessarily have video output ports. All support 32-bitsingle-precision floating point, and all but the first release support 64-bitdouble-precision. The line was partnered with new APIs to provide higher performance than existingOpenGL andDirect3D shader APIs could provide, beginning withClose to Metal, followed byOpenCL and the Stream Computing SDK, and eventually integrated into theAPP SDK.

For highly parallel floating point math workloads, the cards can speed up large computations by more than 10 times; Folding@Home, the earliest and one of the most visible users of the GPGPU, obtained 20-40 times the CPU performance.^[2] Each pixel and vertex shader, or unified shader in later models, can perform arbitrary floating-point calculations.

History

[edit]

Following the release of theRadeon R520 andGeForce G70 GPU cores withprogrammable shaders, the large floating-point throughput drew attention from academic and commercial groups, experimenting with using then for non-graphics work. The interest led ATI (andNvidia) to create GPGPU products — able to calculate general purpose mathematical formulas in a massively parallel way — to process heavy calculations traditionally done onCPUs and specialized floating-point mathco-processors. GPGPUs were projected to have immediate performance gains of a factor of 10 or more, over compared to contemporary multi-socket CPU-only calculation.

With the development of the high-performance X1900 XFX nearly finished, ATI based its first Stream Processor design on it, announcing it as the upcomingATI FireSTREAM together with the newClose to Metal API at SIGGRAPH 2006.^[3] The core itself was mostly unchanged, except for doubling the onboard memory and bandwidth, similar to theFireGL V7350; new driver and software support made up most of the difference.Folding@home began using the X1900 for general computation, using a pre-release of version 6.5 of the ATI Catalyst driver, and reported 20-40x improvement in GPU over CPU.^[2] The first product was released in late 2006, rebranded asAMD Stream Processor after the merger with AMD.^[4]

The brand becameAMD FireStream with the second generation of stream processors in 2007, based on the RV650 chip with new unified shaders and double precision support.^[5] AsynchronousDMA also improved performance by allowing a larger memory pool without the CPU's help. One model was released, the 9170, for the initial price of $1999. Plans included the development of a stream processor on anMXM module by 2008, for laptop computing,^[6] but was never released.

The third-generation quickly followed in 2008 with dramatic performance improvements from the RV770 core; the 9250 had nearly double the performance of the 9170, and became the first single-chipteraflop processor, despite dropping the price to under $1000.^[7] A faster sibling, the 9270, was released shortly after, for $1999.

In 2010 the final generation of FireStreams came out, the 9350 and 9370 cards, based on the Cypress chip featured in the HD 5800. This generation again doubled the performance relative to the previous, to 2 teraflops in the 9350 and 2.6 teraflops in the 9370,^[8] and was the first built from the ground up forOpenCL. This generation was also the only one to feature fully passive cooling, and active cooling was unavailable.

The Northern and Southern Islands generations were skipped, and in 2012, AMD announced that the new FirePro W (workstation) and S (server) series based on the newGraphics Core Next architecture would take the place of FireStream cards.^[9]

Models

[edit]

FireStream 9170 includeDirect3D 10.1,OpenGL 3.3 and APP Stream
FireStream 92x0 includeDirect3D 10.1,OpenGL 3.3 andOpenCL 1.0
FireStream 93x0 includeDirect3D 11,OpenGL 4.3 andOpenCL 1.2 with Last Driver updates

Model (Codename)	Launch	Architecture (Fab)	Bus interface	Stream processors	Clock rate		Memory				Processing power^[a] (GFLOPS)		TDP (Watts)
Model (Codename)	Launch	Architecture (Fab)	Bus interface	Stream processors	Core (MHz)	Memory (MHz)	Size (MB)	Type	Bus width (bit)	Bandwidth (GB/s)	Single	Double	TDP (Watts)
Stream Processor (R580)	2006	R500 80 nm		240	600		1024	GDDR3	256	83.2	375^[10]	N/A	165
FireStream 9170 (RV670)^[11]^[12]	November 8, 2007	TeraScale 1 55 nm	PCIe 2.0 x16	320	800	800	2048	GDDR3	256	51.2	512	102.4	105
FireStream 9250 (RV770)^[13]^[14]	June 16, 2008	TeraScale 1 55 nm	PCIe 2.0 x16	800	625	993	1024	GDDR3	256	63.6	1000	200	150
FireStream 9270 (RV770)^[15]^[16]	November 13, 2008	TeraScale 1 55 nm	PCIe 2.0 x16	800	750	850	2048	GDDR5	256	108.8	1200	240	160
FireStream 9350 (Cypress XT)^[17]	June 23, 2010	TeraScale 2 40 nm	PCIe 2.1 x16	1440	700	1000	2048	GDDR5	256	128	2016	403.2	150
FireStream 9370 (Cypress XT)^[18]	June 23, 2010	TeraScale 2 40 nm	PCIe 2.1 x16	1600	825	1150	4096	GDDR5	256	147.2	2640	528	225

^Precision performance is calculated from the base (or boost) core clock speed based on aFMA operation.

Software

[edit]

Software Development Kit

[edit]

After abandoning their short-livedClose to Metal API, AMD focused onOpenCL. AMD first released its Stream ComputingSDK (v1.0), in December 2007 under the AMDEULA, to be run onWindows XP.^[19] The SDK includes "Brook+", an AMD hardware optimized version of theBrook language developed by Stanford University, itself a variant of theANSI C (C language),open-sourced and optimized for stream computing. TheAMD Core Math Library (ACML) andAMD Performance Library (APL) with optimizations for the AMD FireStream and the COBRA video library (further renamed as "Accelerated Video Transcoding" or AVT) forvideo transcoding acceleration will also be included. Another important part of the SDK, the Compute Abstraction Layer (CAL), is a software development layer aimed for low-level access, through the CTM hardware interface, to the GPU architecture for performance tuning software written in various high-levelprogramming languages.

In August 2011, AMD released version 2.5 of the ATI APP Software Development Kit,^[19] which includes support for OpenCL 1.1, aparallel computing language developed by theKhronos Group. The concept ofcompute shaders, officially called DirectCompute, inMicrosoft's next generation API calledDirectX 11 is already included in graphics drivers with DirectX 11 support.

AMD APP SDK

[edit]

Main article:AMD APP SDK

Benchmarks

[edit]

According to an AMD-demonstrated system^[20] with two dual-core AMDOpteron processors and two Radeon R600 GPU cores running onMicrosoft Windows XP Professional, 1 teraflop (TFLOP) can be achieved by a universal multiply-add (MADD) calculation. By comparison, an Intel Core 2 Quad Q9650 3.0 GHz processor at the time could achieve 48 GFLOPS.^[21]

In a demonstration of Kaspersky SafeStream anti-virus scanning that had been optimized for AMD stream processors, was able to scan 21 times faster with the R670 based acceleration than with search running entirely on an Opteron, in 2007.^[22]

Limitations

[edit]

Recursive functions are not supported in Brook+ because all function calls areinlined at compile time. Using CAL, functions (recursive or otherwise) are supported to 32 levels.^[23]
Only bilinear texture filtering is supported;mipmapped textures andanisotropic filtering are not supported.
Functions cannot have a variable number of arguments. The same problem occurs for recursive functions.
Conversion of floating-point numbers to integers on GPUs is done differently than on x86 CPUs; it is not fullyIEEE-754 compliant.
Doing "global synchronization" on the GPU is not very efficient, which forces the GPU to divide thekernel and do synchronization on the CPU. Given the variable number of multiprocessors and other factors, there may not be a perfect solution to this problem.
The bus bandwidth and latency between the CPU and the GPU may become abottleneck.

References

[edit]

^AMD Press Release
^^a ^bGasior, Geoff (October 16, 2006)."A closer look at Folding@home on the GPU".The Tech Report. Retrieved2016-05-26.
^ATI SIGGRAPH 2006 Presentation(PDF) (Report). ATI Technologies. Archived fromthe original(PDF) on 2016-12-21. Retrieved2016-05-26.
^Valich, Theo (November 16, 2006)."ATI FireSTREAM AMD Stream board revealed".The Inquirer. Archived from the original on August 21, 2009. Retrieved2016-05-26.
^"AMD Delivers First Stream Processor with Double Precision Floating Point Technology". AMD. November 8, 2007. Archived fromthe original on 2017-06-19. Retrieved2016-05-26.
^AMD WW HPC 2007 presentation(PDF) (Report). p. 37.
^"AMD Stream Processor First to Break 1 Teraflop Barrier". AMD. June 16, 2008. Archived fromthe original on 2017-06-19. Retrieved2016-05-26.
^"Newest AMD FireStream(TM) GPU Compute Accelerators Deliver Almost 2x Single and Double Precision Peak Performance and Performance Per Watt Over Last Generation". AMD. June 23, 2010. Archived fromthe original on 2017-06-19. Retrieved2016-05-26.
^Smith, Ryan (14 August 2012)."The AMD Firepro W9000 W8000 Review Part 1". Anandtech.com. Archived fromthe original on August 18, 2012. Retrieved28 June 2016.
^"Beyond3D - ATI R580: Radeon X1900 XTX & Crossfire".Beyond3D.
^"AMD Delivers First Stream Processor with Double Precision Floating Point Technology". AMD. November 8, 2007. Retrieved2016-05-26.
^"AMD FireStream 9170 Specs".TechPowerUp.
^AMD FireStream 9250 - Product page Archived May 13, 2010, at theWayback Machine
^"AMD FireStream 9250 Specs".TechPowerUp.
^AMD FireStream 9270 - Product page Archived February 16, 2010, at theWayback Machine
^"AMD FireStream 9270 Specs".TechPowerUp.
^"AMD FireStream 9350 Specs".TechPowerUp.
^"AMD FireStream 9370 Specs".TechPowerUp.
^^a ^b ^cAMD APP SDK download page Archived 2012-09-03 at theWayback Machine andStream Computing SDK EULA Archived March 6, 2009, at theWayback Machine, retrieved December 29, 2007
^HardOCP report Archived 2016-03-04 at theWayback Machine, retrieved July 17, 2007
^Intel microprocessor export compliance metrics
^Valich, Theo (September 12, 2007)."GPGPU drastically accelerates anti-virus software".The Inquirer. Archived from the original on September 23, 2009. Retrieved2016-05-26.
^AMD Intermediate Language Reference Guide, August 2008

External links

[edit]

AMD graphics

Radeon-brand
List of GPUs (GPU features template) andList of APUs (APU features template)

Fixed pipeline

Vertex and fragment shaders

Unified shaders

TeraScale	HD 2000 HD 3000 HD 4000 HD 5000 HD 6000

Unified shaders &memory

GCN	HD 7000 HD 8000 200 300 400 500 RX Vega 600
RDNA	RX 5000 RX 6000 RX 7000 RX 9000

Current technologies and software

Audio/video acceleration

GPU technologies

Software

Current	AMD Radeon Software HD3D ROCm AMDGPU GPUOpen TressFX HLSL2GLSL
Obsolete	AMD APP SDK Catalyst Close to Metal CodeAnalyst GPU PerfStudio Mantle CodeXL