Movatterモバイル変換

Physics processing unit

From Wikipedia, the free encyclopedia

This articleneeds additional citations forverification. Please helpimprove this article byadding citations to reliable sources. Unsourced material may be challenged and removed.
Find sources: "Physics processing unit" – news ·newspapers ·books ·scholar ·JSTOR(March 2015) (Learn how and when to remove this message)

Type of dedicated microprocessor

Aphysics processing unit (PPU) is a dedicatedmicroprocessor designed to handle the calculations ofphysics, especially in thephysics engine ofvideo games. It is an example ofhardware acceleration.

Examples of calculations involving a PPU might includerigid body dynamics,soft body dynamics,collision detection,fluid dynamics, hair andclothing simulation,finite element analysis, and fracturing of objects.

The idea is having specialized processors offload time-consuming tasks from a computer's CPU, much like how aGPU performs graphics operations in the main CPU's place. The term was coined byAgeia to describe itsPhysX chip. Several other technologies in the CPU-GPU spectrum have some features in common with it, although Ageia's product was the only complete one designed, marketed, supported, and placed within a system exclusively being a PPU.

History

[edit]

An early academic PPU research project^[1]^[2] named SPARTA (Simulation of Physics on A Real-Time Architecture) was carried out at Penn State^[3] and University of Georgia. This was a simpleFPGA based PPU that was limited to two dimensions. This project was extended into a considerably more advancedASIC-based system named HELLAS.

February 2006 saw the release of the first dedicated PPUPhysX fromAgeia (later merged intoNvidia). The unit is most effective in acceleratingparticle systems, with only a small performance improvement measured for rigid body physics.^[4] The Ageia PPU is documented in depth in their US patent application #20050075849.^[5] Nvidia/Ageia no longer produces PPUs and hardware acceleration for physics processing, although it is now supported through some of their graphics processing units.

Academic PPU research projects
Example SPARTA animation
SPARTAPrinted circuit board
Hellasdie photo

AGEIA PhysX

[edit]

The first processor to be advertised being a PPU was named thePhysX chip, introduced by afabless semiconductor company calledAGEIA. Games wishing to take advantage of the PhysX PPU must use AGEIA'sPhysX SDK, (formerly known as the NovodeX SDK).

It consists of a general purpose RISC core controlling an array of customSIMD floating pointVLIW processors working in local banked memories, with a switch-fabric to manage transfers between them. There is nocache-hierarchy like in a CPU or GPU.

The PhysX was available from three companies akin to the wayvideo cards are manufactured.ASUS,BFG Technologies,^[6] andELSA Technologies were the primary manufacturers. PCs with the cards already installed were available from system builders such asAlienware,Dell, andFalcon Northwest.^[7]

In February 2008, afterNvidia bought Ageia Technologies and eventually cut off the ability to process PhysX on the AGEIA PPU and NVIDIA GPUs in systems with active ATi/AMD GPUs, it seemed that PhysX went 100% to Nvidia. But in March 2008, Nvidia announced that it will make PhysX an open standard for everyone,^[8] so the main graphic-processor manufacturers will have PhysX support in the next generation graphics cards. Nvidia announced that PhysX will also be available for some of their released graphics cards just by downloading some new drivers.

Seephysics engine for a discussion of academic research PPU projects.

PhysX P1 (PPU) hardware specifications

[edit]

ASUS andBFG Technologies bought licenses to manufacture alternate versions of AGEIA's PPU, the PhysX P1 with 128 MB GDDR3:

Multi-core device based on theMIPS architecture with integrated physics acceleration hardware and memory subsystem with "tons of cores"^[9]^[10]
- 125 milliontransistors^[11]
- 182 mm²die size
- Fabrication process:130 nm
- Peak power consumption: 30W
Memory: 128 MBGDDR3 RAM with 128-bit interface
32-bitPCI 3.0 (ASUS also made aPCI Express version card)
Sphere collision tests: 530 million per second (maximum capability)
Convex collision tests: 530,000 per second (maximum capability)
Peak instruction bandwidth: 20 billion per second

Havok FX

[edit]

TheHavok SDK is a major competitor to the PhysX SDK, used in more than 150 games, including major titles likeHalf-Life 2,Halo 3 andDead Rising.^[12]

To compete with the PhysX PPU, an edition known asHavok FX was to take advantage of multi-GPU technology fromATI (AMD CrossFire) andNVIDIA (SLI) using existing cards to accelerate certain physics calculations.^[13]

Havok divides the physics simulation intoeffect andgameplay physics, with effect physics being offloaded (if possible) to the GPU asShader Model 3.0 instructions and gameplay physics being processed on the CPU as normal. The important distinction between the two is thateffect physics do not affect gameplay (dust or small debris from an explosion, for example); the vast majority of physics operations are still performed in software. This approach differs significantly from the PhysX SDK, which moves all calculations to the PhysX card if it is present.

Since Havok's acquisition byIntel, Havok FX appears to have been shelved or cancelled.^[14]

PPU vs. GPUs

[edit]

The drive towardGPGPU has made GPUs more suitable for the job of a PPU; DX10 added integer data types, unified shader architecture, and a geometry shader stage which allows a broader range of algorithms to be implemented; Modern GPUs supportcompute shaders, which run across an indexed space and don't require any graphical resources, just general purpose data buffers. NVidiaCUDA provides a little more in the way of inter-thread communication andscratchpad-style workspace associated with the threads.

Nonetheless GPUs are built around a larger number of longer latency, slower threads, and designed around texture and framebuffer data paths, and poor branching performance; this distinguishes them from PPUs andCell as being less well optimized for taking over game world simulation tasks.

TheCodeplay Sieve compiler supports the PPU, indicating that the Ageia physX chip would be suitable for GPGPU type tasks. However Ageia seem unlikely to pursue this market.

PS2 – VU0

[edit]

Although very different from the PhysX, one could argue thePlayStation 2'sVU0 is an early, limited implementation of a PPU. Conversely, one could describe a PPU to a PS2 programmer as an evolved replacement for VU0. Its feature-set and placement within the system is geared toward accelerating game update tasks including physics and AI; it can offload such calculations working off its own instruction stream whilst the CPU is operating on something else. Being a DSP however, it is much more dependent on the CPU to do useful work in a game engine, and would not be capable of implementing a full physics API, so it cannot be classed as a PPU. Also VU0 is capable of providing additional vertex processing power, though this is more a property of the pathways in the system rather than the unit itself.

This usage is similar to Havok FX or GPU physics in that an auxiliary unit's general purpose floating point power is used to complement the CPU in either graphics or physics roles.

References

[edit]

^S. Yardi, B. Bishop, T. Kelliher, "HELLAS: A Specialised Architecture for Interactive Deformable Object Modeling", ACM Southeast Conference, Melbourne, FL, March 10–12, 2006, pp. 56–61.
^B. Bishop, T. Kelliher, "Specialized Hardware for Deformable Object Modeling," IEEE Transactions on Circuits and Systems for Video Technology, 13(11):1074–1079, Nov. 2003.
^"SPARTA Homepage". Cse.psu.edu. Archived fromthe original on 2010-07-30. Retrieved2010-08-16.
^"Exclusive: ASUS Debuts AGEIA PhysX Hardware". AnandTech. Retrieved2010-08-16.
^"United States Patent Application: 0050086040". Appft1.uspto.gov. Archived fromthe original on 2020-02-10. Retrieved2010-08-16.
^":::News Release:::". Archived fromthe original on 2006-04-26. Retrieved2011-06-08.
^"BFG Tech ad for the PhysX".Maximum PC.Future US. May 2006. p. 6.ISSN 1522-4279. Retrieved2009-09-16.
^Nvidia offers PhysX support to AMD / ATI Archived 2008-03-13 at theWayback Machine
^"PhysX FAQ". NVIDIA Corporation. 28 November 2018.
^Nicholas Blachford (2006)."Lets Get Physical: Inside The PhysX Physics Processor".
^http://www.legitreviews.com/article/346/2/ Legit Reviews - ASUS's AGEIA PhysX P1 Card
^"Games using Havok". Archived fromthe original on 2012-04-15. Retrieved2007-02-19.
^Havok FX product information Archived 2007-03-02 at theWayback Machine
^Shilov, Anton (2007-11-19)."GPU Physics Dead for Now, Says AMD's Developer Relations Chief". Xbit Laboratories. Archived fromthe original on 2011-12-01. Retrieved2007-11-26.
^https://www.digipart.com/part/UA6528

External links

[edit]

AGEIA Official Website (no longer available)
AGEIA Physx Processor Website (no longer available)
Projects using PhysX SDK (no longer available)
BFG AGEIA PhysX Card Review
Planet PhysX News & Information Page (no longer available)
PC Hardware: AGEIA PhysX Interview (no longer available)
PC Perspective: AGEIA PhysX Physics Processing Unit Preview (no longer available)
Havok FX physics engine (middleware library) SDK (no longer available)
NVIDIA CUDA Toolkit and SDK
PhysX Toolkit and SDK

Processor technologies

Models

Architecture

Instruction set
architectures

Types	Orthogonal instruction set CISC RISC Application-specific EDGE TRIPS VLIW EPIC MISC OISC NISC ZISC VISC architecture Quantum computing Comparison Addressing modes
Instruction sets	Motorola 68000 series VAX PDP-11 x86 ARM Stanford MIPS MIPS MIPS-X Power POWER PowerPC Power ISA Clipper architecture SPARC SuperH DEC Alpha ETRAX CRIS M32R Unicore Itanium OpenRISC RISC-V MicroBlaze LMC System/3x0 S/360 S/370 S/390 z/Architecture Tilera ISA VISC architecture Epiphany architecture Others

Execution

Instruction pipelining	Pipeline stall Operand forwarding Classic RISC pipeline
Hazards	Data dependency Structural Control False sharing
Out-of-order	Scoreboarding Tomasulo's algorithm Reservation station Re-order buffer Register renaming Wide-issue
Speculative	Branch prediction Memory dependence prediction

Parallelism

Level	Bit Bit-serial Word Instruction Pipelining Scalar Superscalar Task Thread Process Data Vector Memory Distributed
Multithreading	Temporal Simultaneous Hyperthreading Simultaneous and heterogenous Speculative Preemptive Cooperative
Flynn's taxonomy	SISD SIMD Array processing (SIMT) Pipelined processing Associative processing SWAR MISD MIMD SPMD

Processor
performance

Transistor count
Instructions per cycle (IPC)
- Cycles per instruction (CPI)
Instructions per second (IPS)
Floating-point operations per second (FLOPS)
Transactions per second (TPS)
Synaptic updates per second (SUPS)
Performance per watt (PPW)
Cache performance metrics
Computer performance by orders of magnitude

Types

By application	Embedded system Microprocessor Microcontroller Mobile Ultra-low-voltage ASIP Soft microprocessor
Systems on chip	System on a chip (SoC) Multiprocessor (MPSoC) Cypress PSoC Network on a chip (NoC)
Hardware accelerators	Coprocessor AI accelerator Graphics processing unit (GPU) Image processor Vision processing unit (VPU) Physics processing unit (PPU) Digital signal processor (DSP) Tensor Processing Unit (TPU) Secure cryptoprocessor Network processor Baseband processor

Word size

Core count

Components

Functional units	Arithmetic logic unit (ALU) Address generation unit (AGU) Floating-point unit (FPU) Memory management unit (MMU) Load–store unit Translation lookaside buffer (TLB) Branch predictor Branch target predictor Integrated memory controller (IMC) Memory management unit Instruction decoder
Logic	Combinational Sequential Glue Logic gate Quantum Array
Registers	Processor register Status register Stack register Register file Memory buffer Memory address register Program counter
Control unit	Hardwired control unit Instruction unit Data buffer Write buffer Microcode ROM Counter
Datapath	Multiplexer Demultiplexer Adder Multiplier CPU Binary decoder Address decoder Sum-addressed decoder Barrel shifter
Circuitry	Integrated circuit 3D Mixed-signal Power management Boolean Digital Analog Quantum Switch