Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Volta (microarchitecture)

From Wikipedia, the free encyclopedia
GPU microarchitecture by Nvidia
Nvidia Volta
Release dateDecember 7, 2017
CodenameVolta
Fabrication processTSMC12 nm (FinFET)
Cards
Enthusiast
  • Tesla V100
  • Tesla V100S PCIe
  • Titan V
  • Titan V CEO Edition
  • Quadro GV100
History
PredecessorPascal
VariantTuring (consumer, professional)
SuccessorAmpere (consumer, professional)
Support status
Limited support until October 2025
Security updates until October 2028[1]
Painting of Alessandro Volta, eponym of architecture

Volta is the codename, but not the trademark,[2] for aGPUmicroarchitecture developed byNvidia, succeedingPascal. It was first announced on a roadmap in March 2013,[3] although the first product was not announced until May 2017.[4] The architecture is named after 18th–19th century Italian chemist and physicistAlessandro Volta. It was Nvidia's first chip to featureTensor Cores, specially designed cores that have superior deep learning performance over regular CUDA cores.[5] The architecture is produced withTSMC's12 nmFinFET process. TheAmpere microarchitecture is the successor to Volta.

The first graphics card to use it was the datacenterTesla V100, e.g. as part of theNvidia DGX-1 system.[4] It has also been used in the Quadro GV100 and Titan V. There were no mainstream GeForce graphics cards based on Volta.

After two USPTO proceedings,[6][7] on July 3, 2023 Nvidia lost the Volta trademark application in the field of artificial intelligence. The Volta trademark[8] owner remains Volta Robots, a company specialized in AI and vision algorithms for robots and unmanned vehicles.

Details

[edit]

Architectural improvements of the Volta architecture include the following:

  • CUDA Compute Capability 7.0
    • concurrent execution of integer and floating point operations
  • TSMC's12 nmFinFET process,[9] allowing 21.1 billiontransistors.[10]
  • High Bandwidth Memory 2 (HBM2),[9][11]
  • NVLink 2.0: a high-bandwidth bus between the CPU and GPU, and between multiple GPUs. Allows much higher transfer speeds than those achievable by usingPCI Express; estimated to provide 25 Gbit/s per lane.[12] (Disabled for Titan V)
  • Tensor cores: A tensor core is a unit that multiplies two 4×4FP16 matrices, and then adds a third FP16 or FP32 matrix to the result by usingfused multiply–add operations, and obtains an FP32 result that could be optionally demoted to an FP16 result.[13] Tensor cores are intended to speed up the training of neural networks.[13] Volta's Tensor cores are first generation while Ampere has third generation Tensor cores.[14][15]
  • PureVideo Feature Set I hardware video decoding

Comparison of Compute Capability: GP100 vs GV100 vs GA100[16]

GPU featuresNvidia Tesla P100Nvidia Tesla V100Nvidia A100
GPU codenameGP100GV100GA100
GPU architectureNvidia PascalNvidia VoltaNvidia Ampere
Compute capability6.07.08.0
Threads / warp323232
Max warps / SM646464
Max threads / SM204820482048
Max thread blocks / SM323232
Max 32-bit registers / SM655366553665536
Max registers / block655366553665536
Max registers / thread255255255
Max thread block size102410241024
FP32 cores / SM646464
Ratio of SM registers to FP32 cores102410241024
Shared Memory Size / SM64 KBConfigurable up to 96 KBConfigurable up to 164 KB

Comparison of Precision Support Matrix[17][18]

Supported CUDA Core PrecisionsSupported Tensor Core Precisions
FP16FP32FP64INT1INT4INT8TF32BF16FP16FP32FP64INT1INT4INT8TF32BF16
Nvidia Tesla P4NoYesYesNoNoYesNoNoNoNoNoNoNoNoNoNo
Nvidia P100YesYesYesNoNoNoNoNoNoNoNoNoNoNoNoNo
Nvidia VoltaYesYesYesNoNoYesNoNoYesNoNoNoNoNoNoNo
Nvidia TuringYesYesYesNoNoNoNoNoYesNoNoYesYesYesNoNo
Nvidia A100YesYesYesNoNoYesNoYesYesNoYesYesYesYesYesYes

Legend:

Comparison of Decode Performance

Concurrent streamsH.264 decode (1080p30)H.265 (HEVC) decode (1080p30)VP9 decode (1080p30)
V100162222
A10075157108

Products

[edit]

Volta has been announced as the GPU microarchitecture within theXavier generation ofTegraSoC focusing onself-driving cars.[19][20]

At Nvidia's annual GPU Technology Conference keynote on May 10, 2017, Nvidia officially announced the Volta microarchitecture along with theTesla V100.[4] The Volta GV100 GPU is built on a 12 nm process size using HBM2 memory with 900 GB/s of bandwidth.[21]

Nvidia officially announced the Nvidia TITAN V on December 7, 2017.[22][23]

Nvidia officially announced the Quadro GV100 on March 27, 2018.[24]

ModelLaunchCode Name (s)Fab
(nm)
Transistors
(billion)
Die size
(mm2)
BusInterfaceCore configSM
Count[a]
Graphics
Processing
Clusters[b]
L2 Cache
Size (MiB)
Clock speedsFillrateMemoryProcessing power (GFLOPS)TDP
(Watts)
NVLink SupportLaunch Price
(USD)
CUDA
core[c]
Tensor
core[d]
Base core
clock (MHz)
Boost clock
(MHz)
Memory
(MT/s)
Pixel
(GP/s)
Texture
(GT/s)
Size
(GiB)
Bandwidth
(GB/s)
Bus
Type
Bus width
(bit)
Single
precision
(boost)
Double
precision
(boost)
Half
precision
(boost)
MSRP
Nvidia Titan V[25]December 7, 2017GV100-400-A1TSMC12 nm21.1815PCIe 3.0 ×165120:320:966408064.5120014551700139.7465.612652.8HBM2307212288 (14899)6144 (7450)24576 (29798)250No$2,999
Nvidia Quadro GV100[26]March 27, 2018GV1005120:320:1286113216281696208.452132868.4409611592 (16671)5796 (8335)23183 (33341)Yes$8,999
Nvidia Titan V CEO Edition[27][28]June 21, 2018120014551700186.2465.6870.412288 (14899)6144 (7450)24576 (29798)N/A
  1. ^One Streaming Multiprocessor encompasses 64 CUDA cores and 4 TMUs.
  2. ^One Graphics Processing Cluster encompasses fourteen Streaming Multiprocessors.
  3. ^CUDA cores :Texture mapping units :Render output units
  4. ^A Tensor core is a mixed-precisionFPU specifically designed formatrix arithmetic.

Application

[edit]

Volta is also reported to be included in theSummit andSierra supercomputers, used for GPGPU compute.[29][30] The Volta GPUs will connect to thePOWER9 CPUs viaNVLink 2.0, which is expected to supportcache coherency and therefore improve GPGPU performance.[31][12][32]

V100 accelerator and DGX V100

[edit]

Comparison of accelerators used in DGX:[33][34][35]

ModelArchitectureSocketFP32
CUDA
cores
FP64 cores
(excl. tensor)
Mixed
INT32/FP32
cores
INT32
cores
Boost
clock
Memory
clock
Memory
bus width
Memory
bandwidth
VRAMSingle
precision
(FP32)
Double
precision
(FP64)
INT8
(non-tensor)
INT8
dense tensor
INT32FP4
dense tensor
FP16FP16
dense tensor
bfloat16
dense tensor
TensorFloat-32
(TF32)
dense tensor
FP64
dense tensor
Interconnect
(NVLink)
GPUL1 CacheL2 CacheTDPDie sizeTransistor
count
ProcessLaunched
P100PascalSXM/SXM235841792N/AN/A1480 MHz1.4 Gbit/s HBM24096-bit720 GB/sec16 GB HBM210.6 TFLOPS5.3 TFLOPSN/AN/AN/AN/A21.2 TFLOPSN/AN/AN/AN/A160 GB/secGP1001344 KB (24 KB × 56)4096 KB300 W610 mm215.3 BTSMC 16FF+Q2 2016
V100 16GBVoltaSXM251202560N/A51201530 MHz1.75 Gbit/s HBM24096-bit900 GB/sec16 GB HBM215.7 TFLOPS7.8 TFLOPS62 TOPSN/A15.7 TOPSN/A31.4 TFLOPS125 TFLOPSN/AN/AN/A300 GB/secGV10010240 KB (128 KB × 80)6144 KB300 W815 mm221.1 BTSMC 12FFNQ3 2017
V100 32GBVoltaSXM351202560N/A51201530 MHz1.75 Gbit/s HBM24096-bit900 GB/sec32 GB HBM215.7 TFLOPS7.8 TFLOPS62 TOPSN/A15.7 TOPSN/A31.4 TFLOPS125 TFLOPSN/AN/AN/A300 GB/secGV10010240 KB (128 KB × 80)6144 KB350 W815 mm221.1 BTSMC 12FFN
A100 40GBAmpereSXM4691234566912N/A1410 MHz2.4 Gbit/s HBM25120-bit1.52 TB/sec40 GB HBM219.5 TFLOPS9.7 TFLOPSN/A624 TOPS19.5 TOPSN/A78 TFLOPS312 TFLOPS312 TFLOPS156 TFLOPS19.5 TFLOPS600 GB/secGA10020736 KB (192 KB × 108)40960 KB400 W826 mm254.2 BTSMC N7Q1 2020
A100 80GBAmpereSXM4691234566912N/A1410 MHz3.2 Gbit/s HBM2e5120-bit1.52 TB/sec80 GB HBM2e19.5 TFLOPS9.7 TFLOPSN/A624 TOPS19.5 TOPSN/A78 TFLOPS312 TFLOPS312 TFLOPS156 TFLOPS19.5 TFLOPS600 GB/secGA10020736 KB (192 KB × 108)40960 KB400 W826 mm254.2 BTSMC N7
H100HopperSXM516896460816896N/A1980 MHz5.2 Gbit/s HBM35120-bit3.35 TB/sec80 GB HBM367 TFLOPS34 TFLOPSN/A1.98 POPSN/AN/AN/A990 TFLOPS990 TFLOPS495 TFLOPS67 TFLOPS900 GB/secGH10025344 KB (192 KB × 132)51200 KB700 W814 mm280 BTSMC 4NQ3 2022
H200HopperSXM516896460816896N/A1980 MHz6.3 Gbit/s HBM3e6144-bit4.8 TB/sec141 GB HBM3e67 TFLOPS34 TFLOPSN/A1.98 POPSN/AN/AN/A990 TFLOPS990 TFLOPS495 TFLOPS67 TFLOPS900 GB/secGH10025344 KB (192 KB × 132)51200 KB1000 W814 mm280 BTSMC 4NQ3 2023
B100BlackwellSXM6N/AN/AN/AN/AN/A8 Gbit/s HBM3e8192-bit8 TB/sec192 GB HBM3eN/AN/AN/A3.5 POPSN/A7 PFLOPSN/A1.98 PFLOPS1.98 PFLOPS989 TFLOPS30 TFLOPS1.8 TB/secGB100N/AN/A700 WN/A208 BTSMC 4NPQ4 2024
B200BlackwellSXM6N/AN/AN/AN/AN/A8 Gbit/s HBM3e8192-bit8 TB/sec192 GB HBM3eN/AN/AN/A4.5 POPSN/A9 PFLOPSN/A2.25 PFLOPS2.25 PFLOPS1.2 PFLOPS40 TFLOPS1.8 TB/secGB100N/AN/A1000 WN/A208 BTSMC 4NP

See also

[edit]

References

[edit]
  1. ^Kampman, Jeffrey (2025-07-31)."Nvidia confirms end of Game Ready driver support for Maxwell and Pascal GPUs — affected products will get optimized drivers through October 2025".Tom's Hardware. Retrieved2025-08-21.
  2. ^"Nvidia Volta Trademark Status".United_States_Patent_and_Trademark_Office. 14 August 2023. Retrieved14 August 2023.
  3. ^Gasior, Geoff (19 March 2013)."Nvidia's Volta GPU to feature on-chip DRAM".The Tech Report. Archived fromthe original on 1 May 2019. Retrieved14 March 2017.
  4. ^abcSmith, Ryan (2017-05-10)."The NVIDIA GPU Tech Conference 2017 Keynote Live Blog". Archived fromthe original on May 10, 2017. Retrieved2018-11-03.
  5. ^"NVIDIA Volta AI Architecture | NVIDIA".NVIDIA. Retrieved2018-04-11.
  6. ^"Volta trademark Cancellation Proceeding".United_States_Patent_and_Trademark_Office.
  7. ^"Volta trademark Exparte Appeal Proceeding".United_States_Patent_and_Trademark_Office.
  8. ^"Volta Trademark status".United_States_Patent_and_Trademark_Office.
  9. ^abKillian, Zak (14 March 2017)."Report: TSMC set to fabricate Volta and Centriq on 12-nm process".The Tech Report. Retrieved14 March 2017.
  10. ^Durant, Luke; Giroux, Olivier; Harris, Mark; Stam, Nick (May 10, 2017)."Inside Volta: The World's Most Advanced Data Center GPU".Nvidia developer blog.
  11. ^Gasior, Geoff (March 19, 2013)."Nvidia's Volta GPU to feature on-chip DRAM".The Tech Report. Archived fromthe original on May 1, 2019. RetrievedMarch 14, 2017.
  12. ^abShah, Agam (22 August 2016)."Nvidia's NVLink 2.0 will first appear in Power9 servers next year".PC World. Retrieved14 March 2017.
  13. ^abHarris, Mark (May 11, 2017)."CUDA 9 Features Revealed: Volta, Cooperative Groups and More". RetrievedAugust 12, 2017.
  14. ^"NVIDIA Ampere Architecture In-Depth". 14 May 2020.
  15. ^"NVIDIA A100 Tensor Core GPU Architecture"(PDF). Retrieved2023-12-15.
  16. ^"NVIDIA A100 Tensor Core GPU Architecture: Unprecedented Acceleration at Every Scale"(PDF).Nvidia. RetrievedSeptember 18, 2020.
  17. ^"NVIDIA Tensor Cores: Versatility for HPC & AI".NVIDIA.
  18. ^"Abstract".docs.nvidia.com.
  19. ^Cutress, Ian; Tallis, Billy (4 January 2016)."CES 2017: Nvidia Keynote Liveblog".AnandTech. Archived fromthe original on January 5, 2017. Retrieved9 January 2017.
  20. ^"NVIDIA DRIVE Xavier, World's Most Powerful SoC, Brings Dramatic New AI Capabilities | NVIDIA Blog".The Official NVIDIA Blog. 2018-01-07. Retrieved2018-11-03.
  21. ^Smith, Ryan (10 May 2017)."Nvidia Volta Unveiled".AnandTech. Archived fromthe original on May 11, 2017. Retrieved2 June 2017.
  22. ^"NVIDIA TITAN V Transforms the PC into AI Supercomputer".
  23. ^"Introducing NVIDIA TITAN V: The World's Most Powerful PC Graphics Card".
  24. ^"NVIDIA Reinvents the Workstation with Real-Time Ray Tracing".
  25. ^"Introducing NVIDIA TITAN V: The World's Most Powerful PC Graphics Card".NVIDIA. Retrieved2017-12-08.
  26. ^"NVIDIA Quadro GV100". Retrieved2018-03-27.
  27. ^Smith, Ryan."NVIDIA Unveils & Gives Away New Limited Edition 32GB Titan V "CEO Edition"". Archived fromthe original on June 21, 2018. Retrieved2018-07-06.
  28. ^"NVIDIA TITAN V CEO Edition".TechPowerUp. Retrieved2018-07-07.
  29. ^Shankland, Steven (14 September 2015)."IBM, Nvidia land $325M supercomputer deal".CNET. Retrieved29 December 2015.
  30. ^Noyes, Katherine (16 March 2015)."IBM, Nvidia rev HPC engines in next-gen supercomputer push".PC World. Retrieved29 December 2015.
  31. ^Smith, Ryan (17 November 2014)."Nvidia Volta, IBM Power9 Land Contracts for New US Government Supercomputers".Anandtech. Archived fromthe original on November 19, 2014. Retrieved14 March 2017.
  32. ^Lilly, Paul (January 25, 2017)."NVIDIA 12nm FinFET Volta GPU Architecture Reportedly Replacing Pascal In 2017".HotHardware.
  33. ^Smith, Ryan (March 22, 2022)."NVIDIA Hopper GPU Architecture and H100 Accelerator Announced: Working Smarter and Harder". AnandTech.
  34. ^Smith, Ryan (May 14, 2020)."NVIDIA Ampere Unleashed: NVIDIA Announces New GPU Architecture, A100 GPU, and Accelerator". AnandTech.
  35. ^"NVIDIA Tesla V100 tested: near unbelievable GPU power".TweakTown. September 17, 2017.

External links

[edit]
Fixed pixel pipeline
Pre-GeForce
Vertex andpixel shaders
Unified shaders
Unified shaders &NUMA
Ray tracing &Tensor Cores
Software and technologies
Multimedia acceleration
Software
Technologies
GPU microarchitectures
Other products
GraphicsWorkstation cards
GPGPU software
Console components
Nvidia Shield
SoCs and embedded
CPUs
Computerchipsets
Company
Key people
Acquisitions
Retrieved from "https://en.wikipedia.org/w/index.php?title=Volta_(microarchitecture)&oldid=1308687067"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2025 Movatter.jp