Movatterモバイル変換

Volta (microarchitecture)

From Wikipedia, the free encyclopedia

GPU microarchitecture by Nvidia

Nvidia Volta
Cards
Release date	December 7, 2017
Codename	Volta
Fabrication process	TSMC 12 nm (FinFET)
Enthusiast	Tesla V100 Tesla V100S PCIe Titan V Titan V CEO Edition Quadro GV100
History
Predecessor	Pascal
Variant	Turing (consumer, professional)
Successor	Ampere (consumer, professional)
Support status
Limited support until October 2025 Security updates until October 2028^[1]

Painting of Alessandro Volta, eponym of architecture

Volta is the codename, but not the trademark,^[2] for aGPU microarchitecture developed byNvidia, succeedingPascal. It was first announced on a roadmap in March 2013,^[3] although the first product was not announced until May 2017.^[4] The architecture is named after 18th–19th century Italian chemist and physicistAlessandro Volta. It was Nvidia's first chip to featureTensor Cores, specially designed cores that have superior deep learning performance over regular CUDA cores.^[5] The architecture is produced withTSMC's12 nm FinFET process. TheAmpere microarchitecture is the successor to Volta.

The first graphics card to use it was the datacenterTesla V100, e.g. as part of theNvidia DGX-1 system.^[4] It has also been used in the Quadro GV100 and Titan V. There were no mainstream GeForce graphics cards based on Volta.

After two USPTO proceedings,^[6]^[7] on July 3, 2023 Nvidia lost the Volta trademark application in the field of artificial intelligence. The Volta trademark^[8] owner remainsVolta Robots, a company specialized in AI and vision algorithms for robots and unmanned vehicles.

Details

[edit]

Architectural improvements of the Volta architecture include the following:

CUDA Compute Capability 7.0
- concurrent execution of integer and floating point operations
TSMC's12 nm FinFET process,^[9] allowing 21.1 billiontransistors.^[10]
High Bandwidth Memory 2 (HBM2),^[9]^[11]
NVLink 2.0: a high-bandwidth bus between the CPU and GPU, and between multiple GPUs. Allows much higher transfer speeds than those achievable by usingPCI Express; estimated to provide 25 Gbit/s per lane.^[12] (Disabled for Titan V)
Tensor cores: A tensor core is a unit that multiplies two 4×4FP16 matrices, and then adds a third FP16 or FP32 matrix to the result by usingfused multiply–add operations, and obtains an FP32 result that could be optionally demoted to an FP16 result.^[13] Tensor cores are intended to speed up the training of neural networks.^[13] Volta's Tensor cores are first generation while Ampere has third generation Tensor cores.^[14]^[15]
PureVideo Feature Set I hardware video decoding

Comparison of Compute Capability: GP100 vs GV100 vs GA100^[16]

GPU features	Nvidia Tesla P100	Nvidia Tesla V100	Nvidia A100
GPU codename	GP100	GV100	GA100
GPU architecture	Nvidia Pascal	Nvidia Volta	Nvidia Ampere
Compute capability	6.0	7.0	8.0
Threads / warp	32	32	32
Max warps / SM	64	64	64
Max threads / SM	2048	2048	2048
Max thread blocks / SM	32	32	32
Max 32-bit registers / SM	65536	65536	65536
Max registers / block	65536	65536	65536
Max registers / thread	255	255	255
Max thread block size	1024	1024	1024
FP32 cores / SM	64	64	64
Ratio of SM registers to FP32 cores	1024	1024	1024
Shared Memory Size / SM	64 KB	Configurable up to 96 KB	Configurable up to 164 KB

Comparison of Precision Support Matrix^[17]^[18]

	Supported CUDA Core Precisions								Supported Tensor Core Precisions
	FP16	FP32	FP64	INT1	INT4	INT8	TF32	BF16	FP16	FP32	FP64	INT1	INT4	INT8	TF32	BF16
Nvidia Tesla P4	No	Yes	Yes	No	No	Yes	No	No	No	No	No	No	No	No	No	No
Nvidia P100	Yes	Yes	Yes	No	No	No	No	No	No	No	No	No	No	No	No	No
Nvidia Volta	Yes	Yes	Yes	No	No	Yes	No	No	Yes	No	No	No	No	No	No	No
Nvidia Turing	Yes	Yes	Yes	No	No	No	No	No	Yes	No	No	Yes	Yes	Yes	No	No
Nvidia A100	Yes	Yes	Yes	No	No	Yes	No	Yes	Yes	No	Yes	Yes	Yes	Yes	Yes	Yes

Legend:

FPnn: floating point with nn bits
INTn: integer with n bits
INT1: binary
TF32:TensorFloat32
BF16:bfloat16

Comparison of Decode Performance

Concurrent streams	H.264 decode (1080p30)	H.265 (HEVC) decode (1080p30)	VP9 decode (1080p30)
V100	16	22	22
A100	75	157	108

Products

[edit]

Volta has been announced as the GPU microarchitecture within theXavier generation ofTegra SoC focusing onself-driving cars.^[19]^[20]

At Nvidia's annual GPU Technology Conference keynote on May 10, 2017, Nvidia officially announced the Volta microarchitecture along with theTesla V100.^[4] The Volta GV100 GPU is built on a 12 nm process size using HBM2 memory with 900 GB/s of bandwidth.^[21]

Nvidia officially announced the Nvidia TITAN V on December 7, 2017.^[22]^[23]

Nvidia officially announced the Quadro GV100 on March 27, 2018.^[24]

Model	Launch	Code Name (s)	Fab (nm)	Transistors (billion)	Die size (mm²)	Bus Interface	Core config		SM Count^[a]	Graphics Processing Clusters^[b]	L2 Cache Size (MiB)	Clock speeds			Fillrate		Memory				Processing power (GFLOPS)			TDP (Watts)	NVLink Support	Launch Price (USD)
							CUDA core^[c]	Tensor core^[d]				Base core clock (MHz)	Boost clock (MHz)	Memory (MT/s)	Pixel (GP/s)	Texture (GT/s)	Size (GiB)	Bandwidth (GB/s)	Bus Type	Bus width (bit)	Single precision (boost)	Double precision (boost)	Half precision (boost)			Launch Price (USD)
							CUDA core^[c]	Tensor core^[d]				Base core clock (MHz)	Boost clock (MHz)	Memory (MT/s)	Pixel (GP/s)	Texture (GT/s)	Size (GiB)	Bandwidth (GB/s)	Bus Type	Bus width (bit)	Single precision (boost)	Double precision (boost)	Half precision (boost)			MSRP
Nvidia Titan V^[25]	December 7, 2017	GV100-400-A1	TSMC 12 nm	21.1	815	PCIe 3.0 ×16	5120:320:96	640	80	6	4.5	1200	1455	1700	139.7	465.6	12	652.8	HBM2	3072	12288 (14899)	6144 (7450)	24576 (29798)	250	No	$2,999
Nvidia Quadro GV100^[26]	March 27, 2018	GV100					5120:320:128				6	1132	1628	1696	208.4	521	32	868.4		4096	11592 (16671)	5796 (8335)	23183 (33341)		Yes	$8,999
Nvidia Titan V CEO Edition^[27]^[28]	June 21, 2018	GV100					5120:320:128				6	1200	1455	1700	186.2	465.6	32	870.4		4096	12288 (14899)	6144 (7450)	24576 (29798)		Yes	N/A

^One Streaming Multiprocessor encompasses 64 CUDA cores and 4 TMUs.
^One Graphics Processing Cluster encompasses fourteen Streaming Multiprocessors.
^CUDA cores :Texture mapping units :Render output units
^A Tensor core is a mixed-precisionFPU specifically designed formatrix arithmetic.

Application

[edit]

Volta is also reported to be included in theSummit andSierra supercomputers, used for GPGPU compute.^[29]^[30] The Volta GPUs will connect to thePOWER9 CPUs viaNVLink 2.0, which is expected to supportcache coherency and therefore improve GPGPU performance.^[31]^[12]^[32]

V100 accelerator and DGX V100

[edit]

Comparison of accelerators used in DGX:^[33]^[34]^[35]

Model	Architecture	Socket	FP32 CUDA cores	FP64 cores (excl. tensor)	Mixed INT32/FP32 cores	INT32 cores	Boost clock	Memory clock	Memory bus width	Memory bandwidth	VRAM	Single precision (FP32)	Double precision (FP64)	INT8 (non-tensor)	INT8 dense tensor	INT32	FP4 dense tensor	FP16	FP16 dense tensor	bfloat16 dense tensor	TensorFloat-32 (TF32) dense tensor	FP64 dense tensor	Interconnect (NVLink)	GPU	L1 Cache	L2 Cache	TDP	Die size	Transistor count	Process	Launched
P100	Pascal	SXM/SXM2	3584	1792	N/A	N/A	1480 MHz	1.4 Gbit/s HBM2	4096-bit	720 GB/sec	16 GB HBM2	10.6 TFLOPS	5.3 TFLOPS	N/A	N/A	N/A	N/A	21.2 TFLOPS	N/A	N/A	N/A	N/A	160 GB/sec	GP100	1344 KB (24 KB × 56)	4096 KB	300 W	610 mm²	15.3 B	TSMC 16FF+	Q2 2016
V100 16GB	Volta	SXM2	5120	2560	N/A	5120	1530 MHz	1.75 Gbit/s HBM2	4096-bit	900 GB/sec	16 GB HBM2	15.7 TFLOPS	7.8 TFLOPS	62 TOPS	N/A	15.7 TOPS	N/A	31.4 TFLOPS	125 TFLOPS	N/A	N/A	N/A	300 GB/sec	GV100	10240 KB (128 KB × 80)	6144 KB	300 W	815 mm²	21.1 B	TSMC 12FFN	Q3 2017
V100 32GB	Volta	SXM3	5120	2560	N/A	5120	1530 MHz	1.75 Gbit/s HBM2	4096-bit	900 GB/sec	32 GB HBM2	15.7 TFLOPS	7.8 TFLOPS	62 TOPS	N/A	15.7 TOPS	N/A	31.4 TFLOPS	125 TFLOPS	N/A	N/A	N/A	300 GB/sec	GV100	10240 KB (128 KB × 80)	6144 KB	350 W	815 mm²	21.1 B	TSMC 12FFN	Q3 2017
A100 40GB	Ampere	SXM4	6912	3456	6912	N/A	1410 MHz	2.4 Gbit/s HBM2	5120-bit	1.52 TB/sec	40 GB HBM2	19.5 TFLOPS	9.7 TFLOPS	N/A	624 TOPS	19.5 TOPS	N/A	78 TFLOPS	312 TFLOPS	312 TFLOPS	156 TFLOPS	19.5 TFLOPS	600 GB/sec	GA100	20736 KB (192 KB × 108)	40960 KB	400 W	826 mm²	54.2 B	TSMC N7	Q1 2020
A100 80GB	Ampere	SXM4	6912	3456	6912	N/A	1410 MHz	3.2 Gbit/s HBM2e	5120-bit	1.52 TB/sec	80 GB HBM2e	19.5 TFLOPS	9.7 TFLOPS	N/A	624 TOPS	19.5 TOPS	N/A	78 TFLOPS	312 TFLOPS	312 TFLOPS	156 TFLOPS	19.5 TFLOPS	600 GB/sec	GA100	20736 KB (192 KB × 108)	40960 KB	400 W	826 mm²	54.2 B	TSMC N7	Q1 2020
H100	Hopper	SXM5	16896	4608	16896	N/A	1980 MHz	5.2 Gbit/s HBM3	5120-bit	3.35 TB/sec	80 GB HBM3	67 TFLOPS	34 TFLOPS	N/A	1.98 POPS	N/A	N/A	N/A	990 TFLOPS	990 TFLOPS	495 TFLOPS	67 TFLOPS	900 GB/sec	GH100	25344 KB (192 KB × 132)	51200 KB	700 W	814 mm²	80 B	TSMC 4N	Q3 2022
H200	Hopper	SXM5	16896	4608	16896	N/A	1980 MHz	6.3 Gbit/s HBM3e	6144-bit	4.8 TB/sec	141 GB HBM3e	67 TFLOPS	34 TFLOPS	N/A	1.98 POPS	N/A	N/A	N/A	990 TFLOPS	990 TFLOPS	495 TFLOPS	67 TFLOPS	900 GB/sec	GH100	25344 KB (192 KB × 132)	51200 KB	1000 W	814 mm²	80 B	TSMC 4N	Q3 2023
B100	Blackwell	SXM6	N/A	N/A	N/A	N/A	N/A	8 Gbit/s HBM3e	8192-bit	8 TB/sec	192 GB HBM3e	N/A	N/A	N/A	3.5 POPS	N/A	7 PFLOPS	N/A	1.98 PFLOPS	1.98 PFLOPS	989 TFLOPS	30 TFLOPS	1.8 TB/sec	GB100	N/A	N/A	700 W	N/A	208 B	TSMC 4NP	Q4 2024
B200	Blackwell	SXM6	N/A	N/A	N/A	N/A	N/A	8 Gbit/s HBM3e	8192-bit	8 TB/sec	192 GB HBM3e	N/A	N/A	N/A	4.5 POPS	N/A	9 PFLOPS	N/A	2.25 PFLOPS	2.25 PFLOPS	1.2 PFLOPS	40 TFLOPS	1.8 TB/sec	GB100	N/A	N/A	1000 W	N/A	208 B	TSMC 4NP	Q4 2024

References

[edit]

^Kampman, Jeffrey (2025-07-31)."Nvidia confirms end of Game Ready driver support for Maxwell and Pascal GPUs — affected products will get optimized drivers through October 2025".Tom's Hardware. Retrieved2025-08-21.
^"Nvidia Volta Trademark Status".United_States_Patent_and_Trademark_Office. 14 August 2023. Retrieved14 August 2023.
^Gasior, Geoff (19 March 2013)."Nvidia's Volta GPU to feature on-chip DRAM".The Tech Report. Archived fromthe original on 1 May 2019. Retrieved14 March 2017.
^^a ^b ^cSmith, Ryan (2017-05-10)."The NVIDIA GPU Tech Conference 2017 Keynote Live Blog". Archived fromthe original on May 10, 2017. Retrieved2018-11-03.
^"NVIDIA Volta AI Architecture | NVIDIA".NVIDIA. Retrieved2018-04-11.
^"Volta trademark Cancellation Proceeding".United_States_Patent_and_Trademark_Office.
^"Volta trademark Exparte Appeal Proceeding".United_States_Patent_and_Trademark_Office.
^"Volta Trademark status".United_States_Patent_and_Trademark_Office.
^^a ^bKillian, Zak (14 March 2017)."Report: TSMC set to fabricate Volta and Centriq on 12-nm process".The Tech Report. Retrieved14 March 2017.
^Durant, Luke; Giroux, Olivier; Harris, Mark; Stam, Nick (May 10, 2017)."Inside Volta: The World's Most Advanced Data Center GPU".Nvidia developer blog.
^Gasior, Geoff (March 19, 2013)."Nvidia's Volta GPU to feature on-chip DRAM".The Tech Report. Archived fromthe original on May 1, 2019. RetrievedMarch 14, 2017.
^^a ^bShah, Agam (22 August 2016)."Nvidia's NVLink 2.0 will first appear in Power9 servers next year".PC World. Retrieved14 March 2017.
^^a ^bHarris, Mark (May 11, 2017)."CUDA 9 Features Revealed: Volta, Cooperative Groups and More". RetrievedAugust 12, 2017.
^"NVIDIA Ampere Architecture In-Depth". 14 May 2020.
^"NVIDIA A100 Tensor Core GPU Architecture"(PDF). Retrieved2023-12-15.
^"NVIDIA A100 Tensor Core GPU Architecture: Unprecedented Acceleration at Every Scale"(PDF).Nvidia. RetrievedSeptember 18, 2020.
^"NVIDIA Tensor Cores: Versatility for HPC & AI".NVIDIA.
^"Abstract".docs.nvidia.com.
^Cutress, Ian; Tallis, Billy (4 January 2016)."CES 2017: Nvidia Keynote Liveblog".AnandTech. Archived fromthe original on January 5, 2017. Retrieved9 January 2017.
^"NVIDIA DRIVE Xavier, World's Most Powerful SoC, Brings Dramatic New AI Capabilities | NVIDIA Blog".The Official NVIDIA Blog. 2018-01-07. Retrieved2018-11-03.
^Smith, Ryan (10 May 2017)."Nvidia Volta Unveiled".AnandTech. Archived fromthe original on May 11, 2017. Retrieved2 June 2017.
^"NVIDIA TITAN V Transforms the PC into AI Supercomputer".
^"Introducing NVIDIA TITAN V: The World's Most Powerful PC Graphics Card".
^"NVIDIA Reinvents the Workstation with Real-Time Ray Tracing".
^"Introducing NVIDIA TITAN V: The World's Most Powerful PC Graphics Card".NVIDIA. Retrieved2017-12-08.
^"NVIDIA Quadro GV100". Retrieved2018-03-27.
^Smith, Ryan."NVIDIA Unveils & Gives Away New Limited Edition 32GB Titan V "CEO Edition"". Archived fromthe original on June 21, 2018. Retrieved2018-07-06.
^"NVIDIA TITAN V CEO Edition".TechPowerUp. Retrieved2018-07-07.
^Shankland, Steven (14 September 2015)."IBM, Nvidia land $325M supercomputer deal".CNET. Retrieved29 December 2015.
^Noyes, Katherine (16 March 2015)."IBM, Nvidia rev HPC engines in next-gen supercomputer push".PC World. Retrieved29 December 2015.
^Smith, Ryan (17 November 2014)."Nvidia Volta, IBM Power9 Land Contracts for New US Government Supercomputers".Anandtech. Archived fromthe original on November 19, 2014. Retrieved14 March 2017.
^Lilly, Paul (January 25, 2017)."NVIDIA 12nm FinFET Volta GPU Architecture Reportedly Replacing Pascal In 2017". HotHardware.
^Smith, Ryan (March 22, 2022)."NVIDIA Hopper GPU Architecture and H100 Accelerator Announced: Working Smarter and Harder".AnandTech. Archived fromthe original on September 23, 2023.
^Smith, Ryan (May 14, 2020)."NVIDIA Ampere Unleashed: NVIDIA Announces New GPU Architecture, A100 GPU, and Accelerator". AnandTech. Archived fromthe original on July 29, 2024.
^Garreffa, Anthony (September 17, 2017)."NVIDIA Tesla V100 Tested: Near Unbelievable GPU Power".TweakTown.com. RetrievedDecember 30, 2025.

External links

[edit]

Nvidia

GeForce(List of GPUs)

Fixed pixel pipeline

Pre-GeForce	NV1 NV2 RIVA 128 RIVA TNT TNT2

Vertex andpixel shaders

GeForce 3

4 Ti

Unified shaders

Unified shaders &NUMA

Ray tracing &Tensor Cores

Software and technologies

Multimedia acceleration	NVENC (video encoding) NVDEC (video decoding) PureVideo (video decoding)
Software	Cg (shading language) CUDA Nvidia GameWorks OptiX (ray tracing API) PhysX (physics SDK) Nvidia Omniverse (3D graphics) Nvidia RTX (ray tracing platform) Nvidia System Tools VDPAU (video decode API)
Technologies	Nvidia 3D Vision (stereo 3D) Nvidia G-Sync (variable refresh rate) Nvidia Optimus (GPU switching) Nvidia Surround (multi-monitor) MXM (module/socket) SXM (module/socket) NVLink (protocol) Scalable Link Interface (multi-GPU) TurboCache (framebuffer in system memory) Video Super Resolution (live video upscaling)
GPU microarchitectures	Celsius Kelvin Rankine Curie Tesla Fermi Kepler Maxwell Pascal Volta Turing Ampere Hopper Ada Lovelace Blackwell Rubin Feynman

Other products

GraphicsWorkstation cards	Nvidia Quadro Quadro Plex
GPGPU software	Nvidia Tesla DGX
Console components	NV2A(Xbox) RSX 'Reality Synthesizer'(PlayStation 3) Tegra X1(Nintendo Switch) Tegra T239 "Drake"(Nintendo Switch 2)
Nvidia Shield	Shield Portable Shield Tablet Shield Android TV GeForce Now
SoCs and embedded	GoForce Drive Jetson Tegra
CPUs	Project Denver
Computerchipsets	nForce

Company

Key people	Jen-Hsun Huang Chris Malachowsky Curtis Priem David Kirk Bill Dally Colette Kress Debora Shoquist Ranga Jayaraman Jonah M. Alben
Acquisitions	3dfx Interactive Ageia ULi Bright Computing Cumulus Networks DeepMap Icera Mellanox Technologies Mental Images PortalPlayer Exluna MediaQ Stexar

Retrieved from "https://en.wikipedia.org/w/index.php?title=Volta_(microarchitecture)&oldid=1337647228"

Categories:

Hidden categories:

[8]ページ先頭

Movatterモバイル変換

Details

Products

Application

V100 accelerator and DGX V100

See also

References

External links