Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Deep Learning Super Sampling

From Wikipedia, the free encyclopedia
(Redirected fromTensor Core)
Image upscaling technology by Nvidia
This articlecontainspromotional content. Please helpimprove it by removingpromotional language and inappropriateexternal links, and by adding encyclopedic text written from aneutral point of view.(March 2024) (Learn how and when to remove this message)

Deep Learning Super Sampling (DLSS) is a suite ofreal-timedeep learning image enhancement andupscaling technologies developed byNvidia that are available in a number ofvideo games. The goal of these technologies is to allow the majority of thegraphics pipeline to run at a lowerresolution for increased performance, and then infer a higher resolution image from this that approximates the same level of detail as if the image had been rendered at this higher resolution. This allows for higher graphical settings and/orframe rates for a given output resolution, depending on user preference.[1]

All generations of DLSS are available on allRTX-branded cards from Nvidia in supported titles. However, theFrame Generation feature is only supported on40 series GPUs or newer andMulti Frame Generation is only available on50 series GPUs.[2][3]

History

[edit]

Nvidia advertised DLSS as a key feature of theGeForce 20 series cards when they launched in September 2018.[4] At that time, the results were limited to a few video games, namelyBattlefield V,[5] orMetro Exodus, because the algorithm had to be trained specifically on each game on which it was applied and the results were usually not as good as simple resolution upscaling.[6][7] In 2019, the video gameControl shipped withreal-time ray tracing and an improved version of DLSS, which did not use the Tensor Cores.[8][9]

In April 2020, Nvidia advertised and shipped an improved version of DLSS named DLSS 2.0 withdriver version 445.75. DLSS 2.0 was available for a few existing games includingControl andWolfenstein: Youngblood, and would later be added to many newly released games andgame engines such asUnreal Engine andUnity.[10][11] This time Nvidia said that it used the Tensor Cores again, and that the AI did not need to be trained specifically on each game.[4][12] Despite sharing the DLSS branding, the two iterations of DLSS differ significantly and are not backwards-compatible.[13][14]

In January 2025, Nvidia stated that there are over 540 games and apps supporting DLSS, and that over 80% ofNvidia RTX users activate DLSS.[15]

Release history

[edit]
ReleaseRelease dateHighlights
1.0February 2019Predominantly spatial image upscaler, required specifically trained for each game integration, included inBattlefield V andMetro Exodus, among others[5]
"1.9" (unofficial name)August 2019DLSS 1.0 adapted for running on the CUDA shader cores instead of tensor cores, used forControl[8][4][16]
2.0April 2020An AI accelerated form ofTAAU using Tensor Cores, and trained generically[17]
3.0September 2022DLSS 3.0, augmented with an optical flow frame generation algorithm (only available on RTX 40-series GPUs) to generate frames inbetween rendered frames[2][18]
3.5September 2023DLSS 3.5 adds Ray Reconstruction, replacing multiple denoising algorithms with a single AI model trained on five times more data than DLSS 3.[19][18]
4.0January 2025DLSS 4.0 adds Multi Frame Generation, new AI-model based on thetransformer architecture, improving frame stability, reduced memory usage, and increased lighting detail.[3][20]

Quality presets

[edit]

When using DLSS, depending on the game, users have access to various quality presets in addition to the option to set the internally rendered, upscaled resolution manually:

Standard DLSS presets
Quality preset[a]Scale factor[b]Render scale[c]
DLAA1x100%
Ultra Quality[21] (unused)1.32x76.0%
Quality1.50x66.7%
Balanced1.72x58.0%
Performance2.00x50.0%
Ultra Performance (since v2.1; only recommended for resolutions from8K[21])3.00x33.3%
AutoRendered resolution dynamically adjusts in real time to achieve user-defined fps targets (e.g., 144 fps on a 144 Hz monitor).[22]
  1. ^ The algorithm does not necessarily need to be implemented using these presets; it is possible for the implementer to define custom input and output resolutions.
  2. ^ The linear scale factor used for upsampling the input resolution to the output resolution. For example, a scene rendered at 540p with a 2.00x scale factor would have an output resolution of 1080p.
  3. ^ The linear render scale, compared to the output resolution, that the technology uses to render scenes internally before upsampling. For example, a 1080p scene with a 50% render scale would have an internal resolution of 540p.

Implementation

[edit]

DLSS 1.0

[edit]

The first iteration of DLSS is a predominantly spatial image upscaler with two stages, both relying onconvolutionalauto-encoderneural networks.[23] The first step is an image enhancement network which uses the current frame and motion vectors to performedge enhancement, andspatial anti-aliasing. The second stage is an image upscaling step which uses the single raw, low-resolution frame to upscale the image to the desired output resolution. Using just a single frame for upscaling means the neural network itself must generate a large amount of new information to produce the high resolution output, this can result in slighthallucinations such as leaves that differ in style to the source content.[13]

The neural networks are trained on a per-game basis by generating a "perfect frame" using traditionalsupersampling to 64 samples per pixel, as well as the motion vectors for each frame. The data collected must be as comprehensive as possible, including as many levels, times of day, graphical settings, resolutions, etc. as possible. This data is alsoaugmented using common augmentations such as rotations, colour changes, and random noise to help generalize the test data. Training is performed on Nvidia's Saturn V supercomputer.[14][24]

This first iteration received a mixed response, with many criticizing the often soft appearance and artifacts in certain situations;[25][6][5] likely a side effect of the limited data from only using a single frame input to the neural networks which could not be trained to perform optimally in all scenarios andedge-cases.[13][14] Nvidia also demonstrated the ability for the auto-encoder networks to learn the ability to recreatedepth-of-field andmotion blur,[14] although this functionality has never been included in a publicly released product.[citation needed]

DLSS 2.0

[edit]

DLSS 2.0 is atemporal anti-aliasingupsampling (TAAU) implementation, using data from previous frames extensively through sub-pixel jittering to resolve fine detail and reduce aliasing. The data DLSS 2.0 collects includes: the raw low-resolution input,motion vectors,depth buffers, andexposure / brightness information.[13] It can also be used as a simpler TAA implementation where the image is rendered at 100% resolution, rather than being upsampled by DLSS, Nvidia brands this asDLAA (Deep Learning Anti-Aliasing).[26]

TAA(U) is used in many modern video games andgame engines;[27] however, all previous implementations have used some form of manually writtenheuristics to prevent temporal artifacts such asghosting andflickering. One example of this is neighborhood clamping which forcefully prevents samples collected in previous frames from deviating too much compared to nearby pixels in newer frames. This helps to identify and fix many temporal artifacts, but deliberately removing fine details in this way is analogous to applying ablur filter, and thus the final image can appear blurry when using this method.[13]

DLSS 2.0 uses aconvolutionalauto-encoderneural network[25] trained to identify and fix temporal artifacts, instead of manually programmed heuristics as mentioned above. Because of this, DLSS 2.0 can generally resolve detail better than other TAA and TAAU implementations, while also removing most temporal artifacts. This is why DLSS 2.0 can sometimes produce a sharper image than rendering at higher, or even native resolutions using traditional TAA. However, no temporal solution is perfect, and artifacts (ghosting in particular) are still visible in some scenarios when using DLSS 2.0.

Because temporal artifacts occur in most art styles and environments in broadly the same way, the neural network that powers DLSS 2.0 does not need to be retrained when being used in different games. Despite this, Nvidia does frequently ship new minor revisions of DLSS 2.0 with new titles,[28] so this could suggest some minor training optimizations may be performed as games are released, although Nvidia does not provide changelogs for these minor revisions to confirm this. The main advancements compared to DLSS 1.0 include: Significantly improved detail retention, a generalized neural network that does not need to be re-trained per-game, and ~2x less overhead (~1–2 ms vs ~2–4 ms).[13]

It should also be noted that forms of TAAU such as DLSS 2.0 are notupscalers in the same sense as techniques such as ESRGAN or DLSS 1.0, which attempt to create new information from a low-resolution source; instead, TAAU works to recover data from previous frames, rather than creating new data. In practice, this means low resolutiontextures in games will still appear low-resolution when using current TAAU techniques. This is why Nvidia recommends game developers use higher resolution textures than they would normally for a given rendering resolution by applying a mip-map bias when DLSS 2.0 is enabled.[13]

DLSS 3.0

[edit]

Augments DLSS 2.0 by making use ofmotion interpolation. The DLSS Frame Generation algorithm takes two rendered frames from the rendering pipeline and generates a new frame that smoothly transitions between them. So for every frame rendered, one additional frame is generated.[2] DLSS 3.0 makes use of a new generation Optical Flow Accelerator (OFA) included in Ada Lovelace generation RTX GPUs. The new OFA is faster and more accurate than the OFA already available in previous Turing and Ampere RTX GPUs.[29] This results in DLSS 3.0 being exclusive for the RTX 40 Series. At release, DLSS 3.0 does not work for VR displays.[citation needed]

DLSS 3.5

[edit]

DLSS 3.5 adds Ray Reconstruction, replacing multiple denoising algorithms with a single AI model trained on five times more data than DLSS 3. Ray Reconstruction is available on all RTX GPUs and first targeted games withpath tracing (aka "full ray tracing"), includingCyberpunk 2077'sPhantom Liberty DLC,Portal with RTX, andAlan Wake 2.[19][18]

DLSS 4.0

[edit]

The fourth generation of Deep Learning Super Sampling (DLSS) was unveiled alongside theGeForce RTX 50 series. DLSS 4 upscaling uses a new visiontransformer-based model for enhanced image quality with reduced ghosting and greater image stability in motion compared to the previousconvolutional neural network (CNN) model.[30] DLSS 4 allows a greater number of frames to be generated andinterpolated based on a single traditionally rendered frame. This form of frame generation called Multi Frame Generation is exclusive to the GeForce RTX 50 series while theGeForce RTX 40 series is limited to one interpolated frame per traditionally rendered frame. According to Nvidia, this technique will increase performance by up to 800% while retaining low latency withNvidia Reflex.[31] Nvidia claims that DLSS 4x Frame Generation model uses 30% less video memory with the example ofWarhammer 40,000: Darktide using 400MB less memory at 4K resolution with Frame Generation enabled.[32] Nvidia claims that 75 games will integrate DLSS 4 Multi Frame Generation at launch, includingAlan Wake 2,Cyberpunk 2077,Indiana Jones and the Great Circle, andStar Wars Outlaws.[33]

GeForce RTX 20 seriesGeForce RTX 30 seriesGeForce RTX 40 seriesGeForce RTX 50 series
Transformer ModelYesYesYesYes
2× Frame GenerationNoNoYesYes
3–4× Frame GenerationNoNoNoYes

Manually upgrading DLSS support

[edit]

Users can manually replace theDLLs in games to support a newer version of DLSS. DLSS Swapper, anopen source utility, can automatically do this for all installed games.[34] Replacing DLL files can not add DLSS support or features to games that do not already implement them, though somemods can add frame generation support.[35]

Anti-aliasing

[edit]

DLSS requires and applies its ownanti-aliasing method. Thus, depending on the game and quality setting used, using DLSS may improve image quality even over native resolution rendering.[36] It operates on similar principles toTAA. Like TAA, it uses information from past frames to produce the current frame. Unlike TAA, DLSS does not sample every pixel in every frame. Instead, it samples different pixels in different frames and uses pixels sampled in past frames to fill in the unsampled pixels in the current frame. DLSS uses machine learning to combine samples in the current frame and past frames, and it can be thought of as an advanced and superior TAA implementation made possible by the available tensor cores.[13]Nvidia also offersDeep Learning Anti-Aliasing (DLAA), which provides the same AI-driven anti-aliasing DLSS uses, but without any upscaling or downscaling functionality.[26]

Architecture

[edit]

With the exception of the shader-core version implemented inControl, DLSS is only available onGeForce RTX 20,GeForce RTX 30,GeForce RTX 40,GeForce RTX 50, andQuadro RTX series of video cards, using dedicatedAI accelerators calledTensor Cores.[25][failed verification] Tensor Cores are available since the NvidiaVoltaGPUmicroarchitecture, which was first used on theTesla V100 line of products.[37] They are used for doingfused multiply-add (FMA) operations that are used extensively in neural network calculations for applying a large series of multiplications on weights, followed by the addition of a bias. Tensor cores can operate on FP16, INT8, INT4, and INT1 data types. Each core can do 1024 bits of FMA operations per clock, so 1024 INT1, 256 INT4, 128 INT8, and 64 FP16 operations per clock per tensor core, and most Turing GPUs have a few hundred tensor cores.[38] The Tensor Cores useCUDAWarp-Level Primitives on 32 parallel threads to take advantage of their parallel architecture.[39] A Warp is a set of 32threads which are configured to execute the same instruction. SinceWindows 10 version 1903, Microsoft Windows providedDirectML as one part ofDirectX to support Tensor Cores.

Reception

[edit]

Particularly in early versions of DLSS, users reported blurry frames. Andrew Edelsten, an employee at Nvidia, therefore commented on the problem in a blog post in 2019 and promised that they were working on improving the technology and clarified that the DLSS AI algorithm was mainly trained with 4K image material. That the use of DLSS leads to particularly blurred images at lower resolutions, such asFull HD, is due to the fact that the algorithm has far less image information available to calculate an appropriate image compared to higher resolutions like 4K.[40]

The use of DLSS Frame Generation may lead to increasedinput latency,[41] as well asvisual artifacts.[42] It has also been criticized that by implementing DLSS in their games, game developers no longer have an incentive to optimize them so that they also run smoothly in native resolution on modern PC hardware. For example, for the gameAlan Wake 2 in4K resolution at the highest graphics settings withray tracing enabled, the use of DLSS in Performance mode is recommended even with graphics cards such as theNvidia GeForce RTX 4080 in order to achieve 60 fps.[43]

The transformer-based AI upscaling model introduced with DLSS 4 received praise for its improved image quality with regard to increased stability, reduced ghosting, better anti-aliasing, and higher level of detail, as well as its backward compatability and higher training scalability regarding future improvements.[44][45]

See also

[edit]

References

[edit]
  1. ^"Nvidia RTX DLSS: Everything you need to know".Digital Trends. 2020-02-14. Retrieved2020-04-05.Deep learning super sampling uses artificial intelligence and machine learning to produce an image that looks like a higher-resolution image, without the rendering overhead. Nvidia's algorithm learns from tens of thousands of rendered sequences of images that were created using a supercomputer. That trains the algorithm to be able to produce similarly beautiful images, but without requiring the graphics card to work as hard to do it.
  2. ^abc"Introducing NVIDIA DLSS 3".NVIDIA. Retrieved2022-09-20.
  3. ^ab"NVIDIA DLSS 4 Introduces Multi Frame Generation & Enhancements For All DLSS Technologies".NVIDIA. Retrieved2025-01-14.
  4. ^abc"Nvidia DLSS in 2020: stunning results". techspot.com. 2020-02-26. Retrieved2020-04-05.
  5. ^abc"Battlefield V DLSS Tested: Overpromised, Underdelivered". techspot.com. 2019-02-19. Retrieved2020-04-06.'Of course, this is to be expected. DLSS was never going to provide the same image quality as native 4K while providing a 37% performance uplift. That would be black magic. But the quality difference comparing the two is almost laughable, in how far away DLSS is from the native presentation in these stressful areas.'
  6. ^ab"AMD Thinks NVIDIA DLSS is not Good Enough; Calls TAA & SMAA Better Alternatives". techquila.co.in. 2019-02-15. Retrieved2020-04-06.Recently, two big titles received NVIDIA DLSS support, namely Metro Exodus and Battlefield V. Both these games come with NVIDIA's DXR (DirectX Raytracing) implementation that at the moment is only supported by the GeForce RTX cards. DLSS makes these games playable at higher resolutions with much better frame rates, although there is a notable decrease in image sharpness. Now, AMD has taken a jab at DLSS, saying that traditional AA methods like SMAA and TAA 'offer superior combinations of image quality and performance.'
  7. ^"Nvidia Very Quietly Made DLSS A Hell Of A Lot Better".Kotaku. 2020-02-22. Archived fromthe original on February 21, 2020. Retrieved2020-04-06.The benefit for most people is that, generally, DLSS comes with a sizeable FPS improvement. How much varies from game to game. In Metro Exodus, the FPS jump was barely there and certainly not worth the bizarre hit to image quality.
  8. ^ab"Remedy's Control vs DLSS 2.0 – AI upscaling reaches the next level".Eurogamer. 2020-04-04. Retrieved2020-04-05.Of course, this isn't the first DLSS implementation we've seen in Control. The game shipped with a decent enough rendition of the technology that didn't actually use machine learning Tensor core component of the Nvidia Turing architecture, relying on the standard CUDA cores instead
  9. ^"NVIDIA DLSS 2.0 Update Will Fix The GeForce RTX Cards' Big Mistake". techquila.co.in. 2020-03-24. Retrieved2020-04-06.As promised, NVIDIA has updated the DLSS network in a new GeForce update that provides better, sharper image quality while still retaining higher framerates in raytraced games. While the feature wasn't used as well in its first iteration, NVIDIA is now confident that they have successfully fixed all the issues it had before
  10. ^"NVIDIA DLSS Plugin and Reflex Now Available for Unreal Engine".NVIDIA Developer Blog. 2021-02-11. Retrieved2022-02-07.
  11. ^"NVIDIA DLSS Natively Supported in Unity 2021.2".NVIDIA Developer Blog. 2021-04-14. Retrieved2022-02-07.
  12. ^"HW News - Crysis Remastered Ray Tracing, NVIDIA DLSS 2, Ryzen 3100 Rumors". 2020-04-19. Archived fromthe original on 2020-09-26. Retrieved2020-04-19.The original DLSS required training the AI network for each new game. DLSS 2.0 trains using non-game-specific content, delivering a generalized network that works across games. This means faster game integrations, and ultimately more DLSS games.
  13. ^abcdefghEdward Liu, NVIDIA"DLSS 2.0 - Image Reconstruction for Real-time Rendering with Deep Learning"
  14. ^abcd"Truly Next-Gen: Adding Deep Learning to Games & Graphics (Presented by NVIDIA)".GDC Vault. Retrieved2022-02-07.
  15. ^"DLSS enabled by over 80% of GeForce RTX gaming GPU owners, claims Nvidia".PCGamesN. 2025-01-16. Retrieved2025-01-31.
  16. ^Edelsten, Andrew (30 August 2019)."NVIDIA DLSS: Control and Beyond". Nvidia. Retrieved11 August 2020.Leveraging this AI research, we developed a new image processing algorithm that approximated our AI research model and fit within our performance budget. This image processing approach to DLSS is integrated into Control, and it delivers up to 75% faster frame rates.
  17. ^"NVIDIA DLSS 2.0 Review with Control – Is This Magic?". techquila.co.in. 2020-04-05. Retrieved2020-04-06.
  18. ^abc"Nvidia's new DLSS 3.5 works on all RTX GPUs to improve the quality of ray tracing". The Verge. 22 August 2023. Retrieved6 September 2023.
  19. ^ab"Nvidia announces DLSS 3.5 with ray reconstruction, boosting RT quality with an AI-trained denoiser". EuroGamer. 23 August 2023. Retrieved6 September 2023.
  20. ^Khan, Sarfraz (2025-01-14)."NVIDIA Confirms Updated DLSS Frame Generation On RTX 40 GPUs, Leads to Lower VRAM Usage & Faster Performance".Wccftech. Retrieved2025-01-14.
  21. ^ab"NVIDIA preparing Ultra Quality mode for DLSS, 2.2.9.0 version spotted".VideoCardz.com. Retrieved2021-07-06.
  22. ^"DLSS 3 explained: How Nvidia's AI-infused RTX tech turbocharges PC gaming".PCWorld. Retrieved2024-06-08.
  23. ^"DLSS: What Does It Mean for Game Developers?".NVIDIA Developer Blog. 2018-09-19. Retrieved2022-02-07.
  24. ^"NVIDIA DLSS: Your Questions, Answered".Nvidia. 2019-02-15. Retrieved2020-04-19.The DLSS team first extracts many aliased frames from the target game, and then for each one we generate a matching 'perfect frame' using either super-sampling or accumulation rendering. These paired frames are fed to NVIDIA's supercomputer. The supercomputer trains the DLSS model to recognize aliased inputs and generate high-quality anti-aliased images that match the 'perfect frame' as closely as possible. We then repeat the process, but this time we train the model to generate additional pixels rather than applying AA. This has the effect of increasing the resolution of the input. Combining both techniques enables the GPU to render the full monitor resolution at higher frame rates.
  25. ^abc"NVIDIA DLSS 2.0: A Big Leap In AI Rendering".Nvidia. 2020-03-23. Retrieved2020-04-07.
  26. ^ab"What is Nvidia DLAA? An Anti-Aliasing Explainer".Digital Trends. 2021-09-28. Retrieved2022-02-10.
  27. ^Temporal AA small Cloud Front
  28. ^"NVIDIA DLSS DLL (2.3.7) Download".TechPowerUp. Retrieved2022-02-10.
  29. ^"NVIDIA Optical Flow SDK".NVIDIA Developer. 2018-11-29. Retrieved2022-09-20.
  30. ^Leadbetter, Richard (January 7, 2025)."Hands-on with DLSS 4 on Nvidia's new GeForce RTX 5080".Eurogamer. RetrievedJanuary 7, 2025.
  31. ^"NVIDIA Blackwell GeForce RTX 50 Series Opens New World of AI Computer Graphics".NVIDIA Newsroom. Retrieved2025-01-07.
  32. ^Lin, Henry; Burnes, Andrew (January 6, 2025)."Nvidia DLSS 4 Introduces Multi Frame Generation & Enhancements For All DLSS Technologies".Nvidia. RetrievedJanuary 7, 2025.
  33. ^Mujtaba, Hassan (January 6, 2025)."Nvidia DLSS 4 Delivers An Insane 8x Performance Boost Versus DLSS 3 With Multi Frame Generation Technology, Enhanced Upscaling For RTX 20 & Above".Wccftech. RetrievedJanuary 7, 2025.
  34. ^Edser, Andy (2024-08-30)."This open source tool updates DLSS to the latest version in all your games at once and no matter the launcher".PC Gamer. Retrieved2025-01-28.
  35. ^Nasir, Hassam (2025-01-27)."DLSS Swapper now updates FSR, XeSS, and DLSS, too — Supports all major upscaling/frame gen technologies".Tom's Hardware. Retrieved2025-01-28.
  36. ^Smith, Matthew S. (2023-12-28)."What Is DLSS and Why Does it Matter for Gaming?".IGN. Retrieved2024-06-13.
  37. ^"On Tensors, Tensorflow, And Nvidia's Latest 'Tensor Cores'". tomshardware.com. 2017-04-11. Retrieved2020-04-08.
  38. ^"Tensor Core DL Performance Guide"(PDF).Nvidia.Archived(PDF) from the original on 2020-11-11.
  39. ^"Using CUDA Warp-Level Primitives".Nvidia. 2018-01-15. Retrieved2020-04-08.NVIDIA GPUs execute groups of threads known as warps in SIMT (Single Instruction, Multiple Thread) fashion.
  40. ^"NVIDIA DLSS: Your Questions, Answered". Nvidia. Retrieved2024-07-09.
  41. ^"When a high frame rate can lose you the game".Digital Trends. 2023-11-21. Retrieved2024-07-09.
  42. ^"Nvidia DLSS 3 Revisit: We Try It Out in 9 Games".TechSpot. 2023-03-08. Retrieved2024-07-09.
  43. ^"Alan Wake 2 on PC is an embarrassment of riches".Digital Trends. 2023-10-26. Retrieved2024-07-09.
  44. ^"NVIDIA DLSS 4 Transformer Review - Better Image Quality for Everyone".TechPowerUp. Archived fromthe original on 2025-01-28. Retrieved2025-01-31.
  45. ^Editor, Richard Leadbetter Technology; Foundry, Digital (2025-01-07)."Hands-on with DLSS 4 on Nvidia's new GeForce RTX 5080".Eurogamer.net. Retrieved2025-01-31.{{cite web}}:|last= has generic name (help)

External links

[edit]
Fixed pixel pipeline
Pre-GeForce
Vertex andpixel shaders
Unified shaders
Unified shaders &NUMA
Ray tracing &Tensor Cores
Software and technologies
Multimedia acceleration
Software
Technologies
GPU microarchitectures
Other products
GraphicsWorkstation cards
GPGPU
Console components
Nvidia Shield
SoCs and embedded
CPUs
Computerchipsets
Company
Key people
Acquisitions
Retrieved from "https://en.wikipedia.org/w/index.php?title=Deep_Learning_Super_Sampling&oldid=1278965551#Architecture"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2025 Movatter.jp