This article has multiple issues. Please helpimprove it or discuss these issues on thetalk page.(Learn how and when to remove these messages) (Learn how and when to remove this message)
|

Incomputer graphics, ashader is aprogrammable operation which is applied to data as it moves through therendering pipeline.[1][2] Shaders can act on data such asvertices and primitives — to generate or morph geometry — andfragments — to calculate the values in a rendered image.[2]
Shaders can execute a wide variety of operations and can run on different types of hardware. In modernreal-time computer graphics, shaders are run ongraphics processing units (GPUs) — dedicated hardware which provides highly parallel execution of programs. As rendering an image isembarrassingly parallel, fragment and pixel shaders scale well onSIMD hardware. Historically, the drive for faster rendering has produced highly-parallel processors which can in turn be used for other SIMD amenable algorithms.[3] Such shaders executing in acompute pipeline are commonly calledcompute shaders.
The first known use of the term "shader" was introduced to the public byPixar with version 3.0 of theirRenderMan Interface Specification, originally published in May 1988.[4]
Asgraphics processing units evolved, major graphicssoftware libraries such asOpenGL andDirect3D began to support shaders. The first shader-capable GPUs only supportedpixel shading, butvertex shaders were quickly introduced once developers realized the power of shaders. The first video card with a programmable pixel shader was the NvidiaGeForce 3 (NV20), released in 2001.[5]Geometry shaders were introduced with Direct3D 10 and OpenGL 3.2. Eventually, graphics hardware evolved toward aunified shader model.
The traditional use of shaders is to operate on data in the graphics pipeline to control the rendering of an image. Graphics shaders can be classified according to their position in the pipeline, the data being manipulated, and the graphics API being used.
Fragment shaders, also known aspixel shaders, computecolor and other attributes of each "fragment": a unit of rendering work affecting at most a single outputpixel. The simplest kinds of pixel shaders output one screen pixel as a color value; more complex shaders with multiple inputs/outputs are also possible.[6] Pixel shaders range from simply always outputting the same color, to applying alighting value, to doingbump mapping,shadows,specular highlights,translucency and other phenomena. They can alter the depth of the fragment (forZ-buffering), or output more than one color if multiplerender targets are active. In 3D graphics, a pixel shader alone cannot produce some kinds of complex effects because it operates only on a single fragment, without knowledge of a scene's geometry (i.e. vertex data). However, pixel shaders do have knowledge of the screen coordinate being drawn, and can sample the screen and nearby pixels if the contents of the entire screen are passed as a texture to the shader. This technique can enable a wide variety of two-dimensionalpostprocessing effects such asblur, oredge detection/enhancement forcartoon/cel shaders. Pixel shaders may also be applied inintermediate stages to any two-dimensional images—sprites ortextures—in thepipeline, whereasvertex shaders always require a 3D scene. For instance, a pixel shader is the only kind of shader that can act as apostprocessor orfilter for avideo stream after it has beenrasterized.
Vertex shaders are run once for each 3Dvertex given to the graphics processor. The purpose is to transform each vertex's 3D position in virtual space to the 2D coordinate at which it appears on the screen (as well as a depth value for the Z-buffer).[7] Vertex shaders can manipulate properties such as position, color and texture coordinates, but cannot create new vertices. The output of the vertex shader goes to the next stage in the pipeline, which is either a geometry shader if present, or therasterizer. Vertex shaders can enable powerful control over the details of position, movement, lighting, and color in any scene involving3D models.
Geometry shaders were introduced in Direct3D 10 and OpenGL 3.2; formerly available in OpenGL 2.0+ with the use of extensions.[8] This type of shader can generate new graphicsprimitives, such as points, lines, and triangles, from those primitives that were sent to the beginning of thegraphics pipeline.[9]
Geometry shader programs are executed after vertex shaders. They take as input a whole primitive, possibly with adjacency information. For example, when operating on triangles, the three vertices are the geometry shader's input. The shader can then emit zero or more primitives, which are rasterized and their fragments ultimately passed to apixel shader.
Typical uses of a geometry shader include point sprite generation, geometrytessellation,shadow volume extrusion, and single pass rendering to acube map. A typical real-world example of the benefits of geometry shaders would be automatic mesh complexity modification. A series of line strips representing control points for a curve are passed to the geometry shader and depending on the complexity required the shader can automatically generate extra lines each of which provides a better approximation of a curve.
As of OpenGL 4.0 and Direct3D 11, a new shader class called a tessellation shader has been added. It adds two new shader stages to the traditional model: tessellation control shaders (also known as hull shaders) and tessellation evaluation shaders (also known as Domain Shaders), which together allow for simpler meshes to be subdivided into finer meshes at run-time according to a mathematical function. The function can be related to a variety of variables, most notably the distance from the viewing camera to allow activelevel-of-detail scaling. This allows objects close to the camera to have fine detail, while further away ones can have more coarse meshes, yet seem comparable in quality. It also can drastically reduce required mesh bandwidth by allowing meshes to be refined once inside the shader units instead of downsampling very complex ones from memory. Some algorithms can upsample any arbitrary mesh, while others allow for "hinting" in meshes to dictate the most characteristic vertices and edges.
Circa 2017, theAMD Vegamicroarchitecture added support for a new shader stage—primitive shaders—somewhat akin to compute shaders with access to the data necessary to process geometry.[10][11]
Nvidia introduced mesh and task shaders with itsTuring microarchitecture in 2018 which are also modelled after compute shaders.[12][13] Nvidia Turing is the world's first GPU microarchitecture that supports mesh shading through DirectX 12 Ultimate API, several months before Ampere RTX 30 series was released.[14]
In 2020, AMD and Nvidia releasedRDNA 2 andAmpere microarchitectures which both support mesh shading throughDirectX 12 Ultimate.[15] These mesh shaders allow the GPU to handle more complex algorithms, offloading more work from the CPU to the GPU, and in algorithm intense rendering, increasing the frame rate of or number of triangles in a scene by an order of magnitude.[16] Intel announced that Intel Arc Alchemist GPUs shipping in Q1 2022 will support mesh shaders.[17]
Ray tracing shaders are supported byMicrosoft viaDirectX Raytracing, byKhronos Group viaVulkan,GLSL, andSPIR-V,[18] byApple viaMetal.NVIDIA andAMD called "ray tracing shaders" as "ray tracing cores". Unlike unified shader, one ray tracing shader can contain multiple ALUs.[19]
Compute shaders are not limited to graphics applications, but use the same execution resources forGPGPU. They may be used in graphics pipelines e.g. for additional stages in animation or lighting algorithms (e.g.tiled forward rendering). Some rendering APIs allow compute shaders to easily share data resources with the graphics pipeline.
Tensor shaders may be integrated inNPUs orGPUs. Tensor shaders are supported byMicrosoft viaDirectML, byKhronos Group viaOpenVX, byApple viaCore ML, byGoogle viaTensorFlow, byLinux Foundation viaONNX.[20]NVIDIA andAMD called "tensor shaders" as "tensor cores". Unlike unified shader, one tensor shader can contains multiple ALUs.[21]
Compute kernels are routines compiled for high throughputaccelerators (such asgraphics processing units (GPUs),digital signal processors (DSPs), orfield-programmable gate arrays (FPGAs)), separate from but used by a main program (typically running on acentral processing unit). They may be specified by a separateprogramming language such as "OpenCL C" as "compute shaders" written in ashading language, or embedded directly inapplication code written in ahigh level language. They are sometimes called compute shaders, sharingexecution units withvertex shaders andpixel shaders on GPUs, but are not limited to execution on one class of device, orgraphics APIs.[22][23] Compute kernels roughly correspond toinner loops when implementing algorithms in traditional languages (except there is no implied sequential operation), or to code passed tointernal iterators. Microsoft support this asDirectCompute.
Thisprogramming paradigm maps well tovector processors: there is an assumption that each invocation of a kernel within a batch is independent, allowing fordata parallel execution. However,atomic operations may sometimes be used forsynchronization between elements (for interdependent work), in some scenarios. Individual invocations are given indices (in 1 or more dimensions) from which arbitrary addressing of buffer data may be performed (includingscatter gather operations), so long as the non-overlapping assumption is respected.
TheVulkan API provides the intermediateSPIR-V representation to describebothgraphical shaders, and compute kernels in alanguage independent andmachine independent manner. The intention is to facilitate language evolution and provide a more natural ability to leverage GPU compute capabilities, in line with hardware developments such asUnified Memory Architecture andHeterogeneous System Architecture. This allows closer cooperation between a CPU and GPU.
Much work has been done in the field of Kernel generation through LLMs as a means of optimizing code. KernelBench,[24] created by the Scaling Intelligence Lab atStanford, provides a framework to evaluate the ability of LLMs to generate efficient GPU kernels.Cognition has created Kevin 32-B[25] to create efficient CUDA kernels which is currently the highest performing model on KernelBench.
Several programming languages exist specifically for writing shaders, and which is used can depend on the target environment. The shading language for OpenGL isGLSL, and Direct3D usesHLSL. TheMetal framework, used by Apple devices, has its own shading language calledMetal Shading Language.
Increasingly in modern graphics APIs, shaders are compiled intoSPIR-V, anintermediate language, before they are distributed to the end user. This standard allows more flexible choice of shading language, regardless of target platform.[26] First supported by Vulkan and OpenGL, SPIR-V is also being adopted by Direct3D.[27]
Modernvideo game development platforms such asUnity,Unreal Engine andGodot increasingly includenode-based editors that can create shaders without the need for written code; the user is instead presented with adirected graph of connected nodes that allow users to direct various textures, maps, and mathematical functions into output values like the diffuse color, the specular color and intensity, roughness/metalness, height, normal, and so on. The graph is then compiled into a shader.