Movatterモバイル変換


[0]ホーム

URL:


Uploaded bybasisspace
PPTX, PDF2,592 views

Borderless Per Face Texture Mapping

The document discusses the issue of texture waste in modern games, highlighting that nearly 30% of texture memory is typically wasted, which affects load times and memory usage. It introduces 'ptex', a texture system that eliminates the need for UV unwrapping by allowing each quad to have its own texture space, thereby improving production efficiency and visual fidelity. The document outlines Ptex's benefits and performance impact, demonstrating its efficiency in memory usage while reducing load times.

Embed presentation

Downloaded 19 times
Eliminating TextureWaste: Borderless PtexJohn McDonald, NVIDIACorporation
NVIDIA Corporation © 2013Memory ConsumptionModern games consume a lotof memoryThe largest class of memoryusage is texturesBut lots of texture is wasted!Waste costs both memory andincreased load timesBack/FrontGbufferTexturesVB/IBSimulation
NVIDIA Corporation © 2013Wasted?!Two sources of texture waste:Unmapped texture storage (major)Duplicated texels to helpalleviate visible seams (minor)This cannot eliminate seams.http://www.boogotti.com/root/images/face/dffuse_texture.jpgWaste WasteWasteWasteWaste
NVIDIA Corporation © 2013Wasted?!Two sources of texture waste:Unmapped texture storage (major)Duplicated texels to helpalleviate visible seams (minor)This cannot eliminate seams.http://www.boogotti.com/root/images/face/dffuse_texture.jpg
NVIDIA Corporation © 2013How much waste are we talking?Nearly 60% of memory usage in a modern game* is texture usageAnd up to 30% of that is waste.That’s 18% of your total application footprint.
NVIDIA Corporation © 2013Memory Waste18% of your memory is useless.18% of your load time is wasted.
NVIDIA Corporation © 2013Enter Ptex (a quick recap)The soul of Ptex:Model with Quads instead of TrianglesYou’re doing this for your next-gen engine anyways, right?Every Quad gets its own entire texture UV-spaceUV orientation is implicit in surfacedefinitionNo explicit UV parameterizationResolution of each face isindependent of neighbors.
NVIDIA Corporation © 2013Ptex (cont’d)Invented by Brent Burley at Walt Disney Animation StudiosUsed in every animated film at Disney since 20076 features and all shorts, plus everything inproduction now and for the foreseeablefutureUsed on ~100% of surfacesRapid adoption in DCC toolsWidespread usage throughoutthe film industry
NVIDIA Corporation © 2013Ptex benefitsNo UV unwrapsAllow artists to work at any resolution they wantPerform an offline pass on assets to decide what to ship for eachplatform based on capabilitiesShip a texture pack later for tail revenueReduce your load times. And your memory footprint. Improveyour visual fidelity.Reduce the cost of production’s long pole—art.
NVIDIA Corporation © 2013DemoDemo is running on a Titan.Sorry, it’s what we have at the show. I’ve run on 430-680—perf scales linearly with Texture/FB.Could run on any Dx11 capable GPU.Could also run on Dx10 capable GPUs with small adaptations.OpenGL 4—no vendor-specific extensions.
NVIDIA Corporation © 2013Roadmap: Realtime Ptex v1LoadModelRenderPreprocessDraw TimeBucketandSortGenerateMipmapsFillBordersPackTextureArraysReorderIndexBufferPackPatchConstantsRed: Vertex and Index dataGreen: Patch Constant informationBlue: Texel dataOrange: Adjacency information
NVIDIA Corporation © 2013Roadmap: Realtime Ptex v2LoadModelRenderPreprocessDraw TimePackTextureArraysPackPatchConstantsRed: Vertex and Index dataGreen: Patch Constant informationBlue: Texel dataOrange: Adjacency information
NVIDIA Corporation © 2013Realtime Ptex v2Instead of copying texels into a border region, just go look atthem.Use clamp to edge (border color), with a border color of (0,0,0,0)This makes those lookups fast.Also lets you know how close to the edge you areWe’ll need to transform our UVs into their UV spaceAnd accumulate the resultsWaste factor? 0*.
NVIDIA Corporation © 2013Example ModelVB: …IB:
NVIDIA Corporation © 2013Load ModelVertex DataAny geometry arranged as a quad-based meshExample: Wavefront OBJPatch TexturePower-of-two texture imagesAdjacency Information4 Neighbors of each quad patchEasily load texture and adjacency with OSS library available fromhttp://ptex.us/
NVIDIA Corporation © 2013Texture ArraysLike 3D / Volume Textures, except:No filtering between 2D slicesOnly X and Y decrease with mipmap level (Z doesn’t)Z indexed by integer index, not [0,1]E.g. (0.5, 0.5, 4) would be (0.5, 0.5) from the 5th sliceAPI SupportDirect3D 10+: Texture2DArrayOpenGL 3.0+: GL_TEXTURE_2D_ARRAY
NVIDIA Corporation © 2013Arrays of Texture ArraysBoth GLSL and HLSL* support arrays of TextureArrays.This allows for stupidly powerful abuse of texturing.Texture2DArray albedo[32]; // D3Duniform sampler2DArray albedo[32]; // OpenGL* HLSL support requires a little codegen—but it’s entirely a compile-timeexercise, no runtime impact.
NVIDIA Corporation © 2013Pack Texture ArraysOne Texture2DArray per top-mipmap levelStore with complete with mipmap chainDon’t forget to set border color to black (with 0 alpha).
NVIDIA Corporation © 2013Packed ArraysTexture Array (TA) 0 TA 1 TA 2Slice 0 Slice 1 Slice 2 Slice 0 Slice 0
NVIDIA Corporation © 2013Pack Patch ConstantsCreate a constant-buffer indexed byPrimitiveID. Each entry contains:Your Array Index and Slice in theTexture2DArraysYour four neighbors across the edgesEach neighbor’s UV orientation(Again, can be prepared at baking time)If rendering too many primitivesto fit into a constant buffer,you can use Structured Buffers / SSBO for storage.struct PTexParameters {ushort usNgbrIndex[4];ushort usNgbrXform[4];ushort usTexIndex;ushort usTexSlice;};uniform ptxDiffuseUBO {PTexParameters ptxDiffuse[PRIMS];};
NVIDIA Corporation © 2013Rendering time (CPU)Bind Texture2DArrays(If you’re in GL, consider Bindless)Select ShaderSetup Constants
NVIDIA Corporation © 2013Rendering Time (DS)In the domain shader, we need to generate our UVs.Use SV_DomainLocation.Exact mapping is dependent onDCC tool used to generatethe meshIncorrect surface orientation
NVIDIA Corporation © 2013Rendering Time (PS)Conceptually, a ptex lookup is:Sample our surface (use SV_PrimitiveID to determine our data).For each neighbor:Transform our UV into their UV spacePerform a lookup in that surface with transformed UVsAccumulate the result, correct for base-level differences and return
NVIDIA Corporation © 2013Mapping SpaceThere are 16 cases thatmap our UV space to ourneighbors, as shown.
NVIDIA Corporation © 2013Transforming SpaceConveniently these mapto simple 3x2 texturetransforms
NVIDIA Corporation © 2013Bad seamingAll your baseBase level differences, wah?When a 512x512 neighbors a 256x256, their base levels aredifferent.This is an issue because samples are constant-sized in texel(integer) space, not UV (float) space
NVIDIA Corporation © 2013RenormalizationWith unused alpha channel, code is simply:return result / result.a;If you need alpha, see appendixBad seaming Fixed!
NVIDIA Corporation © 20130% Waste?Okay, not quite 0.Need a global set of textures that match ptex resolutions used.“Standard Candles”But they are one-channel, and can be massively compressed (4 bitsper pixel)<5 megs of overhead, regardless of texture footprintFor actual games, more like 1K of overhead.Could be eliminated, but at the cost of some shader complexity.Not needed for:Textures without alphaTextures used for Normal MapsTextures less than 32 bytes per pixel
NVIDIA Corporation © 2013A brief interlude on the expense of retrievingtexels from textured surfacesTexture lookups by themselves are not expensive.There are fundamentally two types of lookups:Independent readsDependent readsIndependent reads can be pipelined.The first lookup “costs” ~150 clocksThe second costs ~5 clocks.Dependent reads must wait for previous resultsThe first lookup costs ~150 clocksThe second costs ~150 clocks.Try to have no more than 2-3 “levels” of dependent reads in a singleshader
NVIDIA Corporation © 2013Performance ImpactIn this demo, Ptex costs < 30% versus no texturing at allCosts < 20% compared to repeat texturing.~15% versus an UV-unwrapped mesh
NVIDIA Corporation © 2013Putting it all togetherFUDRLF.(u, v) = ( 0.5, 0.5 )R.(u, v) = ( 0.5, -0.5 )U.(u, v) = ( 0.5, 1.5 )L.(u, v) = ( 1.5, 0.5 )D.(u, v) = ( 0.5, -0.5 )In this situation, texture lookups in R, U, L and D will return theborder color (0, 0, 0, 0)F lookup will return alpha of 1—so the weight will be exactly 1.
NVIDIA Corporation © 2013Putting it all togetherFUDRLF.(u, v) = ( 1.0, 0.5 )R.(u, v) = ( 0.5, 0.0 )U.(u, v) = ( 0.0, 1.5 )L.(u, v) = ( 2.0, 0.5 )D.(u, v) = ( 0.0, -0.5 )In this situation, texture lookups in U, L and D will return the border color(0, 0, 0, 0)If R and F are the same resolution, they will each return an alpha of 0.5.If R and F are not the same resolution, alpha will not be 1.0—renormalizationwill be necessary.
NVIDIA Corporation © 2013Questions?jmcdonald at nvidia dot comDemo Thanks: Johnny Costello and Timothy Lottes!
NVIDIA Corporation © 2013In the demoPtexAAVignettingLightingSpectral Simulation (7 data points)Volumetric Caustics (128 taps per pixel)

Recommended

PPTX
Beyond porting
PDF
OpenGL NVIDIA Command-List: Approaching Zero Driver Overhead
PDF
Masked Software Occlusion Culling
PDF
Modern OpenGL Usage: Using Vertex Buffer Objects Well
PPTX
Porting the Source Engine to Linux: Valve's Lessons Learned
PDF
OpenGL 4.4 - Scene Rendering Techniques
PPTX
OpenGL 4.5 Update for NVIDIA GPUs
PPT
Realtime Per Face Texture Mapping (PTEX)
PPTX
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
 
PPTX
FlameWorks GTC 2014
PPT
Your Game Needs Direct3D 11, So Get Started Now!
 
PDF
Advanced Scenegraph Rendering Pipeline
PPTX
4K Checkerboard in Battlefield 1 and Mass Effect Andromeda
PPTX
Future Directions for Compute-for-Graphics
PPTX
Approaching zero driver overhead
PDF
Optimizing the graphics pipeline with compute
PPSX
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
PPSX
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
PPSX
Advancements in-tiled-rendering
PPTX
Hair in Tomb Raider
PPT
GDC 2012: Advanced Procedural Rendering in DX11
PPSX
Oit And Indirect Illumination Using Dx11 Linked Lists
PPSX
Dx11 performancereloaded
PPTX
Scope Stack Allocation
PDF
Checkerboard Rendering in Dark Souls: Remastered by QLOC
 
PDF
Using neon for pattern recognition in audio data
PPTX
Sig13 ce future_gfx
PPTX
NvFX GTC 2013
PPT
OpenGL for 2015
PPT
CS 354 Texture Mapping

More Related Content

PPTX
Beyond porting
PDF
OpenGL NVIDIA Command-List: Approaching Zero Driver Overhead
PDF
Masked Software Occlusion Culling
PDF
Modern OpenGL Usage: Using Vertex Buffer Objects Well
PPTX
Porting the Source Engine to Linux: Valve's Lessons Learned
PDF
OpenGL 4.4 - Scene Rendering Techniques
PPTX
OpenGL 4.5 Update for NVIDIA GPUs
PPT
Realtime Per Face Texture Mapping (PTEX)
Beyond porting
OpenGL NVIDIA Command-List: Approaching Zero Driver Overhead
Masked Software Occlusion Culling
Modern OpenGL Usage: Using Vertex Buffer Objects Well
Porting the Source Engine to Linux: Valve's Lessons Learned
OpenGL 4.4 - Scene Rendering Techniques
OpenGL 4.5 Update for NVIDIA GPUs
Realtime Per Face Texture Mapping (PTEX)

What's hot

PPTX
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
 
PPTX
FlameWorks GTC 2014
PPT
Your Game Needs Direct3D 11, So Get Started Now!
 
PDF
Advanced Scenegraph Rendering Pipeline
PPTX
4K Checkerboard in Battlefield 1 and Mass Effect Andromeda
PPTX
Future Directions for Compute-for-Graphics
PPTX
Approaching zero driver overhead
PDF
Optimizing the graphics pipeline with compute
PPSX
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
PPSX
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
PPSX
Advancements in-tiled-rendering
PPTX
Hair in Tomb Raider
PPT
GDC 2012: Advanced Procedural Rendering in DX11
PPSX
Oit And Indirect Illumination Using Dx11 Linked Lists
PPSX
Dx11 performancereloaded
PPTX
Scope Stack Allocation
PDF
Checkerboard Rendering in Dark Souls: Remastered by QLOC
 
PDF
Using neon for pattern recognition in audio data
PPTX
Sig13 ce future_gfx
PPTX
NvFX GTC 2013
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
 
FlameWorks GTC 2014
Your Game Needs Direct3D 11, So Get Started Now!
 
Advanced Scenegraph Rendering Pipeline
4K Checkerboard in Battlefield 1 and Mass Effect Andromeda
Future Directions for Compute-for-Graphics
Approaching zero driver overhead
Optimizing the graphics pipeline with compute
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Advancements in-tiled-rendering
Hair in Tomb Raider
GDC 2012: Advanced Procedural Rendering in DX11
Oit And Indirect Illumination Using Dx11 Linked Lists
Dx11 performancereloaded
Scope Stack Allocation
Checkerboard Rendering in Dark Souls: Remastered by QLOC
 
Using neon for pattern recognition in audio data
Sig13 ce future_gfx
NvFX GTC 2013

Similar to Borderless Per Face Texture Mapping

PPT
OpenGL for 2015
PPT
CS 354 Texture Mapping
PDF
Smedberg niklas bringing_aaa_graphics
PDF
RSX™ Best Practices
PPT
Order Independent Transparency
 
PPT
Anatomy of a Texture Fetch
PDF
Hpg2011 papers kazakov
PDF
Дмитрий Вовк - Learn iOS Game Optimization. Ultimate Guide
PPTX
Develop2012 deferred sanchez_stachowiak
PDF
Deferred shading
PPT
Advanced Lighting Techniques Dan Baker (Meltdown 2005)
PPTX
GFX Part 4 - Introduction to Texturing in OpenGL ES
PPTX
Geometry Batching Using Texture-Arrays
PPTX
Real-time lightmap baking
PPT
Far cry 3
PPTX
Penn graphics
PDF
Shaders - Claudia Doppioslash - Unity With the Best
PPTX
Beginning direct3d gameprogramming08_usingtextures_20160428_jintaeks
PDF
201707 SER332 Lecture 21
PPTX
4,000 Adams at 90 Frames Per Second | Yi Fei Boon
OpenGL for 2015
CS 354 Texture Mapping
Smedberg niklas bringing_aaa_graphics
RSX™ Best Practices
Order Independent Transparency
 
Anatomy of a Texture Fetch
Hpg2011 papers kazakov
Дмитрий Вовк - Learn iOS Game Optimization. Ultimate Guide
Develop2012 deferred sanchez_stachowiak
Deferred shading
Advanced Lighting Techniques Dan Baker (Meltdown 2005)
GFX Part 4 - Introduction to Texturing in OpenGL ES
Geometry Batching Using Texture-Arrays
Real-time lightmap baking
Far cry 3
Penn graphics
Shaders - Claudia Doppioslash - Unity With the Best
Beginning direct3d gameprogramming08_usingtextures_20160428_jintaeks
201707 SER332 Lecture 21
4,000 Adams at 90 Frames Per Second | Yi Fei Boon

Recently uploaded

PDF
Beyond Basics: How to Build Scalable, Intelligent Imagery Pipelines
PDF
[BDD 2025 - Full-Stack Development] The Modern Stack: Building Web & AI Appli...
PPTX
Connecting the unconnectable: Exploring LoRaWAN for IoT
PDF
Lets Build a Serverless Function with Kiro
PDF
Oracle MySQL HeatWave - Complete - Version 3
PDF
Mulesoft Meetup Online Portuguese: MCP e IA
PDF
Rolling out Enterprise AI: Tools, Insights, and Team Empowerment
PDF
Open Source Post-Quantum Cryptography - Matt Caswell
PPTX
Leon Brands - Intro to GPU Occlusion (Graphics Programming Conference 2024)
PDF
Oracle MySQL HeatWave - One Page - Version 3
PDF
[BDD 2025 - Full-Stack Development] PHP in AI Age: The Laravel Way. (Rizqy Hi...
PDF
So You Want to Work at Google | DevFest Seattle 2025
PPTX
"Feelings versus facts: why metrics are more important than intuition", Igor ...
 
PDF
[BDD 2025 - Artificial Intelligence] AI for the Underdogs: Innovation for Sma...
PDF
Transcript: The partnership effect: Libraries and publishers on collaborating...
PDF
The Evolving Role of the CEO in the Age of AI
PDF
"DISC as GPS for team leaders: how to lead a team from storming to performing...
 
PDF
The partnership effect: Libraries and publishers on collaborating and thrivin...
PPTX
How to Choose the Right Vendor for ADA PDF Accessibility and Compliance in 2026
PPTX
Support, Monitoring, Continuous Improvement & Scaling Agentic Automation [3/3]
Beyond Basics: How to Build Scalable, Intelligent Imagery Pipelines
[BDD 2025 - Full-Stack Development] The Modern Stack: Building Web & AI Appli...
Connecting the unconnectable: Exploring LoRaWAN for IoT
Lets Build a Serverless Function with Kiro
Oracle MySQL HeatWave - Complete - Version 3
Mulesoft Meetup Online Portuguese: MCP e IA
Rolling out Enterprise AI: Tools, Insights, and Team Empowerment
Open Source Post-Quantum Cryptography - Matt Caswell
Leon Brands - Intro to GPU Occlusion (Graphics Programming Conference 2024)
Oracle MySQL HeatWave - One Page - Version 3
[BDD 2025 - Full-Stack Development] PHP in AI Age: The Laravel Way. (Rizqy Hi...
So You Want to Work at Google | DevFest Seattle 2025
"Feelings versus facts: why metrics are more important than intuition", Igor ...
 
[BDD 2025 - Artificial Intelligence] AI for the Underdogs: Innovation for Sma...
Transcript: The partnership effect: Libraries and publishers on collaborating...
The Evolving Role of the CEO in the Age of AI
"DISC as GPS for team leaders: how to lead a team from storming to performing...
 
The partnership effect: Libraries and publishers on collaborating and thrivin...
How to Choose the Right Vendor for ADA PDF Accessibility and Compliance in 2026
Support, Monitoring, Continuous Improvement & Scaling Agentic Automation [3/3]

Borderless Per Face Texture Mapping

  • 1.
    Eliminating TextureWaste: BorderlessPtexJohn McDonald, NVIDIACorporation
  • 2.
    NVIDIA Corporation ©2013Memory ConsumptionModern games consume a lotof memoryThe largest class of memoryusage is texturesBut lots of texture is wasted!Waste costs both memory andincreased load timesBack/FrontGbufferTexturesVB/IBSimulation
  • 3.
    NVIDIA Corporation ©2013Wasted?!Two sources of texture waste:Unmapped texture storage (major)Duplicated texels to helpalleviate visible seams (minor)This cannot eliminate seams.http://www.boogotti.com/root/images/face/dffuse_texture.jpgWaste WasteWasteWasteWaste
  • 4.
    NVIDIA Corporation ©2013Wasted?!Two sources of texture waste:Unmapped texture storage (major)Duplicated texels to helpalleviate visible seams (minor)This cannot eliminate seams.http://www.boogotti.com/root/images/face/dffuse_texture.jpg
  • 5.
    NVIDIA Corporation ©2013How much waste are we talking?Nearly 60% of memory usage in a modern game* is texture usageAnd up to 30% of that is waste.That’s 18% of your total application footprint.
  • 6.
    NVIDIA Corporation ©2013Memory Waste18% of your memory is useless.18% of your load time is wasted.
  • 7.
    NVIDIA Corporation ©2013Enter Ptex (a quick recap)The soul of Ptex:Model with Quads instead of TrianglesYou’re doing this for your next-gen engine anyways, right?Every Quad gets its own entire texture UV-spaceUV orientation is implicit in surfacedefinitionNo explicit UV parameterizationResolution of each face isindependent of neighbors.
  • 8.
    NVIDIA Corporation ©2013Ptex (cont’d)Invented by Brent Burley at Walt Disney Animation StudiosUsed in every animated film at Disney since 20076 features and all shorts, plus everything inproduction now and for the foreseeablefutureUsed on ~100% of surfacesRapid adoption in DCC toolsWidespread usage throughoutthe film industry
  • 9.
    NVIDIA Corporation ©2013Ptex benefitsNo UV unwrapsAllow artists to work at any resolution they wantPerform an offline pass on assets to decide what to ship for eachplatform based on capabilitiesShip a texture pack later for tail revenueReduce your load times. And your memory footprint. Improveyour visual fidelity.Reduce the cost of production’s long pole—art.
  • 10.
    NVIDIA Corporation ©2013DemoDemo is running on a Titan.Sorry, it’s what we have at the show. I’ve run on 430-680—perf scales linearly with Texture/FB.Could run on any Dx11 capable GPU.Could also run on Dx10 capable GPUs with small adaptations.OpenGL 4—no vendor-specific extensions.
  • 11.
    NVIDIA Corporation ©2013Roadmap: Realtime Ptex v1LoadModelRenderPreprocessDraw TimeBucketandSortGenerateMipmapsFillBordersPackTextureArraysReorderIndexBufferPackPatchConstantsRed: Vertex and Index dataGreen: Patch Constant informationBlue: Texel dataOrange: Adjacency information
  • 12.
    NVIDIA Corporation ©2013Roadmap: Realtime Ptex v2LoadModelRenderPreprocessDraw TimePackTextureArraysPackPatchConstantsRed: Vertex and Index dataGreen: Patch Constant informationBlue: Texel dataOrange: Adjacency information
  • 13.
    NVIDIA Corporation ©2013Realtime Ptex v2Instead of copying texels into a border region, just go look atthem.Use clamp to edge (border color), with a border color of (0,0,0,0)This makes those lookups fast.Also lets you know how close to the edge you areWe’ll need to transform our UVs into their UV spaceAnd accumulate the resultsWaste factor? 0*.
  • 14.
    NVIDIA Corporation ©2013Example ModelVB: …IB:
  • 15.
    NVIDIA Corporation ©2013Load ModelVertex DataAny geometry arranged as a quad-based meshExample: Wavefront OBJPatch TexturePower-of-two texture imagesAdjacency Information4 Neighbors of each quad patchEasily load texture and adjacency with OSS library available fromhttp://ptex.us/
  • 16.
    NVIDIA Corporation ©2013Texture ArraysLike 3D / Volume Textures, except:No filtering between 2D slicesOnly X and Y decrease with mipmap level (Z doesn’t)Z indexed by integer index, not [0,1]E.g. (0.5, 0.5, 4) would be (0.5, 0.5) from the 5th sliceAPI SupportDirect3D 10+: Texture2DArrayOpenGL 3.0+: GL_TEXTURE_2D_ARRAY
  • 17.
    NVIDIA Corporation ©2013Arrays of Texture ArraysBoth GLSL and HLSL* support arrays of TextureArrays.This allows for stupidly powerful abuse of texturing.Texture2DArray albedo[32]; // D3Duniform sampler2DArray albedo[32]; // OpenGL* HLSL support requires a little codegen—but it’s entirely a compile-timeexercise, no runtime impact.
  • 18.
    NVIDIA Corporation ©2013Pack Texture ArraysOne Texture2DArray per top-mipmap levelStore with complete with mipmap chainDon’t forget to set border color to black (with 0 alpha).
  • 19.
    NVIDIA Corporation ©2013Packed ArraysTexture Array (TA) 0 TA 1 TA 2Slice 0 Slice 1 Slice 2 Slice 0 Slice 0
  • 20.
    NVIDIA Corporation ©2013Pack Patch ConstantsCreate a constant-buffer indexed byPrimitiveID. Each entry contains:Your Array Index and Slice in theTexture2DArraysYour four neighbors across the edgesEach neighbor’s UV orientation(Again, can be prepared at baking time)If rendering too many primitivesto fit into a constant buffer,you can use Structured Buffers / SSBO for storage.struct PTexParameters {ushort usNgbrIndex[4];ushort usNgbrXform[4];ushort usTexIndex;ushort usTexSlice;};uniform ptxDiffuseUBO {PTexParameters ptxDiffuse[PRIMS];};
  • 21.
    NVIDIA Corporation ©2013Rendering time (CPU)Bind Texture2DArrays(If you’re in GL, consider Bindless)Select ShaderSetup Constants
  • 22.
    NVIDIA Corporation ©2013Rendering Time (DS)In the domain shader, we need to generate our UVs.Use SV_DomainLocation.Exact mapping is dependent onDCC tool used to generatethe meshIncorrect surface orientation
  • 23.
    NVIDIA Corporation ©2013Rendering Time (PS)Conceptually, a ptex lookup is:Sample our surface (use SV_PrimitiveID to determine our data).For each neighbor:Transform our UV into their UV spacePerform a lookup in that surface with transformed UVsAccumulate the result, correct for base-level differences and return
  • 24.
    NVIDIA Corporation ©2013Mapping SpaceThere are 16 cases thatmap our UV space to ourneighbors, as shown.
  • 25.
    NVIDIA Corporation ©2013Transforming SpaceConveniently these mapto simple 3x2 texturetransforms
  • 26.
    NVIDIA Corporation ©2013Bad seamingAll your baseBase level differences, wah?When a 512x512 neighbors a 256x256, their base levels aredifferent.This is an issue because samples are constant-sized in texel(integer) space, not UV (float) space
  • 27.
    NVIDIA Corporation ©2013RenormalizationWith unused alpha channel, code is simply:return result / result.a;If you need alpha, see appendixBad seaming Fixed!
  • 28.
    NVIDIA Corporation ©20130% Waste?Okay, not quite 0.Need a global set of textures that match ptex resolutions used.“Standard Candles”But they are one-channel, and can be massively compressed (4 bitsper pixel)<5 megs of overhead, regardless of texture footprintFor actual games, more like 1K of overhead.Could be eliminated, but at the cost of some shader complexity.Not needed for:Textures without alphaTextures used for Normal MapsTextures less than 32 bytes per pixel
  • 29.
    NVIDIA Corporation ©2013A brief interlude on the expense of retrievingtexels from textured surfacesTexture lookups by themselves are not expensive.There are fundamentally two types of lookups:Independent readsDependent readsIndependent reads can be pipelined.The first lookup “costs” ~150 clocksThe second costs ~5 clocks.Dependent reads must wait for previous resultsThe first lookup costs ~150 clocksThe second costs ~150 clocks.Try to have no more than 2-3 “levels” of dependent reads in a singleshader
  • 30.
    NVIDIA Corporation ©2013Performance ImpactIn this demo, Ptex costs < 30% versus no texturing at allCosts < 20% compared to repeat texturing.~15% versus an UV-unwrapped mesh
  • 31.
    NVIDIA Corporation ©2013Putting it all togetherFUDRLF.(u, v) = ( 0.5, 0.5 )R.(u, v) = ( 0.5, -0.5 )U.(u, v) = ( 0.5, 1.5 )L.(u, v) = ( 1.5, 0.5 )D.(u, v) = ( 0.5, -0.5 )In this situation, texture lookups in R, U, L and D will return theborder color (0, 0, 0, 0)F lookup will return alpha of 1—so the weight will be exactly 1.
  • 32.
    NVIDIA Corporation ©2013Putting it all togetherFUDRLF.(u, v) = ( 1.0, 0.5 )R.(u, v) = ( 0.5, 0.0 )U.(u, v) = ( 0.0, 1.5 )L.(u, v) = ( 2.0, 0.5 )D.(u, v) = ( 0.0, -0.5 )In this situation, texture lookups in U, L and D will return the border color(0, 0, 0, 0)If R and F are the same resolution, they will each return an alpha of 0.5.If R and F are not the same resolution, alpha will not be 1.0—renormalizationwill be necessary.
  • 33.
    NVIDIA Corporation ©2013Questions?jmcdonald at nvidia dot comDemo Thanks: Johnny Costello and Timothy Lottes!
  • 34.
    NVIDIA Corporation ©2013In the demoPtexAAVignettingLightingSpectral Simulation (7 data points)Volumetric Caustics (128 taps per pixel)

Editor's Notes

  • #6 * Based on a survey of memory usage from 5 currently shipping AAA titles.
  • #12 Used Border TexelsCons:Load time preparationBorder depended on maximum anisoChanging texture quality required restart (redo expensive border prep)High levels of texture waste for game-resolution assets when high aniso used
  • #15 Each texture already has complete mipmap chain
  • #17 Introduced in Direct3D 10
  • #21 gl_PrimitiveID
  • #22 Standard Model rendering stuff
  • #25 Explain bottom-&gt;bottom.
  • #26 Derive bottom-&gt;bottom again
  • #27 Go through 4x4 adjoinging 16x16 image
  • #30 Note that cost here is really just an increase in latency, which itself is only significant if latency is what’s preventing you from running faster. TLDR; Profile, Profile, Profile.
  • #31 Completelyunoptimized

[8]ページ先頭

©2009-2025 Movatter.jp