Movatterモバイル変換


[0]ホーム

URL:


Uploaded bybasisspace
PPTX, PDF2,592 views

Borderless Per Face Texture Mapping

The document discusses the issue of texture waste in modern games, highlighting that nearly 30% of texture memory is typically wasted, which affects load times and memory usage. It introduces 'ptex', a texture system that eliminates the need for UV unwrapping by allowing each quad to have its own texture space, thereby improving production efficiency and visual fidelity. The document outlines Ptex's benefits and performance impact, demonstrating its efficiency in memory usage while reducing load times.

Embed presentation

Downloaded 19 times
Eliminating TextureWaste: Borderless PtexJohn McDonald, NVIDIACorporation
NVIDIA Corporation © 2013Memory ConsumptionModern games consume a lotof memoryThe largest class of memoryusage is texturesBut lots of texture is wasted!Waste costs both memory andincreased load timesBack/FrontGbufferTexturesVB/IBSimulation
NVIDIA Corporation © 2013Wasted?!Two sources of texture waste:Unmapped texture storage (major)Duplicated texels to helpalleviate visible seams (minor)This cannot eliminate seams.http://www.boogotti.com/root/images/face/dffuse_texture.jpgWaste WasteWasteWasteWaste
NVIDIA Corporation © 2013Wasted?!Two sources of texture waste:Unmapped texture storage (major)Duplicated texels to helpalleviate visible seams (minor)This cannot eliminate seams.http://www.boogotti.com/root/images/face/dffuse_texture.jpg
NVIDIA Corporation © 2013How much waste are we talking?Nearly 60% of memory usage in a modern game* is texture usageAnd up to 30% of that is waste.That’s 18% of your total application footprint.
NVIDIA Corporation © 2013Memory Waste18% of your memory is useless.18% of your load time is wasted.
NVIDIA Corporation © 2013Enter Ptex (a quick recap)The soul of Ptex:Model with Quads instead of TrianglesYou’re doing this for your next-gen engine anyways, right?Every Quad gets its own entire texture UV-spaceUV orientation is implicit in surfacedefinitionNo explicit UV parameterizationResolution of each face isindependent of neighbors.
NVIDIA Corporation © 2013Ptex (cont’d)Invented by Brent Burley at Walt Disney Animation StudiosUsed in every animated film at Disney since 20076 features and all shorts, plus everything inproduction now and for the foreseeablefutureUsed on ~100% of surfacesRapid adoption in DCC toolsWidespread usage throughoutthe film industry
NVIDIA Corporation © 2013Ptex benefitsNo UV unwrapsAllow artists to work at any resolution they wantPerform an offline pass on assets to decide what to ship for eachplatform based on capabilitiesShip a texture pack later for tail revenueReduce your load times. And your memory footprint. Improveyour visual fidelity.Reduce the cost of production’s long pole—art.
NVIDIA Corporation © 2013DemoDemo is running on a Titan.Sorry, it’s what we have at the show. I’ve run on 430-680—perf scales linearly with Texture/FB.Could run on any Dx11 capable GPU.Could also run on Dx10 capable GPUs with small adaptations.OpenGL 4—no vendor-specific extensions.
NVIDIA Corporation © 2013Roadmap: Realtime Ptex v1LoadModelRenderPreprocessDraw TimeBucketandSortGenerateMipmapsFillBordersPackTextureArraysReorderIndexBufferPackPatchConstantsRed: Vertex and Index dataGreen: Patch Constant informationBlue: Texel dataOrange: Adjacency information
NVIDIA Corporation © 2013Roadmap: Realtime Ptex v2LoadModelRenderPreprocessDraw TimePackTextureArraysPackPatchConstantsRed: Vertex and Index dataGreen: Patch Constant informationBlue: Texel dataOrange: Adjacency information
NVIDIA Corporation © 2013Realtime Ptex v2Instead of copying texels into a border region, just go look atthem.Use clamp to edge (border color), with a border color of (0,0,0,0)This makes those lookups fast.Also lets you know how close to the edge you areWe’ll need to transform our UVs into their UV spaceAnd accumulate the resultsWaste factor? 0*.
NVIDIA Corporation © 2013Example ModelVB: …IB:
NVIDIA Corporation © 2013Load ModelVertex DataAny geometry arranged as a quad-based meshExample: Wavefront OBJPatch TexturePower-of-two texture imagesAdjacency Information4 Neighbors of each quad patchEasily load texture and adjacency with OSS library available fromhttp://ptex.us/
NVIDIA Corporation © 2013Texture ArraysLike 3D / Volume Textures, except:No filtering between 2D slicesOnly X and Y decrease with mipmap level (Z doesn’t)Z indexed by integer index, not [0,1]E.g. (0.5, 0.5, 4) would be (0.5, 0.5) from the 5th sliceAPI SupportDirect3D 10+: Texture2DArrayOpenGL 3.0+: GL_TEXTURE_2D_ARRAY
NVIDIA Corporation © 2013Arrays of Texture ArraysBoth GLSL and HLSL* support arrays of TextureArrays.This allows for stupidly powerful abuse of texturing.Texture2DArray albedo[32]; // D3Duniform sampler2DArray albedo[32]; // OpenGL* HLSL support requires a little codegen—but it’s entirely a compile-timeexercise, no runtime impact.
NVIDIA Corporation © 2013Pack Texture ArraysOne Texture2DArray per top-mipmap levelStore with complete with mipmap chainDon’t forget to set border color to black (with 0 alpha).
NVIDIA Corporation © 2013Packed ArraysTexture Array (TA) 0 TA 1 TA 2Slice 0 Slice 1 Slice 2 Slice 0 Slice 0
NVIDIA Corporation © 2013Pack Patch ConstantsCreate a constant-buffer indexed byPrimitiveID. Each entry contains:Your Array Index and Slice in theTexture2DArraysYour four neighbors across the edgesEach neighbor’s UV orientation(Again, can be prepared at baking time)If rendering too many primitivesto fit into a constant buffer,you can use Structured Buffers / SSBO for storage.struct PTexParameters {ushort usNgbrIndex[4];ushort usNgbrXform[4];ushort usTexIndex;ushort usTexSlice;};uniform ptxDiffuseUBO {PTexParameters ptxDiffuse[PRIMS];};
NVIDIA Corporation © 2013Rendering time (CPU)Bind Texture2DArrays(If you’re in GL, consider Bindless)Select ShaderSetup Constants
NVIDIA Corporation © 2013Rendering Time (DS)In the domain shader, we need to generate our UVs.Use SV_DomainLocation.Exact mapping is dependent onDCC tool used to generatethe meshIncorrect surface orientation
NVIDIA Corporation © 2013Rendering Time (PS)Conceptually, a ptex lookup is:Sample our surface (use SV_PrimitiveID to determine our data).For each neighbor:Transform our UV into their UV spacePerform a lookup in that surface with transformed UVsAccumulate the result, correct for base-level differences and return
NVIDIA Corporation © 2013Mapping SpaceThere are 16 cases thatmap our UV space to ourneighbors, as shown.
NVIDIA Corporation © 2013Transforming SpaceConveniently these mapto simple 3x2 texturetransforms
NVIDIA Corporation © 2013Bad seamingAll your baseBase level differences, wah?When a 512x512 neighbors a 256x256, their base levels aredifferent.This is an issue because samples are constant-sized in texel(integer) space, not UV (float) space
NVIDIA Corporation © 2013RenormalizationWith unused alpha channel, code is simply:return result / result.a;If you need alpha, see appendixBad seaming Fixed!
NVIDIA Corporation © 20130% Waste?Okay, not quite 0.Need a global set of textures that match ptex resolutions used.“Standard Candles”But they are one-channel, and can be massively compressed (4 bitsper pixel)<5 megs of overhead, regardless of texture footprintFor actual games, more like 1K of overhead.Could be eliminated, but at the cost of some shader complexity.Not needed for:Textures without alphaTextures used for Normal MapsTextures less than 32 bytes per pixel
NVIDIA Corporation © 2013A brief interlude on the expense of retrievingtexels from textured surfacesTexture lookups by themselves are not expensive.There are fundamentally two types of lookups:Independent readsDependent readsIndependent reads can be pipelined.The first lookup “costs” ~150 clocksThe second costs ~5 clocks.Dependent reads must wait for previous resultsThe first lookup costs ~150 clocksThe second costs ~150 clocks.Try to have no more than 2-3 “levels” of dependent reads in a singleshader
NVIDIA Corporation © 2013Performance ImpactIn this demo, Ptex costs < 30% versus no texturing at allCosts < 20% compared to repeat texturing.~15% versus an UV-unwrapped mesh
NVIDIA Corporation © 2013Putting it all togetherFUDRLF.(u, v) = ( 0.5, 0.5 )R.(u, v) = ( 0.5, -0.5 )U.(u, v) = ( 0.5, 1.5 )L.(u, v) = ( 1.5, 0.5 )D.(u, v) = ( 0.5, -0.5 )In this situation, texture lookups in R, U, L and D will return theborder color (0, 0, 0, 0)F lookup will return alpha of 1—so the weight will be exactly 1.
NVIDIA Corporation © 2013Putting it all togetherFUDRLF.(u, v) = ( 1.0, 0.5 )R.(u, v) = ( 0.5, 0.0 )U.(u, v) = ( 0.0, 1.5 )L.(u, v) = ( 2.0, 0.5 )D.(u, v) = ( 0.0, -0.5 )In this situation, texture lookups in U, L and D will return the border color(0, 0, 0, 0)If R and F are the same resolution, they will each return an alpha of 0.5.If R and F are not the same resolution, alpha will not be 1.0—renormalizationwill be necessary.
NVIDIA Corporation © 2013Questions?jmcdonald at nvidia dot comDemo Thanks: Johnny Costello and Timothy Lottes!
NVIDIA Corporation © 2013In the demoPtexAAVignettingLightingSpectral Simulation (7 data points)Volumetric Caustics (128 taps per pixel)

Recommended

PPT
Realtime Per Face Texture Mapping (PTEX)
PDF
OpenGL NVIDIA Command-List: Approaching Zero Driver Overhead
PPTX
Beyond porting
PPTX
OpenGL 4.5 Update for NVIDIA GPUs
PDF
Modern OpenGL Usage: Using Vertex Buffer Objects Well
PDF
OpenGL 4.4 - Scene Rendering Techniques
PPTX
Porting the Source Engine to Linux: Valve's Lessons Learned
PDF
Masked Software Occlusion Culling
PPSX
Dx11 performancereloaded
PPTX
FlameWorks GTC 2014
PPSX
Oit And Indirect Illumination Using Dx11 Linked Lists
PPTX
Approaching zero driver overhead
PPT
GDC 2012: Advanced Procedural Rendering in DX11
PPSX
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
PPSX
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
PPTX
Sig13 ce future_gfx
PPSX
Advancements in-tiled-rendering
PPTX
Future Directions for Compute-for-Graphics
PPTX
Hair in Tomb Raider
PPT
Your Game Needs Direct3D 11, So Get Started Now!
 
PPTX
NvFX GTC 2013
PDF
Checkerboard Rendering in Dark Souls: Remastered by QLOC
 
PPTX
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
 
PDF
Advanced Scenegraph Rendering Pipeline
PDF
Optimizing the graphics pipeline with compute
PPTX
4K Checkerboard in Battlefield 1 and Mass Effect Andromeda
PPTX
Scope Stack Allocation
PDF
Using neon for pattern recognition in audio data
PDF
Smedberg niklas bringing_aaa_graphics
PPTX
Real-time lightmap baking

More Related Content

PPT
Realtime Per Face Texture Mapping (PTEX)
PDF
OpenGL NVIDIA Command-List: Approaching Zero Driver Overhead
PPTX
Beyond porting
PPTX
OpenGL 4.5 Update for NVIDIA GPUs
PDF
Modern OpenGL Usage: Using Vertex Buffer Objects Well
PDF
OpenGL 4.4 - Scene Rendering Techniques
PPTX
Porting the Source Engine to Linux: Valve's Lessons Learned
PDF
Masked Software Occlusion Culling
Realtime Per Face Texture Mapping (PTEX)
OpenGL NVIDIA Command-List: Approaching Zero Driver Overhead
Beyond porting
OpenGL 4.5 Update for NVIDIA GPUs
Modern OpenGL Usage: Using Vertex Buffer Objects Well
OpenGL 4.4 - Scene Rendering Techniques
Porting the Source Engine to Linux: Valve's Lessons Learned
Masked Software Occlusion Culling

What's hot

PPSX
Dx11 performancereloaded
PPTX
FlameWorks GTC 2014
PPSX
Oit And Indirect Illumination Using Dx11 Linked Lists
PPTX
Approaching zero driver overhead
PPT
GDC 2012: Advanced Procedural Rendering in DX11
PPSX
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
PPSX
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
PPTX
Sig13 ce future_gfx
PPSX
Advancements in-tiled-rendering
PPTX
Future Directions for Compute-for-Graphics
PPTX
Hair in Tomb Raider
PPT
Your Game Needs Direct3D 11, So Get Started Now!
 
PPTX
NvFX GTC 2013
PDF
Checkerboard Rendering in Dark Souls: Remastered by QLOC
 
PPTX
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
 
PDF
Advanced Scenegraph Rendering Pipeline
PDF
Optimizing the graphics pipeline with compute
PPTX
4K Checkerboard in Battlefield 1 and Mass Effect Andromeda
PPTX
Scope Stack Allocation
PDF
Using neon for pattern recognition in audio data
Dx11 performancereloaded
FlameWorks GTC 2014
Oit And Indirect Illumination Using Dx11 Linked Lists
Approaching zero driver overhead
GDC 2012: Advanced Procedural Rendering in DX11
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Sig13 ce future_gfx
Advancements in-tiled-rendering
Future Directions for Compute-for-Graphics
Hair in Tomb Raider
Your Game Needs Direct3D 11, So Get Started Now!
 
NvFX GTC 2013
Checkerboard Rendering in Dark Souls: Remastered by QLOC
 
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
 
Advanced Scenegraph Rendering Pipeline
Optimizing the graphics pipeline with compute
4K Checkerboard in Battlefield 1 and Mass Effect Andromeda
Scope Stack Allocation
Using neon for pattern recognition in audio data

Similar to Borderless Per Face Texture Mapping

PDF
Smedberg niklas bringing_aaa_graphics
PPTX
Real-time lightmap baking
PPT
CS 354 Texture Mapping
PPTX
GFX Part 4 - Introduction to Texturing in OpenGL ES
PPT
Advanced Lighting Techniques Dan Baker (Meltdown 2005)
PPTX
Penn graphics
PDF
Hpg2011 papers kazakov
PPTX
4,000 Adams at 90 Frames Per Second | Yi Fei Boon
PPTX
Beginning direct3d gameprogramming08_usingtextures_20160428_jintaeks
PDF
Deferred shading
PPT
Anatomy of a Texture Fetch
PPTX
Develop2012 deferred sanchez_stachowiak
PPTX
Geometry Batching Using Texture-Arrays
PPT
Far cry 3
PPT
Order Independent Transparency
 
PDF
201707 SER332 Lecture 21
PPT
OpenGL for 2015
PDF
Shaders - Claudia Doppioslash - Unity With the Best
PDF
RSX™ Best Practices
PDF
Дмитрий Вовк - Learn iOS Game Optimization. Ultimate Guide
Smedberg niklas bringing_aaa_graphics
Real-time lightmap baking
CS 354 Texture Mapping
GFX Part 4 - Introduction to Texturing in OpenGL ES
Advanced Lighting Techniques Dan Baker (Meltdown 2005)
Penn graphics
Hpg2011 papers kazakov
4,000 Adams at 90 Frames Per Second | Yi Fei Boon
Beginning direct3d gameprogramming08_usingtextures_20160428_jintaeks
Deferred shading
Anatomy of a Texture Fetch
Develop2012 deferred sanchez_stachowiak
Geometry Batching Using Texture-Arrays
Far cry 3
Order Independent Transparency
 
201707 SER332 Lecture 21
OpenGL for 2015
Shaders - Claudia Doppioslash - Unity With the Best
RSX™ Best Practices
Дмитрий Вовк - Learn iOS Game Optimization. Ultimate Guide

Recently uploaded

PDF
Top Crypto Supers 15th Report November 2025
PDF
How Much Does It Cost to Build an eCommerce Website in 2025.pdf
PDF
Mastering UiPath Maestro – Session 2 – Building a Live Use Case - Session 2
PDF
Lets Build a Serverless Function with Kiro
PDF
[BDD 2025 - Full-Stack Development] PHP in AI Age: The Laravel Way. (Rizqy Hi...
PDF
Transforming Content Operations in the Age of AI
PPTX
The power of Slack and MuleSoft | Bangalore MuleSoft Meetup #60
PDF
The Necessity of Digital Forensics, the Digital Forensics Process & Laborator...
PPTX
Connecting the unconnectable: Exploring LoRaWAN for IoT
PDF
[BDD 2025 - Mobile Development] Crafting Immersive UI with E2E and AGSL Shade...
PDF
Beyond Basics: How to Build Scalable, Intelligent Imagery Pipelines
PDF
[BDD 2025 - Artificial Intelligence] AI for the Underdogs: Innovation for Sma...
PDF
"DISC as GPS for team leaders: how to lead a team from storming to performing...
 
PDF
The Evolving Role of the CEO in the Age of AI
PDF
DUBAI IT MODERNIZATION WITH AZURE MANAGED SERVICES.pdf
PPTX
UFCD 0797 - SISTEMAS OPERATIVOS_Unidade Completa.pptx
PDF
Transforming Supply Chains with Amazon Bedrock AgentCore (AWS Swiss User Grou...
PDF
Oracle MySQL HeatWave - Short - Version 3
PDF
Transcript: The partnership effect: Libraries and publishers on collaborating...
PPTX
Guardrails in Action - Ensuring Safe AI with Azure AI Content Safety.pptx
Top Crypto Supers 15th Report November 2025
How Much Does It Cost to Build an eCommerce Website in 2025.pdf
Mastering UiPath Maestro – Session 2 – Building a Live Use Case - Session 2
Lets Build a Serverless Function with Kiro
[BDD 2025 - Full-Stack Development] PHP in AI Age: The Laravel Way. (Rizqy Hi...
Transforming Content Operations in the Age of AI
The power of Slack and MuleSoft | Bangalore MuleSoft Meetup #60
The Necessity of Digital Forensics, the Digital Forensics Process & Laborator...
Connecting the unconnectable: Exploring LoRaWAN for IoT
[BDD 2025 - Mobile Development] Crafting Immersive UI with E2E and AGSL Shade...
Beyond Basics: How to Build Scalable, Intelligent Imagery Pipelines
[BDD 2025 - Artificial Intelligence] AI for the Underdogs: Innovation for Sma...
"DISC as GPS for team leaders: how to lead a team from storming to performing...
 
The Evolving Role of the CEO in the Age of AI
DUBAI IT MODERNIZATION WITH AZURE MANAGED SERVICES.pdf
UFCD 0797 - SISTEMAS OPERATIVOS_Unidade Completa.pptx
Transforming Supply Chains with Amazon Bedrock AgentCore (AWS Swiss User Grou...
Oracle MySQL HeatWave - Short - Version 3
Transcript: The partnership effect: Libraries and publishers on collaborating...
Guardrails in Action - Ensuring Safe AI with Azure AI Content Safety.pptx

Borderless Per Face Texture Mapping

  • 1.
    Eliminating TextureWaste: BorderlessPtexJohn McDonald, NVIDIACorporation
  • 2.
    NVIDIA Corporation ©2013Memory ConsumptionModern games consume a lotof memoryThe largest class of memoryusage is texturesBut lots of texture is wasted!Waste costs both memory andincreased load timesBack/FrontGbufferTexturesVB/IBSimulation
  • 3.
    NVIDIA Corporation ©2013Wasted?!Two sources of texture waste:Unmapped texture storage (major)Duplicated texels to helpalleviate visible seams (minor)This cannot eliminate seams.http://www.boogotti.com/root/images/face/dffuse_texture.jpgWaste WasteWasteWasteWaste
  • 4.
    NVIDIA Corporation ©2013Wasted?!Two sources of texture waste:Unmapped texture storage (major)Duplicated texels to helpalleviate visible seams (minor)This cannot eliminate seams.http://www.boogotti.com/root/images/face/dffuse_texture.jpg
  • 5.
    NVIDIA Corporation ©2013How much waste are we talking?Nearly 60% of memory usage in a modern game* is texture usageAnd up to 30% of that is waste.That’s 18% of your total application footprint.
  • 6.
    NVIDIA Corporation ©2013Memory Waste18% of your memory is useless.18% of your load time is wasted.
  • 7.
    NVIDIA Corporation ©2013Enter Ptex (a quick recap)The soul of Ptex:Model with Quads instead of TrianglesYou’re doing this for your next-gen engine anyways, right?Every Quad gets its own entire texture UV-spaceUV orientation is implicit in surfacedefinitionNo explicit UV parameterizationResolution of each face isindependent of neighbors.
  • 8.
    NVIDIA Corporation ©2013Ptex (cont’d)Invented by Brent Burley at Walt Disney Animation StudiosUsed in every animated film at Disney since 20076 features and all shorts, plus everything inproduction now and for the foreseeablefutureUsed on ~100% of surfacesRapid adoption in DCC toolsWidespread usage throughoutthe film industry
  • 9.
    NVIDIA Corporation ©2013Ptex benefitsNo UV unwrapsAllow artists to work at any resolution they wantPerform an offline pass on assets to decide what to ship for eachplatform based on capabilitiesShip a texture pack later for tail revenueReduce your load times. And your memory footprint. Improveyour visual fidelity.Reduce the cost of production’s long pole—art.
  • 10.
    NVIDIA Corporation ©2013DemoDemo is running on a Titan.Sorry, it’s what we have at the show. I’ve run on 430-680—perf scales linearly with Texture/FB.Could run on any Dx11 capable GPU.Could also run on Dx10 capable GPUs with small adaptations.OpenGL 4—no vendor-specific extensions.
  • 11.
    NVIDIA Corporation ©2013Roadmap: Realtime Ptex v1LoadModelRenderPreprocessDraw TimeBucketandSortGenerateMipmapsFillBordersPackTextureArraysReorderIndexBufferPackPatchConstantsRed: Vertex and Index dataGreen: Patch Constant informationBlue: Texel dataOrange: Adjacency information
  • 12.
    NVIDIA Corporation ©2013Roadmap: Realtime Ptex v2LoadModelRenderPreprocessDraw TimePackTextureArraysPackPatchConstantsRed: Vertex and Index dataGreen: Patch Constant informationBlue: Texel dataOrange: Adjacency information
  • 13.
    NVIDIA Corporation ©2013Realtime Ptex v2Instead of copying texels into a border region, just go look atthem.Use clamp to edge (border color), with a border color of (0,0,0,0)This makes those lookups fast.Also lets you know how close to the edge you areWe’ll need to transform our UVs into their UV spaceAnd accumulate the resultsWaste factor? 0*.
  • 14.
    NVIDIA Corporation ©2013Example ModelVB: …IB:
  • 15.
    NVIDIA Corporation ©2013Load ModelVertex DataAny geometry arranged as a quad-based meshExample: Wavefront OBJPatch TexturePower-of-two texture imagesAdjacency Information4 Neighbors of each quad patchEasily load texture and adjacency with OSS library available fromhttp://ptex.us/
  • 16.
    NVIDIA Corporation ©2013Texture ArraysLike 3D / Volume Textures, except:No filtering between 2D slicesOnly X and Y decrease with mipmap level (Z doesn’t)Z indexed by integer index, not [0,1]E.g. (0.5, 0.5, 4) would be (0.5, 0.5) from the 5th sliceAPI SupportDirect3D 10+: Texture2DArrayOpenGL 3.0+: GL_TEXTURE_2D_ARRAY
  • 17.
    NVIDIA Corporation ©2013Arrays of Texture ArraysBoth GLSL and HLSL* support arrays of TextureArrays.This allows for stupidly powerful abuse of texturing.Texture2DArray albedo[32]; // D3Duniform sampler2DArray albedo[32]; // OpenGL* HLSL support requires a little codegen—but it’s entirely a compile-timeexercise, no runtime impact.
  • 18.
    NVIDIA Corporation ©2013Pack Texture ArraysOne Texture2DArray per top-mipmap levelStore with complete with mipmap chainDon’t forget to set border color to black (with 0 alpha).
  • 19.
    NVIDIA Corporation ©2013Packed ArraysTexture Array (TA) 0 TA 1 TA 2Slice 0 Slice 1 Slice 2 Slice 0 Slice 0
  • 20.
    NVIDIA Corporation ©2013Pack Patch ConstantsCreate a constant-buffer indexed byPrimitiveID. Each entry contains:Your Array Index and Slice in theTexture2DArraysYour four neighbors across the edgesEach neighbor’s UV orientation(Again, can be prepared at baking time)If rendering too many primitivesto fit into a constant buffer,you can use Structured Buffers / SSBO for storage.struct PTexParameters {ushort usNgbrIndex[4];ushort usNgbrXform[4];ushort usTexIndex;ushort usTexSlice;};uniform ptxDiffuseUBO {PTexParameters ptxDiffuse[PRIMS];};
  • 21.
    NVIDIA Corporation ©2013Rendering time (CPU)Bind Texture2DArrays(If you’re in GL, consider Bindless)Select ShaderSetup Constants
  • 22.
    NVIDIA Corporation ©2013Rendering Time (DS)In the domain shader, we need to generate our UVs.Use SV_DomainLocation.Exact mapping is dependent onDCC tool used to generatethe meshIncorrect surface orientation
  • 23.
    NVIDIA Corporation ©2013Rendering Time (PS)Conceptually, a ptex lookup is:Sample our surface (use SV_PrimitiveID to determine our data).For each neighbor:Transform our UV into their UV spacePerform a lookup in that surface with transformed UVsAccumulate the result, correct for base-level differences and return
  • 24.
    NVIDIA Corporation ©2013Mapping SpaceThere are 16 cases thatmap our UV space to ourneighbors, as shown.
  • 25.
    NVIDIA Corporation ©2013Transforming SpaceConveniently these mapto simple 3x2 texturetransforms
  • 26.
    NVIDIA Corporation ©2013Bad seamingAll your baseBase level differences, wah?When a 512x512 neighbors a 256x256, their base levels aredifferent.This is an issue because samples are constant-sized in texel(integer) space, not UV (float) space
  • 27.
    NVIDIA Corporation ©2013RenormalizationWith unused alpha channel, code is simply:return result / result.a;If you need alpha, see appendixBad seaming Fixed!
  • 28.
    NVIDIA Corporation ©20130% Waste?Okay, not quite 0.Need a global set of textures that match ptex resolutions used.“Standard Candles”But they are one-channel, and can be massively compressed (4 bitsper pixel)<5 megs of overhead, regardless of texture footprintFor actual games, more like 1K of overhead.Could be eliminated, but at the cost of some shader complexity.Not needed for:Textures without alphaTextures used for Normal MapsTextures less than 32 bytes per pixel
  • 29.
    NVIDIA Corporation ©2013A brief interlude on the expense of retrievingtexels from textured surfacesTexture lookups by themselves are not expensive.There are fundamentally two types of lookups:Independent readsDependent readsIndependent reads can be pipelined.The first lookup “costs” ~150 clocksThe second costs ~5 clocks.Dependent reads must wait for previous resultsThe first lookup costs ~150 clocksThe second costs ~150 clocks.Try to have no more than 2-3 “levels” of dependent reads in a singleshader
  • 30.
    NVIDIA Corporation ©2013Performance ImpactIn this demo, Ptex costs < 30% versus no texturing at allCosts < 20% compared to repeat texturing.~15% versus an UV-unwrapped mesh
  • 31.
    NVIDIA Corporation ©2013Putting it all togetherFUDRLF.(u, v) = ( 0.5, 0.5 )R.(u, v) = ( 0.5, -0.5 )U.(u, v) = ( 0.5, 1.5 )L.(u, v) = ( 1.5, 0.5 )D.(u, v) = ( 0.5, -0.5 )In this situation, texture lookups in R, U, L and D will return theborder color (0, 0, 0, 0)F lookup will return alpha of 1—so the weight will be exactly 1.
  • 32.
    NVIDIA Corporation ©2013Putting it all togetherFUDRLF.(u, v) = ( 1.0, 0.5 )R.(u, v) = ( 0.5, 0.0 )U.(u, v) = ( 0.0, 1.5 )L.(u, v) = ( 2.0, 0.5 )D.(u, v) = ( 0.0, -0.5 )In this situation, texture lookups in U, L and D will return the border color(0, 0, 0, 0)If R and F are the same resolution, they will each return an alpha of 0.5.If R and F are not the same resolution, alpha will not be 1.0—renormalizationwill be necessary.
  • 33.
    NVIDIA Corporation ©2013Questions?jmcdonald at nvidia dot comDemo Thanks: Johnny Costello and Timothy Lottes!
  • 34.
    NVIDIA Corporation ©2013In the demoPtexAAVignettingLightingSpectral Simulation (7 data points)Volumetric Caustics (128 taps per pixel)

Editor's Notes

  • #6 * Based on a survey of memory usage from 5 currently shipping AAA titles.
  • #12 Used Border TexelsCons:Load time preparationBorder depended on maximum anisoChanging texture quality required restart (redo expensive border prep)High levels of texture waste for game-resolution assets when high aniso used
  • #15 Each texture already has complete mipmap chain
  • #17 Introduced in Direct3D 10
  • #21 gl_PrimitiveID
  • #22 Standard Model rendering stuff
  • #25 Explain bottom-&gt;bottom.
  • #26 Derive bottom-&gt;bottom again
  • #27 Go through 4x4 adjoinging 16x16 image
  • #30 Note that cost here is really just an increase in latency, which itself is only significant if latency is what’s preventing you from running faster. TLDR; Profile, Profile, Profile.
  • #31 Completelyunoptimized

[8]ページ先頭

©2009-2025 Movatter.jp