/ appendices / VK_NV_shading_rate_image.txt
VK_NV_shading_rate_image.txt
  1  include::meta/VK_NV_shading_rate_image.txt[]
  2  
  3  *Last Modified Date*::
  4      2018-09-13
  5  *Contributors*::
  6    - Pat Brown, NVIDIA
  7    - Carsten Rohde, NVIDIA
  8    - Jeff Bolz, NVIDIA
  9    - Daniel Koch, NVIDIA
 10    - Mathias Schott, NVIDIA
 11    - Matthew Netsch, Qualcomm Technologies, Inc.
 12  
 13  This extension allows applications to use a variable shading rate when
 14  processing fragments of rasterized primitives.
 15  By default, Vulkan will spawn one fragment shader for each pixel covered by
 16  a primitive.
 17  In this extension, applications can bind a _shading rate image_ that can be
 18  used to vary the number of fragment shader invocations across the
 19  framebuffer.
 20  Some portions of the screen may be configured to spawn up to 16 fragment
 21  shaders for each pixel, while other portions may use a single fragment
 22  shader invocation for a 4x4 block of pixels.
 23  This can be useful for use cases like eye tracking, where the portion of the
 24  framebuffer that the user is looking at directly can be processed at high
 25  frequency, while distant corners of the image can be processed at lower
 26  frequency.
 27  Each texel in the shading rate image represents a fixed-size rectangle in
 28  the framebuffer, covering 16x16 pixels in the initial implementation of this
 29  extension.
 30  When rasterizing a primitive covering one of these rectangles, the Vulkan
 31  implementation reads a texel in the bound shading rate image and looks up
 32  the fetched value in a palette to determine a base shading rate.
 33  
 34  In addition to the API support controlling rasterization, this extension
 35  also adds Vulkan support for the `SPV_NV_shading_rate` extension to SPIR-V.
 36  That extension provides two fragment shader variable decorations that allow
 37  fragment shaders to determine the shading rate used for processing the
 38  fragment:
 39  
 40    * code:FragmentSizeNV, which indicates the width and height of the set of
 41      pixels processed by the fragment shader.
 42    * code:InvocationsPerPixel, which indicates the maximum number of fragment
 43      shader invocations that could be spawned for the pixel(s) covered by the
 44      fragment.
 45  
 46  When using SPIR-V in conjunction with the OpenGL Shading Language (GLSL),
 47  the fragment shader capabilities are provided by the
 48  `GL_NV_shading_rate_image` language extension and correspond to the built-in
 49  variables code:gl_FragmentSizeNV and code:gl_InvocationsPerPixelNV,
 50  respectively.
 51  
 52  === New Object Types
 53  
 54  None.
 55  
 56  === New Enum Constants
 57  
 58    * Extending elink:VkStructureType:
 59    ** ename:VK_STRUCTURE_TYPE_PIPELINE_VIEWPORT_SHADING_RATE_IMAGE_STATE_CREATE_INFO_NV
 60    ** ename:VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SHADING_RATE_IMAGE_FEATURES_NV
 61    ** ename:VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SHADING_RATE_IMAGE_PROPERTIES_NV
 62    ** ename:VK_STRUCTURE_TYPE_PIPELINE_VIEWPORT_COARSE_SAMPLE_ORDER_STATE_CREATE_INFO_NV
 63  
 64    * Extending elink:VkImageLayout:
 65    ** ename:VK_IMAGE_LAYOUT_SHADING_RATE_OPTIMAL_NV
 66  
 67    * Extending elink:VkDynamicState:
 68    ** ename:VK_DYNAMIC_STATE_VIEWPORT_SHADING_RATE_PALETTE_NV
 69  
 70    * Extending elink:VkAccessFlagBits:
 71    ** ename:VK_ACCESS_SHADING_RATE_IMAGE_READ_BIT_NV
 72  
 73    * Extending elink:VkImageUsageFlagBits:
 74    ** ename:VK_IMAGE_USAGE_SHADING_RATE_IMAGE_BIT_NV
 75  
 76    * Extending elink:VkPipelineStageFlagBits
 77    ** ename:VK_PIPELINE_STAGE_SHADING_RATE_IMAGE_BIT_NV
 78  
 79  === New Enums
 80  
 81    * elink:VkShadingRatePaletteEntryNV, containing the following constants:
 82  
 83    ** ename:VK_SHADING_RATE_PALETTE_ENTRY_NO_INVOCATIONS_NV
 84    ** ename:VK_SHADING_RATE_PALETTE_ENTRY_16_INVOCATIONS_PER_PIXEL_NV
 85    ** ename:VK_SHADING_RATE_PALETTE_ENTRY_8_INVOCATIONS_PER_PIXEL_NV
 86    ** ename:VK_SHADING_RATE_PALETTE_ENTRY_4_INVOCATIONS_PER_PIXEL_NV
 87    ** ename:VK_SHADING_RATE_PALETTE_ENTRY_2_INVOCATIONS_PER_PIXEL_NV
 88    ** ename:VK_SHADING_RATE_PALETTE_ENTRY_1_INVOCATION_PER_PIXEL_NV
 89    ** ename:VK_SHADING_RATE_PALETTE_ENTRY_1_INVOCATION_PER_2X1_PIXELS_NV
 90    ** ename:VK_SHADING_RATE_PALETTE_ENTRY_1_INVOCATION_PER_1X2_PIXELS_NV
 91    ** ename:VK_SHADING_RATE_PALETTE_ENTRY_1_INVOCATION_PER_2X2_PIXELS_NV
 92    ** ename:VK_SHADING_RATE_PALETTE_ENTRY_1_INVOCATION_PER_4X2_PIXELS_NV
 93    ** ename:VK_SHADING_RATE_PALETTE_ENTRY_1_INVOCATION_PER_2X4_PIXELS_NV
 94    ** ename:VK_SHADING_RATE_PALETTE_ENTRY_1_INVOCATION_PER_4X4_PIXELS_NV
 95  
 96  === New Structures
 97  
 98    * slink:VkShadingRatePaletteNV
 99    * slink:VkPipelineViewportShadingRateImageStateCreateInfoNV
100    * slink:VkPhysicalDeviceShadingRateImageFeaturesNV
101    * slink:VkPhysicalDeviceShadingRateImagePropertiesNV
102  
103  === New Functions
104  
105    * flink:vkCmdBindShadingRateImageNV
106    * flink:vkCmdSetViewportShadingRatePaletteNV
107  
108  === Issues
109  
110  (1) When using shading rates that specify "`coarse`" fragments covering
111      multiple pixels, we will generate a combined coverage mask that combines
112      the coverage masks of all pixels covered by the fragment.
113      By default, these masks are combined in an implementation-dependent
114      order.
115      Should we provide a mechanism allowing applications to query or specify
116      an exact order?
117  
118  *RESOLVED*: Yes, this feature is useful for cases where most of the fragment
119  shader can be evaluated once for an entire coarse fragment, but where some
120  per-pixel computations are also required.
121  For example, a per-pixel alpha test may want to kill all the samples for
122  some pixels in a coarse fragment.
123  This sort of test can be implemented using an output sample mask, but such a
124  shader would need to know which bit in the mask corresponds to each sample
125  in the coarse fragment.
126  We are including a mechanism to allow aplications to specify the orders of
127  coverage samples for each shading rate and sample count, either as static
128  pipeline state or dynamically via a command buffer.
129  This portion of the extension has its own feature bit.
130  
131  We will not be providing a query to determine the implementation-dependent
132  default ordering.
133  The thinking here is that if an application cares enough about the coarse
134  fragment sample ordering to perform such a query, it could instead just set
135  its own order, also using custom per-pixel sample locations if required.
136  
137  (2) For the pipeline stage
138      ename:VK_PIPELINE_STAGE_SHADING_RATE_IMAGE_BIT_NV, should we specify a
139      precise location in the pipeline the shading rate image is accessed
140      (after geometry shading, but before the early fragment tests) or leave
141      it under-specified in case there are other implementations that access
142      the image in a different pipeline location?
143  
144  *RESOLVED* We are specifying the pipeline stage to be between the final
145  stage used for vertex processing
146  (ename:VK_PIPELINE_STAGE_GEOMETRY_SHADER_BIT) and before the first stage
147  used for fragment processing
148  (ename:VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT), which seems to be the
149  natural place to access the shading rate image.
150  
151  (3) How do centroid-sampled variables work with fragments larger than one
152      pixel?
153  
154  *RESOLVED* For single-pixel fragments, fragment shader inputs decorated with
155  code:Centroid are sampled at an implementation-dependent location in the
156  intersection of the area of the primitive being rasterized and the area of
157  the pixel that corresponds to the fragment.
158  With multi-pixel fragments, we follow a similar pattern, using the
159  intersection of the primitive and the *set* of pixels corresponding to the
160  fragment.
161  
162  One important thing to keep in mind when using such "`coarse`" shading rates
163  is that fragment attributes are sampled at the center of the fragment by
164  default, regardless of the set of pixels/samples covered by the fragment.
165  For fragments with a size of 4x4 pixels, this center location will be more
166  than two pixels (1.5 * sqrt(2)) away from the center of the pixels at the
167  corners of the fragment.
168  When rendering a primitive that covers only a small part of a coarse
169  fragment, sampling a color outside the primitive can produce overly bright
170  or dark color values if the color values have a large gradient.
171  To deal with this, an application can use centroid sampling on attributes
172  where "`extrapolation`" artifacts can lead to overly bright or dark pixels.
173  Note that this same problem also exists for multisampling with single-pixel
174  fragments, but is less severe because it only affects certain samples of a
175  pixel and such bright/dark samples may be averaged with other samples that
176  don't have a similar problem.
177  
178  === Version History
179  
180    * Revision 2, 2018-09-13 (Pat Brown)
181      - Miscellaneous edits preparing the specification for publication.
182  
183    * Revision 1, 2018-08-08 (Pat Brown)
184      - Internal revisions