Adreno Hardware Tutorial 3: Tile Based Rendering - 抄作业


目的

加深对 TBR 的理解,便于日后做优化

便于不使用 视频中调整进度的方式来查看内容,而直接从文本上搜索,便于后续的索引方式来回顾内容


Qualcomm Developer Network:
Tutorial: Tile Based Rendering on Adreno GPUs


Tile-based Rendering - TBR

  • Tile-based rendering involves dividing up a “frame” into multiple parts or “tiles” (also called “bins”) and rendering each tile separately

  • It is also called “Indirect Rendering Mode”

  • == 基于瓦片的渲染 (TBR) 涉及将一“帧”分成多个部分或“瓦片”(tiles)(也称为“箱”(bins))并分别渲染每个瓦片==

  • 也称为“间接渲染模式”

We’ll be looking at tile based rendering on the Adreno GPU, this is also sometimes referred to as indirect mode or binning, in this presentation we’ll be using the Adreno Profiler and a sample from the Adreno SDK, this sample is a toon shading sample, the Adreno SDK and the profiler are both available for free download on the Qualcomm Developer Network website

我们将着眼于基于Adreno GPU的TBR,这有时也被称为 间接模式 或binning,在本次演示中,我们将使用 Adreno Profiler 和 Adreno SDK 中的一个示例,该示例是一个卡通着色示例,Adreno SDK 和 Profiler 都可以在 Qualcomm 开发者网络网站上免费下载


Operation - 操作

  • Adreno GPUs contain fast local memory (GMEM) that can store Z, stencil and color values
  • The final frame buffer is divided into samller portions or “tiles” or “bins”
    • Number of bins = Final framebuffer Size / GMEM Size
  • Each “tile” or “bin” is resolved into the final frame buffer one at a time

Adreno GPUs contain fast on chip memory or GMEM that store Z, stencil and color values, tile based rendering is done on this fast memory and the copied to the final frame buffer by dividing up the frame buffer into samller tiles or bins, this is done to make the GPU more power efficient

  • Adreno GPU 包含可以存储:Z(Depth)、模板(Stencil)和颜色(Color)值的快速本地内存 (GMEM,jave.lin : 我反而理解为:Global Memory)
  • 最终的帧缓冲区被分成更小的部分或“瓦片(Tiles)”或“箱(Bins)”
    • 箱数 = 最终帧缓冲区大小 / GMEM 大小
  • 每个“tile”或“bin”一次被解析到最终的帧缓冲区中

Adreno GPU 包含存储 Z(depth) 和模板(stencil)和颜色(color)值的快速片上内存或 GMEM,TBR在此快速内存上完成,然后通过将帧缓冲区划分为更小的tiles或 bins 复制到最终帧缓冲区。让 GPU 更省电

在这里插入图片描述

The larger the GMEM on the device the fewer tiles or bins you need, so for example on a Snapdragon 805 using my toon shading sample 15 bins were created however on a mid-range Snapdragon device 51 were created, I will now run the sample on the device

设备上的GMEM越大,你的 tiles 或 bins 就越少,因此,例如在使用我的卡通着色样本的骁龙805上,创建了15个bins,但在中端骁龙设备上创建了51个箱子,我现在将在设备上运行这个样本

在这里插入图片描述

I will demonstrate using the Adreno Profiler and one of the Adreno SDK samples, this sample is a simple toon shader and by default it renders in tile based mode

我将使用Adreno Profiler和Adreno SDK中的一个示例来演示,这个示例是一个简单的卡通着色器,默认情况下它以TBR渲染

在这里插入图片描述

I will capture the sample in the profiler, notice the upper part of the profiler screen, after I click the bins button, where a snapshot of the sample is seen divided up into bins or tiles, notice the lower part of the profiler screen, where my GPU busy metrics and system memory stores metrics have very small green bars

我将在Profiler中捕获样本,注意Profiler屏幕的上部,在我点击bins按钮后,样本的快照被分成bins或tiles,注意分析器屏幕的下部,在这里我的GPU繁忙指标和系统内存存储指标有非常小的绿色条

在这里插入图片描述
在这里插入图片描述

I will now run another version of same sample except with tile based rendering disabled, notice the upper part of the profiler screen, after I click the bins button, where a screenshot of the sample is seen with no bins or tiles

现在我将运行相同示例的另一个版本,只是禁用了TBR,注意在我点击bins按钮后,分析器屏幕的上部,在那里可以看到没有bins或tile的示例截图

在这里插入图片描述

Notice the lower part of the profiler screen, where my GPU busy metrics and system memory stalls metrics, have longer red bars than what we saw on the previous capture

注意,在分析器屏幕的下方,也就是我的GPU繁忙指标和系统内存停滞指标,有比我们在前一个捕获中看到的更长的红色条

在这里插入图片描述


Why Tile Based? - 为何要基于Tile

  • Reduces data traffice to system memory

  • This helps to minimize power consumption

  • Reduces cost of transparency / anti-aliasing because it happens in high speed GMEM

  • 减少数据传输到系统内存

  • 这有助于降低功耗

  • 降低透明度/抗锯齿的成本,因为它发生在高速GMEM中

For this sample tile based rendering provided a net benefit while also improving power usage of the CPU, it also reduces data traffic to system memory and the cost of transparency anti-aliasing because it happens in high-speed GMEM

对于这个特定的示例,TBR提供了益处,同时还提高了 CPU 的利用率,它还减少了系统内存的数据流量和透明抗锯齿的成本,因为它发生在高速 GMEM


了解 Adreno GPU SDK,请访问: http://developer.qualcomm.com/adreno

如需了解 Adreno GPU Profiler,请访问:
https://developer.qualcomm.com/adreno… - 这部分的页面丢失了:page not found


#参考资料

猜你喜欢

转载自blog.csdn.net/linjf520/article/details/126272964
今日推荐