Android GraphicBuffer神奇之处---direct texture

研究了Android GraphicBuffer一段时间了，那么Android GraphicBuffer到底有什么优点呢？
我在网上搜了下，转载一篇比较好的文章了介绍下这个问题。
[转载] ANDROID中的EGL扩展
上的介绍如下：
由于在OpenGL ES中，上传纹理（glTexImage2D(), glSubTexImage2D()）是一个极为耗时的过程，在1080×1920的屏幕尺寸下传一张全屏的texture需要20～60ms。这样的话SurfaceFlinger就不可能在60fps下运行。因此， Android采用了image native buffer,将graphic buffer直接作为纹理（direct texture）进行操作。

[转载]using direct textures on android
using direct textures on android
16 dec 2011
I’ve been working at Mozilla on Firefox Mobile for a few months now. One of the goals of the new native UI is to have liquid smooth scrolling and panning at all times. Unsurprisingly, we do this by drawing into an OpenGL texture and moving it around on the screen. This is pretty fast until you run out of content in the texture and need to update it. Gecko runs in a separate thread and can draw to a buffer there without blocking us, but uploading that data into the texture is where problems arise. Right now we use just one very large texture (usually 2048x2048), and glTexSubImage2D can take anywhere from 25ms to 60ms. Given that our target is 60fps, we have about 16ms to draw a frame. This means we’re guaranteed to miss at least one frame every time we upload, but likely more than that. What we need is a way of uploading texture data asynchronously (and preferably quicker). This is where direct textures can help.

If you haven’t read Dianne Hackborn’s recent posts on the Android graphics stack, you’re missing out (part 1, part 2). The window compositing system she describes (called SurfaceFlinger) is particularly interesting because it is close to the problem we have in Firefox. One of the pieces Android uses to to draw windows is the gralloc module. As you may have guessed, gralloc is short for ‘graphics alloc’. You can see the short and simple API for it here. Android has a wrapper class that encapsulates access to this called GraphicBuffer. It has an even nicer API, found here. Usage is very straightforward. Simply create the GraphicBuffer with whatever size and pixel format you need, lock it, write your bits, and unlock. One of the major wins here is that you can use the GraphicBuffer instance from any thread. So not only does this reduce a copy of your image, but it also means you can upload it without blocking the rendering loop!

To get it on the screen using OpenGL, you can create an EGLImageKHR from the GraphicBuffer and bind it to a texture:

#define EGL_NATIVE_BUFFER_ANDROID 0x3140
#define EGL_IMAGE_PRESERVED_KHR   0x30D2

GraphicBuffer* buffer = new GraphicBuffer(1024, 1024, PIXEL_FORMAT_RGB_565,
                                          GraphicBuffer::USAGE_SW_WRITE_OFTEN |
                                          GraphicBuffer::USAGE_HW_TEXTURE);

unsigned char* bits = NULL;
buffer->lock(GraphicBuffer::USAGE_SW_WRITE_OFTEN, (void**)&bits);

// Write bitmap data into 'bits' here

buffer->unlock();

// Create the EGLImageKHR from the native buffer
EGLint eglImgAttrs[] = {
    
     EGL_IMAGE_PRESERVED_KHR, EGL_TRUE, EGL_NONE, EGL_NONE };
EGLImageKHR img = eglCreateImageKHR(eglGetDisplay(EGL_DEFAULT_DISPLAY), EGL_NO_CONTEXT,
                                    EGL_NATIVE_BUFFER_ANDROID,
                                    (EGLClientBuffer)buffer->getNativeBuffer(),
                                    eglImgAttrs);

// Create GL texture, bind to GL_TEXTURE_2D, etc.

// Attach the EGLImage to whatever texture is bound to GL_TEXTURE_2D
glEGLImageTargetTexture2DOES(GL_TEXTURE_2D, img);

The resulting texture can be used as a regular one, with one caveat. Whenever you manipulate pixel data, the changes will be reflected on the screen immediately after unlock. You probably want to double buffer in order to avoid problems here.

If you’ve ever used the Android NDK, it won’t be surprising that GraphicBuffer (or anything similar) doesn’t exist there. In order to use any of this in your app you’ll need to resort to dlopen hacks. It’s a pretty depressing situation. Google uses this all over the OS, but doesn’t seem to think that apps need a high performance API. But wait, it gets worse. Even after jumping through these hoops, some gralloc drivers don’t allow regular apps to play ball. So far, testing indicates that this is the case on Adreno and Mali GPUs. Thankfully, PowerVR and Tegra allow it, which covers a fair number of devices.

With any luck, I’ll land the patches that use this in Firefox Mobile today. The result should be a much smoother panning and zooming experience on devices where gralloc is allowed to work.

[转载] ANDROID中的EGL扩展

原文：http://tangzm.com/blog/?p=167

Google在Android中对egl做了一些扩展，让整个显示渲染的软件体系运行地更加有效率。在我们分析，修改SurfaceFlinger代码的过程中，经常可以看到这些egl扩展相关的代码，比如android native fence, KHR image等等。虽然跳过这些内容对理解SurfaceFlinger本身影响并不大，但是在阅读代码时，每次看到这些”小石头”，心理总不很舒服。因此，稍微花了一些时间，找了一些KHRONOS的文档，结合SurfaceFlinger源代码，大致对这些EGL扩展做了一点点基本的了解。

BufferQueue的生产者/消费者模型

在进入讨论这些扩展之前，先简单回顾下Andriod BufferQueue的运行机制。

在Android (3.0之后)，上到application,下到surfaceflinger, 所有的绘制过程都采用OpenGL ES来完成。对于每个绘制者（生产者，内容产生者）来说，步骤大致都是一样的。

(1)获得一个Surface（一般是通过SurfaceControl）

(2)以这个Surface为参数，得到egl draw surface 和 read surface. 通常这俩是同一个对象

(3)配置egl config,然后调用eglMakeCurrent()配置当前的绘图上下文。

(4)开始画内容，一般通过glDrawArray()或者glDrawElemens()进行绘制

(5)调用eglSwapBuffers() swap back buffer和front buffer。

(6)回到第4步，画下一帧。

我们知道，所有的绘制，最后的结果无非是一块像素内存，内部存放了RGB，或者YUV的值；这块内存，也就是Surface的backend。在Android中，为了让内容生产者和消费者可以并行工作，每个Surface背后都采用了三倍的内存缓冲(MTK是四倍缓冲)，这样，当绘制者在绘制的时候，不会影响当前屏幕的显示，而且，绘制者画完一帧之后，一般立即就可以再获得一块新的Buffer进行下一帧的绘制，而无需等待。

在这里插入图片描述
在图中，BufferQueue中的buffer被标记成4种状态：

（1）FREE 表示该Buffer没有被使用，且内容为空

（2）DEQUEUED 表示该Buffer正在被内容生产者绘制

（3）QUEUED 表示该Buffer已经被绘制了内容，放入BufferQueue，等待被显示（或者下一步处理）

（4）ACQUIRED 表示该Buffer正在被消费者拿去做显示（或者下一步处理）

状态的迁徙总是 FREE=>DEQUEUED=>QUEUED=>ACQUIRED=>FREE。

生产者做eglMakeCurrent()的时候，它会从BufferQueue中找到一个处于FREE的item,标记为DEQUEUED，并将它作为当前绘制的目标。画完之后，当生产者调用eglSwapBuffers()时，将当前DEQUEUED Buffer置为QUEUED，标记它可以被显示（或者下一步处理），同时再从BufferQueue中寻找另一个状态为FREE的Buffer Item，置为DEQUEUED，继续绘制。

在另一端，BufferQueue的消费者（一般是SurfaceFlinger，也可能是某些ImageProcessor）从BufferQueue中寻找被标记成QUEUED的item,进行显示或者下一步处理。生产者和消费者类的关系图谱可参考下图：

在这里插入图片描述

EGL_ANDROID_image_native_buffer

讲完了BufferQueue的机制之后，可以进入正题了。Andriod引入的第一个扩展是EGL_ANDROID_image_native_buffer。由于在OpenGL ES中，上传纹理（glTexImage2D(), glSubTexImage2D()）是一个极为耗时的过程，在1080×1920的屏幕尺寸下传一张全屏的texture需要20～60ms。这样的话SurfaceFlinger就不可能在60fps下运行。因此， Android采用了image native buffer,将graphic buffer直接作为纹理（direct texture）进行操作。

image native buffer在Android上的使用相当方便：

首先用Graphic Buffer创建一个EGLImageKHR

然后将该egl image绑定到texture2D OES上

如果需要在OpenGL ES中引用该纹理，则须在glsl中将类型从sampler2D改为samplerExternalOES，例如

EGL_ANDROID_native_fence_sync

在讨论Android native fence sync之前，我们先来考虑下，CPU与GPU如何同步协调的问题。我们知道，OpenGL ES的API call实际上是一系列的command,当用户调用这些API时，这些command会被OpenGL的lib库（或者下层的驱动？）缓存起来。手动调用glFlush()会强制将当前context中的所有命令送入GPU执行，但在这些命令执行结束之前，glFlush()就会返回。glFinish()将所有命令送入GPU，并等待这些命令都执行结束之后才返回。

看起来，glFinish()是同步CPU和GPU工作一个可行的方式。但是glFinish()的缺陷也很明显，在GPU工作的期间，当前的线程一直处于等待状态，如果没有其他线程有工作要做（被CPU调度执行）的话，这段时间CPU就被白白浪费了。此外，如果是其他线程在等待GPU任务完成的话，还必须手动在glFinish()之后通过Condition进行通知。于是,KHRONOS在此基础上增加了EGL_KHR_fence_sync（fence同步对象）。通过eglCreateSyncKHR(),会在当前的GL context的当前位置（所有之前被调用的command之后）中创建一个类型为EGLSyncKHR的fence对象。之后当前线程或者其他任何线程都可以在这个fence同步对象上通过eglClientWaitSyncKHR（）等待，就像等待一个Condition一样。和Condition类似，wait函数还可以接受一个timeout参数，指定超时的条件。下图中演示了egl fence如何同步GPU和多个线程。

在这里插入图片描述
Android native fence在KHR fence更进一步。可以从fence object中得到一个fd(文件描述符)，或者从fd生成一个sync object。有了这个功能，Android把同步的范围从多个线程扩展到多个进程！这对Android来说可太重要了，因为我们知道，BufferQueue的生产者和消费者大多不在一个进程内，有了android native fence,就可以在多进程的环境中同步Buffer的使用了。

也许有人要问，干吗那么麻烦啊，BufferQueue Item不是有状态吗？通过状态判断graphic buffer的使用情况不就行了吗？实际上，Android为了让CPU和GPU更好地并行工作，BufferQueue Item的状态设置并没有等到graphic buffer（在GPU端）的操作完全结束，也就是说，当生产者调用eglSwapBuffers()的时候，BufferQueue Item的状态被立即改写(DEQUEUED=>QUEUED, FREE=>DEQUEUED)而没有用glFinish()等待所有操作结束再设置。这样，生产者和消费者都能尽快进行下一步处理。这样，就带来一个问题，消费者如何知道graphic buffer已经被“真正”写完了，生产者如何知道graphic buffer已经被真正被消费者使用完，释放了。答案当然就是android native fence。

一方面，生产者在eglSwapBuffers()后插入acquire fence,消费者通过等待这个fence知道graphic buffer被真正写入，然后进行消费（显示…）。另一方面，当消费者处理完成之后，会插入relese fence,当生产者在dequeueBuffer()时，会等待这个fence，但后真正进入可以绘制的状态。

让我们从代码中看看这些产生和等待fence的具体位置

Acquire Fence

EGL的驱动（lib）在queueBuffrer时会传入acquire fence. (See
code)
Android系统通过Binder调用到位于SurfaceFlinger的BufferQueue::queueBuffer(),
将fence作为参数传入。BufferQueue将fence记录在slot中 (see
code)
当SurfaceTexture（也就是ConsumerBase）acquireBuffer的时候，acquire
fence被存入slot (see
code)
如果这个Surface会被SurfaceFlinger拿来做显示，那么分两种情况。加入用GLES做合成，那么在Layer::onDraw()时会在SurfaceTexture的fence上等待(see
code),
如果用HWC合成，则SurfaceFlinger将该fence传给HWC(see
code)。虽然我们看不到HWC的代码实现，但正确实现的HWC会在内部等待所有的fence被signal以后才进行合成
如果这个Surface被拿来作为纹理，在另一个GL
context中进行绘制的话。由于SurfaceTexture无法知道该纹理在什么时候被真正使用，因此在SurfaceTexture::updateTexImage()中Android会等待fence被trigger，才完成update.(see
code)

Release Fence

反过来，Release Fence用来通知生产者，消费者什么时候完成对graphic buffer的使用。

如果Surface被SurfaceFlinger用作Layer合成显示，那么在SurfaceFlinger::postComposition()中，SurfaceFlinger会为每个Surface设置release
fence。这个release fence的生成是由HWC或者FrameBuffer的GLES完成的。HWC和FB
GLES都应该保证在release fence在合适的位置生成
如果Surface被用作纹理，那么该graphic
buffer的命运有两种。一是在SurfaceTexture::updateTexImage()中旧的graphic
buffer被新的graphic
buffer替换；二是在SurfaceTexture::detachFromContext()中该graphic
buffer被从当前的GL上下文detach。无论哪种情况，SurfaceTexture都会调用syncForReleaseLocked插入release
fene
最后，在BufferQueue::dequeueBuffer()中，BufferQueue会用eglClientWaitSyncKHR()等待release
fence被signal；在此之后生产者才能真正dequeue到一个graphic buffer,开始绘制。

EGL_ANDROID_framebuffer_target

除了HWC,SurfaceFlinger也会采用GLES方式(FrameBuffer方式)来合成Surface。为了区分普通的EGL上下文(渲染结果被用作HWC输入或者GLES的纹理)和FrameBuffer专用的EGL上下文(渲染结果被输出到FrameBuffer上)，Android采用了EGL_FRAMEBUFFER_TARGET_ANDROID CONFIG属性来标记需要一个FrameBuffer的EGL上下文。具体的信息可以参考KHRONOS标准

EGL_ANDROID_recordable

参考KHRONOS标准

EGL_ANDROID_blob_cache

参考KHRONOS标准