Minimize Android GLSurfaceView lag

Great question.

Quick bit of background for anyone else reading this:

The goal here is to minimize the display latency, i.e. the time between when the app renders a frame and when the display panel lights up the pixels. If you’re just throwing content at the screen, it doesn’t matter, because the user can’t tell the difference. If you’re responding to touch input, though, every frame of latency makes your app feel just a bit less responsive.

The problem is similar to A/V sync, where you need audio associated with a frame to come out the speaker as the video frame is being displayed on screen. In that case, the overall latency doesn’t matter so long as its consistently equal on both audio and video outputs. This faces very similar problems though, because you’ll lose sync if SurfaceFlinger stalls and your video is consistently being displayed one frame later.

SurfaceFlinger runs at elevated priority, and does relatively little work, so isn’t likely to miss a beat on its own… but it can happen. Also, it is compositing frames from multiple sources, some of which uses fences to signal asynchronous completion. If an on-time video frame is composed with OpenGL output, and the GLES rendering hasn’t completed when the deadline hits, the whole composition will be postponed to the next VSYNC.

The desire to minimize latency was strong enough that the Android KitKat (4.4) release introduced the “DispSync” feature in SurfaceFlinger, which shave half a frame of latency off the usual two-frame delay. (This is briefly mentioned in the graphics architecture doc, but it’s not in widespread use.)

So that’s the situation. In the past this was less of an issue for video, because 30fps video updates every-other frame. Hiccups work themselves out naturally because we’re not trying to keep the queue full. We’re starting to see 48Hz and 60Hz video though, so this matters more.

The question is, how do we detect if the frames we send to SurfaceFlinger are being displayed as soon as possible, or are spending an extra frame waiting behind a buffer we sent previously?

The first part of the answer is: you can’t. There is no status query or callback on SurfaceFlinger that will tell you what its state is. In theory you could query the BufferQueue itself, but that won’t necessarily tell you what you need to know.

The problem with queries and callbacks is that they can’t tell you what the state is, only what the state was. By the time the app receives the information and acts on it, the situation may be completely different. The app will be running at normal priority, so it’s subject to delays.

For A/V sync it’s slightly more complicated, because the app can’t know the display characteristics. For example, some displays have “smart panels” that have memory built in to them. (If what’s on the screen doesn’t update often, you can save a lot of power by not having the panel scan the pixels across the memory bus 60x per second.) These can add an additional frame of latency that must be accounted for.

The solution Android is moving toward for A/V sync is to have the app tell SurfaceFlinger when it wants the frame to be displayed. If SurfaceFlinger misses the deadline, it drops the frame. This was added experimentally in 4.4, though it’s not really intended to be used until the next release (it should work well enough in “L preview”, though I don’t know if that includes all of the pieces required to use it fully).

The way an app uses this is to call the eglPresentationTimeANDROID() extension before eglSwapBuffers(). The argument to the function is the desired presentation time, in nanoseconds, using the same timebase as Choreographer (specifically, Linux CLOCK_MONOTONIC). So for each frame, you take the timestamp you got from the Choreographer, add the desired number of frames multiplied by the approximate refresh rate (which you can get by querying the Display object — see MiscUtils#getDisplayRefreshNsec()), and pass it to EGL. When you swap buffers, the desired presentation time is passed along with the buffer.

Recall that SurfaceFlinger wakes up once per VSYNC, looks at the collection of pending buffers, and delivers a set to the display hardware via Hardware Composer. If you request display at time T, and SurfaceFlinger believes that a frame passed to the display hardware will display at time T-1 or earlier, the frame will be held (and the previous frame re-shown). If the frame will appear at time T, it will be sent to the display. If the frame will appear at time T+1 or later (i.e. it will miss its deadline), and there’s another frame behind it in the queue that is scheduled for a later time (e.g. the frame intended for time T+1), then the frame intended for time T will be dropped.

The solution doesn’t perfectly suit your problem. For A/V sync, you need constant latency, not minimum latency. If you look at Grafika’s “scheduled swap” activity you can find some code that uses eglPresentationTimeANDROID() in a way similar to what a video player would do. (In its current state it’s little more than a “tone generator” for creating systrace output, but the basic pieces are there.) The strategy there is to render a few frames ahead, so SurfaceFlinger never runs dry, but that’s exactly wrong for your app.

The presentation-time mechanism does, however, provide a way to drop frames rather than letting them back up. If you happen to know that there are two frames of latency between the time reported by Choreographer and the time when your frame can be displayed, you can use this feature to ensure that frames will be dropped rather than queued if they are too far in the past. The Grafika activity allows you to set the frame rate and requested latency, and then view the results in systrace.

It would be helpful for an app to know how many frames of latency SurfaceFlinger actually has, but there isn’t a query for that. (This is somewhat awkward to deal with anyway, as “smart panels” can change modes, thereby changing the display latency; but unless you’re working on A/V sync, all you really care about is minimizing the SurfaceFlinger latency.) It’s reasonably safe to assume two frames on 4.3+. If it’s not two frames, you may have suboptimal performance, but the net effect will be no worse than you would get if you didn’t set the presentation time at all.

You could try setting the desired presentation time equal to the Choreographer timestamp; a timestamp in the recent past means “show ASAP”. This ensures minimum latency, but can backfire on smoothness. SurfaceFlinger has the two-frame delay because it gives everything in the system enough time to get work done. If your workload is uneven, you’ll wobble between single-frame and double-frame latency, and the output will look janky at the transitions. (This was a concern for DispSync, which reduces the total time to 1.5 frames.)

I don’t remember when the eglPresentationTimeANDROID() function was added, but on older releases it should be a no-op.

Bottom line: for ‘L’, and to some extent 4.4, you should be able to get the behavior you want using the EGL extension with two frames of latency. On earlier releases there’s no help from the system. If you want to make sure there isn’t a buffer in your way, you can deliberately drop a frame every so often to let the buffer queue drain.

Update: one way to avoid queueing up frames is to call eglSwapInterval(0). If you were sending output directly to a display, the call would disable synchronization with VSYNC, un-capping the application’s frame rate. When rendering through SurfaceFlinger, this puts the BufferQueue into “async mode”, which causes it to drop frames if they’re submitted faster than the system can display them.

Note you’re still triple-buffered: one buffer is being displayed, one is held by SurfaceFlinger to be displayed on the next flip, and one is being drawn into by the application.

Leave a Comment