Wednesday, January 8, 2014

freedreno update: new year edition

Time for another freedreno update.  hw binning support, and fun with gallium HUD.


The big news is that hw binning pass support (for a3xx) is working.   This is a pre-pass for all the draws which generates a visibility stream (ie. basically which vertices apply to which tiles) used to speed up the tile rendering step by filtering out non visible vertices for a given tile.

tl;dr: games or anything with a healthy vertex loading (ie. not window managers) are showing 35-45% fps boost.

Currently it is not enabled by default.  I'd like some time for it to get more testing before it is enabled by default.  For now, use the FD_MESA_DEBUG environment variable to enable it, ie:

  FD_MESA_DEBUG=binning supertuxkart

Also, since I was looking for a way to correlate fps with various other statistics (in particular batches per second vs frames per second), I started playing with the gallium performance monitor HUD (heads-up-display).  With the addition of a few driver custom queries, I had what I needed:

The driver custom queries:
  • draw-calls
  • batches - number of batches per second, sum of batches-sysmem plus batches-gmem
  • batches-gmem - a set of tiles in GMEM rendered, for each tile (optionally) system mem -> gmem (restore), plus N draws, plus gmem -> system mem (resolve); value in batches per second
  • batches-sysmem - draws to system memory (GMEM bypass) per second
  • restores - number of GMEM batches that required restore per second
So above screenshot was generated with:

 export GALLIUM_HUD=cpu0+cpu1+cpu2+cpu3,fps+batches-sysmem+batches-gmem+restores,draw-calls
 export FD_MESA_DEBUG=binning
 supertuxkart -s 1280x720 --demo-mode 1

The binning and query support are on mesa master.


  1. Nice work. Any chances of these or any other improvements for a2xx GPUs too ?
    I have a phone with Adreno 225 which I plan on hacking on but my GPU knowledge is limited.

    1. The query stuff (for gallium HUD) should work on a2xx (although the HUD might be a bit too intrusive for the lower performance a2xx stuff). Binning would need some a2xx work, although I suspect that once someone digs into how binning works on a2xx, there should be a lot of similarities. Ie, it looks to have similar VSC_PIPE registers, so I think the visibility stream stuff probably works in the same way.

      But that all said, I'm not finding too much time to do a2xx specific stuff these days. When I do eventually get some time to spend on a2xx, probably kernel is first priority, since with kgsl/fbdev we can't pageflip and the way cross-process synchronization works is quite a hack. But things would probably move along quicker on a2xx if someone was sending patches ;-)

  2. First of all, thanks you for your great jobs.

    Has Freedeno support for X video extension? Is it possible any video acceleration?

    1. currently, no. Actually at the moment on a3xx we don't accelerate *anything* on xserver side. When I get some time, I'll either revisit XA or try to get glamor working. Either of which should, I suppose, make Xv possible.

      That all said, Xv is probably less and less useful. Xbmc renders with GL, works quite fine with very small load on the GPU for 720p.. but 1080p is too much for the CPU for sw decode. AFAIU more recent totem is also using GL rendering.

      I haven't tried to get hw video decode working.. that is not really part of the GPU. But from android we have kernel and userspace src code (and dsp firmware blob). I guess it should be a pretty easy project if someone wanted to figure out how to adapt that to work somehow on linux.

  3. This comment has been removed by a blog administrator.