Project

General

Profile

Actions

Graphics on Replicant 11

This page documents the current progress and future plans for the graphics acceleration on Replicant 11. The original plan can be found at TasksFunding. Since then, through more in depth research and hands-on experience, several things have diverged.

The full effort of Porting Replicant to Android 11 can be tracked at: PortingToAndroid11.

Background information, as well as details on the software components and acronyms used on this document, can be found at GraphicsResearch.

Graphics stack tasks

status origin short description notes estimated man-hours actual man-hours
done original plan Set up the development environment. Required: i9305 phones, LXC Trisquel container (systemd nspawn fails due to old systemd on Trisquel 8), larger SSDs, 1.8V serial-USB adapters (BS101P FT232RL) plus makeshift resistors' banks. 24 24
ongoing new Update development environment for Replicant 11. Android 11 has higher resource consumption during builds.
Hacks and workarounds had to be found to be able to build in our machines.
Build servers will be set up to get faster builds.
0 8
ongoing original plan Read graphics related AOSP documentation. Never-ending task that, besides actual documentation, involves scouring through source-code, bug trackers, mailing lists and IRC logs. 16 48
ongoing new Ask for help. Bothering free-software developers1 that have experience with or contribute to graphics sub-systems has been the most fruitful way to clear most roadblocks. 0 0
done original plan Use Mesa's llvmpipe backend instead of softpipe. Merge requests on Mesa: !1402 and !1403. There was no need to update LLVM version. 40 28
todo new Find out why we are getting away without using the libEGL patch. Android 11 no longer needs EGL?
Take a look at frameworks/native/opengl/libs/EGL/Loader.cpp
0 0
ongoing original plan Implement the missing pixel formats in drm/exynos. Joonas created a patch that allows selecting the BGR format for Exynos FIMD through a boot time module parameter. It uses the VIDCON0 register, which can be set to either VIDCON0_PNRMODE_RGB or VIDCON0_PNRMODE_BGR. Both formats cannot co-exist. TODO: use a string in the module parameter and send upstream.
GNUtoo proposed another approach with runtime checks like .atomic_check(...){ if (using RGB && asked for BGR) => return NOT_POSSIBLE; If application A starts using RGB, and application B asks for BGR, the kernel refuses. As if the BGR format is removed from the list at runtime. Once no more applications use the display engine, then it's like if it was re-added to the list. There seems to be no practical use for this extra flexibility, as once booted both Android and GNU/Linux will stick to their preferred pixel format. However, at #dri-devel, emersion told us that listing all formats and then failing at the atomic check is preferable upstream.
72 1
todo new Get entire stack to use RGB555 pixel format. Had a huge performance boost on Replicant 6. 0 0
abandonned original plan Proper way to use DRM-Master and DRM-Auth with gbm_gralloc and drm_hwcomposer. DRM-Auth is no longer needed for gbm_gralloc because, on December 2019, DRM_AUTH was dropped from PRIME_TO/FROM_HANDLE ioctls. drm_hwcomposer and gbm_gralloc can now share the display/kms node with no need for DRM Auth. drm_hwcomposer, which uses KMS ioctls, must attach to the node first, in order to become DRM Master. gbm_gralloc should attach after it.
Before DRM_AUTH had been dropped, we had tried:
1. Auth hack (both on /dev/dri/card0)
2. vGEM (gbm gralloc on /dev/dri/card1) - gbm gralloc cannot take advantage of exynos hardware planes; memory may not be properly allocated.
3. Allow dumb buffers on render node (gbm gralloc on /dev/dri/renderD128) - Dumb buffers are used for scanout. Should not be created on a render node.
40 8
done new Start gbm_gralloc service after drm_hwcomposer. Android init is quite primitive, but Joonas accomplished this by disabling the gralloc service. It seems that gralloc is later started automatically when needed. 0 2
ongoing new Use hardware planes for better composer performance. Enabling HW planes with drm_hwcomposer was straightforward but led to severe graphics corruption.
Disabling devfreq fixed the corruption. Tentative explanation: display controller frequency gets too low for timely DMA transfer of overlays. Reported upstream. Devfreq on Exynos is known to be a little bit broken.
TODO:
* Lock display controller frequencies through sysfs and re-enable devfreq, or try to remove low freq OPP steps.
* Make sure that drm_hwcomposer is using all 4 available HW planes on exynos (1 primary, 3 overlay). Joonas used dumpsys and added prints to validateDisplay, finding at most 3 planes in use: avail_planes is how many HW planes we have and layers_.size() is how many are in the composition.
* Add support for rotation.
* Debug drm-hwcomposer-intermittent-alpha.mp4 (Dim Layer sent by SurfaceFlinger).
* Enable the cursor plane.
0 17
ongoing new Use Skia instead of HWUI to render the Canvas. Unlike Replicant 6, none of the usual system props (e.g. ro.kernel.qemu=1, ro.config.avoid gfx accel=1) would yield the expected performance.
Got there by forcing hardwareAccelerated=false on all apps.
TODO:
* Turn this dirty hack into a system property that can be toggled on the device tree.
* Test ro.kernel.qemu.gles=0
0 22
ongoing original plan Create test scenarios and check if the graphics stack works as expected. Stock apps work.
Check Tested apps bellow for the current status with apps that require advanced graphics features.
TODO: compliance tests.
40 14
ongoing original plan Make the graphics stack work with vGEM driver besides drm/exynos. vGEM seem to be the proper dri node for Mesa's kms_swrast driver
We are currently using a simple hack that kms_swrast to use drm/exynos instead. Should rather use vGEM.
40 4
tentative new Combine kmsro with kms_swrast on vGEM render node? Is kms_swrast working on top of the vGEM render node able to share PRIME buffers with the display node (Exynos)? If not, would adding kmsro to the mix help?
Architectural ideia: kmsro + kms_swrast on vGEM render node -> PRIME -> drm_hwcomposer and gbm_gralloc on display node (Exynos)
Advantages:
- no need to copy buffers between kms_swrast and Exynos (PRIME takes care of that);
- can take advantage of HW planes.
0 0
ongoing original plan Document the design decisions. Done at this wiki page plus GraphicsResearch and the presentation at ContributorsMeetingJuly2019. 16 64
ongoing new Try out the Android Go low RAM switches. Check their impact on graphics rendering performance and overall system usability. 0 1
todo new Test gbm_gralloc with camera. So far we've only been testing gbm_gralloc with Lima and Exynos. However, the Gralloc HAL will be used to allocate buffers that will be shared with other devices as well, such as the camera. 0 0
todo new Fix screen recorder. Seems to fail with some EGL issue. 0 1
total sum: 288 240

1 A big thanks to Joonas Kylmälä, Paul Kocialkowski, Denis Carikli, Andrés Domínguez, Mauro Rossi, Erico Nunes, Tomeu Vizoso, Daniel Stone, Emil Velikov, Andrzej Hajda, Marek Szyprowski and LiquidAcid.

SwiftShader tasks

status origin short description notes estimated man-hours actual man-hours
done original plan Find a way to use SwiftShader instead of Mesa. Joonas got there with ranchu composer (from Android Emulator) and the default gralloc, plus a patch to support UDIV/SDIV emulation in the kernel. 40 0
done new Use LLVM as backend instead of SubZero. Found a SwiftShader revision that uses LLVM and is still compatible with Android 9 frameworks/native. No noticeable performance difference. 0 6
done new Do UDIV/SDIV emulation on JIT compiled shader code instead of kernel patch. Avoids performance penalty of interrupt handling. It seems that SwiftShader does not send the processor model (microarchitecture) to LLVM, leaving it without a way to decide whether the processor has hardware division.
Fixed upon update to Replicant 10. SwiftShader now uses LLVM 7 instead of 3, which fixed this.
0 30
todo new Use drm_hwcomposer instead of ranchu. Advantages: uses hardware planes and DRM nodes instead of direct framebuffer. Joonas was close. 0 0
todo new Use mainline SwiftShader. Brings in a Vulkan software renderer for Replicant. Difficult to due incompatibilities with frameworks/native.
Check if is fixed with Replicant 11.
0 0
total sum: 40 36

llvmpipe optimization tasks

status origin short description notes estimated man-hours actual man-hours
todo original plan Setup a testing and benchmarking environment. Profiling: turn on profiling switch on Mesa + simpleperf?
Benchmarks: android-fps-count, 0xBenchmark, GearsES2
Conformance: dEQP, Android CTS, piglit, freedreno/tests-*, glmark2
40 1
todo original plan Disable expensive OpenGL operations. 24 0
todo original plan Recap matrix operations and study ARM NEON. 48 0
todo original plan Profile apps to find the most used GLES operations. 32 0
todo original plan Use Ne10 library or Neon Intrinsics for the most used GLES operations. Optimizations have to be done on LLVM and not on llvmpipe. llvmpipe only outputs LLVM IR. LLVM already has autovectorization for ARM NEON, try it. 80 0
todo original plan Fix bugs, re-write the code where needed, get it stable. 80 0
total sum: 304 1

Lima driver tasks

status origin short description notes estimated man-hours actual man-hours
done original plan Rebase Lima's Linux kernel DRM driver on top of forkbomb's Midas on Mainline kernel. Done by others. Lima DRM driver was accepted into mainline Linux, which also has forkbomb's patches and is now used on Replicant 11. 80 0
done original plan Replace mainline Mesa for Lima's Mesa (with their driver). Done by others. Lima is now on mainline Mesa. Lima wiki 16 0
done new Lima DRM driver bringup on Exynos. Lima development is done on AllWinner devices.
We expected some issues to get it working on Exynos.
Although there were encouraging reports by ChronoMonochrome, hexdump0815 and Viciouss (manifest, xda).
Joonas added Lima to Replicant 10 and faced no major bringup issues.
0 1
done new Fully test proper architecture. drm_hwcomposer and gbm_gralloc on card0 (Exynos) -> PRIME -> Mesa on renderD129 (Lima)
Advantages:
- no need to copy buffers between Lima and Exynos (PRIME takes care of that);
- can take advantage of HW planes.
0 1
done new Fix graphics corruption with hardware planes. Corruption happened when compositing GL planes with non-GL planes. E.g. on Shader Editor, run a shader and open a menu.
Disabling devfreq didn't solve it (as it did with llvmpipe).
Was due to having gbm_gralloc working on Lima's render node, which cannot do contiguous memory allocation.
0 1
todo new Fix video play. Joonas reported that the Big Buck Bunny video fails at os_get_total_physical_memory call from Mesa, which is called from lima_screen.c 0 0
todo new Advertise GLES 2. Shader Editor can only detect GLES 1. 0 0
todo original plan Build and test thoroughly with synthetic and real applications. Use conformance tests to figure out the current GLES implementation status. 40 0
abandoned original plan Create a fallback mechanism that uses the software renderer for GLES functions not yet implemented in Lima. There is no sane way to switch between different GLES drivers at the function level. Abandoned in favour of the tasks bellow. 100 1
done new Lima as SurfaceFlinger backend. This is the default (SurfaceFlinger using the default GLES implementation). No problems found. 0 0
done new Lima as HWUI (SkiaGL) backend. This is the default (SkiaGL using the default GLES implementation). No problems found. 0 0
todo new Lima on a per-app basis. Lima will at most support GLES2. Therefore it may not work with certain apps depending on their GLES usage. We can re-work the per process libagl/llvmpipe patch into a patch that switches between Lima and a software renderer (llvmpipe or SwiftShader). 0 0
total sum: 236 4

2D optimization tasks

status origin short description notes estimated man-hours actual man-hours
todo new Investigate the possibility of using Pixman or Exynos G2D as RenderEngine for SurfaceFlinger. There are interesting reports of people using G2D to hardware-accelerate X11 EXA 0 1
todo new Accelerate Skia with G2D. Rework old patches (Hillenbrand 2013 and raymanfx 2016 ) to make them work on current Skia. 0 1
total sum: 0 2

Tested apps

app Device Replicant 6 Replicant 11 notes
libagl LLVMpipe Lima LLVMpipe SwiftShader
Fennec F-Droid1 GT-I9300 crashes slow fast Needs GLES 2.0
LibreOffice Viewer2 GT-I9300 crashes slow cannot test
(missing storage)
Red Reader3 GT-I9300 crashes4 usable cannot test
(no network)
Shader Editor5 GT-I9300 crashes 7 fps 30 fps (HW planes off)
40 fps (HW planes on)
Freezes when changing resolution.
FPS measured on default shader
with 1/1 resolution.
Marine Compass6 GT-I9300 bad render bad render crashes Only uses GLES 1.0
Gears7 GT-I9300 crashes no render crashes
GL TRON8 GT-I9300 4 fps 2 fps 23 fps Has a nice FPS counter.
Tor-browser 10.0.59 GT-I9250 Untested
Crash10 [11]
No support for GT-I9250 yet
Tor-browser 10.0.59 GT-I9300 Untested Works10 [12]
a bit slow
No support for network yet
Replica Island GT-I9300 Untested Untested Fast enough Retest with specific versions

1 https://f-droid.org/en/packages/org.mozilla.fennec_fdroid

2 https://f-droid.org/en/packages/org.documentfoundation.libreoffice

3 https://f-droid.org/en/packages/org.quantumbadger.redreader

4 https://github.com/QuantumBadger/RedReader/issues/279

5 https://f-droid.org/en/packages/de.markusfisch.android.shadereditor

6 https://f-droid.org/en/packages/net.pierrox.mcompass

7 https://f-droid.org/en/packages/com.jeffboody.GearsES2eclair

8 https://f-droid.org/en/packages/com.glTron

9 https://dist.torproject.org/torbrowser/10.0.5/tor-browser-10.0.5-android-armv7-multi.apk

10 Tested on replicant-6.0 0004 RC3:

adb root ; adb shell
# setprop ro.kernel.qemu 1
# setprop ro.kernel.qemu.gles 0
# killall surfaceflinger

The device was using the default graphics settings (llvmpipe).

11 It crashed when clicking on the URL bar with the following error:

12-04 00:35:36.364  4302  4322 I Gecko   : Can't find symbol 'eglGetNativeClientBufferANDROID'.
12-04 00:35:36.364  4302  4322 I Gecko   : Can't find symbol 'eglQuerySurfacePointerANGLE'.
12-04 00:35:36.364  4302  4322 I Gecko   : Can't find symbol 'eglDupNativeFenceFDANDROID'.
12-04 00:35:36.364  4302  4322 I Gecko   : Can't find symbol 'eglQueryDisplayAttribEXT'.
12-04 00:35:36.364  4302  4322 I Gecko   : Can't find symbol 'eglQueryDeviceAttribEXT'.
12-04 00:35:36.364  4302  4322 I Gecko   : Can't find symbol 'eglStreamConsumerGLTextureExternalAttribsNV'.
12-04 00:35:36.364  4302  4322 I Gecko   : Can't find symbol 'eglCreateStreamProducerD3DTextureANGLE'.
12-04 00:35:36.365  4302  4322 I Gecko   : Can't find symbol 'eglStreamPostD3DTextureANGLE'.
12-04 00:35:36.365  4302  4322 I Gecko   : Can't find symbol 'eglSwapBuffersWithDamageEXT'.

12 To compare with the GT-I9250, # getprop | grep qemu returns nothing.

Updated by dl lud over 3 years ago · 114 revisions

Also available in: PDF HTML TXT