h1. Graphics on Replicant 10 This page documents the current progress and future plans for the graphics acceleration on Replicant 10. The original plan can be found at [[TasksFunding#Graphics-acceleration]]. Since then, through more in depth research and hands-on experience, several things have diverged. The full effort of Porting Replicant to Android 10 can be tracked at: [[PortingToAndroid10]]. Background information, as well as details on the software components and acronyms used on this document, can be found at [[GraphicsResearch]]. h2. Graphics stack tasks |_. status |_. origin |_. short description |_. notes |_. estimated man-hours |_. actual man-hours | | done | original plan | Set up the development environment. | Required: i9305 phones, LXC Trisquel container (systemd nspawn fails due to old systemd on Trisquel 8), larger SSDs, 1.8V serial-USB adapters (BS101P FT232RL) plus makeshift resistors' banks. |>. 24 |>. 24 | | ongoing | new | Update development environment for Replicant 10. | Android 10 has higher resource consumption during builds. "Hacks and workarounds":https://redmine.replicant.us/projects/replicant/wiki/PortingToAndroid10#Java-heap-space had to be found to be able to build in our machines. Build servers will be set up to get faster builds. |>. 0 |>. 8 | | ongoing | original plan | Read graphics related AOSP documentation. | Never-ending task that, besides actual documentation, involves scouring through source-code, bug trackers, mailing lists and IRC logs. |>. 16 |>. 48 | | ongoing | new | Ask for help. | Bothering free-software developers[1] that have experience with or contribute to graphics sub-systems has been the most fruitful way to clear most roadblocks. |>. 0 |>. 0 | | done | original plan | Use Mesa's llvmpipe backend instead of softpipe. | Merge requests on Mesa: "!1402":https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1402 and "!1403":https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1403. There was no need to update LLVM version. |>. 40 |>. 28 | | todo | new | Find out why we are getting away without using the "libEGL patch.":https://patchwork.freedesktop.org/patch/131741/?series=17611&rev=1 | Android 10 no longer needs EGL? Take a look at @frameworks/native/opengl/libs/EGL/Loader.cpp@ |>. 0 |>. 0 | | ongoing | original plan | Implement the missing pixel formats in drm/exynos. | Joonas created "a patch":https://git.replicant.us/replicant-next/kernel_replicant_linux/commit/?h=replicant-10&id=5af868ab366794c70c60be395c6d0a16831d2689 that allows selecting the BGR format for Exynos FIMD through a "boot time module parameter":https://git.replicant.us/replicant-next/kernel_replicant_linux/commit/?h=replicant-10&id=248962150104df3520fbb532b98c65ceb8efd7dc. It uses the VIDCON0 register, which can be set to either VIDCON0_PNRMODE_RGB or VIDCON0_PNRMODE_BGR. Both formats cannot co-exist. TODO: use a string in the module parameter and send upstream. GNUtoo proposed another approach with runtime checks like @.atomic_check(...){ if (using RGB && asked for BGR) => return NOT_POSSIBLE;@ If application A starts using RGB, and application B asks for BGR, the kernel refuses. As if the BGR format is removed from "the list":https://git.replicant.us/replicant-next/kernel_replicant_linux/commit/?h=replicant-10&id=6f59aa92223f0f2df2ded4e60acb8c2365b0e76f at runtime. Once no more applications use the display engine, then it's like if it was re-added to the list. There seems to be no practical use for this extra flexibility, as once booted both Android and GNU/Linux will stick to their preferred pixel format. However, at #dri-devel, emersion told us that "listing all formats and then failing at the atomic check is preferable upstream":https://people.freedesktop.org/~cbrill/dri-log/?channel=dri-devel&highlight_names=emersion%3Bdllud%3BGNUtoo&date=2020-11-17. |>. 72 |>. 1 | | todo | new | Get entire stack to use RGB555 pixel format. | Had a huge performance boost on Replicant 6. |>. 0 |>. 0 | | abandonned | original plan | Proper way to use DRM-Master and DRM-Auth with gbm_gralloc and drm_hwcomposer. | DRM-Auth is no longer needed for gbm_gralloc because, on December 2019, "DRM_AUTH was dropped from PRIME_TO/FROM_HANDLE ioctls":https://git.replicant.us/replicant-next/kernel_replicant_linux/commit/?h=replicant-10&id=30a958526d2cc6df38347336a602479d048d92e7. drm_hwcomposer and gbm_gralloc can now share the display/kms node with no need for DRM Auth. drm_hwcomposer, which uses KMS ioctls, must attach to the node first, in order to become DRM Master. gbm_gralloc should attach after it. Before DRM_AUTH had been dropped, we had tried: 1. "Auth hack":https://git.replicant.us/replicant-next/kernel_replicant_linux/commit/?h=history/lineage-16.0_i9300&id=e0f327c553befec2d367ac9a5ebef29b4291187a (both on @/dev/dri/card0@) 2. vGEM (gbm gralloc on @/dev/dri/card1@) - gbm gralloc cannot take advantage of exynos hardware planes; memory may not be properly allocated. 3. "Allow dumb buffers on render node":https://patchwork.kernel.org/patch/10541649/ (gbm gralloc on @/dev/dri/renderD128@) - Dumb buffers are used for scanout. Should not be created on a render node. |>. 40 |>. 8 | | done | new | Start gbm_gralloc service after drm_hwcomposer. | "Android init":https://blog.ngzhian.com/android-startup-tour-init.html is quite primitive, but Joonas accomplished this by "disabling the gralloc service":https://git.replicant.us/replicant-next/device_samsung_midas_common/commit/?h=replicant-10&id=62d61abe032f700cd73a6eefddd498d8e8bc29be. It seems that gralloc is later started automatically when needed. |>. 0 |>. 2 | | ongoing | new | Use hardware planes for better composer performance. | Enabling HW planes with drm_hwcomposer was straightforward but led to severe graphics corruption. "Disabling devfreq":https://git.replicant.us/replicant-next/kernel_replicant_linux/commit/?id=90ceab148872e9515522c5150df16722822cf9d5 fixed the corruption. Tentative explanation: display controller frequency gets too low for timely DMA transfer of overlays. "Reported upstream.":https://www.spinics.net/lists/linux-samsung-soc/msg66752.html TODO: * Lock display controller frequencies through sysfs and re-enable devfreq, or try to "remove low freq OPP steps":https://github.com/ReplicantOS-midas/android_kernel_samsung_smdk4412/commit/ce08237e39e2d74c1a25b3eadde4ac0f1043f270. * Make sure that drm_hwcomposer is using all 4 available HW planes on exynos (1 primary, 3 overlay). Joonas used "dumpsys":https://developer.android.com/studio/command-line/dumpsys and added prints to "@validateDisplay@":https://source.android.com/devices/graphics/implement-hwc#display_comp_ops, finding at most 3 planes in use: @avail_planes@ is how many HW planes we have and @layers_.size()@ is how many are in the composition. * Add support for rotation. * Debug "intermittent alpha blending":https://share.zalkeen.net/d/l/drm-hwcomposer-intermittent-alpha.mp4 (Dim Layer sent by SurfaceFlinger). * Enable the cursor plane.|>. 0 |>. 17 | | ongoing | new | Use Skia instead of HWUI to render the Canvas. | Unlike Replicant 6, none of the usual system props (e.g. @ro.kernel.qemu=1@, @ro.config.avoid gfx accel=1@) would yield the expected performance. Got there by forcing "@hardwareAccelerated=false@":https://git.replicant.us/replicant-next/frameworks_base/commit/?h=replicant-9&id=da9bab529acdb0bf5a0cbf3ac1962ca57fb6b5d7 on all apps. TODO: turn this dirty hack into a system property that can be toggled on the device tree. |>. 0 |>. 22 | | ongoing | original plan | Create test scenarios and check if the graphics stack works as expected. | Stock apps work. Check [[GraphicsReplicant10#Tested-apps|Tested apps]] bellow for the current status with apps that require advanced graphics features. TODO: compliance tests. |>. 40 |>. 14 | | ongoing | original plan | Make the graphics stack work with vGEM driver besides drm/exynos. | vGEM seem to be the proper dri node for Mesa's "kms_swrast driver":https://memcpy.io/kms_swrast-a-hardware-backed-graphics-driver.html We are currently using a "simple hack":https://git.replicant.us/replicant-next/external_mesa3d/commit/?id=e0a5e6c056c8281712bf6366acfe876215992613 that kms_swrast to use drm/exynos instead. Should rather use "vGEM":https://git.replicant.us/replicant-next/kernel_replicant_linux/commit/?id=29fab049bea6a17e5f85ccb147df0e0046bbd487. |>. 40 |>. 4 | | tentative | new | Combine "kmsro":https://gitlab.freedesktop.org/mesa/mesa/-/blob/master/src/gallium/winsys/kmsro/drm/kmsro_drm_winsys.c with kms_swrast on vGEM render node? | Is kms_swrast working on top of the vGEM render node able to share PRIME buffers with the display node (Exynos)? If not, would adding kmsro to the mix help? Architectural ideia: @kmsro + kms_swrast on vGEM render node -> PRIME -> drm_hwcomposer and gbm_gralloc on display node (Exynos)@ Advantages: - no need to copy buffers between kms_swrast and Exynos (PRIME takes care of that); - can take advantage of HW planes. |>. 0 |>. 0 | | ongoing | original plan | Document the design decisions. | Done at this wiki page plus [[GraphicsResearch]] and the presentation at [[ContributorsMeetingJuly2019#Presentations]]. |>. 16 |>. 64 | | ongoing | new | Try out the "Android Go low RAM switches":https://redmine.replicant.us/projects/replicant/wiki/PortingToAndroid9#Links. | Check their impact on graphics rendering performance and overall system usability. |>. 0 |>. 1 | | todo | new | Test gbm_gralloc with camera. | So far we've only been testing gbm_gralloc with Lima and Exynos. However, the Gralloc HAL will be used to allocate buffers that will be shared with other devices as well, such as the camera. |>. 0 |>. 0 | | todo | new | Fix screen recorder. | Seems to fail with some EGL issue. |>. 0 |>. 1 | |\4>. total sum: |>. 288 |>. 240 | fn1. A big thanks to Joonas Kylmälä, Paul Kocialkowski, Denis Carikli, Andrés Domínguez, Mauro Rossi, Erico Nunes, Tomeu Vizoso, Daniel Stone, Emil Velikov, Andrzej Hajda, Marek Szyprowski and LiquidAcid. h2. SwiftShader tasks |_. status |_. origin |_. short description |_. notes |_. estimated man-hours |_. actual man-hours | | done | original plan | Find a way to use SwiftShader instead of Mesa. | Joonas got there [[PortingToAndroid9#Using-SwiftShader-instead-of-Mesa3D-llvmpipe-for-software-rendering|with ranchu composer (from Android Emulator) and the default gralloc]], plus a "patch to support UDIV/SDIV emulation in the kernel":https://git.replicant.us/replicant-next/kernel_replicant_linux/commit/?h=udiv-emulation&id=c5954294d3935774ef25375c2783bd3afa60e421. |>. 40 |>. 0 | | done | new | Use LLVM as backend instead of SubZero. | Found a "SwiftShader revision":https://swiftshader.googlesource.com/SwiftShader/+/fde88d96a58b92beab76035393b3acd849445160 that uses LLVM and is still compatible with Android 9 frameworks/native. No noticeable performance difference. |>. 0 |>. 6 | | done | new | Do UDIV/SDIV emulation on JIT compiled shader code instead of kernel patch. Avoids performance penalty of interrupt handling. | It seems that "SwiftShader does not send the processor model (microarchitecture) to LLVM":https://swiftshader.googlesource.com/SwiftShader/+/fde88d96a58b92beab76035393b3acd849445160/src/Reactor/LLVMReactor.cpp#596, leaving it without a way to decide "whether the processor has hardware division":https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/divide-and-conquer. Fixed upon update to Replicant 10. SwiftShader now uses LLVM 7 instead of 3, which fixed this. |>. 0 |>. 30 | | todo | new | Use drm_hwcomposer instead of ranchu. Advantages: uses hardware planes and DRM nodes instead of direct framebuffer. | "Joonas was close":https://people.freedesktop.org/~cbrill/dri-log/?channel=dri-devel&date=2018-08-03. |>. 0 |>. 0 | | todo | new | Use mainline SwiftShader. Brings in a Vulkan software renderer for Replicant. | Difficult to due incompatibilities with "frameworks/native":https://android.googlesource.com/platform/frameworks/native/+/408eda0002ed37a2d30a3dddea6466dfc5c288e7. Check if is fixed with Replicant 10. |>. 0 |>. 0 | |\4>. total sum: |>. 40 |>. 36 | h2. llvmpipe optimization tasks |_. status |_. origin |_. short description |_. notes |_. estimated man-hours |_. actual man-hours | | todo | original plan | Setup a testing and benchmarking environment. | Profiling: turn on profiling switch on Mesa + simpleperf? Benchmarks: "android-fps-count":https://github.com/romannurik/env/blob/master/bin/android-fps-count, "0xBenchmark":https://f-droid.org/wiki/page/org.zeroxlab.zeroxbenchmark, "GearsES2":https://f-droid.org/wiki/page/com.jeffboody.GearsES2eclair Conformance: "dEQP":https://android.googlesource.com/platform/external/deqp, "Android CTS":https://source.android.com/compatibility/cts/, "piglit":https://gitlab.freedesktop.org/mesa/piglit/tree/master, "freedreno/tests-*":https://github.com/freedreno/freedreno, "glmark2":https://github.com/glmark2/glmark2 |>. 40 |>. 1 | | todo | original plan | Disable expensive OpenGL operations. | |>. 24 |>. 0 | | todo | original plan | Recap matrix operations and study ARM NEON. | |>. 48 |>. 0 | | todo | original plan | Profile apps to find the most used GLES operations. | |>. 32 |>. 0 | | todo | original plan | Use "Ne10 library":https://github.com/projectNe10/Ne10 or "Neon Intrinsics":https://developer.arm.com/architectures/instruction-sets/simd-isas/neon/intrinsics for the most used GLES operations. | Optimizations have to be done on LLVM and not on llvmpipe. llvmpipe only outputs LLVM IR. LLVM already has autovectorization for ARM NEON, try it. |>. 80 |>. 0 | | todo | original plan | Fix bugs, re-write the code where needed, get it stable. | |>. 80 |>. 0 | |\4>. total sum: |>. 304 |>. 1 | h2. Lima driver tasks |_. status |_. origin |_. short description |_. notes |_. estimated man-hours |_. actual man-hours | | done | original plan | Rebase "Lima's Linux kernel DRM driver":https://gitlab.freedesktop.org/lima/linux on top of "forkbomb's Midas on Mainline kernel":https://blog.forkwhiletrue.me/pages/midas-mainline/. | Done by others. Lima DRM driver was accepted into mainline Linux, which also has forkbomb's patches and is now used on Replicant 10. |>. 80 |>. 0 | | done | original plan | Replace mainline Mesa for "Lima's Mesa":https://gitlab.freedesktop.org/lima/mesa (with their driver). | Done by others. Lima is now on mainline Mesa. "Lima wiki":https://gitlab.freedesktop.org/lima/web/-/wikis/home |>. 16 |>. 0 | | done | new | Lima DRM driver bringup on Exynos. | Lima development is done on AllWinner devices. We expected "some issues to get it working on Exynos":https://forum.odroid.com/viewtopic.php?p=267259#p264856. Although there were encouraging reports by "ChronoMonochrome":https://github.com/CustomROMs/android_local_manifests_i9300/issues/1#issuecomment-524375690, "hexdump0815":https://github.com/hexdump0815/linux-mainline-and-mali-on-odroid-u3 and Viciouss ("manifest":https://github.com/Viciouss/manifest_aosp_n80xx, "xda":https://forum.xda-developers.com/galaxy-note-10-1/general/mainline-n8000-progress-t3964980). Joonas added Lima to Replicant 10 and faced no major bringup issues. |>. 0 |>. 1 | | done | new | Fully test proper architecture. | @drm_hwcomposer and gbm_gralloc on card0 (Exynos) -> PRIME -> Mesa on renderD129 (Lima)@ Advantages: - no need to copy buffers between Lima and Exynos (PRIME takes care of that); - can take advantage of HW planes. |>. 0 |>. 1 | | done | new | Fix graphics corruption with hardware planes. | Corruption happened when compositing GL planes with non-GL planes. E.g. on "Shader Editor":https://f-droid.org/en/packages/de.markusfisch.android.shadereditor/, run a shader and open a menu. Disabling devfreq didn't solve it (as it did with llvmpipe). Was due to having "gbm_gralloc working on Lima's render node":https://git.replicant.us/replicant-next/device_samsung_i9305/commit/?h=replicant-10&id=488fa591dc5ad38a7da118f60964af174482dcf4, which cannot do contiguous memory allocation. |>. 0 |>. 1 | | todo | new | Fix video play. | Joonas reported that the Big Buck Bunny video fails at @os_get_total_physical_memory@ call from Mesa, which is called from lima_screen.c |>. 0 |>. 0 | | todo | new | Advertise GLES 2. "Shader Editor":https://f-droid.org/en/packages/de.markusfisch.android.shadereditor/ can only detect GLES 1. | |>. 0 |>. 0 | | todo | original plan | Build and test thoroughly with "synthetic":https://source.android.com/devices/graphics/testing and real applications. | Use conformance tests to figure out the "current GLES implementation status":https://gitlab.freedesktop.org/mesa/mesa/-/issues/949#note_566527. |>. 40 |>. 0 | | abandoned | original plan | Create a fallback mechanism that uses the software renderer for GLES functions not yet implemented in Lima. | There is no sane way to switch between different GLES drivers at the function level. Abandoned in favour of the tasks bellow. |>. 100 |>. 1 | | done | new | Lima as SurfaceFlinger backend. | This is the default (SurfaceFlinger using the default GLES implementation). No problems found. |>. 0 |>. 0 | | done | new | Lima as HWUI (SkiaGL) backend. | This is the default (SkiaGL using the default GLES implementation). No problems found. |>. 0 |>. 0 | | todo | new | Lima on a per-app basis. | Lima will at most support GLES2. Therefore it may not work with certain apps depending on their GLES usage. We can re-work the "per process libagl/llvmpipe patch":https://lists.osuosl.org/pipermail/replicant/2019-August/002054.html into a patch that switches between Lima and a software renderer (llvmpipe or SwiftShader). |>. 0 |>. 0 | |\4>. total sum: |>. 236 |>. 4 | h2. 2D optimization tasks |_. status |_. origin |_. short description |_. notes |_. estimated man-hours |_. actual man-hours | | todo | new | Investigate the possibility of using "Pixman":http://pixman.org or "Exynos G2D":http://linux-exynos.org/wiki/G2D as RenderEngine for SurfaceFlinger. | There are interesting reports of people using "G2D to hardware-accelerate X11 EXA":https://forum.odroid.com/viewtopic.php?f=81&t=21035 |>. 0 |>. 1 | | todo | new | Accelerate Skia with G2D. | Rework old patches ("Hillenbrand 2013":https://github.com/CyanogenMod/android_external_skia/commit/647876b665f2cf011e75adc6ff2238d467c47635 and "raymanfx 2016":https://review.lineageos.org/c/LineageOS/android_external_skia/+/61162 ) to make them work on current Skia. |>. 0 |>. 1 | |\4>. total sum: |>. 0 |>. 2 | h2. Tested apps |_/2. app |_\2. Replicant 6 |_\3. Replicant 10 |_/2. notes | |_. libagl |_. LLVMpipe |_. Lima |_. LLVMpipe |_. SwiftShader | | "Fennec F-Droid":https://f-droid.org/en/packages/org.mozilla.fennec_fdroid | crashes | slow | fast | | | Needs GLES 2.0 | | "LibreOffice Viewer":https://f-droid.org/en/packages/org.documentfoundation.libreoffice | crashes | slow |\3. cannot test (missing storage) | | | "Red Reader":https://f-droid.org/en/packages/org.quantumbadger.redreader | "crashes":https://github.com/QuantumBadger/RedReader/issues/279 | usable |\3. cannot test (no network) | | | "Shader Editor":https://f-droid.org/en/packages/de.markusfisch.android.shadereditor | crashes | 7 fps | 30 fps (HW planes off) 40 fps (HW planes on) Freezes when changing resolution. | | | FPS measured on default shader with 1/1 resolution. | | "Marine Compass":https://f-droid.org/en/packages/net.pierrox.mcompass | bad render | bad render | crashes | | | Only uses GLES 1.0 | | "Gears":https://f-droid.org/en/packages/com.jeffboody.GearsES2eclair | crashes | no render | crashes | | | | | "GL TRON":https://f-droid.org/en/packages/com.glTron | 4 fps | 2 fps | 23 fps | | | Has a nice FPS counter. |