Always measure - fun with benchmarks
In one of the previous posts, we explored the possibilities shaders give us. In the end, we tried to animate them. While doing that, I started to wonder if this solution was a viable option for production. In other words, is it fast enough? And the answer might be only one. Let’s have some fun with benchmarks.
Finding weaknesses
Here I modified the original code to make a minimal working example.
@Composable
fun ShaderPerformance1() = Column {
val shader = remember {
RuntimeShader(SHADER_ANIM_COLOR)
.apply { setFloatUniform("iDuration", DURATION) }
}
val brush = remember { ShaderBrush(shader) }
val time by timeAnimation()
shader.setFloatUniform("iTime", time)
Text(
text = prefText,
style = TextStyle(brush = brush),
modifier = Modifier
.onSizeChanged {
shader.setFloatUniform(
"iResolution",
it.width.toFloat(),
it.height.toFloat()
)
}
.alpha(1 - (time + 1) / 1000 / DURATION),
)
}
What stands out here is this alpha manipulation to force redraw. I started trying to understand what causes text draw invalidation. This is a highly complex topic and I didn’t want to jump to the wrong conclusions, so I started seeking answers on Kotlin’s Slack. I got help from Halil Ozercan, who works at Google, precisely on that. He was super helpful and provided his solution to the problem. I tweaked it so it looks similar to my original code.
@Composable
fun ShaderPerformance2() {
class AnimShaderBrush(val time: Float = -1f) : ShaderBrush() {
private var internalShader: RuntimeShader? = null
private var previousSize: Size? = null
override fun createShader(size: Size): Shader {
val shader = if (internalShader == null || previousSize != size) {
RuntimeShader(SHADER_ANIM_COLOR).apply {
setFloatUniform("iResolution", size.width, size.height)
setFloatUniform("iDuration", DURATION)
}
} else {
internalShader!!
}
shader.setFloatUniform("iTime", time)
internalShader = shader
previousSize = size
return shader
}
fun setTime(newTime: Float): AnimShaderBrush {
return AnimShaderBrush(newTime).apply {
this@apply.internalShader = this@AnimShaderBrush.internalShader
this@apply.previousSize = this@AnimShaderBrush.previousSize
}
}
override fun equals(other: Any?): Boolean {
if (other !is AnimShaderBrush) return false
if (other.internalShader != this.internalShader) return false
if (other.previousSize != this.previousSize) return false
if (other.time != this.time) return false
return true
}
}
var brush by remember { mutableStateOf(AnimShaderBrush()) }
val time by timeAnimation()
LaunchedEffect(time) {
brush = brush.setTime(time)
}
Text(
text = prefText,
style = TextStyle(brush = brush),
)
}
The key in this solution is to invalidate Text composable by invalidating Shader. This should be very efficient due to internal caching mechanisms.
That being said, it still looks a bit hacky and most likely internal changes in Text will appear to make animations like that work out of the box. In the meantime, I wanted to create a more efficient way to do it.
@Composable
fun ShaderPerformance3() = Column {
data class Info(val layout: TextLayoutResult, val width: Float, val height: Float)
val shader = remember {
RuntimeShader(SHADER_ANIM_COLOR)
.apply { setFloatUniform("iDuration", DURATION) }
}
val brush = remember { ShaderBrush(shader) }
val time by timeAnimation()
val textMeasurer = rememberTextMeasurer()
val info = remember(prefText) {
textMeasurer.measure(
text = AnnotatedString(prefText),
style = TextStyle(brush = brush)
).let { textLayout ->
val lines = (0 until textLayout.lineCount)
val start = lines.minOf { textLayout.getLineLeft(it) }
val end = lines.maxOf { textLayout.getLineRight(it) }
val top = textLayout.getLineTop(lines.first)
val bottom = textLayout.getLineBottom(lines.last)
val width = abs(end - start)
val height = bottom - top
shader.setFloatUniform("iResolution", width, height)
Info(textLayout, width, height)
}
}
val wdp = with(LocalDensity.current) { info.width.toDp() }
val hdp = with(LocalDensity.current) { info.height.toDp() }
Canvas(
Modifier
.size(wdp, hdp)
) {
shader.setFloatUniform("iTime", time)
drawText(info.layout, brush)
}
}
Here we are using TextMeasurer and drawing text directly on the canvas. We measure text, read its width and height, pass its size to the modifier and then render that text on canvas. For sure, this is not the most optimal solution, but the idea should be powerful enough.
Benchmark set up
To test animation efficiency, we will be using the macrobenchmark library. I made a few test runs and discovered that even though the benchmark library does its best to minimise fluctuations, it cannot give me a precise result. So, I end up creating tests like:
companion object {
@Parameterized.Parameters(name = "anim{1} loop{0}")
@JvmStatic
fun initParameters() = buildList {
repeat(100) {
add(arrayOf(it, "1"))
add(arrayOf(it, "2"))
add(arrayOf(it, "3"))
}
}
}
@Test
fun animation() = benchmarkRule.measureRepeated(
packageName = "com.mikolajkakol.myapplication",
metrics = listOf(FrameTimingMetric()),
compilationMode = CompilationMode.Full(),
iterations = 10,
startupMode = StartupMode.HOT,
setupBlock = {
startActivityAndWait()
device.findObject(By.text("Shader performance $id"))?.click()
}
) {
Thread.sleep(100)
}
I make 10 iterations and repeated it 100 times. Also, rendering a single text is super-fast so I decided to stack text 30 times to slow down rendering.
NavHost(navController = navController, startDestination = "list") {
composable("list") { Composables(navController) }
composable("shaderPerf1") {
repeat(30) {
ShaderPerformance1()
}
}
composable("shaderPerf2") {
repeat(30) {
ShaderPerformance2()
}
}
composable("shaderPerf3") {
repeat(30) {
ShaderPerformance3()
}
}
}
Is it best? Hard to tell. Anyway, let’s see the results! (I run it on Pixel 6 phone)
ShaderPerformanceTest_animation[anim1 loop0]
frameDurationCpuMs P50 7,6, P90 9,4, P95 9,8, P99 10,4
frameOverrunMs P50 -7,7, P90 -6,0, P95 -5,6, P99 -4,6
Traces: Iteration 0 1 2 3 4 5 6 7 8 9
ShaderPerformanceTest_animation[anim2 loop0]
frameDurationCpuMs P50 8,6, P90 10,1, P95 10,6, P99 11,8
frameOverrunMs P50 -5,4, P90 1,6, P95 1,8, P99 2,2
Traces: Iteration 0 1 2 3 4 5 6 7 8 9
ShaderPerformanceTest_animation[anim3 loop0]
frameDurationCpuMs P50 6,1, P90 8,4, P95 9,6, P99 10,8
frameOverrunMs P50 -9,2, P90 -6,8, P95 -6,2, P99 -4,9
Traces: Iteration 0 1 2 3 4 5 6 7 8 9
ShaderPerformanceTest_animation[anim1 loop1]
frameDurationCpuMs P50 9,0, P90 10,4, P95 11,3, P99 12,1
frameOverrunMs P50 -4,7, P90 1,7, P95 1,8, P99 2,3
Traces: Iteration 0 1 2 3 4 5 6 7 8 9
ShaderPerformanceTest_animation[anim2 loop1]
frameDurationCpuMs P50 7,6, P90 9,2, P95 9,9, P99 11,6
frameOverrunMs P50 -7,7, P90 -6,1, P95 -5,4, P99 -3,6
Traces: Iteration 0 1 2 3 4 5 6 7 8 9
ShaderPerformanceTest_animation[anim3 loop1]
frameDurationCpuMs P50 6,5, P90 8,5, P95 9,1, P99 10,3
frameOverrunMs P50 -8,9, P90 -7,0, P95 -6,3, P99 -5,5
Traces: Iteration 0 1 2 3 4 5 6 7 8 9
ShaderPerformanceTest_animation[anim1 loop2]
frameDurationCpuMs P50 7,6, P90 9,5, P95 9,8, P99 10,6
frameOverrunMs P50 -7,8, P90 -5,9, P95 -5,6, P99 -4,5
Traces: Iteration 0 1 2 3 4 5 6 7 8 9
Well, that tells us absolutely nothing.
Let’s do some graphs!
In order to visualize effects, we will be using Lets-Plot for Kotlin. At first, we did a line graph that shows iteration on the X-axis and the meantime for each percentile on the Y-axis.
We are now getting somewhere. It seems that the first solution is the slowest and the last the fastest, but is it? Thankfully our college Jadwiga found an even better representation of this kind of data.
I’m presenting you the violin plot.
Now we have a clear indication of what is fastest. Intriguingly, 3rd method has the most fluctuations, possibly a GC running, but why? Hard to tell. Anyway!
It was great fun and I gained excellent knowledge using macrobenchmark. If I was asked for a recommendation, I would say go with the Halil method it is super easy and most reliable. We learned that Text in compose is very fast. Considering that we had to increase the number of stacks to see meaningful numbers, we might come to the conclusion that we should measure only when we have a performance problem and measure real-life examples/problems.
Additional information:
- https://www.youtube.com/watch?v=BrdjsKq8lgQ&ab_channel=CodewiththeItalians extensive talk with Zach Klippenstein and Halil Ozercan about how text is layout
- https://gist.github.com/halilozercan/a6fb2b9977b386f9ad6ca6ce7cf3c72f original solution by Halil
- Whole project https://github.com/SPEEDNETpl/fun-with-fonts/blob/main/macrobenchmark/src/main/java/com/mikolajkakol/macrobenchmark/ShaderPerformanceTest.kt