A deep dive into the subtleties of testing Kotlin coroutines
Even if you are fluent in Kotlin coroutines, you might still find it difficult to test them. Concurrency is just inherently hard to reason about, especially if you’re aiming for 100% deterministic behavior, which is necessary in testing.
Kotlin 1.6 introduced a lot of changes to the coroutine testing environment. runTest
is not a simple change of naming convention from runBlockingTest
, fundamental changes have been introduced to how tests work. This won’t be another article about the changes between 1.5 and 1.6, though, there have been a plethora of these lately. Instead, we will focus on a few not-so-obvious TestDispatcher
characteristics that might have you pulling your hair in frustration.
So, in this article you will find some theory:
- What is the difference between a task being scheduled and executed?
- What are the main characteristics of
StandardTestDispatcher
andUnconfinedTestDispatcher
and how are they different? - How can you control the virtual time of a TestDispatcher (with visualization)?
And subtle TestDispatcher traps explained in code:
- How never-ending coroutines will prevent test finish.
- How
UnconfinedTestDispatcher
might mess up your tests in an unexpected way. - How you might need an extra 1 ms using
advanceTimeBy
.
Please note that this article is based on the state of affairs in Kotlin 1.7.0. Most of the APIs used here are still under @ExperimentalCoroutinesApi opt-in, so they are subject to change in the future. Also, we will use Turbine and mockk in these examples to simplify test cases.
TestDispatcher 101
Before we get to the practical examples, let’s make sure we understand what TestDispatcher
is and why it’s the very heart of coroutine testing.
TestDispatcher
is nothing more than a special case ofCoroutineDispatcher
(so a cousin of Dispatchers.Main
, Dispatchers.IO
, etc.). Like any other dispatcher, it is attached to a CoroutineScope
and its job is to orchestrate the execution of the coroutines launched in that scope. The difference is that, unlike with any regular dispatcher, we get to control this orchestration via TestCoroutineScheduler
, its virtual time and the manual execution of scheduled tasks.
Scheduling and execution of coroutines
One thing that really helped me understand how TestDispatchers work was realizing the difference between a task being scheduled and executed. In production code, this distinction isn’t that important because everything happens on a real clock and in the vast majority of situations scheduled tasks are executed right away. With TestDispatchers, the difference becomes apparent. Basically, even if we’re considering a task that is scheduled to execute in this exact millisecond, it does not mean that its subscribers will receive it immediately. It needs to be executed by the CoroutineDispatcher first. A quick look at the
TestCoroutineScheduler
source reveals a lot:
public class TestCoroutineScheduler { /** This heap stores the knowledge about which dispatchers are interested in which moments of virtual time. */ private val events = ThreadSafeHeap<TestDispatchEvent<Any>>() /** Establishes that [currentTime] can't exceed the time of the earliest event in [events]. */ private val lock = SynchronizedObject() /** This counter establishes some order of the events that happen at the same virtual time. */ private val count = atomic(0L)
These 3 properties tell us the following:
events
: Dispatchers use the scheduler to register events — specific moments of time they are interested in.count
: Even if there are multiple events scheduled at the same virtual time, there is a mechanism to ensure their deterministic order (although it is not used byUnconfinedTestDispatcher
).lock
: It’s guaranteed that if the virtual clock is moved past the time scheduled for the task — that task is executed.
This will all become easier to understand on pictures below.
StandardTestDispatcher vs UnconfinedTestDispatcher
There are only two test dispatchers used in tests. StandardTestDispatcher
is the default provided whenever we use runTest
. It has strict guarantees about the execution order of the tasks, however, the execution is not eager, i.e. we need to use runCurrent
to trigger it at the current moment of virtual time or advance the time manually with advanceTimeBy
and advanceUntilIdle
.
UnconfinedTestDispatcher
is eager, it will not require poking with therunCurrent
stick to execute. It will automatically advance the virtual time and execute all the enqueued tasks. However, its downside is that it will not guarantee the order of several coroutines scheduled inside of it. It basically works like Dispatchers.Unconfined
, but with auto-advanced virtual time. This can cause a lot of confusion, as will be explained below.
Moving the virtual clock hand
With StandardTestDispatcher
, we can precisely control the execution of scheduled coroutines via one of 3 methods:
runCurrent
: will execute any tasks scheduled at the current moment of virtual time.advanceTimeBy
: advances the virtual time by the given amount of milliseconds and executes any tasks scheduled in the meantime.advanceUntilIdle
: works similarly toadvanceTimeBy
, but instead of advancing virtual time by a specific amount of milliseconds, it advances it until there are no more tasks scheduled in the queue.
Now let’s visualize this virtual timeline. Say, we have 4 tasks scheduled:
A: scheduled at 0 ms (immediately).
B: scheduled at 1000 ms.
C: scheduled at 1000 ms, but registered after B.
D: scheduled at 2000 ms.
Let’s see how each method will affect the timeline.
runCurrent()
will not move the virtual time, but will execute task A, which is scheduled at the current time (0 ms).
advanceTimeBy(1000)
will move the clock by 1000 ms and will execute task A, which was scheduled in the meantime. It will not, however, execute B and C yet.
To execute them, we need to explicitly call runCurrent()
after advanceTimeBy()
. This will be shown in code in one of the examples below. Please note that if B had been registered before C, that order would be maintained for execution byStandardTestDispatcher
. UnconfinedTestDispatcher
would not guarantee that.
And finally, we can advanceUntilIdle()
, which will advance the time by 2000 ms — i.e., until all the currently scheduled tasks are executed.
Now that we have the basics out of the way, let’s take a look at a few subtleties that have made quite a few grown men scream at their computers.
How TestDispatchers might bite your ass
Never-ending coroutines will prevent test finish
Let’s consider a MediaPlayer class that launches sound playback.
interface SoundPlayer { | |
suspend fun playSound() | |
} | |
class MediaPlayer( | |
private val scope: CoroutineScope, | |
private val soundPlayer: SoundPlayer | |
) { | |
val playerErrors = MutableSharedFlow<Exception>() | |
fun play() { | |
scope.launch { | |
try { | |
soundPlayer.playSound() | |
} catch (e: Exception) { | |
playerErrors.emit(e) | |
} | |
} | |
} | |
} |
Sometimes, the coroutine that is supposed to play the sound throws an exception, possibly due to a corrupted file. MediaPlayer
will notify its clients about these errors and we can test that behavior.
@OptIn(ExperimentalCoroutinesApi::class) | |
class NeverEndingCoroutineTest { | |
@Test | |
fun `should emit error when playSound throws`() = runTest { | |
val exception = Exception("Oopsie") | |
val soundPlayer = mockk<SoundPlayer>() | |
coEvery { soundPlayer.playSound() } throws exception | |
val sut = MediaPlayer(this, soundPlayer) | |
sut.playerErrors.test { | |
sut.play() | |
awaitItem() shouldBe exception | |
} | |
} | |
} |
So far, so good. Now let’s assume that apart from delivering these errors to the UI layer, we also want to report them to Crashlytics and we have a special IssueReporter
class to handle it. We can just observe the playerErrors
Flow and report them.
interface IssueReporter { | |
fun reportIssue(e: Exception) | |
} | |
interface SoundPlayer { | |
suspend fun playSound() | |
} | |
class MediaPlayer( | |
private val scope: CoroutineScope, | |
private val soundPlayer: SoundPlayer, | |
private val issueReporter: IssueReporter | |
) { | |
val playerErrors = MutableSharedFlow<Exception>() | |
init { | |
scope.launch { | |
playerErrors.collect { | |
issueReporter.reportIssue(it) | |
} | |
} | |
} | |
fun play() { | |
scope.launch { | |
try { | |
soundPlayer.playSound() | |
} catch (e: Exception) { | |
playerErrors.emit(e) | |
} | |
} | |
} | |
} | |
@OptIn(ExperimentalCoroutinesApi::class) | |
class NeverEndingCoroutineTest { | |
@Test | |
fun `should emit error when playSound throws`() = runTest { | |
val exception = Exception("Oopsie") | |
val soundPlayer = mockk<SoundPlayer>() | |
val issueReporter = mockk<IssueReporter>() | |
justRun { issueReporter.reportIssue(any()) } | |
coEvery { soundPlayer.playSound() } throws exception | |
val sut = MediaPlayer(this, soundPlayer, issueReporter) | |
sut.playerErrors.test { | |
sut.play() | |
awaitItem() shouldBe exception | |
} | |
} | |
} |
Job Offers
This test will run for one minute and then timeout with:
Caused by: kotlinx.coroutines.test.UncompletedCoroutinesError: After waiting for 60000 ms, the test coroutine is not completing, there were active child jobs: ["coroutine#4":StandaloneCoroutine{Active}@36790bec]
Why is that? Because we leaked a coroutine job running on the test scope — obviously playerErrors.collect
. Test dispatchers make sure that if there’s an active coroutine,runTest
will block until all of its children are finished (or until timeout, which is 60 s by default and can be changed with runTest(dispatchTimeoutMs=x)
).
This is actually quite handy: we leaked a resource, tests told us about it and now we need to clean it up. So let’s add a dispose()
method that will take care of it by canceling the scope. This should fix it, right?
class MediaPlayer(...) { | |
... | |
fun dispose() { | |
scope.cancel() | |
} | |
} | |
class NeverEndingCoroutineTest { | |
@Test | |
fun `should emit error when playSound throws`() = runTest { | |
... | |
sut.playerErrors.test { | |
sut.play() | |
awaitItem() shouldBe exception | |
} | |
sut.dispose() | |
} | |
} |
Nope. Well, yes, the leak is fixed, but the test still fails:
TestScopeImpl was cancelled kotlinx.coroutines.JobCancellationException: TestScopeImpl was cancelled; job=TestScope[test ended]
Test scopes don’t like being canceled. What we can do is to make sure its children jobs are canceled so that itcan die peacefully. If we replace sut.dispose() with
this.coroutineContext.cancelChildren() the test will pass.
Personally, I don’t find this line of code very self-explanatory, so I like to wrap it into a function for the reader to know what’s happening.
private fun TestScope.cancelNeverEndingCoroutines() = this.coroutineContext.cancelChildren()
Note that this behavior is a bit controversial, so it’s possible it will change in the future. E.g. https://github.com/Kotlin/kotlinx.coroutines/issues/1531.
UnconfinedTestDispatcher will mess up conflated StateFlow emissions
This actually shouldn’t come as a surprise because it’s documented, however, the indeterministic nature of UnconfinedTestDispatcher
(and the original UnconfinedDispatcher
too, fwiw) can be pretty subtle. It’s sometimes useful because it saves us all the runCurrent
calls, but from time to time it can blow up in our faces.
Let’s use the MediaPlayer
example again.
interface SoundPlayer { | |
suspend fun playSound() | |
} | |
sealed class PlayerState { | |
object Stopped : PlayerState() | |
object Playing : PlayerState() | |
} | |
class MediaPlayer( | |
private val scope: CoroutineScope, | |
private val soundPlayer: SoundPlayer | |
) { | |
val playerState = MutableStateFlow<PlayerState>(PlayerState.Stopped) | |
fun play() { | |
scope.launch { | |
playerState.emit(PlayerState.Playing) | |
soundPlayer.playSound() | |
playerState.emit(PlayerState.Stopped) | |
} | |
} | |
} |
If run in a production app, this piece of code will properly emit Playing
, then play the sound, and then turn back to Stopped
. Things get a bit weird when we test that and mock playSound
to be a 0 ms coroutine.
@Test | |
fun `should change state to Playing after play() called`() = runTest(UnconfinedTestDispatcher()) { | |
val soundPlayer = mockk<SoundPlayer>() | |
coJustRun { soundPlayer.playSound() } | |
val sut = MediaPlayer(this, soundPlayer) | |
sut.playerState.test { | |
awaitItem() shouldBe PlayerState.Stopped | |
sut.play() | |
awaitItem() shouldBe PlayerState.Playing | |
cancelAndIgnoreRemainingEvents() | |
} | |
} |
We expect the player state to switch to Playing
. But this test will time out after 60 seconds — the second awaitItem()
will suspend forever. Why is that?
For two reasons combined:
StateFlow
is allowed to conflate emissions — e.g. if it accepts two identical values (like twoStopped
one after another), it will only emit once.UnconfinedTestDispatcher
, citing the docs, “does not provide guarantees about the execution order when several coroutines are queued in this dispatcher”.
Now, we have launched from the scope with an unconfined dispatcher and have run 3 suspend functions inside of it.
scope.launch { playerState.emit(PlayerState.Playing) soundPlayer.playSound() playerState.emit(PlayerState.Stopped) }
Even though a lot happened between sut.play()
and awaitItem()
, our collector (Turbine’s sut.playerState.test
) missed the whole show. It still only sees PlayerState.Stopped
.
It’s easily fixed by replacing UnconfinedTestDispatcher
with StandardTestDispatcher
, which guarantees that the second awaitItem()
will wait for playerState.emit(PlayerState.Playing)
and only after that will it resume.
This example is here to show that problems with UnconfinedTestDispatcher
are not always super obvious like expecting events in an A-B-C-D
order and receiving A-C-B-D
. Combined with the rest of the coroutines machinery, like StateFlow
‘s implicit conflation, it can get really obscure.
TestScope.advanceTimeBy will not execute coroutines scheduled exactly at the current virtual time, but only those scheduled earlier
This is a confusing difference between the old 1.5 TestCoroutineScope
and the new 1.6 TestScope
. The new one will just move the virtual time, but will not execute any pending tasks on the TestCoroutineScheduler
. The old one would additionally call runCurrent. This is of course documented:
In contrast with
TestCoroutineScope.advanceTimeBy
, this function does not run the tasks scheduled at the moment currentTime + delayTimeMillis.
In practice, this new behavior means that after advanceTimeBy
, we have to call runCurrent
explicitly or just advance the time a tiny bit more (by 1 ms).
Let’s see an example.
interface SoundPlayer { | |
suspend fun playSound() | |
} | |
class MediaPlayer( | |
private val scope: CoroutineScope, | |
private val soundPlayer: SoundPlayer | |
) { | |
var playbackCounter = 0 | |
fun play() { | |
scope.launch { | |
soundPlayer.playSound() | |
playbackCounter++ | |
} | |
} | |
} |
With our old friend MediaPlayer
, let’s assume that we want to make sure we won’t increase playbackCounter
until the playback has actually finished. The test below will let us do just that:
@Test | |
fun `should not increase the playback counter until the sound has finished playing`() = runTest { | |
val soundPlayer = mockk<SoundPlayer>() | |
coEvery { soundPlayer.playSound() } coAnswers { delay(1000) } | |
val sut = MediaPlayer(this, soundPlayer) | |
sut.play() | |
advanceTimeBy(500) | |
sut.playbackCounter shouldBe 0 | |
} |
We make the mock sound player run for 1000 ms so that after 500 ms we can check that the counter is still 0. Now, let’s prepare another test to make sure that after it’s finished playing, it really does change to 1.
@Test | |
fun `should increase the playback counter after the sound has finished playing`() = runTest { | |
val soundPlayer = mockk<SoundPlayer>() | |
coEvery { soundPlayer.playSound() } coAnswers { delay(1000) } | |
val sut = MediaPlayer(this, soundPlayer) | |
sut.play() | |
advanceTimeBy(1000) | |
sut.playbackCounter shouldBe 1 | |
} |
Unfortunately, the assertion for this test fails. It would pass with the deprecated runBlockingTest
, but nowadays, we need to be explicit about the execution of scheduled tasks. We can fix the test by adding runCurrent()
just after advanceTimeBy(1000)
or (which is a bit less elegant, I think) by replacing it with advanceTimeBy(1001)
.
Summary
There are a few other confusing behaviors to the coroutine testing framework like Dispatchers.setMain
implicitly providing a default test dispatcher for all the consecutive test (although I couldn’t imagine a convincing broken test for that). If you found a fragile coroutine test scenario yourself, let me know in the comments.
However, overall, the changes in 1.6 are a huge step forward and make testing concurrency more predictable. For the cases where they don’t, I hope these examples will help someone pull a little less hair from their head.
Additional resources:
- https://developer.android.com/kotlin/coroutines/coroutines-best-practices
- https://github.com/Kotlin/kotlinx.coroutines/blob/master/kotlinx-coroutines-test/MIGRATION.md
Also, thanks Artur Klamborowski for your help with this article.
This article was originally published on proandroiddev.com on June 29, 2022