Like many other passioned software engineers I also have a side project. It’s a time tracker app for Android, started in 2012. It’s not a perfect app, but after a lifetime of almost 10 years it still has a very good rating in the play store (currently 4.7 stars), has a quite decent architecture which enables me to continuously add features or adapt to new Android releases without breaking things, is covered by unit and UI tests and it has almost no crashes.
I said almost no crashes. A few ones are still there and I never gave them the required attention. What are you doing when a beloved app you’re using on a daily basis crashes once every few weeks? I suppose you’re restarting it and live on with the situation. That’s what I decided that my users should also do.
But lately I so was annoyed by those crash reports that I started to invest more brainpower into finding what’s causing the issues, and here’s were the story starts.
Foreground services
According to the official android documentation, a foreground service performs operations that are noticeable to the user. It shows a status bar notification, so that users are actively aware that your app is performing a task in the foreground and is consuming system resources. While this notification is shown, the Android system will be gentle to the application “owning” the notification, giving it the required priority and trying not to stop it, because it’s probably doing some important work.
Well, what does all of this exactly mean? I’ll answer all questions in the context of my app, Swipetimes Time Tracker.
While tracking project time you may also record a GPS track. Lets say you’re driving to a client, and at the end of the month you need the exact mileage to put it on the invoice. You get that value only if tracking doesn’t stop while driving. To achieve this, the app is using a foreground service.
How to start a foreground service? Well, instead of
Context#startService(i: Intent)
you’re using
Context#startForegroundService(i: Intent)
Once the service is started, Android will invoke onStartCommand upon it. Then you will have to take care of displaying the foreground notification, and you do that by calling
Service#startForeground(id: Int, notification: Notification)
If you forget it, or it takes too long before you invoke that method you will end up with an exception
Fatal Exception: android.app.RemoteServiceException Context.startForegroundService() did not then call Service.startForeground(): ServiceRecord{1228932 u0 lc.st.free/lc.st.notification.StartStopNotificationService}
Swipetimes foreground notification
Job Offers
Application crash
Well, I surely understood the technical requirements and implemented it accordingly:
- Start the foreground service
- Show the notification
- Do the work in the foreground service asynchronously (more or less)
Still, my crashlytics logs showed very sporadic crashes. I did everything to reduce the time between the foreground service start and the notification display to no palpable success.
What’s causing the crash?
Photo by mostafa meraji on Unsplash
Locally, I never had this kind of crashes. In order to narrow down the problem, I had to rely on information from all of my apps installtions out there in the wild. The first thing I did was to massively scale up firebase based log events: when does the service start, what happens before etc. Without those log events you’re basically lost. As always, firebase logging helped and I got a clearer picture.
The diagram below shows the app startup process, which is the key in finding the issue.
When the app starts, some of the initialisation steps are done asynchronously. The UI will be displayed only after all those asynchronous jobs finish their work. Until that happens, the main thread is blocked in a waiting state. It’s not good practice to block the main thread, but hey, it’s the startup phase of the application.
One of the initialisation steps (xyz in our case) decides that it needs to start a foreground service.
The foreground service is started. As you can see, starting happens on the same thread the init step is running, which is different from the main thread.
The main thread is busy waiting for all init processes to finish. The foreground service which has previously been started also needs to run on this thread (as all services run on the main thread of their hosting process), but it can’t because the main thread is waiting for all init steps.
When all initialisation steps finish and the UI is shown, the system will switch its attention to the service and invoke onStartCommand on main. There, the promise of calling startForeground(id: Int, notification: Notification) can be fulfilled. However, in some rare situations it takes too long until onStartCommand is called, and the application will crash.
The service is given an amount of time comparable to the ANR interval to invoke startForeground, otherwise the system will automatically stop the service and declare the app ANR.
Photo by Jeff Kingma on Unsplash
The solution
Well, the solution was quite straightforward. We need to synchronise with the main thread and start the service only if main becomes free again. So, the only required change is to delegate the service start to a handler which is bound to main.
handler.post { context.startForegroundService(...) }
Conclusion
Well, I know this was quite a non-standard issue. Nonetheless, I found it really interesting. There may be more ways of solving it:
- Don’t block the main thread by waiting. Show a splash or progress screen while asynchronous initialisation runs and leave the main thread free. This way, the foreground service requirements can be handled immediately. In my case this would require some bigger changes involving some risks I don’t want to take right now.
- Postpone the service start until the main thread is free. Do this by posting to the main thread’s handler. For my case the preferred approach.