Not Just Crashes: Your Observability Stack for the Mobile App
Improve mobile app reliability with full-stack observability, including ANR tracking, latency monitoring, cold start metrics, and in-app telemetry.
Join the DZone community and get the full member experience.
Join For FreeGo beyond Crashlytics by adopting latency tracing, ANR root-cause analysis, and in-app telemetry to understand the end-user journey.
If you are a mobile engineer, you have probably felt the same gut punch I have: ship a feature, see the app store rating drop, see reviews say nothing more useful than "app is slow" or "app keeps freezing."
And by the time these complaints pile up, it is too late, and the user experience has already been compromised. That is why ratings are not observability.
In mobile engineering today, there needs to be full-stack observability, meaning not just knowing what broke, but where it broke, and why, with enough data to act quickly. In this post, I am going to share a tested observability checklist for mobile apps that I have found goes beyond crash reporting, including ANR prioritization, latency monitoring, cold start monitoring, and in-app telemetry.
1. Start With ANR (Application Not Responding) Routes
ANRs are a mobile's version of a backend outage. The UI thread is freezing, your app is not responsive, and users quit your app. Unlike crashes, ANRs do not always create stack traces that give you clear insight as to what and where things went wrong.
So, why prioritize ANRs:
- Users can deal with bugs here and there, but a frozen app obliterates trust.
- ANRs are often reflective of deeper architectural debt with blocking calls on the main thread, sub-optimal layouts, or runaway GC events.
Pro tip: Do not track ANRs globally; track ANR on each route/screen. One bad screen could spoil the whole stability score.
Example (Android with Firebase performance monitoring):
val trace = FirebasePerformance.getInstance().newTrace("screen_home_load")
trace.start()
// Load data, render UI...
trace.stop()
Later, you will be able to correlate with ANR data from Sentry or Bugsnag, if certain routes deal with the freezes.
2. Debugging ANR With Stack Sampling
The fastest way to debug ANRs is by stack sampling. This is where you capture the call stack of the main thread at regular intervals, and then dump the logs to be analyzed later.
Example in Android:
val mainHandler = Handler(Looper.getMainLooper())
mainHandler.post(object : Runnable {
override fun run() {
if (Debug.threadCpuTimeNanos() == lastCpuTime) {
Log.w("ANR", "Potential freeze detected on main thread")
}
lastCpuTime = Debug.threadCpuTimeNanos()
mainHandler.postDelayed(this, 1000)
}
})
On iOS, use a DispatchSourceTimer to sample the main thread backtrace at regular intervals. Libraries like PLCrashReporter can also help with this process.
Pro tip: Analyze ANR samples with breadcrumbs of what actions the user performed recently — "tapped Pay," "Opened Settings" — so that you are able to recreate the exact steps that the user went through.
3. Build Latency Tracking in Every Flow
Latency isn't just about network speed — it is the time-to-interaction your users perceive. Your users care about how long it takes from when they tap to the app being in a usable state.
You should be tracking latency at the transaction level — "Search query run" or "Checkout flow" — not the individual API calls.
Example in Android:
val transaction = Sentry.startTransaction("checkout_flow", "navigation")
Sentry.configureScope { scope ->
scope.transaction = transaction
}
// ... do work
transaction.finish()
Having named transactions allows you to observe the performance over time, which can also help you catch regressions when a release slows down a critical flow within an app.
4. Cold Start Time Tracking (With Sentry)
Cold start is the time taken to open the app to the first frame rendering. Anything over ~2 seconds feels slow. Generally, Sentry tracks cold and warm starts automatically as part of the mobile vitals end-to-end capabilities, and this is done without ever needing to create your manual timers!
Example (Android with Sentry ≥ 7.4.0):
import io.sentry.android.core.SentryAndroid
class MyApplication : Application() {
override fun onCreate() {
super.onCreate()
SentryAndroid.init(this) { options ->
options.setEnablePerformanceV2(true)
options.setEnableAppStartProfiling(true)
}
}
}
When enabled, Sentry records app.start.cold and app.start.warm as part of your transactions, and breaks down all start phases (before runtime, runtime init, UI initialization, and the first frame render).
Pro tip: Sentry allows you to segment regressions by operating system, device, or release.
5. Developing a Lightweight Application Telemetry Layer
Telemetry turns observability into something that is actionable. Not only will we be measuring crash analytics and performance, but we’ll also have user paths, feature usage, and error context.
Abstract your SDK calls so that it is easy for you to swap vendors without needing to rewrite your app.
Example in Android:
interface TelemetryClient {
fun trackEvent(name: String, properties: Map<String, Any>? = null)
fun trackError(throwable: Throwable, context: Map<String, Any>? = null)
fun trackMetric(name: String, value: Double)
}
Then implement adapters:
class SentryTelemetry : TelemetryClient {
override fun trackEvent(name: String, properties: Map<String, Any>?) {
Sentry.captureMessage("$name: $properties")
}
override fun trackError(throwable: Throwable, context: Map<String, Any>?) {
Sentry.captureException(throwable)
}
override fun trackMetric(name: String, value: Double) {
// store in custom metrics
}
}
If you abstract your SDK call this way, it turns replacing Sentry with Datadog, Bugsnag (or any other platform), into a modified DI configuration rather than a rewrite.
6. Be Vendor Agnostic From Day One
Implement your observability stack in a way that you can easily swap in and out vendors. In the beginning, you want to try out various vendors and their pricing. So, having an abstract implementation allows you to avoid vendor lock-in:
- Use your own TelemetryClient to wrap all SDK calls.
- Create event schemas ("AppLaunch," "ANRDetected") so that changing vendors only requires re-mapping fields.
- Keep your SDK init/config info in a single module.
If you do it this way, you are never locked in, and you can respond to price changes (or feature changes) without pain.
7. Other Tools Like Sentry You Should Consider
Sentry is one of the most popular observability platforms for mobile, and it is not the only one that is worth considering. If you would like to investigate alternatives — or combine tools — here are some very good options:
- Bugsnag (by SmartBear) – Mobile first crash and ANR analytics, stability scores, and release health tracking.
- Firebase Performance Monitoring – Lightweight and easy for small teams, optionally integrates into Google's Firebase ecosystem.
- Datadog Mobile RUM – Full end-to-end mobile-to-backend transaction tracing, and strong when dealing with more complex systems.
- Instabug – combines in-app bug reporting, user feedback, and crash analytics.
- Casualty – Real-time mobile performance metric and regression alerts focused.
- New Relic Mobile – Cross-platform telemetry that correlates with a back-end service.
With a vendor-agnostic approach as discussed in this article, you can start with one tool and change (or add) your vendor without having to rewrite your app's core observability code.
Conclusion
- Consider ANRs for Android like they are P0s — you want to debug using stack sampling and correlate back to routes.
- Track latency at the transaction level rather than per API.
- Use Sentry's cold start tracking so you're measuring the most accurate launch performance.
- Only collect vendor-agnostic telemetry that can allow you to pivot the tool set at any given time.
- Be aware of your alternatives — there's a full observability ecosystem beyond Sentry.
Opinions expressed by DZone contributors are their own.
Comments