I believe that the debate of mobile testing on real devices vs. emulators is one of the oldest and most emotional debates in the mobile space over recent years.
In this blog, I would like to try and make "peace" between the parties who are in favor of each.
Before I give my POV on that, let's clarify the exact meaning of "testing mobile apps" so we're all on the same page (I wrote an entire book about it).
Mobile app testing has a wide scope that includes unit tests, functional end-to-end (E2E) tests, performance and UX tests, security and accessibility tests, and compatibility tests. Some innovative apps will also require advanced audio and gesture test cases for chatbots, fingerprint and Face-ID authentication.
While we're all familiar with the traditional test pyramid, when we apply it to mobile, the testing strategy often looks different - especially when we try to include the large scope and the market platforms scale into the testing cycle.
That's why organizations are trying to balance between the test types, software development life cycle (SDLC) app phase, and other considerations to address the challenge of quality and velocity.
Emulators, on one hand, serve a very important goal during the SDLC. They greatly reduce costs to the developers and testers. Some would argue that they are faster to set up and execute. They have lower error rates and are already embedded in the developer's environment in most cases, which is very convenient from a fast feedback perspective.
On the other hand, emulators are not running the formal carrier OS version - they do not run on the same real device hardware - they cannot cover all required environment-based testing like specific carrier configuration or unique sensors and gestures testing, and mostly - they cannot serve as a single Go/No-Go decision platform. End-users that are consuming the app from various locations and environments in the world are not using emulators but real devices with varying conditions, OS versions and many competing apps running in the background. Therefore, this should be the formal test bed for your advanced testing activity.
The best practice for mobile app testing should rely on a mix of tests that are handled by different personas in the mobile product team that are spread between emulators and real devices, based on the build phase. In the early sprint phases, when the features are only shaping up, it makes a lot of sense to run smoke tests, unit tests and other fast validations against emulators from the developer environment. Later in the build process, when the coverage requirements and the quality insights are greater, launching the full testing scope in parallel against real devices is the right way to go.
Real Devices vs. Emulators - LinkedIn Case Study
I presented last month at Android Summit 2017. The event was awesome, and I enjoyed great sessions and community networking. One of the presentations at this conference was given by Drew Hannay, Staff Software Engineer from LinkedIn.
LinkedIn (acquired by Microsoft) is now serving more than ½ a billion users across multiple platforms like Web and mobile. According to the company, it is constantly trying to attract new users, mostly of a young age (college graduates - the numbers of which are estimated at 180M by LinkedIn's university campaign), or from expanded geographies such as India & China, which include 140M users in China alone.
LinkedIn used to suffer from poor quality, stability issues and bad reviews in the past. While trying to solve the quality issues, as well as keep growing, LinkedIn announced more than a year ago project Voyager: a new release model and SDLC strategy that is aimed to improve both quality as well as release velocity for the organization. LinkedIn shifted to an admirable 3×3 release strategy. As Drew presented, LinkedIn are now able to push a new version to production 3 times a day, every 3 hours from developer's code commits - this is due to the announced project Voyager.
While the release cadence improved, allowing the LinkedIn product team to quickly adapt to changes, fix bugs faster and innovate, the end-user experience somehow dropped, and many of the mobile users are vocal about it in the app stores, stating that they would rather use the desktop browser instead of the mobile app. The thing about quality, especially in the mobile space, is that it is "a moment in time." With such an aggressive release schedule to production, the build that was in production 3 hours ago is irrelevant from quality and stability perspectives, therefore, the product team must know at all times how the app works on real devices in production rather than assuming that it works just based on emulator testing.
As shown in the above image, LinkedIn is running as part of their test plan the entire test scope on 16 emulators in parallel. There is zero coverage on real devices, because as Drew stated - our implementation is agnostic across devices.
If we validate the above statement and strategy against the app store, and my own personal device experiences, the implementation isn't that device-agnostic, and there are many escaped defects to production that are impacting real-device users.
The above experiences show crashes of the app on Android devices when switching from Wi-Fi to real carrier networks, invites from mobile to various connections not working properly, sync issues between the LinkedIn feed and what's shown on the browser version, installation issues on real devices, and more.
When the majority of traffic to the LinkedIn app is coming from mobile devices, the LinkedIn testing strategy should be tuned accordingly.
With the above stats and the quality reality of LinkedIn users, the testing strategy needs to change. Future usage growth is expected to come from India, China, and non-U. S demographics, building the case for different devices, OS, form factors and network conditions.
LinkedIn needs to base their mobile testing strategy on real-life personas operating from different geographic locations, varying conditions, background apps and more.
Testing on emulators is essential and should be kept as part of the strategy, but it cannot be the only platform for testing this app - as seen above, it does not guarantee continuous quality and UX.