The Innards of mabl: The Whys Behind the Technologies Used in mabl
Many have been curious about the technology that sits behind the UI of mabl, an ML-driven automated end-to-end testing service. So, I thought I'd share the details.
Join the DZone community and get the full member experience.Join For Free
We started building mabl, the ML-driven automated end-to-end testing service, in March 2017. All throughout, we've made many decisions regarding the technology used to build the service. Late in 2017, we began to show the service to alpha customers and since then, many have been curious as to the technology that sits behind the UI. So, I thought I'd share the details.
Beginning with the infrastructure layer, we had to decide where to run mabl. This was a decision that was researched by our early engineering team. As we began to compare AWS and GCP, we focused on some of the core components we knew the application would use across both providers.
We knew we wanted to be serverless as much as possible, so we'd be heavy users of containers including a container orchestration service, as well as a compute-on-demand service. In addition, since we were designing an application that would use several different analytics layers, we knew we needed a data pipeline, database/query service, and machine learning capabilities. Ultimately, we decided that because Google offered an off-the-shelf version of Kubernetes called Google Kubernetes Engine, it provided less lock-in to the cloud provider if we ever wanted to switch, unlike AWS, which offered a custom version of Kubernetes.
On the analytics front, we did a bake-off of Dataflow vs. Kineses and BiqQuery vs. DynamoDB and pinned the ML services against one another. In all of these bake-offs, GCP's analytics services were a much better fit for our use case than AWS for reasons which we've documented. We've now been on GCP for over a year, and we couldn't be happier. We use many of their services, including Cloud Functions, Cloud Storage, Key Management Service, Stackdriver, Firebase, AppEngine, and, of course, the core services that drove our decision.
Moving from back-end to front-end, we evaluated several different front-end frameworks to support mabl, which we knew would be a single-page app. This collection included Angular, Angular 2, React, Ember, Backbone, and the newest trendy sensation, Vue.
For this research project, our engineer who had the most front-end experience led the analysis and looked at variables including ecosystem support, ability to get up and going quickly, how quickly new features were moving from beta to GA, as well as if there was a major contributor to the project, e.g. Facebook with React or Google with Angular. The shortlist came down to Angular 2 vs. React, and we ultimately decided on React for various reasons.
The first reason was the community and their satisfaction. During the time gap between Angular 1 and 2, a large community of engineers emerged — using, developing, and supporting React — and their feedback was quite positive. Many developers we spoke to were advocates of how quickly they were able to add new features in React, albeit with a steeper learning curve from other frameworks they've worked with.
We also appreciated the performance we'd get using React — a large part of it due to how it renders dom elements in the browser. When paired with Redux, we're able to manage application state centrally and optimize performance better by managing app re-rendering based on state changes.
Overall, what we learned during our research of selecting a front-end framework has indeed been true: the learning curve is high but we're able to add new features very quickly — much quicker than we did at my prior company, where we were using Angular.
Next came the decision of what language we would write the application in. Our engineering team had a broad range of skillsets, but we really wanted to make an objective decision, not one that pushed us towards the most popular language that most of our engineers had experience with.
Next was our second easiest decision: choosing between Python and Java for the back-end. We wanted to choose a language that allowed us to move fast. We felt it would be easier to debug Java, as analysis by humans would be easier since Java was a statically typed language as well as compiled. Given we didn't have many advocates for Python and we thought Java would allow us to move faster, we landed on Java.
Lastly, we had to decide which languages to use for the machine learning and analytics component of the application. We looked at both the different types of machine learning models we were training (performance anomaly detection and visual change detection), as well as the data pipeline.
As we dug deeper, we found the best support and libraries for our machine learning models (which includes TensorFlow) was in Python, so we decided on Python for the training side. On the analysis front, we wanted to use GCP Dataflow (Apache Beam), which only supports Python for batch, not streaming, so we had to use Java in order to use Dataflow for streaming.
Despite us spending a lot of time on each of these decisions in the early days of mabl, that's not to say we won't revisit them over time. Our requirements may change, we may have been wrong about one of our decisions, or numerous other reasons as to why we may revisit and switch. If you feel there's an area of our application which could benefit from different technology, we'd love to hear your perspective in the comments below.
Published at DZone with permission of Izzy Azeri, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.