Full Stack Engineering in the Age of Hybrid Cloud
Full Stack Engineering in the Age of Hybrid Cloud
This article is featured in the new DZone Guide to Building and Deploying Applications on the Cloud. Get your free copy for more insightful articles, industry statistics, and more.
Join the DZone community and get the full member experience.Join For Free
Insight into the right steps to take for migrating workloads to public cloud and successfully reducing cost as a result. Read the Guide.
The Hybrid cloud describes an architecture that allow businesses to procure platform services from multiple sources. This implies that these services are divided across some physical boundary each possibly with their own disparate characteristics with regard to use and operation. For the Full Stack Engineer (FSE) —an engineer that has experience understanding the various layers that comprise an application stack inclusive of the physical and logical architectures—hybrid cloud will introduce additional levels of complexity and challenge even the best Full Stack Engineer capabilities.
In their November 2015 report, “IDC FutureScape: Worldwide Cloud 2016 Predictions — Mastering the Raw Material of Digital Transformation” Analyst firm IDC predicts more than 80% of Enterprise IT organizations will commit to hybrid cloud architectures by 2017. The drivers behind this will be the desire to leverage the best and most compliant platform for a given application component. While a strong business strategy, it will often ignore the consequences of operating these applications across a variety of cloud service providers and, more importantly, performing root cause analysis when something doesn’t work as planned.
A recent IDC report entitled, “DevOps and the Cost of Downtime: Fortune 1000 Best Practice Metrics Quantified” , on average, infrastructure failure costs large enterprises $100,000 per hour. Critical application failures exact a far steeper toll, from $500,000 to $1 million per hour. This is often due to the lack of ownership by any one single group. This difficulty only remains to become more problematic when applications are being divided across cloud boundaries.
This leaves many enterprises with the considerable burden of figuring how to deploy modern or cloud-native applications across multiple architectures and manage the release and operations of these applications. The answer to this problem requires more than just tearing down the silos, it requires employing more full stack engineers.
In today’s cloud world, the FSE is someone that understands the nature of a distributed applications regardless if components of the application are operating on bare metal servers in corporate owned data centers or on public clouds. Hence, they may not be an expert on every component involved in making that application work, but they understand the basic flow of data across the entire application, the impact of the various components with regard to performance on the application and they know which questions to ask of subject matter experts to arrive at an understanding when an application is not operating at it’s defined service levels.
Value of FSE and the Case of the Chatty Application
A chatty application is one that tends to flood the network with messages regarding activities. These messages include the application data itself on a data plane along with management data packets along a control plane. While under most normal conditions the chattiness of the application is not observable by end users, but may result in issues with connectivity and failures due to latency.
Given these outcomes, it can be expected that an end user would probably call the service desk to report their issue. Supposing the resulting ticket gets routed to the network and infrastructure team, they may assess the problem and determine that the chatty application is causing issues and isolate that traffic using quality of service capabilities of the router. If that ticket gets routed to the level 2 engineering support team, they may determine they may determine that the application that failed needs to be modified to better deal with latency.
Ultimately, we don’t know which solution is the best one. Correcting the issue in the infrastructure means implementing QoS rules that may be operational 100% of the time even though the problem only occurs 20% of the time. Correcting the issue in code means that we now need to maintain specialized code for a unique circumstance that may be highly-dependent upon other network events that are occurring, such as a corporate town hall meeting over the intranet.
In the end, the FSE would could best assess the conditions under which the problem occurred and be able to trace the data flow from the network through to the application. The FSE may determine that the application is unnecessarily chatty and the appropriate fix would be limit messages on the control plane. However, only by understanding the context—end user asserting that system responds with high latency at times—which the problem occurred and having the skills to assess the function of the application within that context can the appropriate solution be implemented.
What The FSE Needs to Know
Solving the aforementioned chatty application problem requires a certain set of skills that defines the boundaries of a full stack engineer. The following is a sampling of skills that a full stack engineer might have:
- Networking – The FSE should have a detailed understanding of IP networking inclusive of routing. Additionally, the FSE should be able to use packet tracing and analysis tools.
- Servers – The FSE should have detailed understanding of server configuration management inclusive of operating systems and hypervisors. This includes both bare metal and virtual servers as well as data center tools, such as KVM switching and keyboard logging.
- Application Infrastructure – The FSE should have an understanding of how various components of application infrastructure work, such as databases, message queuing, mail servers, web servers, application servers, etc. While it’s not feasible for any one individual to know all products in this space, there are certain standards these products implement , such as SMTP, JMS, APMQ, Tomcat, etc. that the FSE should be very knowledgeable about.
- Data Persistence and Modeling – An FSE should have understanding for how the data persistence architecture may impact performance. This includes:
- How the data is physically stored, for example direct-attached storage and Storage Area Network (SAN), as well as the types of medium used, such as flash, Solid State Drives (SSD) or Hard Disk Drives (HDD).
- How the data is logically represented, such as file or database.
- How the data is organized within the logical format.
- Data Flow – For a given application an FSE should knowledgeable in how the data moves between the various components of the application and across the application infrastructure. This should also account for how the physical architecture is connected.
- Information Security – Ideally, the FSE might also be a Certified Information Systems Security Professional (CISSP) which means that they will have a proven understanding of multiple domains surrounding information security. At a minimum, the FSE will need to understand authentication and authorization, firewalling, data loss prevention, and logging.
At this point, the business will have a high likelihood of achieving solid root cause analysis in cases of systems and service outages as well as review designs for implementation and provide guidance and assurances. Of course, the business could get this same result by hiring multiple individuals with domain expertise in each of these areas, however, even then, they would lose the understanding for how these various domains impact each other. This is the true value of the FSE.
The FSE in a Hybrid Cloud World
As if being a FSE was not complex enough, the introduction of the hybrid cloud architecture has emerged introducing a whole new level of complexity. To the aforementioned skills, the FSE must now also have understanding for applications operating across resources that they do not directly control. This includes the public internet, shared Infrastructure-as-a-Service (IaaS) and Platform-as-a-Service (PaaS) environments, cloud services with their assorted Application Programming Interfaces (API) all operated within a by a pay-for-use business model.
This is a real game changer for the FSE in today’s world as they must now attempt to discern new methods of analyzing operational performance leveraging only what the cloud service providers have made available. In contrast, there is considerably more transparency when the FSE has access to the physical device specifications. Additionally, the FSE is now expected to represent estimated costs for operating these applications based on usage projections and service pricing—one of the few skills that is inherently non-technical.
In line with current DevOps thinking, the FSE approach raises certain concerns with regard to avoiding creating a culture of “heroes”. That is, how does an organization develop this capability without having this capability become a single point of responsibility for all systems, and thus, become a bottleneck to design and delivery. Scalability will certainly be an issue here as it will be difficult to find many of these resources.
Businesses are going to have to rely on tooling to help with this problem. In this model, the FSEs will need to help develop blueprints for analysts to watch for and identify common issues pulling in the FSE for decision management. This is akin to surveillance analysts in military, who are responsible for discerning intelligence from multiple sources and then providing that information to senior military officials who will decide on appropriate action. In this case, the FSE can assess the intelligence gathered by operations analysts.
Sadly, there are no university programs or training courses one can take to become a full stack engineer. Some of the knowledge will be learned on the job, but for most that attain this level of understanding, it often comes at the price of personal investment of time and money to acquire and use the technologies that comprise being an FSE. Not surprisingly, most FSEs come from an application development background and learn the other skills as a way to deploy and debug their own applications.
The good news is taking the time to learn the skills necessary to be considered a full stack engineer is definitely worth the time and effort. It will be one of the more important requirements as companies drive for applied DevOps principles and attempt to shift left as much of the responsibility for application quality to the development stages. Those with the skills to be considered an FSE will be paid, on average, 20%-30% more than typical developers and will see easier growth into roles, such as Chief Architect and Chief Technology Officer.
Opinions expressed by DZone contributors are their own.