Anyone involved in hiring DevOps engineers will realize that it is hard to find prospective candidates who have all the skills listed in the job description. Most of the experienced applicants have deep knowledge of just a few tools. So, you look for people with essential skills with the intention to get them trained on the job. However, what are those essential skills?
That question nags the head-hunters scouting for talents, the hiring manager who sifts through the resumes of prospective candidates, and the interviewers who determine whether or not the applicant would deliver. The DevOps jobs seekers have similar uncertainties bothering them as they wonder what skills they have to highlight or get trained on before approaching a prospective employer.
With many tools and processes being linked to the DevOps practice and the fact that any technology could be related to DevOps somehow, it becomes very confusing to differentiate the essential skills from the desired or the unrelated.
I don’t think that there is any single recipe to sort out this problem. Ultimately, the skill set needed for an incoming DevOps engineer would depend on the current and short-term focus of the operations team. A brand new team that is rolling out a new software service would require someone with good experience in infrastructure provisioning, deployment automation, and monitoring. A team that supports a stable product might require the service of an expert who could migrate home-grown automation projects to tools and processes around standard configuration management and continuous integration tools.
An experienced DevOps engineer would end up working in a very broad swath of technology landscapes that overlaps with software development, system integration, and operations engineering. Sometimes, DevOps practice is a glue between these related engineering disciplines.
To be able to become that glue, bridging the gap (or pulling down the wall) between the teams' technical and soft skills is needed. However, the focus of this article is on the technical skills. An experienced DevOps engineer would be able to describe most of the technologies that I would describe in the following sections. This is a comprehensive list of DevOps skills for comparing one’s expertise and a reference template for acquiring new skills.
Knowledge of Infrastructure
A DevOps engineer should have a good understanding of both classic (datacenter-based) and cloud infrastructure components, even if the team would have dedicated infrastructure engineers.
This involves how real hardware (servers and storage devices) are racked, networked, and accessed from both the corporate network and the internet. It also involves the provisioning of shared storage to be used across multiple servers and the methods available for that, as well as infrastructure and methods for load balancing.
- Virtual machines.
- Object storage.
- Running virtual machines on PC and Mac (Vagrant, VMWare, etc.).
Cloud infrastructure has to do with core cloud computing and storage components as they are implemented in one of the popular virtualization technologies (VMWare or OpenStack). It also involves the idea of elastic infrastructure and options available to implement it.
- Network layers.
- Routers, domain controllers, etc.
- Networks and subnets.
- IP address.
- IP tables.
- Network access between applications (ACL).
- Networking in cloud (i.e., Amazon AWS).
- Load balancing infrastructure and methods.
- Geographical load balancing.
- Understanding of CDN.
- Load balancing in cloud.
A DevOps engineer should have experience using specialized tools for implementing various DevOps processes. While Jenkins, Ansible, and the like are known to most everyone, other tools might be obscure or not very obvious (such as the importance of knowing one major monitoring tool in and out). Some tools like, source code control systems, are shared with development teams.
The list here has only examples of basic tools. An experienced DevOps engineer would have used some application or tool from all or most of these categories.
Source Code Management (SCM) System
- Expert-level knowledge of an SCM system such as Git or Subversion.
- Knowledge of code branching best practices, such as Git-Flow.
- Knowledge of the importance of checking in Ops code to the SCM system.
- Experience using GitHub.
Bug Management System
- Experience using a major bug management system such as Bugzilla or Jira.
- Ability to have a workflow related to the bug filing and resolution process.
- Experience integrating SCM systems with the bug resolution process and using triggers or REST APIs.
Collaborative Documentation System
- Knowledge of Wiki basics.
- Experience using MediaWiki, Confluence, etc.
- Knowledge of why DevOps projects have to be documented.
- Knowledge of how documents were organized on a Wiki-based system for past projects.
Build and CI
- Experience building on Jenkins.
- Experience using Jenkins as Continuous Integration (CI) platform.
- Experience with CI platform features such as:
- Integration with SCM systems.
- Secret management and SSH-based access management.
- Scheduling and chaining of build jobs.
- Source-code change based triggers.
- Worker and slave nodes.
- REST API support and nNotification management.
- Should know what artifacts are and why they have to be managed.
- Experience using a standard artifacts management system such as Artifactory.
- Experience caching third-party tools and dependencies in-house.
- Should be able to explain configuration management.
- Experience using any Configuration Management Database (CMDB) system.
- Experience using open-source tools such as Cobbler for inventory management.
- Ability to do both agent-less and agent-driven enforcement of configuration.
- Experience using Ansible, Puppet, Chef, Cobbler, etc.
Orchestration and Deployment
- Knowledge of the workflow of released code getting into production.
- Ability to push code to production with the use of SSH-based tools such as Ansible.
- Ability to perform on-demand or Continuous Delivery (CD) of code from Jenkins.
- Ability to perform agent-driven code pull to update production environment.
- Knowledge of deployment strategies, with or without an impact on the software service.
- Knowledge of code deployment in cloud (using auto-scaling groups, machine images, etc.).
- Knowledge of all monitoring categories: system, platform, application, business, last-mile, log management, and meta-monitoring.
- Status-based monitoring with Nagios.
- Data-driven monitoring with Zabbix.
- Experience with last-mile monitoring, as done by Pingdom or Catchpoint.
- Experience doing log management with ELK.
- Experience monitoring SaaS solutions (i.e., Datadog and Loggly).
Stay tuned for Part II, where we'll talk about system tools and methods, programming, setting the bar high, and more.