17 Essential Skills for Growing Performance Engineers
17 Essential Skills for Growing Performance Engineers
It’s possible to dabble but harder to excel. Great performance engineers usually flesh out their skillsets with tuning expertise, development skills, and testing theory.
Join the DZone community and get the full member experience.Join For Free
Built by operators for operators, the Sensu monitoring event pipeline empowers businesses to automate their monitoring workflows and gain deep visibility into their multi-cloud environments. Get started for free today.
Performance engineering as a discipline goes back several decades. I’ve heard firsthand accounts of testing and optimization of software from the 1960s. Still, much of what we practice today has built up in the last twenty years or so, since the first generation of commercial performance testing tools started appearing.
It can be hard to describe all the different skills that go into performance engineering. Most people in the field agree that it is an intersection of disciplines that includes testing, optimization, and systems engineering. There are great depths to be explored even in these subjects, and they need to be thought about in order to think about how to make more of us.
It turns out that this particular set of skills is difficult to accumulate organically. I recently met with this group of experienced performance engineers, and we talked about what we think the entry criteria are.
We tried to be reasonable in assessing what we learned along the way, and what we wished we had known at the beginning. After some debate, we ended up with seventeen skills that we think form a foundation for becoming a well-rounded performance engineer and what we would offer as a course of study for people aspiring to become performance engineers.
Remember: Almost no one will have all of these skills before they become performance engineers. But you will need to develop them in order to be effective in this role.
Systems and Architecture Skills
1. Interpret and draw system diagrams.
2. Basic understanding of systems environments: Shared resources, components, and services; CPU memory, storage, and network; soft resources.
3. The differences between production and test environments: Containers, cloud, virtualization, and configuration management.
Some exposure to resources and distributed systems is a good place to start. Understanding the difference in exhaustion behavior between memory and CPU is a good test for understanding resources. Visualizing and abstracting the system, then drawing a diagram checks for an architect’s eye.
I’ve heard it said that there aren’t more than a handful of people alive who could explain how every component of a modern system of hardware and software actually works. We all depend on abstraction and subject matter experts to help fill in all the hazy areas in our understanding. But can you read the index?
4. Goals, requirements, desirements, and stakeholders.
5. Concurrency, arrival rates, and scheduling.
6. Scalability, capacity, and reliability as quality attributes and requirements.
7. Test data and test data management.
The theory and practice of testing is an endlessly fascinating exercise in applied epistemology. Thinking through the design of a test – what information it will gather, what that information will mean, and where it is fallible – is great fun and really interesting (if you are doing it right).
There is no way to effectively understand that without understanding the context. What do the stakeholders need to know? What will help the rest of the team be successful?
8. Identifying transactions and workflows: Calculating workload TPS goals and rates.
9. Think time and pacing.
10. Log file analysis, running queries, and production monitoring.
Load tests are experiments. The parameters of the load test are the conditions of the experiment. In order for the experiment to be a valid predictor of future experience, the modeled conditions need to accurately project expected activity for the system’s workload. Describing these conditions requires counts and frequencies, which is more complex than the simple concurrency count most people think of when describing load amounts.
Constructing these models requires distilling the guesses and projections of stakeholders and subject matter experts and then supplementing with data from monitoring systems and logs. Not every activity in the system can or should be part of a load test, but most of the critical and high-volume ones should.
Scripting and Test Automation
11. Parameterization and dynamic content.
12. Transaction measurement and naming conventions.
Some scripting skills are needed to build load tests. It’s a smaller hill to climb than being a really good SDET for functional test automation, and many tools provide a rapid test development environment that minimize the amount of coding necessary to get going. The bar for entry is not particularly high, but advanced PEs might even optimize source code. There is room for a wide range of skill level here, but perf engineers need to solve a lot of different problems when recreating request workloads. Sometimes load tools help, and sometimes you have work around limitations to get to where you need to go.
Interpreting Performance Test Results
14. Measurement and metrics.
16. Reading results and interpreting graphs.
17. Explain queues (Little’s Law).
I’ve talked about this before, but stakes are high in a load test. It is critical to understand what a performance test does and does not tell us, and we can’t be wrong.
In all testing, there is the risk of being too invested in confirmation and under-invested in investigation and exploration. There can be much more to load testing than applying as much load as you think you need, such as checking that the average response time is acceptable, the error rate is low enough, and noting that the system didn’t crash. That might describe a good portion of the load testing that takes place, but here are just a few more things that could be examined in a load test:
- Did response time degrade? When? By how much, and in what pattern?
- How did resource consumption look against load? Any degradation patterns?
- Can we make any projections about system capacity from what we’ve learned?
- When did errors occur? Were they related to changes in response time or metric consumption?
Who Becomes a Performance Engineer?
Some of us were functional testers who were assigned to performance. Others were developers who took an interest in optimization. Some SMEs, such as network admins or database administrators, get interested in performance and optimization and migrate towards it. Fresh engineering graduates are another source. Many highly talented people without formal education seem to turn up in this specialty, as well.
Modern thinking about Agile development emphasizes multiple roles over rigid titles. As with other specialties, there is a gradient of degrees of effectiveness based on experience and skill. It’s possible to dabble but harder to excel. Great performance engineers usually flesh out their skillsets with tuning expertise, development skills, and testing theory. As with anything worth doing, there is more than one way to do it.
A number of backgrounds can lead to performance engineering, but the diverse set of skills needed means that the best candidates are technology generalists with a healthy curiosity and a desire to learn. Perhaps this formulation of helpful skills can help a seeker find their way.
Published at DZone with permission of Eric Proegler . See the original article here.
Opinions expressed by DZone contributors are their own.