Performance is a mystical thing our systems must have. But as with most thing in software engineering, there is no clearly defined set of steps that have to be followed in order to have a performant system. It depends on the architecture, on the network, on the algorithms, on the domain problem, on the chosen technologies, on the database, etc.
Apart from applying common-sense driven development, I have “collected” some general tips on how problems with performance can usually be addressed.
But before that, I have to make a clarification. A “performance problem” is not only about problems that you realize after you run your performance tests or after you deploy to production. Not all optimization is premature, so most of these “tips” must be applied in advance. Of course, you should always monitor, measure and try to find bottlenecks in a running system, but you should also think ahead.
The first thing is using a cache. Anything that gets accessed many times but doesn’t change that often must be cached. If it’s a database table, the query should be cached. If a heavy method is invoked many times, it can be cached. Static web resources must be cached. On an algorithmic level, memoization is a useful technique. So how to do the caching?
It depends. An ORM can provide the relevant cache infrastructure for database queries, spring has method-level cache support, web frameworks have resource caches. Some distributed cache (memcached/redis/ElastiCache) can be setup, but that may be too much of an effort. Sometimes it’s better and easier to have a local cache. Guava has a good cache implementation, for example.
Of course, as Phil Karlton has once said, “There are only two hard things in Computer Science: cache invalidation and naming things”. So cache comes with a “mental” cost. How and when should the cache be invalidated. So don’t just cache everything – figure out where there’s benefit. In many cases, that is quite obvious.
The second tip is to use queues (and that does not contradict my claim that you probably don’t need an MQ). It can be an in-memory queue, or it can be a full-blown MQ system. In any case, if you have a heavy operation that has to be performed, you can just queue all the requests for that operation. Users will have to wait, but sometimes that doesn’t matter. For example, twitter can generate your entire twitter archive. That takes a while, as it has to go through a lot of records and aggregate them. My guess is that they use a queue for that – all requests for archive generation are queued. When your time comes, and your request is processed, you get an email. Queuing should not be overused, though. Simply having an expensive operation doesn’t mean a queue solves it.
The third tip is background calculation. Some data you have to show to your users doesn’t have to be generated in real-time. So you can have a background task that does its job periodically, instead of having the user wait for the result in a veery long request. For example, music generation in my Computoser takes a lot of time (due to the mp3 generation), so I can’t just generate tracks upon request. But there’s a background process that generates tracks and serves a newly generated track to each new visitor.
The previous two tips were more about making heavy operations not look slow, rather than actually optimizing them. But they are also about not using too much server resources for achieving the required task.
Database optimizations are next. Quite obvious, you may say, but actually – no. Especially if using an ORM, many people have no idea what happens underneath (hint: it’s not the ORM’s fault). I’ve seen a production system with literally no secondary indexes, for example. It was fine until there were millions of records, but it has gradually become unusable (why it wasn’t fixed – different story). So, yes, create indexes. Use EXPLAIN to see how your queries are executed, see if there are any unnecessary full table scans.
Another tip that I’ve already written about is using the right formats for internal communication. Schemes like Thrift, Avro, protobuf, messagepack, etc. exist for exactly this reason. If your systems/services have to communicate internally, you don’t want XML, if there’s another format that takes 20% of the space and uses 30% of the CPU to serialize/deserialize. These things accumulate at scale.
The final tip is “Don’t do stupid things”, and it’s harder than it sounds. It is a catch-all tip, but sometimes when you look at your code from aside, you want to slap yourself. Have you just written an O(nn) array search? Have you just called an external service a thousand times where you could’ve cached the result the first time? Have you forgotten to add an index? Such obviously stupid things lurk in every project. So in order to minimize the stupid things being done, do code reviews. Code reviews are not premature optimization either.
Will applying these tips mean your system performs well? Not at all. But it’s a good start.