I just read through a great Twitter blog post, New Tweets per second record, and how! It covers in-depth the changes to their engineering organization over the last three or so years. As CCP is undergoing similar technical stresses (we hit our PCU record recently), and responding with similar actions, I thought I’d write down my takeaways and personal feelings.
Interpreted languages are fast enough or too slow. Ruby helped Twitter get where it is, and without such insane peak performance needs I’m sure they would have stayed with Ruby. We’re actually about to embark on rewriting some core systems from Python into C++, because we can’t always throw (more or better) hardware at the problem.
Architect your organization the way you'd want your software to be architected. This is derived from Conway’s law (“Four teams working on a compiler will develop a four-pass compiler”), and I feel is central. If you want to rearchitect your technology, you need to reform your organization. There’s no way around it. The basis of discussion for those organizational changes should be how the technology should look.
Prefer interfaces at the service level. This isn’t always possible (core libraries or frameworks, fat clients, etc.), but should be preferred. It's a natural boundary. Although, interfaces at the module, package or class level can work fine for a certain scale (up to quite large, I imagine). Twitter has more open Software Engineer positions than most projects have programmers! But SOA is a great thing, for technology and for organizations.
Self-organized teams around services are effective. Self-organizing teams have been one of the keys to Agile’s effectiveness (even when Agile development principles aren’t followed). It is difficult to scale Agile up to multiple teams, though, so dividing at service boundaries is a convenient way to reduce to how large Agile must scale. (As an aside, I don’t mean to say “Agile doesn’t scale” or that other development methodologies scale better, it’s just really difficult to scale up software development.)
Monolithic databases will be a bottleneck. Twitter came up with some clever solutions for sharding and balancing that worked for Twitter. If you have a monolithic database under great strain, you will need to get creative with your own solution.