Being part of a startup is a bit like riding a roller coaster—there’s a great idea, lots of uncertainty, and too much work to do. Many industry talks have been held, tons of books have been written, and everyone has their opinion as to what the priorities of startups should be. If you’re working in a startup you know how easy it is to lose focus on your goals.
Founding a new company is a complex endeavor. You need to keep an eye on your competitors, be aware of relevant social media discussions, ensure that your business runs smoothly, pay taxes, and oh, by the way, you also need to create the best product in the world.
As I’ve worked at young tech companies for more than 10 years now, I’ll focus this blog post on keeping an eye on the backbone of most young companies: customer-facing IT systems. So if your company doesn’t and never will have customer-facing IT systems, you can stop reading right now. For all you others, I’ll provide a brief overview of how you can provide basic protection for your business and avoid bad surprises through effective monitoring of your IT systems.
1. Availability monitoringIt’s always good to be notified of outages at least as soon as your customers are aware of them (and ideally not from angry calls from your customers). So the first thing a startup needs is basic availability monitoring of its IT systems. I tried out several monitoring tools while working with some friends as a part-time sysadmin during my university days. The monitoring required regular pings to our servers and the tools needed to reach specific URLs within our application. If a ping failed a few times, I received an alert on my mobile phone. Once the pings were successful, I received an additional all-clear SLA alert on my phone.
Most of the time I received the all-clear alerts after only a few minutes (usually about the time I’d opened up my laptop and began investigating the problem). On a few occasions I restarted our Web- and application servers only to discover later that the availability issue had resolved itself without my involvement.
So, it’s crucial that you be aware of when your services go offline, but it’s preferable if you can avoid downtime entirely. Availability monitoring should be one of the first systems you put in place, but the monitoring needs to have intelligence behind it so that when you receive an alert you know that you have a real problem on your hands that needs your immediate attention.
2. Infrastructure monitoring
Being aware of your system’s availability is important, but it’s preferable to identify issues early, before they become customer-facing problems. The most obvious point of failure is your IT infrastructure. You need to be aware of hard drive slow-downs (maybe via indicators that tell you when each hard drive has reached the end of its lifespan). You also need to know if your CPU is running high and dropping important tasks. And you need to know if you’re close to running out of memory, at which point your OS will begin swapping memory to your hard drive (and things become really sloooooow). If you know about these issues early enough, you can invest the time required to solve them before they affect your customers.
But your company is really hip and runs everything in the cloud, right? Thats great, but you should consider that cloud-based services also run on physical infrastructure and that your architecture may be dependant on CDNs or other cloud-based services (for example, if you allow your customers to pay with a credit card or you display third-party ads on your site). All of these third party services need to be monitored.
3. Process/service monitoring
In these times of virtualization and PaaS, many startups believe they don’t need to monitor their network infrastructure if it’s cloud-based. You still need to know that your code and services are up and running though. Is your credit-card service responding quickly enough? Are there specific customer-facing problems that only affect specific services or service methods? Are customers relying on broken links to access your site? Which requests consume most of your users’ time? Is your CDN working correctly? Process and service monitoring provides answers to all these questions, thereby enabling you to take an active role in improving the experience of your customers, and thereby your business success.
4. Real user monitoring
The final type of monitoring I want to mention here is real user monitoring . RUM isn’t new necessarily, but it’s only been until recently that RUM was an affordable option for small businesses. RUM enables you to monitor each of your customers’ interactions with your website.
RUM enables you to see where your customers come from, what their experience is on your website, how your customers interact with your website, how long your customers need to wait for results from your site, and if there are any issues with the delivery of 3rd party content.
Be aware however that RUM should be used only in conjunction with the other types of monitoring mentioned earlier. There’s no point in trying to measure user experience if your site is down. For young companies it isn’t always possible to distinguish the causes of zero user load; is your site really down, or is there just no traffic on your site?
Take the time to focus on your product
Some of the types of monitoring I’ve detailed here can be achieved using free software tools. If you need to get up and running quickly (and focus on what really makes you your money: your product), I suggest you try out ruxit . ruxit is capable of doing all the types of monitoring I’ve described in this post in addition to much more—all with almost no effort. And the good news is, ruxit offers great startup packages to make monitoring affordable for your young company.