Golgi has recently launched it’s Golgi Programmable Device Cloud – a cloud infrastructure designed to enable developers to access their Internet of Things (IoT) devices via Web APIs. Golgi’s cloud infrastructure is Secure, Reliable and Scalable.
These three key features are a must for any cloud based API or infrastructure – indeed you’ve probably heard them mentioned on other cloud/web service provider’s sites. And whether you’re using a cloud service like Golgi, integrating your own cloud service alongside another or building a standalone service for yourself, understanding a little more about how services are made secure, reliable, and scalable can be invaluable. With that in mind, in this post we’re going to discuss just that.
Before we discuss details it is worth noting that internet security is massive area that is constantly evolving. This means that your security policies and practices will require continuous updating to stay ahead of emerging threats.
That said there are some basics that can get you started. We’ll start our discussion with SSL/TLS. I’ve written a more detailed article on using SSL/TLS — but let’s look at some of the main points.
- You should encrypt all your data transmissions — if a user’s device is reporting back to your servers, you don’t want to make it easy for an attacker to intercept the data.
- Use TLS and not it’s predecessor SSL — SSL has been broken.
- Always verify the server’s certificate with a CA on the client side — to avoid MiTM attacks.
- Keep up to date with developments in SSL/TLS — if someone demonstrates how a particular cipher suite can be broken, remove it from your server options. If there is a problem identified with your SSL/TLS library, make sure you patch it appropriately.
There are other steps you can take on the server. Don’t allow unnecessary ports to remain publicly open. It’s easy to think of the security at the port where users are expected to access the service (in this case 443), but an attacker may try to compromise your server and an open port may just end up being their route in. If you do require ports to be open – perhaps for internal network communications — add a whitelist of IP addresses that are allowed to connect to them. This will prevent (or at least make it more difficult for) outside attackers targeting these ports.
Of course your system will have to store some user data. If one of your servers is compromised you may suffer from data loss. Protecting this data can be ensured with some (best maybe to use all) of the following steps:
- Segregate your network into front-end servers (to receive data) and back-end servers (to process and store data)
- Front-end servers never connect, they always listen – users connect into the front-end to send data and the back-end servers connect into the front-end to receive data
- Make sure all sensitive user data is encrypted on disk – the extra processing overhead will be worth it if a back-end server ever gets compromised
- Ensure your processes don’t leak vital memory in response to requests – Heartbleed was a serious bug in OpenSSL that leaked potentially very sensitive information and was caused by the lack of a simple bounds check. Be careful with your programming!
Reliability means that your users can always (or at least nearly always) access your service – even some of the giants of the internet have occasional interruptions of service. The key element to ensure reliability is redundancy. If all your servers are in a single location and that location has a critical failure (e.g. loss of power) your service will go down. But if you have multiple sites you can mitigate against this threat.
A common model is to have geographically distributed sites. This not only provides fallback sites in case of down time but can also reduce the transmission latency for your global community of users. You should also take steps to try to mitigate against Denial of Service (DoS) or Distributed Denial of Service (DDoS) attacks. These kinds of attacks are relatively easy to launch and can bring down a service if not properly protected. If you’re hosting your service in one of the large cloud service providers such as AWS, then you will probably have tools available to you from the cloud service to provide that protection.
There are two distinct problems to address when thinking about scalability. There is long term scalability (e.g. when your subscriber base grows over time) and short term scalability (e.g. when there is an unexpected temporary spike in service usage).
To address long term scalability your system will need to be designed so that it can react to your services growing user base. A good example of helping your service to grow is the division of front-end and back-end servers we discussed above. We have already highlighted how this can be beneficial from a security point of view.
However, it can also help when growing your service as it allows you to add servers where needed. For example, you may need to add additional front-end servers to handle increased requests or you may need additional back-end servers due to increased processing requirements. And this modularisation technique can be expanded to further subdivide your system so that it can be easily scaled.
Databases are another area that should be considered as your system expands. A single master database for 1000 users is probably achievable. But a single master data base for 100 million users probably isn’t. Try to plan early how you can subdivide your database so that you can one day achieve a service for 100 or even 500 million users.
To address short term scalability (and assuming we don’t want any loss or degradation of service) simply requires one thing. More servers. This can be a challenging exercise. However, if you are hosted in a cloud service provider there is often an option for auto-scaling — that is, the ability to launch (and subsequently shutdown) additional VMs on demand. Of course you will have to ensure that when these additional systems auto-launch that they integrate with your existing infrastructure so that they can start to serve the network.
And that’s it. I hope this brings some insight to security, reliability, and scalability on the web.