Curator's Note: If you need a tool to quickly define, instantiate and manage environments, the open source tool, Terraform, is your best bet. Terraform integrates with EC2 and vShpere to spin up new VMs in their own virtual network. Essentially you clone entire environments, not just VMs.
On my current project we’ve been setting up production and staging environments and Shodhan came up with the idea of making staging and production identical to the point that a machine wouldn’t even know what environment it was in.
Identical in this sense means:
- Puppet doesn’t know which environment the machine is in. Our factor variables suggest the environment is production.
- We set the RACK_ENV variable to production so applications don’t know what environment they’re in.
- The IPs and host names of equivalent machines in production/staging are identical
The only thing that differs is that the external IPs to access machines differ and therefore the NATed address that they display to the world when making any outgoing requests is also different.
The only place where we do something different based on an environment is when deploying applications from Jenkins.
At that stage if there needs to be a different configuration file or initializer depending on the environment we’ll have it placed in the correct location.
Why are we doing this?
The benefit of doing this is that we can have a much higher degree of confidence that something which works in staging is also going to work in production.
Previously we’d noticed that we were inclined to write switch or if/else statements in puppet code based on the environment which meant that the machines were being configured differently depending on the environment.
There have been a few problems with this approach, most of which we’ve come up with a solution for.
Applications that rely on knowing their environment
One problem we had while doing this was that some applications did internally rely on knowing which environment they were deployed on.
For example, we have an email processing job which relies on the RACK_ENV variable and we would end up processing production emails on staging if we deployed the job there with RACK_ENV set to ‘production’.
Our temporary fix for this was to change the application’s deployment mechanism so that this job wouldn’t run in staging but the long term fix is to make it environment agnostic.
Addressing machines from outside the network
We sometimes need to SSH into machines via a jumpbox and this becomes more difficult now that machines in production and staging have identical IPs and host names.
We got around this with some cleverness to rewrite the hostname if it ended in staging. The ~/.ssh/config file reads like this:
Host *.staging IdentityFile ~/.ssh/id_rsa ProxyCommand sh -c "/usr/bin/ssh_staging %h %p"
And in /usr/bin/ssh_staging we have the following:
HOSTNAME=`echo $1 | sed s/staging/production/` ssh staging-jumpbox1 nc $HOSTNAME $2
If we were to run the following command:
That would SSH us onto the staging jumpbox and then proxy an SSH connection onto some-box.production using netcat.
Since web facing applications in both staging and production are referred to by the same fully qualified domain name (FQDN) we need to update our /etc/hosts file to access the staging version.
When SSHing you can’t tell which environment a machine is
One problem with having the machines not know their environment is that we couldn’t tell whether or not we’d SSH’d into a staging or production machine by looking at our command prompt since they have identical host names.
We ended up defining PS1 based on the public IP address of the machine which we found out by calling icanhazip.com
Overall despite the fact that it was initially painful – and Shodhan game-fully took most of that pain – to do this I think it makes sense as an idea and I’d probably want to have it baked in from the beginning on anything I work on in future.