Many organizations require a secure infrastructure. I’ve yet to meet a customer that says that security isn’t a concern. But, the decision on “how secure?” should be closely associated with a risk analysis for your organization.
Since Amazon Web Services (AWS) is often referred to as a “public cloud”, people sometimes infer that “public” must mean it’s “out in the public” for all to see. I’ve always seen “public/private clouds” as an unfortunate use of terms. In this context, public means more like “Public Utility”. People often interpret “private clouds” to be inherently more secure. Assuming that “public cloud” = less secure and “private cloud” = more secure couldn’t be further from the truth. Like most things, it’s all about how you architect your infrastructure. While you can define your infrastructure to have open access, AWS provides many tools to create a truly secure infrastructure while eliminating access to all but only authorized users.
I’ve created an initial list of many of the practices we use. We don’t employ all these practices in all situations, as it often depends on our customers’ particular security requirements. But, if someone asked me “How do I create a secure AWS infrastructure using a Deployment Pipeline?”, I’d offer some of these practices in the solution. I’ll be expanding these over the next few weeks, but I want to start with some of our practices.
* After initial AWS account creation and login, configure IAM so that there’s no need to use the AWS root account
* Apply least privilege to all IAM accounts. Be very careful about who gets Administrator access.
* Enable all IAM password rules
* Enable MFA for all users
* There should be a 1-to-1 relationship between AWS (IAM) users and EC2 Key Pairs; No EC2 Key Pairs should be shared with others. Same goes for Access Keys.
* Do not give Internet access to any EC2 instances (i.e. no security groups should have a CIDR Source of 0.0.0.0/0). The bastion host might have access to port 22 (SSH), but you should enable CIDR to limit access to specific subnets.
* Use IAM to limit access to specific AWS resources and/or remove/limit AWS console access
* Apply a bastion host configuration to reduce your attack profile
* Use IAM Roles so that there’s no need to share Access Keys.
* Use SSE to secure objects in S3 buckets
* Share initial IAM credentials with others through a secure mechanism (e.g. AES-256 encryption)
* Use and monitor AWS CloudTrail logs
* Automate everything: Networking (VPC, Route 53) Compute (EC2), Storage, etc. All AWS automation should be defined in CloudFormation. All environment configuration should be defined using infrastructure automation scripts – such as Chef, Puppet, etc.
* Version Everything: Application Code, Configuration, Infrastructure and Data
* Manage your binary dependencies. Be specific about binary version numbers. Ensure you have control over these binaries.
* Lockdown pipeline environments. Do not allow SSH/RDP access to any environment in the deployment pipeline
* Use the Disposable Environments pattern – instances are terminated once every few days. This approach reduces the attack profile
* Log everything outside of the EC2 instances (so that they can be access later). Ensure these log files are encrypted e.g. securely through S3)
* All canonical changes are only applied through automation that are part of the deployment pipeline. No one has access to nor can make direct changes to environment
* Create high-availability systems Multi-AZ, Auto Scaling, Elastic Load Balancing and Route 53
* For non-Admin AWS users, only provide access to AWS through a secure CI server or a self-service application
* Use Self-Service Deployments and give developers full SSH/RDP access to their self-service deployment. Only their particular EC2 Key Pair can access the instance(s) associated with the deployment. Self-Service Deployments can be defined in the CI server or a lightweight self-service application.
* Provide capability for any authorized user to perform a self-service deployment with full SSH/RDP access to the environment they created (while eliminating outside access)
* Run two active environments – We’ve yet to do this for customers, but if you want to eliminate all access to the canonical production environment, you might choose to run two active environments at once so that engineers can access the non-production environment to troubleshoot a problem in which the environment has the exact same configuration and data so you’re troubleshooting accurately.
* Run automated infrastructure tests to test for security vulnerabilities with every change committed to the version-control repository as part of the deployment pipeline.
* What is a canonical environment? It’s your system of record. The canonical environment is defined in source code and versioned. If someone makes a change to the canonical system and it affects everyone it should only be done through automation. While you can use a self-service deployment to get a copy of the canonical system, any direct change you make to the environment is isolated and never made part of the canonical system unless code is committed to the version-control repository.
* How can I troubleshoot if I cannot directly access canonical environments? Using a self-service deployment, you can usually determine the cause of the problem. If it’s a data-specific problem, you might import a copy of the production database. If this isn’t possible for time or security reasons, you might run multiple versions of the application at once.
* Why should we dispose of environments regularly? Two primary reasons. The first is to reduce your attack profile (i.e. if environments always go up and down, it’s more difficult to hone in on specific resources. The second reason is that it ensures that all team members are used to applying all canonical changes through automation and not relying on environments to always be up and running somewhere.
* Why should we lockdown environments? To prevent people from making disruptive environment changes that don’t go through the version-control repository.