DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Because the DevOps movement has redefined engineering responsibilities, SREs now have to become stewards of observability strategy.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Related

  • Mutable vs. Immutable: Infrastructure Models in the Cloud Era
  • The Evolution of Scalable and Resilient Container Infrastructure
  • Infrastructure as Code (IaC) Beyond the Basics
  • Terraform Drift Detection at Scale: How to Catch Configuration Drift Early

Trending

  • Optimizing Serverless Computing with AWS Lambda Layers and CloudFormation
  • Next Evolution in Integration: Architecting With Intent Using Model Context Protocol
  • ITBench, Part 1: Next-Gen Benchmarking for IT Automation Evaluation
  • Apache Spark 4.0: Transforming Big Data Analytics to the Next Level
  1. DZone
  2. Software Design and Architecture
  3. Cloud Architecture
  4. Best Practices for Infrastructure as Code with Terraform, Kubernetes, and Helm (Part 1)

Best Practices for Infrastructure as Code with Terraform, Kubernetes, and Helm (Part 1)

In these series I’m going to explain how to set up your workspace to accomplish Infrastructure as Code with Terraform, Kubernetes and Helm. This setup is based on my real world experience as a DevOps…

By 
Steven Hermans user avatar
Steven Hermans
·
May. 26, 20 · Opinion
Likes (2)
Comment
Save
Tweet
Share
5.7K Views

Join the DZone community and get the full member experience.

Join For Free

In these series I’m going to explain how to set up your workspace to accomplish Infrastructure as Code with Terraform, Kubernetes, and Helm. This setup is based on my real-world experience as a DevOps Engineer working with these techniques for over 3 years.

Concepts that these series will cover:

  • Disaster Recovery and Infrastructure as Code.
  • Setting up a remote workspace.
  • File structure.
  • Storing secrets.
  • Setting up a Terraform project.
  • Deploying applications with Helm.
  • Backup and restore process.

Disaster Recovery (DR) and Infrastructure as Code (IaC)

In this article, I’ll tell you some things you need to know about Disaster Recovery Plan and Infrastructure as Code. Disaster Recovery is the process of bringing your application back online and (partly) functional in any way possible when a major outage has happened. So it is good to have a plan for that. Infrastructure as Code on the other hand ensures that the current state of the infrastructure is written in Code. Which helps a lot during a DR event.

Mean Time to Repair (MTTR)

There is one thing that a DR Plan and IaC have in common, which is reducing the Mean Time to Repair during an outage. All outages are avoidable, but it still happens even to the best of us. Therefore, you should not only focus on how to prevent an outage but also on how to reduce the time it takes to repair it or go back to the previous working state.

Change Management

In order to reduce the Mean Time to Repair, it is important to have a clear overview of the changes that are made to the infrastructure and also the applications running on it. Next to server overloads, “changes” are the most leading causes of outages. Therefore this phrase: “Version everything!”.

Git is a useful and easy tool to track changes. In order to use Git, you’ll first need to manage your Infrastructure as Code. There are several tools that are really helpful in accomplishing IaC, like Terraform and Helm. I’ll dig deeper into these tools in one of the next episodes.

Deploy and Rollback Changes

I’ve read that some Ops teams use Continuous Deployment (CD) for deploying Infrastructure changes. As this sounds like a good idea, there are some drawbacks to it.

The first one is that you don’t have hands-on when things are starting to break and the changes that are being deployed are not always at the top of your mind anymore. This eventually will increase the Mean Time to Repair the disruption. Next to that, how are you going to do complex maintenance, like a database migration through CD?

My personal preference is to always be at the buttons when deploying something, so in case it goes wrong you’ll have all the possibilities open to resolving it quickly.

Consistency

Another thing that IaC will solve, is inconsistency throughout the infrastructure. Throughout my experiences, I’ve seen a lot of times when there are two servers that should be identical; they eventually become inconsistent over time.

Back-Up and Restore Process

An important part a DR Plan solves is to have a clear process of how to restore backups. The process of making backups is something that is done a lot of times automatically. So you gain experience over time with it when it breaks and you’ll have to fix it. But the restore process you’ll hopefully never use. Nevertheless it should be clear how it is done. Because when you’ll have to use it, you don’t have much time to figure it out.

In the end, there is one thing that is really important when doing Ops. In modern infrastructures, there are a lot of changes happening every day, so a mistake that causes a disruption is not a rare thing. Mistakes don’t matter and are inevitable. It matters how you respond to them. Therefore, focus on the Mean Time to Repair.

In the next article, I’ll talk about how to set up up your workspace to get started with IaC. Stay tuned!

Infrastructure as code Infrastructure Terraform (software) Kubernetes Disaster recovery

Opinions expressed by DZone contributors are their own.

Related

  • Mutable vs. Immutable: Infrastructure Models in the Cloud Era
  • The Evolution of Scalable and Resilient Container Infrastructure
  • Infrastructure as Code (IaC) Beyond the Basics
  • Terraform Drift Detection at Scale: How to Catch Configuration Drift Early

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!