GitOps: How to Ops Your Git the Right Way
GitOps: How to Ops Your Git the Right Way
In this article we’ll look into the specifics of creating Git repositories structures — the very core of the GitOps approach.
Join the DZone community and get the full member experience.Join For Free
Nowadays, there’s no lack of articles about the GitOps approach, ArgoCD, and other tools for Kubernetes configuration management and application deployments. Yet most of them are pretty high level, or don’t go beyond the “hello world” level.
In this series of articles, I’m going to explain in detail (and with examples) how to build Kubernetes infrastructure with the GitOps approach. We’ll talk about your Git repos, CI/CD pipelines for specific environments, and ways to organize your work and your automation. These guides represent and generalize my experience of building GitOps environments in different companies with different needs.
In this article specifically, we’ll look into the specifics of creating Git repositories structures — the very core of the GitOps approach.
But first, let’s talk about theory a little and make sure that we are on the same page.
Change of Paradigm
In 2017, Weaveworks published a post about their approach to Kubernetes configuration management utilizing Git as a single source of truth. They called this approach “GitOps.”
The main idea of GitOps is to shift focus from the infrastructure to the Git repo. Basically, this means that when you want to deploy a new application or configure an existing one , you should work with Git, not with servers. Just prepare changes, create a Pull Request, pass the review process, merge, and boom! GitOps magic happens.
Two Ways to GitOps
There are two ways to implement the deployment strategy for GitOps: Push-based and Pull-based.
The push-based strategy doesn’t differ much from the classic deployment pipelines, and it’s also easier to implement. That’s why it can be used for a wider variety of scenarios, such as Kubernetes, Terraform, or Ansible-based environments. The main requirement here is the idempotency of the tool of choice (i.e. the ability to apply the same configuration again and again without change of behavior). As a rule, push-based GitOps pipelines use automation tools, such as Jenkins, CircleCI, etc.
The major components of this infrastructure are:
Repository with configuration
The pull-based GitOps strategy is mostly used for Kubernetes environments because this approach requires unusual tooling — some kind of an agent, which continuously pulls the Git repo and reconciles the Kubernetes state following the repo content. Examples of such agents are Weave Flux and ArgoCD.
The main components of this infrastructure are:
Repository with configuration
Target environment with an agent inside
The GitOps approach has many advantages, such as improved security, genuine IaC, easy rollbacks, and you name it! You can check a whole bunch of articles on this:
GitOps — Operations by Pull Request — I believe it’s the original post on GitOps
Guide to GitOps from Weave with key benefits
Pull-based CD Pipelines for Security — A nice article on GitOps from the infosec perspective
Now let’s get into practice.
Hello, GitOps World!
So, let’s kick off with a simple example of a GitOps-based Kubernetes infrastructure.
Simple enterprise-level infrastructure.
Simple. Enterprise. Wait. What? These two words shouldn’t be in one sentence, right? Not in this case. It’s not that easy to build an enterprise-grade GitOps configuration, but it’s relatively simple. Just a few series of simple steps, without complicated automagical components.
In this example, we’ll create a setting for multiple clusters and multiple teams with separated responsibilities. We need to cover:
Manifests quality control
Separation of access
QA and dev teams’ user experience and velocity
Most of these points we can cover with a proper Git structure for the Kubernetes state repo(s), some of them require additional automation.
Our company, of course, has a bunch of microservices in a bunch of environments. You can usually divide them by the purpose, like this:
QA, where QA engineers run end-to-end, acceptance, or any other kinds of tests, whose activities require a complete, isolated environment with all components
Dev, where developers can perform R&D activities
Prod, where applications run and generate value
Of course, it isn’t the only possible model. You may also have one or a bunch of staging environments, for example, or dedicated performance clusters. It isn’t the topic of this article, so let’s assume those three are enough.
As for Dev and QA environments, they usually don’t require strict access isolation and can even sometimes use inter-environment communications. So we can put them together in one DevQA cluster if we’d like to.
The main principle here is, a particular use case in a specific organization defines what is allowed and what is not. Once again, let’s assume that here we can put them together.
First Glimpse on Git Repositories
Now, we have two major divisions: DevQA and Prod. And, accordingly, we can create two Kubernetes clusters and two Git repositories — one for each, where we’ll store all application-related configurations.
Depending on the approach you choose for manifests management, you may need additional Git repositories. In this case, we use Kustomize, and all kinds of environments should share the very same base. So we create a separate repo with base manifests, which we can include in clusters’ repositories as a submodule, remote base, or just automatically replicate files from it.
Every cluster repo has a base folder (or submodule) and several others, each of which represents a separate namespace. Each folder in the namespace represents one application.
Three repositories allow us to create three separate change flows: one for each kind of environment (DevQA and Prod) and one for base manifests, which affects both types. We aren’t limited to three repositories, however. We can create as many repositories as much separation we need.
Now, we create our application configuration and put it in appropriate folders. Let’s take this canonical example and deploy it to the dev environment, QA, and production.
Deep Dive Into Repo Structure
To prepare the base repository, the first thing to do is to define our applications — in this case: frontend, redis-master, and redis-slave. Three applications in total.
Then, create an environment-independent configuration for each application and place it to the base repository.
Note that the namespace level isn’t featured here. It’s the base repository and the configurations of these applications can be inherited to any namespace in any cluster we want.
Cluster’s Repo Preparation
For the cluster-specific repositories, we start by adding a base repo as a submodule.
Now, we create separate folders for desired namespaces in our clusters. “dev” and “qa” for DevQA cluster and “prod” for Prod cluster.
Then, create folders for applications in namespaces’ folders
And finally, create env-specific overrides. For example, an override for a frontend application in dev env:
You can find the cluster’s repo structure diagram below. It slightly differs from the previous one, because we added the namespace level.
Ok, now we have configuration repositories for clusters. We built the user-facing part of the GitOps infrastructure. The piece, where developers and QA engineers create their pull requests with changes for environments, and we control the applications’ change flow.
The next step is what compliments these repositories and makes enterprise-ready — the system layer.
System Base Preparation
Now, we create three more repositories by mirroring the Application layer, but, in this case, for a system-level configuration.
Here we deploy not only applications containing deployments and services like the Application layer ones, but also slightly different things, such as namespace or a cluster-level configuration. It could be RBAC config, resource limits and quotas, monitoring solutions, etc. In our example, we’ll create a namespace template and Nginx ingress.
System Cluster’s Repo Preparation
Here, along with system applications, such as ingress Nginx mentioned above, we’re going to configure user-space namespaces : dev and qa for DevQA cluster, and prod for Prod.
Now, let’s define a step-by-step algorithm:
Add a system base repo as a submodule to our DevQA and Prod system repositories
Create folders for user-space namespaces and ingress-Nginx namespace, which is not present in the Application layer.
Create a “namespace” folder in namespaces folders for our system-level “namespace” application
Create an env-specific override for the namespace application:
In the image below, note that we have not only namespaces we had in the Application layer, but also new ones — argocd and ingress-nginx. We put them just here, because we don’t want users to access them.
Now, we have an Application layer that is accessible by Dev and QA teams and a System layer for cluster administrators. Our Git core of the GitOps infrastructure is almost ready.
Bear in mind that the GitOps agent is the very center of our infrastructure, so we should create a separate repository with an ArgoCD configuration only. It’s our fourth system repository.
ArgoCD can manage multiple clusters from a single installation, but for simplicity, in our example, we install Argo CD instances in each cluster.
The Argo project utilizes Kustomize for manifests management, so we can just create a kustomization file with a remote base and, then, apply overrides we need.
Cluster-Specific Overrides for ArgoCD
Instance configuration deserves some explanation, too. Don’t hesitate to check the files in the repo to learn more. DevQA cluster installation kustomization.yml looks like this:
If you want to know exactly how we configure an ArgoCD instance, please check out the comments above and this repository storing all override files.
Once overrides are ready, we proceed with the initial installation by executing a simple command in the cluster’s instance folder:
Note that we didn’t use `kubectl apply -k` command, to render manifests explicitly and to separate kubectl and templatization tool — you can select your own tool that wouldn’t integrate to kubectl.
After ArgoCD installation is done, it kicks off two processes:
Continuously poll repositories from all argo Application objects
Continuously reconcile Kubernetes state based on what it finds in these repositories
For now, we have only two ArgoCD Applications — self and devqa-applications. The first one is pointing to the repo with the ArgoCD configuration; the second one — to the devqa-system repo. It’s dormant because we don’t have an argocd folder there yet.
This folder in the System repo is our last piece of the puzzle; it’s how we make this beautiful structure work. And finally, the last preparation step, which connects our layers: Application layer with System and all of them with ArgoCD agent. This folder contains all ArgoCD applications for this particular cluster.
Glue Everything Together
To begin with, create an “argocd” folder in the system repo of each cluster. Then, create an ArgoCD Application manifest for every single application in the cluster’s repositories. And finally, we get the complete structure, as shown below:
With this structure, we control manifests quality by enforcing the peer review process for all parts of the Kubernetes configuration. We separate access by having multiple repositories, too. We can implement role-based access control for different clusters and different parts of the cluster, without even touching Kubernetes — only by managing permissions of Git repositories.
This structure also allows us to automate manifests updates and to provide developers with a more user-friendly interface than plain YAML. It also gives you the ability to dynamically create and destroy applications (and even whole environments), because it’s still just a set of files in the Git. We’ll explore this topic, as well as automation, in the next article.
As I mentioned earlier, we started with a very basic enterprise example. So, get ready for something much more sophisticated!
Just kidding. This time we aren’t going to spend much time to explain more complex concepts, because the general idea is still the same; the hierarchy is the only difference.
For now, I’d like to illustrate that the previous example is not the only way of dealing with the GitOps repo structure. Here’s another one.
Here we have a little startup with two significant applications and two environments ( Dev and Prod). Each dev team manages one application configuration in both environments. All Kubernetes system components, such as ingress controller, external-dns controller, and ArgoCD are installed during cluster provisioning and don’t require additional management.
Startup Repo Structure
For such a case, we can build a simple one-repository structure:
All-in-one repository structure
The main difference here is that one repository contains all configurations for all applications and can even contain non-Kubernetes configurations, such as terraform. Application configuration utilizes the classic overlays approach, which is described in Kustomize documentation.
This example demonstrates an entirely different approach to the repo organization. It can be easier to manage, since we don’t have to create seven separate repositories. But, as long as the organization grows, every startup can eventually become an enterprise. So, when designing a repo structure, you should always keep in mind the potential to scale.
In this article, I’ve briefly reviewed the theory of GitOps and tried to elaborate on how exactly you can structure a Git repository in different scenarios.
Don’t hesitate to try to deploy everything described here in your local Kubernetes installation, such as docker-for-desktop or minikube. Experiment and share your feedback. Also, let me know if you know more efficient ways to organize a Kubernetes configuration!
All configurations featured in the article are here:
Opinions expressed by DZone contributors are their own.