Journey to Containers - Part III
In the third part of this series, we dive into the deployment of an application and the configuration of Kubernetes.
Join the DZone community and get the full member experience.Join For Free
Before we move on to Part III, I would like to recap what we did in Part I and II. It is important to understand the transition as it will help you when you think about migrating your existing application(s) (which currently may be running on VM or Physical server ) to Docker.
In Part I, we picked up a simple Python web application (two tiers) and first installed it on the local machine; this is equivalent of installing an application on servers. We understood all the requirements in order to install this application. In this application there is no compilation (build) step; hence, there are no binaries generated. But consider that it could be one of the most common steps for languages like Java, Go, and it is important to understand the overall requirements.
In Part II, we modified the Python application a little bit so it can run inside the container. Basically, we removed VM/server dependencies. This is a very important step when you migrate an application to containers. Previously, we tightly coupled our applications to specific server(s) such that if your application needs to move to another stack of server(s) then it becomes a nightmare because now we need to understand all the application specific setup such as permissions, environment variables, additional utilities, libraries, etc., installed on the server along with server environment, including OS version and versions of patches. Basically, we treat that server as a pet. I have experienced scenarios where a few applications can only run on specific server(s) and no one knew what special setup that server had, either because all configurations are done manually with no or poor documentation, or the SME left his job or moved to another area. I am sure everyone has experienced such situations. In the container world, there are no more pets. All these servers will be treated as cattle. What it means is we should be able to easily replace infrastructure components or swap them when needed. Do not put any dependencies on the server infrastructure. The application should be self-contained such that it should be able to run anywhere, be it on-premise or on a cloud infrastructure, with no changes to underlying infrastructure. That’s where containers play a critical role.
We also removed server dependencies from the application and packaged all requirements for the application using a Dockerfile. The output of a Dockerfile execution is a Docker image which now contains all the requirements for an application and it can run on any Linux platform. In this case, the underlying infrastructure doesn’t really matter. All it needs is a Docker runtime and the application will come up, be it on a cloud or on-premise infrastructure. When we run the image, it spins up a container and starts the application based on the instructions provided in Dockerfile. In this section, we also talked about the need of container orchestration platforms. When you run multiple containers that communicate with various other containers, managing so many containers at scale on just a Docker host is not possible.
With the rise of containers, there are a lot of orchestration platforms that came up in the market such as Amazon’s ECS/EKS, Azure’s ACS/AKS, Pivotal's PKS, IBM's IPC, Redhat's (now IBM's) Openshift. Docker has its own orchestrator, Docker Swarm, and there's also Mesos and Kubernetes.
Of all these container orchestration platforms, Google’s Kubernetes (K8s) surpassed all boundaries and emerged as the most popular open source orchestration platform in no time. It is very well-received by the industry and it is now available with major cloud providers as well. K8s is backed by Google’s decades of experience to make it a production-ready container platform who then donated the project to CNCF. Read more about CNCF here.
I think that's enough for the background on Kubernetes and orchestration platforms. Let's deep dive into the technical details of Kubernetes.
A Kubernetes cluster is a pool of machines which handles container workload. There are Master nodes and worker nodes (minions) which forms the Kubernetes control plane. When you deploy objects to Kubernetes, you have to provide the desired state in a declarative way. Basically, you just need to provide details about what you expect as part of the deployment, like specifying which image to use, how many replicas of applications, resource requirements, etc. But you don’t specify how to create those replicas or how to get resources. Various processes within the control plane will make sure that you get your desired state in the cluster. It could move your application across nodes, add additional nodes to get more resources and more to make sure your desired state is maintained.
Kubernetes master is made of three processes: kube-apiserver, kube-controller, and kube-scheduler. Each process has a specific role in the cluster. The worker node has 2 processes, kubelet and kube-proxy. Refer to more details on architecture here.
Kubernetes works with basic objects such as pods, service, volume, configmaps, secrets, and namespaces which gets deployed to cluster. There are higher level abstractions called controllers which provide additional management capabilities — ReplicaSet, Deployment, StatefulSet, DaemonSet, and Job.
Once you open the Kubernetes Dashboard, you would see all these objects and controls. Note that Kubernetes provides a command line tool, “kubectl,” which interacts with the API server to perform most of the tasks.
In this section of the article, we’ll be mainly dealing with pods, services, namespace. deployments, and configmaps.
The Pod is the smallest unit in Kubernetes object that you create and deploy. A Pod represents a running process in your cluster. It consists of one or more containers along with specifications you provide at the time of deployment. Docker is the most common runtime used in Kubernetes, although it also supports other container runtimes such as rkt. Also note that these pods are in most cases short-lived. They can come and go, moved across any node, and more. Replication controllers manages the lifecycle of pods. These pods get unique IP addresses, however, because of the ephemeral nature of pods, so you can not rely on the IP address to access these pods. Here Service object comes into play.
Service object defines the persistent name, address or port for pods with a common set of Labels. It also provides basic load balancing across multiple pods in the cluster with same labels. Labels are key-value pairs which are applied on pods which ties up these pods with specific service. There are 4 service types with the default Service type ClusterIP. In this example, we’ll use NodePort service type. The NodePort service type uses Node IP address and port to access the application. This acts kind of as a load balancer to your application when you have multiple pods of applications running. This also helps during Blue/Green or Canary deployment mechanisms.
If the application is being deployed to Cloud, Kubernetes provides service type “LoadBalancer” which uses the cloud provided load balancer. You have to be cautious when you use the “LoadBalancer” service type as it will create a new load balancer per application.
Consider namespace as a kind of logical separation of environments in Kubernetes. Namespaces can also be used as separation between applications teams or based on functional areas. When you install Kubernetes, apart from system namespaces, a default namespace gets created. If you deploy any objects without specifying the namespace, the objects land in the default namespace.
This is one of the controllers which manages updates to Pods (and ReplicaSets). In the deployment object, you provide the desired state of your application, which includes image name, resources, environment configurations to use, labels, and the name of the container.
We’ll look at the Deployment object in more details as we start writing YAML files for the Python application. We’ll go over configmaps later in this section just to avoid confusion with Kubernetes objects
For this section, I have installed minikube on Ubuntu, which creates a single node cluster and gives most of the basic Kubernetes functionalities to play with it. If you don’t want to install minikube, you can use the lab environment provided by Docker. You have to log in with either your GitHub or Dockerhub account.
I would recommend creating both accounts if you have not done yet. We’ll use the GitHub account in upcoming sessions when we talk about CI/CD pipeline. However, for this session we need to have Dockerhub account so that we can push and pull images to an image registry, which will be Dockerhub. By default, minikube, or for that matter any Kubernetes cluster, is configured to pull images from Dockerhub if you don’t explicitly specify image registry.
As you set up the Kubernetes cluster, keep in mind that security is of the utmost importance; there is lot of work going on to harden the K8s platform and various security tools available which provides great functionality in terms of security.
Prerequisites for this session:
Kubernetes cluster ( Any of the kubernetes offering or minikube)
Dockerhub account ( https://hub.docker.com/) with one public repository
Kubectl client - Use this link for instructions
In order to install minikube locally, there are specific requirements. Please refer to the link for more details on installation, and this link for how to run minikube locally. It's easy to install minikube locally, but if you don’t have required machine configurations or for some reason don’t want to install on your laptop, as mentioned earlier Docker lab comes in handy. The only caveat is for getting the dashboard, you might have to do additional configurations or you can just use kubectl command line to interact with the cluster. The lab environment and VMs remain valid only for 4 hrs. The other option is to use Katacoda which provides various Kubernetes courses and also a single node cluster. The issue with this is the environment remains available for a certain duration only and you get disconnected after a certain time. However, you can open up the dashboard and look at your actual deployments and all other Kubernetes objects.
Deploying Python Web Application to Kubernetes
We created application image python-webapp:1.0.0 in Part II. This same image we’ll use to run the application in Kubernetes. This is one of the biggest benefits of using containers; we don’t have to make changes to applications to deploy to the target platform because the image is packaged with all requirements for the application execution. Also note that we did not make any changes to Redis image, so we’ll pull this image directly from Dockerhub during deployment to Kubernetes.
As mentioned earlier, Kubernetes will use Dockerhub as default image registry so we need to push our local image python-webapp:1.0.0 to Dockerhub first.
Push local image to Dockerhub:
- Make sure you have application image available locally. If Image is not available, build it using instructions from Part II.
Output: python-webapp 1.0.0 8f7c151c354c 4 weeks ago 131MB
2. Tag image with a Dockerhub repository name. The Dockerhub repository name is combination of username and image name.
$docker tag python-webapp:1.0.0 docuser200/python-webapp:1.0.0
Output: docuser200/python-webapp 1.0.0 8f7c151c354c 4 weeks ago 131MB python-webapp 1.0.0 8f7c151c354c 4 weeks ago 131MB
3. Push image to Dockerhub.
$docker push docuser200/python-webapp:1.0.0
Output: The push refers to repository [docker.io/docuser200/python-webapp] 4d5fb3ff5474: Pushed 4837a60f6823: Pushed 4aabf78dee51: Pushed 639216b11d4f: Pushed 473e9a98e4dd: Pushed 9ff579683928: Pushed 237472299760: Pushed 1.0.0: digest: sha256:7e1d842e8dc293da9d624aa1b28bab2f586d9ce1b57b5cbd24d75f9715b24dad size: 1788
Now we have both of our images, the application image and Redis database image in Dockerhub.
In order to deploy an application to Kubernetes, we need to use specific Kubernetes objects and we’ll do that using declarative YAML. Kubernetes also supports JSON format, however, I feel YAML is better as we don’t have to deal with braces and other formatting for maps and lists. If you are comfortable with JSON, you can use JSON.
Create a YAML file for python-webapp (python-webapp-deployment.yaml).
For the application, we have to create two Kubernetes objects, service and deployment. Service, as mentioned earlier, is an abstraction for pods and defines policy to access those pods. The application service interacts with the Database service which internally talks to Redis DB pod. Deployment object creates pods and replicasets controllers to manage pods.
Note: Make sure service and deployment objects are separated by “---” ( three hyphens).
The service definition looks like the one below. Note that this is the simplest form of service. Look at the service core API for more attributes here.
The below definition indicates that the service name is python-webapp-svc, it will communicate with pods tagged with
label app: python-webapp , the type of service is
NodePort meaning service can be accessed using node IP address and port number generated by service.
Targetport is the exposed port within the container which is 9000 and the Service exposes port 9000. However, note that externally, the service gets exposed to a random port that needs to be used while accessing an application.
apiVersion: v1 kind: Service metadata: name: python-webapp-svc spec: selector: app: python-webapp type: NodePort ports: - protocol: TCP port: 9000 targetPort: 9000
This is the most common object in Kubernetes and it will be used frequently for most of the deployments. We are using apps/v1 version of API which has a deployment object. However, you’ll find that there are other versions such as v1beta1 and v1beta2 versions which also has this objects which would have additional beta functionalities. The name of the deployment is “python-webapp-deployment.” The spec.selector is label selector “app: python-webapp” for pods that Replicasets manages. This label should match with pod’s template’s label. Also note that service uses this label for communication at the backend with pods. Hence service has same spec.selector “python-webapp”. We are asking for 2 replicas of the pod as the desired state after deployment. So Kubernetes will try to get this desired state. Once the container is created, its name will be “hello-python” ; Kubernetes appends a unique identifier to the name. The image to use from Dockerhub by default and its fully qualified name is “docuser200/python-webapp:1.0.0”. Port indicates exposed port from container which is 9000.
apiVersion: apps/v1 kind: Deployment metadata: name: python-webapp-deployment spec: selector: matchLabels: app: python-webapp replicas: 2 template: metadata: labels: app: python-webapp spec: containers: - name: hello-python image: docuser200/python-webapp:1.0.0 ports: - containerPort: 9000
Create a yaml file for Redis DB deployment (python-redis-deployment.yaml)
This also includes a service to communicate to pods and with the application service.
Note: Make sure service and deployment objects are separated by “---” ( three hyphens).
The target port is the port exposed from the container. In this case, Redis container exposes 6379 port. The service type here is ClusterIP as we really don’t want to access Redis DB outside the cluster.
apiVersion: v1 kind: Service metadata: name: redis spec: selector: app: python-redis-app type: ClusterIP ports: - protocol: TCP port: 6379 targetPort: 6379 ---
apiVersion: apps/v1 kind: Deployment metadata: name: python-redis-app spec: selector: matchLabels: app: python-redis-app replicas: 1 template: metadata: labels: app: python-redis-app spec: containers: - name: python-redis image: redis:5.0 ports: - containerPort: 6379
Deploy the yaml files to Kubernetes cluster ( Minikube):
Make sure that both images docuser200/python-webapp:1.0.0 and redis:5.0 are available in Dockerhub. Also delete images from local machine/laptop.
1. If you are using minikube, start minikube if it's not already running. If you are using cloud cluster such as EKS, AKS, or other, ignore this step.
Output: There is a newer version of minikube available (v0.30.0). Download it here: https://github.com/kubernetes/minikube/releases/tag/v0.30.0 To disable this notification, run the following: minikube config set WantUpdateNotification false Starting local Kubernetes v1.10.0 cluster... Starting VM... Getting VM IP address... Moving files into cluster... Setting up certs... Connecting to cluster... Setting up kubeconfig... Starting cluster components... Kubectl is now configured to use the cluster. Loading cached images from config file.
2. Execute the
There are two ways you can manage Kubernetes objects, Imperative Management and Declarative Management. In the Imperative technique you tell Kubernetes what operations to perform specifically, such as create, replace, delete, etc. This approach doesn’t maintain change history. In the Declarative approach, Kubernetes identifies operations based on the object configurations in the file. In this case, create, update, delete, etc., are automatically detected. In addition, the Declarative approach also maintains history and you can rollback objects to the previous version which becomes very handy.
Note that the Kubernetes object should be managed using only one technique. Mixing and matching techniques for the same object results in undefined behavior. You can read more details about both techniques here.
kubectl apply on both the yaml files.
- Create folder with name “kubernetes” in your project folder
- Create both above files. Make sure service definition and deployment definitions are separated by --- ( 3 hyphens). Also, ensure the service definition is created first followed by deployment definition.
Output: python-webapp-deployment.yaml python-redis-deployment.yaml
- Execute the apply command on the Redis file first and then the webapp
$kubectl apply -f python-redis-deployment.yaml
Output: service "redis" created deployment "python-redis-app" configured
Validate deployment is successful using
kubectl get command.
$kubectl get deployments
Output: NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE python-redis-app 1 1 1 1 2m
$kubectl apply -f python-webapp-deployment.yaml
Output: service "python-webapp-svc" created deployment "python-webapp-deployment" created
Validate deployment using
kubectl get command.
$kubectl get deployments
Output: NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE python-redis-app 1 1 1 1 8m python-webapp-deployment 2 2 2 2 2m
Start dashboard using command
The dashboard should get opened in the default browser. In the dashboard, browse through objects such as Deployments, replicasets, services, etc.
In order to access the application, in minikube, execute the below command which will open default browser.
$minikube service python-webapp-svc
Output: Opening kubernetes service default/python-webapp-svc in default browser…
One other interesting thing to observe as you access the service is that it does perform load balancing across pods. As you refresh the application page, you’ll see Hostname changing which means requests get routed to any of the two pods. For the same application, you can increase the number of replicas and still service will balance request across pods. The default mechanism is round robin.
If the cluster is in cloud or on-prem, then you need to access the application using NodePort. Follow below steps:
- Go to Dashboard -> Services
- Select python-webapp-svc
- You'll see two internal endpoints. The first one indicates service port 9000 and the second one indicates 31044 ( this would change for every time service is installed freshly and random port exposed from service) to access application outside cluser
- Go to the Nodes section in the dashboard and click on any of the nodes to get the IP address of the node. In the case of minikube, execute the command "
minikube ip" to get the IP address of the node.
- Access the application in the browser using Node IP address and service port e.g. http://199.642.48.999:31044
Let’s make the deployment interesting by overriding environment variables defined in Dockerfile using Kubernetes YAML file.Kubernetes provides Configmap object to define environment variables which are essentially key value pairs. Below is the configmap definition for out deployment. It overrides same two environment variables that we defined earlier in Dockerfile —
apiVersion: v1 kind: ConfigMap metadata: name: python-webapp-config labels: app: python-webapp-config data: NAME: This message is from configmap - Hello kubernetes ! BGCOLOR: yellow
We have to use this configmap in the application deployment object definition as below. No changes in service object are needed but deployment object needs to refer to the configmap.
Also, make sure objects are separated by “---” ( three hyphens).
apiVersion: apps/v1 kind: Deployment metadata: name: python-webapp-deployment spec: selector: matchLabels: app: python-webapp replicas: 2 template: metadata: labels: app: python-webapp spec: containers: - name: hello-python image: docuser200/python-webapp:1.0.0 ports: - containerPort: 9000 envFrom: - configMapRef: name: python-webapp-config
kubectl applycommand on the file “python-webapp-deployment.yaml.” Note that we don’t have to delete the earlier deployment. As we are using Declarative approach, Kubernetes knows what operation to perform on objects based on YAML file contents. In addition, your application will never have downtime because by default Kubernetes uses rolling update which means it will enable new pods in rolling manner and shut down old pods once new pods are up and running.
$kubectl apply -f python-webapp-deployment.yaml
Output: configmap "python-webapp-config" created service "python-webapp-svc" configured deployment "python-webapp-deployment" configured
Access application again and you should see changes to the text in the page (
NAME ) and change in color (
COLOR ) to "yellow"
This completes the Kubernetes section for this series. Although this is a very short crash course and we just scratched the Kubernetes surface, if you look at what we have done so far, we have actually migrated an application that was created for installation on Virtual machine to Container platform. We used Docker to convert application to container and used Kubernetes to run those containers.
The activities we performed so far which involved writing source code, creating Dockerfile, YAML files, executing few commands to build images and deploy those images. It is okay to do this for the first time. What if we have to make more changes in the source code because business changed the requirements and now we have to go through the entire cycle again. Manually making changes, building new images and performing deployments could be very time-consuming and error-prone. In addition, there are no logs for the activities performed for traceability.
This is where CI/CD pipeline tools are used to automate end to end lifecycle starting from source code control to all the way deployment of an application to various environments and finally to production for use. Docker and Kubernetes help to speed up application delivery with very consistent behavior of an application across multiple environments. Various issues such as the application not working in production or differences in the various environments causing inconsistent application behavior will no longer exist. The images built would provide consistent behavior in any environment as long as the test data is valid and consistent production.
Opinions expressed by DZone contributors are their own.