Containers allow applications to run quicker across many different development environments, and a single container encapsulates everything needed to run an application. Container technologies have exploded in popularity in recent years, leading to diverse use cases as well as new and unexpected challenges. This Zone offers insights into how teams can solve these challenges through its coverage of container performance, Kubernetes, testing, container orchestration, microservices usage to build and deploy containers, and more.
Getting Started With NCache Java Edition (Using Docker)
DZone's Cloud Native Research: Join Us for Our Survey (and $750 Raffle)!
This article is part of a series exploring a workshop guiding you through the open source project Fluent Bit, what it is, a basic installation, and setting up the first telemetry pipeline project. Learn how to manage your cloud-native data from source to destination using the telemetry pipeline phases covering collection, aggregation, transformation, and forwarding from any source to any destination. The previous article in this series helped with the installation of Fluent Bit on our local machine using the project source code. This time around, we'll learn how to use Fluent Bit in a container on our local machine, including how to run the container while using local configuration files. You can find more details in the accompanying workshop lab. Let's get started with Fluent Bit in a container. Installing Fluent Bit using a container image will be demonstrated here using the open-source project Podman. It's assumed you have previously installed the Podman command line tooling. It should also be noted that the following code and command line examples are all based on using an OSX machine. If you want to use another container tooling, such as Docker, most of the commands are the same with just a substitution of the tooling name (docker instead of podman). Containerized Fluent Bit While it's not that difficult to run Fluent Bit in a container, we'll also show you how to do it using local configuration files so that you can use it to build your first telemetry pipelines. It's pretty straightforward to run Fluent Bit in a container. Just start the container image as follows: $ podman run --name fb -ti cr.fluentbit.io/fluent/fluent-bit:2.2.2 Let's take a look at what this command is actually doing. First, you see a flag for giving the container a name we can reference (--name fb). Another is assigning the container a console for output and staying interactive (-ti) and finally, it's using the image version supported in this workshop (cr.fluentbit.io/fluent/fluent-bit:2.2.2). You'll notice the container starts and takes over the console for its output, where Fluent Bit is measuring CPU usage and dumping it to the console (CTRL-C will stop the container): Fluent Bit v2.2.2 * Copyright (C) 2015-2024 The Fluent Bit Authors * Fluent Bit is a CNCF sub-project under the umbrella of Fluentd * https://fluentbit.io ____________________ < Fluent Bit v2.2.2 > ------------------- \ \ \ __---__ _- /--______ __--( / \ )XXXXXXXXXXX\v. .-XXX( O O )XXXXXXXXXXXXXXX- /XXX( U ) XXXXXXX\ /XXXXX( )--_ XXXXXXXXXXX\ /XXXXX/ ( O ) XXXXXX \XXXXX\ XXXXX/ / XXXXXX \__ \XXXXX XXXXXX__/ XXXXXX \__----> ---___ XXX__/ XXXXXX \__ / \- --__/ ___/\ XXXXXX / ___--/= \-\ ___/ XXXXXX '--- XXXXXX \-\/XXX\ XXXXXX /XXXXX \XXXXXXXXX \ /XXXXX/ \XXXXXX > _/XXXXX/ \XXXXX--__/ __-- XXXX/ -XXXXXXXX--------------- XXXXXX- \XXXXXXXXXXXXXXXXXXXXXXXXXX/ ""VXXXXXXXXXXXXXXXXXXV"" [2024/03/01 10:38:52] [ info] [fluent bit] version=2.2.2, commit=eeea396e88, pid=1 [2024/03/01 10:38:52] [ info] [storage] ver=1.5.1, type=memory, sync=normal, checksum=off, max_chunks_up=128 [2024/03/01 10:38:52] [ info] [cmetrics] version=0.6.6 [2024/03/01 10:38:52] [ info] [ctraces ] version=0.4.0 [2024/03/01 10:38:52] [ info] [input:cpu:cpu.0] initializing [2024/03/01 10:38:52] [ info] [input:cpu:cpu.0] storage_strategy='memory' (memory only) [2024/03/01 10:38:52] [ info] [sp] stream processor started [2024/03/01 10:38:52] [ info] [output:stdout:stdout.0] worker #0 started [0] cpu.local: [[1709289532.997338599, {}], {"cpu_p"=>0.250000, "user_p"=>0.000000, "system_p"=>0.250000, "cpu0.p_cpu"=>0.000000 [0] cpu.local: [[1709289533.996516160, {}], {"cpu_p"=>0.000000, "user_p"=>0.000000, "system_p"=>0.000000, "cpu0.p_cpu"=>0.000000 Should we encounter failures at any time during installation, testing, data population, or build results, don't worry! This can be rerun any time after you fix any problems reported. We might have to remove the Fluent Bit container depending on how far you get before something goes wrong. Just stop, remove, and restart it as follows: $ podman container stop fb $ podman container rm fb $ podman run --name fb -ti cr.fluentbit.io/fluent/fluent-bit:2.2.2 There might come a moment when we are going to want to stop working with Fluent Bit and pause until a later time. To do this, we can shut down our container environment by stopping the running Fluent Bit container and then stopping the Podman virtual machine as follows: $ podman container stop fb $ podman machine stop Let's take a look at building our container images with our specific telemetry pipeline configurations. Building Container Images Often you want to set up your specific configuration and add that to the container image. This means you build your own container image and then run that with custom configurations copied into the container image. For example, let's assume we have the following files in our current directory, all parts of a Fluent Bit telemetry pipeline configuration: workshop-fb.conf : The main configuration file importing all other split-out configuration files inputs.conf: File containing all input plugin configurations outputs.conf: File container all output configurations To build a container image, we need to provide a Buildfile, which defines what base container image to use and lists where to copy our above configuration files into that base image as shown here: FROM cr.fluentbit.io/fluent/fluent-bit:2.2.2 COPY ./workshop-fb.conf /fluent-bit/etc/fluent-bit.conf COPY ./inputs.conf /fluent-bit/etc/inputs.conf COPY ./outputs.conf /fluent-bit/etc/outputs.conf Once we have this file, we can now build our container image as follows: $ podman build -t workshop-fb:v1 -f Buildfile STEP 1/4: FROM cr.fluentbit.io/fluent/fluent-bit:2.2.2 STEP 2/4: COPY ./workshop-fb.conf /fluent-bit/etc/fluent-bit.conf --> a379e7611210 STEP 3/4: COPY ./inputs.conf /fluent-bit/etc/inputs.conf --> f39b10d3d6d0 STEP 4/4: COPY ./outputs.conf /fluent-bit/etc/outputs.conf COMMIT workshop-fb:v1 --> b06df84452b6 Successfully tagged localhost/workshop-fb:v1 b06df84452b6eb7a040b75a1cc4088c0739a6a4e2a8bbc2007608529576ebeba The build command uses the flag -t workshop-fb:v1, which tags the container image with a name and version number. This helps us to run the image by name later in the next workshop lab. Furthermore, it makes use of the -f Buildfile we just created to copy in our custom configuration files as shown in the build output. This process now has us ready to start building our first pipelines, but what if you want to use other versions of Fluent Bit container images? Other Versions You might be wondering how to run other versions of Fluent Bit in a container. For example, Fluent Bit v3.0.0 was just released, so the workshop will be updated to follow the releases. Feel free to give this a try by starting the container image as follows: $ podman run --name fb -ti cr.fluentbit.io/fluent/fluent-bit:3.0.0 This starts the container and takes over the console for its output, whereas Fluent Bit measures CPU usage and dumps it to the console (CTRL-C will stop the container): Fluent Bit v3.0.0 * Copyright (C) 2015-2024 The Fluent Bit Authors * Fluent Bit is a CNCF sub-project under the umbrella of Fluentd * https://fluentbit.io ___________.__ __ __________.__ __ ________ \_ _____/| | __ __ ____ _____/ |_ \______ \__|/ |_ ___ _\_____ \ | __) | | | | \_/ __ \ / \ __\ | | _/ \ __\ \ \/ / _(__ < | \ | |_| | /\ ___/| | \ | | | \ || | \ / / \ \___ / |____/____/ \___ >___| /__| |______ /__||__| \_/ /______ / \/ \/ \/ \/ \/ [2024/03/29 15:29:26] [ info] [fluent bit] version=3.0.0, commit=f499a4fbe1, pid=1 [2024/03/29 15:29:26] [ info] [storage] ver=1.5.1, type=memory, sync=normal, checksum=off, max_chunks_up=128 [2024/03/29 15:29:26] [ info] [cmetrics] version=0.7.0 [2024/03/29 15:29:26] [ info] [ctraces ] version=0.4.0 [2024/03/29 15:29:26] [ info] [input:cpu:cpu.0] initializing [2024/03/29 15:29:26] [ info] [input:cpu:cpu.0] storage_strategy='memory' (memory only) [2024/03/29 15:29:26] [ info] [output:stdout:stdout.0] worker #0 started [2024/03/29 15:29:26] [ info] [sp] stream processor started [0] cpu.local: [[1711726167.221360412, {}], {"cpu_p"=>0.000000, "user_p"=>0.000000, "system_p"=>0.000000, "cpu0.p_cpu"=>1.000000, "cpu0.p_user"=>0.000000, "cpu0.p_system"=>1.000000, "cpu1.p_cpu"=>0.000000, "cpu1.p_user"=>0.000000, "cpu1.p_system"=>0.000000}] [0] cpu.local: [[1711726168.216299447, {}], {"cpu_p"=>1.000000, "user_p"=>0.500000, "system_p"=>0.500000, "cpu0.p_cpu"=>0.000000, "cpu0.p_user"=>0.000000, "cpu0.p_system"=>0.000000, "cpu1.p_cpu"=>0.000000, "cpu1.p_user"=>0.000000, "cpu1.p_system"=>0.000000}] This puts all available container images at your fingertips. What's Next? This article helped us install Fluent Bit on our local machine using the available container builds. The series continues with the next step in this workshop, creating our telemetry pipelines using either the source install or container images. Stay tuned for more hands-on material to help you with your cloud-native observability journey.
Are you using the C4 model to create your architecture diagrams? Then Structurizr might be a good option for you to consider. With Structurizr, you can create and maintain your diagrams as code. Let’s take a closer look at it in this blog! Introduction The C4 model helps you with visualizing software architecture. We all know the whiteboard diagrams cluttered with boxes and connectors. The C4 model approach helps you visualize software architecture in a more structured way. A good explanation is given on the C4 model website, so if you do not know what it is, it is worth reading it first. The next question is which tool you use to create the diagrams. You can use Visio, draw.io, PlantUML, even PowerPoint, or whatever tool you normally use for creating diagrams. However, these tools do not check whether naming, relations, etc. are consistently used in the different diagrams. Besides that, it might be difficult to review new versions of diagrams because it is not clear which changes are made. In order to solve these problems, Simon Brown, the author of the C4 model, created Structurizr. Structurizr allows you to create diagrams as code. Based on the code, Structurizr visualizes the diagrams for you and allows you to interact with the visualization. Because the diagrams are maintained in code, you can add them to your version control system (Git) and changes in the diagrams are tracked and can be easily reviewed. In the remaining part of this blog, you will explore some of the features of Structurizr. You will only use two diagram types of the C4 model (the most commonly used ones): System context diagram: Your application as a black box indicating the users of your application Container diagram: An overview of your software architecture Sources used in this blog can be found on GitHub. Prerequisites Prerequisites for this blog are: Basic knowledge of the C4 model Basic knowledge of Docker Linux is used, so if you are using a different Operating System, you will need to adjust the commands accordingly. Installation There are different installation options for Structurizr. In this blog, you will make use of Structurizr Lite, which is an easy installation using Docker which supports one workspace. Create in the root of the repository a data directory. This directory will be mapped as a volume in the Docker container. Execute the following commands from within the root of the repository. Shell $ docker pull structurizr/lite $ docker run -it --rm -p 8080:8080 -v ./data:/usr/local/structurizr structurizr/lite Navigate in your browser to http://localhost:8080 and the Structurizr webpage is shown. In the data directory, you notice that some data is added. When you take a closer look at it, you notice that all files have the root ownership. This is not very convenient, because when you want to edit the files, you have to do so as root. Shell $ ls -la ... drwxr-xr-x 1 root root 22 feb 18 12:04 .structurizr/ -rw-r--r-- 1 root root 316 feb 18 12:03 workspace.dsl -rw-r--r-- 1 root root 2218 feb 18 12:04 workspace.json Stop the container with CTRL+C and remove the contents of the data directory. You will start the container with the same user you are using on your host machine. With the id command, you can retrieve your uid and gid. It is important that you use the same uid and gid inside the container in order that files can be edited easily inside and outside the container. Replace in the following command <uid> and <gid> with your uid and gid and start the Docker container again. Shell $ docker run -it --rm -p 8080:8080 -u <uid>:<gid> -v ./data:/usr/local/structurizr structurizr/lite When you check the ownership of the files again, you will notice that the directory and files are now owned by your host user. Navigate again in the browser to Structurizr and enable Auto canvas size and Auto-layout. This will create a more beautiful diagram. Initial DSL First, let’s take a closer look at the initial DSL that has been created. The complete DSL reference can be found here. The initial DSL is the following: JSON workspace { model { user = person "User" softwareSystem = softwareSystem "Software System" user -> softwareSystem "Uses" } views { systemContext softwareSystem "Diagram1" { include * } } configuration { scope softwaresystem } } The following sections can be viewed: model: The model contains the actors and the software system. If you need to reference these in the DSL, e.g., in relations, you assign them to a variable. The variables user and softwareSystem are used here. The model also contains the relations. One relation from user to softwareSystem is created. views: To visualize the model, you need to create views. In this initial DSL, a view of the System Context Diagram is created. With the include keyword, you can include all or a part of the model. configuration: This section will not be covered in this blog. Basic System Context DSL Now it is time to create a System Context Diagram for your application. The application is a webshop with two types of users, a customer and an administrator. The webshop makes use of a global payment system that handles bank transactions. The DSL is the following: JSON workspace { model { customer = person "Customer" "The customer of our webshop" administrator = person "Administrator" "The administrator of the webshop" globalPayment = softwareSystem "Global Payment" "Used for all banking transactions" myWebshop = softwareSystem "My Webshop" "Our beautiful webshop" customer -> myWebshop "Uses" administrator -> myWebshop "Uses" myWebshop -> globalPayment "Uses" } views { systemContext myWebshop "MyWebshopSystemContextView" { include * autolayout } } } Some things to notice here: Relations have the following format: identifier -> identifier description technology. The identifiers must correspond to a variable defined above the relations. Views have the following format: systemContext softwareSystem key. The softwareSystem must correspond to an identifier defined in the model. The key can be chosen freely. The autolayout option can be added to the view so that it is enabled by default for this view. A problem I encountered is the following: Only one person was shown in the view, the last one defined. I managed to solve this by commenting on the entire view section. A default view is used this way. With this default view, all persons were shown. After this, I enabled the view section again, and now all persons were shown. The System Context Diagram is shown as follows. You can also apply themes to the views, which will enhance your diagram. You can also apply themes to the views, which will enhance your diagram. JSON views { systemContext myWebshop "MyWebshopSystemContextView" { include * autolayout } theme default } The System Context Diagram becomes the following. This already looks more like a C4 model System Context Diagram. Basic Container DSL Time to create a diagram for the software architecture, the Container Diagram. Assume that you need a frontend for both users, a common backend, and, of course, a database. The DSL becomes the following: JSON workspace { model { customer = person "Customer" "The customer of our webshop" administrator = person "Administrator" "The administrator of the webshop" globalPayment = softwareSystem "Global Payment" "Used for all banking transactions" myWebshop = softwareSystem "My Webshop" "Our beautiful webshop" { customerFrontend = container customerFrontend "The frontend for the customer" administratorFrontend = container administratorFrontend "The frontend for the administrator" webshopBackend = container webshopBackend "The webshop backend" webshopDatabase = container webshopDatabase "The webshop database" } // system context relationships customer -> myWebshop "Uses" administrator -> myWebshop "Uses" myWebshop -> globalPayment "Uses" // software system relationships customer -> customerFrontend "Uses" "https" administrator -> administratorFrontend "Uses" "https" customerFrontend -> webshopBackend "Uses" "http" administratorFrontend -> webshopBackend "Uses" "http" webshopBackend -> webshopDatabase "Uses" "ODBC" webshopBackend -> globalPayment "Uses" "https" } views { systemContext myWebshop "MyWebshopSystemContextView" { include * autolayout } container myWebshop "MyWebshopSoftwareSystemView" { include * autolayout } theme default } } What has been added to the DSL? The application myWebshop in the model has been extended with the containers defining the architecture. In the model, the relations are defined between the containers. Note that this time, also the used technology is added to the relations. A container view is added to the views. The container view is represented as follows. The fun part is that when you navigate to the System Context Diagram, you can double-click the myWebshop software system and it will show you the Container Diagram. Awesome! Styling In the Container Diagram, the database is represented as a rounded box. Normally, a database is represented as a cylinder. Is it possible to adjust this? Yes, you can. This can be done by means of styles. The list of possible shapes can be found here. A style can be applied to an element by using a tag. First, in the views section, add a style for an element with the tag Database. You apply a shape Cylinder to this element. JSON views { ... styles { element "Database" { shape Cylinder } } } Now you need to add a tag to the corresponding container. When you define a container, you need to define it with the following format: container [description] [technology] [tags]. In the container definition, you did not specify the technology. There are two options here: Add a technology to the container. This is a bit error-prone, as you have to know the format by heart and you can simply forget to add the technology. Set the tags explicitly. This is the one chosen here. JSON webshopDatabase = container webshopDatabase "The webshop database" { tags "Database" } The complete DSL can be found on GitHub. The resulting diagram is the following. Conclusion Structurizr helps you with creating diagrams according to the C4 model. The diagrams are created by means of a DSL which has several advantages. You need to learn the DSL of course, but this can be learned quite easily.
What Is Patch Management? Patch management is a proactive approach to mitigate already-identified security gaps in software. Most of the time, these patches are provided by third-party vendors to proactively close the security gaps and secure the platform, for example. RedHat provides security advisories and patches for various RedHat products such as RHEL, OpenShift, OpenStack, etc. Microsoft provides patches in the form of updates for Windows OS. These patches include updates to third-party libraries, modules, packages, or utilities. Patches are prioritized and, in most organizations, patching of systems is done at a specific cadence and handled through a change control process. These patches are deployed through lower environments first to understand the impact and then applied in higher environments, such as production. Various tools such as Ansible and Puppet can handle patch management seamlessly for enterprise infrastructures. These tools can automate the patch management process, ensuring that security patches and updates are promptly applied to minimize application disruptions and security risks. Coordination for patching and testing with various stakeholders using infrastructure is a big deal to minimize interruptions. What Is a Container? A container is the smallest unit of software that runs in the container platform. Unlike traditional software that, in most cases, includes application-specific components such as application files, executables, or binaries, containers include the operating system required to run the application and all other dependencies for the application. Containers include everything needed to run the application; hence, they are self-contained and provide greater isolation. With all necessary components packaged together, containers provide inherent security and control, but at the same time, are more vulnerable to threats. Containers are created using a container image, and a container image is created using a Dockerfile/Containerfile that includes instructions for building an image. Most of the container images use open-source components. Therefore, organizations have to make efforts to design and develop recommended methods to secure containers and container platforms. The traditional security strategies and tools would not work for securing containers. DZone’ previously covered how to health check Docker containers. For infrastructure using physical machines or virtual machines for hosting applications, the operations team would SSH to servers (manually or with automation) and then upgrade the system to the latest version or latest patch on a specific cadence. If the application team needs to make any changes such as updating configurations or libraries, they would do the same thing by logging in to the server and making changes. If you know what this means, in various cases, the servers are configured for running specific applications. In this case, the server becomes a pet that needs to be cared for as it creates a dependency for the application, and keeping such servers updated with the latest patches sometimes becomes challenging due to dependency issues. If the server is shared with multiple applications, then updating or patching such servers consumes a lot of effort from everyone involved to make sure applications run smoothly post-upgrade. However, containers are meant to be immutable once created and expected to be short-lived. As mentioned earlier, containers are created from container images; so it's really the container image that needs to be patched. Every image contains one or more file system layers which are built based on the instructions from Containerfile/Dockerfile. Let’s further delve into how to do the patch management and vulnerability management for containers. What Is Vulnerability Management? While patch management is proactive, vulnerability management is a reactive approach to managing and maintaining the security posture within an organization. Platforms and systems are scanned in real-time, at specific schedules, or on an ad hoc basis to identify common vulnerabilities. These are also known as CVEs (Common Vulnerability and Exposures). The tools that are used to discover CVEs use various vulnerability databases such as the U.S. National Vulnerability Database (NVD) and the CERT/CC Vulnerability Notes Database. Most of the vendors that provide scanning tools also maintain their own database to compare the CVEs and score them based on the impact. Every CVE gets a unique code along with a score in terms of severity (CVSS) and resolution, if any (e.g., CVE-2023-52136). Once the CVEs are discovered, these are categorized based on the severity and prioritized based on the impact. Not every Common Vulnerabilities and Exposure (CVE) has a resolution available. Therefore, organizations must continuously monitor such CVEs to comprehend their impact and implement measures to mitigate them. This could involve taking steps such as temporarily removing the system from the network or shutting down the system until a suitable solution is found. High-severity and critical vulnerabilities should be remediated so that they can no longer be exploited. As is evident, patch management and vulnerability management are intrinsically linked in terms of security. Their shared objective is to safeguard an organization's infrastructure and data from cyber threats. Container Security Container security entails safeguarding containerized workloads and the broader ecosystem through a mix of various security tools and technologies. Patch management and vulnerability management are integral parts of this process. The container ecosystem is also often referred to as a container supply chain. The container supply chain includes various components. When we talk about securing containers, it is essentially monitoring and securing the various components listed below. Containers A container is also called a runtime instance of a container image. It uses instructions provided in the container image to run itself. The container has lifecycle stages such as create, start, run, stop, delete, etc. This is the smallest unit which has existence in the container platform and you can log in to it, execute commands, monitor it, etc. Container Orchestration Platform Orchestration platforms provide various capabilities such as HA, scalability, self-healing, logging, monitoring, and visibility for container workloads. Container Registry A container registry includes one or more repositories where container images are stored, are version-controlled, and made available to container platforms. Container Images A container image is sometimes also called a build time instance of a container. It is a read-only template or artifact that includes everything needed to start and run the container (e.g., minimal operating system, libraries, packages, software) along with how to run and configure the container. Development Workspaces The development workspaces reside on developer workstations that are used for writing code, packaging applications, and creating and testing containers. Container Images: The Most Dynamic Component Considering the patch management and vulnerability management for containers, let's focus on container images, the most dynamic component of the supply chain. In the container management workflow, most of the exploits are encountered due to various security gaps in container images. Let’s categorize various container images used in the organization based on hierarchy. 1. Base Images This is the first level in the image hierarchy. As the name indicates, these base images are used as parent images for most of the custom images that are built within the organization. These images are pulled down from various external public and private image registries such as DockerHub, the RedHat Ecosystem Catalog, and the IBM Cloud. 2. Enterprise Images Custom images are created and built from base images and include enterprise-specific components, standard packages, or structures as part of enterprise security and governance. These images are then modified to meet certain standards for organization and published in private container registries for consumption by various application teams. Each image has an assigned owner responsible for managing the image's lifecycle. 3. Application Images These images are built using enterprise custom images as a base. Applications are added on top of them to build application images. These application images are further deployed as containers to container platforms. 4. Builder Images These images are primarily used in the CI/CD pipeline for compiling, building, and deploying application images. These images are based on enterprise custom images and include software required to build applications, create container images, perform testing, and finally, deploy images as part of the pipeline. 5. COTS Images These are vendor-provided images for vendor products. These are also called custom off-the-shelf (COTS) products managed by vendors. The lifecycle for these images is owned by vendors. For simplification, the image hierarchy is represented in the diagram below. Now that we understand various components of the container supply chain and container image hierarchy, let's understand how patching and vulnerability management are done for containers. Patching Container Images Most of the base images are provided by community members or vendors. Similar to traditional patches provided by vendors, image owners proactively patch these base images to mitigate security issues and make new versions available regularly in the container registries. Let's take an example of Python 3.11 Image from RedHat. RedHat patches this image regularly and also provides a Health Index based on scan results. RedHat proactively fixes vulnerabilities and publishes new versions post-testing. The image below indicates that the Python image is patched every 2-3 months, and corresponding CVEs are published by RedHat. This patching involves modifying the Containerfile to update required packages to fix vulnerabilities as well as building and publishing a new version (tag) of the image in the registry. Let’s move to the second level in the image hierarchy: Enterprise custom images. These images are created by organizations using base images (e.g., Python 3.11) to add enterprise-specific components to the image and harden it further for use within the organization. If the base image changes in the external registry, the enterprise custom image should be updated to use a newer version of the base image. This will create a new version of the Enterprise custom image using an updated Containerfile. The same workflow should be followed to update any of the downstream images, such as application and builder images that are built using Enterprise custom images. This way, the entire chain of images will be patched. In this entire process, the patching is done by updating the Containerfile and publishing new images to the image registry. As far as COTS images, the same process is followed by the vendor, and consumers of the images have to make sure new versions of images are being used in the organization. Vulnerability Management for Containers Patch management to secure containers is only half part of the process. Container images have to be scanned regularly or at a specific cadence to identify newly discovered CVEs within images. There are various scanning tools available in the market that scan container images as well as platforms to identify security gaps and provide visibility for such issues. These tools identify security gaps such as running images with root privileges, having directories world-writable, exposed secrets, exposed ports, vulnerable libraries, and many more. These vulnerability reports help organizations to understand the security postures of images being used as well as running containers in the platform. The reports also provide enough information to address these issues. Some of these tools also provide the ability to define policies and controls such that they can block running images if they violate policies defined by the organization. They could even stop running containers if that's what the organization decides to implement. As far as mitigating such vulnerabilities, the process involves the same steps mentioned in the patch management section; i.e., updating the Containerfile to create a new Docker image, rescanning the image to make sure reported vulnerabilities don’t exist anymore, testing the image and publish it to image registry. Depending on where the vulnerability exists in the hierarchy, the respective image and all downstream images need to be updated. Let’s look at an example. Below is the scan report from the python-3.11:1-34 image. It provides 2 important CVEs against 3 packages. These 2 CVEs will also be reported in all downstream images built based on the python-3.11:1-34 image. On further browsing CVE-2023-38545, more information is provided, including action required to remediate the CVE. It indicates that, based on the operating system within the corresponding image, the curl package should be upgraded in order to resolve the issue. From an organizational standpoint, to address this vulnerability, a new Dockerfile or Containerfile needs to be developed. This file should contain instructions to upgrade the curl package and generate a new image with a unique tag. Once the new image is created, it can be utilized in place of the previously affected image. As per the hierarchy mentioned in image-1, all downstream images should be updated with the new image in order to fix the reported CVE across all images. All images, including COTS images, should be regularly scanned. For COTS images, the organization should contact the vendor (image owner) to fix critical vulnerabilities. Shift Left Container Security Image scanning should be part of every stage in the supply chain pipeline. Detecting and addressing security issues early is crucial to avoid accumulating technical debt as we progress through the supply chain. The sooner we identify and rectify security vulnerabilities, the less disruptive they will be to our operations and the lower the amount of work required to fix them later. Local Scanning In order to build Docker images locally, developers need to have tools such as Docker and Podman installed locally on the workstation. Along with these tools, scanning tools should be made available so that developers can scan images pulled from external registries to determine if those images are safe to use. Also, once they build application images, they should have the ability to scan those images locally before moving to the next stage in the pipeline. Analyzing and fixing vulnerabilities at the source is a great way to minimize the security risks further in the lifecycle. Most of the tools provide a command line interface or IDE plugins for security tools for the ease of local scanning. Some organizations create image governance teams that pull, scan, and approve images from external registries before allowing them to be used within the organization. They take ownership of base images and manage the lifecycle of these images. They communicate with all stakeholders on the image updates and monitor new images being used by downstream consumers. This is a great way to maintain control of what images are being used within an organization. Build Time Scanning Integrate image scanning tools in the CI/CD pipeline during the image build stage to make sure every image is getting scanned. Performing image scans as soon as the image is built and determining if the image can be published to the image registry is a good approach to allowing only safe images in the image registry. Additional control gates can be introduced before the production use of the image by enforcing certain policies specifically for production images. Image Registry Scanning Build-time scanning is essentially an on-demand scanning of images. However, given that new vulnerabilities are constantly being reported and added to the Common Vulnerabilities and Exposures (CVE) database, images stored in the registry need to be scanned at regular intervals. Images with critical vulnerabilities have to be reported to the image owners to take action. Runtime Container Scanning This is real-time scanning of running containers within a platform to identify the security posture of containers. Along with analysis that's being done for images, runtime scan also determines additional issues such as the container running with root privileges, what ports it's listening on, if it's connected to the internet, and any runway process being executed. Based on the capability of the scanning tool, it provides full visibility and a security view of the entire container platform, including the hosts on which the platform is running. The tool could also enforce certain policies, such as blocking specific containers or images from running, identifying specific CVEs, and taking action. Note that this is the last stage in the container supply chain. Hence, fixing any issues at this stage is costlier than any other stage. Challenges With Container Security From the process standpoint, it looks straightforward to update base images with new versions and all downstream images. However, it comes with various challenges. Below are some of the common challenges you would encounter as you start looking into the process of patching and vulnerability management for containers: Identifying updates to any of the parent/base images in the hierarchy Identifying image hierarchy and impacted images in the supply chain Making sure all downstream images are updated when a new parent image is made available Defining ownership of images and identifying image owners Communication across various groups within the organization to ensure controls are being maintained Building a list of trusted images to be used within an organization and managing the lifecycle of the same Managing vendor images due to lack of control Managing release timelines at the same time as securing the pipeline Defining controls across the enterprise with respect to audit, security, and governance Defining exception processes to meet business needs Selecting the right scanning tool for the organization and integration with the supply chain Visibility of vulnerabilities across the organization; providing scan results post-scanning of images to respective stakeholders Patch Management and Containers Summarized This article talks about how important it is to keep things secure in container systems, especially by managing patches and dealing with vulnerabilities. Containers are like independent software units that are useful but need special security attention. Patch management means making sure everything is up to date, starting from basic images to more specific application and builder images. At the same time, vulnerability management involves regularly checking for potential security issues and fixing them, like updating files and creating new images. The idea of shifting left suggests including security checks at every step, from creating to running containers. Despite the benefits, there are challenges, such as communicating well in teams and handling images from external sources. This highlights the need for careful control and ongoing attention to keep organizations safe from cyber threats throughout the container process.
Docker, the main containerization technology, has transformed application packaging and deployment. While Docker makes it easier to execute apps, it is also critical to monitor and log your Dockerized environments to ensure they are working properly and stay safe. In this post, we’ll go into the realm of Docker logging and monitoring, looking at the best practices, tools, and techniques for keeping your containerized apps operating smoothly. The Importance of Logging and Monitoring Before we dive into the technical aspects of logging and monitoring in a Docker environment, let’s understand why these activities are crucial in a containerized setup. 1. Troubleshooting Dockerized applications can be complex, comprising multiple containers, each with its own dependencies. When things go wrong, it’s essential to quickly identify and rectify the issues. Logging and monitoring provide the visibility required to pinpoint problems, whether it’s a failing container, network issues, or resource constraints. 2. Performance Optimization To keep your applications running efficiently, you need insights into resource utilization, response times, and other performance metrics. Monitoring tools can help you fine-tune your Docker environment, ensuring that resources are allocated effectively and that your applications are performing at their best. 3. Scalability Docker’s lightweight and portable nature makes it an excellent choice for scaling applications. However, managing the scaling process effectively requires careful monitoring to prevent resource bottlenecks and optimize container placement. 4. Security Security is a top concern in any Docker environment. By monitoring and logging activities, you can detect security breaches and unusual behavior promptly. This allows you to respond quickly to mitigate risks and protect your applications and data. Docker Logging Logging in a Docker environment involves capturing and managing the output of containerized applications, making it accessible for analysis and troubleshooting. Docker provides several ways to collect logs from your containers, and there are also third-party solutions available. Let’s explore some of the key options for logging in a Docker environment. 1. Docker Container Logs Docker itself provides the ability to view container logs using the docker logs command. You can retrieve logs for a specific container, making this a straightforward method for inspecting logs on a per-container basis. However, it may not be suitable for large-scale or automated log collection and analysis. 2. Docker Logging Drivers Docker supports various logging drivers that allow you to configure where container logs are sent. These include the JSON File driver, the Syslog driver, the Fluentd driver, and the Gelf driver, among others. By selecting an appropriate logging driver, you can send logs, such as files, remote Syslog servers, or centralized log management systems, to different destinations. 3. Fluentd Fluentd is a popular open-source log collector that’s commonly used in Docker environments. Fluentd can be deployed as a sidecar container alongside your application containers or as part of an orchestrated logging pipeline. Fluentd can collect logs from various sources, including container runtimes, and forward them to centralized log storage, such as Elasticsearch, Logstash, or Kafka. 4. ELK Stack Elasticsearch, Logstash, and Kibana, collectively known as the ELK stack, are popular tools for log aggregation and analysis. You can use Elasticsearch to store log data, Logstash to process and enrich the logs, and Kibana to create visualizations and dashboards. This stack is highly extensible and can be integrated with Docker using various plugins and configurations. 5. Loki and Grafana Loki is a log aggregation system developed by Grafana Labs. It is designed to work seamlessly with Grafana, a popular open-source monitoring and observability platform. Loki is efficient and cost-effective, as it stores logs in a compact, indexed format, allowing you to search and analyze logs effectively. Grafana can be used to create dashboards and alerts based on Loki data. 6. Graylog Graylog is an open-source log management platform that offers log collection, processing, and analysis capabilities. It is well-suited for Docker environments and provides a user-friendly web interface for exploring log data. Graylog can centralize logs from multiple containers and sources. Best Practices for Logging in Docker Effective logging in a Docker environment requires adherence to best practices to ensure that your logs are accessible, reliable, and actionable. Here are some tips to help you implement a robust logging strategy: 1. Standardize Log Formats Maintain a consistent log format across your applications. Using JSON or structured logging formats makes it easier to parse and analyze logs. Standardized logs facilitate automated processing and reduce the time required for troubleshooting. 2. Store Logs Off the Container Avoid storing logs within the container itself. Instead, use a centralized logging solution to store and manage logs. Storing logs off the container ensures that log data is preserved even if the container or host fails. 3. Set Log Rotation and Retention Policies Define log rotation and retention policies to manage log storage efficiently. You can configure log rotation and retention policies to automatically delete or archive old logs. This prevents your log storage from becoming overwhelmed with outdated data. 4. Implement Security Measures Protect your log data by applying access controls and encryption. Unauthorized access to logs can expose sensitive information and pose security risks. Ensure that only authorized personnel can access and modify log data. 5. Use Structured Logging Use structured logging to add context to your log entries. Include important information such as application names, versions, timestamps, and request IDs. This context is invaluable for tracing issues and identifying the source of problems. 6. Monitor Log Collection Monitor the log collection process itself. If log collection fails, it may indicate underlying issues in your logging infrastructure or containers. Set up alerts to be notified of any log collection failures. 7. Aggregate and Correlate Logs Collect logs from all parts of your Docker environment and correlate them to get a holistic view of your application’s behavior. Correlating logs from different services and components can help you identify and troubleshoot complex issues. 8. Automate Log Analysis Leverage log analysis tools to automatically detect anomalies and patterns in your log data. Machine learning and AI-based log analysis can help you identify issues before they impact your applications. 9. Create Dashboards and Alerts Use visualization tools to create dashboards that provide real-time insights into your Docker environment’s health. Set up alerts to notify you of critical events or unusual behavior, allowing for proactive responses to potential issues. Docker Monitoring Monitoring in a Docker environment goes beyond logging. While logs are crucial for troubleshooting, monitoring provides real-time visibility into your container’s performance and resource utilization. Here are some essential aspects of monitoring in a Docker environment: 1. Metrics Collection Collecting metrics is the foundation of Docker monitoring. Metrics can include CPU and memory usage, network traffic, storage consumption, and more. Docker exposes a rich set of metrics that you can use to gain insights into your container’s health. 2. Resource Utilization Monitoring resource utilization helps you ensure that your containers have enough capacity to handle your applications’ workloads. It also enables you to optimize resource allocation, preventing over-provisioning or resource bottlenecks. 3. Application Performance Monitoring application performance is essential for delivering a high-quality user experience. You can track response times, error rates, and throughput to identify performance bottlenecks and optimize your applications. 4. Auto-Scaling Docker provides auto-scaling capabilities, allowing your containerized applications to adapt to changing workloads. Monitoring helps you define the right metrics and thresholds to trigger automatic scaling actions, ensuring optimal resource utilization. 5. Security and Compliance Monitor your Docker environment for security vulnerabilities and compliance violations. Detecting unusual behavior or security threats in real time is critical for maintaining a secure environment. 6. Event Tracking Monitoring should also capture and track significant events in your Docker environment, such as container starts, stops, and resource allocation changes. Event tracking provides an audit trail and helps in root cause analysis. Docker Monitoring Tools There are several monitoring solutions and tools available for Docker environments, each with its own strengths and capabilities. Here are some of the widely used options: 1. Prometheus Prometheus is a popular open-source monitoring solution for Docker environments. It is designed for reliability and scalability and offers a flexible query language for extracting insights from your metrics. Prometheus can be integrated with Grafana to create interactive dashboards and alerts. 2. Grafana Grafana is an open-source platform for creating, sharing, and exploring interactive dashboards. When combined with Prometheus, Loki, or other data sources, Grafana provides a powerful visualization and alerting solution for monitoring your Docker environment. 3. cAdvisor Container Advisor (cAdvisor) is an open-source container monitoring tool developed by Google. It provides detailed information about container resource usage, performance statistics, and container-level metrics. cAdvisor is often used in conjunction with other monitoring solutions. 4. Datadog Datadog is a cloud-based monitoring and analytics platform that offers comprehensive Docker monitoring. It provides real-time visibility into containerized applications, infrastructure, and logs. Datadog offers extensive integrations and automation features. 5. Sysdig Sysdig is a container intelligence platform that offers Docker monitoring and security capabilities. It provides detailed visibility into your containers, microservices, and applications, helping you detect and respond to security threats and performance issues. Best Practices for Docker Monitoring To effectively monitor your Docker environment, follow these best practices: 1. Define Monitoring Objectives Clearly define what you want to achieve with monitoring. Determine the key metrics and alerts that are critical to your applications’ performance and stability. 2. Collect Relevant Metrics Collect metrics that are relevant to your applications, including resource usage, application-specific metrics, and business-related KPIs. Avoid collecting excessive data that can lead to information overload. 3. Set Up Alerts Configure alerts based on your defined objectives. Alerts should be actionable and not generate noise. Consider using multiple notification channels, such as email, Slack, or SMS, for different severity levels. 4. Implement Monitoring as Code Use Infrastructure as Code (IaC) to define and configure your monitoring infrastructure. This ensures consistency and reproducibility of your monitoring setup. 5. Monitor the Entire Stack Monitor not only your applications but also the entire stack, including the underlying infrastructure and the Docker host. This comprehensive view helps you detect issues at any level of your environment. 6. Use Visualization and Dashboards Create interactive dashboards to visualize your metrics. Dashboards provide a real-time, at-a-glance view of your Docker environment’s health. They are especially useful during incidents and investigations. 7. Continuously Review and Update Regularly review your monitoring setup to ensure it remains relevant and effective. Update alerting thresholds, metrics, and dashboards as your applications evolve. 8. Involve All Stakeholders Collaborate with all relevant stakeholders, including developers, operators, and business teams, to define monitoring requirements and objectives. This ensures that monitoring aligns with the overall business goals. Conclusion Logging and monitoring are critical components of efficiently managing a Docker infrastructure. They give the visibility and information required to solve issues, optimize performance, and keep your containerized applications secure. You can keep your Docker environment strong, durable, and efficient by following best practices and employing the correct tools. Remember that logging and monitoring are dynamic procedures that should change in tandem with your apps and infrastructure. Review and update your logging and monitoring techniques on a regular basis to adapt to changing requirements and keep ahead of possible problems. Your Docker system can function smoothly and give the performance and dependability your users demand with the correct strategy.
What if you were asked to deploy your Python Flask application or Dockerize a Flask app 100 times a day on a virtual machine? This would be a tedious and frustrating task, as most people would agree. This article shows you how to Dockerize a Flask Python application to overcome the above scenario. Setting up a machine manually to deploy your Python Flask application multiple times can easily lead to human error and increase the chances of missing certain dependencies. It takes plenty of time to figure out the errors, fix them, and then deploy the applications. Now, if you were asked to share the Python Flask or any other application with your team members across the globe, how would you do that? If you think you will not be able to share the machine, you are right. You can create a snapshot of the machine, but that’s about it. In this article, we will also see how the Dockerized Python Flask application can be used by global teams residing at different places. Before we dive into the nitty-gritty, let’s see the components that we are going to deploy as Docker containers. Nginx Reverse Proxy:We will create a Docker Image of Nginx to use as a Reverse Proxy, i.e. to forward user requests to a Python application. Python Flask Application:We will create a simple Python Flask application providing 3 APIs. This application will store the count of visits or hits to the applications in Redis. For this, we will write a Dockerfile and create a Docker image. Redis Database:We will use the Redis Database to store the count of visits or hits to our application and a Redis image, which is already available, to create its container. What Is Docker? Docker is an open platform written in the Go language for developing, shipping, and running applications. It enables applications to separate from the infrastructure which results in better speed. It makes it possible to manage the infrastructure the same as we manage the applications. To achieve this, Docker uses a client-server architecture and has the following components. Docker Client:The Docker client is a way to interact with the Docker daemon (dockerd). Docker Daemon:The Docker daemon (dockerd) listens for Docker API requests and manages Docker objects. Docker Objects: Docker Image:A Docker image is a read-only template with instructions for creating a Docker container. Docker Container:This is a runnable instance of a Docker image. Docker Volumes:The persisting data generated and used by Docker containers is stored on Docker volumes. Docker Network:The Docker network is a tunnel through which all isolated Docker containers communicate with each other. Docker Registry:The Docker registry is a place where Docker images are stored. If you’re interested in learning more about Docker, you can access its official guide here. Before we proceed, let’s see the steps to install Docker on Ubuntu 20.04 Server. Check Linux Version:$ cat /etc/issue Update the apt package index:$ sudo apt-get update Install packages to allow apt to use a repository over HTTPS:$ sudo apt-get install apt-transport-https ca-certificates curl gnupg lsb-release Add Docker’s official GPG key:$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg –dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg Set up the stable repository:$ echo “deb [arch=amd64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable” | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null Update the apt package index:$ sudo apt-get update Install the latest version of Docker Engine and containerd:$ sudo apt-get install docker-ce docker-ce-cli containerd.io Check Docker version:$ docker –version Manage Docker as a non-root user: Create ‘docker’ group:$ sudo groupadd docker Add your user to the docker group:$ sudo usermod -aG docker <your-user-name> Exit and re-login. Verify that you can run docker commands without sudo:$ docker run hello-world Upon executing the above run command, you should see the output as follows. Now you have Docker installed on your machine. To know more about installation and other installation alternatives, click here. To know more about post-installation steps, click here. What Are Containers? A Docker container is an object in the Docker engine that bundles up the code and all its dependencies, it is a running instance of the Docker image, i.e., Docker images become Docker containers when they run on a Docker engine. Docker containers virtualize the Operating System(OS), unlike virtual machines that virtualize the hardware. A Docker container is a loosely isolated environment where the application runs. It can be created on Linux OS or Windows OS. Docker containers provide flexibility in deploying and managing software applications. You can start, stop, delete and perform various operations on a Docker container using the Docker CLI or API. Containers can share the same OS, Kernel, and run as isolated processes. They take up less space as compared to virtual machines and can get running in a few seconds. Let’s get familiar with a few basic commands that we will need in the upcoming sections. $ docker run hello-worldIn the previous section, you saw the “docker run hello-world” command. This command created a container with the image “hello-world”. $ docker run –name my-second-container hello-worldYou can give a name to the container by adding “–name <container-name>“ option to the “docker run” command. “docker run –name my-second-container hello-world” command will create a container with image=hello-world and name it as my-second-container. $ docker ps -aYou can also check all your containers using the “docker ps -a” command. docker psIf you are interested to see only running containers, do not add “-a” to the command. $ docker stop <CONTAINER ID>You can stop a running container before deleting it. $ docker rm <CONTAINER ID>Free up your machine by deleting unnecessary containers. $ docker imagesYou can also check images that have been pulled on your machine to create the containers. $ docker image rm <image-name>Similar to the way containers can be removed or deleted, you can also delete images. Create a Flask Python Application Now that you have Docker installed on your machine and you have an idea of Docker containers, let’s create a Python Flask application and Dockerize it in the next section. Go to the home directory:$ cd /home/ubuntu/ Create a new directory:$ mkdir clickittech Change your current working directory and stay in it throughout the article.$ cd clickittech/ Create a directory to put Python Flask application files under it.$ mkdir flask Create a main.py under the Flask directory.$ vim flask/main.py import redis from flask import Flask app = Flask(__name__) redis = redis.Redis(host='redis', port=6379, db=0) @app.route('/') def hello_world(): return 'This is a Python Flask Application with redis and accessed through Nginx' @app.route('/visitor') def visitor(): redis.incr('visitor') visitor_num = redis.get('visitor').decode("utf-8") return "Visit Number = : %s" % (visitor_num) @app.route('/visitor/reset') def reset_visitor(): redis.set('visitor', 0) visitor_num = redis.get('visitor').decode("utf-8") return "Visitor Count has been reset to %s" % (visitor_num) Creating a Dockerfile to Dockerize Your Python Flask Application A Dockerfile is a set of instructions in a text file containing all the commands required to generate a Docker image. The “docker build” command is used to create an image using the Dockerfile. This command also needs additional arguments. Before we create a Dockerfile for your Python Flask application, let’s try to understand the Dockerfile instructions. FROM – FROM is used to specify the base image to be used by the application/container. LABEL – This instruction is used to provide metadata to an image. ENV – To set environment variables in a Docker container, ENV is used. You can have multiple variables in a single Dockerfile. ARG – It defines build-time variables that users can pass at build-time to the builder with the docker build command. WORKDIR – It sets the working directory for the instructions that follow. RUN – Used to run a Linux command and install packages into containers, create folders, etc. COPY – This copies files and directories from the host machine to the container. ADD – It copies files and directories to the container from the host machine as well as from the URL location. It can also unpack compressed files. VOLUME – It creates a directory mount point to access and store persistent data. EXPOSE – It is used to expose ports on the containers, and it informs Docker that the container listens on the exposed network ports at runtime. ENTRYPOINT – It provides a command and arguments for an executing container. If the user specifies any arguments or commands at the end of the “docker run” command, the specified command overrides the default in CMD instruction. CMD – It provides defaults for executing containers and there can be only one CMD. It is used to set a command to be executed when running a container. Refer to the official documentation here to know more about Dockerfile instructions. Dockerfile for Flask Python Application Now, let’s create a Dockerfile. Stay in the same directory (/home/ubuntu/clickittech) and create a Dockerfile in the “flask” directory for creating an image of the Python Flask application.$ vim flask/Dockerfile FROM python:3.7-alpine RUN mkdir /app WORKDIR /app ADD requirements.txt /app ADD main.py /app RUN pip3 install -r requirements.txt CMD ["gunicorn", "-w 4", "-b", "0.0.0.0:8000", "main:app"] Now create a file containing application requirements.$ vim flask/requirements.txt Flask==1.1.2 redis==3.4.1 gunicorn>=19,<20F Now you should have the following files in your “flask” directory. main.py requirements.txt Dockerfile Dockerfile for Nginx Create a new directory “nginx” at the same location (/home/ubuntu/clickittech) where you created the “flask” in the “/home/ubuntu/clickittech” directory.$ mkdir nginx Stay in the same directory (/home/ubuntu/clickittech) and create a Dockerfile in the “nginx” directory for creating an image of the Nginx Proxy.$ vim nginx/Dockerfile Also, create a conf file that will hold the routing configuration:$ vim nginx/conf server { listen 80; server_name localhost; location / { proxy_pass http://app:8000; } location /hit { proxy_pass http://app:8000/visitor; } location /hit/reset { proxy_pass http://app:8000/visitor/reset; } } Now, you should have the following files in your “nginx” directory. conf Dockerfile Creating a Docker Image to Dockerize Your Flask Python Application After creating your Dockerfiles, the next step is to make Docker images. They will be for your Python Flask application and Nginx using the Dockerfiles we generated. For Redis, we will use a readily available image. We can create Docker images using the “docker build” command; however, this is not the only way. Docker images can also be created when you run your applications using Docker-Compose. Before we use Docker-Compose to build and deploy applications, let’s just see the syntax of the “docker build” command. Syntax: docker build [OPTIONS] PATH | URL | – We won’t go into details of this command, as we will be using Docker-Compose to build Docker images. If you want to know in detail about this command, click here to visit the official documentation. Docker Compose You probably know by now that deploying, sharing, and managing software applications using Docker containers is very convenient. However, when you have multiple applications to manage, it becomes cumbersome. Docker commands can take a lot of arguments like volume mapping, port mapping, environment variables, command, network, image name, working directory, etc. Imagine you have hundreds of containers to manage and still want to keep it simple. This is why Docker Compose comes into the picture as the next step to managing multi-container applications. Docker Compose is a tool that helps to define multi-container applications in a file and manages several containers quickly and easily. You can use a YAML file to define and configure all your applications. Then, we just need to execute one command, “docker-compose up”, to get all our applications defined in the YAML file up and running. To destroy the same, we again need to execute one command, “docker-compose down,” and to stop applications safely, “docker-compose stop” comes to the rescue. The YAML file is used to define services, networks, and volumes. One can use either a .yml or .yaml extension for this docker-compose file. Let’s install docker-compose as we will need it. Download the current stable release of Docker Compose:$ sudo curl -L “(uname -s)-$(uname -m)” -o /usr/local/bin/docker-compose Apply executable permissions to the docker-compose binary we just downloaded in the above step.$ sudo chmod +x /usr/local/bin/docker-compose Test if the installation was successful by checking the docker-compose version.$ docker-compose –version Now it’s time to create a docker-compose file. Stay in your “/home/ubuntu/clickittech” directory and create a docker-compose.yml in it.$ vim docker-compose.yml version: '3' services: app: build: flask volumes: - app:/app ports: - "8000:8000" links: - redis:redis depends_on: - redis redis: image: "redis:alpine" expose: - "6379" proxy: build: nginx restart: always ports: - 80:80 depends_on: - app volumes: app: This is what your folder structure should look like. (Tip: Install “tree” command using “sudo apt install tree”.) 3. You are ready to deploy your Python Flask Application using docker-compose in just 1 command. Stay in the “/home/ubuntu/clickittech” directory and execute the following command to start your application containers.$ docker-compose up The first time you execute the above command, you will notice that the base images are being pulled and then our images are being created. 4. In the end, you will see that the applications have started. You can now open a new terminal and try to access the following APIs. curl http://0.0.0.0:80/; curl http://0.0.0.0:80/hit; curl http://0.0.0.0:80/hit/reset; 5. You can also hit the same APIs from the browser on Nginx’s IP. Now, go back to the console and press “control + c” to stop the execution as “docker-compose up” starts the applications in the foreground. Here are a few more commands you can play with. To start containers in the background and leave them running, execute the command with –detach option as follows.$ docker-compose up –detach Check logs of containers by service name defined in the docker-compose file.$ docker-compose logs -f –tail 10 app $ docker-compose logs -f –tail 10 redis Check running processes.$ docker-compose top To stop containers, use the following command.$ docker-compose stop Again, start the services/applications/containers.$ docker-compose start Stop and remove containers, networks, images, and volumes.$ docker-compose down Deploy Your Flask Python Application Using Docker in Production Deploying any applications on production is a very different experience. It is not just writing Dockerfiles, building images, and using Docker-Compose to deploy the application. A lot can change when you promote the application to production. You need to consider a few things while deploying your Python Flask Application on production. First and foremost, never build your Docker Images on the production servers. You must always pull your images from the central Docker registry/repository. Always verify the source of base images that you use in your Dockerfiles. Clean up containers that are no longer running using the “docker rm” command. Use volumes to store your application logs on persistent volumes. Public traffic should not have access to certain containers that are private. Use the “docker logs” command to fetch logs from your containers. Make use of the “docker inspect” command to get detailed information about Docker objects. Use “docker secret” to manage credentials or secrets, avoid storing secrets in plain text format. Limit resources for containers and enforce resource limits so that containers use no more than a given limit. Enable logging and monitoring for your containerized applications. Use the “restart: always” policy to avoid downtime. And, very important to note, use a container orchestration tool. Troubleshooting Docker Containers This section will give you an idea of how to troubleshoot your Dockerized applications. When you encounter issues in your Docker environment, run the following checks. docker ps -aGet information of all the containers, running and stopped. docker logs <container-id>Check logs of the container. docker stop <container-id>If the container is not behaving as expected, stop it. docker start <container-id>Try to start the container. docker rm <container-id>Remove the container if it fails to start. Check mounted volumes, and if the data is irrelevant in the volumes, clean them up. Next Steps After Deploying Your Python Flask Application So, now you understand the reason behind using Docker over virtual machines. You also learned the steps to Dockerize your Python Flask application and deploy applications using Docker-Compose. Docker-Compose is good when containers are only deployed on a single machine, but it cannot provide high availability to the application. Imagine: If the machine goes down, so will all your containers. In the Docker-Compose environment, all containers run on a single machine, and this machine can be a point of failure. Docker-Compose can be good in the early stages of the container world as it does not have a steep learning curve; however, it has its limitations, as mentioned above. Once you become good at Docker-Compose and have reached a certain point, you need to move to one of the container orchestration tools such as ECS, EKS, AKS, Kubernetes, and Docker-Swarm. Container orchestration tools come with benefits that can help overcome the limitations of Docker-Compose. Container orchestration tools help deploy containers on multiple machines, called worker nodes or nodes, that provide high application availability. Container or application auto-scaling can be done based on resource consumption using container orchestrations tools. There are other benefits of using container orchestration tools for production applications. Let’s see a few of the orchestration tools. Amazon ECS AWS Elastic Container Service (ECS) is a fully managed container orchestration service provided by AWS and made available in 2015 in the US East (N. Virginia), US West (Oregon), Asia Pacific (Tokyo), and Europe (Ireland) regions. One does not need to learn Kubernetes to deploy containerized applications on ECS. It is easily integrated with other AWS services to provide a secure, easy-to-use solution for running containers. One has to pay only for the resources you configure. The smallest object that manages the container's lifecycle in ECS is a task. Amazon EKS AWS Elastic Kubernetes Service(EKS) is a Kubernetes-as-a-Service platform provided by AWS and made available in 2018. You need to know about Kubernetes to deploy your containerized application on EKS. One can have a fully managed Kubernetes cluster on AWS; however, one has to pay the additional cost for the Kubernetes control plane. The smallest object that manages the container's lifecycle in EKS is a pod. Microsoft AKS Azure Kubernetes Service(AKS) is a fully managed Kubernetes service provided by Azure. It was released in 2018. While using AKS, one does not have to pay the additional cost of the Kubernetes control plane, but only for the virtual machines, associated storage, and networking resources consumed. You need to know about Kubernetes to deploy your containerized applications on AKS. The smallest object that manages the container's lifecycle in AKS is a pod. Summary Table for ECS, EKS, and AKS Conclusion Containerization offers a solution based on containers for virtualization. It creates an image of the deployable application with all its dependencies and reduces the build and deployment time. Docker is a container-based virtualization platform that provides a way for fast, and consistent delivery of applications. It helps to scale and run more workloads on the same hardware. In this article, we saw how to Dockerize a Flask Python application and reduce the build and deployment time. We deployed 3 different components, viz. a Python Flask application, Redis Database, and Nginx Reverse Proxy, in 3 different containers. Dockerizing applications serve as the basis to enable the decomposed microservices-based architecture. There is no doubt that Docker is a revolutionary technology to provide isolation and environmental independence. To make the best out of it, container orchestration platforms like ECS, EKS, and AKS can definitely be a great choice since they make the scenario of having thousands of containers deploying hundreds of microservices much easier.
Continuous integration and continuous delivery (CI/CD) have become critical practices for software teams looking to accelerate development cycles and improve product quality. By automatically building, testing, and deploying application updates, CI/CD pipelines enable reliable and efficient software delivery. This article will discuss best practices for implementing robust CI/CD workflows using popular open-source tools like Jenkins and Docker. Overview of CI/CD Concepts Continuous integration (CI) refers to the practice of frequently merging developer code changes into a shared repository, triggering automated builds and tests to detect integration issues early. Common CI principles include committing code in small increments, continuously testing each change, and rapidly resolving identified problems to avoid the accumulation of technical debt. Continuous delivery (CD) extends upon CI by automating the release process all the way to production deployment using repeatable workflows. Each code change that passes the automated testing gates is considered releasable. Automated deployment allows development teams to deliver features faster and more reliably. Benefits of Adopting CI/CD Practices Implementing CI/CD pipelines provides multiple software development and delivery advantages, including: Accelerated time to market: Automated workflows enable faster build, test, and release cycles Reduced risk: Continuous testing and version control identify defects early on Reliability: Repeatability ensures software updates are consistently delivered without manual errors Developer productivity: Automation frees up developers to focus on coding rather than builds Reputation: Users and customers benefit from faster features and minimal disruption Critical Components of a CI/CD Pipeline A typical CI/CD pipeline comprises several key components connected together: Version control system: Hosts application code in repositories. Developers can collaboratively edit and track changes over time. Popular systems like Git facilitate branching and merging. Build server: Automates compiling source code into executable applications by running build scripts. Popular open-source build servers include Jenkins and Bamboo. Testing framework: Automatically runs unit, integration, and system tests to validate application integrity before release. JUnit and Selenium are commonly used. Binary repository: Stores build artifacts and dependencies in a centralized, easily accessible package. Artifactory and Nexus are common artifact repository examples. Deployment automation: Scripts and configures deployment of built and tested code changes to progressive server environments, all the way up to production. Kubernetes facilitates container deployments. Jenkins Overview Jenkins is one of the most widely adopted open-source automation servers used to set up, operate, and manage CI/CD pipelines. It is used for automating software development processes through a pipeline-oriented architecture. Key Jenkins capabilities include: Easy installation: Available both on-premises and in cloud platforms. Easily scalable. Declarative pipelines: Pipeline workflows can be defined through code using a Jenkinsfile, facilitating version control. Extensive ecosystem: A broad plugin ecosystem allows the integration of the most common developer tools into pipelines. Distributed builds: Supports distributed CI/CD by executing parallel tests and build routines across multiple machines. Simple administration: Easy for admins to manage users, access controls and Jenkins configuration. Docker Overview Docker has emerged as the de facto standard in the development and deployment of containerization technologies. Docker containers bundle application source code together with libraries, dependencies, and a lightweight runtime into an isolated package. Containers provide a predictable way to deploy applications across environments. Benefits include: Lightweight: Containers leverage the host OS instead of needing a guest OS, reducing overhead. Portability: Can run uniformly on any platform due to shared runtime environments. Scalability: Easily spawn multiple instances of containers due to low resource requirements. Isolation: Changes made inside containers do not impact the host machine or other containers. Implementing CI/CD Pipelines Using Jenkins and Docker Leveraging both Jenkins and Docker, robust CI/CD pipelines can be designed that enable continuous code integration and reliable application deployments. Here is one recommended implementation pattern: Code commits: Developers commit code changes frequently to Git repositories. Webhooks trigger Jenkins jobs upon code pushes. Jenkins CI jobs: Jenkins pulls source code and runs CI workflows - clean > build > unit tests > static analysis > create Docker image with dependencies. Docker registry: A validated Docker image is pushed and versioned in a private Docker registry. Deploy Jenkins jobs: Deployment jobs first pull images from the registry and then deploy them onwards to higher environments. Infrastructure: Docker environments for progressive test, stage, and prod application deployment need to be set up. Kubernetes is great for container orchestration. Rollback strategies: Rollback workflows are automated through Jenkins to revert to the last working version in case of production issues. This pipeline allows developers to have a fast inner DevOps feedback loop through Jenkins while Docker containers handle application encapsulation and deployment portability. Infrastructure-as-code practices help manage environment sprawl. Best Practices for Effective Jenkins and Docker CI/CD Based on industry-wide learnings, here are some best practices to follow: Standardize pipelines through templatized Jenkinsfiles checked into source control. Leverage Docker multi-stage builds to keep images lean. Abstract environment differences using Docker runtime configurations over custom image builds. Scale Jenkins dynamically using the Kubernetes plugin for on-demand build agents. Implement Git hooks for commit syntax linting and automated tests before pushing code. Integrate security scans in the pipeline and analyze images for vulnerabilities. Enable traceability by integrating build numbers into application UIs and logs. Simulate production load, traffic, and access environment during later testing stages. Only build container images once through registries. Avoid image sprawl. Conclusion Implementing a high-performing CI/CD pipeline requires integrating disparate systems like code repositories, build servers and container technologies while ensuring automated test coverage through all phases. Jenkins and Docker provide open-source solutions to deliver robust pipelines that augment developer productivity, release reliability and operations efficiency. Standardizing pipelines, branching strategies, and environments provides consistency across the SDLC. By following industry best practices around CI/CD processes, test automation, and architectural decentralization, teams can accelerate innovation cycles dramatically.
DynamoDB Local is a version of Amazon DynamoDB that you can run locally as a Docker container (or other forms). It's super easy to get started: # start container docker run --rm -p 8000:8000 amazon/dynamodb-local # connect and create a table aws dynamodb create-table --endpoint-url http://localhost:8000 --table-name Books --attribute-definitions AttributeName=ISBN,AttributeType=S --key-schema AttributeName=ISBN,KeyType=HASH --billing-mode PAY_PER_REQUEST # list tables aws dynamodb list-tables --endpoint-url http://localhost:8000 More on the --endpoint-url soon. Hello Testcontainers! This is a good start. But DynamoDB Local is a great fit for Testcontainers which "is an open source framework for providing throwaway, lightweight instances of databases, message brokers, web browsers, or just about anything that can run in a Docker container." It supports multiple languages (including Go!) and databases (also messaging infrastructure, etc.); All you need is Docker. Testcontainers for Go makes it simple to programmatically create and clean up container-based dependencies for automated integration/smoke tests. You can define test dependencies as code, run tests and delete the containers once done. Testcontainers has the concept of modules that are "preconfigured implementations of various dependencies that make writing your tests even easier." Having a piece of infrastructure supported as a Testcontainer module provides a seamless, plug-and-play experience. The same applies to DynamoDB Local, where the Testcontainers module for DynamoDB Local comes in! It allows you to easily run/test your Go-based DynamoDB applications locally using Docker. Getting Started With the Testcontainers Module for DynamoDB Local Super easy! go mod init demo go get github.com/abhirockzz/dynamodb-local-testcontainers-go You can go ahead and use the sample code in the project README. To summarize, it consists of four simple steps: Start the DynamoDB Local Docker container, dynamodblocal.RunContainer(ctx). Gets the client handle for the DynamoDB (local) instance, dynamodbLocalContainer.GetDynamoDBClient(context.Background()). Uses the client handle to execute operations. In this case, create a table, add an item, and query that item. Terminate it at the end of the program (typically register it using defer), dynamodbLocalContainer.Terminate(ctx). Module Options The following configuration parameters are supported: WithTelemetryDisabled: When specified, DynamoDB local will not send any telemetry. WithSharedDB: If you use this option, DynamoDB creates a shared database file in which data is stored. This is useful if you want to persist data, e.g., between successive test executions. To use WithSharedDB, here is a common workflow: Start the container and get the client handle. Create a table, add data, and query it. Re-start container Query the same data (again); it should be there. And here is how you might go about it (error handling and logging omitted): func withSharedDB() { ctx := context.Background() //start container dynamodbLocalContainer, _ := dynamodblocal.RunContainer(ctx) defer dynamodbLocalContainer.Terminate(ctx) //get client client, _ := dynamodbLocalContainer.GetDynamoDBClient(context.Background()) //create table, add data createTable(client) value := "test_value" addDataToTable(client, value) //query same data queryResult, _ := queryItem(client, value) log.Println("queried data from dynamodb table. result -", queryResult) //re-start container dynamodbLocalContainer.Stop(context.Background(), aws.Duration(5*time.Second)) dynamodbLocalContainer.Start(context.Background()) //query same data client, _ = dynamodbLocalContainer.GetDynamoDBClient(context.Background()) queryResult, _ = queryItem(client, value) log.Println("queried data from dynamodb table. result -", queryResult) } To use these options together: container, err := dynamodblocal.RunContainer(ctx, WithSharedDB(), WithTelemetryDisabled()) The Testcontainers documentation is pretty good in terms of detailing how to write an extension/module. But I had to deal with a specific nuance - related to DynamoDB Local. DynamoDB Endpoint Resolution Contrary to the DynamoDB service, in order to access DynamoDB Local (with the SDK, AWS CLI, etc.), you must specify a local endpoint - http://<your_host>:<service_port>. Most commonly, this is what you would use: http://locahost:8000. The endpoint resolution process has changed since AWS SDK for Go v2 - I had to do some digging to figure it out. You can read up in the SDK documentation, but the short version is that you have to specify a custom endpoint resolver. In this case, all it takes is to retrieve the docker container host and port. Here is the implementation, this is used in the module as well. type DynamoDBLocalResolver struct { hostAndPort string } func (r *DynamoDBLocalResolver) ResolveEndpoint(ctx context.Context, params dynamodb.EndpointParameters) (endpoint smithyendpoints.Endpoint, err error) { return smithyendpoints.Endpoint{ URI: url.URL{Host: r.hostAndPort, Scheme: "http"}, }, nil } This Was Fun! As I mentioned, Testcontainers has excellent documentation, which was helpful as I had to wrap my head around how to support, the shared flag (using WithSharedDB). The solution was easy (ultimately), but the Reusable container section was the one which turned on the lightbulb for me! If you find this project interesting/helpful, don't hesitate to ⭐️ it and share it with your colleagues. Happy Building!
“Top” is a robust, lightweight command-line tool that provides real-time reports on system-wide resource utilization. It is commonly available in various Linux distributions. However, we have observed that it may not accurately report information when executed within a Docker container. This post aims to bring this issue to your attention. CPU Stress Test in Docker Container Let’s carry out a straightforward experiment. We’ll deploy a container using an Ubuntu image and intentionally increase CPU consumption. Execute the following command: Shell docker run -ti --rm --name tmp-limit --cpus="1" -m="1G" ubuntu bash -c 'apt update; apt install -y stress; stress --cpu 4' The provided command performs the following actions: Initiates a container using the Ubuntu image Establishes a CPU limit of 1 Sets a memory limit of 1G Executes the command ‘apt update; apt install -y stress; stress –cpu 4’, which conducts a CPU stress test CPU utilization reported by the top in the host Now, let’s initiate the top tool on the host where this Docker container is operating. The output of the top tool is as follows: Fig 1: top command from the host Please take note of the orange rectangle in Fig 1. This metric is indicated as 25% CPU utilization, and it is the correct value. The host has 4 cores, and we have allocated our container with a limit of 1 core. As this single core is fully utilized, the reported CPU utilization at the host level is 25% (i.e., 1/4 of the total cores). CPU Utilization Reported by the Top in the Container Now, let’s execute the top command within the container. The following is the output reported by the top command: Fig 2: top command from the container Please observe the orange rectangle in Fig 2. The CPU utilization is noted as 25%, mirroring the host’s value. This, however, is inaccurate from the container’s viewpoint as it has fully utilized its allotted CPU limit of 100%. Nevertheless, it’s important to note that the processes listed in Fig 2 are accurate. The tool correctly reports only the processes running within this container and excludes processes from the entire host. How To Find Accurate CPU Utilization in Containers In such a scenario, to obtain accurate CPU utilization within the container, there are several solutions: Docker Container Stats (docker stats) Container Advisor (cAdvisor) yCrash 1. Docker Stats The docker stats command provides fundamental resource utilization metrics at the container level. Here is the output of `docker stats` for the previously launched container: Fig 3: docker stats output Note the orange rectangle in Fig 3. The CPU utilization is indicated as 100.64%. However, the challenge lies in the fact that `docker stats` cannot be executed within the container (unless the docker socket is passed into the container, which is uncommon and poses a security risk). It must be run from the host. 2. cAdvisor You can utilize the cAdvisor (Container Advisor) tool, which inherently supports Docker containers, to furnish container-level resource utilization metrics. 3. yCrash Fig 4: yCrash – root cause analysis report Additionally, you have the option to employ the yCrash tool, which not only provides container-level metrics but also analyzes application-level dumps (such as Garbage Collection logs, application logs, threads, memory dumps, etc.) and presents a comprehensive root cause analysis report. Conclusion While “top” serves as a reliable tool for monitoring system-wide resource utilization, its accuracy within Docker containers may be compromised. This discrepancy can lead to misleading insights into container performance, especially regarding CPU utilization. As demonstrated in our experiment, “top” reported 25% CPU usage within the container despite full utilization of the allocated CPU limit. To obtain precise metrics within Docker containers, alternative tools such as Docker Container Stats, cAdvisor, and yCrash offer valuable insights into resource utilization. By leveraging these tools, users can ensure accurate monitoring and optimization of containerized environments, ultimately enhancing performance and operational efficiency.
In the ever-evolving landscape of cloud-native computing, containers have emerged as the linchpin, enabling organizations to build, deploy, and scale applications with unprecedented agility. However, as the adoption of containers accelerates, so does the imperative for robust container security strategies. The interconnected realms of containers and the cloud have given rise to innovative security patterns designed to address the unique challenges posed by dynamic, distributed environments. Explore the latest patterns, anti-patterns, and practices that are steering the course in an era where cloud-native architecture, including orchestration intricacies of Kubernetes that span across Amazon Elastic Kubernetes Service (EKS), Azure Kubernetes Service (AKS), Google Kubernetes Engine (GKE), including nuances of securing microservices. Related: Amazon ETL Tools Compared. What Is Container Security? Container security is the practice of ensuring that container environments are protected against any threats. As with any security implementation within the software development lifecycle (SDLC), the practice of securing containers is a crucial step to take, as it not only protects against malicious actors but also allows containers to run smoothly in production. Learn how to incorporate CI/CD pipelines into your SDLC. The process of securing containers is a continuous one and can be implemented on the infrastructure level, runtime, and the software supply chain, to name a few. As such, securing containers is not a one-size-fits-all approach. In future sections, we will discuss different container management strategies and how security comes into play. Review additional CI/CD design patterns. How to Build a Container Strategy With Security Forensics Embdedded A container management strategy involves a structured plan to oversee the creation, deployment, orchestration, maintenance, and discarding of containers and containerized applications. It encompasses key elements to ensure efficiency, security, and scalability throughout the software development lifecycle based around containerization. Let's first analyze the prevailing and emerging anti-patterns for container management and security. Then, we will try to correlate possible solutions or alternative recommendations corresponding to each anti-pattern along with optimization practices for fortifying container security strategies for today's and tomorrow's threats. Review more DevOps anti-pattern examples. "Don't treat container security like a choose-your-own-adventure book; following every path might lead to a comedy of errors, not a happy ending!" Container Security Best Practices Weak Container Supply Chain Management This anti-pattern overlooks container supply chain management visible in "Docker history," risking compromised security. Hastily using unofficial Docker images without vetting their origin or build process poses a significant threat. Ensuring robust container supply chain management is vital for upholding integrity and security within the container environment. Learn how to perform a docker container health check. Anti-Pattern: Potential Compromise Pushing malicious code into Docker images is straightforward, but detecting such code is challenging. Blindly using others' images or building new ones from these can risk security, even if they solve similar problems. Pattern: Secure Practices Instead of relying solely on others' images, inspect their Dockerfiles, emulate their approach, and customize them for your needs. Ensure FROM lines in the Dockerfile point to trusted images, preferably official ones or those you've crafted from scratch, despite the added effort, ensuring security over potential breach aftermaths. Installing Non-Essential Executables Into a Container Image Non-essential executables for container images encompass anything unnecessary for the container's core function or app interpreter. For production, omit tools like text editors. Java or Python apps may need specific executables, while Go apps can run directly from a minimal "scratch" base image. Anti-Pattern: Excessive Size Adding non-essential executables to a container amplifies vulnerability risks and enlarges image size. This surplus bulk slows pull times and increases network data transmission. Pattern: Trim the Fat Start with a minimal official or self-generated base image to curb potential threats. Assess your app's true executable necessities, avoiding unnecessary installations. Exercise caution while removing language-dependent executables to craft a lean, cost-effective container image. Cloning an Entire Git Repo Into a Container Image It could look something like : GitHub Flavored Markdown RUN git clone https://github.org/somerepo Anti-Pattern: Unnecessary Complexity External dependency: Relying on non-local sources for Docker image files introduces risk, as these files may not be vetted beforehand. Git clutter: A git clone brings surplus files like the .git/ directory, increasing image size. The .git/ folder may contain sensitive information, and removing it is error-prone. Network dependency: Depending on container engine networking to fetch remote files adds complexity, especially with corporate proxies, potentially causing build errors. Executable overhead: Including the Git executable in the image is unnecessary unless directly manipulating Git repositories. Pattern: Streamlined Assembly Instead of a direct git clone in the Dockerfile, clone to a sub-directory in the build context via a shell script. Then, selectively add needed files using the COPY directive, minimizing unnecessary components. Utilize a .dockerignore file to exclude undesired files from the Docker image. Exception: Multi-Stage Build For a multi-stage build, consider cloning the repository to a local folder and then copying it to the build-stage container. While git clone might be acceptable, this approach offers a more controlled and error-resistant alternative. Building a Docker Container Image “On the Fly” Anti-Pattern: Skipping Registry Deployment Performing cloning, building, and running a Docker image without pushing it to an intermediary registry is an anti-pattern. This skips security screenings, lacks a backup, and introduces untested images to deployment. The main reason is that there are security and testing gaps: Backup and rollback: Skipping registry upload denies the benefits of having a backup, which is crucial for quick rollbacks in case of deployment failures. Vulnerability scanning: Neglecting registry uploads means missing out on vulnerability scanning, a key element in ensuring data and user safety. Untested images: Deploying unpushed images means deploying untested ones, a risky practice, particularly in a production environment. DZone's previously covered how to use penetration tests within an organization. Pattern: Registry Best Practices Build and uniquely version images in a dedicated environment, pushing them to a container registry. Let the registry scan for vulnerabilities and ensure thorough testing before deployment. Utilize deployment automation for seamless image retrieval and execution. Running as Root in the Container Anti-Pattern: Defaulting to Root User Many new container users inadvertently run containers with root as the default user, a practice necessitated by container engines during image creation. This can lead to the following security risks: Root user vulnerabilities: Running a Linux-based container as root exposes the system to potential takeovers and breaches, allowing bad actors access inside the network and potentially the container host system. Container breakout risk: A compromised container could lead to a breakout, granting unauthorized root access to the container host system. Pattern: User Privilege Management Instead of defaulting to root, use the USER directive in the Dockerfile to specify a non-root user. Prior to this, ensure the user is created in the image and possesses adequate permissions for required commands, including running the application. This practice reduces security vulnerabilities associated with root user privileges. Running Multiple Services in One Container Anti-Pattern: Co-Locating Multiple Tiers This anti-pattern involves running multiple tiers of an application, such as APIs and databases, within the same container, contradicting the minimalist essence of container design. The complexity and deviation from the design cause the following challenges: Minimalism violation: Containers are meant to be minimalistic instances, focusing on the essentials for running a specific application tier. Co-locating services in a single container introduces unnecessary complexity. Exit code management: Containers are designed to exit when the primary executable ends, relaying the exit code to the launching shell. Running multiple services in one container requires manual management of unexpected exceptions and errors, deviating from container engine handling. Pattern: Service Isolation Adopt the principle of one container per task, ensuring each container hosts a single service. Establish a local virtualized container network (e.g., docker network create) for intra-container communication, enabling seamless interaction without compromising the minimalist design of individual containers. Embedding Secrets in an Image Anti-Pattern: Storing Secrets in Container Images This anti-pattern involves storing sensitive information, such as local development secrets, within container images, often overlooked in various parts like ENV directives in Dockerfiles. This causes the following security compromises: Easy to forget: Numerous locations within container images, like ENV directives, provide hiding spots for storing information, leading to inadvertent negligence and forgetfulness. Accidental copy of secrets: Inadequate precautions might result in copying local files containing secrets, such as .env files, into the container image. Pattern: Secure Retrieval at Runtime Dockerignore best practices: Implement a .dockerignore file encompassing local files housing development secrets to prevent inadvertent inclusion in the container image. This file should also be part of .gitignore. Dockerfile security practices: Avoid placing secrets in Dockerfiles. For secure handling during build or testing phases, explore secure alternatives to passing secrets via --build-arg, leveraging Docker's BuildKit for enhanced security. Runtime secret retrieval: Retrieve secrets at runtime from secure stores like HashiCorp Vault, cloud-based services (e.g., AWS KMS), or Docker's built-in secrets functionality, which requires a docker-swarm setup for utilization. Failing to Update Packages When Building Images Anti-Pattern: Static Base Image Packages This anti-pattern stems from a former best practice where container image providers discouraged updating packages within base images. However, the current best practice emphasizes updating installed packages every time a new image is built. The main reason for this is outdated packages, which causes lagging updates. Base images may not always contain the latest versions of installed packages due to periodic or scheduled image builds, leaving systems vulnerable to outdated packages, including security vulnerabilities. Pattern: Continuous Package Updates To address this, regularly update installed packages using the distribution's package manager within the Dockerfile. Incorporate this process early in the build, potentially within the initial RUN directive, ensuring that each new image build includes updated packages for enhanced security and stability. When striving to devise a foolproof solution, a frequent misstep is to undervalue the resourcefulness of total novices. Building Container Security Into Development Pipelines Creates a Dynamic Landscape In navigating the ever-evolving realm of containers, which are at an all-time high in popularity and directly proportional to the quantum of security threats, we've delved into a spectrum of crucial patterns and anti-patterns. From fortifying container images by mastering the intricacies of supply chain management to embracing the necessity of runtime secrets retrieval, each pattern serves as a cornerstone in the architecture of robust container security. Unraveling the complexities of co-locating services and avoiding the pitfalls of outdated packages, we've highlighted the significance of adaptability and continuous improvement. As we champion the ethos of one-container-per-task and the secure retrieval of secrets, we acknowledge that container security is not a static destination but an ongoing journey. By comprehending and implementing these patterns, we fortify our containers against potential breaches, ensuring a resilient and proactive defense in an ever-shifting digital landscape.
Docker has transformed the world of containerization by providing a powerful platform for packaging, shipping, and running applications within containers. A key aspect of containerization is networking, and Docker offers a range of networking drivers to facilitate communication between containers and with external networks. In this comprehensive guide, we will explore the significance of networking drivers in Docker, how they work, the different types available, and best practices for selecting the right driver to optimize container networking. Docker has revolutionized containerization by offering a strong platform for packing, delivering, and executing container programs. Networking is an important part of containerization, and Docker provides a variety of networking drivers to support communication between containers and with external networks. In this detailed article, we will look at the importance of networking drivers in Docker, how they function, the many types available, and best practices for picking the proper driver to optimize container networking. Docker, the containerization industry leader, is changing the way applications are deployed and managed. Containers provide a lightweight, portable, and isolated environment for programs, which makes them appealing to developers and DevOps teams. Networking in Docker is critical for allowing containers to communicate with one another and with external systems. This article delves into Docker networking drivers, including their purpose, functionality, available alternatives, and best practices for choosing the proper driver to optimize container communication. The Role of Networking Drivers Networking drivers in Docker are essential components responsible for configuring the network interface of containers and connecting them to different network segments. They play a critical role in enabling communication among containers, connecting containers to external networks, and ensuring network isolation and security. The primary functions of networking drivers include: Creating Isolated Networks: Networking drivers can create isolated networks within the Docker host, enabling containers to communicate securely without interfering with one another. Bridge and Routing: They provide the bridge and routing functionality necessary to connect containers to the host network or other external networks. Custom Network Topologies: Docker networking drivers allow users to create custom network topologies, connecting containers in various ways to achieve specific communication patterns. Integration with External Networks: Networking drivers enable Docker containers to communicate with external networks, such as the Internet or on-premises networks. How Networking Drivers Work Networking drivers in Docker operate by configuring network interfaces and rules on the host system to manage the network connectivity of containers. They allow containers to connect to virtual or physical network interfaces and interact with other containers or external systems. Here’s a simplified overview of how networking drivers work: Isolation: Docker creates isolated networks for containers, ensuring that each container operates in its dedicated network namespace, preventing direct interference between containers. Routing: Networking drivers set up routing tables and firewall rules to enable containers to communicate within their respective networks and with external systems. Bridge and Overlay Networks: Networking drivers manage bridge and overlay networks that facilitate communication between containers. Bridge networks are used for communication within the host, while overlay networks allow containers to communicate across hosts. Custom Configuration: Depending on the networking driver chosen, custom configurations like IP addressing, port mapping, and network discovery can be implemented to meet specific communication requirements. Common Docker Networking Drivers Docker offers a variety of networking drivers, each with its own strengths and use cases. The choice of a networking driver can significantly impact container communication, performance, and network security. Here are some of the most commonly used Docker networking drivers: Bridge Bridge is the default Docker networking driver and is commonly used for local communication between containers on a single host. Containers connected to a bridge network can communicate with each other over the host’s internal network. The bridge network provides NAT (Network Address Translation) for container-to-host communication and basic isolation. Pros Simple to set up and use. Suitable for scenarios where containers need to communicate with each other on the same host. Provides basic network isolation. Cons Limited to communication within the host. Not ideal for multi-host communication. Host The Host network driver allows containers to share the host’s network namespace. This means that containers have full access to the host’s network stack and can communicate with external networks directly using the host’s IP address. It’s primarily used when you need maximum network performance and don’t require network isolation. Pros Highest possible network performance. Containers share the host’s network namespace, enabling access to external networks directly. Cons Minimal network isolation. Containers may conflict with ports already in use on the host. Overlay The Overlay network driver enables communication between containers running on different Docker hosts. It creates a distributed network that spans multiple hosts, making it suitable for building multi-host and multi-container applications. Overlay networks are based on the VXLAN protocol, providing encapsulation and tunneling for inter-host communication. Pros Supports communication between containers on different hosts. Scalable for multi-host environments. Provides network isolation and segmentation. Cons Requires more configuration than bridge networks. Requires network plugins for integration with third-party networking technologies. Macvlan Macvlan allows you to assign a MAC address to each container, making them appear as separate physical devices on the network. This is useful when you need containers to communicate with external networks using unique MAC and IP addresses. Macvlan is typically used in scenarios where containers need to behave like physical devices on the network. Pros Containers appear as distinct devices on the network. Useful for scenarios where containers require unique MAC addresses. Supports direct external network communication. Cons Requires careful configuration to avoid conflicts with existing network devices. Limited to Linux hosts. Ipvlan Ipvlan is a similar network driver to Macvlan but provides separate IP addresses to containers while sharing the same MAC address. Ipvlan is efficient for scenarios where multiple containers need to share a network link while having individual IP addresses. Pros Provides separate IP addresses to containers. More efficient resource usage compared to Macvlan. Supports external network communication. Cons Limited to Linux hosts. Containers share the same MAC address, which may have limitations in specific network configurations. Selecting the Right Networking Driver Choosing the right networking driver for your Docker environment is a critical decision that depends on your specific use case and requirements. Consider the following factors when making your selection: Container Communication Needs: Determine whether your containers need to communicate locally within the same host, across multiple hosts, or directly with external networks. Network Isolation: Consider the level of network isolation required for your application. Some drivers, like Bridge and Overlay, provide network segmentation and isolation, while others, like Host and Macvlan, offer less isolation. Host OS Compatibility: Ensure that the chosen networking driver is compatible with your host operating system. Some drivers are limited to Linux hosts, while others can be used in a broader range of environments. Performance and Scalability: Assess the performance characteristics of the networking driver in your specific environment. Different drivers excel in various workloads, so it’s essential to align performance with your application’s needs. Configuration Complexity: Evaluate the complexity of setting up and configuring the networking driver. Some drivers require more extensive configuration than others. Best Practices for Docker Networking Selecting the right networking driver is just the first step in optimizing Docker container communication. To ensure optimal performance, security, and network isolation, consider these best practices: Performance Considerations Monitor Network Traffic: Regularly monitor network traffic and bandwidth usage to identify bottlenecks and performance issues. Tools like iftop and netstat can help in this regard. Optimize DNS Resolution: Configure DNS resolution efficiently to reduce network latency and improve container name resolution. Use Overlay Networks for Multi-Host Communication: When building multi-host applications, use Overlay networks for efficient and secure communication between containers on different hosts. Security and Isolation Implement Network Segmentation: Use Bridge or Overlay networks for network segmentation and isolation between containers to prevent unauthorized communication. Network Policies and Firewall Rules: Define network policies and firewall rules to control container communication and enforce security measures. Regular Updates and Security Patches: Keep your Docker installation, host OS, and networking drivers up to date with the latest security patches and updates to mitigate vulnerabilities. TLS Encryption: Enable TLS (Transport Layer Security) encryption for container communication when transmitting sensitive data. Container Privileges: Limit container privileges and define user namespaces to restrict container access to the host and network resources. Conclusion Docker networking drivers are required for containers to communicate with external networks. They are critical in the creation of isolated networks, the routing of communication, and the creation of specialized network topologies. It is critical to select the correct networking driver for your Docker system to provide optimal container connectivity, performance, security, and network isolation. You can leverage the full power of Docker containers and optimize communication for your applications by knowing the strengths and limits of common Docker networking drivers and following recommended practices. Whether you’re developing single-host or multi-host applications, the networking driver you choose will be critical to the success of your containerized system.
Yitaek Hwang
Software Engineer,
NYDIG
Emmanouil Gkatziouras
Cloud Architect,
egkatzioura.com
Marija Naumovska
Product Manager,
Microtica