Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Health Checking Your Docker Containers

DZone's Guide to

Health Checking Your Docker Containers

Are your containers feeling under the weather? Struggling to get out of bed? See how you can build in health checks to make sure your containers are fighting fit.

· Cloud Zone
Free Resource

Site24x7 - Full stack It Infrastructure Monitoring from the cloud. Sign up for free trial.

One of the new features in Docker 1.12 is how health check for a container can be baked into the image definition. And this can be overridden at the command line.

Just like the CMD instruction, there can be multiple HEALTHCHECK instructions in Dockerfile but only the last one is effective.

This is a great addition because a container reporting status as Up 1 hour may return errors. The container may be up but there is no way for the application inside the container to provide a status. This instruction fixes that.

The Dockerfile that builds arungupta/couchbase image is:

FROM couchbase:latest

COPY configure-node.sh /opt/couchbase

HEALTHCHECK --interval=5s --timeout=3s CMD curl --fail http://localhost:8091/pools || exit 1

CMD ["/opt/couchbase/configure-node.sh"]


It uses a configure-node.sh script to configure the server using the Couchbase REST API. The new instruction to notice here is HEALTHCHECK.

This instruction can be specified as:

HEALTHCHECK <options> CMD <command>


The <options> can be:

  • --interval=DURATION (default 30s).

  • --timeout=DURATION (default 30s).

  • --retries=N (default 3).

The <command> is the command that runs inside the container to check the health.

If health check is enabled, then the container can have three states:

  • Starting: Initial status when the container is still starting.

  • Healthy: If the command succeeds, then the container is healthy.

  • Unhealthy: If a single run of the <command> takes longer than the specified timeout, then it is considered unhealthy. If a health check fails, then the <command> will run retries number of times and will be declared unhealthy if the <command> still fails.

The commands exit status indicates the health status of the container. The following values are allowed:

  • 0: container is healthy.

  • 1: container is not healthy.

In our instruction, /pools REST API is invoked using curl. If the command fails then an exit status of 1 is returned, and this marks the container unhealthy for that attempt. This command is invoked every 5 seconds. The container is marked unhealthy if the command does not return successfully within 3 seconds.

Run the container as:

docker run -d --name db arungupta/couchbase:latest


Check the status:

docker ps
CONTAINER ID        IMAGE                        COMMAND                  CREATED             STATUS                            PORTS                                                        NAMES
55b14302671e        arungupta/couchbase:latest   "/entrypoint.sh /opt/"   2 seconds ago       Up 1 seconds (health: starting)   8091-8094/tcp, 11207/tcp, 11210-11211/tcp, 18091-18093/tcp   db


Notice how health: starting status is reported in the STATUS column. Checking after a few seconds shows the status:

docker ps
CONTAINER ID        IMAGE                        COMMAND                  CREATED              STATUS                        PORTS                                                        NAMES
55b14302671e        arungupta/couchbase:latest   "/entrypoint.sh /opt/"   About a minute ago   Up About a minute (healthy)   8091-8094/tcp, 11207/tcp, 11210-11211/tcp, 18091-18093/tcp   db


And now it's reported healthy.

More details about this HEALTHCHECK instruction can be found on docs.docker.com.

Now, if you are running an image that does not have HEALTHCHECK instruction then the docker run command can be used to specify similar values. An equivalent runtime command would be:

docker run -d --name db --health-cmd "curl --fail http://localhost:8091/pools || exit 1" --health-interval=5s --timeout=3s arungupta/couchbase


The last five health checks for a container can be obtained using the docker inspect command:

docker inspect --format='{{json .State.Health}}' db


The output is shown as:

{
  "Status": "healthy",
  "FailingStreak": 0,
  "Log": [
    {
      "Start": "2016-11-12T03:23:03.351561Z",
      "End": "2016-11-12T03:23:03.422176171Z",
      "ExitCode": 0,
      "Output": "  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current\n                                 Dload  Upload   Total   Spent    Left  Speed\n\r  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0\r100   768  100   768    0     0   595k      0 --:--:-- --:--:-- --:--:--  750k\n{\"isAdminCreds\":true,\"isROAdminCreds\":false,\"isEnterprise\":true,\"pools\":[{\"name\":\"default\",\"uri\":\"/pools/default?uuid=1b84cdbd136e4e8466049dd062dd6969\",\"streamingUri\":\"/poolsStreaming/default?uuid=1b84cdbd136e4e8466049dd062dd6969\"}],\"settings\":{\"maxParallelIndexers\":\"/settings/maxParallelIndexers?uuid=1b84cdbd136e4e8466049dd062dd6969\",\"viewUpdateDaemon\":\"/settings/viewUpdateDaemon?uuid=1b84cdbd136e4e8466049dd062dd6969\"},\"uuid\":\"1b84cdbd136e4e8466049dd062dd6969\",\"implementationVersion\":\"4.5.1-2844-enterprise\",\"componentsVersion\":{\"lhttpc\":\"1.3.0\",\"os_mon\":\"2.2.14\",\"public_key\":\"0.21\",\"asn1\":\"2.0.4\",\"kernel\":\"2.16.4\",\"ale\":\"4.5.1-2844-enterprise\",\"inets\":\"5.9.8\",\"ns_server\":\"4.5.1-2844-enterprise\",\"crypto\":\"3.2\",\"ssl\":\"5.3.3\",\"sasl\":\"2.3.4\",\"stdlib\":\"1.19.4\"}}"
    },
    {
      "Start": "2016-11-12T03:23:08.423558928Z",
      "End": "2016-11-12T03:23:08.510122392Z",
      "ExitCode": 0,
      "Output": "  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current\n                                 Dload  Upload   Total   Spent    Left  Speed\n\r  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0\r100   768  100   768    0     0   309k      0 --:--:-- --:--:-- --:--:--  375k\n{\"isAdminCreds\":true,\"isROAdminCreds\":false,\"isEnterprise\":true,\"pools\":[{\"name\":\"default\",\"uri\":\"/pools/default?uuid=1b84cdbd136e4e8466049dd062dd6969\",\"streamingUri\":\"/poolsStreaming/default?uuid=1b84cdbd136e4e8466049dd062dd6969\"}],\"settings\":{\"maxParallelIndexers\":\"/settings/maxParallelIndexers?uuid=1b84cdbd136e4e8466049dd062dd6969\",\"viewUpdateDaemon\":\"/settings/viewUpdateDaemon?uuid=1b84cdbd136e4e8466049dd062dd6969\"},\"uuid\":\"1b84cdbd136e4e8466049dd062dd6969\",\"implementationVersion\":\"4.5.1-2844-enterprise\",\"componentsVersion\":{\"lhttpc\":\"1.3.0\",\"os_mon\":\"2.2.14\",\"public_key\":\"0.21\",\"asn1\":\"2.0.4\",\"kernel\":\"2.16.4\",\"ale\":\"4.5.1-2844-enterprise\",\"inets\":\"5.9.8\",\"ns_server\":\"4.5.1-2844-enterprise\",\"crypto\":\"3.2\",\"ssl\":\"5.3.3\",\"sasl\":\"2.3.4\",\"stdlib\":\"1.19.4\"}}"
    },
    {
      "Start": "2016-11-12T03:23:13.511446818Z",
      "End": "2016-11-12T03:23:13.58141325Z",
      "ExitCode": 0,
      "Output": " {\"isAdminCreds\":true,\"isROAdminCreds\":false,\"isEnterprise\":true,\"pools\":[{\"name\":\"default\",\"uri\":\"/pools/default?uuid=1b84cdbd136e4e8466049dd062dd6969\",\"streamingUri\":\"/poolsStreaming/default?uuid=1b84cdbd136e4e8466049dd062dd6969\"}],\"settings\":{\"maxParallelIndexers\":\"/settings/maxParallelIndexers?uuid=1b84cdbd136e4e8466049dd062dd6969\",\"viewUpdateDaemon\":\"/settings/viewUpdateDaemon?uuid=1b84cdbd136e4e8466049dd062dd6969\"},\"uuid\":\"1b84cdbd136e4e8466049dd062dd6969\",\"implementationVersion\":\"4.5.1-2844-enterprise\",\"componentsVersion\":{\"lhttpc\":\"1.3.0\",\"os_mon\":\"2.2.14\",\"public_key\":\"0.21\",\"asn1\":\"2.0.4\",\"kernel\":\"2.16.4\",\"ale\":\"4.5.1-2844-enterprise\",\"inets\":\"5.9.8\",\"ns_server\":\"4.5.1-2844-enterprise\",\"crypto\":\"3.2\",\"ssl\":\"5.3.3\",\"sasl\":\"2.3.4\",\"stdlib\":\"1.19.4\"}} % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current\n                                 Dload  Upload   Total   Spent    Left  Speed\n\r  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0\r100   768  100   768    0     0   248k      0 --:--:-- --:--:-- --:--:--  375k\n"
    },
    {
      "Start": "2016-11-12T03:23:18.583512367Z",
      "End": "2016-11-12T03:23:18.677727356Z",
      "ExitCode": 0,
      "Output": "  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current\n                                 Dlo{\"isAdminCreds\":true,\"isROAdminCreds\":false,\"isEnterprise\":true,\"pools\":[{\"name\":\"default\",\"uri\":\"/pools/default?uuid=1b84cdbd136e4e8466049dd062dd6969\",\"streamingUri\":\"/poolsStreaming/default?uuid=1b84cdbd136e4e8466049dd062dd6969\"}],\"settings\":{\"maxParallelIndexers\":\"/settings/maxParallelIndexers?uuid=1b84cdbd136e4e8466049dd062dd6969\",\"viewUpdateDaemon\":\"/settings/viewUpdateDaemon?uuid=1b84cdbd136e4e8466049dd062dd6969\"},\"uuid\":\"1b84cdbd136e4e8466049dd062dd6969\",\"implementationVersion\":\"4.5.1-2844-enterprise\",\"componentsVersion\":{\"lhttpc\":\"1.3.0\",\"os_mon\":\"2.2.14\",\"public_key\":\"0.21\",\"asn1\":\"2.0.4\",\"kernel\":\"2.16.4\",\"ale\":\"4.5.1-2844-enterprise\",\"inets\":\"5.9.8\",\"ns_server\":\"4.5.1-2844-enterprise\",\"crypto\":\"3.2\",\"ssl\":\"5.3.3\",\"sasl\":\"2.3.4\",\"stdlib\":\"1.19.4\"}}ad  Upload   Total   Spent    Left  Speed\n\r  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0\r100   768  100   768    0     0   307k      0 --:--:-- --:--:-- --:--:--  375k\n"
    },
    {
      "Start": "2016-11-12T03:23:23.679661467Z",
      "End": "2016-11-12T03:23:23.782372291Z",
      "ExitCode": 0,
      "Output": "  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current\n                                 Dload  Upload   Total   Spent    Left{\"isAdminCreds\":true,\"isROAdminCreds\":false,\"isEnterprise\":true,\"pools\":[{\"name\":\"default\",\"uri\":\"/pools/default?uuid=1b84cdbd136e4e8466049dd062dd6969\",\"streamingUri\":\"/poolsStreaming/default?uuid=1b84cdbd136e4e8466049dd062dd6969\"}],\"settings\":{\"maxParallelIndexers\":\"/settings/maxParallelIndexers?uuid=1b84cdbd136e4e8466049dd062dd6969\",\"viewUpdateDaemon\":\"/settings/viewUpdateDaemon?uuid=1b84cdbd136e4e8466049dd062dd6969\"},\"uuid\":\"1b84cdbd136e4e8466049dd062dd6969\",\"implementationVersion\":\"4.5.1-2844-enterprise\",\"componentsVersion\":{\"lhttpc\":\"1.3.0\",\"os_mon\":\"2.2.14\",\"public_key\":\"0.21\",\"asn1\":\"2.0.4\",\"kernel\":\"2.16.4\",\"ale\":\"4.5.1-2844-enterprise\",\"inets\":\"5.9.8\",\"ns_server\":\"4.5.1-2844-enterprise\",\"crypto\":\"3.2\",\"ssl\":\"5.3.3\",\"sasl\":\"2.3.4\",\"stdlib\":\"1.19.4\"}}  Speed\n\r  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0\r100   768  100   768    0     0   439k      0 --:--:-- --:--:-- --:--:--  750k\n"
    }
  ]
}


Related Refcard:

Site24x7 - Full stack It Infrastructure Monitoring from the cloud. Sign up for free trial.

Topics:
cloud ,docker ,health checks ,command line

Published at DZone with permission of Arun Gupta, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

THE DZONE NEWSLETTER

Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

X

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}