DockerHub News — And the Impact for Us Developers
Docker Hub is in the news! But this news is not the best for us developers like you and me. So what is this DockerHub?
Join the DZone community and get the full member experience.Join For Free
But First: What's the News?
So what is this DockerHub? DockerHub is a cloud-based repository where popular Docker images can be published and used by other users for their needs. So far it is the central repository similar to the maven-Central for the Java world.
Access was free, and there were no further restrictions on storage space and duration of storage. This circumstance has led to a large number of open source projects using this repository for their purposes. A whole network of dependencies between these images has built up over the years. So much for the past.
Docker Hub Was in the News Recently for Two Reasons:
Storage Limits: So far, Docker images were stored on Dockerhub for an unlimited time. On the one hand, this meant that nobody cared about the storage space of their images. On the other hand, this state has been maintained for so long that pretty much everyone has relied on it that it will not change anymore. Unfortunately, that has now changed. The retention period for inactive Docker images has been reduced to six months. What does not sound particularly critical at first, turns out to be quite uncomfortable in detail. When selecting the base images, i.e. the images that were used as the basis for one's compositions, it was not uncommon for them not to consider that exactly these images would like one day no longer be available.
Download throttling: Docker has put a limit on the download rate of 100 pulls per six hours for anonymous users, and 200 pulls per six hours for free accounts. The number 200 sounds pretty bearable. However, it makes sense to do a more detailed calculation here. 200 pulls / 6h are 200 pulls / 360min. We're talking about 0.55 pulls/minute at a constant polling rate. First, many systems do more than one build every 2 minutes. Second, if the limit has been reached, it can take more than half a working day to regain access. The latter is to be viewed as very critical. Usually, limits are given per hour, which then only leads to a delay of a little under an hour. Six hours is a different order of magnitude.
These two points alone are enough to bring some active open source projects to a standstill. And here I am not even starting from a holistic view of what this can do in a global context.
For DevSecOps — A Nightmare Comes True
But let's come to a completely different point of view. We are talking about **business continuity** and the effects on the entire value chain. As a result of this announcement, every software project must now begin to examine its whole value chain to determine where there are neuralgic points on such Docker images. And this must include the direct and indirect paths. A nightmare will come true, as this means an incalculable effort that, by definition, has to be completed within the next six months. The time has already started because the timer on the DockerHub site has begun to tick!
It's interesting to see how carelessly the entire Docker universe has been handled. Docker images that are freely available are simply integrated into your process. Either in the development to translate your source codes, to operate the necessary infrastructure or as a basis for the production systems. There are two types of dangers lurking here for a company, both of which can lead to financial loss.
The apparent danger comes from the area of classic security. Burglary security of the operating systems, bugs, compromised basic packages and the like. So everything you will see in the realm of hackers and other evildoers. A completely different type of danger, on the other hand, is much more subtle. This danger slumbers indefinitely and is not one of the generally visible failures. I am addressing the legal pitfalls here. Docker images contain a countless amount of binary packages that have been made available as a composition. Every small part is linked to a license, even if it is only the explicit exemption from such a permit. The wrong license in the right place can ensure that a company is confronted with financial risk on an unimagined level.
After all the creepy explanations, the question of the solution naturally arises. I will now outline this point by point and show which of the risks mentioned can be eliminated with it.
To ensure the availability of system-critical elements. The goal must be to become independent of the positions that can lead that your own value chain comes to a standstill. And the obvious point is to use your own registry in which the Docker images used are stored. So you have control over how long these images have to be saved.
Identify and start the removal of performance bottlenecks inside the production. The point aims to reduce the maximum number of queries per unit of time. When operating your infrastructure, you do not have this form of limitation, and you can decide for yourself which load the system has to withstand to guarantee production as planned.
Implementing technical security with regard to known vulnerabilities, bugs and compromised system components. The uses of your own infrastructure give excellent potential here, though, and you also have a comprehensive picture of the metadata. Whenever you provide the artefacts yourself, you can, for example, incorporate the use of checksums, the purpose and the real necessity of the individual components in a uniform concept.
Let us now come to a practically applicable concept. If we look at the aspects mentioned, the focus is on the use of a central registry. Whether this is operated by a hoster or in your own data centre is secondary to the basic concept. You can also think about hybrid approaches that offer a company exciting change paths.
One such product that provides these integration options is Artifactory from JFrog. You can use not only Docker images but also pretty much all other package managers. If you look at how DockerHub works and compare it with Maven Central, for example, you can see obvious parallels.
For this purpose, I have provided a talk on YouTube that sheds light on the problematic in the area of Debian repositories.
- The English version is under https://youtu.be/TqxdLOs0Q1E
- and the german version at https://youtu.be/tTslQjPiZ34 on Youtube.
If you don't want to type in the link, you can search for Why Debian-Repositories are mission-critical on YouTube.
We now place this registry between DockerHub and our components that need these DockerImages. With each request, the image obtained from DockerHub is cached in its registry. This behaviour bypasses several limitations that we have recently been confronted with on DockerHub. On the one hand, we can now ensure that all images used are kept in one infrastructure for an indefinite period. On the other hand, this also circumvents the limitation with regard to the request frequency, since only elements are requested that are not yet available in the system. These two points alone represent an enormous relief.
The effects are the same, and the production process immediately benefits from these improvements. As soon as the repositories have been activated, you can additionally start thinking about security. JFrog offers you the possibility to use Xray https://jfrog.com/xray/ to map the topics of vulnerabilities and compliance at this central point. What you can do with the Docker images, you can also do with all other artefacts. A list of the supported package management systems can be found at https://jfrog.com/artifactory/. The effects are the same, and the production process immediately benefits from these improvements.
Talk — DevSecOps Up and running with JFrog-Xray
But the best of all is that you can just try it out yourself for an indefinite period of time. Activate a FreeTier and try out the functions. How to activate such a FreeTier of the JFrog platform, I have briefly shown on Youtube.
We are currently in a time of change. Many things are in flux, but there are always approaches that have proven themselves. We're talking about maintaining control over your own process-relevant elements. This also includes the essential components of software development and IT operation. With a uniform concept, you can master many challenges that you face and look forward to the future with a smile. And the best thing about it is that it can be tried out and quickly. Because when you realize it yourself and see how it works, you trust in a completely different way.
Cheers! — Sven
Opinions expressed by DZone contributors are their own.
Performance Comparison — Thread Pool vs. Virtual Threads (Project Loom) In Spring Boot Applications
Revolutionize JSON Parsing in Java With Manifold
Building the World's Most Resilient To-Do List Application With Node.js, K8s, and Distributed SQL
AI and Cybersecurity Protecting Against Emerging Threats