Docker-Based Application Stack for the Archival of Architectural Data with DCHQ
Ever heard the term "DURAARK"? It stands for Durable Architectural Knowledge and it's a helpful way to design applications. See how it is implemented using DCHQ.
Join the DZone community and get the full member experience.
Join For FreeRecently I stumbled across DCHQ, a deployment automation & life-cycle management platform for container-based applications. As I'm currently working on a project (DURAARK) that is heavily using containerized services I wanted to give it a try. This post describes my first experiences with the system and shows how to setup the multi-container DURAARK application with it. Before describing the DCHQ setup I'll introduce the DURAARK application to give the reader a bit more context. If you are only interested in the DCHQ part simply skip the first two paragraphs of the post.
DURAARK (Durable Architectural Knowledge)
DURAARK is an open source system to do semantic archival and retrieval of architectural data. The tool helps stakeholders from the architecture, engineering and construction (AEC) community to manage data like 3D models, point cloud scans or context information over the lifecycle of a building. Our goal is to transform building data into a *living archive* that serves as a knowledge base for stakeholders from the design phase over planning, construction, facility management and renovations or retrofittings. The interested reader can look up the project on the official page here. If you happen to work in the AEC world you should definitely check it out or drop us a note, if you are interested in the topics described there!
Service Based Architecture
From a technical point of view DURAARK is a set of components developed by individual project partners. The components are written in different programming languages, including Java, Python, C++ and Javascript. To provide our stakeholders with a homogeneous interface we decided to provide a RESTful API for each of the components and tie the APIs together via a graphical web application. This setup has the advantage that we provide an application where the only pre-requisite is a web browser. Additionally we can expose DURAARK's functionality via REST APIs. This way customers can integrate the functionality into their existing workflows easily.
The current DURAARK system provides the following services:
duraark-sessions: management service for sessions
duraark-metadata: extraction of metadata from supported files (BIM models in the IFC format and point cloud files in the E57 format)
duraark-sda: a knowledge graph that stores semantic information about a building
duraark-geometricenrichemnt: tools for extracting information from point cloud scans
duraark-digitalpreservation: service for storing files and additional information into a long-term preservation system
Additionally we have a web application as frontend:
workbench-ui
Each service and the frontend live in their own Github repository and have a Dockerfile to manage their individual deployment. Additionally we provide docker images on Docker Hub for each repository.
With this setup in place let's dive into DCHQ!
Setup DURAARK on DCHQ
DCHQ is a deployment automation & life-cycle management platform for container-based applications. Since a few months the DCHQ team provides a hosted version of their system (additionally to their on-premise version) and you can get try it for free.
The use case we want to achive with DCHQ is to provide our stakeholder group with the possibility to setup their own copy of the DURAARK system with a "single-click" experience. We are a research company that develops a prototype of the system, but we are not hosting a production ready system for potential customers (the reasons for that lie in our company setup as non-profit organization). Therefore we are interested in using a hosted service where stakeholders can preview the DURAARK prototype on their own. Let's see how we can achieve that with DCHQ...
The first step is to navigate to Manage > Template and create a new application template for the DURAARK system with the + sign. The content of the template looks like this:
In the template we declare each backend service and the frontend web application and specify the Docker Hub image they are deployed from. The duraark-sessions service is setup as a data volume container that provides a file-based storage system to other containers that need to read data from or store data into files. The duraark-geometricenrichment service is configured to allow the execution of Docker containers within the Docker container itself (privileged keyword). The reason for that is that the components used by this service are provided as Docker containers themselves (they are implemented in C++ and we abstracted the compile process of those tools into a Dockerfile). You can read up on using docker-in-docker here.
For the duraark-* containers we are using the default Docker Compose format to describe how they should be deployed. The workbench-ui container is using a dedicated DCHQ feature - template parameters - to setup the URL that the frontend will use to connect to the backend services. In this case we use a template parameter to find out the IP address of the host the web container is running on (or to be more precise: will be running on after the deployment). This value is set to the environment variable DURAARK_API_ENDPOINT which is read by the workbench-ui container to connect to the correct API endpoint.
The web container is a NGINX reverse proxy that serves the duraark-* backend services at a single base endpoint. For this setup the NGINX configuration needs to know the container IPs of the services which are only available at runtime after the system is provisioned. We can use a second dedicated DCHQ feature - plugins - in combination with the template parameters to setup the NGINX instance.
Plugins are bash scripts that are executed after a container is provisioned (or at runtime, depending on your needs). In this case we update the NGINX configuration file to point to the respective IPs of the duraark-* containers. This is the relevant NGINX configuration file:
With a plugin we are going to replace the placeholders in the file (workbench-ui, duraark-sessions, etc.) with the actual IP addresses of the matching containers. To do that we first create a new plugin in navigating to the Manage > Plugins section. With the + sign a new plugin is created.
The Script section contains logic to replace the placeholder names in the NGINX configuration file with IP addresses read from environment variable. E.g., the placeholder 'duraark-sessions' gets replaced with the IP address stored in the environment variable $DURAARK_SESSIONS_IP. As a next step we will activate the plugin in the template and solve the mystery on how the environment variables are set to the correct IP addresses.
To activate the plugin first note down the plugin ID (I took the ID directly from the URL after saving the plugin, as the page did not display the ID) head back to the template and have a look at line 34. Here the plugins keyword is used, which references the plugin. The arguments section is then responsible for setting the environment variables used within the bash script. Again, template parameters are used to figure out the respective container IP addresses, which get set as environment variables. When the plugin gets executed after the web container is started the plugin has all the information necessary to rewrite the NGINX configuration to adapt to the provisioned system.
The template is now setup and ready to be deployed from the Library section:
After a click on Run the system gets deployed:
Conclusion
Our stakeholders commonly don't have the expertise to deploy multi-container web applications on their own. Via the Library section and the configured template the deployment literally boils down to a single click, which works great for our target audience. We are planning to provide a demo version of the DURAARK system that fits into the free plan (5 containers) so that stakeholders can get a free account and evaluate the system. If they are interested in the full version (which exceeds the 5 containers limit) they can upgrade to the paid plan. It has to be said that DURAARK is an open source system and can easily be hosted on-premise. However, many of our stakeholders prefer to have a hosted platform where the system is running on and where they do not have to care about the administration of the system.
Setting up the application template feels very familiar for developers that are experienced in Docker and Docker Compose. The DCHQ specific extensions like plugins and template parameters are useful and make lifes easier when setting up post-provisioning tasks like a NGINX reverse proxy configuration. Have a look a the DCHQ blog to get more ideas on how to use the provided extensions, there is much more to see then touched in this post, e.g., for setting up multi-host environments for load balancing. The DCHQ team is also very responsive and willing to help if you experience any problems, which is great.
If you are interested in the DURAARK system feel free to drop me a note and I'll keep you posted for updates on the system (my email is available here). Currently we have a prototype that is showcasing the developed functionality but still has some rough edges. You can also follow the development on Github or on the official project page.
Opinions expressed by DZone contributors are their own.
Comments