Over a million developers have joined DZone.

IoT Platform Design Doc: Distributed OurCI (Part 2)

Take a look at Cesanta's design doc for OurCI — a CI system created alongside the Mongoose IoT Platform. This part covers examples of OurCI in action.

· IoT Zone

Access the survey results 'State of Industrial Internet Application Development' to learn about latest challenges, trends and opportunities with Industrial IoT, brought to you in partnership with GE Digital.

In our last post, we went over the design philosophy and concepts of OurCI, a nickname for our Continuous Integration (as part of the Mongoose IoT Platform) tool at Cesanta. Now, we're going to tackle a few examples to see how it works in action.

This is just an example of how we can spawn build tasks that execute builds (each of a subset of our code base). Some tasks spawn other tasks. The reason might be grouping (keeping all Mongoose IoT Platform build logic together instead of scattering it into a big master rules file), or sequencing (waiting for build artefacts to be built before spawning a test task).

This example shows how a simple distributed build can be orchestrated with those primitives.

Workflow Example

  1. Someone sends Mongoose PR
  2. OurCI master process, running on ci.cesanta.com, catches GitHub notifications and sends a message containing the "all.sh" command and a name of a Docker container to the "ourci/build/docker" topic. This message is depicted in the leftmost green box in Figure 1.
  3. An agent running on a GCE machine having access to Docker and srcfs is subscribed to the "ourci/build/docker" topic. It runs the command contained in the message invoking the "all.sh" script. The script then sends more messages to the same "ourci/build/docker" topic, thus spawning V7, Mongoose Embedded Web Server, and Mongoose IoT Platform builds. The second column of green boxes depicts this wave of messages.
  4. Respective agents that perform builds for V7, Mongoose Embedded Web Server, and Mongoose IoT Platform will execute each their own shell script within a Docker container specified in the published message defining the task. The first wave of messages will be executed in our basic build environment (the one that currently executes Makefile.ci).
  5. The Mongoose IoT Platform build script spawns a few child tasks. Figure 1 depicts only the spawn of esp8266 crosstool build environments (the script detail below shows how it also spawns a POSIX build in parallel, although it technically can run in the context of this build task as well).
  6. The esp8266 cross builds will still be executed by the agent running on GCE (because it's sent to the ourci/build/docker topic) but this time, it will use the esp SDK Docker image instead of the generic build base image.
  7. When the Mongoose IoT Platform build tasks produce a firmware successfully, the script will upload a FW tarball and a test script to the blobstore and publish another build task request message to the ourci/test/esp-12e topic.
  8. The agents running on shepetovka class machines will then pick up this task, download the script, and run it. The script will download the FW, flash it on the device and perform other steps that are out of the scope of this document but can be briefly summarized as: use cadaver to upload a JS test script to the device, reboot it, wait for the device to report test results by listening to a socket (e.g. with Netcat or perhaps POSIX Mongoose IoT Platform itself downloaded from the blobstore)
  9. The end state is reached when all currently open tasks are closed. The OurCI master then can treat the job as closed and set the GitHub PR status according to whether all tasks have succeeded or not. OurCI will set the PR status as failed as soon as the first task fails. In our first iteration, there are no plans to abort already spawned tasks. Once they fail, they will contribute to the still-open build job, however, the PR will immediately reflect the failed status (and notification emails will be sent to the author and possibly the reviewer).
  10. If we leave all the garbage in the blobstore, we would accumulate an estimated of 7 GB of FW tarballs per architecture per year. The content of the blobstore can be cleaned up with a cron job, thus, allowing us to easily fetch the same images and try them out interactively or use them along with core dumps to get stacktraces. I don't see another reason to not clean the blobstore when the build job is marked as done (either because it's closed because all tasks have reported completion or because OurCI master closed it forcibly because of a timeout).

Agent A, Running on the CI Machines on GCE:

$ ourci_agent -t ourci/build/docker ourci_docker_agent.sh

The master script being run by the agent on each build is just a trampoline to a script stored in the repo:

$ cat ourci_docker_agent.sh

TMPCLONE=$(mktemp -d)
git clone --reference /data/srcfs/dev -b "${BRANCH}" \   
         git@github.com:cesanta/dev.git "${TMPCLONE}"

function cleanup {
   rm -rf ${TMPCLONE}
trap cleanup EXIT

# see current ourci/cmd/ourciweb/tools/ourci
function run() {


run ${SCRIPT} "$@"

The actual build logic is stored in bash scripts, makefiles, whatever. Here's a simple example where the build is defined with a set of build scripts that spawn other build tasks, whose bodies are themselves defined in shell scripts:

$ cat /data/srcfs/dev/ourci/scripts/functions.sh
ESP_SDK=$(cat smartjs/platforms/esp8266/sdk.version)

# publish a build task request to the "-t <topic>"
# encoding the rest of the arguments in the message
# the newly created task will be linked to the current job
# held in the $BUILD_JOB env
function spawn_task() {

OurCI master will spawn the first task and pass "all.sh" as a parameter.

$ cat /data/srcfs/dev/ci/all.sh
. $(dirname $0)/ourci/scripts/functions.sh

spawn_task -t ourci/build/docker ${BASE_SDK} "ci/v7.sh"
spawn_task -t ourci/build/docker ${BASE_SDK} "ci/mongoose.sh"
spawn_task -t ourci/build/docker ${BASE_SDK} "ci/cloud.sh"
spawn_task -t ourci/build/docker ${BASE_SDK} "ci/fw.sh"

Simple builds can be done by a shell script or even a makefile:

$ cat /data/srcfs/dev/ci/v7.sh
. $(dirname $0)/ourci/scripts/functions.sh

# we can further split the build steps in smaller pieces to be run in parallel
# or do something more clever in future
$(MAKE) -C v7/tests compile
$(MAKE) -C v7 w build_variants -j8
$(MAKE) -C v7/docs
$(MAKE) -C v7 v7.c difftest
$(MAKE) -C v7 run difftest
$(MAKE) -C v7 run CFLAGS_EXTRA=-m32
$(MAKE) -C v7/examples
git checkout v7/tests/ecma_report.txt
if [ "x$(CI_BRANCH)" = xmaster ]; then $(MAKE) -C v7 report_footprint; fi

Bigger components, such as Mongoose IoT Platform, can spawn several subtasks, some of which might need to run in a different container: 

$ cat /data/srcfs/dev/ci/fw.sh
. $(dirname $0)/ourci/scripts/functions.sh

spawn_task -t ourci/build/docker ${BASE_SDK} "ci/fw/posix.sh"
spawn_task -t ourci/build/docker ${ESP_SDK} "ci/fw/esp8266.sh"
spawn_task -t ourci/build/docker ${ESP_SDK} "OTA=1 ci/fw/esp8266.sh"

A HW test is spawned after a build is made within the cross toolchain image:

$ cat /data/srcfs/dev/ci/fw/esp8266.sh
. $(dirname $0)/ourci/scripts/functions.sh


make -C fw/platforms/esp8266 -f Makefile.build OTA=${OTA}
tar cvfz fw.tar.gz -C fw/platforms/esp8266 firmware

blob_upload fw.tar.gz ${FW_BLOB}
blob_upload ci.fw.esp8266_test_driver.sh ${SCRIPT_BLOB}

spawn_task -t ourci/test/esp-12e ${FW_BLOB} ${SCRIPT_BLOB}

The actual test driver is stored in the repo and copied into the blobstore so the HW test agent can run it without needing git credentials:

$ cat /data/srcfs/dev/ci/fw/esp8266_test_driver.sh
curl ${BLOB_URL}/${1} >fw.tar.gz
tar ...
esptool.sh ...
sleep ...
cadaver ...
sleep ...
curl ...

Agent B, Running on Shepetovka.corp.cesanta.com

The agent of shepetovka (this is the hostname of an internal test machine) has a very simple script. The actual per-build script will be fetched from the blobstore on each build:

$ ourci_agent -t ourci/test/esp-12e ourci_esp8266_agent.sh
$ cat ourci_esp8266_agent.sh
curl ${BLOB_URL}/${2} >test_script.sh
chmod +x test_script.sh
./test_script ${1}


The price paid for keeping the workflow model dead simple, non-declarative, and without explicit dependencies is that we have to be explicit in our sequencing. If we need to spawn a task after some task did something, we have to put the spawn command after that something, which means that the "spawn ESP test" command will be buried in some non-top level script.

If this proves to be too cumbersome, we can improve the way we describe the build.

The IoT Zone is brought to you in partnership with GE Digital.  Discover how IoT developers are using Predix to disrupt traditional industrial development models.

agent,platform,service,build script,mongoose,master,docker

Published at DZone with permission of Marko Mikulicic, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}