Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Back to the Roots: Towards True Continuous Integration (Part 1)

DZone's Guide to

Back to the Roots: Towards True Continuous Integration (Part 1)

Continuous integration is much more than just tooling. Read up on the true meaning of CI as opposed to the common misconceptions.

· DevOps Zone ·
Free Resource

Is the concept of adopting a continuous everything model a daunting task for your fast moving business? Read this whitepaper to break down and understand one of the key pillars of this model in Continuous Governance: The Guardrails for Continuous Everything.

In this article, I would like to show you what many people believe CI is, what is true Continuous Integration, and what is not CI. Also, I will give you some examples to better understand it. 

What Is CI?

CI (continuous integration) is a software development practice in which a continuous integration server polls a version control repository builds an artifact and validate the artifact with a set of defined tests. It is a common practice for most enterprises and individuals… and this is not the true Continuous Integration definition, sorry for the joke.

What Is True Continuous Integration?

True Continuous Integration is not simply some kind of “Jenkins/Travis/Go/Teamcity” that polls the git repository of the project, compiles it, and runs a bunch of tests against the artifact. In fact, this is the less interesting part of CI, which is not a technology (like Jenkins) but an agile practice created by Grady Booch and adopted and prescribed by the Extreme programming methodology.

As an analogy with another Extreme programming technique, TDD is not about unit testing (although it uses unit testing), but about feedback and obtaining feedback as soon as possible to speed up the development cycles (which is implemented in a concrete usage of unit testing).

With CI, software is built several times a day (ideally, every few hours) - every time a developer integrates code in the mainline (which should be often) in order to avoid “integration hell” (merging code from different developments at the end of a development interaction). CI avoids this “integration hell” by integrating code as soon as possible and forcing team members to view what other developers are doing to make shared team decisions about new code.

The methodology states that every team member integrates into the mainline as often as possible. Every contribution to the VCS (Version Control System) is potentially a release, so every contribution should not break functionality and should pass all known tests.

A CI server will construct an artifact from the last sources of the mainline and pass all known tests. If there is a failure, the CI server will warn all members of the team of the state of the build (RED). The maximum priority of the team is to keep the build in its default value (GREEN).

What Is Not CI?

Once we realize that CI is far more than the simple use of a CI server, we can state that:

  • Working with feature branches and have a CI checking master is not CI.
  • Working with pull requests is not CI.

It’s important to note that I’m not judging in terms of good/bad practices; both feature branches and pull requests are simply other methodologies different than CI.

Both feature branches and pull requests mandates that the work must be done in another branch different than the master (the one monitored by the CI Server) this leads to longer cycles before they could be merged into master.

Feature branches and pull requests rely profoundly on team resource/task planification to avoid refactors on one task(branch) that affects developments on another task(branch) minifying the threaded “integration hell.”

An example of integration hell: we have the following code; two classes that leverage the API, and the rest calls to an external API: 

APIUsersAccessor
class APIUsersAccessor
{
    const USERS_API_PATH = "/users";
 /**
     * @var string
     */
    private $host;
    /**
     * @var string
     */
    private $username;
    /**
     * @var string
     */
    private $password;
    public function __construct(string $host, string $username, string
$password)
    {
        $this->host = $host;
        $this->username = $username;
        $this->password = $password;
}
    public function getAllUsers(): array
    {
        $data = array(
            "email" => $this->username,
            "password" => $this->password
        );
        $headers = array(
            "Content-Type" => "application/json;charset=UTF-8"
        );
        $request = \Requests::GET($this->host.self::USERS_API_PATH,
$headers, json_encode($data));
        return json_decode($request->body);
    }
}

APIProductsAccessor
class APIProductsAccessor
{
    const PRODUCTS_API_PATH = "/products";
    /**
* @var string
     */
    private $host;
    /**
     * @var string
     */
    private $username;
    /**
     * @var string
     */
    private $password;
    public function __construct(string $host, string $username, string
$password)
    {
        $this->host = $host;
        $this->username = $username;
        $this->password = $password;
}
    public function getAllProducts(): array
    {
        $data = array(
            "email" => $this->username,
            "password" => $this->password
        );
        $headers = array(
            "Content-Type" => "application/json;charset=UTF-8"
        );
        $request = \Requests::GET($this->host.self::PRODUCTS_API_PATH,
$headers, json_encode($data));
        return json_decode($request->body);
    }
}

As you can see, both blocks of code are very similar (the classical code duplication). Now we are going to start two development features with two development branches. The first development must add a telephone number to the request to the Products API; the second one must create a new API to query all cars available at a store. This is the code in the Products API after adding the telephone number:

APIUsersAccessor (with telephone)
class APIUsersAccessor
{
....
    public function __construct(string $host, string $username, string
$password)
{
.......
  $this->telephone = $telephone;
    }
    public function getAllUsers(): array
    {
        $data = array(
            "email" => $this->username,
            "password" => $this->password,
   "tel" => $this->telephone
        );
..... }
}

The developer has added the missing field and has added it to the request. The developer of branch 1 expects this diff as the merge with a master:

true continuous integration

The problem is that developer 1 does not know that developer 2 has made a refactor in order to reduce code duplication because CarAPI is too similar to UserAPI and ProductAPI, so the code in his branch will be like this:

BaseAPIAccessor
abstract class BaseAPIAccessor
{
    private $apiPath;
    /**
* @var string
     */
    private $host;
    /**
     * @var string
     */
    private $username;
    /**
     * @var string
     */
    private $password;
    protected function __construct(string $host,string $apiPath, string
$username, string $password)
    {
        $this->host = $host;
        $this->username = $username;
        $this->password = $password;
        $this->apiPath = $apiPath;
}
    protected function doGetRequest(): array
    {
        $data = array(
            "email" => $this->username,
            "password" => $this->password
        );
        $headers = array(
            "Content-Type" => "application/json;charset=UTF-8"
        );
        $request = \Requests::GET($this->host.$this->apiPath, $headers,
json_encode($data));
        return json_decode($request->body);
    }
}
concrete APIs
class ApiCarsAccessor extends BaseAPIAccessor
{
    public function __construct(string $host, string $username, string
$password)
    {
        parent::__construct($host, "/cars", $username, $password);
}
    public function getAllUsers(): array
    {
        return $this->doGetRequest();
    }
}
class APIUserAccessor extends BaseAPIAccessor
{
    public function __construct(string $host, string $username, string
$password)
    {
        parent::__construct($host, "/users", $username, $password);
}
    public function getAllUsers(): array
    {
        return $this->doGetRequest();
    }
}
class APIProductsAccessor extends BaseAPIAccessor
{
    public function __construct(string $host, string $username, string
$password)
    {
        parent::__construct($host, "/products", $username, $password);
}
    public function getAllProducts(): array
    {
        return $this->doGetRequest();
    }
}

So the real merge will be:

true continuous integrationBasically, we will have a big conflict at the end of development cycle when we merge branch 1 and branch 2 into the mainline. We will have to do a lot of code reviews, which will involve an archaeological process of reviewing all past decisions in a development phase to see how to merge the code. In this concrete case, the telephone number will also involve some kind of rewrite.

Some will argue that developer 2 should not have done a refactor because planning stated that he has to develop only CarAPI, and planning stated clearly that there should be no collision with UserAPI. Well, yes…but to make that this kind of extreme planification work, there should be a good planning of all resources, we should have a lot of architectural meetings involving developer 1 and developer 2.

In these architectural meetings, developer 1 and developer 2 should have realized that there was some kind of code duplication and they have to decide to intervene and replan, or do nothing and increase technical debt, moving the refactor decision to future iterations. This may not sound too agile, right? The point is that is difficult to mix agile and non-agile practices.

If we do feature branch/pull requests, a full iterative planification process works better if we’re doing agile continuous integration. Again, I’m not stating that feature branches/pull requests are good/bad tools, I’m simply stating that they are non-agile practices.

Agile is all about communication and continuous improvement, and it’s all about feedback as soon as possible. In the agile approach, developer 1 will be aware of the refactoring of developer 2 in the beginning, being able to start a dialog with developer 1, and check if the type of abstraction that they're proposing will be the correct one to fit, and the addition of a telephone number. 

OK….but wait! I need a feature branch! What if not all features are deliverable at the end of an iteration?

Feature branches are a solution to a problem - what to do if not all code is deliverable at the end of an iteration - but it is not the only solution.

CI has another solution to this problem – “feature toggles.” Feature branches isolate the work-in-progress feature from the final product via a branch (the WIP lives in a separate copy of the code), feature toggles isolate the feature from the rest of the code using.. Code!

The simplest feature toggle one can write is the dreaded if-then-else, is the example you will find in most sites when you googled “feature toggle.” It is not the only way to implementing, as any other type of software engineering you can replace this conditional logic with polymorphism.

In this example in Slim, we are creating in the current iteration a new REST endpoint, we do not want to be ready for production, we have this code:

code prior the toggling
<?php
require '../vendor/autoload.php';
use resources\OriginalEndpoint
$config = [
    'settings' => [
        'displayErrorDetails' => true,
        'logger' => [
            'name' => "dexeus",
            'level' => Monolog\Logger::DEBUG,
            'path'  => 'php://stderr',
], ],
];
$app = new \Slim\App(
$config );
$c = $app->getContainer();
$c['logger'] = function ($c) {
    $settings = $c->get('settings');
    $logger = LoggerFactory::getInstance($settings['logger']['name'],
$settings['logger']['level']);
    $logger->pushHandler(new
Monolog\Handler\StreamHandler($settings['logger']['path'],
$settings['logger']['level']));
    return $logger;
};
$app->group("", function () use ($app){
 OriginalEndpoint::get()->add($app); //we are registering the endpoint
in slim });

We can define the feature toggle with a simple if clause:

if clause feature toggle
<?php ....
$app->group("", function () use ($app){
 OriginalEndpoint::get()->add($app);
    if(getenv("APP_ENV") === "development") {
        NewEndpoint::get()->add($app); // we are registering the new
endpoint if the environment is set to development (devs machines should
have APP_ENV envar setted to development)
} });

We can refine our code to express better what we’re doing and be able to have several environments (maybe for having a test AB situation?)

configuration map feature toggle
<?php
......
$productionEnvironment = function ($app){
    OriginalEndpoint::get()->add($app);
};
$aEnvironment = function ($app){
    productionEnvironment($app);
    NewEndpointA::get()->add($app);
};
$bEnvironment = function ($app){
    productionEnvironment($app);
    NewEndpointB::get()->add($app);
};
$develEnvironment = function ($app){
    productionEnvironment($app);
    NewEndpointInEarlyDevelopment::get()->add($app);
};
$configurationMap = [
    "production" => $productionEnvironment,
    "testA" => $aEnvironment,
    "testB" => $bEnvironment,
    "development" => $develEnvironment
];
$app->group("", function () use ($app, $configurationMap){
    $configurationMap[getenv("APP_ENV")]($app);
});

The advantages of this technique is consistent with the main goal of CI (having constant feedback about code integration/validation and collisions with other developments), the code in progress is developed and deployed into production, and we have constant feedback about the integration of the new feature with the rest of the code, leveraging the risk of enabling the feature when it’s developed.

It is a good practice to remove this kind of toggle from code once a new feature has been stabilized in order to avoid adding complexity to the codebase.

We have arrived at the end of this first part of true Continuous Integration. We have rediscovered that continuous integration is “not only” using a CI server but adopting a practice with perseverance and discipline. In the second part, we will talk about how to model a good CI flow. 

Are you looking for greater insight into your software development value stream? Check out this whitepaper: DevOps Performance: The Importance of Measuring Throughput and Stability to see how CloudBees DevOptics can give you the visibility to improve your continuous delivery process.

Topics:
continuous integration ,ci ,tdd ,jenkins ,devops ,agile

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}