DevOps and CI/CD Resources

The Latest DevOps and CI/CD Topics

Learning Spring-Cloud - Writing a Microservice

Continuing my Spring-Cloud learning journey, earlier I had covered how to write the infrastructure components of a typical Spring-Cloud and Netflix OSS based micro-services environment - in this specific instance two critical components, Eureka to register and discover services and Spring Cloud Configuration to maintain a centralized repository of configuration for a service. Here I will be showing how I developed two dummy micro-services, one a simple "pong" service and a "ping" service which uses the "pong" service. Sample-Pong microservice The endpoint handling the "ping" requests is a typical Spring MVC based endpoint: @RestController public class PongController { @Value("${reply.message}") private String message; @RequestMapping(value = "/message", method = RequestMethod.POST) public Resource pongMessage(@RequestBody Message input) { return new Resource<>( new MessageAcknowledgement(input.getId(), input.getPayload(), message)); } } It gets a message and responds with an acknowledgement. Here the service utilizes the Configuration server in sourcing the "reply.message" property. So how does the "pong" service find the configuration server, there are potentially two ways - directly by specifying the location of the configuration server, or by finding the Configuration server via Eureka. I am used to an approach where Eureka is considered a source of truth, so in this spirit I am using Eureka to find the Configuration server. Spring Cloud makes this entire flow very simple, all it requires is a "bootstrap.yml" property file with entries along these lines: --- spring: application: name: sample-pong cloud: config: discovery: enabled: true serviceId: SAMPLE-CONFIG eureka: instance: nonSecurePort: ${server.port:8082} client: serviceUrl: defaultZone: http://${eureka.host:localhost}:${eureka.port:8761}/eureka/ The location of Eureka is specified through the "eureka.client.serviceUrl" property and the "spring.cloud.config.discovery.enabled" is set to "true" to specify that the configuration server is discovered via the specified Eureka server. Just a note, this means that the Eureka and the Configuration server have to be completely up before trying to bring up the actual services, they are the pre-requisites and the underlying assumption is that the Infrastructure components are available at the application boot time. The Configuration server has the properties for the "sample-pong" service, this can be validated by using the Config-servers endpoint - http://localhost:8888/sample-pong/default, 8888 is the port where I had specified for the server endpoint, and should respond with a content along these lines: "name": "sample-pong", "profiles": [ "default" ], "label": "master", "propertySources": [ { "name": "classpath:/config/sample-pong.yml", "source": { "reply.message": "Pong" } } ] } As can be seen the "reply.message" property from this central configuration server will be used by the pong service as the acknowledgement message Now to set up this endpoint as a service, all that is required is a Spring-boot based entry point along these lines: @SpringBootApplication @EnableDiscoveryClient public class PongApplication { public static void main(String[] args) { SpringApplication.run(PongApplication.class, args); } } and that completes the code for the "pong" service. Sample-ping micro-service So now onto a consumer of the "pong" micro-service, very imaginatively named the "ping" micro-service. Spring-Cloud and Netflix OSS offer a lot of options to invoke endpoints on Eureka registered services, to summarize the options that I had: 1. Use raw Eureka DiscoveryClient to find the instances hosting a service and make calls using Spring's RestTemplate. 2. Use Ribbon, a client side load balancing solution which can use Eureka to find service instances 3. Use Feign, which provides a declarative way to invoke a service call. It internally uses Ribbon. I went with Feign. All that is required is an interface which shows the contract to invoke the service: package org.bk.consumer.feign; import org.bk.consumer.domain.Message; import org.bk.consumer.domain.MessageAcknowledgement; import org.springframework.cloud.netflix.feign.FeignClient; import org.springframework.http.MediaType; import org.springframework.web.bind.annotation.RequestBody; import org.springframework.web.bind.annotation.RequestMapping; import org.springframework.web.bind.annotation.RequestMethod; import org.springframework.web.bind.annotation.ResponseBody; @FeignClient("samplepong") public interface PongClient { @RequestMapping(method = RequestMethod.POST, value = "/message", produces = MediaType.APPLICATION_JSON_VALUE, consumes = MediaType.APPLICATION_JSON_VALUE) @ResponseBody MessageAcknowledgement sendMessage(@RequestBody Message message); } The annotation @FeignClient("samplepong") internally points to a Ribbon "named" client called "samplepong". This means that there has to be an entry in the property files for this named client, in my case I have these entries in my application.yml file: samplepong: ribbon: DeploymentContextBasedVipAddresses: sample-pong NIWSServerListClassName: com.netflix.niws.loadbalancer.DiscoveryEnabledNIWSServerList ReadTimeout: 5000 MaxAutoRetries: 2 The most important entry here is the "samplepong.ribbon.DeploymentContextBasedVipAddresses" which points to the "pong" services Eureka registration address using which the service instance will be discovered by Ribbon. The rest of the application is a routine Spring Boot application. I have exposed this service call behind Hystrix which guards against service call failures and essentially wraps around this FeignClient: package org.bk.consumer.service; import com.netflix.hystrix.contrib.javanica.annotation.HystrixCommand; import org.bk.consumer.domain.Message; import org.bk.consumer.domain.MessageAcknowledgement; import org.bk.consumer.feign.PongClient; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.beans.factory.annotation.Qualifier; import org.springframework.stereotype.Service; @Service("hystrixPongClient") public class HystrixWrappedPongClient implements PongClient { @Autowired @Qualifier("pongClient") private PongClient feignPongClient; @Override @HystrixCommand(fallbackMethod = "fallBackCall") public MessageAcknowledgement sendMessage(Message message) { return this.feignPongClient.sendMessage(message); } public MessageAcknowledgement fallBackCall(Message message) { MessageAcknowledgement fallback = new MessageAcknowledgement(message.getId(), message.getPayload(), "FAILED SERVICE CALL! - FALLING BACK"); return fallback; } } Boot"ing up I have dockerized my entire set-up, so the simplest way to start up the set of applications is to first build the docker images for all of the artifacts this way: mvn clean package docker:build -DskipTests and bring all of them up using the following command, the assumption being that both docker and docker-compose are available locally: docker-compose up Assuming everything comes up cleanly, Eureka should show all the registered services, at http://dockerhost:8761 url - The UI of the ping application should be available at http://dockerhost:8080 url - Additionally a Hystrix dashboard should be available to monitor the requests to the "pong" app at this url http://dockerhost:8989/hystrix/monitor?stream=http%3A%2F%2Fsampleping%3A8080%2Fhystrix.stream: References 1. The code is available at my github location - https://github.com/bijukunjummen/spring-cloud-ping-pong-sample 2. Most of the code is heavily borrowed from the spring-cloud-samples repository - https://github.com/spring-cloud-samples

July 1, 2015

by Biju Kunjummen

· 13,690 Views · 4 Likes

DevOps Leadership Series: Gov Does DevOps

This past week, I had the opportunity to catch up with some more industry thought leaders at the DevOpsDays DC event in our nation’s capital. This was the first major DevOps Days event to feature a large audience of government participants. It was an awesome event and is certainly going to be on my must-attend list for next year. First off for the series, I had the chance to chat with Nathen Harvey, the Community Director at Chef. Nathen also did a great job leading the organizing committee for DevOps Days DC. In this episode of the DevOps Leadership Series, Nathen illustrates some recurrent topics he noticed at DevOpsDays DC. Nathen ensures us that government is ready for DevOps, enterprises are ready for DevOps, and small businesses and web innovators are ready for DevOps. Then he highlights how we need to create high velocity organizations that are safe, scalable, and humane for our teams. Next, I had the pleasure of catching up with Greg Elin, the Executive Director at GovReady and former Chief Data Officer at the FCC. What really stood out to Greg at the event was how many people already have their DevOps “sea legs.” He notes the number of different organizations that are comfortable with DevOps, and are familiar with its language and concepts. Finally, I caught up with Dave Johnson, Senior Technical Director at Ansible. Dave shares that “the market really is not dominated yet by any particular player in this (DevOps) space.” Dave explains when he first entered the industry he thought the IT world was dominated by open source and commercial technologies. However, when talking to the industry at large now, he found most the enterprises aren’t yet doing anything yet for automation, and most enterprises are still tackling their problems with humans. Next up in the series, you’ll hear from John Willis (Co-Author of the upcoming DevOps Cookbook) and Leon Fayer (Vice President at OmniTI). If you have missed any of the other videos from this series, you can find them here. (We’re up to 15 so far).

July 1, 2015

by Derek Weeks

· 13,472 Views

Gene Kim Explains ‘Why DevOps Matters’

Ever wonder why DevOps gets so much attention these days? The answer is simple: “DevOps solves the most important business problem of our generation, [which is] how organizations make the transition from good to great.” That’s according to Gene Kim, co-author of The Phoenix Project, founder of Tripwire, and a DevOps advocate. Gene headlined a New Relic DevOps roadshow with stops in Chicago, Dallas, and Houston last month, regaling attendees with the inside scoop of what DevOps really is, what it does, and how to make it work (more on that in upcoming blog posts). But perhaps his most important point was the overwhelming importance of the effort. Traditional IT leads to “hopelessness and despair” According to Gene, the opportunity cost of wasted IT spending is some $2.6 trillion. These days, he says, “every company is an IT company”—we like to say “every company is a software company,” but you get the message. Gene observes that 95% of all capital projects have an IT component and 50% of all capital spending is technology related. And every IT organization is pressured to simultaneously respond more quickly to urgent business needs while also providing stable, secure, and predictable IT service. That chronic conflict created what Gene described as “a horrible downward spiral that leads to horrendous outcomes. Every time we cut corners, or manually deploy code, or write code that doesn’t have automated testing, it all leads to the accumulation of technical debt.” And the ever-increasing amount of technical debt sets the stage for intertribal warfare that can exist between dev and ops. Those wars mean that “Devs submit code at 5 p.m. on Friday, and ops then works all weekend to deploy it by 9 a.m. Monday. Everyone becomes buried in unplanned work, and this deprives our ability to pay down the technical debt being created. This led to hopelessness and despair, with everyone doomed to repeat the same mistakes.” DevOps offers a better way Fortunately, Gene explained, “We know now there is a better way. The DevOps exemplars have shown us that we can have incredibly fast flow from dev to ops to deployment while preserving world-class quality and security.” According to Gene, the top predictors of IT performance are all associated with DevOps: Version control of all production artifacts Continuous integration and deployment Automated acceptance testing Peer-review of production changes (vs. external change approval) High-trust culture Proactive monitoring of the production environment Win-win relationship between dev and ops Lead time is the key metric Lead time from raw material to finished product is the key metric in manufacturing, “and that’s true for software, too,” Gene said. “How long does it take to go from code committed to code successfully running in production?” The standard 9-month software lead time common in waterfall development projects is “highly correlated with catastrophic deployment errors,” Gene warned. The key, he said, is to have smaller deployments, and to do them more frequently. That approach is already working for high-performing organizations, he added, who are accelerating away from the herd. “Ten deploys a day used to be startling,” Gene noted. “Now it’s probably considered merely average among high performers.” Amazon Web Services deploys every 11.6 seconds! That kind of speed is possible only by doing small deployments more frequently, Gene said. “The bigger the change, the bigger the crater when it hits.” DevOps correlates with business success! IT high-performers who incorporate DevOps are much more agile and more reliable, Gene said. Critically, he added, “They are more likely to win in the marketplace!” The common reaction to that statement is shock. Gene noted he often hears: “That’s absurd! How can IT ops practices be visible on the bottom line or in the stock price?” But the Puppet Labs 2014 State of DevOps report noted that IT high-performers are twice as likely to exceed profitability, market share, and productivity goals as well as enjoy 50% higher market capitalization growth over three years. Of course, that doesn’t mean all those good things will happen to your company just by moving to DevOps. But do you really want to risk the “horrendous outcomes” of staying with outmoded models that lead to excruciatingly long deployment cycles?

July 1, 2015

by Fredric Paul

· 1,929 Views

DevOps Tools for Continuous Delivery: Workloads Distribution and Jenkins Installation

the vast majority of software development companies have to place a great emphasis on the process of continuous integration and rapid delivery of new versions of their product. obviously, when supplying enterprise-level projects, such processes need to be automated as much as possible. and this is when the cloud devops tools come in handy. thus, in today’s article we’d like to pay a special attention to the devops tools that automate the continuous integration and delivery within the jelastic paas that can be installed on any bare metal or cloud infrastructure as virtual private cloud or hybrid cloud. this is a pretty complex example of enterprise application life cycle with continuous integration and seamless migration throughout devops pipeline from development to several productions (you can use simplified process if you have less complex project ). the instruction below will be useful for jelastic cluster administrators such as systems integrators, hosting service providers, enterprises, and isv customers, who can easily implement it at their jelastic cloud installations. nevertheless, this guide contains plenty of features and continuous integration tips described, which can be interesting for different developers. so, let’s get started with the first part of the instruction! setting up dedicated user groups first of all, you need to allocate separate hardware sets for all your project teams (one per each development phase, i.e. development > testing > production ) and adjust the access permissions to make them completely isolated and not influenced by others. the multi-regions for a hybrid cloud option, that became available within the recently released jelastic 3.3 version , is optimally suited for this task. to start with, create three hardware node groups (within one region) and name them after the corresponding stages for more convenience (e.g. dev , test , production ). the next step is to prepare three user groups and attach them to the corresponding hardware – in our case the dev group has access to the dev hardware node group only, qa – to the test one, and ops should work specifically with the production set. in such a way, users from the appropriate groups can use the specified sets of hardware only, but at the same time – they have a possibility to transfer their environments throughout the whole platform, between different teams’ accounts. jenkins continuous integration server configuration now we need the integration tool, that will control and perform all of the required operations automatically, i.e. build the cloud devops pipeline. our choice fell on jenkins as one of the most popular solutions used for this goal – it can be easily installed from our marketplace either at the corresponding site page or directly via the dashboard . as a result, you’ll get the pure jenkins installed, which should be properly adjusted before you start organizing your application life cycle. thus, select the open in browser button and proceed with the following configurations steps: while at the home page, click on the manage jenkins option at the left-hand menu and select the manage plugins link within the appeared list. after you’ve been redirected to the plugin manager, switch to the available tab, find the following plugins using the search filter field above and tick them for installation: git plugin – is required for building our project’s source (stored at the github repository) envfile plugin – is used for storing system environment variables (its necessity is driven by security restrictions, implemented at jelastic, which forbid the direct exporting of environment variables from the tomcat server) click install without restart when ready. during the installation process, tick the restart jenkins when installation is complete and no jobs are running option to automatically restart jenkins for enabling the chosen plugins. then, you also need to install maven, which will be used for building the project. for that, navigate to the manage jenkins > configure system menu, scroll down to the maven section and click add maven. within the expanded section, type the desired name for your maven installation (e.g. maven ) and save the changes using the same-named button at the bottom of the page. in such a way, this tool will be also automatically installed when required (i.e. during the first app build). now your jenkins server is well-staffed for the further work. add deployment process scripts to the jenkins container the next step is to upload the scripts that you are going to use for automating different organizational actions, required to be applied to your application at the intermediate development life cycle phases (like deploying, placing it to the appropriate hardware according to the stage, running auto-test, etc). the easiest way to do this is to access your jenkins container via the jelastic ssh-gateway. in the case you haven’t performed similar operations before, you need to: generate an ssh keypair add your public ssh key to the dashboard access your account via ssh protocol once inside, create a new folder for your project (we’ll use demo ) and move in there: mkdir /opt/tomcat/demo cd /opt/tomcat/demo this location can be used for storing your scripts, variables, logs etc. here, you can upload the required scripts using the command of the following type: curl -fssl {link_to_script} -o {file_name} we also provide the set of script examples, which can be used as templates for your own ones: install.sh – gets a user session and creates a new environment via the jelastic api according to the specified manifest file. it also defines, that the name of this environment will be equal to its creation date and time (as a unique name is required for every script execution, but you won’t be able to set it manually as this operation would be run automatically). however, you can set your own dynamic name pattern to be used here transfer.sh – changes the environment ownership based on the jelastic environment transferring feature migrate.sh – physically moves an environment to another hardware set (hardnode group) note: that before the appliance, each of the script templates, presented above, have to be additionally adjusted to make them work properly within a particular jelastic installation. thus, the list of parameters below should be obligatory substituted according to your platform’s settings: /path/to/scripts/ – the full path to your scripts folder (created in the previous step) {cloud_domain} – your jelastic platform domain name {jca_dashboard_appid} – your dashboard id, that could be seen within the platform.dashboard_appid parameter at the jca > about section {jca_appstore_appid} – appstore id, listed within the same section (at the platform.appstore_appid parameter) {url_to_manifest} – link to the manifest file created according to our documentation (you may also use this one as an example – it sets up two tomcat application servers with the nginx load-balancer in front of them) note: above you can see one more runtest.sh script uploaded – it simulates the testing activities for demonstration purposes, thus we don’t provide its code in this tutorial. if required, create your own one according the specifics of your application and upload it alongside the rest of the scripts. in addition, you need to create a separate file for storing the variable with environment name (as it needs to be dynamically changed each time a new environment is created): echo env_name= > /opt/tomcat/demo/variables these are the main steps of preparation to achieve automatic continuous integration and delivery of your web application with a help of jenkins within jelastic cloud platform. in the second part of these blog series, we’ll configure the set of jobs at the jenkins server, which represents the core of our automation. each of them will be devoted for a particular operation, required to be run at the corresponding application life cycle phase: create environment > build and deploy > dev tests > migrate to qa > qa tests > migrate to production stay tuned to see the next steps. if you still don’t have jelastic installation, contact us to get access to our free demo for cloud platform evaluation or just start with trial registration at one of our hosting partners .

June 30, 2015

by Tetiana Markova

· 3,164 Views · 1 Like

Integrating SonarQube with Nexus Lifecycle

Many development organizations we work with have turned to SonarQube as a dashboard to visualize and measure their code quality. Customers using Nexus Lifecycle (formerly CLM) want to surface known security vulnerabilities and license risk in the same place developers or executives already go to assess the overall quality of their application. To support this growing interest from our customers, we have introduced Nexus Lifecycle integration with SonarQube. Figure 1. SonarQube widget example highlights open source policy violations that require attention. Drill down reports with with detailed analysis are accessible directly from this widget. This integration will allow you to access summary-level Nexus Lifecycle information for your applications, as well as link to Nexus Lifecycle Application Composition Reportsdirectly from your SonarQube projects. Figure 2. Nexus Lifecycle Application Composition Reports offer detailed analysis of license and security issues down to the individual components and risks. If you are already using SonarQube, you know first hand the impact that principles such as the 7 Axes of Code Quality can have on the applications and projects your teams create. Paralleling this, as a user of Nexus Lifecycle you also know how using good components is a critical and essential part of developing quality applications. Nexus Lifecycle for SonarQube brings both of these together. THE SOFTWARE: For Nexus Lifecycle users needing access to the 1.11 release, it can be found on our KnowledgeBase here. THE INTEGRATION: For Nexus Lifecycle users looking for more information on the SonarQube integration, you can quickly get up-and-running with our online guideshere. LEARN MORE: What to learn more about SonarQube? Here is an informativearticle I found from Nadeem Mohammad. Finally, if you are looking for information on how Nexus Lifecycle integrates into your complete development environment, here are some links that you might find helpful: Integration with continuous integration servers (e.g., Hudson/Jenkins), Integration integrates with IDEs (e.g., Eclipse) Integration integrates with repository management (e.g., Nexus) Integration integrates with build managers (e.g., Maven)

June 30, 2015

by Brian Fox

· 4,108 Views · 1 Like

Wrangling the Different Docker APIs

[This article was written by Alex Harford.] Docker APIs are a convenient way for your systems to talk to Docker infrastructure. But sometimes there are challenges associated with them. I've outlined in this blog the steps you need to take and the items you need to look out for when working with Docker APIs. Initial Docker Setup Ensure you have the latest Docker client installed. It should be v1.6 or newer. [alexh:~/work] docker pull ubuntu latest: Pulling from ubuntu 428b411c28f0: Pull complete 435050075b3f: Pull complete 9fd3c8c9af32: Pull complete 6d4946999d4f: Already exists ubuntu:latest: The image you are pulling has been verified. Important: image verification is a tech preview feature and should not be relied on to provide security. Digest: sha256:45e42b43f2ff4850dcf52960ee89c21cda79ec657302d36faaaa07d880215dd9 Status: Downloaded newer image for ubuntu:latest [alexh:~/work] docker run -ti ubuntu /bin/bash root@1092e8ca2ead:/# ps PID TTY TIME CMD 1 ? 00:00:00 bash 14 ? 00:00:00 ps root@1092e8ca2ead:/# exit exit Daemons, Registries, Hubs The Docker registry is used to host docker images for download. In the most simple case, it can be a process serving static images. This would be a read-only registry supporting GET operations only. If you need something more complex, you need to use a Docker registry web service. You can [a target="_blank" href="http://www.activestate.com/blog/2014/01/deploying-your-own-private-docker-registry"]run your own private Docker registry or use the public official Docker Hub. The Docker Hub contains a Docker registry, but also includes other features, like user authentication. In our examples, we will run an unauthenticated Docker registry. Setup If you are using standard Docker images, most people will pull from the Docker Hub, which is a publically accessible Docker registry. However, a more complicated service may be talking to private Docker registries running different versions of the API. Let’s assemble a test environment with both versions of the docker registry API so we can see the different ways you can access it. First, pull down two versions of the docker registry from the Docker Hub: docker pull registry:0.9.1 0.9.1: Pulling from registry e9e06b06e14c: Pull complete a82efea989f9: Pull complete 37bea4ee0c81: Pull complete 07f8e8c5e660: Pull complete 1f4ab7282e19: Pull complete 3c27027cdae8: Pull complete 7e0e5314436e: Pull complete 2696504d3685: Pull complete 012772dbb1c6: Pull complete e24d9fce1d00: Pull complete fd2726a79da8: Pull complete bffc32d7113a: Pull complete 0cd49aa0e23c: Pull complete 4e698fa80441: Already exists registry:0.9.1: The image you are pulling has been verified. Important: image verification is a tech preview feature and should not be relied on to provide security. Digest: sha256:98937757728eecbd72c9276bf711260aa29896f15217ce05be0562287e73232d Status: Downloaded newer image for registry:0.9.1 [alexh:~/work] docker pull registry:2.0.1 2.0.1: Pulling from registry 39bb80489af7: Pull complete df2a0347c9d0: Pull complete 7a3871ba15f8: Pull complete a2703ed272d7: Pull complete 68769176e114: Pull complete ab2ab59d7d1b: Pull complete 882ecee9f360: Pull complete 40de65f8e79f: Pull complete 0c4f9c7d798f: Pull complete ca29675fe853: Pull complete 89d10e9463e5: Pull complete 1a5aa415e484: Pull complete 3ea7a9e93b04: Pull complete 769d811a57fd: Pull complete ae8a4a3af1aa: Pull complete 85cc9a791bb5: Pull complete 9cd2c8646022: Pull complete 048c32c549b9: Pull complete cbbbda28c189: Pull complete 2602c005e534: Pull complete 136beb445cfa: Pull complete 0c5e5ef1d7da: Already exists registry:2.0.1: The image you are pulling has been verified. Important: image verification is a tech preview feature and should not be relied on to provide security. Digest: sha256:0cd177d687589aff586aa2c66c64d1c25657b8d09cff9e1492192f496e7786c3 Status: Downloaded newer image for registry:2.0.1 The next step is to start them. We will start the v1 registry on port 5000, and the v2 registry on port 6000. The v1 registry occasionally fails when starting due to a lock file race condition, so tell it to restart if necessary. [alexh:~/work] docker run -p 5000:5000 -d --restart=on-failure:3 registry:0.9.1 896c651b9bfa9780b14e3710d20428baab8497c30b9bc89946b192e1d1c145aa [alexh:~/work] docker run -p 6000:5000 -d registry:2.0.1 e09d4204921c732879ee9b7544cd40a25275e0d1f1702cacd954412cfd586ffb [alexh:~/work] docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES e09d4204921c registry:2.0.1 "registry cmd/regist 4 seconds ago Up 3 seconds 0.0.0.0:6000->5000/tcp silly_albattani 896c651b9bfa registry:0.9.1 "docker-registry" 35 seconds ago Up 34 seconds 0.0.0.0:5000->5000/tcp jovial_leakey Understanding Docker Namespaces Docker has a concept of namespaces for its repositories which can be confusing. [a target="_blank" href="https://docs.docker.com/docker-hub/official_repos/"]Official Repositories can be referred to without a username prefix: CentOS Ubuntu Internally these are prefixed by library/. This means that command like docker pull ubuntu:15.10 and docker pull library/ubuntu:15.10 are equivalent. If the name includes a '/' character (samalba/docker-registry), the left side refers to the username, and the right side refers to the image name in their public repository. It gets more complex when accessing private registries. The format becomes HOST:PORT/[USERNAME/]IMAGE. However, you should note that there is no authentication performed at this layer of our docker registry environment: anyone can push, pull, or delete from any 'user'. If the USERNAME is omitted, it is internally treated as being an 'official' image, and prefixed with library/. docker pull 127.0.0.1:5000/library/test-ubuntu Pulling repository 127.0.0.1:5000/library/test-ubuntu FATA[0004] Error: image library/test-ubuntu:latest not found [alexh:~/work] docker tag 0fe5a10d2cf8 127.0.0.1:5000/test-ubuntu [alexh:~/work] docker push 127.0.0.1:5000/test-ubuntu The push refers to a repository [127.0.0.1:5000/test-ubuntu] (len: 1) Sending image list Pushing repository 127.0.0.1:5000/test-ubuntu (1 tags) Image 5c1d0c04c3b8 already pushed, skipping Image 8c63e4ac9a5f already pushed, skipping Image 5fc05c0feaea already pushed, skipping Image 0fe5a10d2cf8 already pushed, skipping Pushing tag for rev [0fe5a10d2cf8] on {http://127.0.0.1:5000/v1/repositories/test-ubuntu/tags/latest} [alexh:~/work] docker pull 127.0.0.1:5000/library/test-ubuntu Pulling repository 127.0.0.1:5000/library/test-ubuntu 0fe5a10d2cf8: Download complete 5c1d0c04c3b8: Download complete 8c63e4ac9a5f: Download complete 5fc05c0feaea: Download complete Status: Image is up to date for 127.0.0.1:5000/library/test-ubuntu:latest In the v2 Docker registry, the [a target="_blank" href="https://docs.docker.com/registry/spec/api/#overview"]URI scheme has changed to allow the repository name to be broken up into multiple components. However, the Docker client does not yet support this flexibility. In the future, you should be able to extend the namespace of your registries, ie `redhat/centos/beta or redhat/fedora/stable. Populating the Registries We'll use Ubuntu 15.10 as our example image: docker pull ubuntu:15.10 15.10: Pulling from ubuntu 5c1d0c04c3b8: Pull complete 8c63e4ac9a5f: Pull complete 5fc05c0feaea: Pull complete 0fe5a10d2cf8: Already exists ubuntu:15.10: The image you are pulling has been verified. Important: image verification is a tech preview feature and should not be relied on to provide security. Digest: sha256:d569b6ebfc62f35f9792392724bd4a74a4f5f5af10ccbc1880974ae2f0660898 Status: Downloaded newer image for ubuntu:15.10 It needs to be tagged with the new URL in order to push it to the private registries: [alexh:~/work] docker tag ubuntu:15.10 127.0.0.1:5000/ubuntu:15.10 [alexh:~/work] docker tag ubuntu:15.10 127.0.0.1:6000/ubuntu:15.10 [alexh:~/work] docker push 127.0.0.1:5000/ubuntu:15.10 The push refers to a repository [127.0.0.1:5000/ubuntu] (len: 1) Sending image list Pushing repository 127.0.0.1:5000/ubuntu (1 tags) 5c1d0c04c3b8: Image successfully pushed 8c63e4ac9a5f: Image successfully pushed 5fc05c0feaea: Image successfully pushed 0fe5a10d2cf8: Image successfully pushed Pushing tag for rev [0fe5a10d2cf8] on {http://127.0.0.1:5000/v1/repositories/ubuntu/tags/15.10} [alexh:~/work] docker push 127.0.0.1:6000/ubuntu:15.10 The push refers to a repository [127.0.0.1:6000/ubuntu] (len: 1) 0fe5a10d2cf8: Image already exists 5fc05c0feaea: Image successfully pushed 8c63e4ac9a5f: Image successfully pushed 5c1d0c04c3b8: Image successfully pushed Digest: sha256:1f93077ce8f2fa1da8aae87735f395eae93a1c21928d3e2d130717c9aeff177d Note that the output between the v1 registry (on port 5000) and v2 (port 6000) are slightly different, but the result is the same: the Ubuntu image is now available on each registry. Docker Registry APIs At this point, we're able to compare the different APIs. In April 2015, Docker [a target="_blank" href="http://docs.docker.com/v1.6/release-notes/"]released version 1.6 and this included v2 of the Registry. Your software should be aware of the different versions of the Docker Registry API to handle these differences. Let's look at what it takes to download the image layers through the various APIs in order to make an offline cache. First, we'll prepare our environment: [alexh:~/work] export image=ubuntu [alexh:~/work] export tag=15.10 v1 The v1 private registry can be examined at this point: [alexh:~/work] curl -s http://127.0.0.1:5000/v1/repositories/library/$image/tags/$tag | python -m json.tool "0fe5a10d2cf8cdb378a39a81d87b0c8fcfa8fcaaf11bba895a1b6f72baf9a547" export v1_image_id=`curl -s http://127.0.0.1:5000/v1/repositories/library/$image/tags/$tag | sed 's/"//g'` [alexh:~/work] curl -s http://127.0.0.1:5000/v1/images/$v1_image_id/ancestry | python -m json.tool [ "0fe5a10d2cf8cdb378a39a81d87b0c8fcfa8fcaaf11bba895a1b6f72baf9a547", "5fc05c0feaeab977e52b7c2490bffacaba0e3d58e7955b683f271041d3558ad1", "8c63e4ac9a5f31e482d25a149b022209653b5948cb4f045c2ede9331a18e5824", "5c1d0c04c3b846fffd1d70886c956927a5c5f6a1c96f5e9f61c02f2ec1a45a73" ] [alexh:~/work] curl -sSL http://127.0.0.1:5000/v1/images/0fe5a10d2cf8cdb378a39a81d87b0c8fcfa8fcaaf11bba895a1b6f72baf9a547/layer > 0fe5a10d2cf8cdb378a39a81d87b0c8fcfa8fcaaf11bba895a1b6f72baf9a547.tar.gz [alexh:~/work] curl -sSL http://127.0.0.1:5000/v1/images/5fc05c0feaeab977e52b7c2490bffacaba0e3d58e7955b683f271041d3558ad1/layer > 5fc05c0feaeab977e52b7c2490bffacaba0e3d58e7955b683f271041d3558ad1.tar.gz [alexh:~/work] curl -sSL http://127.0.0.1:5000/v1/images/8c63e4ac9a5f31e482d25a149b022209653b5948cb4f045c2ede9331a18e5824/layer > 8c63e4ac9a5f31e482d25a149b022209653b5948cb4f045c2ede9331a18e5824.tar.gz [alexh:~/work] curl -sSL http://127.0.0.1:5000/v1/images/5c1d0c04c3b846fffd1d70886c956927a5c5f6a1c96f5e9f61c02f2ec1a45a73/layer > 5c1d0c04c3b846fffd1d70886c956927a5c5f6a1c96f5e9f61c02f2ec1a45a73.tar.gz v1 on Docker Hub The Docker Hub currently implements the v1 API, but requires an authentication token for certain operations. It also allows multiple endpoints to be returned by the server. We'll take the simple approach of always using the first endpoint: [alexh:~/work] export endpoint=`curl -sSL -o /dev/null -D- "https://index.docker.io/v1/repositories/$image/images" | awk '/X-Docker-Endpoints/{print $2}' | tr -d '\r' | sed 's/,//'` [alexh:~/work] echo $endpoint registry-1.docker.io [alexh:~/work] export token=`curl -sSL -o /dev/null -D- -H 'X-Docker-Token: true' "https://index.docker.io/v1/repositories/$image/images" | tr -d '\r' | awk '/X-Docker-Token/{print $2}'` The token needs to be used for authentication for the rest of the commands, but otherwise they are the same as the v1 private registry: [alexh:~/work] export v1_image_id=`curl -s -H "Authorization: Token $token" https://$endpoint/v1/repositories/library/$image/tags/$tag | sed 's/"//g'` [alexh:~/work] curl -sSL -H "Authorization: Token $token" "https://registry-1.docker.io/v1/images/$v1_image_id/ancestry" | python -m json.tool [ "0fe5a10d2cf8cdb378a39a81d87b0c8fcfa8fcaaf11bba895a1b6f72baf9a547", "5fc05c0feaeab977e52b7c2490bffacaba0e3d58e7955b683f271041d3558ad1", "8c63e4ac9a5f31e482d25a149b022209653b5948cb4f045c2ede9331a18e5824", "5c1d0c04c3b846fffd1d70886c956927a5c5f6a1c96f5e9f61c02f2ec1a45a73" ] [alexh:~/work] curl -sSL -H "Authorization: Token $token" https://$endpoint/v1/images/0fe5a10d2cf8cdb378a39a81d87b0c8fcfa8fcaaf11bba895a1b6f72baf9a547/layer > 0fe5a10d2cf8cdb378a39a81d87b0c8fcfa8fcaaf11bba895a1b6f72baf9a547.tar.gz [alexh:~/work] curl -sSL -H "Authorization: Token $token" https://$endpoint/v1/images/5fc05c0feaeab977e52b7c2490bffacaba0e3d58e7955b683f271041d3558ad1/layer > 5fc05c0feaeab977e52b7c2490bffacaba0e3d58e7955b683f271041d3558ad1.tar.gz [alexh:~/work] curl -sSL -H "Authorization: Token $token" https://$endpoint/v1/images/8c63e4ac9a5f31e482d25a149b022209653b5948cb4f045c2ede9331a18e5824/layer > 8c63e4ac9a5f31e482d25a149b022209653b5948cb4f045c2ede9331a18e5824.tar.gz [alexh:~/work] curl -sSL -H "Authorization: Token $token" https://$endpoint/v1/images/5c1d0c04c3b846fffd1d70886c956927a5c5f6a1c96f5e9f61c02f2ec1a45a73/layer > 5c1d0c04c3b846fffd1d70886c956927a5c5f6a1c96f5e9f61c02f2ec1a45a73.tar.gz v2 API The v2 API works with manifest files that include checksums. It's also slightly simpler. A manifest file for a tag contains all of the layer information, rather than requiring an image ID to be looked up for a tag, and then the ancestry for that image to be looked up. [alexh:~/work] curl -sSL http://127.0.0.1:6000/v2/$image/manifests/$tag | python -c 'import sys, json, pprint; pprint.pprint(json.load(sys.stdin)["fsLayers"])' [{u'blobSum': u'sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4'}, {u'blobSum': u'sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4'}, {u'blobSum': u'sha256:d4d342aa9da086ca4b7f7273858072e81021f4379c486223bc4708df6862b55d'}, {u'blobSum': u'sha256:23dc26e1038ae691b1a7e8e0152f974a358c42c929104c18c8e20b6d363c41ca'}, {u'blobSum': u'sha256:7772c716a45a828e124d20bc67199e77f2e63fb62589d0046f974f99b406e107'}] [alexh:~/work] curl -sSL http://127.0.0.1:6000/v2/$image/blobs/sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4 > a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4.tar.gz [alexh:~/work] curl -sSL http://127.0.0.1:6000/v2/$image/blobs/sha256:d4d342aa9da086ca4b7f7273858072e81021f4379c486223bc4708df6862b55d > d4d342aa9da086ca4b7f7273858072e81021f4379c486223bc4708df6862b55d.tar.gz [alexh:~/work] curl -sSL http://127.0.0.1:6000/v2/$image/blobs/sha256:23dc26e1038ae691b1a7e8e0152f974a358c42c929104c18c8e20b6d363c41ca > 23dc26e1038ae691b1a7e8e0152f974a358c42c929104c18c8e20b6d363c41ca.tar.gz [alexh:~/work] curl -sSL http://127.0.0.1:6000/v2/$image/blobs/sha256:7772c716a45a828e124d20bc67199e77f2e63fb62589d0046f974f99b406e107 > 7772c716a45a828e124d20bc67199e77f2e63fb62589d0046f974f99b406e107.tar.gz We can get the checksum for these files to verify that they are what is described in the manifest file: [alexh:~/work] sha256sum *.tar.gz a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4 a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4.tar.gz d4d342aa9da086ca4b7f7273858072e81021f4379c486223bc4708df6862b55d d4d342aa9da086ca4b7f7273858072e81021f4379c486223bc4708df6862b55d.tar.gz 23dc26e1038ae691b1a7e8e0152f974a358c42c929104c18c8e20b6d363c41ca 23dc26e1038ae691b1a7e8e0152f974a358c42c929104c18c8e20b6d363c41ca.tar.gz 7772c716a45a828e124d20bc67199e77f2e63fb62589d0046f974f99b406e107 7772c716a45a828e124d20bc67199e77f2e63fb62589d0046f974f99b406e107.tar.gz The Remote (daemon) API Another API that is available is the Docker daemon running locally. It can be accessed over a Unix socket, or over TCP if the daemon is configured to allow it. [alexh:~/work] echo -e "GET /images/json HTTP/1.0\r\n" | nc -U /var/run/docker.sock | tail -n +6 | python -m json.tool [ { "Created": 1433116930, "Id": "0fe5a10d2cf8cdb378a39a81d87b0c8fcfa8fcaaf11bba895a1b6f72baf9a547", "Labels": {}, "ParentId": "5fc05c0feaeab977e52b7c2490bffacaba0e3d58e7955b683f271041d3558ad1", "RepoDigests": [], "RepoTags": [ "127.0.0.1:6000/ubuntu:15.10", "ubuntu:15.10", "127.0.0.1:5000/ubuntu:15.10" ], "Size": 0, "VirtualSize": 132392276 }, { "Created": 1432704049, "Id": "0c5e5ef1d7dac23c7164ea48faafc79f0c921f6cf87d2d8ea7469832ea31e4ca", "Labels": {}, "ParentId": "136beb445cfa7f48dbe4e36a80a83d4b7945682827fd8bfb1510ac17b6a200c0", "RepoDigests": [], "RepoTags": [ "registry:2.0.1" ], "Size": 0, "VirtualSize": 548626543 }, { "Created": 1432703977, "Id": "4e698fa804417b34b334793bab8a143403be9384e0651067b0c3933fe8d90eb2", "Labels": {}, "ParentId": "0cd49aa0e23cfe176cbea4bf622d552a6f16b21965cf52d633f8c9e27438f52c", "RepoDigests": [], "RepoTags": [ "registry:0.9.1" ], "Size": 0, "VirtualSize": 413940033 } ] A tarball containing all of the layers for a tag can be generated: [alexh:~/work] echo -e "GET /images/get?names=$image:$tag HTTP/1.0\r\n" | nc -U /var/run/docker.sock | tail -n +5 > $image-$tag.tar [alexh:~/work] mkdir tmp [alexh:~/work] tar -C tmp -xf ubuntu-15.10.tar [alexh:~/work] ls -l tmp total 20 drwxr-xr-x 2 alexh alexh 4096 Jun 2 15:33 0fe5a10d2cf8cdb378a39a81d87b0c8fcfa8fcaaf11bba895a1b6f72baf9a547 drwxr-xr-x 2 alexh alexh 4096 Jun 2 15:33 5c1d0c04c3b846fffd1d70886c956927a5c5f6a1c96f5e9f61c02f2ec1a45a73 drwxr-xr-x 2 alexh alexh 4096 Jun 2 15:33 5fc05c0feaeab977e52b7c2490bffacaba0e3d58e7955b683f271041d3558ad1 drwxr-xr-x 2 alexh alexh 4096 Jun 2 15:33 8c63e4ac9a5f31e482d25a149b022209653b5948cb4f045c2ede9331a18e5824 -rw-r--r-- 1 alexh alexh 87 Jun 2 15:33 repositories Conclusions Docker is a great technology and there are a lot of improvements and new features coming out at a rapid pace. Fortunately it's well documented and discussions about bugs are in the open on GitHub. However, there are still some edge cases to be aware of when talking to the Docker APIs. With some good design choices, your applications can be made backwards and forwards compatible, and will be able to use a wide range of Docker client versions and remote APIs.

June 30, 2015

by Kathy Thomas

· 1,985 Views · 2 Likes

Instant Enterprise REST Accelerates the Software Driven Business

Software Driven Business is a consensus goal. But real challenges exist: the time, cost and complexity of building such apps is substantial. Business Agility – and strategic business advantage – is lost. We need another revolution – Instant Enterprise REST – that provides Business Agility using business-level specifications rather than low-level code, and delivers Enterprise-class scalability, integration, enforcement and extensibility. It’s now a reality with Instant Enterprise REST. Software Driven Business: Consensus Vision Businesses have seen the value in providing mobile and tablet apps that bring the business into the hands of customers and employees. They provide information at their finger tips – wherever they are. Industry Leaders like CA have pioneered the vision of a Software Driven Business. They argue persuasively that strategic business advantage lies in Time to Market and Time to Decision: “reveal the need for speed in the application economy. As companies transform into software-driven enterprises, bringing high-quality applications to market faster becomes one of the most critical differentiators.” The Business Agility Gap While there is consensus around this vision, there is a substantial gap in realizing the Software Driven Business. It centers around Agility – time to market. As CA argues, this drives strategic business advantage. This problem manifests both to Business Users and IT, although differently. You might have been party to a discussion like this: Business Users are frustrated about how long it takes to create systems, and revise them. They see problems that look nearly as simple as a spreadsheet take weeks… to months. How can it months for IT to build a system that takes days on a spreadsheet? IT is no less frustrated. They understand the deep technology it takes to build Enterprise-class systems: We’re working 90 hours a week. And falling behind. Gap Analysis For apps about critical corporate data, there’s general consensus that the time and cost for such systems are about evenly split between backends and front ends. And there’s nearly universal consensus that, independent of the UI technology, that RESTful APIs deliver the backend data. But the backend is far more than basic data access. A “SQL Pass-through” – simply restifying SQL data – does not meet Enterprise-class requirements to scale, integrate and enforce: Scale – APIs require Pagination to address large result sets, Nested Documents to reduce latency, Optimistic Locking to ensure concurrency. These are not provided in a simple SQL Pass-through – you must program them, by hand. Integrate – a wizard can produce an API from schema objects, but it cannot address multiple databases, or integrate non-SQL data sources such as ERP, other RESTful services, or NoSQL. Enforce – an API needs to enforce our security (down to the row level), and the integrity of the data. These are significant tasks, which are sadly often placed in client buttons where they cannot be shared. Providing these Enterprise class services takes significant time, expertise and expense. Business Agility is reduced. IT is essentially being forced to cover inadequate technology infrastructure. The Business Users are right: if the Business Specification is clear, then that ought to be enough: A clear business specification should be sufficient. Everything else is just friction. The vision of the Software Driven Business requires Business Driven Software that pre-supplies the infrastructure. We are not seeking 10 or 15%. We are looking for orders of magnitude. Our vision must be: We should be able to create RESTful APIs (mainly) from business specifications, not low level code. It should be no more difficult to create a system than it is toimagine it. Business-Driven Software: Instant Enterprise REST Business Driven Software is more than just a clever play on words. It’s a real implementation that delivers this vision, and we call it Instant Enterprise REST. It consists of 3 core technologies: Enterprise Pattern Automation – creates APIs that with Enterprise-class scalability built-in (pagination, nested documents, optimistic locking, etc) Declarative – specify your API, integration and enforcement policies with spreadsheet-like rules in a simple point-and-click UI Extensibility – enables the RESTful APIs to invoke your existing logic, inside or outside the JVM, via standard server-side JavaScript. The combination of these 3 technologies enables you to create RESTful APIs for database backends – half your system – 10 times faster. Let’s briefly examine them below. Technology 1: Enterprise Pattern Automation There are well known patterns in the data domain, describing data structure and access via SQL. There are also well-known patterns for managing SQL data in the context of RESTful services. Well known patterns can be automated. Let’s imagine a service (say, a server accessed via a browser) that automates these patterns, as described below, just by connecting the service to a database: Schema Discovery – tables, views, stored procedures: The system creates a complete (default) API for each schema object. Note this includes Stored Procedures, which often represent a significant investment. Enterprise Pattern Automation: the resultant API provides well-known services for Filter, Sort, Pagination, Optimistic Locking, handling Generated Keys and so forth. So, the service has provided a default Enterprise-class API, instantly. So, literally seconds into your project, you can test your running API: Not enough, not done, but a great start. Technology 2: Declarative Declarative is the key (“what, not how”). It has had striking impacts on domains where there are well-understood underlying patterns. Max Tardiveau has put it well: Whatever can be declarative, will be declarative. For example, spreadsheets are declarative – and they gave birth to the PC industry. And SQL is declarative – itself an industry. Two game-changers. So, the challenge is to apply the spirit of declarative to REST integration and enforcement. The stakes are high – success can deliver breathtaking agility. Declarative Integration: Multi-Database Custom API, Point and Click Enterprise Pattern Automation provides a good start, but the API is not rich. It is a flat, single-table API, really just “restified” SQL. What we really need is Nested Documents – returning multiple types (e.g., an Order, a list of Items, and a list of contact names) in a single call can reduce latency (vs. a separate call for each type). REST is perfect for this. Multi-database APIs – a RESTful server provides the opportunity to integrate multiple databases in single call, shielding clients from underlying complexity. Nested Documents are easy: define them by simply selecting tables (via a User Interface or Command Line). Foreign Keys are used to default the joins. Add the ability to choose / alias columns, and we’re on the way to a pretty good API. But what about databases that have no Foreign Keys? Or multi-database APIs? Leveraging the schema does not mean we are limited to it. All we need to do is: Provide a means to define “Virtual” Foreign Keys for the service (i.e., stored outside the schema) Extend this to Foreign Keys between databases We now have a rich, multi-database API. Defined declaratively as shown below, no code required, running in minutes, ready for client development: Declarative Enforcement: Integrity Logic, with spreadsheet-like rules So now consider enforcement, specifically database integrity. A very significant portion of any project is the multi-table validations and computations that define how the data is processed. “Your code goes here” means, well, a lot of code. We need a more powerful, more declarative, paradigm. In a spreadsheet, you assign expressions to cells. Whenever the referenced data is changed, the cell is updated. Since the cells references can chain, a series of simple expressions can solve remarkably complex problems. What if we did the same for database data? We could assign derivation expressions to columns, and validation expressions to tables. Then, the API could “watch” for requests that change the referenced column, and recompute (efficiently) the calculated column. Just as in a spreadsheet, support for chaining and proper ordering is required and implicit. To address multi-table logic, such expressions would need to address references to related tables. It’s only at this point that the logic becomes seriously powerful. Let’s take an example. To check credit in a Customer / Purchaseorder / Lineitem application, we could define spreadsheet-like expressions such as: There is actually a sub-branch of declarative that addresses this: Reactive Programming. Here it’s declarative,since you don’t need to code a Observer handler. The result is that the logic above can be fully executable. No need to code Change Detection / Change Dependency – it’s invoked and enforced automatically by the API in reaction to RESTful updates. SQL handling is also implicit, including underlying optimizations (caching, pruning etc). The impact is massive – the 5 expressions above express the same logic as hundreds of lines of code. That’s a massive 40X more concise. Game changer. And quality goes up, since the rules are applied automatically. Declarative Enforcement: Security, filter expressions for role/table We can provide an analogous approach to security: define filter expressions for roles (like SalesRep), so that when a table is accessed by the role, the API adds the filter. That way, a user with that role sees only the rows for which they are authorized. Technology 3: Standards-based Extensibility Declarative is great, but you’re probably thinking “ok, but you can’t solve every problem declaratively”. And you’re dead right. Business Value requires that we integrate a declarative approach with a procedural one that is familiar, standards-based, and enables us to integrate existing software. Automatic JavaScript Object Model The first phase of many projects is to build an ORM for natural programmatic access to data: JPA, Hibernate, Entity Framework. It’s not a small project, and cumbersome to maintain as changes occur. In fact, the Object Model can be created directly from the schema. So, you’d have an object type for Purchaseorder, for Lineitem, and so forth. The model provides access to attributes and related data, and persistence services. You could then use it as shown below. JavaScript seems like the best language choice: reasonable across technology bases (everybody uses JavaScript), and its dynamic nature eliminates code generation hassles. JavaScript Events In addition to accessors and persistence, the JavaScript objects are Logic Aware. That is, the save operation above executes any rules associated with OrderAudit (e.g., updated-by), and JavaScript Events. Here is a sample event for the PurchaseOrder object, where you access the JavaScript Object Model via the system-supplied row variable: Extensible Logic Auditing is a common pattern. It should be possible to solve this once in a genericmanner, then re-use it (e.g, to audit employees, orders and so forth). So, Instant Enterprise REST should enable you to provide Extensible Logic – load your own JavaScript code, and invoke it. So, the code above could become: MyLibrary.auditFromTo(orderRow,"OrderAudit"); where auditFromTo creates an instance of OrderAudit, sets the foreign key, sets like-named attributes, and saves it. Pluggable Authentication Most organizations have existing data stores that identify users and their roles, such as Active Directory, LDAP, OAuth, etc. Security should integrate with such systems as a function of enforcing row/column access. Standard deployment Finally, the system should deploy in a familiar manner: available on the cloud, or an on-premise virtual appliance or war file. Standards also enable integration with related critical infrastructure, such as API Management, ERP Systems, etc. See a project in 3 minutes To see how it all fits together, you can view this video to see a full project built: from concept, through initial implementation, and an iteration cycle. Actual project time was about half an hour. Instant Enterprise REST: Business Agility Instant Enterprise REST enables us to close the Agility Gap in realizing the Software Driven Business vision. We can now create important portions of our software in largely business terms, rather than technical terms. This offers major advantages: Time to Market: spreadsheet-like rules are 40X more concise. Instant REST eliminates all the SQL / REST / JSON boilerplate. Simplicity: team members can learn the basics of Espresso in days, and be as productive as rocket scientists using alternative technologies Leverage Expertise and Software: Espresso is built on standards like REST, JavaScript, and Event Oriented Programming. You can call out to existing software, and extend the rule types by identifying your own patterns and loading their implementations into Espresso. Quality: at the defect level, automatic invocation and ordering eliminate large classes of bugs. At the architectural level, centralized enforcement factors logic out of the client buttons where it can be shared, audited for compliances, etc

June 30, 2015

by Val Huber

CORE

· 1,406 Views

Sync issues with your codes on GitHub

It’s no surprise that many if not all programmers use GitHub today to store their codes, but it can be frustrating to keep everyone up to date with the code changes. Recently, GitHub has been integrated with Quire, a tree-structured task management tool that lets programmers to easily keep track of code changes. By linking GitHub commits to the so-called tasks (issues), users can refer to these tasks when they look at code changes, and also trace back to the codes when they look at the tasks. In a blog article, Quire goes into a bit more detail about their new integration and what exactly users can do and benefit from it. Check out the details at the link below. Hello GitHub, We’re Quire | Quire Blog

June 30, 2015

by Crystal Chen

· 869 Views

Bharath Gowda Focuses on Business Success with DevOps—In CIOReview

“Companies report incredible benefits by adopting a DevOps approach: faster delivery of software, fewer defects, faster resolution of problems, and better allocation of limited IT resources. If your organization has already adopted a DevOps approach, you’re on the right track to developing and deploying software faster and better than ever. “But that’s not the end of the story nor should it be the ultimate driver behind DevOps. To really succeed with a DevOps approach, you need to think about what faster and better software actually means for the business and the all-important bottom line. You must make the connection between your DevOps approach and goals and their impact on your business’ goals. To do this, you must link and balance the goals of faster (speed of delivery), better software (high performing, quality software that delivers a good customer experience) to goals for innovation and business success.” Bharath Gowda in CIOReview’s DevOps Special Edition So begins New Relic director of market leadership Bharath Gowda’s recent article inCIOReview’s new DevOps Special Edition, which features insights from a number of DevOps thought leaders. In an article titled Keep the Focus on the Business to Succeed with DevOps, Bharath makes the point that faster and better software development is only the beginning for DevOps. To extract the most value from the DevOps approach, he says, “you need to think about what faster and better software actually means for the business and the all-important bottom line.” The problem, Bharath warns, is that “working better, smarter, and faster isn’t quite enough to drive business success.” Application performance and customer experience are necessary but not sufficient factors to ensure business success, as measured by revenue, customer adoption, retention, customer satisfaction, and so on. DevOps packs its biggest punch when used to enable rapid innovation driven by business outcomes. DevOps drivers and metrics With that in mind, Bharath names four key drivers for DevOps implementations: Speed Innovation/business success Customer experience Application performance In the CIOReview article, Bharath offers a helpful framework for measuring each driver, along with suggested resource allocations for each one. He notes that unlike traditional waterfall development processes, DevOps’ iterative, data-driven approach allows for the ongoing insights needed to be a better software business. That’s critical, because these days, every business is a software business. Read Bharath’s whole article here!

June 29, 2015

by Fredric Paul

· 1,877 Views

Level Up Your Automated Tests

I presented a new talk at GOTO Chicago 2015 about how to change a team’s attitude towards writing automated tests. The talk covers the same case study as Groovy vs Java for Testing, adopting Spock in MongoDB, but this is a more process/agile/people perspective, not a technical look at the merits of one language over another. Slides available below. As always, the slides are not super-useful out of context, but they do contain my conclusions (also note that due to a technology fail, my hand-drawn style is even more hand-drawn than usual). Questions I sadly did not have a lot of time for questions during the presentation, but thanks to the wonders of modern technology, I have a list of unanswered questions which I will attempt to address here. Is testing to find out your system works? Or is it so you know when your system is broken? Excellent question. I would expect that if you have a system that’s in production (which is probably the large majority of the projects we work on), we can assume the system is working, for some definition of working. Automated testing is particularly good at catching when your system stops doing the things you thought it was doing when you wrote the tests (which may, or may not, mean the system is genuinely “broken”). Regression testing is to find out when your system is no longer doing what you expect, and automated tests are really good for this. But testing can also make sure you implement code that behaves the way you expect, especially if you write the tests first. Automated tests can be used to determine that your code is complete, according to some pre-agreed specification (in this case, the automated tests you wrote up front). So I guess what I’m trying to say is, when you first write the tests you have tests that, when they pass, proves the system works (assumingyour tests are testing the right things and/or not giving you false positives). Subsequent passes show that you haven’t broken anything. At what level do “tests documenting code” actually become useful? And who is/should the documentation be targeted to? In the presentation, my case study is the MongoDB Java Driver. Our users were Java programmers, who were going to be coding using our driver. So in this example, it makes a lot of sense to document the code using a language that our users understood. We started with Java, and ended up using Groovy because it was also understandable for our users and a bit more succinct. On a previous project we had different types of tests. The unit and system tests documented what the expected behaviour was at the class or module level, and was aimed at developers in the team. The acceptance tests were written in Java, but in a friendly DSL-style way. These were usually written by a triad of tester, business analyst and developer, and documented to all these guys and girls what the top-level behaviour should be. Our audience here was fairly technical though, so there was no need to go to the extent of trying to write English-language-style tests, they were readable enough for a reasonably techy (but non-programmer) audience. These were not designed to be read by “the business” - us developers might use them to answer questions about the behaviour of the system, but they didn’t document it in a way that just anyone could understand. These are two different approaches for two different-sized team/organisations, with different users. So I guess in summary the answer is “it depends”. But at the very least, developers on your own team should be able to read your tests and understand what the expected behaviour of the code is. How do you become a team champion? I.e. get authority and acceptance that people listen to you? In my case, it was just by accident - I happened to care about the tests being green and also being useful, so I moaned at people until it happened. But it’s not just about nagging, you get more buy-in if other people see you doing the right things the right way, and it’s not too painful for them to follow your example. There are going to be things that you care about that you’ll never get other people to care about, and this will be different from team to team. You have two choices here - if you care that much, and it bothers you that much, you have to do it yourself (often on your own time, especially if your boss doesn’t buy into it). Or, you have to let it go - when it comes to quality, there are so many things you could care about that it might be more beneficial to drop one cause and pick another that you can get people to care about. For example, I wanted us to use assertThat instead of assertFalse (or true, or equals, or whatever). I tried to demo the advantages (as I saw them) of my approach to the team, and tried to push this in code reviews, but in the end the other developers weren’t sold on the benefits, and from my point of view the benefits weren’t big enough to force the issue. Those of us who cared, used assertThat. For the rest, I was just happy people were writing and maintaining tests. So, pick your battles. You’ll be surprised at how many people do get on board with things. I thought implementing checkstyle and setting draconian formatting standards was going to be a tough battle, but in the end people were just happy to have any standards, especially when they were enforced by the build. Do you report test, style, coverage, etc failures separately? Why? We didn’t fail on coverage. Enforcing a coverage percentage is a really good way to end up with crappy tests, like for getters/setters and constructors (by the way, if there’s enough logic in your constructor that it needs a test, You’re Doing It Wrong). Generally different types of failures are found by different tools, so for this reason alone they will be reported separately - for example, checkstyle will fail the build if it doesn’t conform to our style standards, codenarc fails it for Groovy style failures, and Gradle will run the tests in a different task to these two. What’s actually important, though, is time-to-failure. For checkstyle, for example, it will fail on something silly like curly braces in the wrong place. You want this to fail within seconds, so you can fix the silly mistake quickly. Ideally you’d have IntelliJ (perhaps) run your checks before it even makes it into your CI environment. Compiler errors should, of course, fail things before you run a test, short-running tests should fail before long-running tests. Basically, the easier it is to fix the problem, the sooner you want to know, I guess. Our build was relatively small and not too complex, so actually we ran all our types of tests (integration and unit, both Groovy and Java) in a single task, because this turned out to be much quicker in Gradle (in our case) than splitting things up into a simple pipeline. You might have a reason to report stuff separately, but for me it’s much more important to understand how fast I need to be aware of a particular type of failure. Sometimes I find myself modifying code design and architecture to enable testing. How can I avoid damaging design? This is a great question, and a common one too. The short answer is: in general writing code that’s easier to test leads to a cleaner design anyway (for example, dependency injection at that appropriate places). If you find you need to rip your design apart to test it, there’s a smell there somewhere - either your design isn’t following SOLID principals, or you’re trying to test the wrong things. Of course, the common example here is testing private methods - how do you test these without exposing secrets? I think for me, if it’s important enough to be tested it’s important enough to be exposed in some way - it might belong in some sort of util or helper (right now I’m not going to go into whether utils or helpers are, in themselves a smell), in a smaller class that only provides this sort of functionality, or simply a protected method. Or, if you’re testing with Groovy, you can access private methods anyway so this becomes a moot point (i.e. your testing framework may be limiting you). In another story from LMAX, we found we had created methods just for testing. It seemed a bit wrong to have these methods only available for testing, but later on down the line, we needed access to many of these methods In Real Life (well, from our Admin app), so our testing had “found” a missing feature. When we came to implement it, it was pretty easy as we’d already done most of it for testing. My co-workers often point to a lack of end-to-end testing as the reason why a lot of bugs get out to production even though they don’t have much unit tests nor integration tests. What, in your experience, is a good balance between unit tests, integration tests and end-to-end testing? Hmm, sounds to me like “lack of tests” is your problem! This is a big (and contentious!) topic. Martin Fowler has written about it, Google wrote something I completely disagree with (so I’m not even going to link to it, but you’ll find references in the links in this paragraph), and my ex-colleague Adrian talks about what we, at LMAX, meant by end-to-end tests. I hope that’s enough to get you started, there’s plenty more out there too. How did you go about getting buy in from the team to use Spock? I cover this in my other presentation on the topic - the short version is, I did a week-long spike to investigate whether Spock would make testing easier for us, showed the pros and cons to the whole team, and then led by example writing tests that (I thought) were more readable than what we had before and, probably most importantly, much easier to write than what we were previously doing. I basically got buy-in by showing how much easier it was for us to use the tool than even JUnit (which we were all familiar with). It did help that we were already using Gradle, so we already had a development dependency on Groovy. It also helped that adding Spock made no changes to the dependencies of the final Jar, which was very important. Over time, further buy-in (certainly from management) came when the new tests started catching more errors - usually regressions in our code or regressions in the server’s overnight builds. I don’t think it was Spock specifically that caught more problems - I think it was writing more tests, and better tests, that caught the issues. Can we do data driven style tests in frameworks like junit or cucumber? I don’t think you can in JUnit (although maybe there’s something out there). I believe someone told me you can do it in TestNG. Are there drawbacks to having tests that only run in ci? I.e I have Java 8 on my machine, but the test requires Java 7 Yes, definitely - the drawback is Time. You have to commit your code to a branch that is being checked by CI and wait for CI to finish before you find the error. In practice, we found very little that was different between Java 7 and 8, for example, but this is a valid concern (otherwise you wouldn’t be testing a complex matrix of dependencies at all). In our case, our Java 6 driver used Netty for async capabilities, as the stuff we were using from Java 7 wasn’t available. This was clearly a different code path that wasn’t tested by us locally as we were all running Java 8. Probably more importantly for us is we were testing against at least 3 different major versions of the server, which all supported different features (and had different APIs). I would often find I’d broken the tests for version 2.2 as I’d only been running it on 2.6, and had forgotten to either turn off the new tests for the old server versions, or didn’t realise the new functionality wouldn’t work there. So the main drawback is time - it takes a lot longer to find out about these errors. There are a few ways to get around this: Commit often!! And to a branch that’s actually going to be run by CI Make your build as fast as possible, so you get failures fast (you should be doing this anyway) You could set up virtual machines locally or somewhere cloudy to run these configurations before committing, but that sounds kinda painful (and to my mind defeats a lot of the point of CI). I set up Travis on my fork of the project, so I could have that running a different version of Java and MongoDB when I committed to my own fork - I’d be able to see some errors before they made it into the “real” project. If you can, you probably want these specific tests run first so they can fail fast. E.g. if you’re running a Java 6 & MongoDB 2.2 configuration on CI, run those tests that only work in that environment first. Would probably need some Gradle magic, and/or might need you to separate these into a different set of folders. The advantage of this approach though is if you set up some aliases on your local machine you could sanity check just these special cases before checking in. For example, I had aliases to start MongoDB versions/configurations from a single command, and to set JAVA_HOME to whichever version I wanted. Do you have any tips for unit tests that pass on dev machines but not on Jenkins because it’s not as powerful as our own machines? E.g. Synchronous calls timeout on the Jenkins builds intermittently. Erk! Yes, not uncommon. No, not really. We had our timeouts set longer than I would have liked to prevent these sorts of errors, and they still intermittently failed. You can also set some sort of retry on the test, and get your build system to re-run those that fail to see if they pass later. It’s kinda nasty though. At LMAX they were able to take testing seriously enough to really invest in their testing architecture, and, of course, this is The Correct Answer. Just often very difficult to sell. If you ask where are tests and dev asks if code is correct? And you say yes. Then dev asks why you’re delaying shipping value, how do you manage that? These are my opinions: Your code is not complete without tests that show me it’s complete. Your code might do what you think it’s supposed to do right now, but given Shared Code Ownership, anyone can come in and change it at any time, you want tests in place to make sure they don’t change it to break what you thought it did The tests are not so much to show it works right now, the tests are to show it continues to work in future Having automated tests will speed you up in future. You can refactor more safely, you can fix bugs and know almost immediately if you broke something, you can read from the test what the author of the code thought the code should do, getting you up to speed faster. You don’t know you’re shipping value without tests - you’re only shipping code (to be honest, you never know if you’re shipping value until much later on when you also analyse if people are even using the feature). Testing almost never slows you down in the long run. Show me the bits of your code base which are poorly tested, and I bet I can show you the bits of your code base that frequently have bugs (either because the code is not really doing what the author thinks, or because subsequent changes break things in subtle ways). If you say code is hard to understand and dev asks if you seriously don’t understand the code, how do you explain you mean easy to understand without thinking rather than ‘can I compile this in my head’? I have zero problem with saying “I’m too stupid to understand this code, and I expect you’re much smarter than me for writing it. Can you please write it in a way so that a less smart person like myself won’t trample all over your beautiful code at a later date through lack of understanding?” By definition, code should be easy to understand by someone who’s not the author. If someone who is not the author says the code is hard to understand, then the code is hard to understand. This is not negotiable. This is what code reviews or pair programming should address. What is effective nagging like? (Whether or not you get what you want) Mmm, good question. Off the top of my head: Don’t make the people who are the target of the nagging feel stupid - they’ll get defensive. If necessary, take the burden of “stupidity” on yourself. E.g. “I’m just not smart enough to be able to tell if this test is failing because the test is bad or because the code is bad. Can you walk me through it and help me fix it?” Do at least your fair share of the work, if not more. When I wanted to get the code to a state where we could fail style errors, I fixed 99% of the problems, and delegated the handful of remaining ones that I just didn’t have the context to fix. In the face of three errors to fix each, the team could hardly say “no” after I’d fixed over 6000. Explain why things need to be done. Developers are adults and don’t want to be treated like children. Give them a good reason and they’ll follow the rules. The few times I didn’t have good reasons, I could not get the team to do what I wanted. Find carrots and sticks that work. At LMAX, a short e-mail at the start of the day summarising the errors that had happened overnight, who seemed to be responsible, and whether they looked like real errors or intermittencies, was enough to get people to fix their problems2 - they didn’t like to look bad, but they also had enough information to get right on it, they didn’t have to wade through all the build info. On occasion, when people were ignoring this, I’d turn up to work with bags of chocolate that I’d bought with my own money, offering chocolate bars to anyone who fixed up the tests. I was random with my carrot offerings so people didn’t game the system. Give up if it’s not working. If you’ve tried to phrase the “why” in a number of ways, if you’ve tried to show examples of the benefits, if you’ve tried to work the results you want into a process, but it’s still not getting done, just accept the fact that this isn’t working for the team. Move on to something else, or find a new angle. 1 I had a colleague at LMAX who was working with a hypothesis that All Private Methods Were Evil - they were clearly only sharable within single class, so provided no reuse elsewhere, and if you have the same bit of code being called multiple times from within the same class (but it’s not valuable elsewhere) then maybe your design is wrong. I’m still pondering this specific hypothesis 4 years on, and I admit I see its pros and cons. 2 This worked so well that this process was automated by one of the guys and turned into a tool called AutoTrish, which as far as I know is still used at LMAX. Dave Farley talks about it in some of hisContinuous Delivery talks. Resources My talk that specifically looks at the advantages of Spock over JUnit, plus some Spock-specific resources. I love Jay Fields book Working Effectively With Unit Tests - if I could have made the whole team read this before moving to Spock, we might have stuck with JUnit. Go read everything Adrian Sutton has written about testing at LMAX. If not everything, definitely Abstraction by DSL and Making End-to-End Tests Work If you can’t make it all the way through Dave Farley and Jez Humble’s excellent Continuous Delivery book, do take a look at one of Dave’s presentations on the subject, for example The Rationale for Continuous Delivery or The Process, Technology and Practice of Continuous Delivery - my own talk was around testing, but I’m working off the assumption that you’re at least running some sort of Continuous Integration, if not Continuous Delivery. Martin Fowler has loads of interesting and useful articles on testing. Abstract What can you do to help developers a) write tests b) write meaningful tests and c) write readable tests? Trisha will talk about her experiences of working in a team that wanted to build quality into their new software version without a painful overhead - without a QA / Testing team, without putting in place any formal processes, without slavishly improving the coverage percentage. The team had been writing automated tests and running them in a continuous integration environment, but they were simply writing tests as another tick box to check, there to verify the developer had done what the developer had aimed to do. The team needed to move to a model where tests provided more than this. The tests needed to: Demonstrate that the library code was meeting the requirements Document in a readable fashion what those requirements were, and what should happen under non-happy-path situations Provide enough coverage so a developer could confidently refactor the code This talk will cover how the team selected a new testing framework (Spock, a framework written in Groovy that can be used to test JVM code) to aid with this effort, and how they evaluated whether this tool would meet the team’s needs. And now, two years after starting to use Spock, Trisha can talk about how both the tool and the shift in the focus of the purpose of tests has affected the quality of the code. And, interestingly, the happiness of the developers.

June 29, 2015

by Trisha Gee

· 2,072 Views

The Cloudcast #198 - Architecting Cloud Foundry

Download the MP3 Date: June 19, 2015 By: Aaron Delp and Brian Gracely Description: Aaron and Brian talk to Chip Childers (@chipchilders, VP of Technology @CloudFoundryOrg) about the current status of Cloud Foundry projects, how Microsoft .NET will be integrated, IaaS vs. PaaS, and the CF.org thinking about overall interoperability Interested in the O'Reilly OSCON? Want to register for OSCON now? Use promo code 20CLOUD for 20% off Details to win an OSCON pass coming soon! Check out the OSCON Schedule Free eBook from O'Reilly Media for Cloudcast Listeners! Check out an excerpt from the upcoming Docker Cookbook Topic 1 - From an overall project perspective, what grades would you give Cloud Foundry in terms of stability, core functionality, security, operations, etc? Topic 2 - You were previously involved (directly/indirectly)with CloudStack. As you talk to people in the marketplace, how is it different discussing IaaS vs. PaaS. Topic 3 - How much ability will you have to drive prioritization within sub-projects or new projects? (eg. Security vs. new Languages vs. Interop, etc.) Topic 4 - What’s the CF.org way of thinking about interoperability? Topic 5 - What guidance are you giving the teams in terms of expandability of Cloud Foundry? Architecturally, are there certain places you recommend over other places? Topic 6 - Is there a place for integrating SaaS applications (monitoring, logging, etc.) into Cloud Foundry?

June 29, 2015

by Brian Gracely

· 1,156 Views

Get CoreOS Logs into ELK in 5 Minutes

CoreOS Linux is the operating system for “Super Massive Deployments”. We wanted to see how easily we can get CoreOS logs into Elasticsearch / ELK-powered centralized logging service. Here’s how to get your CoreOS logs into ELK in about 5 minutes, give or take. If you’re familiar with CoreOS and Logsene, you can grab CoreOS/Logsene config files from Github. Here’s an example Kibana Dashboard you can get in the end: CoreOS Kibana Dashboard CoreOS is based on the following: Docker and rkt for containers systemd for startup scripts, and restarting services automatically etcd as centralized configuration key/value store fleetd to distribute services over all machines in the cluster. Yum. journald to manage logs. Another yum. Amazingly, with CoreOS managing a cluster feels a lot like managing a single machine! We’ve come a long way since ENIAC! There’s one thing people notice when working with CoreOS – the repetitive inspection of local or remote logs using “journalctl -M machine-N -f | grep something“. It’s great to have easy access to logs from all machines in the cluster, but … grep? Really? Could this be done better? Of course, it’s 2015! Here is a quick example that shows how to centralize logging with CoreOS with just a few commands. The idea is to forward the output of “journalctl -o short” to Logsene‘s Syslog Receiver and take advantage of all its functionality – log searching, alerting, anomaly detection, integrated Kibana, even correlation of logs with Docker performance metrics — hey, why not, it’s all available right there, so we may as well make use of it all! Let’s get started! Preparation: 1) Get a list of IP addresses of your CoreOS machines fleetctl list-machines 2) Create a new Logsene App (here) 3) Change the Logsene App Settings, and authorize the CoreOS host IP Addresses from step 1) (here’s how/where) Congratulations – you just made it possible for your CoreOS machines to ship their logs to your new Logsene app! Test it by running the following on any of your CoreOS machines: journalctl -o short -f | ncat --ssl logsene-receiver-syslog.sematext.com 10514 …and check if the logs arrive in Logsene (here). If they don’t, yell at us @sematext – there’s nothing better than public shaming on Twitter to get us to fix things. :) Create a fleet unit file called logsene.service [Unit] Description=Logsene Log Forwarder [Service] Restart=always RestartSec=10s ExecStartPre=/bin/sh -c "if [ -n \"$(etcdctl get /sematext.com/logsene/`hostname`/lastlog)\" ]; then echo \"Value Exists: /sematext.com/logsene/`hostname`/lastlog $(etcdctl get /sematext.com/logsene/`hostname`/lastlog)\"; else etcdctl set /sematext.com/logsene/`hostname`/lastlog\"`date +\"%Y-%%m-%d %%H:%M:%S\"`\"; true; fi" ExecStart=/bin/sh -c "journalctl --since \"$(etcdctl get /sematext.com/logsene/`hostname`/lastlog)\" -o short -f | ncat --ssl logsene-receiver-syslog.sematext.com 10514" ExecStopPost=/bin/sh -c "export D=\"`date +\"%Y-%%m-%%d %%H:%M:%S\"`\"; /bin/etcdctl set /sematext.com/logsene/$(hostname)/lastlog \"$D\"" [Install] WantedBy=multi-user.target [X-Fleet] Global=true Activate cluster-wide logging to Logsene with fleet To start logging to Logsene from all machines activate logsene.service: fleetctl load logsene.service fleetctl start logsene.service There. That’s all there is to it! Hope this worked for you! At this point all your CoreOS logs should be going to Logsene. Now you have a central place to see all your CoreOS logs. If you want to send your app logs to Logsene, you can do that, too — anything that can send logs via Syslog or to Elasticsearch can also ship logs to Logsene. If you want some Docker containers & host monitoring to go with your CoreOS logs, just pull spm-agent-docker from Docker Registry. Enjoy!

June 29, 2015

by Stefan Thies

· 2,639 Views

Apache Camel in Space—Erh, in Docker and Kubernetes and Fancy Fabric8 Web Console

I have just recorded a 5 minute video that demonstrates running an out of stock example from Apache Camel release, the camel-example-servlet packaged as a docker container and running on a kubernetes platform, such as openshift 3. camel-servlet-example scaled up to 3 running containers (pods) which is easy with kubernetes and fabric8 In this video I have already deployed the example and then demonstrates how we can use the fabric8 web console to manage our application. And also connect to the running container and see inside, such as the Camel routes visually as shown above. Then I run a simple bash script from my laptop that sends a HTTP GET to the Camel example and prints the response. The script runs in a endless loop and demonstrates how kubernetes can easily scale up and down multiple Camel containers and load balance across the running containers. And at the end its even self healing when I force killing docker containers. So I suggest to grab a fresh cup of tea or coffee and sit back and play the 5 minutes video. The video is hosted on vimeo and can be seen from this link.

June 29, 2015

by Claus Ibsen

· 2,148 Views · 1 Like

R: Scraping the Release Dates of Github Projects

Continuing on from my blog post about scraping Neo4j’s release dates I thought it’d be even more interesting to chart the release dates of some github projects. In theory the release dates should be accessible through the github API but the few that I looked at weren’t returning any data so I scraped the data together. We’ll be using rvest again and I first wrote the following function to extract the release versions and dates from a single page: library(dplyr) library(rvest) process_page = function(releases, session) { rows = session %>% html_nodes("ul.release-timeline-tags li") for(row in rows) { date = row %>% html_node("span.date") version = row %>% html_node("div.tag-info a") if(!is.null(version) && !is.null(date)) { date = date %>% html_text() %>% str_trim() version = version %>% html_text() %>% str_trim() releases = rbind(releases, data.frame(date = date, version = version)) } } return(releases) } Let’s try it out on the Cassandra release page and see what it comes back with: > r = process_page(data.frame(), html_session("https://github.com/apache/cassandra/releases")) > r date version 1 Jun 22, 2015 cassandra-2.1.7 2 Jun 22, 2015 cassandra-2.0.16 3 Jun 8, 2015 cassandra-2.1.6 4 Jun 8, 2015 cassandra-2.2.0-rc1 5 May 19, 2015 cassandra-2.2.0-beta1 6 May 18, 2015 cassandra-2.0.15 7 Apr 29, 2015 cassandra-2.1.5 8 Apr 1, 2015 cassandra-2.0.14 9 Apr 1, 2015 cassandra-2.1.4 10 Mar 16, 2015 cassandra-2.0.13 That works pretty well but it’s only one page! To get all the pages we can use the follow_link function to follow the ‘Next’ link until there aren’t anymore pages to process. We end up with the following function to do this: find_all_releases = function(starting_page) { s = html_session(starting_page) releases = data.frame() next_page = TRUE while(next_page) { possibleError = tryCatch({ releases = process_page(releases, s) s = s %>% follow_link("Next") }, error = function(e) { e }) if(inherits(possibleError, "error")){ next_page = FALSE } } return(releases) } Let’s try it out starting from the Cassandra page: > cassandra = find_all_releases("https://github.com/apache/cassandra/releases") Navigating to https://github.com/apache/cassandra/releases?after=cassandra-2.0.13 Navigating to https://github.com/apache/cassandra/releases?after=cassandra-2.0.10 Navigating to https://github.com/apache/cassandra/releases?after=cassandra-2.0.8 Navigating to https://github.com/apache/cassandra/releases?after=cassandra-1.2.13 Navigating to https://github.com/apache/cassandra/releases?after=cassandra-2.0.0-rc1 Navigating to https://github.com/apache/cassandra/releases?after=cassandra-1.2.3 Navigating to https://github.com/apache/cassandra/releases?after=cassandra-1.2.0-beta2 Navigating to https://github.com/apache/cassandra/releases?after=cassandra-1.0.10 Navigating to https://github.com/apache/cassandra/releases?after=cassandra-1.0.6 Navigating to https://github.com/apache/cassandra/releases?after=cassandra-1.0.0-rc2 Navigating to https://github.com/apache/cassandra/releases?after=cassandra-0.7.7 Navigating to https://github.com/apache/cassandra/releases?after=cassandra-0.7.4 Navigating to https://github.com/apache/cassandra/releases?after=cassandra-0.7.0-rc3 Navigating to https://github.com/apache/cassandra/releases?after=cassandra-0.6.4 Navigating to https://github.com/apache/cassandra/releases?after=cassandra-0.5.0-rc3 Navigating to https://github.com/apache/cassandra/releases?after=cassandra-0.4.0-final > cassandra %>% sample_n(10) date version 151 Mar 13, 2010 cassandra-0.5.0-rc2 25 Jul 3, 2014 cassandra-1.2.18 51 Jul 27, 2013 cassandra-1.2.8 21 Aug 19, 2014 cassandra-2.1.0-rc6 73 Sep 24, 2012 cassandra-1.2.0-beta1 158 Mar 13, 2010 cassandra-0.4.0-rc2 113 May 20, 2011 cassandra-0.7.6-2 15 Oct 24, 2014 cassandra-2.1.1 103 Sep 15, 2011 cassandra-1.0.0-beta1 93 Nov 29, 2011 cassandra-1.0.4 I want to plot when the different releases happened in time and in order to do that we need to create an extra column containing the ‘release series’ which we can do with the following transformation: series = function(version) { parts = strsplit(as.character(version), "\\.") return(unlist(lapply(parts, function(p) paste(p %>% unlist %>% head(2), collapse = ".")))) } bySeries = cassandra %>% mutate(date2 = mdy(date), series = series(version), short_version = gsub("cassandra-", "", version), short_series = series(short_version)) > bySeries %>% sample_n(10) date version date2 series short_version short_series 3 Jun 8, 2015 cassandra-2.1.6 2015-06-08 cassandra-2.1 2.1.6 2.1 161 Mar 13, 2010 cassandra-0.4.0-beta1 2010-03-13 cassandra-0.4 0.4.0-beta1 0.4 62 Feb 15, 2013 cassandra-1.1.10 2013-02-15 cassandra-1.1 1.1.10 1.1 153 Mar 13, 2010 cassandra-0.5.0-beta2 2010-03-13 cassandra-0.5 0.5.0-beta2 0.5 37 Feb 7, 2014 cassandra-2.0.5 2014-02-07 cassandra-2.0 2.0.5 2.0 36 Feb 7, 2014 cassandra-1.2.15 2014-02-07 cassandra-1.2 1.2.15 1.2 29 Jun 2, 2014 cassandra-2.1.0-rc1 2014-06-02 cassandra-2.1 2.1.0-rc1 2.1 21 Aug 19, 2014 cassandra-2.1.0-rc6 2014-08-19 cassandra-2.1 2.1.0-rc6 2.1 123 Feb 16, 2011 cassandra-0.7.2 2011-02-16 cassandra-0.7 0.7.2 0.7 135 Nov 1, 2010 cassandra-0.7.0-beta3 2010-11-01 cassandra-0.7 0.7.0-beta3 0.7 Now let’s plot those releases and see what we get: ggplot(aes(x = date2, y = short_series), data = bySeries %>% filter(!grepl("beta|rc", short_version))) + geom_text(aes(label=short_version),hjust=0.5, vjust=0.5, size = 4, angle = 90) + theme_bw() An interesting thing we can see from this visualisation is what overlap the various series of versions have. Most of the time there are only two series of versions overlapping but the 1.2, 2.0 and 2.1 series all overlap which is unusual. In this chart we excluded all beta and RC versions. Let’s bring those back in and just show the last 3 versions: ggplot(aes(x = date2, y = short_series), data = bySeries %>% filter(grepl("2\\.[012]\\.|1\\.2\\.", short_version))) + geom_text(aes(label=short_version),hjust=0.5, vjust=0.5, size = 4, angle = 90) + theme_bw() From this chart it’s clearer that the 2.0 and 2.1 series have recent releases so there will probably be three overlapping versions when the 2.2 series is released as well. The chart is still a bit cluttered although less than before. I’m not sure of a better way of visualising this type of data so if you have any ideas do let me know!

June 29, 2015

by Mark Needham

· 1,445 Views

Scraping Github Pull Requests and Their Code Review Comments

Github stores its pull-request and code review data in MySql. I’d much prefer a git reperentation for both (JSON, commits, audit trail, etc). Kinda the way Github Wiki pages are stored. That’s an aside though, this article is about storing code-review comments long term. The problem I’m trying to solve is one of deletion of users thich causes their pull requenst commentary to also get deleted. Sure the commits make it back to the origin/master (in the pull request is processed), but many things are left assoctaed with the fork. If the user gets deleted such info is gone forever :( I want a permanent copy, so the interim answer is to scrape the data I fear losing, while it still exists. Hence a scrape-pull-requests.sh bash script (for Mac and maybe Linux). Github’s portal is written in Ruby on Rails. It is extremely fast which helps scraping generally. There’s not a lot of JavaScript and that means that Wget is a viable extraction tool. Anyway the script runs quickly, and leaves a decent HTML interface for easy access later. I’ve tested it, but won’t leave up a scraped set of pull-requests as our GH overlords might object on copyright grounds. They can’t object for your own GithubEnterprise instance of course. Github could change the structure of their HTML, and the script might stop work so well.If that happens I’m happy to accept back pull-requests via the usual mechanism :)

June 29, 2015

by Paul Hammant

· 2,042 Views

Working with Merge and Identity Column -- A Practical Scenario

Introduction As we all know about the Identity columns and Merge statement. We are not going to discuss any boring theoretical tropics related to it. Better we are discussing here with a practical scenario of merging records. Hope all of you must enjoy it and it will be informative. The Scenario We have a Table with Identity Columns named #tbl_TempStudentRecords. The table details are mentioned bellow. Column Name SrlNo StudentName StudentClass StudentSection We have another table named #tbl_TempStudrntMsrks. The table details are mentioned bellow. Column Name SrlNo StubjectName MarksObtain What we want to do is, we have another set of table called tbl_StudentDetails mentioned bellow. Column Name StdRoll (PK) StudentName StudentClass StudentSection Another table named tbl_StudentMarks Column Name IdNo (PK) StdRoll (FK) References [tbl_StudentDetails].[StdRoll] SubjectName MarksObtain Her we can insert records very easily in tbl_StudentDetails from #tbl_TempStudentRecords very easily. But the main problem is the IDENTITY columns in the Table named [tbl_StudentDetails].[ StdRoll]. When we insert records the Identity columns values generate automatically. When we are trying to insert records into the table named tbl_StudentMarks from Table named #tbl_TempStudrntMsrks we have to provide the StdRoll values, which is the Foreign Key References to the Table named [tbl_StudentDetails].[ StdRoll]. Think one minute with the case scenario. Hope you can understand the problem. Now we have to solve it and we are not using any LOOP for that and NOT even any DDL operation to change the structure of base table. We are just using the SET BASED operation to make performance high. How to Solve it Step – 1 [ Create the Base Table First ] IF OBJECT_ID('tempdb..#tbl_TempStudentRecords')IS NOT NULL BEGIN DROP TABLE #tbl_TempStudentRecords; END GO CREATE TABLE #tbl_TempStudentRecords ( SrlNo BIGINT NOT NULL, StudentName VARCHAR(50) NOT NULL, StudentClass INT NOT NULL, StudentSection CHAR(1) NOT NULL ); GO IF OBJECT_ID('tempdb..#tbl_TempStudrntMsrks')IS NOT NULL BEGIN DROP TABLE #tbl_TempStudrntMsrks; END GO CREATE TABLE #tbl_TempStudrntMsrks ( SrlNo BIGINT NOT NULL, StubjectName VARCHAR(50) NOT NULL, MarksObtain INT NOT NULL ); GO IF OBJECT_ID(N'[dbo].[tbl_StudentDetails]', N'U')IS NOT NULL BEGIN DROP TABLE [dbo].[tbl_StudentDetails]; END GO CREATE TABLE [dbo].[tbl_StudentDetails] ( StdRoll BIGINT NOT NULL IDENTITY(100,1) PRIMARY KEY, StudentName VARCHAR(50) NOT NULL, StudentClass INT NOT NULL, StudentSection CHAR(1) NOT NULL ); GO IF OBJECT_ID(N'[dbo].[tbl_StudentMarks]', N'U')IS NOT NULL BEGIN DROP TABLE [dbo].[tbl_StudentMarks]; END GO CREATE TABLE [dbo].[tbl_StudentMarks] ( IdNo BIGINT NOT NULL IDENTITY(1,1) PRIMARY KEY, StdRoll BIGINT NOT NULL, SubjectName VARCHAR(50) NOT NULL, MarksObtain INT NOT NULL ); GO ALTER TABLE [dbo].[tbl_StudentMarks] ADD CONSTRAINT FK_StdRoll_tbl_StudentMarks FOREIGN KEY(StdRoll) REFERENCES [dbo].[tbl_StudentDetails](StdRoll); Step – 2 [ Inserting Records in Temp Table ] INSERT INTO #tbl_TempStudentRecords (SrlNo, StudentName, StudentClass, StudentSection) VALUES(1, 'Joydeep Das', 1, 'A'), (2, 'Preeti Sharma', 1, 'A'), (3, 'Deepasree Das', 1, 'A'); INSERT INTO #tbl_TempStudrntMsrks (SrlNo, StubjectName, MarksObtain) VALUES (1, 'Bengali', 50), (1, 'English', 70), (1, 'Math', 80), (2, 'Bengali', 0), (2, 'English', 70), (2, 'Math', 80), (3, 'Bengali', 20), (3, 'English', 90), (3, 'Math', 95); Step – 3 [ Now Solve it By MERGE Statement ] BEGIN DECLARE @MappingTable TABLE ([NewRecordID] BIGINT, [OldRecordID] BIGINT) MERGE [dbo].[tbl_StudentDetails] AS target USING (SELECT [SrlNo] AS RecordID_Original ,[StudentName] ,[StudentClass] ,[StudentSection] FROM #tbl_TempStudentRecords ) AS source ON (target.StdRoll = NULL) WHEN NOT MATCHED THEN INSERT ([StudentName], [StudentClass], [StudentSection]) VALUES (source.[StudentName],source.[StudentClass], source.[StudentSection]) OUTPUT inserted.[StdRoll], source.[RecordID_Original] INTO @MappingTable; --- Now Map table is ready and we can use it --- INSERT INTO [dbo].[tbl_StudentMarks] (StdRoll, SubjectName, MarksObtain) SELECT b.NewRecordID, a.StubjectName, a.MarksObtain FROM #tbl_TempStudrntMsrks AS a INNER JOIN @MappingTable AS b ON a.SrlNo = b.OldRecordID; END GO Step – 4 [ Observation ] SELECT * FROM [dbo].[tbl_StudentDetails]; GO SELECT * FROM [dbo].[tbl_StudentMarks]; GO StdRoll StudentName StudentClass StudentSection 100 Joydeep Das 1 A 101 Preeti Sharma 1 A 102 Deepasree Das 1 A IdNo StdRoll SubjectName MarksObtain 1 100 Bengali 50 2 100 English 70 3 100 Math 80 4 101 Bengali 0 5 101 English 70 6 101 Math 80 7 102 Bengali 20 8 102 English 90 9 102 Math 95

June 28, 2015

by Joydeep Das

· 5,514 Views

Fixing Your Org by Continually Breaking It - DevOps Days Austin

Here is the third post in a six part series focusing on DevOps Days Austin. Following Cote’s presentation, next in the line up of speakers on day one was Paul Read of Release Engineering Approaches. Paul’s talk tackles the “importance of continually breaking your organization with experiments, some surprising examples, and how to do it in such a way that you aren’t left with a broken organization.” Take a listen, and check out his awesome tie! (also see his slides below) <br> Some of the ground Paul covers: Learning from Toyota Kata The culture of continuous learning and experimentation/improvement Making “the pain visible” Where Paul sees DevOps going in the future Still to come Cameron Haight – Gartner John Willis – Docker Barton George (Sputnik ignite talk) – Dell Extra credit reading The coming donkey apocalypse — Cote Opening Keynote – Damon Edwards Pau for now…

June 28, 2015

by Barton George

· 1,820 Views

Docker Events and Docker Metrics Monitoring

Docker deployments can be very dynamic with containers being started and stopped, moved around the YARN or Mesos-managed clusters, having very short life spans (the so-called pets) or long uptimes (aka cattle). Getting insight into the current and historical state of such clusters goes beyond collecting container performance metrics and sending alert notifications. If a container dies or gets paused, for example, you may want to know about it, right? Or maybe you’d want to be able to see that a container went belly up in retrospect when troubleshooting, wouldn’t you? Just two weeks ago we added Docker Monitoring (docker image is right here for your pulling pleasure) to SPM. We didn’t stop there — we’ve now expanded SPM’s Docker support by adding Docker Event collection, charting, and correlation. Every time a container is created or destroyed, started, stopped, or when it dies, spm-agent-docker captures the appropriate event so you can later see what happened where and when, correlate it with metrics, alerts, anomalies — all of which are captured in SPM — or with any other information you have at your disposal. The functionality and the value this brings should be pretty obvious from the annotated screenshot below. Like this post? Please tweet about Docker Events and Docker Metrics Monitoring Know somebody who’d find this post useful? Please let them know… Here’s the list of Docker events SPM Docker monitoring agent currently captures: Version Information on Startup: server-info – created by spm-agent framework with node.js and OS version info on startup docker-info – Docker Version, API Version, Kernel Version on startup Docker Status Events: Container Lifecycle Events like create, exec_create, destroy, export Container Runtime Events like die, exec_start, kill, oom, pause, restart, start, stop, unpause Every time a Docker container emits one of these events spm-agent-docker will capture it in real-time, ship it over to SPM, and you’ll be able to see it as shown in the above screenshot. Oh, and if you’re running CoreOS, you may also want to see how to index CoreOS logs into ELK/Logsene. Why? Because then you can have not only metrics and container events in one place, but also all container and application logs, too! If you’re using Docker, we hope you find this useful! Anything else you’d like us to add to SPM (for Docker or anyother integration)? Leave a comment, ping @sematext, or send us email – tell us what you’d like to get for early Christmas!

June 27, 2015

by Stefan Thies

· 3,201 Views

Does DevOps Reduce Technical Debt--or Make it Worse?

DevOps can help reduce technical debt in some fundamental ways. Continuous Delivery/Deployment First, building a Continuous Delivery/Deployment pipeline, automating the work of migration and deployment, will force you to clean up inconsistencies and holes in configuration and code deployment, and inconsistencies between development, test and production environments. And automated Continuous Delivery and Infrastructure as Code gets rid of dangerous one-of-a-kind snowflakes and configuration drift caused by making configuration changes and applying patches manually over time. Which makes systems easier to setup and manage, and reduces the risk of an un-patched system becoming the target of a security attack or the cause of an operational problem. A CD pipeline also makes it easier, cheaper and faster to pay down other kinds of technical debt. With Continuous Delivery/Deployment, you can test and push out patches and refactoring changes and platform upgrades faster and with more confidence. Positive Feedback The Lean feedback cycle and Just-in-Time prioritization in DevOps ensures that you’re working on whatever is most important to the business. This means that bugs and usability issues and security vulnerabilities don’t have to wait until after the next feature release to get fixed. Instead, problems that impact operations or the users will get fixed immediately. Teams that do Blameless Post-Mortems and Root Cause(s) Analysis when problems come up will go even further, and fix problems at the source and improve in fundamental and important ways. But there’s a negative side to DevOps that can add to technical debt costs. Erosive Change Michael Feathers’ research has shown that constant, iterative change is erosive: the same code gets changed over and over, the same classes and methods become bloated (because it is naturally easier to add code to an existing method or a method to an existing class), structure breaks down and the design is eventually lost. DevOps can make this even worse. DevOps and Continuous Delivery/Deployment involves pushing out lots of small changes, running experiments and iteratively tuning features and the user experience based on continuous feedback from production use. Many DevOps teams work directly on the code mainline, “branching in code” to “dark launch” code changes, while code is still being developed, using conditional logic and flags to skip over sections of code at run-time. This can make the code hard to understand, and potentially dangerous: if a feature toggle is turned on before the code is ready, bad things can happen. Feature flags are also used to run A/B experiments and control risk on release, by rolling out a change incrementally to a few users to start. But the longer that feature flags are left in the code, the harder it is to understand and change. There is a lot of housekeeping that needs to be done in DevOps: upgrading the CD pipeline and making sure that all of the tests are working; maintaining Puppet or Chef (or whatever configuration management tool you are using) recipes; disciplined, day-to-day refactoring; keeping track of features and options and cleaning them up when they are no longer needed, getting rid of dead code and trying to keep the code as simple as possible. Microservices and Technology Choices Microservices are a popular architectural approach for DevOps teams. This is because loosely-coupled Microservices are easier for individual teams to independently deploy, change, refactor or even replace. And a Microservices-based approach provides developers with more freedom when deciding on language or technology stack: teams don’t necessarily have to work the same way, they can choose the right tool for the job, as long as they support an API contract for the rest of the system. In the short term there are obvious advantages to giving teams more freedom in making technology choices. They can deliver code faster, quickly try out prototypes, and teams get a chance to experiment and learn about different technologies and languages. But Microservices “are not a free lunch”. As you add more services, system testing costs and complexity increase. Debugging and problem solving gets harder. And as more teams choose different languages and frameworks, it’s harder to track vulnerabilities, harder to operate, and harder for people to switch between teams. Code gets duplicated because teams want to minimize coupling and it is difficult or impossible to share libraries in a polyglot environment. Data is often duplicated between services for the same reason, and data inconsistencies creep in over time. Negative Feedback There is a potentially negative side to the Lean delivery feedback cycle too. Constantly responding to production feedback, always working on what’s most immediately important to the organization, doesn’t leave much space or time to consider bigger, longer-term technical issues, and to work on paying off deeper architectural and technical design debt that result from poor early decisions or incorrect assumptions. Smaller, more immediate problems get fixed fast in DevOps. Bugs that matter to operations and the users can get fixed right away instead of waiting until all the features are done, and patches and upgrades to the run-time can be pushed out more often. Which means that you can pay off a lot of debt before costs start to compound. But behind-the-scenes, strategic debt will continue to add up. Nothing’s broke, so you don’t have to fix anything right away. And you can’t refactor your way out of it either, at least not easily. So you end up living with a poor design or an aging technology platform, slowly slowing down your ability to respond to changes, to come up with new solutions. Or forcing you to continue filling in security holes as they come up, or scrambling to scale as load increases. DevOps can reduce technical debt. But only if you work in a highly disciplined way. And only if you raise your head up from tactical optimization to deal with bigger, more strategic issues before they become real problems.

June 27, 2015

by Jim Bird

· 7,421 Views

Employee Engagement: The Magic Potion?

I am sure by now most people understand that there is strong correlation between employee satisfaction and business results. If you need more convincing have a read of these two articles: Forbes & Research Paper So how do you best go about measuring it? On my current project I have decided to go with the following 4 questions: I would recommend this account and my project as a good place to work I have the tools and resources to do my role well I rarely think about rolling off this account or project My role makes good use of my skills and abilities For those of you who have read Jez Humble’s “Lean Enterprise”, these questions will look familiar. I have adopted them to the project setting that I work within. We have just set out on a cultural transformation to become truly Agile and adopt DevOps in a large complex legacy environment. To me measuring the above will give me the best indicator that we are doing the right thing. Of course there will be other measures who determine the quality of the outcomes and the levels of automation among others, but changing the culture of an organisation is critical if your Agile and DevOps adoption is to be successful. I will report back throughout that journey to tell you what my experiences is with the above questions. IT delivery is complex and it is not always clear what the right solution is. I found in the past that it is near impossible to create processes and tools that work by itself, you need to have the right mindset that people use the processes and tools with the right intent. It’s very frustrating when you implement great automation only to see a few months later that the solution has degraded. It is with hindsight that I understand that the solution is to not just implement process and tools but to instill the right culture and mindset for progression, a culture where we blamelessly identify a way to avoid the same mistake again rather than looking for the person in fault, a culture where we strive for automation and lean processes and are not concerned about the size of our teams or budgets, a culture where you don’t have to protect your fiefdom and where you are happy to collaborate with others to solve problems no matter where the root cause lies. I think we all in IT need to understand this dynamic between employee satisfaction and outcomes better, I for sure believe that I have come across a magic potion that I aim to bring to all my future projects. About these ads

June 27, 2015

by Mirco Hering

CORE

· 1,557 Views