Testing, Deployment, and Maintenance Resources

The Latest Testing, Deployment, and Maintenance Topics

Captains with Benefits

When it comes to teaching or learning, video streaming is something that still frightens people away. As a matter of fact that video chats and webinars have been around for a relatively long time, however; its still hard to encourage an individual or business to take part as such. And yet the benefits of CaptainLive can be substantial in both, short as well as long term. As we have already seen the benefits of video marketing therefore, we want to encourage you to use CaptainLive in order to take advantage of your potential whether it’s hidden in you or you are well aware of it. CaptainLive was launched in early 2015 with a mission to connect people in need of knowledge and skills with Captains with Benefits that are willing to share and give their expertise and mentor skills. CaptainLive’s integrated service now allows for text, video and audio conferencing. It’s been used by a variety of individuals with different backgrounds. At CaptainLive you can schedule an online live video stream with the experts in number topics ranging from counseling up to entertainment. Captains/Experts on the site charges from $5 USD up to $150 USD, most of which offer free 5 minute sessions with no obligation to book their session thereafter. Who knows you might end up registering as Captain yourself and start a part time business of your own to help others with your skills while making a healthy stream of income for yourself, it’s surely well worth your effort.

July 1, 2015

by Peter Watson

· 834 Views

Interoute Virtual Data Centre is the fastest transatlantic cloud service

Double the throughput and lower latency than the leading global cloud providers between the US and Europe in independent comparison research London & New York, 1 July, 2015. Interoute has today announced that its global cloud platform Interoute Virtual Data Centre (VDC), has been proven to deliver nearly double the throughput across the Atlantic than the next best cloud provider in comparison research conducted by Cloud Spectator. The research from March 2015 compared Interoute VDC with three leading cloud providers (Amazon AWS, Rackspace and Microsoft Azure), testing network throughput and latency between Europe and USA and between providers' European data centres. In all of the comparisons, Interoute VDC demonstrated the highest throughputs and lowest latencies. Cloud Spectator's full research report, and more information about Interoute VDC's performance and features, can be viewed here: http://bit.ly/1GHyzwJ Network performance is a significant factor in cloud computing for business services requiring the highest network capacity (throughput) and the shortest possible time from the server to the client (latency), to meet the needs of the businesses and their users. Innovating new applications and business services in the cloud needs network performance to match and this report shows the advantages of building the cloud into a huge global high performance network. Key research findings: Transatlantic: Interoute VDC delivered 1.1 Gbit/s throughput, which was 96% better than Amazon AWS, 141% better than Rackspace, and 195% better than Microsoft Azure. Interoute VDC had the lowest latency, between its London and New York data centres. Interoute was the only provider in the comparison with both of its transatlantic data centres located in key business cities, meaning that VDC users can access compute and storage resources, and deliver data to their customers, from two centres of European and US business activity. Within Europe: Interoute VDC achieved 1.3 Gbit/s throughput between its London and Amsterdam data centres. This was 52% better than Amazon AWS (Dublin - Frankfurt) and 73% better than Microsoft Azure (Dublin - Amsterdam) Interoute VDC achieved a latency of 6 milliseconds between London and Amsterdam, over three times better than the inter-data centre latency of the comparison providers. Matthew Finnie, CTO of Interoute, commented: "This independent report confirms and validates our networked cloud strategy. Building cloud into a world class network provides our customers with significantly better performance when compared with the traditional cloud models. Businesses looking to grow between Europe and US should definitely be looking at the importance of these network characteristics for their ability to shift workloads into the cloud. Interoute's fourteen global zones are all built into high performance network with over 300 interconnects in Europe alone. So wherever you choose to put your data and connect to us, your services are typically going to perform faster on Interoute than on many other global providers." Danny Gee, Senior Analyst, Cloud Spectator: "Users want to transfer large amounts of data between data centres quickly. Our study revealed that for a trans-Atlantic connection between cloud data centers, Interoute provided the highest throughput and lowest latency out of AWS, Rackspace and Azure. Interoute also had the higher network throughput and lowest latency in European testing compared to Azure and AWS (Rackspace was excluded, having only one location in Europe), making it a good option for users operating servers within this region. Interoute also provided the best latency, ideal for real-time communications. Users running geographically dispersed environments for such things as geo-redundancy would benefit from Interoute's high performance cloud connectivity."

July 1, 2015

by Fran Cator

· 1,185 Views

Learning Spring-Cloud - Writing a Microservice

Continuing my Spring-Cloud learning journey, earlier I had covered how to write the infrastructure components of a typical Spring-Cloud and Netflix OSS based micro-services environment - in this specific instance two critical components, Eureka to register and discover services and Spring Cloud Configuration to maintain a centralized repository of configuration for a service. Here I will be showing how I developed two dummy micro-services, one a simple "pong" service and a "ping" service which uses the "pong" service. Sample-Pong microservice The endpoint handling the "ping" requests is a typical Spring MVC based endpoint: @RestController public class PongController { @Value("${reply.message}") private String message; @RequestMapping(value = "/message", method = RequestMethod.POST) public Resource pongMessage(@RequestBody Message input) { return new Resource<>( new MessageAcknowledgement(input.getId(), input.getPayload(), message)); } } It gets a message and responds with an acknowledgement. Here the service utilizes the Configuration server in sourcing the "reply.message" property. So how does the "pong" service find the configuration server, there are potentially two ways - directly by specifying the location of the configuration server, or by finding the Configuration server via Eureka. I am used to an approach where Eureka is considered a source of truth, so in this spirit I am using Eureka to find the Configuration server. Spring Cloud makes this entire flow very simple, all it requires is a "bootstrap.yml" property file with entries along these lines: --- spring: application: name: sample-pong cloud: config: discovery: enabled: true serviceId: SAMPLE-CONFIG eureka: instance: nonSecurePort: ${server.port:8082} client: serviceUrl: defaultZone: http://${eureka.host:localhost}:${eureka.port:8761}/eureka/ The location of Eureka is specified through the "eureka.client.serviceUrl" property and the "spring.cloud.config.discovery.enabled" is set to "true" to specify that the configuration server is discovered via the specified Eureka server. Just a note, this means that the Eureka and the Configuration server have to be completely up before trying to bring up the actual services, they are the pre-requisites and the underlying assumption is that the Infrastructure components are available at the application boot time. The Configuration server has the properties for the "sample-pong" service, this can be validated by using the Config-servers endpoint - http://localhost:8888/sample-pong/default, 8888 is the port where I had specified for the server endpoint, and should respond with a content along these lines: "name": "sample-pong", "profiles": [ "default" ], "label": "master", "propertySources": [ { "name": "classpath:/config/sample-pong.yml", "source": { "reply.message": "Pong" } } ] } As can be seen the "reply.message" property from this central configuration server will be used by the pong service as the acknowledgement message Now to set up this endpoint as a service, all that is required is a Spring-boot based entry point along these lines: @SpringBootApplication @EnableDiscoveryClient public class PongApplication { public static void main(String[] args) { SpringApplication.run(PongApplication.class, args); } } and that completes the code for the "pong" service. Sample-ping micro-service So now onto a consumer of the "pong" micro-service, very imaginatively named the "ping" micro-service. Spring-Cloud and Netflix OSS offer a lot of options to invoke endpoints on Eureka registered services, to summarize the options that I had: 1. Use raw Eureka DiscoveryClient to find the instances hosting a service and make calls using Spring's RestTemplate. 2. Use Ribbon, a client side load balancing solution which can use Eureka to find service instances 3. Use Feign, which provides a declarative way to invoke a service call. It internally uses Ribbon. I went with Feign. All that is required is an interface which shows the contract to invoke the service: package org.bk.consumer.feign; import org.bk.consumer.domain.Message; import org.bk.consumer.domain.MessageAcknowledgement; import org.springframework.cloud.netflix.feign.FeignClient; import org.springframework.http.MediaType; import org.springframework.web.bind.annotation.RequestBody; import org.springframework.web.bind.annotation.RequestMapping; import org.springframework.web.bind.annotation.RequestMethod; import org.springframework.web.bind.annotation.ResponseBody; @FeignClient("samplepong") public interface PongClient { @RequestMapping(method = RequestMethod.POST, value = "/message", produces = MediaType.APPLICATION_JSON_VALUE, consumes = MediaType.APPLICATION_JSON_VALUE) @ResponseBody MessageAcknowledgement sendMessage(@RequestBody Message message); } The annotation @FeignClient("samplepong") internally points to a Ribbon "named" client called "samplepong". This means that there has to be an entry in the property files for this named client, in my case I have these entries in my application.yml file: samplepong: ribbon: DeploymentContextBasedVipAddresses: sample-pong NIWSServerListClassName: com.netflix.niws.loadbalancer.DiscoveryEnabledNIWSServerList ReadTimeout: 5000 MaxAutoRetries: 2 The most important entry here is the "samplepong.ribbon.DeploymentContextBasedVipAddresses" which points to the "pong" services Eureka registration address using which the service instance will be discovered by Ribbon. The rest of the application is a routine Spring Boot application. I have exposed this service call behind Hystrix which guards against service call failures and essentially wraps around this FeignClient: package org.bk.consumer.service; import com.netflix.hystrix.contrib.javanica.annotation.HystrixCommand; import org.bk.consumer.domain.Message; import org.bk.consumer.domain.MessageAcknowledgement; import org.bk.consumer.feign.PongClient; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.beans.factory.annotation.Qualifier; import org.springframework.stereotype.Service; @Service("hystrixPongClient") public class HystrixWrappedPongClient implements PongClient { @Autowired @Qualifier("pongClient") private PongClient feignPongClient; @Override @HystrixCommand(fallbackMethod = "fallBackCall") public MessageAcknowledgement sendMessage(Message message) { return this.feignPongClient.sendMessage(message); } public MessageAcknowledgement fallBackCall(Message message) { MessageAcknowledgement fallback = new MessageAcknowledgement(message.getId(), message.getPayload(), "FAILED SERVICE CALL! - FALLING BACK"); return fallback; } } Boot"ing up I have dockerized my entire set-up, so the simplest way to start up the set of applications is to first build the docker images for all of the artifacts this way: mvn clean package docker:build -DskipTests and bring all of them up using the following command, the assumption being that both docker and docker-compose are available locally: docker-compose up Assuming everything comes up cleanly, Eureka should show all the registered services, at http://dockerhost:8761 url - The UI of the ping application should be available at http://dockerhost:8080 url - Additionally a Hystrix dashboard should be available to monitor the requests to the "pong" app at this url http://dockerhost:8989/hystrix/monitor?stream=http%3A%2F%2Fsampleping%3A8080%2Fhystrix.stream: References 1. The code is available at my github location - https://github.com/bijukunjummen/spring-cloud-ping-pong-sample 2. Most of the code is heavily borrowed from the spring-cloud-samples repository - https://github.com/spring-cloud-samples

July 1, 2015

by Biju Kunjummen

· 13,692 Views · 4 Likes

Gene Kim Explains ‘Why DevOps Matters’

Ever wonder why DevOps gets so much attention these days? The answer is simple: “DevOps solves the most important business problem of our generation, [which is] how organizations make the transition from good to great.” That’s according to Gene Kim, co-author of The Phoenix Project, founder of Tripwire, and a DevOps advocate. Gene headlined a New Relic DevOps roadshow with stops in Chicago, Dallas, and Houston last month, regaling attendees with the inside scoop of what DevOps really is, what it does, and how to make it work (more on that in upcoming blog posts). But perhaps his most important point was the overwhelming importance of the effort. Traditional IT leads to “hopelessness and despair” According to Gene, the opportunity cost of wasted IT spending is some $2.6 trillion. These days, he says, “every company is an IT company”—we like to say “every company is a software company,” but you get the message. Gene observes that 95% of all capital projects have an IT component and 50% of all capital spending is technology related. And every IT organization is pressured to simultaneously respond more quickly to urgent business needs while also providing stable, secure, and predictable IT service. That chronic conflict created what Gene described as “a horrible downward spiral that leads to horrendous outcomes. Every time we cut corners, or manually deploy code, or write code that doesn’t have automated testing, it all leads to the accumulation of technical debt.” And the ever-increasing amount of technical debt sets the stage for intertribal warfare that can exist between dev and ops. Those wars mean that “Devs submit code at 5 p.m. on Friday, and ops then works all weekend to deploy it by 9 a.m. Monday. Everyone becomes buried in unplanned work, and this deprives our ability to pay down the technical debt being created. This led to hopelessness and despair, with everyone doomed to repeat the same mistakes.” DevOps offers a better way Fortunately, Gene explained, “We know now there is a better way. The DevOps exemplars have shown us that we can have incredibly fast flow from dev to ops to deployment while preserving world-class quality and security.” According to Gene, the top predictors of IT performance are all associated with DevOps: Version control of all production artifacts Continuous integration and deployment Automated acceptance testing Peer-review of production changes (vs. external change approval) High-trust culture Proactive monitoring of the production environment Win-win relationship between dev and ops Lead time is the key metric Lead time from raw material to finished product is the key metric in manufacturing, “and that’s true for software, too,” Gene said. “How long does it take to go from code committed to code successfully running in production?” The standard 9-month software lead time common in waterfall development projects is “highly correlated with catastrophic deployment errors,” Gene warned. The key, he said, is to have smaller deployments, and to do them more frequently. That approach is already working for high-performing organizations, he added, who are accelerating away from the herd. “Ten deploys a day used to be startling,” Gene noted. “Now it’s probably considered merely average among high performers.” Amazon Web Services deploys every 11.6 seconds! That kind of speed is possible only by doing small deployments more frequently, Gene said. “The bigger the change, the bigger the crater when it hits.” DevOps correlates with business success! IT high-performers who incorporate DevOps are much more agile and more reliable, Gene said. Critically, he added, “They are more likely to win in the marketplace!” The common reaction to that statement is shock. Gene noted he often hears: “That’s absurd! How can IT ops practices be visible on the bottom line or in the stock price?” But the Puppet Labs 2014 State of DevOps report noted that IT high-performers are twice as likely to exceed profitability, market share, and productivity goals as well as enjoy 50% higher market capitalization growth over three years. Of course, that doesn’t mean all those good things will happen to your company just by moving to DevOps. But do you really want to risk the “horrendous outcomes” of staying with outmoded models that lead to excruciatingly long deployment cycles?

July 1, 2015

by Fredric Paul

· 1,929 Views

DevOps Leadership Series: Gov Does DevOps

This past week, I had the opportunity to catch up with some more industry thought leaders at the DevOpsDays DC event in our nation’s capital. This was the first major DevOps Days event to feature a large audience of government participants. It was an awesome event and is certainly going to be on my must-attend list for next year. First off for the series, I had the chance to chat with Nathen Harvey, the Community Director at Chef. Nathen also did a great job leading the organizing committee for DevOps Days DC. In this episode of the DevOps Leadership Series, Nathen illustrates some recurrent topics he noticed at DevOpsDays DC. Nathen ensures us that government is ready for DevOps, enterprises are ready for DevOps, and small businesses and web innovators are ready for DevOps. Then he highlights how we need to create high velocity organizations that are safe, scalable, and humane for our teams. Next, I had the pleasure of catching up with Greg Elin, the Executive Director at GovReady and former Chief Data Officer at the FCC. What really stood out to Greg at the event was how many people already have their DevOps “sea legs.” He notes the number of different organizations that are comfortable with DevOps, and are familiar with its language and concepts. Finally, I caught up with Dave Johnson, Senior Technical Director at Ansible. Dave shares that “the market really is not dominated yet by any particular player in this (DevOps) space.” Dave explains when he first entered the industry he thought the IT world was dominated by open source and commercial technologies. However, when talking to the industry at large now, he found most the enterprises aren’t yet doing anything yet for automation, and most enterprises are still tackling their problems with humans. Next up in the series, you’ll hear from John Willis (Co-Author of the upcoming DevOps Cookbook) and Leon Fayer (Vice President at OmniTI). If you have missed any of the other videos from this series, you can find them here. (We’re up to 15 so far).

July 1, 2015

by Derek Weeks

· 13,474 Views

DevOps Tools for Continuous Delivery: Workloads Distribution and Jenkins Installation

the vast majority of software development companies have to place a great emphasis on the process of continuous integration and rapid delivery of new versions of their product. obviously, when supplying enterprise-level projects, such processes need to be automated as much as possible. and this is when the cloud devops tools come in handy. thus, in today’s article we’d like to pay a special attention to the devops tools that automate the continuous integration and delivery within the jelastic paas that can be installed on any bare metal or cloud infrastructure as virtual private cloud or hybrid cloud. this is a pretty complex example of enterprise application life cycle with continuous integration and seamless migration throughout devops pipeline from development to several productions (you can use simplified process if you have less complex project ). the instruction below will be useful for jelastic cluster administrators such as systems integrators, hosting service providers, enterprises, and isv customers, who can easily implement it at their jelastic cloud installations. nevertheless, this guide contains plenty of features and continuous integration tips described, which can be interesting for different developers. so, let’s get started with the first part of the instruction! setting up dedicated user groups first of all, you need to allocate separate hardware sets for all your project teams (one per each development phase, i.e. development > testing > production ) and adjust the access permissions to make them completely isolated and not influenced by others. the multi-regions for a hybrid cloud option, that became available within the recently released jelastic 3.3 version , is optimally suited for this task. to start with, create three hardware node groups (within one region) and name them after the corresponding stages for more convenience (e.g. dev , test , production ). the next step is to prepare three user groups and attach them to the corresponding hardware – in our case the dev group has access to the dev hardware node group only, qa – to the test one, and ops should work specifically with the production set. in such a way, users from the appropriate groups can use the specified sets of hardware only, but at the same time – they have a possibility to transfer their environments throughout the whole platform, between different teams’ accounts. jenkins continuous integration server configuration now we need the integration tool, that will control and perform all of the required operations automatically, i.e. build the cloud devops pipeline. our choice fell on jenkins as one of the most popular solutions used for this goal – it can be easily installed from our marketplace either at the corresponding site page or directly via the dashboard . as a result, you’ll get the pure jenkins installed, which should be properly adjusted before you start organizing your application life cycle. thus, select the open in browser button and proceed with the following configurations steps: while at the home page, click on the manage jenkins option at the left-hand menu and select the manage plugins link within the appeared list. after you’ve been redirected to the plugin manager, switch to the available tab, find the following plugins using the search filter field above and tick them for installation: git plugin – is required for building our project’s source (stored at the github repository) envfile plugin – is used for storing system environment variables (its necessity is driven by security restrictions, implemented at jelastic, which forbid the direct exporting of environment variables from the tomcat server) click install without restart when ready. during the installation process, tick the restart jenkins when installation is complete and no jobs are running option to automatically restart jenkins for enabling the chosen plugins. then, you also need to install maven, which will be used for building the project. for that, navigate to the manage jenkins > configure system menu, scroll down to the maven section and click add maven. within the expanded section, type the desired name for your maven installation (e.g. maven ) and save the changes using the same-named button at the bottom of the page. in such a way, this tool will be also automatically installed when required (i.e. during the first app build). now your jenkins server is well-staffed for the further work. add deployment process scripts to the jenkins container the next step is to upload the scripts that you are going to use for automating different organizational actions, required to be applied to your application at the intermediate development life cycle phases (like deploying, placing it to the appropriate hardware according to the stage, running auto-test, etc). the easiest way to do this is to access your jenkins container via the jelastic ssh-gateway. in the case you haven’t performed similar operations before, you need to: generate an ssh keypair add your public ssh key to the dashboard access your account via ssh protocol once inside, create a new folder for your project (we’ll use demo ) and move in there: mkdir /opt/tomcat/demo cd /opt/tomcat/demo this location can be used for storing your scripts, variables, logs etc. here, you can upload the required scripts using the command of the following type: curl -fssl {link_to_script} -o {file_name} we also provide the set of script examples, which can be used as templates for your own ones: install.sh – gets a user session and creates a new environment via the jelastic api according to the specified manifest file. it also defines, that the name of this environment will be equal to its creation date and time (as a unique name is required for every script execution, but you won’t be able to set it manually as this operation would be run automatically). however, you can set your own dynamic name pattern to be used here transfer.sh – changes the environment ownership based on the jelastic environment transferring feature migrate.sh – physically moves an environment to another hardware set (hardnode group) note: that before the appliance, each of the script templates, presented above, have to be additionally adjusted to make them work properly within a particular jelastic installation. thus, the list of parameters below should be obligatory substituted according to your platform’s settings: /path/to/scripts/ – the full path to your scripts folder (created in the previous step) {cloud_domain} – your jelastic platform domain name {jca_dashboard_appid} – your dashboard id, that could be seen within the platform.dashboard_appid parameter at the jca > about section {jca_appstore_appid} – appstore id, listed within the same section (at the platform.appstore_appid parameter) {url_to_manifest} – link to the manifest file created according to our documentation (you may also use this one as an example – it sets up two tomcat application servers with the nginx load-balancer in front of them) note: above you can see one more runtest.sh script uploaded – it simulates the testing activities for demonstration purposes, thus we don’t provide its code in this tutorial. if required, create your own one according the specifics of your application and upload it alongside the rest of the scripts. in addition, you need to create a separate file for storing the variable with environment name (as it needs to be dynamically changed each time a new environment is created): echo env_name= > /opt/tomcat/demo/variables these are the main steps of preparation to achieve automatic continuous integration and delivery of your web application with a help of jenkins within jelastic cloud platform. in the second part of these blog series, we’ll configure the set of jobs at the jenkins server, which represents the core of our automation. each of them will be devoted for a particular operation, required to be run at the corresponding application life cycle phase: create environment > build and deploy > dev tests > migrate to qa > qa tests > migrate to production stay tuned to see the next steps. if you still don’t have jelastic installation, contact us to get access to our free demo for cloud platform evaluation or just start with trial registration at one of our hosting partners .

June 30, 2015

by Tetiana Markova

· 3,165 Views · 1 Like

Integrating SonarQube with Nexus Lifecycle

Many development organizations we work with have turned to SonarQube as a dashboard to visualize and measure their code quality. Customers using Nexus Lifecycle (formerly CLM) want to surface known security vulnerabilities and license risk in the same place developers or executives already go to assess the overall quality of their application. To support this growing interest from our customers, we have introduced Nexus Lifecycle integration with SonarQube. Figure 1. SonarQube widget example highlights open source policy violations that require attention. Drill down reports with with detailed analysis are accessible directly from this widget. This integration will allow you to access summary-level Nexus Lifecycle information for your applications, as well as link to Nexus Lifecycle Application Composition Reportsdirectly from your SonarQube projects. Figure 2. Nexus Lifecycle Application Composition Reports offer detailed analysis of license and security issues down to the individual components and risks. If you are already using SonarQube, you know first hand the impact that principles such as the 7 Axes of Code Quality can have on the applications and projects your teams create. Paralleling this, as a user of Nexus Lifecycle you also know how using good components is a critical and essential part of developing quality applications. Nexus Lifecycle for SonarQube brings both of these together. THE SOFTWARE: For Nexus Lifecycle users needing access to the 1.11 release, it can be found on our KnowledgeBase here. THE INTEGRATION: For Nexus Lifecycle users looking for more information on the SonarQube integration, you can quickly get up-and-running with our online guideshere. LEARN MORE: What to learn more about SonarQube? Here is an informativearticle I found from Nadeem Mohammad. Finally, if you are looking for information on how Nexus Lifecycle integrates into your complete development environment, here are some links that you might find helpful: Integration with continuous integration servers (e.g., Hudson/Jenkins), Integration integrates with IDEs (e.g., Eclipse) Integration integrates with repository management (e.g., Nexus) Integration integrates with build managers (e.g., Maven)

June 30, 2015

by Brian Fox

· 4,111 Views · 1 Like

Instant Enterprise REST Accelerates the Software Driven Business

Software Driven Business is a consensus goal. But real challenges exist: the time, cost and complexity of building such apps is substantial. Business Agility – and strategic business advantage – is lost. We need another revolution – Instant Enterprise REST – that provides Business Agility using business-level specifications rather than low-level code, and delivers Enterprise-class scalability, integration, enforcement and extensibility. It’s now a reality with Instant Enterprise REST. Software Driven Business: Consensus Vision Businesses have seen the value in providing mobile and tablet apps that bring the business into the hands of customers and employees. They provide information at their finger tips – wherever they are. Industry Leaders like CA have pioneered the vision of a Software Driven Business. They argue persuasively that strategic business advantage lies in Time to Market and Time to Decision: “reveal the need for speed in the application economy. As companies transform into software-driven enterprises, bringing high-quality applications to market faster becomes one of the most critical differentiators.” The Business Agility Gap While there is consensus around this vision, there is a substantial gap in realizing the Software Driven Business. It centers around Agility – time to market. As CA argues, this drives strategic business advantage. This problem manifests both to Business Users and IT, although differently. You might have been party to a discussion like this: Business Users are frustrated about how long it takes to create systems, and revise them. They see problems that look nearly as simple as a spreadsheet take weeks… to months. How can it months for IT to build a system that takes days on a spreadsheet? IT is no less frustrated. They understand the deep technology it takes to build Enterprise-class systems: We’re working 90 hours a week. And falling behind. Gap Analysis For apps about critical corporate data, there’s general consensus that the time and cost for such systems are about evenly split between backends and front ends. And there’s nearly universal consensus that, independent of the UI technology, that RESTful APIs deliver the backend data. But the backend is far more than basic data access. A “SQL Pass-through” – simply restifying SQL data – does not meet Enterprise-class requirements to scale, integrate and enforce: Scale – APIs require Pagination to address large result sets, Nested Documents to reduce latency, Optimistic Locking to ensure concurrency. These are not provided in a simple SQL Pass-through – you must program them, by hand. Integrate – a wizard can produce an API from schema objects, but it cannot address multiple databases, or integrate non-SQL data sources such as ERP, other RESTful services, or NoSQL. Enforce – an API needs to enforce our security (down to the row level), and the integrity of the data. These are significant tasks, which are sadly often placed in client buttons where they cannot be shared. Providing these Enterprise class services takes significant time, expertise and expense. Business Agility is reduced. IT is essentially being forced to cover inadequate technology infrastructure. The Business Users are right: if the Business Specification is clear, then that ought to be enough: A clear business specification should be sufficient. Everything else is just friction. The vision of the Software Driven Business requires Business Driven Software that pre-supplies the infrastructure. We are not seeking 10 or 15%. We are looking for orders of magnitude. Our vision must be: We should be able to create RESTful APIs (mainly) from business specifications, not low level code. It should be no more difficult to create a system than it is toimagine it. Business-Driven Software: Instant Enterprise REST Business Driven Software is more than just a clever play on words. It’s a real implementation that delivers this vision, and we call it Instant Enterprise REST. It consists of 3 core technologies: Enterprise Pattern Automation – creates APIs that with Enterprise-class scalability built-in (pagination, nested documents, optimistic locking, etc) Declarative – specify your API, integration and enforcement policies with spreadsheet-like rules in a simple point-and-click UI Extensibility – enables the RESTful APIs to invoke your existing logic, inside or outside the JVM, via standard server-side JavaScript. The combination of these 3 technologies enables you to create RESTful APIs for database backends – half your system – 10 times faster. Let’s briefly examine them below. Technology 1: Enterprise Pattern Automation There are well known patterns in the data domain, describing data structure and access via SQL. There are also well-known patterns for managing SQL data in the context of RESTful services. Well known patterns can be automated. Let’s imagine a service (say, a server accessed via a browser) that automates these patterns, as described below, just by connecting the service to a database: Schema Discovery – tables, views, stored procedures: The system creates a complete (default) API for each schema object. Note this includes Stored Procedures, which often represent a significant investment. Enterprise Pattern Automation: the resultant API provides well-known services for Filter, Sort, Pagination, Optimistic Locking, handling Generated Keys and so forth. So, the service has provided a default Enterprise-class API, instantly. So, literally seconds into your project, you can test your running API: Not enough, not done, but a great start. Technology 2: Declarative Declarative is the key (“what, not how”). It has had striking impacts on domains where there are well-understood underlying patterns. Max Tardiveau has put it well: Whatever can be declarative, will be declarative. For example, spreadsheets are declarative – and they gave birth to the PC industry. And SQL is declarative – itself an industry. Two game-changers. So, the challenge is to apply the spirit of declarative to REST integration and enforcement. The stakes are high – success can deliver breathtaking agility. Declarative Integration: Multi-Database Custom API, Point and Click Enterprise Pattern Automation provides a good start, but the API is not rich. It is a flat, single-table API, really just “restified” SQL. What we really need is Nested Documents – returning multiple types (e.g., an Order, a list of Items, and a list of contact names) in a single call can reduce latency (vs. a separate call for each type). REST is perfect for this. Multi-database APIs – a RESTful server provides the opportunity to integrate multiple databases in single call, shielding clients from underlying complexity. Nested Documents are easy: define them by simply selecting tables (via a User Interface or Command Line). Foreign Keys are used to default the joins. Add the ability to choose / alias columns, and we’re on the way to a pretty good API. But what about databases that have no Foreign Keys? Or multi-database APIs? Leveraging the schema does not mean we are limited to it. All we need to do is: Provide a means to define “Virtual” Foreign Keys for the service (i.e., stored outside the schema) Extend this to Foreign Keys between databases We now have a rich, multi-database API. Defined declaratively as shown below, no code required, running in minutes, ready for client development: Declarative Enforcement: Integrity Logic, with spreadsheet-like rules So now consider enforcement, specifically database integrity. A very significant portion of any project is the multi-table validations and computations that define how the data is processed. “Your code goes here” means, well, a lot of code. We need a more powerful, more declarative, paradigm. In a spreadsheet, you assign expressions to cells. Whenever the referenced data is changed, the cell is updated. Since the cells references can chain, a series of simple expressions can solve remarkably complex problems. What if we did the same for database data? We could assign derivation expressions to columns, and validation expressions to tables. Then, the API could “watch” for requests that change the referenced column, and recompute (efficiently) the calculated column. Just as in a spreadsheet, support for chaining and proper ordering is required and implicit. To address multi-table logic, such expressions would need to address references to related tables. It’s only at this point that the logic becomes seriously powerful. Let’s take an example. To check credit in a Customer / Purchaseorder / Lineitem application, we could define spreadsheet-like expressions such as: There is actually a sub-branch of declarative that addresses this: Reactive Programming. Here it’s declarative,since you don’t need to code a Observer handler. The result is that the logic above can be fully executable. No need to code Change Detection / Change Dependency – it’s invoked and enforced automatically by the API in reaction to RESTful updates. SQL handling is also implicit, including underlying optimizations (caching, pruning etc). The impact is massive – the 5 expressions above express the same logic as hundreds of lines of code. That’s a massive 40X more concise. Game changer. And quality goes up, since the rules are applied automatically. Declarative Enforcement: Security, filter expressions for role/table We can provide an analogous approach to security: define filter expressions for roles (like SalesRep), so that when a table is accessed by the role, the API adds the filter. That way, a user with that role sees only the rows for which they are authorized. Technology 3: Standards-based Extensibility Declarative is great, but you’re probably thinking “ok, but you can’t solve every problem declaratively”. And you’re dead right. Business Value requires that we integrate a declarative approach with a procedural one that is familiar, standards-based, and enables us to integrate existing software. Automatic JavaScript Object Model The first phase of many projects is to build an ORM for natural programmatic access to data: JPA, Hibernate, Entity Framework. It’s not a small project, and cumbersome to maintain as changes occur. In fact, the Object Model can be created directly from the schema. So, you’d have an object type for Purchaseorder, for Lineitem, and so forth. The model provides access to attributes and related data, and persistence services. You could then use it as shown below. JavaScript seems like the best language choice: reasonable across technology bases (everybody uses JavaScript), and its dynamic nature eliminates code generation hassles. JavaScript Events In addition to accessors and persistence, the JavaScript objects are Logic Aware. That is, the save operation above executes any rules associated with OrderAudit (e.g., updated-by), and JavaScript Events. Here is a sample event for the PurchaseOrder object, where you access the JavaScript Object Model via the system-supplied row variable: Extensible Logic Auditing is a common pattern. It should be possible to solve this once in a genericmanner, then re-use it (e.g, to audit employees, orders and so forth). So, Instant Enterprise REST should enable you to provide Extensible Logic – load your own JavaScript code, and invoke it. So, the code above could become: MyLibrary.auditFromTo(orderRow,"OrderAudit"); where auditFromTo creates an instance of OrderAudit, sets the foreign key, sets like-named attributes, and saves it. Pluggable Authentication Most organizations have existing data stores that identify users and their roles, such as Active Directory, LDAP, OAuth, etc. Security should integrate with such systems as a function of enforcing row/column access. Standard deployment Finally, the system should deploy in a familiar manner: available on the cloud, or an on-premise virtual appliance or war file. Standards also enable integration with related critical infrastructure, such as API Management, ERP Systems, etc. See a project in 3 minutes To see how it all fits together, you can view this video to see a full project built: from concept, through initial implementation, and an iteration cycle. Actual project time was about half an hour. Instant Enterprise REST: Business Agility Instant Enterprise REST enables us to close the Agility Gap in realizing the Software Driven Business vision. We can now create important portions of our software in largely business terms, rather than technical terms. This offers major advantages: Time to Market: spreadsheet-like rules are 40X more concise. Instant REST eliminates all the SQL / REST / JSON boilerplate. Simplicity: team members can learn the basics of Espresso in days, and be as productive as rocket scientists using alternative technologies Leverage Expertise and Software: Espresso is built on standards like REST, JavaScript, and Event Oriented Programming. You can call out to existing software, and extend the rule types by identifying your own patterns and loading their implementations into Espresso. Quality: at the defect level, automatic invocation and ordering eliminate large classes of bugs. At the architectural level, centralized enforcement factors logic out of the client buttons where it can be shared, audited for compliances, etc

June 30, 2015

by Val Huber

CORE

· 1,411 Views

Azure Service Bus – As I Understand It: Part I (Overview)

Recently we started working on including support for Azure Service Bus in Cloud Portam. Prior to this, I had no experience with this service though it has been around for quite some time and I always wanted to try this out but one thing or another (oh, my stupid excuses :)!) prevented me from doing so. I learned a lot (and I am still learning) about this service while including support for it in Cloud Portam and this blog post talks about my learning. Please note that at the time of writing of all in all I have about a week of learning about this service so it is quite possible that I may be wrong about certain things. If that’s the case, please let me know and I will fix them ASAP. Now that the tone is set, let’s start! Azure Service Bus Offering The way I understand is that “Azure Service Bus” is a cloud-based messaging service that enables you to connect virtually anything – be it applications, services or devices. The beauty of Service Bus is that these things need not be in the cloud. They can run anywhere even inside the firewalled networks! Another thing I learned is that “Azure Service Bus” is essentially an umbrella service. At the time of writing of this post, there are actually four distinct services that are collectively offered under “Service Bus” umbrella – Queues, Topics & Subscriptions, Relays and Notification Hubs. Each service serves a different purpose yet the common theme is that all of them provide rich messaging infrastructure. To give you an analogy, if you have used Azure Storage Service you may already know that it offers four distinct services – Blobs, Files, Queues and Tables. It is the same with Service Bus as well. Queues Queues is the simplest of the service and kind of compares with Azure Storage Queue Service in the sense that it provides a unidirectional messaging infrastructure where a publisher publishes a message and the message is received by a receiver. There can be many receivers ready to receive the messages however one receiver can only receive a message. No two receivers can receive a single message simultaneously. For an in-depth comparison of Service Bus Queue and Storage Queues, please see this link: https://msdn.microsoft.com/en-us/library/azure/hh767287.aspx. Topics Topics are like queues in the sense that it also provides a unidirectional messaging infrastructure where a publisher publishes a message and receivers receive the message. The key difference is that same message can be received by multiple receivers (subscribers). Each subscriber can optionally specify a filter criteria so that they only receive the messages matching that criteria. To understand the difference between the two, let’s consider an example. Let’s say you run an e-commerce site and on successful completion of order, you have two tasks: 1) Send an email to customer about the order and 2) Notify the warehouse. If you were using Queues, you would either create 2 queues and put email notification message in one queue and warehouse notification message in another queue or build a workflow where you would send order confirmation message to a queue. Receiver would take that message and send out an email and then put warehouse notification message in the same queue (or other queue) and then another receiver would receive the message and notify the warehouse. However if you were using Topics, things would be much simpler logistically speaking. Essentially you would have just one message (order confirmation) but there will be two subscribers – one will be responsible for sending the email confirmation and the other will be responsible for notifying the warehouse. Relays Unlike Queues and Topics, which provide unidirectional flow of messages a Relay provides bi-directional flow. Using Relays, two disparate applications, services or devices can exchange messages. Other key difference is that a Relay doesn’t store the message like Queues and Topics. It just passes the messages from source to destination. Event Hubs Event Hubs service is meant for ingesting events and telemetry data in the cloud at massive scale (millions of events / second). Event Hubs are now more than important considering the push for connected devices (Internet-of-Things). Azure Service Bus Tiers Azure Service Bus is offered under two tiers (or SKUs if you would like): Basic and Standard. The difference is the level of functionality offered in each tier and the pricing. For example, Topics, Relays and Notification Hubs are only offered under Standard tier. Even with Queues, a limited set of functionality is exposed under Basic tier. For a list of features offered under each tier, please see this link: http://azure.microsoft.com/en-in/pricing/details/service-bus/. Summary That’s it for this post. In the next posts in this series, I will share my learnings about Queues and other Service Bus services. So stay tuned for that! Again, if you think that I have provided some incorrect information, please let me know and I will fix them ASAP.

June 30, 2015

by Gaurav Mantri

· 1,262 Views

Wrangling the Different Docker APIs

[This article was written by Alex Harford.] Docker APIs are a convenient way for your systems to talk to Docker infrastructure. But sometimes there are challenges associated with them. I've outlined in this blog the steps you need to take and the items you need to look out for when working with Docker APIs. Initial Docker Setup Ensure you have the latest Docker client installed. It should be v1.6 or newer. [alexh:~/work] docker pull ubuntu latest: Pulling from ubuntu 428b411c28f0: Pull complete 435050075b3f: Pull complete 9fd3c8c9af32: Pull complete 6d4946999d4f: Already exists ubuntu:latest: The image you are pulling has been verified. Important: image verification is a tech preview feature and should not be relied on to provide security. Digest: sha256:45e42b43f2ff4850dcf52960ee89c21cda79ec657302d36faaaa07d880215dd9 Status: Downloaded newer image for ubuntu:latest [alexh:~/work] docker run -ti ubuntu /bin/bash root@1092e8ca2ead:/# ps PID TTY TIME CMD 1 ? 00:00:00 bash 14 ? 00:00:00 ps root@1092e8ca2ead:/# exit exit Daemons, Registries, Hubs The Docker registry is used to host docker images for download. In the most simple case, it can be a process serving static images. This would be a read-only registry supporting GET operations only. If you need something more complex, you need to use a Docker registry web service. You can [a target="_blank" href="http://www.activestate.com/blog/2014/01/deploying-your-own-private-docker-registry"]run your own private Docker registry or use the public official Docker Hub. The Docker Hub contains a Docker registry, but also includes other features, like user authentication. In our examples, we will run an unauthenticated Docker registry. Setup If you are using standard Docker images, most people will pull from the Docker Hub, which is a publically accessible Docker registry. However, a more complicated service may be talking to private Docker registries running different versions of the API. Let’s assemble a test environment with both versions of the docker registry API so we can see the different ways you can access it. First, pull down two versions of the docker registry from the Docker Hub: docker pull registry:0.9.1 0.9.1: Pulling from registry e9e06b06e14c: Pull complete a82efea989f9: Pull complete 37bea4ee0c81: Pull complete 07f8e8c5e660: Pull complete 1f4ab7282e19: Pull complete 3c27027cdae8: Pull complete 7e0e5314436e: Pull complete 2696504d3685: Pull complete 012772dbb1c6: Pull complete e24d9fce1d00: Pull complete fd2726a79da8: Pull complete bffc32d7113a: Pull complete 0cd49aa0e23c: Pull complete 4e698fa80441: Already exists registry:0.9.1: The image you are pulling has been verified. Important: image verification is a tech preview feature and should not be relied on to provide security. Digest: sha256:98937757728eecbd72c9276bf711260aa29896f15217ce05be0562287e73232d Status: Downloaded newer image for registry:0.9.1 [alexh:~/work] docker pull registry:2.0.1 2.0.1: Pulling from registry 39bb80489af7: Pull complete df2a0347c9d0: Pull complete 7a3871ba15f8: Pull complete a2703ed272d7: Pull complete 68769176e114: Pull complete ab2ab59d7d1b: Pull complete 882ecee9f360: Pull complete 40de65f8e79f: Pull complete 0c4f9c7d798f: Pull complete ca29675fe853: Pull complete 89d10e9463e5: Pull complete 1a5aa415e484: Pull complete 3ea7a9e93b04: Pull complete 769d811a57fd: Pull complete ae8a4a3af1aa: Pull complete 85cc9a791bb5: Pull complete 9cd2c8646022: Pull complete 048c32c549b9: Pull complete cbbbda28c189: Pull complete 2602c005e534: Pull complete 136beb445cfa: Pull complete 0c5e5ef1d7da: Already exists registry:2.0.1: The image you are pulling has been verified. Important: image verification is a tech preview feature and should not be relied on to provide security. Digest: sha256:0cd177d687589aff586aa2c66c64d1c25657b8d09cff9e1492192f496e7786c3 Status: Downloaded newer image for registry:2.0.1 The next step is to start them. We will start the v1 registry on port 5000, and the v2 registry on port 6000. The v1 registry occasionally fails when starting due to a lock file race condition, so tell it to restart if necessary. [alexh:~/work] docker run -p 5000:5000 -d --restart=on-failure:3 registry:0.9.1 896c651b9bfa9780b14e3710d20428baab8497c30b9bc89946b192e1d1c145aa [alexh:~/work] docker run -p 6000:5000 -d registry:2.0.1 e09d4204921c732879ee9b7544cd40a25275e0d1f1702cacd954412cfd586ffb [alexh:~/work] docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES e09d4204921c registry:2.0.1 "registry cmd/regist 4 seconds ago Up 3 seconds 0.0.0.0:6000->5000/tcp silly_albattani 896c651b9bfa registry:0.9.1 "docker-registry" 35 seconds ago Up 34 seconds 0.0.0.0:5000->5000/tcp jovial_leakey Understanding Docker Namespaces Docker has a concept of namespaces for its repositories which can be confusing. [a target="_blank" href="https://docs.docker.com/docker-hub/official_repos/"]Official Repositories can be referred to without a username prefix: CentOS Ubuntu Internally these are prefixed by library/. This means that command like docker pull ubuntu:15.10 and docker pull library/ubuntu:15.10 are equivalent. If the name includes a '/' character (samalba/docker-registry), the left side refers to the username, and the right side refers to the image name in their public repository. It gets more complex when accessing private registries. The format becomes HOST:PORT/[USERNAME/]IMAGE. However, you should note that there is no authentication performed at this layer of our docker registry environment: anyone can push, pull, or delete from any 'user'. If the USERNAME is omitted, it is internally treated as being an 'official' image, and prefixed with library/. docker pull 127.0.0.1:5000/library/test-ubuntu Pulling repository 127.0.0.1:5000/library/test-ubuntu FATA[0004] Error: image library/test-ubuntu:latest not found [alexh:~/work] docker tag 0fe5a10d2cf8 127.0.0.1:5000/test-ubuntu [alexh:~/work] docker push 127.0.0.1:5000/test-ubuntu The push refers to a repository [127.0.0.1:5000/test-ubuntu] (len: 1) Sending image list Pushing repository 127.0.0.1:5000/test-ubuntu (1 tags) Image 5c1d0c04c3b8 already pushed, skipping Image 8c63e4ac9a5f already pushed, skipping Image 5fc05c0feaea already pushed, skipping Image 0fe5a10d2cf8 already pushed, skipping Pushing tag for rev [0fe5a10d2cf8] on {http://127.0.0.1:5000/v1/repositories/test-ubuntu/tags/latest} [alexh:~/work] docker pull 127.0.0.1:5000/library/test-ubuntu Pulling repository 127.0.0.1:5000/library/test-ubuntu 0fe5a10d2cf8: Download complete 5c1d0c04c3b8: Download complete 8c63e4ac9a5f: Download complete 5fc05c0feaea: Download complete Status: Image is up to date for 127.0.0.1:5000/library/test-ubuntu:latest In the v2 Docker registry, the [a target="_blank" href="https://docs.docker.com/registry/spec/api/#overview"]URI scheme has changed to allow the repository name to be broken up into multiple components. However, the Docker client does not yet support this flexibility. In the future, you should be able to extend the namespace of your registries, ie `redhat/centos/beta or redhat/fedora/stable. Populating the Registries We'll use Ubuntu 15.10 as our example image: docker pull ubuntu:15.10 15.10: Pulling from ubuntu 5c1d0c04c3b8: Pull complete 8c63e4ac9a5f: Pull complete 5fc05c0feaea: Pull complete 0fe5a10d2cf8: Already exists ubuntu:15.10: The image you are pulling has been verified. Important: image verification is a tech preview feature and should not be relied on to provide security. Digest: sha256:d569b6ebfc62f35f9792392724bd4a74a4f5f5af10ccbc1880974ae2f0660898 Status: Downloaded newer image for ubuntu:15.10 It needs to be tagged with the new URL in order to push it to the private registries: [alexh:~/work] docker tag ubuntu:15.10 127.0.0.1:5000/ubuntu:15.10 [alexh:~/work] docker tag ubuntu:15.10 127.0.0.1:6000/ubuntu:15.10 [alexh:~/work] docker push 127.0.0.1:5000/ubuntu:15.10 The push refers to a repository [127.0.0.1:5000/ubuntu] (len: 1) Sending image list Pushing repository 127.0.0.1:5000/ubuntu (1 tags) 5c1d0c04c3b8: Image successfully pushed 8c63e4ac9a5f: Image successfully pushed 5fc05c0feaea: Image successfully pushed 0fe5a10d2cf8: Image successfully pushed Pushing tag for rev [0fe5a10d2cf8] on {http://127.0.0.1:5000/v1/repositories/ubuntu/tags/15.10} [alexh:~/work] docker push 127.0.0.1:6000/ubuntu:15.10 The push refers to a repository [127.0.0.1:6000/ubuntu] (len: 1) 0fe5a10d2cf8: Image already exists 5fc05c0feaea: Image successfully pushed 8c63e4ac9a5f: Image successfully pushed 5c1d0c04c3b8: Image successfully pushed Digest: sha256:1f93077ce8f2fa1da8aae87735f395eae93a1c21928d3e2d130717c9aeff177d Note that the output between the v1 registry (on port 5000) and v2 (port 6000) are slightly different, but the result is the same: the Ubuntu image is now available on each registry. Docker Registry APIs At this point, we're able to compare the different APIs. In April 2015, Docker [a target="_blank" href="http://docs.docker.com/v1.6/release-notes/"]released version 1.6 and this included v2 of the Registry. Your software should be aware of the different versions of the Docker Registry API to handle these differences. Let's look at what it takes to download the image layers through the various APIs in order to make an offline cache. First, we'll prepare our environment: [alexh:~/work] export image=ubuntu [alexh:~/work] export tag=15.10 v1 The v1 private registry can be examined at this point: [alexh:~/work] curl -s http://127.0.0.1:5000/v1/repositories/library/$image/tags/$tag | python -m json.tool "0fe5a10d2cf8cdb378a39a81d87b0c8fcfa8fcaaf11bba895a1b6f72baf9a547" export v1_image_id=`curl -s http://127.0.0.1:5000/v1/repositories/library/$image/tags/$tag | sed 's/"//g'` [alexh:~/work] curl -s http://127.0.0.1:5000/v1/images/$v1_image_id/ancestry | python -m json.tool [ "0fe5a10d2cf8cdb378a39a81d87b0c8fcfa8fcaaf11bba895a1b6f72baf9a547", "5fc05c0feaeab977e52b7c2490bffacaba0e3d58e7955b683f271041d3558ad1", "8c63e4ac9a5f31e482d25a149b022209653b5948cb4f045c2ede9331a18e5824", "5c1d0c04c3b846fffd1d70886c956927a5c5f6a1c96f5e9f61c02f2ec1a45a73" ] [alexh:~/work] curl -sSL http://127.0.0.1:5000/v1/images/0fe5a10d2cf8cdb378a39a81d87b0c8fcfa8fcaaf11bba895a1b6f72baf9a547/layer > 0fe5a10d2cf8cdb378a39a81d87b0c8fcfa8fcaaf11bba895a1b6f72baf9a547.tar.gz [alexh:~/work] curl -sSL http://127.0.0.1:5000/v1/images/5fc05c0feaeab977e52b7c2490bffacaba0e3d58e7955b683f271041d3558ad1/layer > 5fc05c0feaeab977e52b7c2490bffacaba0e3d58e7955b683f271041d3558ad1.tar.gz [alexh:~/work] curl -sSL http://127.0.0.1:5000/v1/images/8c63e4ac9a5f31e482d25a149b022209653b5948cb4f045c2ede9331a18e5824/layer > 8c63e4ac9a5f31e482d25a149b022209653b5948cb4f045c2ede9331a18e5824.tar.gz [alexh:~/work] curl -sSL http://127.0.0.1:5000/v1/images/5c1d0c04c3b846fffd1d70886c956927a5c5f6a1c96f5e9f61c02f2ec1a45a73/layer > 5c1d0c04c3b846fffd1d70886c956927a5c5f6a1c96f5e9f61c02f2ec1a45a73.tar.gz v1 on Docker Hub The Docker Hub currently implements the v1 API, but requires an authentication token for certain operations. It also allows multiple endpoints to be returned by the server. We'll take the simple approach of always using the first endpoint: [alexh:~/work] export endpoint=`curl -sSL -o /dev/null -D- "https://index.docker.io/v1/repositories/$image/images" | awk '/X-Docker-Endpoints/{print $2}' | tr -d '\r' | sed 's/,//'` [alexh:~/work] echo $endpoint registry-1.docker.io [alexh:~/work] export token=`curl -sSL -o /dev/null -D- -H 'X-Docker-Token: true' "https://index.docker.io/v1/repositories/$image/images" | tr -d '\r' | awk '/X-Docker-Token/{print $2}'` The token needs to be used for authentication for the rest of the commands, but otherwise they are the same as the v1 private registry: [alexh:~/work] export v1_image_id=`curl -s -H "Authorization: Token $token" https://$endpoint/v1/repositories/library/$image/tags/$tag | sed 's/"//g'` [alexh:~/work] curl -sSL -H "Authorization: Token $token" "https://registry-1.docker.io/v1/images/$v1_image_id/ancestry" | python -m json.tool [ "0fe5a10d2cf8cdb378a39a81d87b0c8fcfa8fcaaf11bba895a1b6f72baf9a547", "5fc05c0feaeab977e52b7c2490bffacaba0e3d58e7955b683f271041d3558ad1", "8c63e4ac9a5f31e482d25a149b022209653b5948cb4f045c2ede9331a18e5824", "5c1d0c04c3b846fffd1d70886c956927a5c5f6a1c96f5e9f61c02f2ec1a45a73" ] [alexh:~/work] curl -sSL -H "Authorization: Token $token" https://$endpoint/v1/images/0fe5a10d2cf8cdb378a39a81d87b0c8fcfa8fcaaf11bba895a1b6f72baf9a547/layer > 0fe5a10d2cf8cdb378a39a81d87b0c8fcfa8fcaaf11bba895a1b6f72baf9a547.tar.gz [alexh:~/work] curl -sSL -H "Authorization: Token $token" https://$endpoint/v1/images/5fc05c0feaeab977e52b7c2490bffacaba0e3d58e7955b683f271041d3558ad1/layer > 5fc05c0feaeab977e52b7c2490bffacaba0e3d58e7955b683f271041d3558ad1.tar.gz [alexh:~/work] curl -sSL -H "Authorization: Token $token" https://$endpoint/v1/images/8c63e4ac9a5f31e482d25a149b022209653b5948cb4f045c2ede9331a18e5824/layer > 8c63e4ac9a5f31e482d25a149b022209653b5948cb4f045c2ede9331a18e5824.tar.gz [alexh:~/work] curl -sSL -H "Authorization: Token $token" https://$endpoint/v1/images/5c1d0c04c3b846fffd1d70886c956927a5c5f6a1c96f5e9f61c02f2ec1a45a73/layer > 5c1d0c04c3b846fffd1d70886c956927a5c5f6a1c96f5e9f61c02f2ec1a45a73.tar.gz v2 API The v2 API works with manifest files that include checksums. It's also slightly simpler. A manifest file for a tag contains all of the layer information, rather than requiring an image ID to be looked up for a tag, and then the ancestry for that image to be looked up. [alexh:~/work] curl -sSL http://127.0.0.1:6000/v2/$image/manifests/$tag | python -c 'import sys, json, pprint; pprint.pprint(json.load(sys.stdin)["fsLayers"])' [{u'blobSum': u'sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4'}, {u'blobSum': u'sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4'}, {u'blobSum': u'sha256:d4d342aa9da086ca4b7f7273858072e81021f4379c486223bc4708df6862b55d'}, {u'blobSum': u'sha256:23dc26e1038ae691b1a7e8e0152f974a358c42c929104c18c8e20b6d363c41ca'}, {u'blobSum': u'sha256:7772c716a45a828e124d20bc67199e77f2e63fb62589d0046f974f99b406e107'}] [alexh:~/work] curl -sSL http://127.0.0.1:6000/v2/$image/blobs/sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4 > a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4.tar.gz [alexh:~/work] curl -sSL http://127.0.0.1:6000/v2/$image/blobs/sha256:d4d342aa9da086ca4b7f7273858072e81021f4379c486223bc4708df6862b55d > d4d342aa9da086ca4b7f7273858072e81021f4379c486223bc4708df6862b55d.tar.gz [alexh:~/work] curl -sSL http://127.0.0.1:6000/v2/$image/blobs/sha256:23dc26e1038ae691b1a7e8e0152f974a358c42c929104c18c8e20b6d363c41ca > 23dc26e1038ae691b1a7e8e0152f974a358c42c929104c18c8e20b6d363c41ca.tar.gz [alexh:~/work] curl -sSL http://127.0.0.1:6000/v2/$image/blobs/sha256:7772c716a45a828e124d20bc67199e77f2e63fb62589d0046f974f99b406e107 > 7772c716a45a828e124d20bc67199e77f2e63fb62589d0046f974f99b406e107.tar.gz We can get the checksum for these files to verify that they are what is described in the manifest file: [alexh:~/work] sha256sum *.tar.gz a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4 a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4.tar.gz d4d342aa9da086ca4b7f7273858072e81021f4379c486223bc4708df6862b55d d4d342aa9da086ca4b7f7273858072e81021f4379c486223bc4708df6862b55d.tar.gz 23dc26e1038ae691b1a7e8e0152f974a358c42c929104c18c8e20b6d363c41ca 23dc26e1038ae691b1a7e8e0152f974a358c42c929104c18c8e20b6d363c41ca.tar.gz 7772c716a45a828e124d20bc67199e77f2e63fb62589d0046f974f99b406e107 7772c716a45a828e124d20bc67199e77f2e63fb62589d0046f974f99b406e107.tar.gz The Remote (daemon) API Another API that is available is the Docker daemon running locally. It can be accessed over a Unix socket, or over TCP if the daemon is configured to allow it. [alexh:~/work] echo -e "GET /images/json HTTP/1.0\r\n" | nc -U /var/run/docker.sock | tail -n +6 | python -m json.tool [ { "Created": 1433116930, "Id": "0fe5a10d2cf8cdb378a39a81d87b0c8fcfa8fcaaf11bba895a1b6f72baf9a547", "Labels": {}, "ParentId": "5fc05c0feaeab977e52b7c2490bffacaba0e3d58e7955b683f271041d3558ad1", "RepoDigests": [], "RepoTags": [ "127.0.0.1:6000/ubuntu:15.10", "ubuntu:15.10", "127.0.0.1:5000/ubuntu:15.10" ], "Size": 0, "VirtualSize": 132392276 }, { "Created": 1432704049, "Id": "0c5e5ef1d7dac23c7164ea48faafc79f0c921f6cf87d2d8ea7469832ea31e4ca", "Labels": {}, "ParentId": "136beb445cfa7f48dbe4e36a80a83d4b7945682827fd8bfb1510ac17b6a200c0", "RepoDigests": [], "RepoTags": [ "registry:2.0.1" ], "Size": 0, "VirtualSize": 548626543 }, { "Created": 1432703977, "Id": "4e698fa804417b34b334793bab8a143403be9384e0651067b0c3933fe8d90eb2", "Labels": {}, "ParentId": "0cd49aa0e23cfe176cbea4bf622d552a6f16b21965cf52d633f8c9e27438f52c", "RepoDigests": [], "RepoTags": [ "registry:0.9.1" ], "Size": 0, "VirtualSize": 413940033 } ] A tarball containing all of the layers for a tag can be generated: [alexh:~/work] echo -e "GET /images/get?names=$image:$tag HTTP/1.0\r\n" | nc -U /var/run/docker.sock | tail -n +5 > $image-$tag.tar [alexh:~/work] mkdir tmp [alexh:~/work] tar -C tmp -xf ubuntu-15.10.tar [alexh:~/work] ls -l tmp total 20 drwxr-xr-x 2 alexh alexh 4096 Jun 2 15:33 0fe5a10d2cf8cdb378a39a81d87b0c8fcfa8fcaaf11bba895a1b6f72baf9a547 drwxr-xr-x 2 alexh alexh 4096 Jun 2 15:33 5c1d0c04c3b846fffd1d70886c956927a5c5f6a1c96f5e9f61c02f2ec1a45a73 drwxr-xr-x 2 alexh alexh 4096 Jun 2 15:33 5fc05c0feaeab977e52b7c2490bffacaba0e3d58e7955b683f271041d3558ad1 drwxr-xr-x 2 alexh alexh 4096 Jun 2 15:33 8c63e4ac9a5f31e482d25a149b022209653b5948cb4f045c2ede9331a18e5824 -rw-r--r-- 1 alexh alexh 87 Jun 2 15:33 repositories Conclusions Docker is a great technology and there are a lot of improvements and new features coming out at a rapid pace. Fortunately it's well documented and discussions about bugs are in the open on GitHub. However, there are still some edge cases to be aware of when talking to the Docker APIs. With some good design choices, your applications can be made backwards and forwards compatible, and will be able to use a wide range of Docker client versions and remote APIs.

June 30, 2015

by Kathy Thomas

· 1,988 Views · 2 Likes

Sync issues with your codes on GitHub

It’s no surprise that many if not all programmers use GitHub today to store their codes, but it can be frustrating to keep everyone up to date with the code changes. Recently, GitHub has been integrated with Quire, a tree-structured task management tool that lets programmers to easily keep track of code changes. By linking GitHub commits to the so-called tasks (issues), users can refer to these tasks when they look at code changes, and also trace back to the codes when they look at the tasks. In a blog article, Quire goes into a bit more detail about their new integration and what exactly users can do and benefit from it. Check out the details at the link below. Hello GitHub, We’re Quire | Quire Blog

June 30, 2015

by Crystal Chen

· 869 Views

Bharath Gowda Focuses on Business Success with DevOps—In CIOReview

“Companies report incredible benefits by adopting a DevOps approach: faster delivery of software, fewer defects, faster resolution of problems, and better allocation of limited IT resources. If your organization has already adopted a DevOps approach, you’re on the right track to developing and deploying software faster and better than ever. “But that’s not the end of the story nor should it be the ultimate driver behind DevOps. To really succeed with a DevOps approach, you need to think about what faster and better software actually means for the business and the all-important bottom line. You must make the connection between your DevOps approach and goals and their impact on your business’ goals. To do this, you must link and balance the goals of faster (speed of delivery), better software (high performing, quality software that delivers a good customer experience) to goals for innovation and business success.” Bharath Gowda in CIOReview’s DevOps Special Edition So begins New Relic director of market leadership Bharath Gowda’s recent article inCIOReview’s new DevOps Special Edition, which features insights from a number of DevOps thought leaders. In an article titled Keep the Focus on the Business to Succeed with DevOps, Bharath makes the point that faster and better software development is only the beginning for DevOps. To extract the most value from the DevOps approach, he says, “you need to think about what faster and better software actually means for the business and the all-important bottom line.” The problem, Bharath warns, is that “working better, smarter, and faster isn’t quite enough to drive business success.” Application performance and customer experience are necessary but not sufficient factors to ensure business success, as measured by revenue, customer adoption, retention, customer satisfaction, and so on. DevOps packs its biggest punch when used to enable rapid innovation driven by business outcomes. DevOps drivers and metrics With that in mind, Bharath names four key drivers for DevOps implementations: Speed Innovation/business success Customer experience Application performance In the CIOReview article, Bharath offers a helpful framework for measuring each driver, along with suggested resource allocations for each one. He notes that unlike traditional waterfall development processes, DevOps’ iterative, data-driven approach allows for the ongoing insights needed to be a better software business. That’s critical, because these days, every business is a software business. Read Bharath’s whole article here!

June 29, 2015

by Fredric Paul

· 1,878 Views

Level Up Your Automated Tests

I presented a new talk at GOTO Chicago 2015 about how to change a team’s attitude towards writing automated tests. The talk covers the same case study as Groovy vs Java for Testing, adopting Spock in MongoDB, but this is a more process/agile/people perspective, not a technical look at the merits of one language over another. Slides available below. As always, the slides are not super-useful out of context, but they do contain my conclusions (also note that due to a technology fail, my hand-drawn style is even more hand-drawn than usual). Questions I sadly did not have a lot of time for questions during the presentation, but thanks to the wonders of modern technology, I have a list of unanswered questions which I will attempt to address here. Is testing to find out your system works? Or is it so you know when your system is broken? Excellent question. I would expect that if you have a system that’s in production (which is probably the large majority of the projects we work on), we can assume the system is working, for some definition of working. Automated testing is particularly good at catching when your system stops doing the things you thought it was doing when you wrote the tests (which may, or may not, mean the system is genuinely “broken”). Regression testing is to find out when your system is no longer doing what you expect, and automated tests are really good for this. But testing can also make sure you implement code that behaves the way you expect, especially if you write the tests first. Automated tests can be used to determine that your code is complete, according to some pre-agreed specification (in this case, the automated tests you wrote up front). So I guess what I’m trying to say is, when you first write the tests you have tests that, when they pass, proves the system works (assumingyour tests are testing the right things and/or not giving you false positives). Subsequent passes show that you haven’t broken anything. At what level do “tests documenting code” actually become useful? And who is/should the documentation be targeted to? In the presentation, my case study is the MongoDB Java Driver. Our users were Java programmers, who were going to be coding using our driver. So in this example, it makes a lot of sense to document the code using a language that our users understood. We started with Java, and ended up using Groovy because it was also understandable for our users and a bit more succinct. On a previous project we had different types of tests. The unit and system tests documented what the expected behaviour was at the class or module level, and was aimed at developers in the team. The acceptance tests were written in Java, but in a friendly DSL-style way. These were usually written by a triad of tester, business analyst and developer, and documented to all these guys and girls what the top-level behaviour should be. Our audience here was fairly technical though, so there was no need to go to the extent of trying to write English-language-style tests, they were readable enough for a reasonably techy (but non-programmer) audience. These were not designed to be read by “the business” - us developers might use them to answer questions about the behaviour of the system, but they didn’t document it in a way that just anyone could understand. These are two different approaches for two different-sized team/organisations, with different users. So I guess in summary the answer is “it depends”. But at the very least, developers on your own team should be able to read your tests and understand what the expected behaviour of the code is. How do you become a team champion? I.e. get authority and acceptance that people listen to you? In my case, it was just by accident - I happened to care about the tests being green and also being useful, so I moaned at people until it happened. But it’s not just about nagging, you get more buy-in if other people see you doing the right things the right way, and it’s not too painful for them to follow your example. There are going to be things that you care about that you’ll never get other people to care about, and this will be different from team to team. You have two choices here - if you care that much, and it bothers you that much, you have to do it yourself (often on your own time, especially if your boss doesn’t buy into it). Or, you have to let it go - when it comes to quality, there are so many things you could care about that it might be more beneficial to drop one cause and pick another that you can get people to care about. For example, I wanted us to use assertThat instead of assertFalse (or true, or equals, or whatever). I tried to demo the advantages (as I saw them) of my approach to the team, and tried to push this in code reviews, but in the end the other developers weren’t sold on the benefits, and from my point of view the benefits weren’t big enough to force the issue. Those of us who cared, used assertThat. For the rest, I was just happy people were writing and maintaining tests. So, pick your battles. You’ll be surprised at how many people do get on board with things. I thought implementing checkstyle and setting draconian formatting standards was going to be a tough battle, but in the end people were just happy to have any standards, especially when they were enforced by the build. Do you report test, style, coverage, etc failures separately? Why? We didn’t fail on coverage. Enforcing a coverage percentage is a really good way to end up with crappy tests, like for getters/setters and constructors (by the way, if there’s enough logic in your constructor that it needs a test, You’re Doing It Wrong). Generally different types of failures are found by different tools, so for this reason alone they will be reported separately - for example, checkstyle will fail the build if it doesn’t conform to our style standards, codenarc fails it for Groovy style failures, and Gradle will run the tests in a different task to these two. What’s actually important, though, is time-to-failure. For checkstyle, for example, it will fail on something silly like curly braces in the wrong place. You want this to fail within seconds, so you can fix the silly mistake quickly. Ideally you’d have IntelliJ (perhaps) run your checks before it even makes it into your CI environment. Compiler errors should, of course, fail things before you run a test, short-running tests should fail before long-running tests. Basically, the easier it is to fix the problem, the sooner you want to know, I guess. Our build was relatively small and not too complex, so actually we ran all our types of tests (integration and unit, both Groovy and Java) in a single task, because this turned out to be much quicker in Gradle (in our case) than splitting things up into a simple pipeline. You might have a reason to report stuff separately, but for me it’s much more important to understand how fast I need to be aware of a particular type of failure. Sometimes I find myself modifying code design and architecture to enable testing. How can I avoid damaging design? This is a great question, and a common one too. The short answer is: in general writing code that’s easier to test leads to a cleaner design anyway (for example, dependency injection at that appropriate places). If you find you need to rip your design apart to test it, there’s a smell there somewhere - either your design isn’t following SOLID principals, or you’re trying to test the wrong things. Of course, the common example here is testing private methods - how do you test these without exposing secrets? I think for me, if it’s important enough to be tested it’s important enough to be exposed in some way - it might belong in some sort of util or helper (right now I’m not going to go into whether utils or helpers are, in themselves a smell), in a smaller class that only provides this sort of functionality, or simply a protected method. Or, if you’re testing with Groovy, you can access private methods anyway so this becomes a moot point (i.e. your testing framework may be limiting you). In another story from LMAX, we found we had created methods just for testing. It seemed a bit wrong to have these methods only available for testing, but later on down the line, we needed access to many of these methods In Real Life (well, from our Admin app), so our testing had “found” a missing feature. When we came to implement it, it was pretty easy as we’d already done most of it for testing. My co-workers often point to a lack of end-to-end testing as the reason why a lot of bugs get out to production even though they don’t have much unit tests nor integration tests. What, in your experience, is a good balance between unit tests, integration tests and end-to-end testing? Hmm, sounds to me like “lack of tests” is your problem! This is a big (and contentious!) topic. Martin Fowler has written about it, Google wrote something I completely disagree with (so I’m not even going to link to it, but you’ll find references in the links in this paragraph), and my ex-colleague Adrian talks about what we, at LMAX, meant by end-to-end tests. I hope that’s enough to get you started, there’s plenty more out there too. How did you go about getting buy in from the team to use Spock? I cover this in my other presentation on the topic - the short version is, I did a week-long spike to investigate whether Spock would make testing easier for us, showed the pros and cons to the whole team, and then led by example writing tests that (I thought) were more readable than what we had before and, probably most importantly, much easier to write than what we were previously doing. I basically got buy-in by showing how much easier it was for us to use the tool than even JUnit (which we were all familiar with). It did help that we were already using Gradle, so we already had a development dependency on Groovy. It also helped that adding Spock made no changes to the dependencies of the final Jar, which was very important. Over time, further buy-in (certainly from management) came when the new tests started catching more errors - usually regressions in our code or regressions in the server’s overnight builds. I don’t think it was Spock specifically that caught more problems - I think it was writing more tests, and better tests, that caught the issues. Can we do data driven style tests in frameworks like junit or cucumber? I don’t think you can in JUnit (although maybe there’s something out there). I believe someone told me you can do it in TestNG. Are there drawbacks to having tests that only run in ci? I.e I have Java 8 on my machine, but the test requires Java 7 Yes, definitely - the drawback is Time. You have to commit your code to a branch that is being checked by CI and wait for CI to finish before you find the error. In practice, we found very little that was different between Java 7 and 8, for example, but this is a valid concern (otherwise you wouldn’t be testing a complex matrix of dependencies at all). In our case, our Java 6 driver used Netty for async capabilities, as the stuff we were using from Java 7 wasn’t available. This was clearly a different code path that wasn’t tested by us locally as we were all running Java 8. Probably more importantly for us is we were testing against at least 3 different major versions of the server, which all supported different features (and had different APIs). I would often find I’d broken the tests for version 2.2 as I’d only been running it on 2.6, and had forgotten to either turn off the new tests for the old server versions, or didn’t realise the new functionality wouldn’t work there. So the main drawback is time - it takes a lot longer to find out about these errors. There are a few ways to get around this: Commit often!! And to a branch that’s actually going to be run by CI Make your build as fast as possible, so you get failures fast (you should be doing this anyway) You could set up virtual machines locally or somewhere cloudy to run these configurations before committing, but that sounds kinda painful (and to my mind defeats a lot of the point of CI). I set up Travis on my fork of the project, so I could have that running a different version of Java and MongoDB when I committed to my own fork - I’d be able to see some errors before they made it into the “real” project. If you can, you probably want these specific tests run first so they can fail fast. E.g. if you’re running a Java 6 & MongoDB 2.2 configuration on CI, run those tests that only work in that environment first. Would probably need some Gradle magic, and/or might need you to separate these into a different set of folders. The advantage of this approach though is if you set up some aliases on your local machine you could sanity check just these special cases before checking in. For example, I had aliases to start MongoDB versions/configurations from a single command, and to set JAVA_HOME to whichever version I wanted. Do you have any tips for unit tests that pass on dev machines but not on Jenkins because it’s not as powerful as our own machines? E.g. Synchronous calls timeout on the Jenkins builds intermittently. Erk! Yes, not uncommon. No, not really. We had our timeouts set longer than I would have liked to prevent these sorts of errors, and they still intermittently failed. You can also set some sort of retry on the test, and get your build system to re-run those that fail to see if they pass later. It’s kinda nasty though. At LMAX they were able to take testing seriously enough to really invest in their testing architecture, and, of course, this is The Correct Answer. Just often very difficult to sell. If you ask where are tests and dev asks if code is correct? And you say yes. Then dev asks why you’re delaying shipping value, how do you manage that? These are my opinions: Your code is not complete without tests that show me it’s complete. Your code might do what you think it’s supposed to do right now, but given Shared Code Ownership, anyone can come in and change it at any time, you want tests in place to make sure they don’t change it to break what you thought it did The tests are not so much to show it works right now, the tests are to show it continues to work in future Having automated tests will speed you up in future. You can refactor more safely, you can fix bugs and know almost immediately if you broke something, you can read from the test what the author of the code thought the code should do, getting you up to speed faster. You don’t know you’re shipping value without tests - you’re only shipping code (to be honest, you never know if you’re shipping value until much later on when you also analyse if people are even using the feature). Testing almost never slows you down in the long run. Show me the bits of your code base which are poorly tested, and I bet I can show you the bits of your code base that frequently have bugs (either because the code is not really doing what the author thinks, or because subsequent changes break things in subtle ways). If you say code is hard to understand and dev asks if you seriously don’t understand the code, how do you explain you mean easy to understand without thinking rather than ‘can I compile this in my head’? I have zero problem with saying “I’m too stupid to understand this code, and I expect you’re much smarter than me for writing it. Can you please write it in a way so that a less smart person like myself won’t trample all over your beautiful code at a later date through lack of understanding?” By definition, code should be easy to understand by someone who’s not the author. If someone who is not the author says the code is hard to understand, then the code is hard to understand. This is not negotiable. This is what code reviews or pair programming should address. What is effective nagging like? (Whether or not you get what you want) Mmm, good question. Off the top of my head: Don’t make the people who are the target of the nagging feel stupid - they’ll get defensive. If necessary, take the burden of “stupidity” on yourself. E.g. “I’m just not smart enough to be able to tell if this test is failing because the test is bad or because the code is bad. Can you walk me through it and help me fix it?” Do at least your fair share of the work, if not more. When I wanted to get the code to a state where we could fail style errors, I fixed 99% of the problems, and delegated the handful of remaining ones that I just didn’t have the context to fix. In the face of three errors to fix each, the team could hardly say “no” after I’d fixed over 6000. Explain why things need to be done. Developers are adults and don’t want to be treated like children. Give them a good reason and they’ll follow the rules. The few times I didn’t have good reasons, I could not get the team to do what I wanted. Find carrots and sticks that work. At LMAX, a short e-mail at the start of the day summarising the errors that had happened overnight, who seemed to be responsible, and whether they looked like real errors or intermittencies, was enough to get people to fix their problems2 - they didn’t like to look bad, but they also had enough information to get right on it, they didn’t have to wade through all the build info. On occasion, when people were ignoring this, I’d turn up to work with bags of chocolate that I’d bought with my own money, offering chocolate bars to anyone who fixed up the tests. I was random with my carrot offerings so people didn’t game the system. Give up if it’s not working. If you’ve tried to phrase the “why” in a number of ways, if you’ve tried to show examples of the benefits, if you’ve tried to work the results you want into a process, but it’s still not getting done, just accept the fact that this isn’t working for the team. Move on to something else, or find a new angle. 1 I had a colleague at LMAX who was working with a hypothesis that All Private Methods Were Evil - they were clearly only sharable within single class, so provided no reuse elsewhere, and if you have the same bit of code being called multiple times from within the same class (but it’s not valuable elsewhere) then maybe your design is wrong. I’m still pondering this specific hypothesis 4 years on, and I admit I see its pros and cons. 2 This worked so well that this process was automated by one of the guys and turned into a tool called AutoTrish, which as far as I know is still used at LMAX. Dave Farley talks about it in some of hisContinuous Delivery talks. Resources My talk that specifically looks at the advantages of Spock over JUnit, plus some Spock-specific resources. I love Jay Fields book Working Effectively With Unit Tests - if I could have made the whole team read this before moving to Spock, we might have stuck with JUnit. Go read everything Adrian Sutton has written about testing at LMAX. If not everything, definitely Abstraction by DSL and Making End-to-End Tests Work If you can’t make it all the way through Dave Farley and Jez Humble’s excellent Continuous Delivery book, do take a look at one of Dave’s presentations on the subject, for example The Rationale for Continuous Delivery or The Process, Technology and Practice of Continuous Delivery - my own talk was around testing, but I’m working off the assumption that you’re at least running some sort of Continuous Integration, if not Continuous Delivery. Martin Fowler has loads of interesting and useful articles on testing. Abstract What can you do to help developers a) write tests b) write meaningful tests and c) write readable tests? Trisha will talk about her experiences of working in a team that wanted to build quality into their new software version without a painful overhead - without a QA / Testing team, without putting in place any formal processes, without slavishly improving the coverage percentage. The team had been writing automated tests and running them in a continuous integration environment, but they were simply writing tests as another tick box to check, there to verify the developer had done what the developer had aimed to do. The team needed to move to a model where tests provided more than this. The tests needed to: Demonstrate that the library code was meeting the requirements Document in a readable fashion what those requirements were, and what should happen under non-happy-path situations Provide enough coverage so a developer could confidently refactor the code This talk will cover how the team selected a new testing framework (Spock, a framework written in Groovy that can be used to test JVM code) to aid with this effort, and how they evaluated whether this tool would meet the team’s needs. And now, two years after starting to use Spock, Trisha can talk about how both the tool and the shift in the focus of the purpose of tests has affected the quality of the code. And, interestingly, the happiness of the developers.

June 29, 2015

by Trisha Gee

· 2,073 Views

Persistence and DAO Testing Made Simple (with Exparity-Stub and Hamcrest-Bean)

Persistence of model objects is a part of many Java projects and a part which deserves, and often gets, high test coverage as one of the key layer integration points in the code. However, I've often felt the testing paradigms for this can be cumbersome, often involving a large amount of setup with an equivalent amount of validation. This can be tedious to both create and maintain. As a solution to this I've been testing persistence with a different pattern; by combining both the exparity-stub and the hamcrest-bean library you can thoroughly test model persistence in a few lines of test code as per the snippet below; .. User user = aRandomInstanceOf(User.class); User saved = dao.save(user); assertThat(dao.getUserById(saved.getId()), theSameBeanAs(saved)); The test snippet above is small but in those few lines will thoroughly test that all fields in a graph can be persisted and retrieved without loss, that any JPA or other mapping is valid, and that your queries are valid. For a complete example we'll work through testing a simple DAO for storing and retrieving User objects using the in-memory H2 database for simplicity. The same example will work for any persistence mechanism. Before we get started with an example lets briefly outline what the libraries are and what they do. The Exparity-Stub Library The exparity-stub libraries provides a set of static methods for creating stubs of model objects, object graphs, collections, types, and primitive types. For our example we'll be creating random stubs because we want to completely fill the graph with junk data and check it can be written down. exparity-stub offers two approaches to this, the RandomBuilder or the BeanBuilder. The RandomBuilder provides a terser notation to create random objects with less code. For example: User user = RandomBuilder.aRandomInstanceOf(User.class); List users = RandomBuilder.aRandomListOf(User.class); String anyString = RandomBuilder.aRandomString(); Whereas the BeanBuilder provides a fluent interface with finer control for building individual objects and graphs, for example; User user = BeanBuilder.aRandomInstanceOf(User.class) .excludeProperty("Id").build(); For this example i'm going to use the BeanBuilder so I can exclude the User.Id property from being populated by the random builder. The Hamcrest-Bean Library The hamcrest-bean library is an extension library to the Java Hamcrest library. The hamcrest-bean library provides a set of matchers specifically for testing Java objects and object graphs and performs deep inspections of those objects. It supports exclusions and overrides to allow fine control, if required, of how matching of any property, path, or type is handled, for example: User expected = new User("Jane", "Doe"); assertThat(new User("John", "Doe"), BeanMatchers.theSameAs(expected).excludeProperty("FirstName")); A Sample Project The sample project I'll work through is persistence of a simple User object with a child list of UserComment objects. This simple graph will be persisted to a H2 database with hibernate handling the Object-Relational Mapping (ORM) mapping, and Java Persistence Annotation (JPA) used to mark-up the model. The Model Below are the two model classes; first the User class. package org.exparity.hamcrest.bean.sample.dao; import java.util.*; import javax.persistence.*; @Entity @Table public class User { @Id @GeneratedValue(strategy = GenerationType.SEQUENCE) private Long id; private Date createTs; private String username, firstName, surname; @OneToMany(cascade = CascadeType.ALL, fetch = FetchType.EAGER) private List comments = new ArrayList<>(); public Long getId() { return id; } public void setId(Long id) { this.id = id; } public Date getCreateTs() { return createTs; } public void setCreateTs(Date createTs) { this.createTs = createTs; } public String getUsername() { return username; } public void setUsername(String username) { this.username = username; } public String getFirstName() { return firstName; } public void setFirstName(String firstName) { this.firstName = firstName; } public String getSurname() { return surname; } public void setSurname(String surname) { this.surname = surname; } public List getComments() { return comments; } public void setComments(List comments) { this.comments = comments; } } Followed by the UserComment class. package org.exparity.hamcrest.bean.sample.dao; import java.util.Date; import javax.persistence.*; @Table @Entity public class UserComment { private Long id; private Date timestamp; @Transient private String text; private String title; public Date getTimestamp() { return timestamp; } public void setTimestamp(Date timestamp) { this.timestamp = timestamp; } public String getText() { return text; } public void setText(String text) { this.text = text; } public String getTitle() { return title; } public void setTitle(String title) { this.title = title; } } Followed by the UserComment class. package org.exparity.hamcrest.bean.sample.dao; import java.util.Date; import javax.persistence.*; @Table @Entity public class UserComment { private Long id; private Date timestamp; @Transient private String text; private String title; public Date getTimestamp() { return timestamp; } public void setTimestamp(Date timestamp) { this.timestamp = timestamp; } public String getText() { return text; } public void setText(String text) { this.text = text; } public String getTitle() { return title; } public void setTitle(String title) { this.title = title; } } The Data Access Object (DAO) Next up we write our DAO layer. I've excluded the UserDAO interface from this post but it is available in the sample project ongithub .The full, if somewhat crude, implementation of the UserDAO is below. package org.exparity.hamcrest.bean.sample.dao; import org.hibernate.boot.registry.StandardServiceRegistryBuilder; import org.hibernate.cfg.Configuration; import org.hibernate.*; public class UserDAOHibernateImpl implements UserDAO { private final SessionFactory factory; public UserDAOHibernateImpl(final String resourceFile) { this.factory = new Configuration() .addAnnotatedClass(User.class) .addAnnotatedClass(UserComment.class) .buildSessionFactory( new StandardServiceRegistryBuilder() .loadProperties(resourceFile) .build()); } @Override public User save(final User user) { Session session = factory.getCurrentSession(); Transaction txn = session.beginTransaction(); try { session.save(user); txn.commit(); } catch (final Exception e) { txn.rollback(); } return user; } @Override public User getUserById(Long userId) { Session session = factory.getCurrentSession(); Transaction txn = session.beginTransaction(); try { return (User) session.get(User.class, userId); } finally { txn.rollback(); } } } Integration Test And finally, onto our integration test. The hibernate.properties will create an instance of an in-memory database and create the necessary tables on instantiation of the DAO. hibernate.dialect=org.hibernate.dialect.H2Dialect hibernate.connection.username=sa hibernate.connection.password= hibernate.connection.driver_class=org.h2.Driver hibernate.connection.url=jdbc:h2:mem:test hibernate.current_session_context_class=thread hibernate.cache.provider_class=org.hibernate.cache.internal.NoCacheProvider hibernate.show_sql=true hibernate.hbm2ddl.auto=update The integration test is below. package org.exparity.hamcrest.bean.sample.dao; import static org.exparity.hamcrest.BeanMatchers.theSameBeanAs; import static org.exparity.stub.bean.BeanBuilder.aRandomInstanceOf; import static org.hamcrest.MatcherAssert.assertThat; import static org.hamcrest.Matchers.*; import org.junit.Test; public class UserDAOHibernateImplTest { @Test public void canSaveAUser() { User user = aRandomInstanceOf(User.class).excludeProperty("Id").build(); UserDAOHibernateImpl dao = new UserDAOHibernateImpl("hibernate.properties"); User saved = dao.save(user); User loaded = dao.getUserById(saved.getId()); assertThat(loaded, not(sameInstance(user))); assertThat(loaded, theSameBeanAs(user)); } } Let's break the test down step by step to see what each step is doing and why the test is put together this way. 1) Model Setup User user = aRandomInstanceOf(User.class).excludeProperty("Id").build(); Create a random instance of the User class and it's associates using exparity-stub. The instance will be populated with random data with the exception of the Id property. I've excluded the Id property so that is left null to test that the id is being generated in the database. 2) DAO Setup UserDAOHibernateImpl dao = new UserDAOHibernateImpl("hibernate.properties") Instantiate the DAO ready to be tested, passing in the property file to use for the test. The hibernate properties used will configure an in-memory instance of H2 and create the schema automatically. 3) Exercise the DAO User saved = dao.save(user); User loaded = dao.getUserById(saved.getId()); Save the random instance of the model set up in step (1) and then query the object back out again. 4) Verify the results assertThat(loaded, not(sameInstance(user))); assertThat(loaded, theSameBeanAs(user)); The first line verifies that the loaded User instance is not the same instance as the originally saved User. This prevents false positive results when the loaded instance is returned directly from a cache. The second line uses hamcrest-bean to perform a deep comparison of the loaded User instance against the original user instance. Running the Test The first run of the test yields an error; specifically a hibernate warning because a @Id annotation has been missed on UserComment. org.hibernate.AnnotationException: No identifier specified for entity: org.exparity.hamcrest.bean.sample.dao.UserComment at org.hibernate.cfg.InheritanceState.determineDefaultAccessType(InheritanceState.java:277) at org.hibernate.cfg.InheritanceState.getElementsToProcess(InheritanceState.java:224) at org.hibernate.cfg.AnnotationBinder.bindClass(AnnotationBinder.java:775) at org.hibernate.cfg.Configuration$MetadataSourceQueue.processAnnotatedClassesQueue(Configuration.java:3845) at org.hibernate.cfg.Configuration$MetadataSourceQueue.processMetadata(Configuration.java:3799) at org.hibernate.cfg.Configuration.secondPassCompile(Configuration.java:1412) at org.hibernate.cfg.Configuration.buildSessionFactory(Configuration.java:1846) at org.exparity.hamcrest.bean.sample.dao.UserDAOHibernateImpl.(UserDAOHibernateImpl.java:15) at org.exparity.hamcrest.bean.sample.dao.UserDAOHibernateImplTest.canSaveAUser(UserDAOHibernateImplTest.java:18) A fix to the UserComment object and we can run the test again. @Table @Entity public class UserComment { @Id @GeneratedValue(strategy = GenerationType.SEQUENCE) private Long id; private Date timestamp; @Transient private String text; private String title; ... After running the test again we get another failure. The presence of the @Transient annotation on the UserComment.text property is preventing the value being persisted java.lang.AssertionError: Expected: the same as but: User.Comments[0].Text is null instead of "mDAWDJXbheIHbbHLR1NNVJqAki49RvaVwQtKD38r79u0y3MTDD" at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8) at org.exparity.hamcrest.bean.sample.dao.UserDAOHibernateImplTest.canSaveAUser(UserDAOHibernateImplTest.java:19) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) Another change to the UserComment object to remove the @Transient annotation and we can run the test again. @Table @Entity public class UserComment { @Id @GeneratedValue(strategy = GenerationType.SEQUENCE) private Long id; private Date timestamp; private String text; private String title; ... After running the test again it all passes. Try It Out To try hamcrest-bean and exparity-stub out for yourself include the dependency in your maven pom or other dependency manager. org.exparity hamcrest-bean 1.0.10 test org.exparity exparity-stub 1.1.5 test

June 29, 2015

by Stewart Bissett

· 3,260 Views

The Cloudcast #198 - Architecting Cloud Foundry

Download the MP3 Date: June 19, 2015 By: Aaron Delp and Brian Gracely Description: Aaron and Brian talk to Chip Childers (@chipchilders, VP of Technology @CloudFoundryOrg) about the current status of Cloud Foundry projects, how Microsoft .NET will be integrated, IaaS vs. PaaS, and the CF.org thinking about overall interoperability Interested in the O'Reilly OSCON? Want to register for OSCON now? Use promo code 20CLOUD for 20% off Details to win an OSCON pass coming soon! Check out the OSCON Schedule Free eBook from O'Reilly Media for Cloudcast Listeners! Check out an excerpt from the upcoming Docker Cookbook Topic 1 - From an overall project perspective, what grades would you give Cloud Foundry in terms of stability, core functionality, security, operations, etc? Topic 2 - You were previously involved (directly/indirectly)with CloudStack. As you talk to people in the marketplace, how is it different discussing IaaS vs. PaaS. Topic 3 - How much ability will you have to drive prioritization within sub-projects or new projects? (eg. Security vs. new Languages vs. Interop, etc.) Topic 4 - What’s the CF.org way of thinking about interoperability? Topic 5 - What guidance are you giving the teams in terms of expandability of Cloud Foundry? Architecturally, are there certain places you recommend over other places? Topic 6 - Is there a place for integrating SaaS applications (monitoring, logging, etc.) into Cloud Foundry?

June 29, 2015

by Brian Gracely

· 1,158 Views

Get CoreOS Logs into ELK in 5 Minutes

CoreOS Linux is the operating system for “Super Massive Deployments”. We wanted to see how easily we can get CoreOS logs into Elasticsearch / ELK-powered centralized logging service. Here’s how to get your CoreOS logs into ELK in about 5 minutes, give or take. If you’re familiar with CoreOS and Logsene, you can grab CoreOS/Logsene config files from Github. Here’s an example Kibana Dashboard you can get in the end: CoreOS Kibana Dashboard CoreOS is based on the following: Docker and rkt for containers systemd for startup scripts, and restarting services automatically etcd as centralized configuration key/value store fleetd to distribute services over all machines in the cluster. Yum. journald to manage logs. Another yum. Amazingly, with CoreOS managing a cluster feels a lot like managing a single machine! We’ve come a long way since ENIAC! There’s one thing people notice when working with CoreOS – the repetitive inspection of local or remote logs using “journalctl -M machine-N -f | grep something“. It’s great to have easy access to logs from all machines in the cluster, but … grep? Really? Could this be done better? Of course, it’s 2015! Here is a quick example that shows how to centralize logging with CoreOS with just a few commands. The idea is to forward the output of “journalctl -o short” to Logsene‘s Syslog Receiver and take advantage of all its functionality – log searching, alerting, anomaly detection, integrated Kibana, even correlation of logs with Docker performance metrics — hey, why not, it’s all available right there, so we may as well make use of it all! Let’s get started! Preparation: 1) Get a list of IP addresses of your CoreOS machines fleetctl list-machines 2) Create a new Logsene App (here) 3) Change the Logsene App Settings, and authorize the CoreOS host IP Addresses from step 1) (here’s how/where) Congratulations – you just made it possible for your CoreOS machines to ship their logs to your new Logsene app! Test it by running the following on any of your CoreOS machines: journalctl -o short -f | ncat --ssl logsene-receiver-syslog.sematext.com 10514 …and check if the logs arrive in Logsene (here). If they don’t, yell at us @sematext – there’s nothing better than public shaming on Twitter to get us to fix things. :) Create a fleet unit file called logsene.service [Unit] Description=Logsene Log Forwarder [Service] Restart=always RestartSec=10s ExecStartPre=/bin/sh -c "if [ -n \"$(etcdctl get /sematext.com/logsene/`hostname`/lastlog)\" ]; then echo \"Value Exists: /sematext.com/logsene/`hostname`/lastlog $(etcdctl get /sematext.com/logsene/`hostname`/lastlog)\"; else etcdctl set /sematext.com/logsene/`hostname`/lastlog\"`date +\"%Y-%%m-%d %%H:%M:%S\"`\"; true; fi" ExecStart=/bin/sh -c "journalctl --since \"$(etcdctl get /sematext.com/logsene/`hostname`/lastlog)\" -o short -f | ncat --ssl logsene-receiver-syslog.sematext.com 10514" ExecStopPost=/bin/sh -c "export D=\"`date +\"%Y-%%m-%%d %%H:%M:%S\"`\"; /bin/etcdctl set /sematext.com/logsene/$(hostname)/lastlog \"$D\"" [Install] WantedBy=multi-user.target [X-Fleet] Global=true Activate cluster-wide logging to Logsene with fleet To start logging to Logsene from all machines activate logsene.service: fleetctl load logsene.service fleetctl start logsene.service There. That’s all there is to it! Hope this worked for you! At this point all your CoreOS logs should be going to Logsene. Now you have a central place to see all your CoreOS logs. If you want to send your app logs to Logsene, you can do that, too — anything that can send logs via Syslog or to Elasticsearch can also ship logs to Logsene. If you want some Docker containers & host monitoring to go with your CoreOS logs, just pull spm-agent-docker from Docker Registry. Enjoy!

June 29, 2015

by Stefan Thies

· 2,646 Views

Apache Camel in Space—Erh, in Docker and Kubernetes and Fancy Fabric8 Web Console

I have just recorded a 5 minute video that demonstrates running an out of stock example from Apache Camel release, the camel-example-servlet packaged as a docker container and running on a kubernetes platform, such as openshift 3. camel-servlet-example scaled up to 3 running containers (pods) which is easy with kubernetes and fabric8 In this video I have already deployed the example and then demonstrates how we can use the fabric8 web console to manage our application. And also connect to the running container and see inside, such as the Camel routes visually as shown above. Then I run a simple bash script from my laptop that sends a HTTP GET to the Camel example and prints the response. The script runs in a endless loop and demonstrates how kubernetes can easily scale up and down multiple Camel containers and load balance across the running containers. And at the end its even self healing when I force killing docker containers. So I suggest to grab a fresh cup of tea or coffee and sit back and play the 5 minutes video. The video is hosted on vimeo and can be seen from this link.

June 29, 2015

by Claus Ibsen

· 2,148 Views · 1 Like

Stackato on the Microsoft Azure Cloud

The growth of Azure has been outstanding--more than 90,000 new subscriptions every month. And the innovation is exponential with over 500 new features and services being added to the platform in the last 12 months. We're very excited to be part of this growth. As we announced yesterday, you can now access Stackato through Azure. We think it's a great way for Azure customers to get access to a Cloud Foundry and Docker based PaaS. With Azure, Microsoft provides an easy path to the cloud for their customers. All applications can be run on one cloud. Microsoft wants to dominate the cloud the same as it has with on-premise software and rarely does a day go by without reading an article about Azure. Whether it's their recent announcement to help encourage start-up's use of Azure by providing $120,000 worth of credits per year or their commitment to open source. Azure gives its customers a growing collection of integrated services that make it easier to build and manage enterprise, mobile, web and Internet of Things (IoT) apps faster. Enterprises face real complexities when building their cloud solution. Having a solid infrastructure is really just the first step in the process--companies also need the right platform to support the deployment and management of their cloud-native applications. The platform should give their developers the freedom to use the language best suited to build the application. In addition, enterprises are on more than one cloud. They need to have the versatility to scale out or move their applications to whatever cloud is appropriate in order to meet end user demand without any downtime. With Stackato, we help remove these complexities. We provide enterprises with a polyglot PaaS that supports the development of applications in virtually any language. We like to refer to Stackato as being "infrastructure-agnostic" and allow companies to deploy their applications to any cloud--private, public or hybrid--without the need to run new scripts or re-package the application in order for it to work in the new environment. The combination of Stackato on Azure gives enterprises the technology they need to streamline application delivery, drive innovation and meet the demands of their customers.

June 29, 2015

by Kathy Thomas

· 950 Views

R: Scraping the Release Dates of Github Projects

Continuing on from my blog post about scraping Neo4j’s release dates I thought it’d be even more interesting to chart the release dates of some github projects. In theory the release dates should be accessible through the github API but the few that I looked at weren’t returning any data so I scraped the data together. We’ll be using rvest again and I first wrote the following function to extract the release versions and dates from a single page: library(dplyr) library(rvest) process_page = function(releases, session) { rows = session %>% html_nodes("ul.release-timeline-tags li") for(row in rows) { date = row %>% html_node("span.date") version = row %>% html_node("div.tag-info a") if(!is.null(version) && !is.null(date)) { date = date %>% html_text() %>% str_trim() version = version %>% html_text() %>% str_trim() releases = rbind(releases, data.frame(date = date, version = version)) } } return(releases) } Let’s try it out on the Cassandra release page and see what it comes back with: > r = process_page(data.frame(), html_session("https://github.com/apache/cassandra/releases")) > r date version 1 Jun 22, 2015 cassandra-2.1.7 2 Jun 22, 2015 cassandra-2.0.16 3 Jun 8, 2015 cassandra-2.1.6 4 Jun 8, 2015 cassandra-2.2.0-rc1 5 May 19, 2015 cassandra-2.2.0-beta1 6 May 18, 2015 cassandra-2.0.15 7 Apr 29, 2015 cassandra-2.1.5 8 Apr 1, 2015 cassandra-2.0.14 9 Apr 1, 2015 cassandra-2.1.4 10 Mar 16, 2015 cassandra-2.0.13 That works pretty well but it’s only one page! To get all the pages we can use the follow_link function to follow the ‘Next’ link until there aren’t anymore pages to process. We end up with the following function to do this: find_all_releases = function(starting_page) { s = html_session(starting_page) releases = data.frame() next_page = TRUE while(next_page) { possibleError = tryCatch({ releases = process_page(releases, s) s = s %>% follow_link("Next") }, error = function(e) { e }) if(inherits(possibleError, "error")){ next_page = FALSE } } return(releases) } Let’s try it out starting from the Cassandra page: > cassandra = find_all_releases("https://github.com/apache/cassandra/releases") Navigating to https://github.com/apache/cassandra/releases?after=cassandra-2.0.13 Navigating to https://github.com/apache/cassandra/releases?after=cassandra-2.0.10 Navigating to https://github.com/apache/cassandra/releases?after=cassandra-2.0.8 Navigating to https://github.com/apache/cassandra/releases?after=cassandra-1.2.13 Navigating to https://github.com/apache/cassandra/releases?after=cassandra-2.0.0-rc1 Navigating to https://github.com/apache/cassandra/releases?after=cassandra-1.2.3 Navigating to https://github.com/apache/cassandra/releases?after=cassandra-1.2.0-beta2 Navigating to https://github.com/apache/cassandra/releases?after=cassandra-1.0.10 Navigating to https://github.com/apache/cassandra/releases?after=cassandra-1.0.6 Navigating to https://github.com/apache/cassandra/releases?after=cassandra-1.0.0-rc2 Navigating to https://github.com/apache/cassandra/releases?after=cassandra-0.7.7 Navigating to https://github.com/apache/cassandra/releases?after=cassandra-0.7.4 Navigating to https://github.com/apache/cassandra/releases?after=cassandra-0.7.0-rc3 Navigating to https://github.com/apache/cassandra/releases?after=cassandra-0.6.4 Navigating to https://github.com/apache/cassandra/releases?after=cassandra-0.5.0-rc3 Navigating to https://github.com/apache/cassandra/releases?after=cassandra-0.4.0-final > cassandra %>% sample_n(10) date version 151 Mar 13, 2010 cassandra-0.5.0-rc2 25 Jul 3, 2014 cassandra-1.2.18 51 Jul 27, 2013 cassandra-1.2.8 21 Aug 19, 2014 cassandra-2.1.0-rc6 73 Sep 24, 2012 cassandra-1.2.0-beta1 158 Mar 13, 2010 cassandra-0.4.0-rc2 113 May 20, 2011 cassandra-0.7.6-2 15 Oct 24, 2014 cassandra-2.1.1 103 Sep 15, 2011 cassandra-1.0.0-beta1 93 Nov 29, 2011 cassandra-1.0.4 I want to plot when the different releases happened in time and in order to do that we need to create an extra column containing the ‘release series’ which we can do with the following transformation: series = function(version) { parts = strsplit(as.character(version), "\\.") return(unlist(lapply(parts, function(p) paste(p %>% unlist %>% head(2), collapse = ".")))) } bySeries = cassandra %>% mutate(date2 = mdy(date), series = series(version), short_version = gsub("cassandra-", "", version), short_series = series(short_version)) > bySeries %>% sample_n(10) date version date2 series short_version short_series 3 Jun 8, 2015 cassandra-2.1.6 2015-06-08 cassandra-2.1 2.1.6 2.1 161 Mar 13, 2010 cassandra-0.4.0-beta1 2010-03-13 cassandra-0.4 0.4.0-beta1 0.4 62 Feb 15, 2013 cassandra-1.1.10 2013-02-15 cassandra-1.1 1.1.10 1.1 153 Mar 13, 2010 cassandra-0.5.0-beta2 2010-03-13 cassandra-0.5 0.5.0-beta2 0.5 37 Feb 7, 2014 cassandra-2.0.5 2014-02-07 cassandra-2.0 2.0.5 2.0 36 Feb 7, 2014 cassandra-1.2.15 2014-02-07 cassandra-1.2 1.2.15 1.2 29 Jun 2, 2014 cassandra-2.1.0-rc1 2014-06-02 cassandra-2.1 2.1.0-rc1 2.1 21 Aug 19, 2014 cassandra-2.1.0-rc6 2014-08-19 cassandra-2.1 2.1.0-rc6 2.1 123 Feb 16, 2011 cassandra-0.7.2 2011-02-16 cassandra-0.7 0.7.2 0.7 135 Nov 1, 2010 cassandra-0.7.0-beta3 2010-11-01 cassandra-0.7 0.7.0-beta3 0.7 Now let’s plot those releases and see what we get: ggplot(aes(x = date2, y = short_series), data = bySeries %>% filter(!grepl("beta|rc", short_version))) + geom_text(aes(label=short_version),hjust=0.5, vjust=0.5, size = 4, angle = 90) + theme_bw() An interesting thing we can see from this visualisation is what overlap the various series of versions have. Most of the time there are only two series of versions overlapping but the 1.2, 2.0 and 2.1 series all overlap which is unusual. In this chart we excluded all beta and RC versions. Let’s bring those back in and just show the last 3 versions: ggplot(aes(x = date2, y = short_series), data = bySeries %>% filter(grepl("2\\.[012]\\.|1\\.2\\.", short_version))) + geom_text(aes(label=short_version),hjust=0.5, vjust=0.5, size = 4, angle = 90) + theme_bw() From this chart it’s clearer that the 2.0 and 2.1 series have recent releases so there will probably be three overlapping versions when the 2.2 series is released as well. The chart is still a bit cluttered although less than before. I’m not sure of a better way of visualising this type of data so if you have any ideas do let me know!

June 29, 2015

by Mark Needham

· 1,446 Views

Building an App with MongoDB: Creating a REST API Using the MEAN Stack Part 2

Written by Norberto Leite In the first part of this blog series, we covered the basic mechanics of our application and undertook some data modeling. In this second part, we will create tests that validate the behavior of our application and then describe how to set-up and run the application. Write the tests first Let’s begin by defining some small configuration libraries. file name: test/config/test_config.js Our server will be running on port 8000 on localhost. This will be fine for initial testing purposes. Later, if we change the location or port number for a production system, it would be very easy to just edit this file. To prepare for our test cases, we need to ensure that we have a good test environment. The following code achieves this for us. First, we connect to the database. file name: test/setup_tests.js Next, we drop the user collection. This ensures that our database is in a known starting state. Next, we will drop the user feed entry collection. Next, we will connect to Stormpath and delete all the users in our test application. Next, we close the database. Finally, we call async.series to ensure that all the functions run in the correct order. Frisby was briefly mentioned earlier. We will use this to define our test cases, as follows. file name: test/create_accounts_error_spec.js We will start with the enroll route in the following code. In this case we are deliberately missing the first name field, so we expect a status reply of 400 with a JSON error that we forgot to define the first name. Let’s “toss that frisby”: In the following example, we are testing a password that does not have any lower-case letters. This would actually result in an error being returned by Stormpath, and we would expect a status reply of 400. In the following example, we are testing an invalid email address. So, we can see that there is no @ sign and no domain name in the email address we are passing, and we would expect a status reply of 400. Now, let’s look at some examples of test cases that should work. Let’s start by defining 3 users. file name: test/create_accounts_spec.js In the following example, we are sending the array of the 3 users we defined above and are expecting a success status of 201. The JSON document returned would show the user object created, so we can verify that what was created matched our test data. Next, we will test for a duplicate user. In the following example, we will try to create a user where the email address already exists. One important issue is that we don’t know what API key will be returned by Stormpath a priori. So, we need to create a file dynamically that looks like the following. We can then use this file to define test cases that require us to authenticate a user. file name: /tmp/readerTestCreds.js In order to create the temporary file above, we need to connect to MongoDB and retrieve user information. This is achieved by the following code. file name: tests/writeCreds.js In the following code, we can see that the first line uses the temporary file that we created with the user information. We have also defined several feeds, such as Dilbert and the Eater Blog. file name: tests/feed_spec.js Previously, we defined some users but none of them had subscribed to any feeds. In the following code we test feed subscription. Note that authentication is required now and this is achieved using .auth with the Stormpath API keys. Our first test is to check for an empty feed list. In our next test case, we will subscribe our first test user to the Dilbert feed. In our next test case, we will try to subscribe our first test user to a feed that they are already subscribed-to. Next, we will subscribe our test user to a new feed. The result returned should confirm that the user is subscribed now to 2 feeds. Next, we will use our second test user to subscribe to a feed. The REST API Before we begin writing our REST API code, we need to define some utility libraries. First, we need to define how our application will connect to the database. Putting this information into a file gives us the flexibility to add different database URLs for development or production systems. file name: config/db.js If we wanted to turn on database authentication we could put that information in a file, as shown below. This file should not be checked into source code control for obvious reasons. file name: config/security.js We can keep Stormpath API and Secret keys in a properties file, as follows, and need to carefully manage this file as well. file name: config/stormpath_apikey.properties Express.js overview In Express.js, we create an “application” (app). This application listens on a particular port for HTTP requests to come in. When requests come in, they pass through a middleware chain. Each link in the middleware chain is given a req (request) object and a res (results) object to store the results. Each link can choose to do work, or pass it to the next link. We add new middleware via app.use(). The main middleware is called our “router”, which looks at the URL and routes each different URL/verb combination to a specific handler function. Creating our application Now we can finally see our application code, which is quite small since we can embed handlers for various routes into separate files. file name: server.js We define our own middleware at the end of the chain to handle bad URLs. Now our server application is listening on port 8000. Let’s print a message on the console to the user. Defining our Mongoose data models We use Mongoose to map objects on the Node.js side to documents inside MongoDB. Recall that earlier, we defined 4 collections: Feed collection. Feed entry collection. User collection. User feed-entry-mapping collection. So we will now define schemas for these 4 collections. Let’s begin with the user schema. Notice that we can also format the data, such as converting strings to lowercase, and remove leading or trailing whitespace using trim. file name: app/routes.js In the following code, we can also tell Mongoose what indexes need to exist. Mongoose will also ensure that these indexes are created if they do not already exist in our MongoDB database. The unique constraint ensures that duplicates are not allowed. The “email : 1” maintains email addresses in ascending order. If we used “email : -1” it would be in descending order. We repeat the process for the other 3 collections. The following is an example of a compound index on 4 fields. Each index is maintained in ascending order. Every route that comes in for GET, POST, PUT and DELETE needs to have the correct content type, which is application/json. Then the next link in the chain is called. Now we need to define handlers for each combination of URL/verb. The link to the complete code is available in the resources section and we just show a few examples below. Note the ease with which we can use Stormpath. Furthermore, notice that we have defined /api/v1.0, so the client would actually call /api/v1.0/user/enroll, for example. In the future, if we changed the API, say to 2.0, we could use /api/v2.0. This would have its own router and code, so clients using the v1.0 API would still continue to work. Starting the server and running tests Finally, here is a summary of the steps we need to follow to start the server and run the tests. Ensure that the MongoDB instance is running mongod Install the Node libraries npm install Start the REST API server node server.js Run test cases node setup_tests.js jasmine-node create_accounts_error_spec.js jasmine-node create_accounts_spec.js node write_creds.js jasmine-node feed_spec.js MongoDB University provides excellent free training. There is a course specifically aimed at Node.js developers and the link can be found in the resources section below. The resources section also contains links to good MongoDB data modeling resources. Resources HTTP status code definitions Chad Tindel’s Github Repository M101JS: MongoDB for Node.js Developers Data Models Data Modeling Considerations for MongoDB Applications

June 29, 2015

by Dana Groce

· 2,279 Views