DevOps and CI/CD Resources

The Latest DevOps and CI/CD Topics

Implementing Correlation IDs in Spring Boot (for Distributed Tracing in SOA/Microservices)

After attending Sam Newman’s microservice talks at Geecon last week I started to think more about what is most likely an essential feature of service-oriented/microservice platforms for monitoring, reporting and diagnostics: correlation ids. Correlation ids allow distributed tracing within complex service oriented platforms, where a single request into the application can often be dealt with by multiple downstream service. Without the ability to correlate downstream service requests it can be very difficult to understand how requests are being handled within your platform. I’ve seen the benefit of correlation ids in several recent SOA projects I have worked on, but as Sam mentioned in his talks, it’s often very easy to think this type of tracing won’t be needed when building the initial version of the application, but then very difficult to retrofit into the application when you do realise the benefits (and the need for!). I’ve not yet found the perfect way to implement correlation ids within a Java/Spring-based application, but after chatting to Sam via email he made several suggestions which I have now turned into a simple project using Spring Boot to demonstrate how this could be implemented. Why? During both of Sam’s Geecon talks he mentioned that in his experience correlation ids were very useful for diagnostic purposes. Correlation ids are essentially an id that is generated and associated with a single (typically user-driven) request into the application that is passed down through the stack and onto dependent services. In SOA or microservice platforms this type of id is very useful, as requests into the application typically are ‘fanned out’ or handled by multiple downstream services, and a correlation id allows all of the downstream requests (from the initial point of request) to be correlated or grouped based on the id. So called ‘distributed tracing’ can then be performed using the correlation ids by combining all the downstream service logs and matching the required id to see the trace of the request throughout your entire application stack (which is very easy if you are using a centralised logging framework such as logstash) The big players in the service-oriented field have been talking about the need for distributed tracing and correlating requests for quite some time, and as such Twitter have created their open source Zipkin framework (which often plugs into their RPC framework Finagle), and Netflix has open-sourced their Karyon web/microservice framework, both of which provide distributed tracing. There are of course commercial offering in this area, one such product being AppDynamics, which is very cool, but has a rather hefty price tag. Creating a proof-of-concept in Spring Boot As great as Zipkin and Karyon are, they are both relatively invasive, in that you have to build your services on top of the (often opinionated) frameworks. This might be fine for some use cases, but no so much for others, especially when you are building microservices. I’ve been enjoying experimenting with Spring Boot of late, and this framework builds on the much known and loved (at least by me :-) ) Spring framework by providing lots of preconfigured sensible defaults. This allows you to build microservices (especially ones that communicate via RESTful interfaces) very rapidly. The remainder of this blog pos explains how I implemented a (hopefully) non-invasive way of implementing correlation ids. Goals Allow a correlation id to be generated for a initial request into the application Enable the correlation id to be passed to downstream services, using as method that is as non-invasive into the code as possible Implementation I have created two projects on GitHub, one containing an implementation where all requests are being handled in a synchronous style (i.e. the traditional Spring approach of handling all request processing on a single thread), and also one for when an asynchronous (non-blocking) style of communication is being used (i.e., using the Servlet 3 asynchronous support combined with Spring’s DeferredResult and Java’s Futures/Callables). The majority of this article describes the asynchronous implementation, as this is more interesting: Spring Boot asynchronous (DeferredResult + Futures) communication correlation id Github repo The main work in both code bases is undertaken by the CorrelationHeaderFilter, which is a standard Java EE Filter that inspects the HttpServletRequest header for the presence of a correlationId. If one is found then we set a ThreadLocal variable in the RequestCorrelation Class (discussed later). If a correlation id is not found then one is generated and added to the RequestCorrelation Class: public class CorrelationHeaderFilter implements Filter { //... @Override public void doFilter(ServletRequest servletRequest, ServletResponse servletResponse, FilterChain filterChain) throws IOException, ServletException { final HttpServletRequest httpServletRequest = (HttpServletRequest) servletRequest; String currentCorrId = httpServletRequest.getHeader(RequestCorrelation.CORRELATION_ID_HEADER); if (!currentRequestIsAsyncDispatcher(httpServletRequest)) { if (currentCorrId == null) { currentCorrId = UUID.randomUUID().toString(); LOGGER.info("No correlationId found in Header. Generated : " + currentCorrId); } else { LOGGER.info("Found correlationId in Header : " + currentCorrId); } RequestCorrelation.setId(currentCorrId); } filterChain.doFilter(httpServletRequest, servletResponse); } //... private boolean currentRequestIsAsyncDispatcher(HttpServletRequest httpServletRequest) { return httpServletRequest.getDispatcherType().equals(DispatcherType.ASYNC); } The only thing is this code that may not instantly be obvious is the conditional check currentRequestIsAsyncDispatcher(httpServletRequest), but this is here to guard against the correlation id code being executed when the Async Dispatcher thread is running to return the results (this is interesting to note, as I initially didn’t expect the Async Dispatcher to trigger the execution of the filter again?) Here is the RequestCorrelation Class, which contains a simple ThreadLocal static variable to hold the correlation id for the current Thread of execution (set via the CorrelationHeaderFilter above) public class RequestCorrelation { public static final String CORRELATION_ID = "correlationId"; private static final ThreadLocal id = new ThreadLocal(); public static String getId() { return id.get(); } public static void setId(String correlationId) { id.set(correlationId); } } Once the correlation id is stored in the RequestCorrelation Class it can be retrieved and added to downstream service requests (or data store access etc) as required by calling the static getId() method within RequestCorrelation. It is probably a good idea to encapsulate this behaviour away from your application services, and you can see an example of how to do this in a RestClient Class I have created, which composes Spring’s RestTemplate and handles the setting of the correlation id within the header transparently from the calling Class. @Component public class CorrelatingRestClient implements RestClient { private RestTemplate restTemplate = new RestTemplate(); @Override public String getForString(String uri) { String correlationId = RequestCorrelation.getId(); HttpHeaders httpHeaders = new HttpHeaders(); httpHeaders.set(RequestCorrelation.CORRELATION_ID, correlationId); LOGGER.info("start REST request to {} with correlationId {}", uri, correlationId); //TODO: error-handling and fault-tolerance in production ResponseEntity response = restTemplate.exchange(uri, HttpMethod.GET, new HttpEntity(httpHeaders), String.class); LOGGER.info("completed REST request to {} with correlationId {}", uri, correlationId); return response.getBody(); } } //... calling Class public String exampleMethod() { RestClient restClient = new CorrelatingRestClient(); return restClient.getForString(URI_LOCATION); //correlation id handling completely abstracted to RestClient impl } Making this work for asynchronous requests… The code included above works fine when you are handling all of your requests synchronously, but it is often a good idea in a SOA/microservice platform to handle requests in a non-blocking asynchronous manner. In Spring this can be achieved by using the DeferredResult Class in combination with the Servlet 3 asynchronous support. The problem with using ThreadLocal variables within the asynchronous approach is that the Thread that initially handles the request (and creates the DeferredResult/Future) will not be the Thread doing the actual processing. Accordingly, a bit of glue code is needed to ensure that the correlation id is propagated across the Threads. This can be achieved by extending Callable with the required functionality: (don’t worry if example Calling Class code doesn’t look intuitive – this adaption between DeferredResults and Futures is a necessary evil within Spring, and the full code including the boilerplate ListenableFutureAdapter is in my GitHub repo): public class CorrelationCallable implements Callable { private String correlationId; private Callable callable; public CorrelationCallable(Callable targetCallable) { correlationId = RequestCorrelation.getId(); callable = targetCallable; } @Override public V call() throws Exception { RequestCorrelation.setId(correlationId); return callable.call(); } } //... Calling Class @RequestMapping("externalNews") public DeferredResult externalNews() { return new ListenableFutureAdapter<>(service.submit(new CorrelationCallable<>(externalNewsService::getNews))); } And there we have it – the propagation of correlation id regardless of the synchronous/asynchronous nature of processing! You can clone the Github report containing my asynchronous example, and execute the application by running mvn spring-boot:run at the command line. If you access http://localhost:8080/externalNews in your browser (or via curl) you will see something similar to the following in your Spring Boot console, which clearly demonstrates a correlation id being generated on the initial request, and then this being propagated through to a simulated external call (have a look in the ExternalNewsServiceRest Class to see how this has been implemented): [nio-8080-exec-1] u.c.t.e.c.w.f.CorrelationHeaderFilter : No correlationId found in Header. Generated : d205991b-c613-4acd-97b8-97112b2b2ad0 [pool-1-thread-1] u.c.t.e.c.w.c.CorrelatingRestClient : start REST request to http://localhost:8080/news with correlationId d205991b-c613-4acd-97b8-97112b2b2ad0 [nio-8080-exec-2] u.c.t.e.c.w.f.CorrelationHeaderFilter : Found correlationId in Header : d205991b-c613-4acd-97b8-97112b2b2ad0 [pool-1-thread-1] u.c.t.e.c.w.c.CorrelatingRestClient : completed REST request to http://localhost:8080/news with correlationId d205991b-c613-4acd-97b8-97112b2b2ad0 Conclusion I’m quite happy with this simple prototype, and it does meet the two goals I listed above. Future work will include writing some tests for this code (shame on me for not TDDing!), and also extend this functionality to a more realistic example. I would like to say a massive thanks to Sam, not only for sharing his knowledge at the great talks at Geecon, but also for taking time to respond to my emails. If you’re interested in microservices and related work I can highly recommend Sam’s Microservice book which is available in Early Access at O’Reilly. I’ve enjoyed reading the currently available chapters, and having implemented quite a few SOA projects recently I can relate to a lot of the good advice contained within. I’ll be following the development of this book with keen interest! If you have any comments or thoughts then please do share them via the comment below, or feel free to get in touch via the usual mechanisms! References I used Tomasz Nurkiewicz’s excellent blog several times for learning how best to wire up all of the DeferredResult/Future code in Spring: http://www.nurkiewicz.com/2013/03/deferredresult-asynchronous-processing.html

May 28, 2014

by Daniel Bryant

· 73,989 Views · 2 Likes

Running the Maven Release Plugin with Jenkins

Learn more about using the Maven Release plugin on Jenkins, including subversion source control, artifactory, continuous integration, and more.

May 23, 2014

by $$anonymous$$

· 104,809 Views · 6 Likes

Understanding the Cloud Foundry Java Buildpack Code with Tomcat Example

Cloudfoundry's java buildpack is supporting some popular jvm based applications. This article is oriented to the audiences already with experience of cloudfoundry/heroku buildpack who want to have more understanding of how buildpack and cloudfoundry works internally. cf push app -p app.war -b build-pack-url The above command demonstrates the usage of pushing a war file to cloudfoundry by using a custom buildpack (E.g. https://github.com/cloudfoundry/java-buildpack). However, what exactly happens inside, or how cloudfoundry bootstrap the war file with tomcat? There are three contracts phase that bridge communication between buildpack and cloudfoundry. The three phases are detect, compile and release, which are three ruby shell scripts: Java buildpack has multiple sub components, while each of them has all of these three phases (E.g. tomcat is one of the sub components, while it contained another layer of sub components). Detect Phase: detect phase is to check whether a particular buildpack/component applies to the deployed application. Take the war file example, tomcat applies only when https://github.com/cloudfoundry/java-buildpack/blob/master/lib/java_buildpack/container/tomcat.rb is true: def supports? web_inf? && !JavaBuildpack::Util::JavaMainUtils.main_class(@application) end The above code means, the tomcat applies when the application has a WEB-INF folder andthisisnot a main class bootstrapped application. Compile Phase: Compile phase would be the major/comprehensive work for a customized buildpack, while it is trying to build a file system on a lxc container. Take the example of our war application and tomcat example. In https://github.com/cloudfoundry/java-buildpack/blob/master/lib/java_buildpack/container/tomcat/tomcat_instance.rb def compile download(@version, @uri) { |file| expand file } link_to(@application.root.children, root) @droplet.additional_libraries << tomcat_datasource_jar if tomcat_datasource_jar.exist? @droplet.additional_libraries.link_to web_inf_lib end def expand(file) with_timing "Expanding Tomcat to #{@droplet.sandbox.relative_path_from(@droplet.root)}" do FileUtils.mkdir_p @droplet.sandbox shell "tar xzf #{file.path} -C #{@droplet.sandbox} --strip 1 --exclude webapps 2>&1" @droplet.copy_resources end The above code is all about preparing the tomcat and link the application files, so the application files will be available for the tomcat classpath. Before going to the code, we have to understand the working directory when the above code executes: . => working directory .app => @application, contains the extracted war archive .buildpack/tomcat => @droplet.sandbox .buildpack/jdk .buildpack/other needed components Inside compile method: download method will download tomcat binary file (specified here: https://github.com/cloudfoundry/java-buildpack/blob/master/config/tomcat.yml), and then extract the archive file to @droplet.sandbox directory. Then copy the resources folder's files to https://github.com/cloudfoundry/java-buildpack/tree/master/resources/tomcat/conf to @droplet.sandbox/conf Symlink the @droplet.sandbox/webapps/ROOT to .app/ Symlink additional libraries (comes from other component rather than application) to the WEB-INF/lib Note: All the symlinks use relative path, since when the container deployed to DEA, the absolute paths would be different. RELEASE PHASE: Release phase is to setup instructions of how to start tomcat. Look at the code in :https://github.com/cloudfoundry/java-buildpack/blob/master/lib/java_buildpack/container/tomcat.rb def command @droplet.java_opts.add_system_property 'http.port', '$PORT' [ @droplet.java_home.as_env_var, @droplet.java_opts.as_env_var, "$PWD/#{(@droplet.sandbox + 'bin/catalina.sh').relative_path_from(@droplet.root)}", 'run' ].flatten.compact.join(' ') end The above code does: Add java system properties http.port (referenced in tomcat server.xml) with environment properties ($PORT), this is the port on the DEA bridging to the lxc container already setup when the container was provisioned. instruction of how to run the tomcat Eg. "./bin/catalina.sh run"

May 9, 2014

by Shaozhen Ding

· 23,309 Views · 1 Like

Git Showing File as Modified Even if It Is Unchanged

This is one annoying problem that happens sometimes to git users: the symptom is: git status command shows you some files as modified (you are sure that you had not modified that files), you revert all changes with a git checkout — . but the files stills are in modified state if you issue another git status. This is a real annoying problem, suppose you want to switch branch with git checkout branchname, you will find that git does not allow you to switch because of uncommitted changes. This problem is likely caused by the end-of-line normalization (I strongly suggest you to read all the details in Pro Git book or read the help of github). I do not want to enter into details of this feature, but I only want to help people to diagnose and avoid this kind of problem. To understand if you really have a Line Ending Issue you should run git diff -w command to verify what is really changed in files that git as modified with git status command. The -w options tells git to ignore whitespace and line endings, if this command shows no differences, you are probably victim of problem in Line Ending Normalization. This is especially true if you are working with git svn, connecting to a subversion repository where developers did not pay attention to line endings and it happens usually when you have files with mixed CRLF / CR / LF. If you work in mixed environment (Unix/Linux, Windows, Macintosh) it is better to find files that are listed as modified and manually (or with some tool) normalize Line Endings. If you do not work in mixed environment you can simply turn off eol normalizationfor the single repository where you experience the problem. To do this you can issue a git config –local core.autocrlf false but it works only for you and not for all the other developers that works to the project. Moreover some people reports that they still have problem even with core.autocrlf to false. Remember that git supports .gitattributes files, used to change settings for a single subdirectory. If you set core.autocrlf to false and still have line ending normalization problem, please search for .gitattribuges files in every subdirectory of your repository, and verify if it has a line where autocrlf is turned on: * text=auto now you can turn off in all .gitattributes files you find in your repository * text=off To be sure that every developer of the team works with autocrlf turned off, you should place a .gitattributes file in repository root with autocrlf turned off. Remember that it is a better option to normalize files and leave autocrlf turned on, but if you are working with legacy code imported from another VCS, or you work with git svn, git-tf or similar tools, probably it is better turn autocrlf to off if you start experiencing that kind of problems.

April 29, 2014

by Ricci Gian Maria

· 93,109 Views

Java EE: The Basics

wanted to go through some of the basic tenets, the technical terminology related to java ee. for many people, java ee/j2ee still mean servlets, jsps or maybe struts at best. no offence or pun intended! this is not a java ee 'bible' by any means. i am not capable enough of writing such a thing! so let us line up the 'keywords' related to java ee and then look at them one by one java ee java ee apis (specifications) containers services multitiered applications components let's try to elaborate on the above mentioned points. ok. so what is java ee? 'ee' stands for enterprise edition. that essentially makes java ee - java enterprise edition. if i had to summarize java ee in a couple of sentences, it would go something like this "java ee is a platform which defines 'standard specifications/apis' which are then implemented by vendors and used for development of enterprise (distributed, 'multi-tired', robust) 'applications'. these applications are composed of modules or 'components' which use java ee 'containers' as their run-time infrastructure." what is this 'standardized platform' based upon? what does it constitute? the platform revolves around 'standard' specifications or apis . think of these as contracts defined by a standard body e.g. enterprise java beans (ejb), java persistence api (jpa), java message service (jms) etc. these contracts/specifications/apis are implemented by different vendors e.g. glassfish, oracle weblogic, apache tomee etc alright. what about containers? containers can be visualized as 'virtual/logical partitions' . each container supports a subset of the apis/specifications defined by the java ee platform they provide run-time 'services' to the 'applications' which they host the java ee specification lists 4 types of containers ejb container web container application client container applet container java ee containers i am not going to dwell into details of these containers in this post. services?? well, 'services' are nothing but a result of the vendor implementations of the standard 'specifications' (mentioned above). examples of specifications are - jersey for jax-rs (restful services), tyrus (web sockets), eclipselink (jpa), weld (cdi) etc. the 'container' is the interface between the deployed application ('service' consumer) and the application server. here is a list of 'services' which are rendered by the 'container' to the underlying 'components' (this is not an exhaustive list) persistence - offered by the java persistence api (jpa) which drives object relational mapping (orm) and an abstraction for the database operations. messaging - the java message service (jms) provides asynchronous messaging between disparate parts of your applications. contexts & dependency injection - cdi provides loosely coupled and type safe injection of resources. web services - jaxrs and jaxws provide support for rest and soap style services respectively transaction - provided by the java transaction api (jta) implementation what is a typical java ee 'application'? what does it comprise of? applications are composed of different ' components ' which in turn are supported by their corresponding ' container ' supported 'component' types are: enterprise applications - make use of the specifications like ejb, jms, jpa etc and are executed within an ejb container web applications - they leverage the servlet api, jsp, jsf etc and are supported by a web container application client - executed in client side. they need an application client container which has a set of supported libraries and executes in a java se environment. applets - these are gui applications which execute in a web browser. how are java ee applications structured? as far as java ee 'application' architecture is concerned, they generally tend follow the n-tier model consisting of client tier, server tier and of course the database (back end) tier client tier - consists of web browsers or gui (swing, java fx) based clients. web browsers tend to talk to the 'web components' on the server tier while the gui clients interact directly with the 'business' layer within the server tier server tier - this tier comprises of the dynamic web components (jsp, jsf, servlets) and the business layer driven by ejbs, jms, jpa, jta specifications. database tier - contains 'enterprise information systems' backed by databases or even legacy data repositories. generic 3-tier java ee application architecture java ee - bare bones, basics.... as quickly and briefly as i possibly could. that's all for now! :-) stay tuned for more java ee content, specifically around the latest and greatest version of the java ee platform --> java ee 7 happy reading!

April 29, 2014

by Abhishek Gupta

CORE

· 40,710 Views · 3 Likes

Continuous Delivery: Maturity Checklist

41% of developers believe they are achieving Continuous Delivery while only 8% actually are. Use the Continuous Delivery Maturity Checklist from DZone's 2014 Guide to Continuous Delivery to determine how close you are to achieving true Continuous Delivery, and be sure to download DZone's 2014 Guide to Continuous Delivery to learn how to improve your Continuous Delivery process. (Download as a PDF)

April 18, 2014

by Alec Noller

· 25,484 Views · 3 Likes

Mule Meets Zuul: A Centralized Properties Management – Part I, Server Side

It is always recommended to use Spring properties with Mule, to externalize any configuration parameters (URLs, ports, user names, passwords, etc.). For example, the Acme APIfrom my previous post connects to an external database. So instead of hard-coding connectivity options inside my application code, I would create a properties file, e.g. acme.properties, as follows: acme.jdbc.host=acmedb acme.jdbc.port=3306 acme.jdbc.database=acmeProducts acme.jdbc.user=WileECoyote acme.jdbc.password=GeeWhizz Obviously, as a developer, I would use a test instance of Acme database to test my application. I’d commit the code to the version control system, including the properties file. Then my application would begin its journey from the automated build system to the Dev environment, to QA, Pre-Prod, and finally Prod – and fail to deploy on production because it wouldn’t be able to connect to the test database! Or even worse, it would connect to the test database and use it and no one would notice the problem until customers placed $0 order for an Acme widget which would normally cost $1000, all because the test database didn’t contain actual prices! Sure, I could just follow the recommendations on our web site and create multiple sets of properties, e.g. acme.dev.properties, acme.qa.properties, acme.prod.properties etc. But instead of solving the problem, it would create a few new ones. First, those properties must still be packaged within the application. Needless to say, IT guys would never give me the credentials for the production database, so I’d have to provide instructions for them on how to modify the properties file AFTER the application is deployed on the prod platform. Second, if (or rather WHEN) any of those properties will need to be changed (for example, the production DB is migrated to a new server), the whole process has to be repeated. And don’t forget about passwords and other sensitive data that should never appear in the code as open text and have to be encrypted. It seems like every single customer I’ve worked with has this problem. And there was no convincing solution until one of our customers told me about an application called Zuul. As the description on the Zuul web site says, “Zuul is a free, open source web application which can be used to centralize and manage configuration for your internal applications. It enables your operations team to control changes and your developers a centralized place to organize settings.” Of course, I couldn’t resist the urge to download it and try it out with Mule. The installation and configuration of the Zuul server was pretty straightforward. After all, Zuul is a standard web application, so I just deployed it to my local Tomcat instance, alongside with MMC which was already deployed on it. I configured the database settings to point to my local MySQL instance. For the LDAP server I used OpenLDAP. I had to download and install the Unlimited Strength JCE Policy Files. Then I started Tomcat and opened the Zuul URL in my browser and logged in as administrator. The first task is to create my environments. Navigating to Administration->Environments menu, I see three environments, prod, qa, and dev, which Zuul creates by default. Just what I need! Moreover, the prod environment is red – which means, only someone with Administrator privileges can mess with it. And while we are in the Administration screen, let’s create a new encryption key for our password values. Administration->Key Management, then click on Create New... button and populate the form: And now we can create our properties. Select Settings->Create New, give it a name, e.g. AcmeProperties. On the next screen, you’re given the option to create a new properties set from scratch, or to upload an existing properties file. Since we already have acme.properties for our dev environment, let’s just use it. Select dev environment on the left tab, then click Upload File button: Upload acme.properties and you’ll see the following screen: Now we can encrypt the database password. Just make sure the correct key is selected, then click Edit and select Encrypt. To finish the server setup, we replicate this set of properties on the qa and prod environments. Select qa tab, then click Copy Existing, then in the Search text box type dev. Your properties set "/dev/AcmeProperties.properties" will be highlighted. Click Copy button and now you have the identical set of properties in qa. Repeat the process for the prod environment. Change properties values on each environment accordingly. This concludes the server setup procedure. In the next post, I will show you how to configure Mule to use Zuul properties management. UPDATE: Zuul can be downloaded at http://www.devnull.org/zuul

April 17, 2014

by Ross Mason

· 7,549 Views · 1 Like

Continuous Delivery: Visualized

For DZone's 2014 Guide to Continuous Delivery we created a detailed infographic to illustrate the creation of deployment pipelines. Download DZone's 2014 Guide to Continuous Delivery to read in-depth articles written by industry experts, see the survey results from 500+ developers, and see profiles on 38 popular Continuous Delivery solutions. (Download this infographic as a PDF)

April 16, 2014

by Alec Noller

· 22,705 Views

A Docker ‘Hello World' With Mono

Docker is a lightweight virtualization technology for Linux that promises to revolutionize the deployment and management of distributed applications. Rather than requiring a complete operating system, like a traditional virtual machine, Docker is built on top of Linux containers, a feature of the Linux kernel, that allows light-weight Docker containers to share a common kernel while isolating applications and their dependencies. There’s a very good Docker SlideShare presentation here that explains the philosophy behind Docker using the analogy of standardized shipping containers. Interesting that the standard shipping container has done more to create our global economy than all the free-trade treaties and international agreements put together. A Docker image is built from a script, called a ‘Dockerfile’. Each Dockerfile starts by declaring a parent image. This is very cool, because it means that you can build up your infrastructure from a layer of images, starting with general, platform images and then layering successively more application specific images on top. I’m going to demonstrate this by first building an image that provides a Mono development environment, and then creating a simple ‘Hello World’ console application image that runs on top of it. Because the Dockerfiles are simple text files, you can keep them under source control and version your environment and dependencies alongside the actual source code of your software. This is a game changer for the deployment and management of distributed systems. Imagine developing an upgrade to your software that includes new versions of its dependencies, including pieces that we’ve traditionally considered the realm of the environment, and not something that you would normally put in your source repository, like the Mono version that the software runs on for example. You can script all these changes in your Dockerfile, test the new container on your local machine, then simply move the image to test and then production. The possibilities for vastly simplified deployment workflows are obvious. Docker brings concerns that were previously the responsibility of an organization’s operations department and makes them a first class part of the software development lifecycle. Now your infrastructure can be maintained as source code, built as part of your CI cycle and continuously deployed, just like the software that runs inside it. Docker also provides docker index, an online repository of docker images. Anyone can create an image and add it to the index and there are already images for almost any piece of infrastructure you can imagine. Say you want to use RabbitMQ, all you have to do is grab a handy RabbitMQ images such as https://index.docker.io/u/tutum/rabbitmq/ and run it like this: docker run -d -p 5672:5672 -p 55672:55672 tutum/rabbitmq The –p flag maps ports between the image and the host. Let’s look at an example. I’m going to show you how to create a docker image for the Mono development environment and have it built and hosted on the docker index. Then I’m going to build a local docker image for a simple ‘hello world’ console application that I can run on my Ubuntu box. First we need to create a Docker file for our Mono environment. I’m going to use the Mono debian packages from directhex. These are maintained by the official Debian/Ubuntu Mono team and are the recommended way of installing the latest Mono versions on Ubuntu. Here’s the Dockerfile: #DOCKER-VERSION 0.9.1 # #VERSION 0.1 # # monoxide mono-devel package on Ubuntu 13.10 FROM ubuntu:13.10 MAINTAINER Mike Hadlow RUN sudo DEBIAN_FRONTEND=noninteractive apt-get install -y -q software-properties-common RUN sudo add-apt-repository ppa:directhex/monoxide -y RUN sudo apt-get update RUN sudo DEBIAN_FRONTEND=noninteractive apt-get install -y -q mono-devel Notice the first line (after the comments) that reads, ‘FROM ubuntu:13.10’. This specifies the parent image for this Dockerfile. This is the official docker Ubuntu image from the index. When I build this Dockerfile, that image will be automatically downloaded and used as the starting point for my image. But I don’t want to build this image locally. Docker provide a build server linked to the docker index. All you have to do is create a public GitHub repository containing your dockerfile, then link the repository to your profile on docker index. You can read the documentation for the details. The GitHub repository for my Mono image is at https://github.com/mikehadlow/ubuntu-monoxide-mono-devel. Notice how the Docker file is in the root of the repository. That’s the default location, but you can have multiple files in sub-directories if you want to support many images from a single repository. Now any time I push a change of my Dockerfile to GitHub, the docker build system will automatically build the image and update the docker index. You can see image listed here:https://index.docker.io/u/mikehadlow/ubuntu-monoxide-mono-devel/ I can now grab my image and run it interactively like this: $ sudo docker pull mikehadlow/ubuntu-monoxide-mono-devel Pulling repository mikehadlow/ubuntu-monoxide-mono-devel f259e029fcdd: Download complete 511136ea3c5a: Download complete 1c7f181e78b9: Download complete 9f676bd305a4: Download complete ce647670fde1: Download complete d6c54574173f: Download complete 6bcad8583de3: Download complete e82d34a742ff: Download complete $ sudo docker run -i mikehadlow/ubuntu-monoxide-mono-devel /bin/bash mono --version Mono JIT compiler version 3.2.8 (Debian 3.2.8+dfsg-1~pre1) Copyright (C) 2002-2014 Novell, Inc, Xamarin Inc and Contributors. www.mono-project.com TLS: __thread SIGSEGV: altstack Notifications: epoll Architecture: amd64 Disabled: none Misc: softdebug LLVM: supported, not enabled. GC: sgen exit Next let’s create a new local Dockerfile that compiles a simple ‘hello world’ program, and then runs it when we run the image. You can follow along with these steps. All you need is a Ubuntu machine with Docker installed. First here’s our ‘hello world’, save this code in a file named hello.cs: using System; namespace Mike.MonoTest { public class Program { public static void Main() { Console.WriteLine("Hello World"); } } } Next we’ll create our Dockerfile. Copy this code into a file called ‘Dockerfile’: #DOCKER-VERSION 0.9.1 FROM mikehadlow/ubuntu-monoxide-mono-devel ADD . /src RUN mcs /src/hello.cs CMD ["mono", "/src/hello.exe"] Once again, notice the ‘FROM’ line. This time we’re telling Docker to start with our mono image. The next line ‘ADD . /src’, tells Docker to copy the contents of the current directory (the one containing our Dockerfile) into a root directory named ‘src’ in the container. Now our hello.cs file is at /src/hello.cs in the container, so we can compile it with the mono C# compiler, mcs, which is the line ‘RUN mcs /src/hello.cs’. Now we will have the executable, hello.exe, in the src directory. The line ‘CMD [“mono”, “/src/hello.exe”]’ tells Docker what we want to happen when the container is run: just execute our hello.exe program. As an aside, this exercise highlights some questions around what best practice should be with Docker. We could have done this in several different ways. Should we build our software independently of the Docker build in some CI environment, or does it make sense to do it this way, with the Docker build as a step in our CI process? Do we want to rebuild our container for every commit to our software, or do we want the running container to pull the latest from our build output? Initially I’m quite attracted to the idea of building the image as part of the CI but I expect that we’ll have to wait a while for best practice to evolve. Anyway, for now let’s manually build our image: $ sudo docker build -t hello . Uploading context 1.684 MB Uploading context Step 0 : FROM mikehadlow/ubuntu-monoxide-mono-devel ---> f259e029fcdd Step 1 : ADD . /src ---> 6075dee41003 Step 2 : RUN mcs /src/hello.cs ---> Running in 60a3582ab6a3 ---> 0e102c1e4f26 Step 3 : CMD ["mono", "/src/hello.exe"] ---> Running in 3f75e540219a ---> 1150949428b2 Successfully built 1150949428b2 Removing intermediate container 88d2d28f12ab Removing intermediate container 60a3582ab6a3 Removing intermediate container 3f75e540219a You can see Docker executing each build step in turn and storing the intermediate result until the final image is created. Because we used the tag (-t) option and named our image ‘hello’, we can see it when we list all the docker images: $ sudo docker images REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE hello latest 1150949428b2 10 seconds ago 396.4 MB mikehadlow/ubuntu-monoxide-mono-devel latest f259e029fcdd 24 hours ago 394.7 MB ubuntu 13.10 9f676bd305a4 8 weeks ago 178 MB ubuntu saucy 9f676bd305a4 8 weeks ago 178 MB ... Now let’s run our image. The first time we do this Docker will create a container and run it. Each subsequent run will reuse that container: $ sudo docker run hello Hello World And that’s it. Imagine that instead of our little hello.exe, this image contained our web application, or maybe a service in some distributed software. In order to deploy it, we’d simply ask Docker to run it on any server we like; development, test, production, or on many servers in a web farm. This is an incredibly powerful way of doing consistent repeatable deployments. To reiterate, I think Docker is a game changer for large server side software. It’s one of the most exciting developments to have emerged this year and definitely worth your time to check out.

April 3, 2014

by Mike Hadlow

· 11,309 Views

Multi-Level Argparse in Python (Parsing Commands Like Git)

It’s a common pattern for command line tools to have multiple subcommands that run off of a single executable. For example, git fetch origin and git commit --amend both use the same executable /usr/bin/git to run. Each subcommand has its own set of required and optional parameters. This pattern is fairly easy to implement in your own Python command-line utilities using argparse. Here is a script that pretends to be git and provides the above two commands and arguments. #!/usr/bin/env python import argparse import sys class FakeGit(object): def __init__(self): parser = argparse.ArgumentParser( description='Pretends to be git', usage='''git [] The most commonly used git commands are: commit Record changes to the repository fetch Download objects and refs from another repository ''') parser.add_argument('command', help='Subcommand to run') # parse_args defaults to [1:] for args, but you need to # exclude the rest of the args too, or validation will fail args = parser.parse_args(sys.argv[1:2]) if not hasattr(self, args.command): print 'Unrecognized command' parser.print_help() exit(1) # use dispatch pattern to invoke method with same name getattr(self, args.command)() def commit(self): parser = argparse.ArgumentParser( description='Record changes to the repository') # prefixing the argument with -- means it's optional parser.add_argument('--amend', action='store_true') # now that we're inside a subcommand, ignore the first # TWO argvs, ie the command (git) and the subcommand (commit) args = parser.parse_args(sys.argv[2:]) print 'Running git commit, amend=%s' % args.amend def fetch(self): parser = argparse.ArgumentParser( description='Download objects and refs from another repository') # NOT prefixing the argument with -- means it's not optional parser.add_argument('repository') args = parser.parse_args(sys.argv[2:]) print 'Running git fetch, repository=%s' % args.repository if __name__ == '__main__': FakeGit() The argparse library gives you all kinds of great stuff. You can run ./git.py --help and get the following: usage: git [] The most commonly used git commands are: commit Record changes to the repository fetch Download objects and refs from another repository Pretends to be git positional arguments: command Subcommand to run optional arguments: -h, --help show this help message and exit You can get help on a particular subcommand with ./git.py commit --help. usage: git.py [-h] [--amend] Record changes to the repository optional arguments: -h, --help show this help message and exit --amend Want bash completion on your awesome new command line utlity? Try argcomplete, a drop in bash completion for Python + argparse.

April 3, 2014

by Chase Seibert

· 18,333 Views · 1 Like

Docker: Bulk Remove Images and Containers

I’ve just started looking at Docker. It’s a cool new technology that has the potential to make the management and deployment of distributed applications a great deal easier. I’d very much recommend checking it out. I’m especially interested in using it to deploy Mono applications because it promises to remove the hassle of deploying and maintaining the mono runtime on a multitude of Linux servers. I’ve been playing around creating new images and containers and debugging my Dockerfile, and I’ve wound up with lots of temporary containers and images. It’s really tedious repeatedly running ‘docker rm’ and ‘docker rmi’, so I’ve knocked up a couple of bash commands to bulk delete images and containers. Delete all containers: sudo docker ps -a -q | xargs -n 1 -I {} sudo docker rm {} Delete all un-tagged (or intermediate) images: sudo docker rmi $( sudo docker images | grep '' | tr -s ' ' | cut -d ' ' -f 3)

April 2, 2014

by Mike Hadlow

· 14,681 Views

How To Add Images To A GitHub Wiki

Every GitHub repository comes with its own wiki. This is a great place to put the documentation for your project. What isn’t clear from the wiki documentation is how to add images to your wiki. Here’s my step-by-step guide. I’m going to add a logo to the main page of my WikiDemo repository’s wiki: https://github.com/mikehadlow/WikiDemo/wiki/Main-Page First clone the wiki. You grab the clone URL from the button at the top of the wiki page. $ git clone [email protected]:mikehadlow/WikiDemo.wiki.git Cloning into 'WikiDemo.wiki'... Enter passphrase for key '/home/mike.hadlow/.ssh/id_rsa': remote: Counting objects: 6, done. remote: Compressing objects: 100% (3/3), done. remote: Total 6 (delta 0), reused 0 (delta 0) Receiving objects: 100% (6/6), done. Create a new directory called ‘images’ (it doesn’t matter what you call it, this is just a convention I use): $ mkdir images Then copy your picture(s) into the images directory (I’ve copied my logo_design.png file to my images directory). $ ls -l -rwxr-xr-x 1 mike.hadlow Domain Users 12971 Sep 5 2013 logo_design.png Commit your changes and push back to GitHub: $ git add -A $ git status # On branch master # Changes to be committed: # (use "git reset HEAD ..." to unstage) # # new file: images/logo_design.png # $ git commit -m "Added logo_design.png" [master 23a1b4a] Added logo_design.png 1 files changed, 0 insertions(+), 0 deletions(-) create mode 100755 images/logo_design.png $ git push Enter passphrase for key '/home/mike.hadlow/.ssh/id_rsa': Counting objects: 5, done. Delta compression using up to 4 threads. Compressing objects: 100% (3/3), done. Writing objects: 100% (4/4), 9.05 KiB, done. Total 4 (delta 0), reused 0 (delta 0) To [email protected]:mikehadlow/WikiDemo.wiki.git 333a516..23a1b4a master -> master Now we can put a link to our image in ‘Main Page’: Save and there’s your image for all to see:

March 27, 2014

by Mike Hadlow

· 25,508 Views · 1 Like

Integration Testing for Spring Applications with JNDI Connection Pools

We all know we need to use connection pools where ever we connect to a database. All of the modern drivers using JDBC type 4 support it. In this post we will have look at an overview ofconnection pooling in spring applications and how to deal with same context in a non JEE enviorements (like tests). Most examples of connecting to database in spring is done using DriverManagerDataSource. If you don't read the documentation properly then you are going to miss a very important point. NOTE: This class is not an actual connection pool; it does not actually pool Connections. It just serves as simple replacement for a full-blown connection pool, implementing the same standard interface, but creating new Connections on every call. Useful for test or standalone environments outside of a J2EE container, either as a DataSource bean in a corresponding ApplicationContext or in conjunction with a simple JNDI environment. Pool-assuming Connection.close() calls will simply close the Connection, so any DataSource-aware persistence code should work. Yes, by default the spring applications does not use pooled connections. There are two ways to implement the connection pooling. Depending on who is managing the pool. If you are running in a JEE environment, then it is prefered use the container for it. In a non-JEE setup there are libraries which will help the application to manage the connection pools. Lets discuss them in bit detail below. 1. Server (Container) managed connection pool (Using JNDI) When the application connects to the database server, establishing the physical actual connection takes much more than the execution of the scripts. Connection pooling is a technique that was pioneered by database vendors to allow multiple clients to share a cached set of connection objects that provide access to a database resource. The JavaWorld article gives a good overview about this. In a J2EE container, it is recommended to use a JNDI DataSource provided by the container. Such a DataSource can be exposed as a DataSource bean in a Spring ApplicationContext via JndiObjectFactoryBean, for seamless switching to and from a local DataSource bean like this class. The below articles helped me in setting up the data source in JBoss AS. 1. DebaJava Post 2. JBoss Installation Guide 3. JBoss Wiki Next step is to use these connections created by the server from the application. As mentioned in the documentation you can use the JndiObjectFactoryBean for this. It is as simple as below If you want to write any tests using springs "SpringJUnit4ClassRunner" it can't load the context becuase the JNDI resource will not be available. For tests, you can then either set up a mock JNDI environment through Spring's SimpleNamingContextBuilder, or switch the bean definition to a local DataSource (which is simpler and thus recommended). As I was looking for a good solutions to this problem (I did not want a separate context for tests) this SO answer helped me. It sort of uses the various tips given in the Javadoc to good effect. The issue with the above solution is the repetition of code to create the JNDI connections. I have solved it using a customized runner SpringWithJNDIRunner. This class adds the JNDI capabilities to the SpringJUnit4ClassRunner. It reads the data source from "test-datasource.xml" file in the class path and binds it to the JNDI resource with name "java:/my-ds". After the execution of this code the JNDI resource is available for the spring container to consume. import javax.naming.NamingException; import org.junit.runners.model.InitializationError; import org.springframework.context.ApplicationContext; import org.springframework.context.support.ClassPathXmlApplicationContext; import org.springframework.mock.jndi.SimpleNamingContextBuilder; import org.springframework.test.context.junit4.SpringJUnit4ClassRunner; /** * This class adds the JNDI capabilities to the SpringJUnit4ClassRunner. * @author mkadicha * */ public class SpringWithJNDIRunner extends SpringJUnit4ClassRunner { public static boolean isJNDIactive; /** * JNDI is activated with this constructor. * * @param klass * @throws InitializationError * @throws NamingException * @throws IllegalStateException */ public SpringWithJNDIRunner(Class klass) throws InitializationError, IllegalStateException, NamingException { super(klass); synchronized (SpringWithJNDIRunner.class) { if (!isJNDIactive) { ApplicationContext applicationContext = new ClassPathXmlApplicationContext( "test-datasource.xml"); SimpleNamingContextBuilder builder = new SimpleNamingContextBuilder(); builder.bind("java:/my-ds", applicationContext.getBean("dataSource")); builder.activate(); isJNDIactive = true; } } } } To use this runner you just need to use the annotation @RunWith(SpringWithJNDIRunner.class) in your test. This class extends SpringJUnit4ClassRunner beacuse a there can only be one class in the @RunWith annotation. The JNDI is created only once is a test cycle. This class provides a clean solution to the problem. 2. Application managed connection pool If you need a "real" connection pool outside of a J2EE container, consider Apache's Jakarta Commons DBCP or C3P0. Commons DBCP's BasicDataSource and C3P0's ComboPooledDataSource are full connection pool beans, supporting the same basic properties as this class plus specific settings (such as minimal/maximal pool size etc). Below user guides can help you configure this. 1. Spring Docs 2. C3P0 Userguide 3. DBCP Userguide The below articles speaks about the general guidelines and best practices in configuring the connection pools. 1. SO question on Spring JDBC Connection pools 2. Connection pool max size in MS SQL Server 2008 3. How to decide the max number of connections 4. Monitoring the number of active connections in SQL Server 2008 Note:- All the text in italics are copied from the spring documentation of the DriverManagerDataSource.

March 26, 2014

by Manu Pk

· 25,327 Views · 1 Like

Distributed Counters Feature Design

this is another experiment with longer posts. previously, i used the time series example as the bed on which to test some ideas regarding feature design, to explain how we work and in general work out the rough patches along the way. i should probably note that these posts are purely fiction at this point. we have no plans to include a time series feature in ravendb at this time. i am trying to work out some thoughts in the open and get your feedback. at any rate, yesterday we had a request for cassandra style counters at the mailing list. and as long as i am doing feature design series, i thought that i could talk about how i would go about implementing this. again, consider this fiction, i have no plans of implementing this at this time. the essence of what we want is to be able to… count stuff. efficiently, in a distributed manner, with optional support for cross data center replication. very roughly, the idea is to have “sub counters”, unique for every node in the system. whenever you increment the value, we log this to our own sub counter, and then replicate it out. whenever you read it, we just sum all the data we have from all the sub counters. let us outline the various parts of the solution in the same order as the one i used for time series. storage a counter is just a named 64 bits signed integer. a counter name can be any string up to 128 printable characters. the external interface of the storage would look like this: 1: public struct counterincrement 2: { 3: public string name; 4: public long change; 5: } 6: 7: public struct counter 8: { 9: public string name; 10: public string source; 11: public long value; 12: } 13: 14: public interface icounterstorage 15: { 16: void localincrementbatch(counterincrement[] batch); 17: 18: counter[] read(string name); 19: 20: void replicatedupdates(counter[] updates); 21: } as you can see, this gives us very simple interface for the storage. we can either change the data locally (which modify our own storage) or we can get an update from a replica about its changes. there really isn’t much more to it, to be fair. the localincrementbatch() increment a local value, and read() will return all the values for a counter. there is a little bit of trickery involved in how exactly one would store the counter values. for now, i think we’ll store each counter as two step values. we’ll have a tree of multi tree values that will carry each value from each source. that means that a counter will take roughly 4kb or so. this is easy to work with and nicely fit the model voron uses internally. note that we’ll outline additional requirement for storage (searching for counter by prefix, iterating over counters, addresses of other servers, stats, etc) below. i’m not showing them here because they aren’t the major issue yet. over the wire skipping out on any optimizations that might be required, we will expose the following endpoints: get /counters/read?id=users/1/visits&users/1/posts <—will return json response with all the relevant values (already summed up). { “users/1/visits”: 43, “users/1/posts”: 3 } get /counters/read?id=users/1/visits&users/1/1/posts&raw=true <—will return json response with all the relevant values, per source. { “users/1/visits”: {“rvn1”: 21, “rvn2”: 22 } , “users/1/posts”: { “rvn1”: 2, “rvn3”: 1 } } post /counters/increment <– allows to increment counters. the request is a json array of the counter name and the change. for a real system, you’ll probably need a lot more stuff, metrics, stats, etc. but this is the high level design, so this would be enough. note that we are skipping the high performance stream based writes we outlined for time series. we’ll probably won’t need them, so that doesn’t matter, but they are an option if we need them. system behavior this is where it is really not interesting, there is very little behavior here, actually. we only have to read the data from the storage, sum it up, and send it to the user. hardly what i’ll call business logic. client api the client api will probably look something like this: 1: counters.increment("users/1/posts"); 2: counters.increment("users/1/visits", 4); 3: 4: using(var batch = counters.batch()) 5: { 6: batch.increment("users/1/posts"); 7: batch.increment("users/1/visits",5); 8: batch.submit(); 9: } note that we’re offering both batch and single api. we’ll likely also want to offer a fire & forget style, which will be able to offer even better performance (because they could do batching across more than a single thread), but that is out of scope for now. for simplicity sake, we are going to have the client just a container for all of endpoints that it knows about. the container would be responsible for… updating the client visible topology, selecting the best server to use at any given point, etc. user interface there isn’t much to it. just show a list of counter values in a list. allow to search by prefix, allow to dive into a particular counter and read its raw values, but that is about it. oh, and allow to delete a counter. deleting data honestly, i really hate deletes. they are very expensive to handle properly the moment you have more than a single node. in this case, there is an inherent race condition between a delete going out and another node getting an increment. and then there is the issue of what happens if you had a node down when you did the delete, etc. this just sucks. deletion are handled normally, (with the race condition caveat, obviously), and i’ll discuss how we replicate them in a bit. high availability / scale out by definition, we actually don’t want to have storage replication here. either log shipping or consensus based. we actually do want to have different values, because we are going to be modifying things independently on many servers. that means that we need to do replication at the database level. and that leads to some interesting questions. again, the hard part here is the deletes. actually, the really hard part is what we are going to do with the new server problem. the new server problem dictates how we are going to bring a new server into the cluster. if we could fix the size of the cluster, that would make things a lot easier. however, we are actually interested in being able to dynamically grow the cluster size. therefor, there are only two real ways to do it: add a new empty node to the cluster, and have it be filled from all the other servers. add a new node by backing up an existing node, and restoring as a new node. ravendb, for example, follows the first option. but it means that in needs to track a lot more information. the second option is actually a lot simpler, because we don’t need to care about keeping around old data. however, this means that the process of bringing up a new server would now be: update all nodes in the cluster with the new node address (node isn’t up yet, replication to it will fail and be queued). backup an existing node and restore at the new node. start the new node. the order of steps is quite important. and it would be easy to get it wrong. also, on large systems, backup & restore can take a long time. operationally speaking, i would much rather just be able to do something like, bring a new node into the cluster in “silent” mode. that is, it would get information from all the other nodes, and i can “flip the switch” and make it visible to clients at any point in time. that is how you do it with ravendb, and it is an incredibly powerful system, when used properly. that means that for all intents and purposes, we don’t do real deletes. what we’ll actually do is replace the counter value with delete marker. this turns deletes into a much simple “just another write”. it has the sad implication of not free disk space on deletes, but deletes tend to be rare, and it is usually fine to add a “purge” admin option that can be run on as needed basis. but that brings us to an interesting issue, how do we actually handle replication. the topology map to simplify things, we are going to go with one way replication from a node to another. that allows complex topologies like master-master, cluster-cluster, replication chain, etc. but in the end, this is all about a single node replication to another. the first question to ask is, are we going to replicate just our local changes, or are we going to have to replicate external changes as well? the problem with replicating external changes is that you may have the following topology: now, server a got a value and sent it to server b. server b then forwarded it to server c. however, at that point, we also have a the value from server a replicated directly to server c. which value is it supposed to pick? and what about a scenario where you have more complex topology? in general, because in this type of system, we can have any node accept writes, and we actually desire this to be the case , we don’t want this behavior. we want to only replicate local data, not all the data. of course, that leads to an annoying question, what happens if we have a 3 node cluster, and one node fails catastrophically. we can bring a new node in, and the other two nodes will be able to fill in their values via replication, but what about the node that is down? the data isn’t gone, it is still right there in the other two nodes, but we need a way to pull it out. therefor, i think that the best option would be to say that nodes only replicate their local state, except in the case of a new node. a new node will be told the address of an existing node in the cluster, at which point it will: register itself in all the nodes in the cluster (discoverable from the existing node). this assumes a standard two way replication link between all servers, if this isn’t the case, the operators would have the responsibility to setup the actual replication semantics on their own. new node now starts getting updates from all the nodes in the cluster. it keeps them in a log for now, not doing anything yet. ask that node for a complete update of all of its current state. when it has all the complete state of the existing node, it replays all of the remembered logs that it didn’t have a chance to apply yet. then it announces that it is in a valid state to start accepting client connections. note that this process is likely to be very sensitive to high data volumes. that is why you’ll usually want to select a backup node to read from, and that decision is an ops decision. you’ll also want to be able to report extensively on the current status of the node, since this can take a while, and ops will be watching this very closely. server name a node requires a unique name. we can use guids, but those aren’t readable, so we can use machine name + port, but those can change. ideally, we can require the user to set us up with a unique name. that is important for readability and for being able to alter see all the values we have in all the nodes. it is important that names are never repeated, so we’ll probably have a guid there anyway, just to be on the safe side. actual replication semantics since we have the new server problem down to an automated process, we can choose the drastically simpler model of just having an internal queue per each replication destination. whenever we make a change, we also make a note of that in the queue for that destination, then we start an async replication process to that server, sending all of our updates there. it is always safe to overwrite data using replication, because we are overwriting our own data, never anyone else. and… that is about it, actually. there are probably a lot of details that i am missing / would discover if we were to actually implement this. but i think that this is a pretty good idea about what this feature is about.

March 25, 2014

by Oren Eini

· 12,657 Views · 1 Like

JavaScript Webapps with Gradle

Gradle, a versatile JVM build tool, effectively handles JavaScript and CSS tasks for web applications and server components.

March 24, 2014

by Kon Soulianidis

· 39,500 Views · 4 Likes

Automating the build of MSI setup packages on Jenkins

a short "how-to" based on an issue one of my work mates recently faced when trying to automate the creation of an msi package on jenkins. normally, visual studio solutions can be build on jenkins by using the appropriate msbuild plugin . apparently though, for visual studio setup projects, msbuild cannot be used and one has to switch to using visual studio itself to execute the build. so the first approach was to use devenv.exe as follows devenv.exe visualstudiosolution.sln /build "release" while this works, the problem is that it is an "async call", meaning that the compilation goes on in the background while the console from which the build is executed, immediately returns. obviously this isn't suited for being used on jenkins. searching around for a while, it turned out that you have to use devenv.com instead of devenv.exe : "c:\program files (x86)\microsoft visual studio 10.0\common7\ide\devenv.com"visualstudiosolution.sln /build "release" once you got that, integrating everything into jenkins is quite straightforward: (obviously you may also simply set an environment variable pointing to devenv.com on your build server rather than indicating the entire path)

March 13, 2014

by Juri Strumpflohner

· 13,635 Views

Spring Boot & JavaConfig integration

Java EE in general and Context and Dependency Injection has been part of the Vaadin ecosystem since ages. Recently, Spring Vaadin is a joint effort of the Vaadin and the Spring teams to bring the Spring framework into the Vaadin ecosystem, lead by Petter Holmström for Vaadin and Josh Long for Pivotal. Integration is based on the Spring Boot project - and its sub-modules, that aims to ease creating new Spring web projects. This article assumes the reader is familiar enough with Spring Boot. If not the case, please take some time to get to understand basic notions about the library. Note that at the time of this writing, there's no release for Spring Vaadin. You'll need to clone the project and build it yourself. The first step is to create the UI. In order to display usage of Spring's Dependency Injection, it should use a service dependency. Let's injection the UI through Constructor Injection to favor immutability. The only addition to a standard UI is to annotate it with org.vaadin.spring.@VaadinUI. @VaadinUI public class VaadinSpringExampleUi extends UI { private HelloService helloService; public VaadinSpringExampleUi(HelloService helloService) { this.helloService = helloService; } @Override protected void init(VaadinRequest vaadinRequest) { String hello = helloService.sayHello(); setContent(new Label(hello)); } } The second step is standard Spring Java configuration. Let's create two configuration classes, one for the main context and the other for the web one. Two thing of note: The method instantiating the previous UI has to be annotated with org.vaadin.spring.@UIScope in addition to standard Spring org.springframework.context.annotation.@Bean to bind the bean lifecycle to the new scope provided by the Spring Vaadin library. At the time of this writing, a RequestContextListener bean must be provided. In order to be compliant with future versions of the library, it's a good practice to annotate the instantiating method with @ConditionalOnMissingBean(RequestContextListener.class). @Configuration public class MainConfig { @Bean public HelloService helloService() { return new HelloService(); } } @Configuration public class WebConfig extends MainConfig { @Bean @ConditionalOnMissingBean(RequestContextListener.class) public RequestContextListener requestContextListener() { return new RequestContextListener(); } @Bean @UIScope public VaadinSpringExampleUi exampleUi() { return new VaadinSpringExampleUi(helloService()); } } The final step is to create a dedicated WebApplicationInitializer. Spring Boot already offers a concrete implementation, we just need to reference our previous configuration classes as well as those provided by Spring Vaadin, namely VaadinAutoConfiguration and VaadinConfiguration. public class ApplicationInitializer extends SpringBootServletInitializer { @Override protected SpringApplicationBuilder configure(SpringApplicationBuilder application) { return application.showBanner(false) .sources(MainConfig.class) .sources(VaadinAutoConfiguration.class, VaadinConfiguration.class) .sources(WebConfig.class); } } At this point, we demonstrated a working Spring Vaadin sample application. Code for this article can be browsed and forked on Github.

March 10, 2014

by Nicolas Fränkel

· 13,548 Views

How to Install R Packages with Ansible

Here is a short snippet of Ansible playbook that installs R and any required packages to any nodes of the cluster: - name: Making sure R is installed apt: pkg=r-base state=installed - name: adding a few R packages command: /usr/bin/Rscript --slave --no-save --no-restore-history -e "if (! ('{{item}' %in% installed.packages()[,'Package'])) install.packages(pkgs={{item}, repos=c('http://www.freestatistics.org/cran/'))" with_items: - rjson - rPython - plyr - psych - reshape2 You should replace the repos with one chosen from the list of Cran mirrors. Note that the command above installs each package only if it is not already present, but messes up the “changed” status of Ansible’s PLAY RECAP by incorrectly reporting a change per R package at every run. Find more big data technical posts on my blog.

March 5, 2014

by Svend Vanderveken

· 6,189 Views

Step-by-Step: Live Migrate Multiple (Clustered) VMs in One Line of PowerShell - Revisited

A while back, I wrote an article showing how to Live Migrate Your VMs in One Line of Powershell between non-clustered Windows Server 2012 Hyper-V hosts using Shared Nothing Live Migration. Since then, I’ve been asked a few times for how this type of parallel Live Migration would be performed for highly available virtual machines between Hyper-V hosts within a cluster. In this article, we’ll walk through the steps of doing exactly that … via Windows PowerShell on Windows Server 2012 or 2012 R2 or our FREE Hyper-V Server 2012 R2 bare-metal, enterprise-grade hypervisor in a clustered configuration. Wait! Do I need PowerShell to Live Migrate multiple VMs within a Cluster? Well, actually … No. You could certainly use the Failover Cluster Manager GUI tool to select multiple highly available virtual machines, right-click and select Move | Live Migration … Failover Cluster Manager – Performing Multi-VM Live Migration But, you may wish to script this process for other reasons … perhaps to efficiently drain all VM’s from a host as part of a maintenance script that will be performing other tasks. Can I use the same PowerShell cmdlets for Live Migrating within a Cluster? Well, actually … No again. When VMs are made highly available resources within a cluster, they’re managed as cluster group resources instead of being standalone VM resources. As a result, we have a different set of Cluster-aware PowerShell cmdlets that we use when managing these cluster groups. To perform a scripted multi-VM Live Migration, we’ll be leveraging three of these cmdlets: Get-ClusterNode, Get-ClusterGroup and Move-ClusterVirtualMachineRole Now, let’s see that one line of PowerShell! Before getting to the point of actually performing the multi-VM Live Migration in a single PowerShell command line, we first need to setup a few variables to handle the "what" and "where" of moving these VMs. First, let’s specify the name of the cluster with which we’ll be working. We’ll store it in a $clusterName variable. $clusterName = read-host -Prompt "Cluster name" Next, we’ll need to select the cluster node to which we’ll be Live Migrating the VMs. Lets use the Get-ClusterNode and Out-GridView cmdlets together to prompt for the cluster node and store the value in a $targetClusterNode variable. $targetClusterNode = Get-ClusterNode -Cluster $clusterName | Out-GridView -Title "Select Target Cluster Node" ` -OutputMode Single And then, we’ll need to create a list of all the VMs currently running in the cluster. We can use the Get-ClusterGroup cmdlet to retrieve this list. Below, we have an example where we are combining this cmdlet with a Where-Object cmdlet to return only the virtual machine cluster groups that are running on any node except the selected target cluster node. After all, it really doesn’t make any sense to Live Migrate a VM to the same node on which it’s currently running! $haVMs = Get-ClusterGroup -Cluster $clusterName | Where-Object {($_.GroupType -eq "VirtualMachine") ` -and ($_.OwnerNode -ne $targetClusterNode.Name)} We’ve stored the resulting list of VMs in a $haVMs variable. Ready to Live Migrate! OK … Now we have all of our variables defined for the cluster, the target cluster node and the list of VMs from which to choose. Here’s our single line of PowerShell to do the magic … $haVMs | Out-GridView -Title "Select VMs to Move" –PassThru | Move-ClusterVirtualMachineRole -MigrationType Live ` -Node $targetClusterNode.Name -Wait 0 Proceed with care: Keep in mind that your target cluster node will need to have sufficient available resources to run the VM's that you select for Live Migration. Of course, it's best to initially test tasks like this in your lab environment first. Here’s what is happening in this single PowerShell command line: We’re passing the list of VMs stored in the $haVMs variable to the Out-GridView cmdlet. Out-GridView prompts for which VMs to Live Migrate and then passes the selected VMs down the PowerShell object pipeline to the Move-ClusterVirtualMachineRole cmdlet. This cmdlet initiates the Live Migration for each selected VM, and because it’s using a –Wait 0 parameter, it initiates each Live Migration one-after-another without waiting for the prior task to finish. As a result, all of the selected VMs will Live Migrate in parallel, up to the maximum number of concurrent Live Migrations that you’ve configured on these cluster nodes. The VMs selected beyond this maximum will simply queue up and wait their turn. Unlike some competing hypervisors, Hyper-V doesn't impose an artificial hard-coded limit on how many VMs for you can Live Migrate concurrently. Instead, it's up to you to set the maximum to a sensible value based on your hardware and network capacity. Do you have your own PowerShell automation ideas for Hyper-V? Feel free to share your ideas in the Comments section below. See you in the Clouds! - Keith

March 3, 2014

by Keith Mayer

· 10,738 Views

Hibernate Query by Example (QBE)

What is It Query by example is an alternative querying technique supported by the main JPA vendors but not by the JPA specification itself. QBE returns a result set depending on the properties that were set on an instance of the queried class. So if I create an Address entity and fill in the city field then the query will select all the Address entities having the same city field as the given Address entity. The typical use case of QBE is evaluating a search form where the user can fill in any search fields and gets the results based on the given search fields. In this case QBE can reduce code size significantly. When to Use · Using many fields of an entity in a query · User selects which fields of an Entity to use in a query · We are refactoring the entities frequently and don’t want to worry about breaking the queries that rely on them Limitations · QBE is not available in JPA 1.0 or 2.0 · Version properties, identifiers and associations are ignored · The query object should be annotated with @Entity Test Data I used the following entities to test the QBE feature of Hibernate: · Address (long id, String city, String street, String countryISO2Code, AddressType addressType) · AddressType (Integer type, String description) Imports The examples will refer to the following classes: import org.hibernate.Criteria; import org.hibernate.Session; import org.hibernate.criterion.Example; import org.hibernate.criterion.Restrictions; import org.junit.Test; import java.util.List; Utility Methods I also made two utility methods to present a list of the two entity types: private void listAddresses(List addresses) { for (Address address : addresses) { System.out.println(address.getId() + ", " + address.getCountryISO2Code() + ", " + address.getCity() + ", " + address.getStreet() + ", " + address.getAddressType().getType() + ", " + address.getAddressType().getDescription()); } } private void listAddressTypes(List addressTypes) { for (AddressType addressType : addressTypes) { System.out.println(addressType.getType() + ", " + addressType.getDescription()); } } Example 1: Equals This example code returns the Address entities matching the given CountryISO2Code and City. Method: @Test public void testEquals() throws Exception { Session session = (Session) entityManager.getDelegate(); Address address = new Address(); address.setCountryISO2Code("US"); address.setCity("CHICAGO"); Example addressExample = Example.create(address); Criteria criteria = session.createCriteria(Address.class).add(addressExample); listAddresses(criteria.list()); } Result: 75, US, CHICAGO, Los Angeles Way2, 6, Customer 170, US, CHICAGO, Jackson Blvd 33a, 4, Delivery 63, US, CHICAGO, Main Avenue 1, 5, Bill to 37, US, CHICAGO, Jackson Blvd 33a, 4, Delivery 36, US, CHICAGO, Jackson Blvd 33a, 4, Delivery Example 2: Id Limitation This example presents that id fields in the query object are ignored. Method: @Test public void testIdLimitation() throws Exception { Session session = (Session) entityManager.getDelegate(); Address address = new Address(); address.setCountryISO2Code("US"); address.setCity("CHICAGO"); address.setId(100); // setting id is ignored Example addressExample = Example.create(address); Criteria criteria = session.createCriteria(Address.class).add(addressExample); listAddresses(criteria.list()); } Result: 75, US, CHICAGO, Los Angeles Way2, 6, Customer 170, US, CHICAGO, Jackson Blvd 33a, 4, Delivery 63, US, CHICAGO, Main Avenue 1, 5, Bill to 37, US, CHICAGO, Jackson Blvd 33a, 4, Delivery 36, US, CHICAGO, Jackson Blvd 33a, 4, Delivery Example 3: Association Limitation Associations of the query object are ignored, too. Method: @Test public void testAssociationLimitation() throws Exception { Session session = (Session) entityManager.getDelegate(); Address address = new Address(); address.setCountryISO2Code("US"); address.setCity("CHICAGO"); AddressType addressType = new AddressType(); addressType.setType(5); address.setAddressType(addressType); // setting an association is ignored Example addressExample = Example.create(address); Criteria criteria = session.createCriteria(Address.class).add(addressExample); listAddresses(criteria.list()); } Result: 75, US, CHICAGO, Los Angeles Way2, 6, Customer 170, US, CHICAGO, Jackson Blvd 33a, 4, Delivery 63, US, CHICAGO, Main Avenue 1, 5, Bill to 37, US, CHICAGO, Jackson Blvd 33a, 4, Delivery 36, US, CHICAGO, Jackson Blvd 33a, 4, Delivery Example 4: Like QBE supports like in the query object if we enable it with Example.enableLike(). Method: @Test public void testLike() throws Exception { Session session = (Session) entityManager.getDelegate(); Address address = new Address(); address.setCountryISO2Code("US"); address.setCity("AT%"); Example addressExample = Example.create(address).enableLike(); Criteria criteria = session.createCriteria(Address.class).add(addressExample); listAddresses(criteria.list()); } Result: 83, US, ATLANTA, null, 6, Customer 184, US, ATLANTA, null, 1, Shipper 25, US, ATLANTA, null, 1, Shipper Example 5: ExcludeProperty We can exclude a property with Example.excludeProperty(String propertyName). Method: @Test public void testExcludeProperty() throws Exception { Session session = (Session) entityManager.getDelegate(); Address address = new Address(); address.setCountryISO2Code("US"); address.setCity("AT%"); Example addressExample = Example.create(address).enableLike() .excludeProperty("countryISO2Code"); // countryISO2Code is a property of Address Criteria criteria = session.createCriteria(Address.class).add(addressExample); listAddresses(criteria.list()); } Result: 154, GR, ATHENS, BETA ALPHA Street 5, 2, Consignee 83, US, ATLANTA, null, 6, Customer 25, US, ATLANTA, null, 1, Shipper 184, US, ATLANTA, null, 1, Shipper Example 6: IgnoreCase Case-insensitive search is supported by Example.ignoreCase(). Method: @Test public void testIgnoreCase() throws Exception { Session session = (Session) entityManager.getDelegate(); AddressType addressType = new AddressType(); addressType.setDescription("customer"); Example addressTypeExample = Example.create(addressType).ignoreCase(); Criteria criteria = session.createCriteria(AddressType.class) .add(addressTypeExample); listAddressTypes(criteria.list()); } Result: 6, Customer Example 7: ExcludeZeroes We can ignore 0 values of the query object by Example.excludeZeroes(). Method: @Test public void testExcludeZeroes() throws Exception { Session session = (Session) entityManager.getDelegate(); AddressType addressType = new AddressType(); addressType.setType(0); addressType.setDescription("Customer"); Example addressTypeExample = Example.create(addressType) .excludeZeroes(); Criteria criteria = session.createCriteria(AddressType.class) .add(addressTypeExample); listAddressTypes(criteria.list()); } Result: 6, Customer Example 8: Combining with Criteria QBE can be combined with criteria query. In this example we add further restriction to the query object using criteria query. Method: @Test public void testCombiningWithCriteria() throws Exception { Session session = (Session) entityManager.getDelegate(); AddressType addressType = new AddressType(); addressType.setDescription("Customer"); Example addressTypeExample = Example.create(addressType); Criteria criteria = session .createCriteria(AddressType.class).add(addressTypeExample) .add(Restrictions.eq("type", 6)); listAddressTypes(criteria.list()); } Result: 6, Customer Example 9: Association With criteria query we can filter both sides of an association, using two query objects. Method: @Test public void testAssociation() throws Exception { Session session = (Session) entityManager.getDelegate(); Address address = new Address(); address.setCountryISO2Code("US"); AddressType addressType = new AddressType(); addressType.setType(6); Example addressExample = Example.create(address); Example addressTypeExample = Example.create(addressType); Criteria criteria = session.createCriteria(Address.class).add(addressExample) .createCriteria("addressType").add(addressTypeExample); // addressType is a property of Address listAddresses(criteria.list()); } Result: 84, US, BOSTON, null, 6, Customer 83, US, ATLANTA, null, 6, Customer 82, US, SAN FRANCISCO, null, 6, Customer 75, US, CHICAGO, Los Angeles Way2, 6, Customer EclipseLink EclipseLink QBE uses QueryByExamplePolicy, ReadObjectQuery and JpaHelper: QueryByExamplePolicy qbePolicy =newQueryByExamplePolicy(); qbePolicy.excludeDefaultPrimitiveValues(); Address address =newAddress(); address.setCity("CHICAGO"); ReadObjectQuery roq =newReadObjectQuery(address, qbePolicy); Query query =JpaHelper.createQuery(roq, entityManager); OpenJPA OpenJPA uses OpenJPAQueryBuilder: CriteriaQuery cq = openJPAQueryBuilder.createQuery(Address.class); Address address =newAddress(); address.setCity("CHICAGO"); cq.where(openJPAQueryBuilder.qbe(cq.from(Address.class), address); References Hibernate: · Srinivas Guruzu and Gary Mak: Hibernate Recipes: A Problem-Solution Approach (Apress) · http://docs.jboss.org/hibernate/core/3.3/reference/en/html/querycriteria.html#querycriteria-examples · http://www.java2s.com/Code/Java/Hibernate/CriteriaQBEQueryByExampleCriteria.htm · http://www.dzone.com/snippets/hibernate-query-example · http://gal-levinsky.blogspot.de/2012/01/qbe-pattern.html Hibernate associations: · http://stackoverflow.com/questions/9309884/query-by-example-on-associations · http://stackoverflow.com/questions/8236596/hibernate-query-by-example-equivalent-of-association-criteria-query JPA: · http://stackoverflow.com/questions/2880209/jpa-findbyexample EclipseLink: · http://www.coderanch.com/t/486528/ORM/databases/findByExample-JPA-book OpenJPA: · http://www.ibm.com/developerworks/java/library/j-typesafejpa/#N10C18

February 27, 2014

by Donat Szilagyi

· 62,584 Views · 3 Likes