DevOps and CI/CD Resources

The Latest DevOps and CI/CD Topics

The Risks Of Big-Bang Deployments And Techniques For Step-wise Deployment

If you ever need to persuade management why it might be better to deploy a larger change in multiple stages and push it to customers gradually, read on. A deployment of many changes is risky. We want therefore to deploy them in a way which minimizes the risk of harm to our customers and our companies. The deployment can be done either in an all-at-once (also known as big-bang) way or a gradual way. We will argue here for the more gradual (“stepwise”) approach. Big-bang or stepwise deployment? A big-bang deployment seems to be the natural thing to do: the full solution is developed and tested and then replaces the current system at once. However, it has two crucial flaws. First, it assumes that most defects can be discovered by testing. However, due to differences in test/prod environments, unknown dependencies, and the sheer scale of a typical larger system there always will be problems that are not discovered until production deployment or even until the application runs for a while in production (whichapplies even to airplanes). The more parts have been changed, the more of these production defects will happen at the same time. A gradual deployment makes it possible to discover and handle them one by one. Second, the more complex the deployment, the higher chance of human error(s), i.e. the deployment itself is a likely source of serious defects. Some of the drawbacks of a big-bang deployment in more detail: Complexity: A big-bang deployment requires coordination of many people and “moving parts” that depend on each other, providing a huge opportunity for human mistake (i.e. there will be mistakes). Lot of time: Such a deployment requires lot of time (typically also more than planed/expected) and thus lot of downtime when users cannot use the system. Hard troubleshooting: With a network of inter-dependent parts that changed all at the same time, while perhaps also changing the infrastructure (i.e. connections between them), it is extremely hard to pinpoint the source of defects, thus considerably increasing the time to detect and correct defects while also increasing the risk of people stepping on the toes of each other and “panic fixes” that either cause more problems than they remove or are not good enough (as the rollback that sped upKnight’s downfall). Rollback is likely either impossible or equally time-consuming and risky as the deployment itself, thus increasing the impact of defects and inviting even more human errors. Impact: Deploying everything to all users at the same time means that everybody will be impacted by a potential defect/error/mistake. Long freeze: All needs to be tested together after all development is finished, which requires a lot of time while the code is frozen and no more fixes and changes can get into production for weeks. Risk mitigation The goal of a good deployment plan is to mitigate the risk of the deployment and get it to an acceptable level. There are two aspects to risk: the probability of a defect and the impact of the defect. The following table shows how the possible measures affect them: Defect probability reduction Defect impact reduction testing stepwise deployment gradual migration of users to the new version (f.ex. 1 in 1000 or particular subsets) rollback mechanism => these also lead to much lower time to detect and fix defects Practices for stepwise deployment Enable stepwise deployment: Use parallel change and other Continuous Delivery techniques to make it possible to deploy updated components independently from each other and to switch on/off new features and to switch what versions of the components they depend on are currently used. (Parallel change – keeping the old and new code and being able to use one or the other – is crucial here. Also notice that parallel change applies also to data – you will need to evolve your data schema gradually and keep both old and new one at the same time in a period of time.) Enable rollback. The previous measure – stepwise deployment – makes it also easy(ier) to roll-back the changes by switching to a previous version of a dependency or by switching back to the old code. Migrate users gradually to the new version, i.e. expose the new version only to a small subset of the users initially and increase that subset until everybody uses it. This can be done f.ex. by deploying to only a subset of servers and sending a random/particular subset of users to the new servers but there are also ways if you have only a single machine. (See f.ex. my post Webapp Blue-Green Deployment Without Breaking Sessions/With Fallback With HAProxy.) Monitoring – make sure you are able to monitor flow of users through the system and detect any anomalies and errors early, long before angry calls from the business. Tools such as Logstash, Google Analytics (with custom events from JavaScript), client-side error logging via one of existing services or a custom solution are invaluable. About these ads

February 20, 2014

by Jakub Holý

· 22,201 Views

To ServiceMix or Not to ServiceMix

This morning an interesting topic was posted to the Apache ServiceMix user forum, asking the question: To ServiceMix or not ServiceMix. In my mind the short answer is: NO Guillaume Nodet one of the key architects and long time committer on Apache ServiceMix already had his mind set 3 years ago when he wrong this blog post - Thoughts about ServiceMix. What has happened on the ServiceMix project was that the ServiceMix kernel was pulled out of ServiceMix into its own project - Apache Karaf. That happened in spring 2009, which Guillaume also blogged about. So is all that bad? No its IMHO all great. In fact having the kernel as a separate project, and Camel and CXF as the integration and WS/RS frameworks, would allow the ServiceMix team to focus on building the ESB that truly had value-add. But that did not happen. ServiceMix did not create a cross product security model, web console, audit and trace tooling, clustering, governance, service registry, and much more that people were looking for in an ESB (or related to a SOA suite). There were only small pieces of it, but never really baked well into the project. That said its not too late. I think the ServiceMix project is dying, but if a lot of people in the community step up, and contribute and work on these things, then it can bring value to some users. But I seriously doubt this will happen. PS: 6 years ago I was working as a consultant and looked at the next integration platform for a major Danish organization, and we looked at ServiceMix back then and dismissed it due its JBI nature, and the new OSGi based architecture was only just started. And frankly it has taken a long long time to mature Apache Karaf / Felix / Aries and the other pieces in OSGi to what they are today to offer a stable and sound platform for users to build their integration applications. That was not the case 4-6 years ago. Okay No to ServiceMix - what are my options then? So what should use you instead of ServiceMix? Well in my mind you have at least these two options. 1) Use Apache Karaf and add the pieces you need, such as Camel, CXF, ActiveMQ and build your own ESB. These individual projects have regular releases, and you can upgrade as you need. The ServiceMix project only has the JBI components in additional, that you should NOT use. Only legacy users that got on the old ServiceMix 3.x wagon may need to use this in a graceful upgrade from JBI to Karaf based containers. 2) Take a look at fabric8. IMHO fabric8 is all that value-add the ServiceMix project did not create, and a lot more. James Strachan, just blogged today about some of his thoughts on fabric8, JBoss Fuse, and Karaf. I encourage you to take a read. For example he talks about how fabric becomes poly container, so you have a much wider choice of which containers/JVM to run your integration applications. OSGi is no longer a requirement. (IMHO that is very very existing and potentially a changer). I encourage you to check out fabric8 web-site, and also read the overview and motivation sections of the documentation. And then check out some of the videos. After the upcoming JBoss Fuse 6.1 release, the Fuse team at Red Hat will have more time and focus to bring the documentation at fabric8 up to date covering all the functionality we have (there is a lot more), and as well bring out a 1.0 community released using pure community releases. This gives end users a 100% free to use out of the box release. And users looking for a commercial release can then use JBoss Fuse. Best of both worlds. Summary Okay back to the question - to ServiceMix or not. Then NO. Innovation happens outside ServiceMix, and also more and more outside Apache. If you have thoughts then you can share those in comments to this blog, or better yet, get involved in the discussion forum at the ServiceMix user forum. PPS: The thoughts on this blog is mine alone, and are not any official words from my employer.

February 12, 2014

by Claus Ibsen

· 16,983 Views

Couchbase .NET SDK 2.0 Development Series: Part 1-1: Server Configuration

This article was originally written by Jeff Morris In the introduction to this series, I discussed some of the motivation for rewriting .NET SDK, the goals, objectives and the major features of the upcoming 2.0 release, and we examined the high-level architecture (10,000 feet view) of a Couchbase Server Client SDK. In this post we will go over the design and development of one of the core configuration components of a Couchbase SDK: Server Configuration. Introduction A Couchbase SDK client requires configuration from two sources: the Client Configuration, which defines the IP of the cluster to connect to, number of connections to use and other important information regarding how the client will interact with the cluster, and the Server Configuration, which defines the current state of the cluster (e.g. number of nodes, buckets that are available, etc.), thus driving the internal state of a client (Cluster Map) This post will only discuss the Server Configuration aspects and will largely revolve around implementing several well-defined interfaces or contracts. HTTP Streaming Configuration Currently, most clients use a “bootstrapping” technique via client configuration and a “Streaming Configuration” exposed by the Couchbase REST API. This is supported by versions of Couchbase from 2.2 and back. The usual approach is as follows: Within the “uris” element of a Client Configuration (semantics very per client), a URL is defined for which to start the bootstrapping process: http://[SERVER]:8091/pools The response is then parsed and the a request is made to get the buckets configuration: http://[SERVER]:8091/pools/default?uuid=[UUID] This response is parsed and another request is made to get streaming URL from: http://[SERVER]:8091/pools/default/buckets?v=[VERSION]&uuid=[UUID] Finally, the streaming URL connection is made which is long-lived and raises events in the client with respect to changes in the cluster: http://[SERVER]:8091/pools/default/bucketsStreaming/default?bucket_uuid=[UUID] The client will then change its internal state to match that of the current server configuration. There are some problems with this approach, among others: The “streaming URL” is resource intensive to create and maintain (mainly memory) on the server-side During a rebalance or failover situation, the cluster configuration may change many, many times. Each time this happens the client must tear down all of its resources (socket connections, VBucket mappings) and build its state up again and again, which can leads to reduced throughput, latency, higher than expected memory and CPU usage, and so on and so forth… Operations that are in-flight may be terminated and then re-tried on a new config state – it’s as if the “carpet has been pulled out from underneath them”. Responding to NOT_MY_VBUCKET responses are handled in-efficiently by simple trying the next node in the list – there is no information to help the client in which node to re-direct the operation to. A New Model for Configuration Management: CCCP While the streaming HTTP “bootstrapping” approach has worked reasonably well for most clients, the downsides have begun to outweigh the plusses, thus a new model for updating client configuration has been defined is available starting with the 2.5 version of the Couchbase Server: Client Cluster Configuration Publication or “CCCP”. CCCP introduces a new operation to be used before or after authentication to request configuration as well as a mechanism for returning configuration information when a NOT_MY_VBUCKET response is returned for a failed operation. In this case CCCP supporting SDK, the client will react by using the configuration to update itself before resending the operation. Note that a NOT_MY_VBUCKET is the standard response that is returned by the cluster when the cluster itself has changed (during a rebalance or failover scenario for example) and the client has not yet “synched” up and is using a stale configuration, resulting in an invalid key mapping. Whereas the “bootstrapping” approach is somewhat of a “pull” type operation, CCCP is either “push” or “pull” depending upon whether the request was initiated by the client (via an explicit CMD_GET_CLUSTER_CONFIG operation) or by the server itself (via a NOT_MY_VBUCKET response to an operation). We will go over CCCP in more detail in a later post. File Based Configuration One other semi-supported configuration option exists: file based configuration. File based configuration is primarily useful for testing and development and we will provide an implementation in the test projects to remove some of the dependencies that are difficult to replicate and or cause false positives when running the test suite. Structural Architecture View Internally the Server Configuration component of the client is a provider based model, in which multiple implementations of a configuration provider can be configured in the client and then a strategy can be chosen to determine which provider should be used. The default is a simple linear, fallback approach where the first configured provider is used and then if it fails the next provider in sequence will take its place. Here is a diagram showing the main actor objects and the relationships with some of other key objects within the client which will be discussed in subsequent posts: A description of each follows: ConfigurationProvider: a source which shall yield a new ConfigInfo. It’s the responsibility of the provider to provide the mechanism for fetching the configuration from its source. ConfigurationInformation: the configuration info contains a list of possible nodes and the VBucket map informing clients about which servers within said nodes a given key should be forwarded to. ConfigurationManager: bridge between the client and the providers and the strategy taken to determine which provider to use and what retry logic to apply. A more detailed document of this architecture can be found here. Please note that this, like all development, is an evolutionary process, so expect some changes and revisions over time. Conclusion and Next Steps This post discussed the history (HTTP Streaming) and the future (CCCP) of Couchbase SDK Server Configuration Management. In the next post we will go into detail the implementation of the HTTP Streaming configuration provider which is required for clients targeting pre-2.5 versions of the Couchbase Server.

February 7, 2014

by Don Pinto

· 3,791 Views

Java: Handling a RuntimeException in a Runnable

At the end of last year I was playing around with running scheduled tasks to monitor a Neo4j cluster and one of the problems I ran into was that the monitoring would sometimes exit. I eventually realised that this was because a RuntimeException was being thrown inside the Runnable method and I wasn’t handling it. The following code demonstrates the problem: import java.util.ArrayList; import java.util.List; import java.util.concurrent.*; public class RunnableBlog { public static void main(String[] args) throws ExecutionException, InterruptedException { ScheduledExecutorService executor = Executors.newSingleThreadScheduledExecutor(); executor.scheduleAtFixedRate(new Runnable() { @Override public void run() { System.out.println(Thread.currentThread().getName() + " -> " + System.currentTimeMillis()); throw new RuntimeException("game over"); } }, 0, 1000, TimeUnit.MILLISECONDS).get(); System.out.println("exit"); executor.shutdown(); } } If we run that code we’ll see the RuntimeException but the executor won’t exit because the thread died without informing it: Exception in thread "main" pool-1-thread-1 -> 1391212558074 java.util.concurrent.ExecutionException: java.lang.RuntimeException: game over at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:252) at java.util.concurrent.FutureTask.get(FutureTask.java:111) at RunnableBlog.main(RunnableBlog.java:11) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120) Caused by: java.lang.RuntimeException: game over at RunnableBlog$1.run(RunnableBlog.java:16) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) At the time I ended up adding a try catch block and printing the exception like so: public class RunnableBlog { public static void main(String[] args) throws ExecutionException, InterruptedException { ScheduledExecutorService executor = Executors.newSingleThreadScheduledExecutor(); executor.scheduleAtFixedRate(new Runnable() { @Override public void run() { try { System.out.println(Thread.currentThread().getName() + " -> " + System.currentTimeMillis()); throw new RuntimeException("game over"); } catch (RuntimeException e) { e.printStackTrace(); } } }, 0, 1000, TimeUnit.MILLISECONDS).get(); System.out.println("exit"); executor.shutdown(); } } This allows the exception to be recognised and as far as I can tell means that the thread executing the Runnable doesn’t die. java.lang.RuntimeException: game over pool-1-thread-1 -> 1391212651955 at RunnableBlog$1.run(RunnableBlog.java:16) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) pool-1-thread-1 -> 1391212652956 java.lang.RuntimeException: game over at RunnableBlog$1.run(RunnableBlog.java:16) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) pool-1-thread-1 -> 1391212653955 java.lang.RuntimeException: game over at RunnableBlog$1.run(RunnableBlog.java:16) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) This worked well and allowed me to keep monitoring the cluster. However, I recently started reading ‘Java Concurrency in Practice‘ (only 6 years after I bought it!) and realised that this might not be the proper way of handling the RuntimeException. public class RunnableBlog { public static void main(String[] args) throws ExecutionException, InterruptedException { ScheduledExecutorService executor = Executors.newSingleThreadScheduledExecutor(); executor.scheduleAtFixedRate(new Runnable() { @Override public void run() { try { System.out.println(Thread.currentThread().getName() + " -> " + System.currentTimeMillis()); throw new RuntimeException("game over"); } catch (RuntimeException e) { Thread t = Thread.currentThread(); t.getUncaughtExceptionHandler().uncaughtException(t, e); } } }, 0, 1000, TimeUnit.MILLISECONDS).get(); System.out.println("exit"); executor.shutdown(); } } I don’t see much difference between the two approaches so it’d be great if someone could explain to me why this approach is better than my previous one of catching the exception and printing the stack trace.

February 6, 2014

by Mark Needham

· 19,664 Views

How to Set Up a Multi-Node Hadoop Cluster on Amazon EC2, Part 1

Learn how to set up a four node Hadoop cluster using AWS EC2, PuTTy(gen), and WinSCP.

January 23, 2014

by Hardik Pandya

· 136,021 Views · 3 Likes

Google's vs Facebook's Trunk-Based Development

i’ve been pushing this branching model for something like 14 years now. it’s nice to see facebook say a little more about their trunk based development . of course they’re not doing it because they read anything i wrote, as the practice isn’t mine, it’s been hanging around in the industry for many years, but always as bridesmaid so to speak. if not trunk, what? mainline? mainline as popularized by clearcase is what we’re trying to kill. at least historically. it’s very different to trunk based development, and even having vastly improved merge tools doesn’t make it better – you still risk regressions, and huge nerves around ordering of releases. clearcase’s best-practices also foisted a ‘many repos’ (vobs) on teams using it, and that courted the whole conway’s law prophesy. i mentioned conway’s law before in scaling trunk based development and it concerns undue self-importance of teams around arbitrary separations. multiple small repos for a dvcs ? there is a great statement by a reddit user in the programming section of reddit, in conjunction with the facebook announcement: comment ref or all comments this redditor is right, there’s a lack of atomicity around a many-repos design, that stymies bisect. it could be that git subtrees (not submodules) are a way of getting that back (thanks @chris_stevenson on a back channel). there’s also a real problem moving code easily between repos (with history) though @offbytwo (back channel again) points out that subtrees carefully used can help do that. trunk at google vs facebook tuesday’s announcement was from facebook, and to give some balance, there’s deeper info on google’s trunk design in: google’s scaled trunk based development . subsetting the trunk for checkouts tl;dr: different google have many thousands of buildable and deployable things, which have very different release schedules. facebook don’t as they substantially have the php web-app, and apps for ios and android in different repos. well at least the main php web-app is in the mercurial trunk they talked about on tuesday. i’m not sure how the ios and android apps are managed, but at least the android one is outside the main trunk. google subset their trunk. i posted about that on monday . in that article i pointed out that the checkout can grow (or shrink) depending on the nature of the change being undertaken. it’s very different to a multiple-small-repos design. facebook don’t subset their trunk on checkout, as they do not need to; the head revisions of everything in that trunk are not big enough for a c: drive or ide to buckle. there’s also no compile stage for php , for regular development work. maximized sharing of code tl;dr: the same code is shared using globbed directories within the source tree. it’s shared as source files, in situ, rather than classes in a jar (or equivalent). refactoring tl;dr: the same developers take on refactorings where appropriate. sure it means a bigger atomic commit, but knowing all the affected source is in front of you as you do the refactoring is comforting. at least, knowing that if intellij (or eclipse, etc) completes the refactoring there’s a very strong possibility that the build will stay green, and that you’re only going to have a slight impact on other people’s working copy, and only if they are concurrently editing the same files. bigger refactoring probably still require a warning email. super tooling of the build phase tl;dr: the same google have what amounts to a super-computer doing the compilation for them (all languages that are compiled). all developers and all ci daemons leverage it. and by effective super-computer, i mean previous-compiled bits and pieces are pulled out of an internal cloud-map-thing for source permutations that have been compiled before. the distributed hashmap is possibly lru centric rather that everything forever. facebook don’t have that big hashmap of recently compiled bits and pieces, but they do have hiphop in the toolchain (originally a php to c++ compiler) which is interesting because at face value php is an interpreted language and ‘compile’ makes no sense. hiphop was created to reduce the server footprint and requirements for production deployments, while still being 100% functionally identical to the interpreted php app. it’s also faster in production. more recently hiphop became a virtual machine. it continues to be incrementally improved. like google, facebook can measure cost-benefit of continued work on it (prod rack space & prod electricity vs developer salaries). source-control weapons of choice tl;dr: different google use perforce for their trunk (with additional tooling), and many (but not all) developers use git on their local workstation to gain local-branching with an inhouse developed bridge for interop with perforce. facebook uses mercurial with additional tooling for the central server/repo. it’s unclear whether developers, by habit, exist with the mercurial client software or use git which can interop with mercurial backends. both google and facebook do trunk based development of course. branches & merge pain tl;dr: the same they don’t have merge pain, because as a rule developers are not merging to/from branches. at least up to the central repo’s server they are not. on workstations, developers may be merging to/from local branches, and rebasing when the push something that’s “done” back to the central repo. release engineers might cherry-pick defect fixes from time to time, but regular developers are not merging (you should not count to-working-copy merges) eating own dog-food tl;dr: mostly different all staff at facebook use a not-live-yet version of the web-app for all of their communication, documentation, management etc. if there’s a bug everyone feels it – though selenium2 functional tests and zillions of unit-tests guard against that happening too often. google has too many different apps for the team making each to be said to be a daily user of it. for example the adsense developer may use a dog-food version of gmail, but they are making adsense, so are hardly hurting themselves as they are not minute by minute using the interface as part of their regular existence at google. code review tl;dr: same both google and facebook insist on code reviews before the commit is accepted into the remote repo’s trunk for all others to use. there’s no mechanism of code review that’s more efficient or effective. google back in 2009 were pivoting incoming changes to the trunk around the code-review process managed by mondrian. i wrote about that in “continuous review #1” in december . i think they are unchanged in that respect: developers actively push their commit after a code review has been completed. facebook have just flipped to mercurial (from subversion). in the article linked to at the top of the page, facebook have not mentioned “pull request” or “patch queue”, or indeed “code review”. the article was mostly about speed, robustness and scale. i suspect they are sitting within the semantics of mercurials patch-queue processing though, although assigning a bot to it rather than a human. update: simon stewart pinged me and reminded me that they use (and made) phabricator. he spoke about it in a mobile@scale presentation, and that video is here . in the video he says the review is queue based now, but that they experimenting with landing the change sets into the master now. the video is from november, and was for the android + ios platforms, but it is likely to be used today for the main trunk for the php web-app. automated testing tl;dr: same heavy reliance on unit tests (not necessarily made in a tdd style). later in an build pipeline, selenium2 tests (for web-apps at least) kick in to guard the functional quality of deployed app. manual qa tl;dr: mostly the same both companies have progressively moved way from manual qa and dedicated testing professionals, towards developers testing their own stuff at discrete moments (note the dog-food item above too). prod release frequency tl;dr: it varies. facebook for the main web app, are twice a day presently (at least on weekdays). i published info on that at the start of last year. google have many apps with different release schedules, and some are “many times a day”, while others are “planned releases every few weeks”. many are in between. prod db deployment tl;dr: mostly the same database (or equivalent) table shapes (or equivalent) are designed to be forwards/backwards compatible as far as possible. pull requests as part of workflow tl;dr: same etsy, github, and other high throughput organizations are trunking by some definition, but using pull-requests to merge in things being done. it has different obligations if done, but google and facebook are not doing this in their trunks – they both essentially push (after review). refer the ‘code review’ section above. common code ownership tl;dr: the same you can commit to any part of the source tree, provided it passed a fair code review. notional owners of directories within the source tree take a boy-scout pledge to do their best with unsolicited incoming change-lists. there are strong permissions in the google perforce implementation, but the pledge means that contributions are not often rejected if the merit is there. build is ever broken tl;dr: the same almost never. directionality of merge for prod bug fixes tl;dr: the same trunk receives the defect fix, it gets cherry picked to the release branch. the release branch might have been made from a tag, if it didn’t exist before. binary dependencies tl;dr: the same checked into source-control without version suffixing (harmonized versions across all apps). e.g. – log4j.jar rather than log4j-1.2.8.jar.

January 21, 2014

by Paul Hammant

· 18,876 Views

Python Script to Delete Merged Git Branches

One of the great things about git is how fast it is. You can create a new branch, or switch to another branch, almost as fast as you can type the command. This tends to lower the impedance of branching. As a result, many individuals and teams will naturally converge on a process where they create many, many branches. If you’re like me, you may have 30 branches at any given time. This can make viewing all the branches unwieldy. Once I week or so, I would go on a branch deletion spree by manually copying and pasting multiple branch names into a git branch -D statement. The basic use case is that you want to delete any branches that are already merged into master. Here is a python script that automated just that. from subprocess import check_output import sys def get_merged_branches(): ''' a list of merged branches, not couting the current branch or master ''' raw_results = check_output('git branch --merged upstream/master', shell=True) return [b.strip() for b in raw_results.split('\n') if b.strip() and not b.startswith('*') and b.strip() != 'master'] def delete_branch(branch): return check_output('git branch -D %s' % branch, shell=True).strip() if __name__ == '__main__': dry_run = '--confirm' not in sys.argv for branch in get_merged_branches(): if dry_run: print branch else: print delete_branch(branch) if dry_run: print '*****************************************************************' print 'Did not actually delete anything yet, pass in --confirm to delete' print '*****************************************************************' To print the branches that would be deleted, just execute python delete_merged_branches.py. To actually delete the branches, execute python delete_merged_branches.py --confirm.

January 21, 2014

by Chase Seibert

· 8,140 Views

Using Grunt with AngularJS for Front End Optimization

I'm passionate about front end optimization and have been for years. My original inspiration was Steve Souders and his Even Faster Web Sites talk at OSCON 2008. Since then, I've optimized this blog, made it even faster with a new design, doubled the speed of several apps for clients and showed how to make AppFuse faster. As part of my Devoxx 2013 presentation, I showed how to do page speed optimization in a Java webapp. I developed a couple AngularJS apps last year. To concat and minify their stylesheets and scripts, I used mechanisms that already existed in the projects. On one project, it was Ant and its concat task. On the other, it was part of a Grails application, so I used the resources and yui-minify-resources plugins. The Angular project I'm working on now will be published on a web server, as well as bundled in an iOS native app. Therefore, I turned to Grunt to do the optimization this time. I found it to be quite simple, once I figured out how to make it work with Angular. Based on my findings, I submitted a pull request to add Grunt to angular-seed. Below are the steps I used to add Grunt to my Angular project. Install Grunt's command line interface with "sudo npm install -g grunt-cli". Edit package.json to include a version number (e.g. "version": "1.0.0"). Add Grunt plugins in package.json to do concat/minify/asset versioning: "grunt": "~0.4.1", "grunt-contrib-concat": "~0.3.0", "grunt-contrib-uglify": "~0.2.7", "grunt-contrib-cssmin": "~0.7.0", "grunt-usemin": "~2.0.2", "grunt-contrib-copy": "~0.5.0", "grunt-rev": "~0.1.0", "grunt-contrib-clean": "~0.5.0" Create a Gruntfile.js that runs all the plugins. module.exports = function (grunt) { grunt.initConfig({ pkg: grunt.file.readJSON('package.json'), clean: ["dist", '.tmp'], copy: { main: { expand: true, cwd: 'app/', src: ['**', '!js/**', '!lib/**', '!**/*.css'], dest: 'dist/' }, shims: { expand: true, cwd: 'app/lib/webshim/shims', src: ['**'], dest: 'dist/js/shims' } }, rev: { files: { src: ['dist/**/*.{js,css}', '!dist/js/shims/**'] } }, useminPrepare: { html: 'app/index.html' }, usemin: { html: ['dist/index.html'] }, uglify: { options: { report: 'min', mangle: false } } }); grunt.loadNpmTasks('grunt-contrib-clean'); grunt.loadNpmTasks('grunt-contrib-copy'); grunt.loadNpmTasks('grunt-contrib-concat'); grunt.loadNpmTasks('grunt-contrib-cssmin'); grunt.loadNpmTasks('grunt-contrib-uglify'); grunt.loadNpmTasks('grunt-rev'); grunt.loadNpmTasks('grunt-usemin'); // Tell Grunt what to do when we type "grunt" into the terminal grunt.registerTask('default', [ 'copy', 'useminPrepare', 'concat', 'uglify', 'cssmin', 'rev', 'usemin' ]); }; Add comments to app/index.html so usemin knows what files to process. The comments are the important part, your files will likely be different. ... A couple of things to note: 1) the copy task copies the "shims" directory from Webshims lib because it loads files dynamically and 2) setting "mangle: false" on the uglify task is necessary for Angular's dependency injection to work. I tried to use grunt-ngmin with uglify and had no luck. After making these changes, I'm able to run "grunt" and get an optimized version of my app in the "dist" folder of my project. For development, I continue to run the app from my "app" folder, so I don't currently have a need for watching and processing assets on-the-fly. That could change if I start using LESS or CoffeeScript. The results speak for themselves: from 27 requests to 5 on initial load, and only 3 requests for less than 2K after that. YSlow Page Speed No optimization 75 27 HTTP requests / 464K 55/100 Apache optimization (gzip and expires headers) 89 initial load: 26 requests / 166K primed cache: 4 requests / 40K 88/100 Apache + concat/minified/versioned files 98 initial load: 5 requests / 136K primed cache: 3 requests / 1.4K 93/100

January 16, 2014

by Matt Raible

· 67,881 Views · 2 Likes

Custom Checkstyle’s checks integration into SonarQube

Companies which use Checkstyle usually extend current set of checks by their own or modify existing ones to satisfy their needs. And there are lots of ready-to-use solutions which help to use Checkstyle in a number of ways: Maven Checkstyle Plugin, Intellij IDEA Checkstyle Plugin and Eclipse Checkstyle Plugin. There is a specific IDE environment which is different between the same company departments or even between team members. Integration of custom checks to all of them is not that simple. There is Sonar Checkstyle Plugin which could help integrate checks and let to show validation results to all of its users, no matter what IDE they use. In this article I'll provide an example about Checkstyle usage in Sonar which is a cross IDE solution for different platforms and environment. The example will be shown on sevntu.checkstyle project which contains a number of additional (non-standard) checks for Checkstyle. Here are some of the valuable checks to my opinion (7 out of 32): AvoidNotShortCircuitOperatorsForBooleanCheck – forces user not to use ShortCircuit operators ("|", "&" for boolean calculations). CustomDeclarationOrderCheck – adjusts class structure to make it more predictable. VariableDeclarationUsageDistanceCheck – checks distance between declaration of variable and its first usage of it. EitherLogOrThrowException – notifies about either log the exception, or throw it, but never do both. AvoidHidingCauseExceptionCheck – checks for hiding the cause of exception by throwing a new exception. ConfusingConditionCheck – prevents negation within an "if" expression if "else" is present. ReturnNullInsteadOfBoolean – notifies about returning null instead of boolean. There is an extension for Sonar's Checkstyle plugin which allows to use non-standard checks within Sonar. Let's dive a bit into the process of integration. Each check is represented as a separate rule in Sonar. After creating a new check we have to add a new rule in order so Sonar could understand and use this new check. To accomplish this we use checkstyle-extensions.xml configuration file in sevntu-checkstyle-sonar-plugin project. For instance, here is a rule for ReturnNullInsteadOfBoolean: com.github.sevntu.checkstyle.checks.coding.ReturnNullInsteadOfBoolean Returning Null Instead of Boolean Method declares to return Boolean, but returns null. Checker/TreeWalker/com.github.sevntu.checkstyle.checks.coding.ReturnNullInsteadOfBoolean To make Sonar know about a new check we have to complete the following steps: # build the project $ cd sevntu-checkstyle-sonar-plugin $ mvn clean install # copy the resulted jar file into Sonar $ cp target/sevntu-checkstyle-sonar-plugin-x.x.x.jar [SONAR_HOME]/extensions/plugins/ # restart Sonar $ [SONAR_HOME]/bin/linux-x86-64/sonar.sh restart The only thing is left is that we have to create a new profile in Sonar's “Quality Profiles” tab. We have already created a default Checkstyle configuration which contains all the non-standard checks from “sevntu.checkstyle” project. So, we can just import this configuration when creating a new profile and that's it: Now we can configure and use non-standard Checkstyle checks in addition to the standard ones within Sonar: This project is a good example of how you can integrate your custom checks into a static stage of code analysis, and make it user friendly, accessible for all members in your team and not get involved in a war of “which IDE is the best and more functional for static code analysis”. Useful links: Install Sonar and analyze a project How to integrate sevntu checks into SonarQubeTM (developer's guide) How to integrate sevntu checks into SonarQubeTM (user's guide) Mail-list for QnA

January 15, 2014

by Ruslan Diachenko

· 21,411 Views

Introduction to Codenvy

what is codenvy exactly? well, their website states: codenvy is a cloud environment for coding, building, and debugging apps. basically, it’s an ide in the cloud (“ide as a service?”) accessible by all the major browsers . it started out as an additional feature to the exo platform in early 2009 and gained a lot of traction after the first paas (openshift) and git integration was added mid-2011. codenvy targets me as a (java) software developer to run and debug applications in their hosted cloud ide, while being able to share and collaborate during development and finally publish to a repository – e.g. git – or a number of deployment platforms – e.g. amazon, openshift or google app engine. i first encountered their booth at javaone last september, but they couldn’t demo their product right there on the spot over the wifi, because their on-line demo workspace never finished loading well i got the t-shirt instead then, but now’s the time to see what codenvy has in store as a cloud ide. signing up signing up took 3 seconds. all you have to do is go to codenvy.com , use the “sign up” button, choose an email address and a name for your workspace , confirm the email they’ll send you and you’re done. the “workspace” holds all your projects and is part of the url codenvy will create for you, like “ https://codenvy.com/ide/ . although not very clear during the registration process – which of course nowadays is usually minimalistic as can be – it seems that i’ve signed up for codenvy’s free community plan , which gives me an unlimited number of public projects. you can even start coding without registration. after confirming the registration mail, i’m in. finally i’ll end up in the browser where your (empty) workspace has been opened. empty workspace a few options a possible for here on, as seen in the figure above: create a new project from scratch – generate an empty project from predefined project types import from github – import projects from your github account clone a git repository – create a new project from any public git reposiroty browse documentation invite people – get team members on board support – questions, feedback and troubleshooting let’s… create a new project from scratch this option allows you to name the new project – e.g. “myproject”, choose a technology and a paas . the technology is a defined set of languages of frameworks to develop with. available technologies at the moment the technologies are: java jar java war java spring javascript ruby on rails python php node.js android maven multi-module at the time of writing java 1.6 is supported. available paas at the moment the available platforms are: amazon webservices (aws) elastic beanstalk savvis cloud appfrog cloudbees google app engine (gae) heroku manymo android emulator red hat’s openshift none depending on the choice of technology, or or more paas options become available. a single jar can not be deployed onto any of the platforms, leaving only the option “none” available. a java web application (war) can be deployed onto any number of platforms, except heroku and manymo. node.js can only be deployed to openshift. creating a simple jar project after having selected a jar (and no platform) one can select a project template . e.g. if webapplication (war) would have been selected, codenvy would present project templates, such as google app engine java project illustrating simple examples that use the search api , java web project with datasource usage or a demonstration of accessing amazon s3 buckets using the java sdk . the jar technology has only one project: simple jar project . after having finished the wizard, our jar project has been created in our workspace. we’ll see two views of our project: a project explorer and a package explorer. project- and package explorer what we can see is that our jar project has been given a maven pom.xml with the following content: view source print ? 01. < project xmlns = " http://maven.apache.org/pom/4.0.0 " xmlns:xsi = " http://www.w3.org/2001/xmlschema-instance " 02. xsi:schemalocation = " http://maven.apache.org/pom/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd " > 03. < modelversion >4.0.0 04. < groupid >com.codenvy.workspaceyug8g52wjwb5im13 05. < artifactid >testjarproject 06. < version >1.0-snapshot 07. < packaging >jar 08. 09. < name >sample-lib 10. 11. < properties > 12. < project.build.sourceencoding >utf-8 13. 14. 15. < dependencies > 16. < dependency > 17. < groupid >junit 18. < artifactid >junit 19. < version >3.8.1 20. < scope >test 21. 22. 23. we have a generated group id com.codenvy.workspaceyug8g52wjwb5im13 , our own artifact id and the junit dependency, which is a decent choice for many java developers use it as a testing framework. the source encoding has already been set to utf-8, which is also a sensible choice. as a convenience we’ve also been given a hello.sayhello class, so we know we’re actually in a java project say hello file & project management so what about the browser-based editor we’re working in? on top we’re seeing a few menu’s, like file, project, edit, view, run, git, paas, window, share and help . i’ll be highlighting a few. file- and project menu the file menu allows to creating folders , packages and various kind of filetypes , such as text, xml (1.0 at time of writing) , html (4.1) , css (2.0), java classes and jsp’s (2.1). although i’m in a jar project, i am still also able to create here e.g. ruby, php or python files. a very convenient feature is to upload existing files to the workspace, either separately or in zip archives. i’ve tried dropping a file onto the package explorer from the file system, but the browser (in this case, chrome) tries to open it instead the project menu allows to create new projects, either by launching the create project wizard again, but also allows for importing from github . in order to clone a repository, you’ll have to authorize codenvy to access github.com to be able to import a project. after having authorized github, codeenvy presents me with a list of projects to choose from. after having imported all necessary stuff, it somehow needs to know what kind of project i’m importing. selecting a file type after importing a project from github the project i imported didn’t give codenvy any clues as to what kind of project it is (which is right since i only had a readme.md in it), so it lists a few options to choose from. i chose the maven multi-module type after which the output window shows: [email protected]:tvinke/examples.git was successfully cloned. [info] project type updated. if you’d have a pom.xml in the root of your project, it would immediately recognize it a s a maven project. apart from going through the project > import from github option, you can also go directly to the git menu, and choose clone repository . this allows you to manually enter the remote repository uri, wanted project name and the remote name (e.g. “origin”). cloning a repository one you have pulled in a git project, the git menu allows all kinds of common operations, such as adding and removing files, committing, pushing, pulling and much more. git menu the ssh keys can be found under menu window > preferences where you can view the github.com entry, where one can view the details or delete it. also a new key can be either generated or uploaded here. sharing the project one of the unique selling points of codenvy are their collaboration possibilities which come along with any project. you can: invite other developers with read-only rights or full read-write rights to your workspace and every project in it.when you’re pair-programming like this, or co-editing a file with a colleague, you can also send each other code pointers – small shortcuts to code lines. use factories to create temporary workspaces , through cloning, off one source project (“factory”) and represent the cloning mechanism as a url which can be given to other developers. a use case might be to get a colleague quickly started on a project by providing a fully working development environment.there’s a lot more about creating factories in the docs (such as through rest), but the nice thing is that once you have a factory url, you can embed it as a button, send it through email of publish it somewhere for others! a factory url to load up e.g. their twitter bootstrap sample – as they use on their website themselves – looks like: https://codenvy.com/factory?v=1.0&pname=sample-twitterbootstrap&wname=codenvy-factories&vcs=git&vcsurl=http%3a%2f%2fcodenvy.com%2fgit%2f04%2f0f%2f7f%2fworkspacegcpv6cdxy1q34n1i%2fsample-twitterbootstrap&idcommit=c1443ecea63471f5797f172c081cd802bac6e6b0&action=openproject&ptype=javascript conclusion applications are run in the cloud nowadays, so why not create them there too? codenvy brings some interesting features, such as being able to instantly provision workspaces (through factory urls) and share projects in real-time. it supports common operations with projects, files and version control. with a slew of languages and platforms and as an ide being always accessible through the internet, it could lower the barrier to actually code anytime and anywhere. in a future post i will try and see whether or not it can actually replace my conventional desktop ide for java development.

January 4, 2014

by Ted Vinke

· 7,982 Views

Make Jenkins Windows Service use your Preferred JRE

recently i was working on installing and configuring a new instance of jenkins . for some reason, which is out of this post’s context, i wanted to make jenkins run with a specific version of the java environment. fortunately it was something really easy. this post is mainly a reminder to me, next time i’d like to do the same jenkins by default uses the jre which located under the jre sub-directory of your jenkins installation home ( %jenkins_home ). to change this find the file named jenkins.xml in which is located in your %jenkins_home directory. edit it and look for the following section %base%\jre\bin\java now change the content of the executable property to point to your favorite jre. you can describe it as an absolute or relative path or you can even use, environment variables. save the file and restart jenkins. that’s it! enjoy!

November 26, 2013

by Patroklos Papapetrou

· 16,784 Views · 2 Likes

Integration vs. Orchestration

Applications are at the center of the IT universe. As IT shifts its primary goal from connectivity to experience, it will require tighter collaboration between the various infrastructure elements that support application workloads. There are two philosophical approaches to how this orchestration might take place: through a tightly-integrated system, or through a more loose coupling of heterogeneous components. But how should architects make the choice between these approaches? The principles of architecture tend to be most vehemently argued by the vendors competing to sell the underlying solutions. IT vendors generally (and networking in particular) tend to turn these principle discussions into tit-for-tat FUD wars, arguing in absolution that one approach or another is the right way to go. But the ones who put their careers on the line when they select an architectural approach should understand more fully what drives specific architectural selections. The difference between tightly-integrated systems and more loosely federated components is really performance. Whenever two components come together, that boundary is defined by some interface. If you need to extract performance out of the coupled system, you have to make changes on one or both sides of said interface. As a vendor, if you can twiddle the bits on only one side, you can improve the overall system performance up to but not beyond whatever the other side can do. So when performance is the primary objective, you will tend to see solutions where both sides of that interface are owned (or at least controlled) by the same party. The ability to make changes on both sides of the interface is the only way to maximize performance. When the primary objective is not performance, you will see a generalized interface that sits between a decoupled pairing of solution components. Enter SDN. Or network virtualization. Or NFV. Or DevOps. When we talk about performance as an industry, we usually mean capacity and speed. But performance is more than bandwidth and latency. The whole reason any of the SDN technologies is emerging is to satisfy operational issues. Getting applications provisioned, monitored, troubleshot, billed, upgraded, and so on has taken over the top spot on the pain list for many companies. The question we ought to be asking is what are the operational performance requirements. The answer isn't black or white. What does performance even mean in an operational setting? It seems at least plausible that operational performance translates to things like the rate of change (think provisioning changes per second or call setup and teardown rates, for example) or the rate of polling (queries per second, as with monitoring or billing). For some environments, it might be that the scale of configuration management or data querying is quite high. Any company that is doing fine-grained monitoring or rapid state-based network changes, for example, might have very high operational performance requirements. Meanwhile, most normal networks will likely have a much lower performance bar. For the former, the objective has to be to eke out every bit of operational performance from the system. This will demand a more tightly-integrated solution. Both sides of the resource boundary (network and storage, as an example) might need to be within the same system, and the interface between them should appropriately be very specific to the implementation. For the latter, a more generalized interface between infrastructure elements should be more than sufficient. The primary goal is not to maximize performance but rather enable collaboration between components. In these architectures, the generalized interface is the most important thing as it will optimize choice and flexibility between the individual system elements. Both are absolutely valid use cases; there is no judgment in which is the more noble cause. But architects ought to be clear about what it is they are optimizing for. Selecting a generalized interface merely because it is open could be disastrous if it turns out that the performance requirements exceed what that interface provides. Conversely, selecting a tightly-integrated system might be more costly or limiting than is necessary if the real problem is orchestration rather than performance. So where do architects start? Everything starts with requirements. Is the objective to achieve a specific rate of change? Or is the objective merely to make tasks like provisioning and troubleshooting more coordinated across infrastructure silos? Are you planning to do anything exotic in terms of polling data on the system elements? Or are you expecting data to be accessed at a more casual rate? The real point here is that architects should start to express their orchestration requirements in terms of both capability and performance. We do this instinctively when we think about how we move bits back and forth, or how we access storage, or how we allot cycles on a server. But when it comes to management, because our collective capabilities have been so lacking, we have ignored performance. As SDN and other technologies continue to advance, operational performance will take on a more important role. And without knowing what the requirements are, designers will really be flying blind, making tradeoffs that might not even be necessary. [Today's fun fact: In ancient Rome, it was considered a sign of leadership to be born with a crooked nose. If Mike Tyson were born earlier, we'd call him Emperor.]

November 20, 2013

by Mike Bushong

· 8,957 Views · 1 Like

How to Integrate Apache Shiro into a Web Application

Apache Shiro can be used in a wide range of applications as part of the Java Security Framework.

November 4, 2013

by Hüseyin Akdoğan

CORE

· 39,383 Views · 2 Likes

EasyNetQ: Publisher Confirms

Publisher confirms are a RabbitMQ addition to AMQP to guarantee message delivery. You can read all about them here and here. In short they provide a asynchronous confirmation that a publish has successfully reached all the queues that it was routed to. To turn on publisher confirms with EasyNetQ set the publisherConfirms connection string parameter like this: var bus = RabbitHutch.CreateBus("host=localhost;publisherConfirms=true"); When you set this flag, EasyNetQ will wait for the confirmation, or a timeout, before returning from the Publish method: bus.Publish(new MyMessage { Text = "Hello World!" }); // here the publish has been confirmed. Nice and easy. There’s a problem though. If I run the above code in a while loop without publisher confirms, I can publish around 4000 messages per second, but with publisher confirms switched on that drops to around 140 per second. Not so good. With EasyNetQ 0.15 we introduced a new PublishAsync method that returns a Task. The Task completes when the publish is confirmed: bus.PublishAsync(message).ContinueWith(task => { if (task.IsCompleted) { Console.WriteLine("Publish completed fine."); } if (task.IsFaulted) { Console.WriteLine(task.Exception); } }); Using this code in a while loop gets us back to 4000 messages per second with publisher confirms on. Happy confirms!

November 1, 2013

by Mike Hadlow

· 8,930 Views

Securing Docker’s Remote API

One piece to Docker that is interesting AMAZING is the Remote API that can be used to programatically interact with docker. I recently had a situation where I wanted to run many containers on a host with a single container managing the other containers through the API. But the problem I soon discovered is that at the moment when you turn networking on it is an all or nothing type of thing… you can’t turn networking off selectively on a container by container basis. You can disable IPv4 forwarding, but you can still reach the docker remote API on the machine if you can guess the IP address of it. One solution I came up with for this is to use nginx to expose the unix socket for docker over HTTPS and utilize client-side ssl certificates to only allow trusted containers to have access. I liked this setup a lot so I thought I would share how it’s done. Disclaimer: assumes some knowledge of docker! Generate The SSL Certificates We’ll use openssl to generate and self-sign the certs. Since this is for an internal service we’ll just sign it ourselves. We also remove the password from the keys so that we aren’t prompted for it each time we start nginx. # Create the CA Key and Certificate for signing Client Certs openssl genrsa -des3 -out ca.key 4096 openssl rsa -in ca.key -out ca.key # remove password! openssl req -new -x509 -days 365 -key ca.key -out ca.crt # Create the Server Key, CSR, and Certificate openssl genrsa -des3 -out server.key 1024 openssl rsa -in server.key -out server.key # remove password! openssl req -new -key server.key -out server.csr # We're self signing our own server cert here. This is a no-no in production. openssl x509 -req -days 365 -in server.csr -CA ca.crt -CAkey ca.key -set_serial 01 -out server.crt # Create the Client Key and CSR openssl genrsa -des3 -out client.key 1024 openssl rsa -in client.key -out client.key # no password! openssl req -new -key client.key -out client.csr # Sign the client certificate with our CA cert. Unlike signing our own server cert, this is what we want to do. openssl x509 -req -days 365 -in client.csr -CA ca.crt -CAkey ca.key -set_serial 01 -out client.crt Another option may be to leave the passphrase in and provide it as an environment variable when running a docker container or through some other means as an extra layer of security. We’ll move ca.crt, server.key and server.crt to /etc/nginx/certs. Setup Nginx The nginx setup for this is pretty straightforward. We just listen for traffic on localhost on port 4242. We require client-side ssl certificate validation and reference the certificates we generated in the previous step. And most important of all, set up an upstream proxy to the docker unix socket. I simply overwrote what was already in /etc/nginx/sites-enabled/default. upstream docker { server unix:/var/run/docker.sock fail_timeout=0; } server { listen 4242; server localhost; ssl on; ssl_certificate /etc/nginx/certs/server.crt; ssl_certificate_key /etc/nginx/certs/server.key; ssl_client_certificate /etc/nginx/certs/ca.crt; ssl_verify_client on; access_log on; error_log /dev/null; location / { proxy_pass http://docker; proxy_redirect off; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; client_max_body_size 10m; client_body_buffer_size 128k; proxy_connect_timeout 90; proxy_send_timeout 120; proxy_read_timeout 120; proxy_buffer_size 4k; proxy_buffers 4 32k; proxy_busy_buffers_size 64k; proxy_temp_file_write_size 64k; } } One important piece to make this work is you should add the user nginx runs as to the docker group so that it can read from the socket. This could be www-data, nginx, or something else! Hack It Up! With this setup and nginx restarted, let’s first run a curl command to make sure that this setup correctly. First we’ll make a call without the client cert to double check that we get denied access then a proper one. # Is normal http traffic denied? curl -v http://localhost:4242/info # How about https, sans client cert and key? curl -v -s -k https://localhost:4242/info # And the final good request! curl -v -s -k --key client.key --cert client.crt https://localhost:4242/info For the first two we should get some run of the mill 400 http response codes before we get a proper JSON response from the final command! Woot! But wait there’s more… let’s build a container that can call the service to launch other containers! For this example we’ll simply build two containers: one that has the client certificate and key and one that doesn’t. The code for these examples are pretty straightforward and to save space I’ll leave the untrusted container out. You can view the untrusted container on github (although it is nothing exciting). First, the node.js application that will connect and display information: https = require 'https' fs = require 'fs' options = host: 172.42.1.62 port: 4242 method: 'GET' path: '/containers/json' key: fs.readFileSync('ssl/client.key') cert: fs.readFileSync('ssl/client.crt') headers: { 'Accept': 'application/json'} # not required, but being semantic here! req = https.request options, (res) -> console.log res req.end() And the Dockerfile used to build the container. Notice we add the client.crt and client.key as part of building it! FROM shykes/nodejs MAINTAINER James R. Carr ADD ssl/client* /srv/app/ssl ADD package.json /srv/app/package.json ADD app.coffee /srv/app/app.coffee RUN cd /srv/app && npm install . CMD cd /srv/app && npm start That’s about it. Run docker build . and docker run -n >IMAGE ID< and we should see a json dump to the console of the actively running containers. Doing the same in the untrusted directory should present us with some 400 error about not providing a client ssl certificate. I’ve shared a project with all this code plus a vagrant file on github for your own prusual. Enjoy!

October 31, 2013

by James Carr

· 14,313 Views

Writing Git Hooks Using Python

Since git hooks can be any executable script with an appropriate #! line, Python is more than suitable for writing your git hooks. Simply stated, git hooks are scripts which are called at different points of time in the life cycle of working with your git repository. Let’s start by creating a new git repository: ~/work> git init git-hooks-exp Initialized empty Git repository in /home/gene/work/git-hooks-exp/.git/ ~/work> cd git-hooks-exp/ ~/work/git-hooks-exp (master)> tree -al .git/ .git/ ├── branches ├── config ├── description ├── HEAD ├── hooks │ ├── applypatch-msg.sample │ ├── commit-msg.sample │ ├── post-update.sample │ ├── pre-applypatch.sample │ ├── pre-commit.sample │ ├── prepare-commit-msg.sample │ ├── pre-rebase.sample │ └── update.sample ├── info │ └── exclude ├── objects │ ├── info │ └── pack └── refs ├── heads └── tags 9 directories, 12 files Inside the .git are a number of directories and files, one of them being hooks/ which is where the hooks live. By default, you will have a number of hooks with the file names ending in .sample. They may be useful as starting points for your own scripts. However, since they all have an extension .sample, none of the hooks are actually activated. For a hook to be activated, it must have the right file name and it should be executable. Let’s see how we can write a hook using Python. We will write a post-commit hook. This hook is called immediately after you have made a commit. We are going to do something fairly useless, but quite interesting in this hook. We will take the commit SHA1 of this commit, and print how it may look like in a more human form. I do the latter using the humanhash module. You will need to have it installed. Here is how the hook looks like: #!/usr/bin/python import subprocess import humanhash # get the last commit SHA and print it after humanizing it # https://github.com/zacharyvoase/humanhash print humanhash.humanize( subprocess.check_output( ['git','rev-parse','HEAD'])) I use the subprocess.check_output() function to execute the command git rev-parse HEAD so that I can get the commit SHA1 and then call the humanhash.humanize() function with it. Save the hook as a file, post-commit in your hooks/ directory and make it executable using chmod +x .git/hooks/post-commit. Let’s see the hook in action: ~/work/git-hooks-exp (master)> touch file ~/work/git-hooks-exp (master)> git add file ~/work/git-hooks-exp (master)> git commit -m "Added a file" carbon-network-connecticut-equal [master (root-commit) 2d7880b] Added a file 1 file changed, 0 insertions(+), 0 deletions(-) create mode 100644 file The commit SHA1 for the commit turned out to be 2d7880be746a1c1e75844fc1aa161e2b8d955427. Let’s check it with the humanize function and check if we get the same message as above: >>> humanhash.humanize('2d7880be746a1c1e75844fc1aa161e2b8d955427') 'carbon-network-connecticut-equal' And you can see the same message above as well. For some of the hooks, you will see that they are called with some parameters. In Python you can access them using the sys.argv attribute from the sys module, with the first member being the name of the hook of course and the others will be the parameters that the hook is called with.

October 31, 2013

by Amit Saha

· 13,621 Views

ElasticSearch: Java API

ElasticSearch provides Java API, thus it executes all operations asynchronously by using client object.

September 30, 2013

by Hüseyin Akdoğan

CORE

· 137,611 Views · 4 Likes

The Real Cost of Change in Software Development

There are two widely opposed (and often misunderstood) positions on how expensive it can be to change or fix software once it has been designed, coded, tested and implemented. One holds that it is extremely expensive to leave changes until late, that the cost of change rises exponentially. The other position is that changes should be left as late as possible, because the cost of changing software is – or at least can be – essentially flat (that’s why we call it software). Which position is right? Why should we care? And what can we do about it? Exponential Cost of Change Back in the early 1980s, Barry Boehm published some statistics (Software Engineering Economics, 1981) which showed that the cost of making a software change or fix increases significantly over time – you can see the original curve that he published here. Boehm looked at data collected from Waterfall-based projects at TRW and IBM in the 1970s, and found that the cost of making a change increases as you move from the stages of requirements analysis to architecture, design, coding, testing and deployment. A requirements mistake found and corrected while you are still defining the requirements costs almost nothing. But if you wait until after you've finished designing, coding and testing the system and delivering it to the customer, it can cost up to 100 times as much. A few caveats here. First, the cost curve is much higher in large projects (in smaller projects, the cost curve is more like 1:4 instead of 1:100). Those cases when the cost of change rises up to 100 times are rare, what Boehm calls Architecture-Breakers, where the team gets a fundamental architectural assumption wrong (scaling, performance, reliability) and doesn't find out until after customers are already using the system and running into serious operational problems. This analysis was all done on a small data sample from more than 30 years ago, when developing code was much more expensive and time-consuming and paperworky, and the tools sucked. A few other studies have been done since then that mostly back up Boehm's findings – at least the basic idea that the longer it takes for you to find out that you made a mistake, the more expensive it is to correct it. These studies have been widely referenced in books like Steve McConnell’s Code Complete, and used to justify the importance of early reviews and testing: Studies over the last 25 years have proven conclusively that it pays to do things right the first time. Unnecessary changes are expensive. Researchers at Hewlett-Packard, IBM, Hughes Aircraft, TRW, and other organizations have found that purging an error by the beginning of construction allows rework to be done 10 to 100 times less expensively than when it's done in the last part of the process, during system test or after release (Fagan 1976; Humphrey, Snyder, and Willis 1991; Leffingwell 1997; Willis et al. 1998; Grady 1999; Shull et al. 2002; Boehm and Turner 2004). In general, the principle is to find an error as close as possible to the time at which it was introduced. The longer the defect stays in the software food chain, the more damage it causes further down the chain. Since requirements are done first, requirements defects have the potential to be in the system longer and to be more expensive. Defects inserted into the software upstream also tend to have broader effects than those inserted further downstream. That also makes early defects more expensive. There’s some controversy over how accurate and complete this data is, how much we can rely on it, and how relevant it is today when we have much better development tools and many teams have moved from heavyweight sequential Waterfall development to lightweight iterative, incremental development approaches. Flattening the Cost of Changing Code The rules of the game should change with iterative and incremental development – because they have to. Boehm realized back in the 1980s that we could catch more mistakes early (and therefore reduce the cost of development) if we think about risks upfront and design and build software in increments, using what he called the Spiral Model, rather than trying to define, design and build software in a Waterfall sequence. The same ideas are behind more modern, lighter Agile development approaches. In Extreme Programming Explained (the first edition, but not the second) Kent Beck states that minimizing the cost of change is one of the goals of Extreme Programming, and that a flattened change cost curve is “the technical premise of XP”: Under certain circumstances, the exponential rise in the cost of changing software over time can be flattened. If we can flatten the curve, old assumptions about the best way to develop software no longer hold … You would make big decisions as late in the process as possible, to defer the cost of making the decisions and to have the greatest possible chance that they would be right. You would only implement what you had to, in hopes that the needs you anticipate for tomorrow wouldn't come true. You would introduce elements to the design only as they simplified existing code or made writing the next bit of code simpler. It’s important to understand that Beck doesn't say that with XP the change curve is flat. He says that these costs can be flattened if teams work toward this, leveraging key practices and principles in XP, such as: Simple Design, doing the simplest thing that works, and deferring design decisions as late as possible (YAGNI), so that the design is easy to understand and easy to change Continuous, disciplined refactoring to keep the code easy to understand and easy to change Test-First Development – writing automated tests upfront to catch coding mistakes immediately, and to build up a testing safety net to catch mistakes in the future Developers collaborating closely and constantly with the customer to confirm their understanding of what they need to build and working together in pairs to design solutions and solve problems, and catch mistakes and misunderstandings early Relying on working software over documentation to minimize the amount of paperwork that needs to be done with each change (write code, not specs) The team’s experience working incrementally and iteratively – the more that people work and think this way, the better they will get at it. All of this makes sense and sounds right, although there are no studies that back up these assertions, which is why Beck dropped this change curve discussion from the second edition of his XP book. But, by then, the idea that change could be flat with Agile development had already become accepted by many people. The Importance of Feedback Scott Amber agrees that the cost curve can be flattened in Agile development, not because of Simple Design, but because of the feedback loops that are fundamental to iterative, incremental development. Agile methods optimize feedback within the team, developers working closely together with each other and with the customer and relying on continuous face-to-face communications. Following technical practices like test-first development, pair programming and continuous integration makes these feedback loops even tighter. But what really matters is getting feedback from the people using the system – it’s only then that you know if you got it right or what you missed. The longer that it takes to design and build something and get feedback from real users, the more time and work that is required to get working software into a real customer’s hands, the higher your cost of change really is. Optimizing and streamlining this feedback loop is what is driving the lean startup approach to development: defining a minimum viable product (something that just barely does the job), getting it out to customers as quickly as you can, and then responding to user feedback through continuous deployment and A/B testing techniques until you find out what customers really want. Even Flat Change Can Still Be Expensive Even if you do everything to optimize these feedback loops and minimize your overheads, this still doesn’t mean that change will come cheap. Being fast isn’t good enough if you make too many mistakes along the way. The Post Agilist uses the example of painting a house: Assume that it costs $1,000 each time you paint the house, whether you paint it blue, red or white. The cost of change is flat. But if you have to paint it blue first, then red, then white before everyone is happy, you’re wasting time and money. “No matter how expensive or cheap the "cost of change" curve may be, the fewer changes that are made, the cheaper and faster the result will be … Planning is not a four letter word.” (However, I would like to point out that “plan” is.) Spending too much time upfront in planning and design is waste. But not spending enough time upfront to find out what you should be building and how you should be building it before you build it, and not taking the care to build it carefully, is also a waste. Change Gets More Expensive Over Time You also have to accept that the incremental cost of change will go up over the life of a system, especially once a system is being used. This is not just a technical debt problem. The more people using the system, the more people who might be impacted by the change if you get it wrong, the more careful you have to be. This means that you need to spend more time on planning and communicating changes, building and testing a roll-back capability, and roll changes out slowly using canary releases and dark launching – which add costs and delays to getting feedback. There are also more operational dependencies that you have to understand and take care of, and more data that you have to change or fix up, making changes even more difficult and expensive. If you do things right, keep a good team together and manage technical debt responsibly, these costs should rise gently over the life of a system – and if you don’t, that exponential change curve will kick in. What is the real cost of change? Is the real cost of change exponential, or is it flat? The truth is somewhere in between. There’s no reason that the cost of making a change to software has to be as high as it was 30 years ago. We can definitely do better today, with better tools and better, cheaper ways of developing software. The keys to minimizing the costs of change seem to be: Get your software into customer hands as quickly as you can. I am not convinced that any organization really needs to push out software changes 10 to 50 to 100 times a day, but you don’t want to wait months or years for feedback, either. Deliver less, but more often. And because you’re going to deliver more often, it makes sense to build a continuous delivery pipeline so that you can push changes out efficiently and with confidence. Use ideas from lean software development and maybe Kanban to identify and eliminate waste and to minimize cycle time. We know that, even with lots of upfront planning and design thinking, we won’t get everything right upfront -- this is the Waterfall fallacy. But it’s also important not to waste time and money iterating when you don’t need to. Spending enough time upfront in understanding requirements and in design to get it at least mostly right the first time can save a lot later on. Whether you’re working incrementally and iteratively, or sequentially, it makes good sense to catch mistakes early when you can, whether you do this through test-first development and pairing, or requirements workshops and code reviews -- whatever works for you.

September 20, 2013

by Jim Bird

· 22,330 Views

This is how Facebook develops and deploys software. Should you care?

A recently published academic paper by Prof. Dror Feitelson at Hebrew University, Eitan Frachtenberg a research scientist at Facebook, and Kent Beck (who is also doing something at Facebook), describes Facebook’s approach to developing and deploying its front-end software. While it would be more interesting to understand how back-end development is done (this is where the real heavy lifting is done scaling up to handle hundreds of millions of users), there are a few things in the paper that are worth knowing about. Continuous Deployment at Facebook is Not Continuous Deployment Rather than planning work out into projects or breaking work into time-boxed Sprints, Facebook developers do most of their work in independent, small changes that are released frequently. This makes sense in Facebook’s online business model, everyone constantly tuning the platform and trying out new options and applications in different user communities, seeing what sticks. It’s a credit to their architecture that so many small, independent changes can actually be done independently and cheaply. Facebook says that it follows Continuous Deployment, but it’s not Continuous Deployment the way that IMVU made popular where every change is pushed out to customers immediately, or even how a company like Etsy does Continuous Deployment. At Facebook, code can be released twice a day, but this is done mostly for bug fixes and internal code. New production code is released once per week: thousands of changes by hundreds of developers are packaged up by their small release team on Sundays, run through automated regression testing, and released on Tuesday if the developers who contributed the changes are present. Release engineers assess the risk of changes based on the size of the change, the amount of discussion done in code reviews (which is recorded through an internal code review tool), and on each developer’s “push karma”: how many problems they have seen from code by this developer before. A tool called “Gatekeeper” controls what features are available to which customers to support dark launching, and all code is released incrementally – to staging, then a subset of users, and so on. Changes can be rolled-back if necessary – individually, or, as a last resort, an entire code release. However, like a lot of Silicon Valley DevOps shops, they mostly follow the “Real Men only Roll Forward” motto. Code Ownership A key to the culture at Facebook is that developers are individually responsible for the code that they wrote, for testing it and supporting it in production. This is reflected in their code ownership model: Developers must also support the operational use of their software — a combination that’s become known as “DevOps.” This further motivates writing good code and testing it thoroughly. Developers’ personal stake in keeping the system running smoothly complements the engineering procedures and lets the system maintain quality at scale. Methodologies and tools aren’t enough by themselves because they can always be misused. Thus, a culture of personal responsibility is critical. Consequently, most source files are modified by only a few engineers. Although at least one other engineer reviews all changes before they’re committed, a third of the source files have only been edited by one engineer, and another quarter by two. Only 10 percent of the files are handled by more than seven engineers. On the other hand, the distribution of engineers per file has a heavy tail, with the most widely shared file handled by no fewer than 870 distinct engineers. These widely shared files are predominantly library files and also include major configuration and top-level PHP files. Testing? We don’t need no stinking testing … Facebook doesn't have an independent test team, because, it says, doesn'tneed one. First, they depend a lot on code reviews to find bugs: At Facebook, code review occupies a central position. Every line of code that’s written is reviewed by a different engineer than the original author. This serves multiple purposes: the original engineer is motivated to ensure that the code is of high quality, the reviewer comes with a fresh mind and might find defects or suggest alternatives, and, in general, knowledge about coding practices and the code itself spreads throughout the company. Developers are also responsible for writing unit tests and their own regression tests – they have “tens of thousands of regression tests” (which doesn't sound like nearly enough for 10+ million lines of mostly PHP code compiled into C++, in both of which languages coding mistakes are easy to make) and automated performance tests. And developers also test the software by using the development version of Facebook for their personal Facebook use. According to the authors, “this is just one aspect of the departure from traditional software development”. But Facebook developers using their own software internally (and passing this off as “testing”) is no different than the early days at Microsoft where employees were supposed to “eat their own dog food”, a practice that did little if anything to improve the quality of Microsoft products. Facebook also depends on customers to test the software for it. Software is released in steps for A/B testing and “live experimentation” on subsets of the user base, whether customers want to participate in this testing or not. Because its customer base is so large, it can get meaningful feedback from testing with even a small percentage of users, which at least minimizes the risk and inconvenience to customers. Security??? While performance is an important consideration for developers at Facebook, there is no mention of security checks or testing anywhere in this description of how Facebook develops and deploys software. No static analysis, dynamic analysis/scanning, pen testing or explanation of how the security team and developers work together, not even for “privacy sensitive code” – although this code is “held to a higher standard” it doesn’t explain what this “higher standard” is. Presumably it relies on the use of libraries and frameworks to handle at least some AppSec problems, and possibly to look for security bugs in its code reviews, but it doesn't say. There isn’t much information available on Facebook’s AppSec program anywhere. The security team at Facebook seems to spend a lot of time educating people on how to use Facebook safely and how to develop Facebook apps safely and running their bug bounty program which pays outsiders to find security bugs for them. A search on security on Facebook mostly comes back with a long list of public security failures, privacy violations and application security vulnerabilities found over the years and continuing up to the present day. Maybe the lack of an effective AppSec program is the reason for this. This is the way Facebook is Developed. Should you care? While it’s interesting to get a look inside a high-profile organization like Facebook and how it approaches development at scale, it’s not clear why this paper was written. There is little about what Facebook is doing (on its front-end development at least) that is unique or innovative, except maybe the way it uses BitTorrent to push code changes out to thousands of servers like Twitter does, something that I already heard about a few years ago at Velocity and that has been written about before. I like the idea of developers being responsible for their work, all the way into production, which is a principle that we also follow. Code reviews are good. Dark launching features is a good practice and has been a common practice in systems for a long time (even before it was called "dark launching"). Not having testers or doing AppSec is not good. Otherwise, I'm not sure what the rest of us can learn from or would want to use from this.

September 4, 2013

by Jim Bird

· 43,052 Views · 1 Like

OpenStack Savanna: Fast Hadoop Cluster Provisioning on OpenStack

introduction openstack is one of the most popular open source cloud computing projects to provide infrastructure as a service solution. its key components are compute (nova), networking (neutron, formerly known as quantum), storage (object and block storage, swift and cinder, respectively), openstack dashboard (horizon), identity service (keystone) and image service (glance). there are other official incubated projects like metering (celiometer) and orchestration and service definition (heat). savanna is a hadoop as a service for openstack introduced by mirantis . it is still in an early phase (version .02 was released in summer 2013) and according to its roadmap version 1.0 is targeted for official openstack incubation. in principle, heat also could be used for hadoop cluster provisioning but savanna is especially tuned for providing hadoop-specific api functionality while heat is meant to be used for generic purposes. savanna architecture savanna is integrated with the core openstack components such as keystone, nova, glance, swift and horizon. it has a rest api that supports the hadoop cluster provisioning steps. savanna api is implemented as a wsgi server that, by default, listens to port 8386. in addition, savanna can also be integrated with horizon, the openstack dashboard to create a hadoop cluster from the management console. savanna also comes with a vanilla plugin that deploys a hadoop cluster image. the standard out-of-the-box vanilla plugin supports hadoop 1.1.2 version. installing savanna the simplest option to try out savanna is to use devstack in a virtual machine. i was using an ubuntu 12.04 virtual instance in my tests. in that environment we need to execute the following commands to install devstack and savanna api: $ sudo apt-get install git-core $ git clone https://github.com/openstack-dev/devstack.git $ vi localrc # edit localrc admin_password=nova mysql_password=nova rabbit_password=nova service_password=$admin_password service_token=nova # enable swift enabled_services+=,swift swift_hash=66a3d6b56c1f479c8b4e70ab5c2000f5 swift_replicas=1 swift_data_dir=$dest/data # force checkout prerequsites # force_prereq=1 # keystone is now configured by default to use pki as the token format which produces huge tokens. # set uuid as keystone token format which is much shorter and easier to work with. keystone_token_format=uuid # change the floating_range to whatever ips vm is working in. # in nat mode it is subnet vmware fusion provides, in bridged mode it is your local network. floating_range=192.168.55.224/27 # enable auto assignment of floating ips. by default savanna expects this setting to be enabled extra_opts=(auto_assign_floating_ip=true) # enable logging screen_logdir=$dest/logs/screen $ ./stack.sh # this will take a while to execute $ sudo apt-get install python-setuptools python-virtualenv python-dev $ virtualenv savanna-venv $ savanna-venv/bin/pip install savanna $ mkdir savanna-venv/etc $ cp savanna-venv/share/savanna/savanna.conf.sample savanna-venv/etc/savanna.conf # to start savanna api: $ savanna-venv/bin/python savanna-venv/bin/savanna-api --config-file savanna-venv/etc/savanna.conf to install savanna ui integrated with horizon, we need to run the following commands: $ sudo pip install savanna-dashboard $ cd /opt/stack/horizon/openstack-dashboard $ vi settings.py horizon_config = { 'dashboards': ('nova', 'syspanel', 'settings', 'savanna'), installed_apps = ( 'savannadashboard', .... $ cd /opt/stack/horizon/openstack-dashboard/local $ vi local_settings.py savanna_url = 'http://localhost:8386/v1.0' $ sudo service apache2 restart provisioning a hadoop cluster as a first step, we need to configure keystone-related environment variables to get the authentication token: ubuntu@ip-10-59-33-68:~$ vi .bashrc $ export os_auth_url=http://127.0.0.1:5000/v2.0/ $ export os_tenant_name=admin $ export os_username=admin $ export os_password=nova ubuntu@ip-10-59-33-68:~$ source .bashrc ubuntu@ip-10-59-33-68:~$ ubuntu@ip-10-59-33-68:~$ env | grep os os_password=nova os_auth_url=http://127.0.0.1:5000/v2.0/ os_username=admin os_tenant_name=admin ubuntu@ip-10-59-33-68:~$ keystone token-get +-----------+----------------------------------+ | property | value | +-----------+----------------------------------+ | expires | 2013-08-09t20:31:12z | | id | bdb582c836e3474f979c5aa8f844c000 | | tenant_id | 2f46e214984f4990b9c39d9c6222f572 | | user_id | 077311b0a8304c8e86dc0dc168a67091 | +-----------+----------------------------------+ $ export auth_token="bdb582c836e3474f979c5aa8f844c000" $ export tenant_id="2f46e214984f4990b9c39d9c6222f572" then we need to create the glance image that we want to use for our hadoop cluster. in our example we have used mirantis's vanilla image but we can also build our own image: $ wget http://savanna-files.mirantis.com/savanna-0.2-vanilla-1.1.2-ubuntu-12.10.qcow2 $ glance image-create --name=savanna-0.2-vanilla-hadoop-ubuntu.qcow2 --disk-format=qcow2 --container-format=bare < ./savanna-0.2-vanilla-1.1.2-ubuntu-12.10.qcow2 ubuntu@ip-10-59-33-68:~/devstack$ glance image-list +--------------------------------------+-----------------------------------------+-------------+------------------+-----------+--------+ | id | name | disk format | container format | size | status | +--------------------------------------+-----------------------------------------+-------------+------------------+-----------+--------+ | d0d64f5c-9c15-4e7b-ad4c-13859eafa7b8 | cirros-0.3.1-x86_64-uec | ami | ami | 25165824 | active | | fee679ee-e0c0-447e-8ebd-028050b54af9 | cirros-0.3.1-x86_64-uec-kernel | aki | aki | 4955792 | active | | 1e52089b-930a-4dfc-b707-89b568d92e7e | cirros-0.3.1-x86_64-uec-ramdisk | ari | ari | 3714968 | active | | d28051e2-9ddd-45f0-9edc-8923db46fdf9 | savanna-0.2-vanilla-hadoop-ubuntu.qcow2 | qcow2 | bare | 551699456 | active | +--------------------------------------+-----------------------------------------+-------------+------------------+-----------+--------+ $ export image_id=d28051e2-9ddd-45f0-9edc-8923db46fdf9 then we have installed httpie , an open source http client that can be used to send rest requests to savanna api: $ sudo pip install httpie from now on we will use httpie to send savanna commands. we need to register the image with savanna: $ export savanna_url="http://localhost:8386/v1.0/$tenant_id" $ http post $savanna_url/images/$image_id x-auth-token:$auth_token username=ubuntu http/1.1 202 accepted content-length: 411 content-type: application/json date: thu, 08 aug 2013 21:28:07 gmt { "image": { "os-ext-img-size:size": 551699456, "created": "2013-08-08t21:05:55z", "description": "none", "id": "d28051e2-9ddd-45f0-9edc-8923db46fdf9", "metadata": { "_savanna_description": "none", "_savanna_username": "ubuntu" }, "mindisk": 0, "minram": 0, "name": "savanna-0.2-vanilla-hadoop-ubuntu.qcow2", "progress": 100, "status": "active", "tags": [], "updated": "2013-08-08t21:28:07z", "username": "ubuntu" } } $ http $savanna_url/images/$image_id/tag x-auth-token:$auth_token tags:='["vanilla", "1.1.2", "ubuntu"]' http/1.1 202 accepted content-length: 532 content-type: application/json date: thu, 08 aug 2013 21:29:25 gmt { "image": { "os-ext-img-size:size": 551699456, "created": "2013-08-08t21:05:55z", "description": "none", "id": "d28051e2-9ddd-45f0-9edc-8923db46fdf9", "metadata": { "_savanna_description": "none", "_savanna_tag_1.1.2": "true", "_savanna_tag_ubuntu": "true", "_savanna_tag_vanilla": "true", "_savanna_username": "ubuntu" }, "mindisk": 0, "minram": 0, "name": "savanna-0.2-vanilla-hadoop-ubuntu.qcow2", "progress": 100, "status": "active", "tags": [ "vanilla", "ubuntu", "1.1.2" ], "updated": "2013-08-08t21:29:25z", "username": "ubuntu" } } then we need to create a nodegroup templates (json files) that will be sent to savanna. there is one template for the master nodes ( namenode , jobtracker ) and another template for the worker nodes such as datanode and tasktracker . the hadoop version is 1.1.2. $ vi ng_master_template_create.json { "name": "test-master-tmpl", "flavor_id": "2", "plugin_name": "vanilla", "hadoop_version": "1.1.2", "node_processes": ["jobtracker", "namenode"] } $ vi ng_worker_template_create.json { "name": "test-worker-tmpl", "flavor_id": "2", "plugin_name": "vanilla", "hadoop_version": "1.1.2", "node_processes": ["tasktracker", "datanode"] } $ http $savanna_url/node-group-templates x-auth-token:$auth_token < ng_master_template_create.json http/1.1 202 accepted content-length: 387 content-type: application/json date: thu, 08 aug 2013 21:58:00 gmt { "node_group_template": { "created": "2013-08-08t21:58:00", "flavor_id": "2", "hadoop_version": "1.1.2", "id": "b3a79c88-b6fb-43d2-9a56-310218c66f7c", "name": "test-master-tmpl", "node_configs": {}, "node_processes": [ "jobtracker", "namenode" ], "plugin_name": "vanilla", "updated": "2013-08-08t21:58:00", "volume_mount_prefix": "/volumes/disk", "volumes_per_node": 0, "volumes_size": 10 } } $ http $savanna_url/node-group-templates x-auth-token:$auth_token < ng_worker_template_create.json http/1.1 202 accepted content-length: 388 content-type: application/json date: thu, 08 aug 2013 21:59:41 gmt { "node_group_template": { "created": "2013-08-08t21:59:41", "flavor_id": "2", "hadoop_version": "1.1.2", "id": "773b2cfb-1e05-46f4-923f-13edc7d6aac6", "name": "test-worker-tmpl", "node_configs": {}, "node_processes": [ "tasktracker", "datanode" ], "plugin_name": "vanilla", "updated": "2013-08-08t21:59:41", "volume_mount_prefix": "/volumes/disk", "volumes_per_node": 0, "volumes_size": 10 } } the next step is to define the cluster template: $ vi cluster_template_create.json { "name": "demo-cluster-template", "plugin_name": "vanilla", "hadoop_version": "1.1.2", "node_groups": [ { "name": "master", "node_group_template_id": "b3a79c88-b6fb-43d2-9a56-310218c66f7c", "count": 1 }, { "name": "workers", "node_group_template_id": "773b2cfb-1e05-46f4-923f-13edc7d6aac6", "count": 2 } ] } $ http $savanna_url/cluster-templates x-auth-token:$auth_token < cluster_template_create.json http/1.1 202 accepted content-length: 815 content-type: application/json date: fri, 09 aug 2013 07:04:24 gmt { "cluster_template": { "anti_affinity": [], "cluster_configs": {}, "created": "2013-08-09t07:04:24", "hadoop_version": "1.1.2", "id": "{ "name": "cluster-1", "plugin_name": "vanilla", "hadoop_version": "1.1.2", "cluster_template_id" : "64c4117b-acee-4da7-937b-cb964f0471a9", "user_keypair_id": "stack", "default_image_id": "3f9fc974-b484-4756-82a4-bff9e116919b" }", "name": "demo-cluster-template", "node_groups": [ { "count": 1, "flavor_id": "2", "name": "master", "node_configs": {}, "node_group_template_id": "b3a79c88-b6fb-43d2-9a56-310218c66f7c", "node_processes": [ "jobtracker", "namenode" ], "volume_mount_prefix": "/volumes/disk", "volumes_per_node": 0, "volumes_size": 10 }, { "count": 2, "flavor_id": "2", "name": "workers", "node_configs": {}, "node_group_template_id": "773b2cfb-1e05-46f4-923f-13edc7d6aac6", "node_processes": [ "tasktracker", "datanode" ], "volume_mount_prefix": "/volumes/disk", "volumes_per_node": 0, "volumes_size": 10 } ], "plugin_name": "vanilla", "updated": "2013-08-09t07:04:24" } } now we are ready to create the hadoop cluster: $ vi cluster_create.json { "name": "cluster-1", "plugin_name": "vanilla", "hadoop_version": "1.1.2", "cluster_template_id" : "64c4117b-acee-4da7-937b-cb964f0471a9", "user_keypair_id": "savanna", "default_image_id": "d28051e2-9ddd-45f0-9edc-8923db46fdf9" } $ http $savanna_url/clusters x-auth-token:$auth_token < cluster_create.json http/1.1 202 accepted content-length: 1153 content-type: application/json date: fri, 09 aug 2013 07:28:14 gmt { "cluster": { "anti_affinity": [], "cluster_configs": {}, "cluster_template_id": "64c4117b-acee-4da7-937b-cb964f0471a9", "created": "2013-08-09t07:28:14", "default_image_id": "d28051e2-9ddd-45f0-9edc-8923db46fdf9", "hadoop_version": "1.1.2", "id": "d919f1db-522f-45ab-aadd-c078ba3bb4e3", "info": {}, "name": "cluster-1", "node_groups": [ { "count": 1, "created": "2013-08-09t07:28:14", "flavor_id": "2", "instances": [], "name": "master", "node_configs": {}, "node_group_template_id": "b3a79c88-b6fb-43d2-9a56-310218c66f7c", "node_processes": [ "jobtracker", "namenode" ], "updated": "2013-08-09t07:28:14", "volume_mount_prefix": "/volumes/disk", "volumes_per_node": 0, "volumes_size": 10 }, { "count": 2, "created": "2013-08-09t07:28:14", "flavor_id": "2", "instances": [], "name": "workers", "node_configs": {}, "node_group_template_id": "773b2cfb-1e05-46f4-923f-13edc7d6aac6", "node_processes": [ "tasktracker", "datanode" ], "updated": "2013-08-09t07:28:14", "volume_mount_prefix": "/volumes/disk", "volumes_per_node": 0, "volumes_size": 10 } ], "plugin_name": "vanilla", "status": "validating", "updated": "2013-08-09t07:28:14", "user_keypair_id": "savanna" } } after a while we can run the nova command to check if the instances are created and running: $ nova list +--------------------------------------+-----------------------+--------+------------+-------------+----------------------------------+ | id | name | status | task state | power state | networks | +--------------------------------------+-----------------------+--------+------------+-------------+----------------------------------+ | 1a9f43bf-cddb-4556-877b-cc993730da88 | cluster-1-master-001 | active | none | running | private=10.0.0.2, 192.168.55.227 | | bb55f881-1f96-4669-a94a-58cbf4d88f39 | cluster-1-workers-001 | active | none | running | private=10.0.0.3, 192.168.55.226 | | 012a24e2-fa33-49f3-b051-9ee2690864df | cluster-1-workers-002 | active | none | running | private=10.0.0.4, 192.168.55.225 | +--------------------------------------+-----------------------+--------+------------+-------------+----------------------------------+ now we can log in to the hadoop master instance and run the required hadoop commands: $ ssh -i savanna.pem [email protected] $ sudo chmod 777 /usr/share/hadoop $ sudo su hadoop $ cd /usr/share/hadoop $ hadoop jar hadoop-example-1.1.2.jar pi 10 100 savanna ui via horizon in order to create nodegroup templates, cluster templates and the cluster itself we used a command line tool -- httpie -- to send rest api calls. the same functionality is also available via horizon, the standard openstack dashboard. first we need to register the image with savanna: then we need to create the nodegroup templates: after that we have to create the cluster template: and finally we have to create the cluster:

August 20, 2013

by Istvan Szegedi

· 9,519 Views