DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

The Latest Integration Topics

article thumbnail
Java concurrency: the hidden thread deadlocks
Most Java programmers are familiar with the Java thread deadlock concept. It essentially involves 2 threads waiting forever for each other. This condition is often the result of flat (synchronized) or ReentrantLock (read or write) lock-ordering problems. Found one Java-level deadlock: ============================= "pool-1-thread-2": waiting to lock monitor 0x0237ada4 (object 0x272200e8, a java.lang.Object), which is held by "pool-1-thread-1" "pool-1-thread-1": waiting to lock monitor 0x0237aa64 (object 0x272200f0, a java.lang.Object), which is held by "pool-1-thread-2" The good news is that the HotSpot JVM is always able to detect this condition for you…or is it? A recent thread deadlock problem affecting an Oracle Service Bus production environment has forced us to revisit this classic problem and identify the existence of “hidden” deadlock situations. This article will demonstrate and replicate via a simple Java program a very special lock-ordering deadlock condition which is not detected by the latest HotSpot JVM 1.7. You will also find a video at the end of the article explaining you the Java sample program and the troubleshooting approach used. The crime scene I usually like to compare major Java concurrency problems to a crime scene where you play the lead investigator role. In this context, the “crime” is an actual production outage of your client IT environment. Your job is to: Collect all the evidences, hints & facts (thread dump, logs, business impact, load figures…) Interrogate the witnesses & domain experts (support team, delivery team, vendor, client…) The next step of your investigation is to analyze the collected information and establish a potential list of one or many “suspects” along with clear proofs. Eventually, you want to narrow it down to a primary suspect or root cause. Obviously the law “innocent until proven guilty” does not apply here, exactly the opposite. Lack of evidence can prevent you to achieve the above goal. What you will see next is that the lack of deadlock detection by the Hotspot JVM does not necessary prove that you are not dealing with this problem. The suspect In this troubleshooting context, the “suspect” is defined as the application or middleware code with the following problematic execution pattern. Usage of FLAT lock followed by the usage of ReentrantLock WRITE lock (execution path #1) Usage of ReentrantLock READ lock followed by the usage of FLAT lock (execution path #2) Concurrent execution performed by 2 Java threads but via a reversed execution order The above lock-ordering deadlock criteria’s can be visualized as per below: Now let’s replicate this problem via our sample Java program and look at the JVM thread dump output. Sample Java program This above deadlock conditions was first identified from our Oracle OSB problem case. We then re-created it via a simple Java program. You can download the entire source code of our program here. The program is simply creating and firing 2 worker threads. Each of them execute a different execution path and attempt to acquire locks on shared objects but in different orders. We also created a deadlock detector thread for monitoring and logging purposes. For now, find below the Java class implementing the 2 different execution paths. package org.ph.javaee.training8; import java.util.concurrent.locks.ReentrantReadWriteLock; /** * A simple thread task representation * @author Pierre-Hugues Charbonneau * */ public class Task { // Object used for FLAT lock private final Object sharedObject = new Object(); // ReentrantReadWriteLock used for WRITE & READ locks private final ReentrantReadWriteLock lock = new ReentrantReadWriteLock(); /** * Execution pattern #1 */ public void executeTask1() { // 1. Attempt to acquire a ReentrantReadWriteLock READ lock lock.readLock().lock(); // Wait 2 seconds to simulate some work... try { Thread.sleep(2000);}catch (Throwable any) {} try { // 2. Attempt to acquire a Flat lock... synchronized (sharedObject) {} } // Remove the READ lock finally { lock.readLock().unlock(); } System.out.println("executeTask1() :: Work Done!"); } /** * Execution pattern #2 */ public void executeTask2() { // 1. Attempt to acquire a Flat lock synchronized (sharedObject) { // Wait 2 seconds to simulate some work... try { Thread.sleep(2000);}catch (Throwable any) {} // 2. Attempt to acquire a WRITE lock lock.writeLock().lock(); try { // Do nothing } // Remove the WRITE lock finally { lock.writeLock().unlock(); } } System.out.println("executeTask2() :: Work Done!"); } public ReentrantReadWriteLock getReentrantReadWriteLock() { return lock; } } As soon ad the deadlock situation was triggered, a JVM thread dump was generated using JVisualVM. As you can see from the Java thread dump sample. The JVM did not detect this deadlock condition (e.g. no presence of Found one Java-level deadlock) but it is clear these 2 threads are in deadlock state. Root cause: ReetrantLock READ lock behavior The main explanation we found at this point is associated with the usage of the ReetrantLock READ lock. The read locks are normally not designed to have a notion of ownership. Since there is not a record of which thread holds a read lock, this appears to prevent the HotSpot JVM deadlock detector logic to detect deadlock involving read locks. Some improvements were implemented since then but we can see that the JVM still cannot detect this special deadlock scenario. Now if we replace the read lock (execution pattern #2) in our program by a write lock, the JVM will finally detect the deadlock condition but why? Found one Java-level deadlock: ============================= "pool-1-thread-2": waiting for ownable synchronizer 0x272239c0, (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync), which is held by "pool-1-thread-1" "pool-1-thread-1": waiting to lock monitor 0x025cad3c (object 0x272236d0, a java.lang.Object), which is held by "pool-1-thread-2" Found one Java-level deadlock: ============================= "pool-1-thread-2": waiting for ownable synchronizer 0x272239c0, (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync), which is held by "pool-1-thread-1" "pool-1-thread-1": waiting to lock monitor 0x025cad3c (object 0x272236d0, a java.lang.Object), which is held by "pool-1-thread-2" Java stack information for the threads listed above: =================================================== "pool-1-thread-2": at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x272239c0> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197) at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945) at org.ph.javaee.training8.Task.executeTask2(Task.java:54) - locked <0x272236d0> (a java.lang.Object) at org.ph.javaee.training8.WorkerThread2.run(WorkerThread2.java:29) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) "pool-1-thread-1": at org.ph.javaee.training8.Task.executeTask1(Task.java:31) - waiting to lock <0x272236d0> (a java.lang.Object) at org.ph.javaee.training8.WorkerThread1.run(WorkerThread1.java:29) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) This is because write locks are tracked by the JVM similar to flat locks. This means the HotSpot JVM deadlock detector appears to be currently designed to detect: Deadlock on Object monitors involving FLAT locks Deadlock involving Locked ownable synchronizers associated with WRITE locks The lack of read lock per-thread tracking appears to prevent deadlock detection for this scenario and significantly increase the troubleshooting complexity. I suggest that you read Doug Lea’s comments on this whole issue since concerns were raised back in 2005 regarding the possibility to add per-thread read-hold tracking due to some potential lock overhead. Find below my troubleshooting recommendations if you suspect a hidden deadlock condition involving read locks: Analyze closely the thread call stack trace, it may reveal some code potentially acquiring read locks and preventing other threads to acquire write locks. If you are the owner of the code, keep track of the read lock count via the usage of the lock.getReadLockCount() method I’m looking forward for your feedback, especially from individuals with experience on this type of deadlock involving read locks. Finally, find below a video explaining such findings via the execution and monitoring of our sample Java program.
January 28, 2013
by Pierre - Hugues Charbonneau
· 105,691 Views · 3 Likes
article thumbnail
The Definitive Gradle Guide for NetBeans IDE
Gradle is a build tool like Ant and Maven only much, much better!
January 14, 2013
by Attila Kelemen
· 65,591 Views · 1 Like
article thumbnail
Measure Elapsed Time with Camel
Apache Camel provides an event notifier support class which allows you to keep information about what happened on Exchange, Route and Endpoint. One of the benefits of this class is that you can easily audit messages created in Camel Routes, collect information and report that in log by example. When developing an application, it is very important to calculate/measure elapsed time on the platform to find which part of your code, processor or system integrated which is the bad duck and must be improved. In three steps, I would show you How to enable this mechanism to report : - Time elapsed to call an endpoint (could be another camel route, web service, ...) - Time elapsed on the route exchange STEP 1 - Create a Class implementing the EventNotifierSupport public class AuditEventNotifier extends EventNotifierSupport { public void notify(EventObject event) throws Exception { if (event instanceof ExchangeSentEvent) { ExchangeSentEvent sent = (ExchangeSentEvent) event; log.info(">>> Took " + sent.getTimeTaken() + " millis to send to external system : " + sent.getEndpoint()); } if (event instanceof ExchangeCompletedEvent) {; ExchangeCompletedEvent exchangeCompletedEvent = (ExchangeCompletedEvent) event; Exchange exchange = exchangeCompletedEvent.getExchange(); String routeId = exchange.getFromRouteId(); Date created = ((ExchangeCompletedEvent) event).getExchange().getProperty(Exchange.CREATED_TIMESTAMP, Date.class); // calculate elapsed time Date now = new Date(); long elapsed = now.getTime() - created.getTime(); log.info(">>> Took " + elapsed + " millis for the exchange on the route : " + routeId); } } public boolean isEnabled(EventObject event) { return true; } protected void doStart() throws Exception { // filter out unwanted events setIgnoreCamelContextEvents(true); setIgnoreServiceEvents(true); setIgnoreRouteEvents(true); setIgnoreExchangeCreatedEvent(true); setIgnoreExchangeCompletedEvent(false); setIgnoreExchangeFailedEvents(true); setIgnoreExchangeRedeliveryEvents(true); setIgnoreExchangeSentEvents(false); } protected void doStop() throws Exception { // noop } } Not really complicated and the code is explicit. Check the doStart() method to enable/disable the events for which you would like to gather information. This example uses only Exchange.CREATED_TIMESTAMP property but the next version of Camel 2.7.0 will provide you the property exchange.RECEIVED_TIMESTAMP and so you will be able to calculate more easily the time spend by the exchange to call the different processors till it arrives at the end of the route. This example collects Date information but you can imagine to use this mechanism to check if your route processes the message according to SLA, .... STEP 2 - Instantiate the bean in Camel Spring XML By adding this bean definition, Camel will automatically register it to the CamelContext created. STEP 3 - Collect info into the log 18:10:46,060 | INFO | tp1238469515-285 | AuditEventNotifier | ? ? | 68 - org.apache.camel.camel-core - 2.6.0.fuse-00-00 | >>> Took 3 millis for the exchange on the route : mock-HTTP-Server 18:10:46,062 | INFO | tp2056154542-293 | AuditEventNotifier | ? ? | 68 - org.apache.camel.camel-core - 2.6.0.fuse-00-00 | >>> Took 25 millis to send to external system : Endpoint[http://localhost:9191/sis] 18:10:46,077 | INFO | tp2056154542-293 | AuditEventNotifier | ? ? | 68 - org.apache.camel.camel-core - 2.6.0.fuse-00-00 | >>> Took 103 millis for the exchange on the route : ws-to-sis
January 11, 2013
by Charles Moulliard
· 12,304 Views
article thumbnail
Distributed Lock using Zookeeper
This article is by Stephen Mouring, Jr. On my project we have a number of software components that run concurrently, some on a cron, and some as part of our build process. Many of these components need to mutate data in our data store and have the possibility of conflicting with one another. What is worse is that many of these processes run on separate machines making language level or even file system level synchronization impossible. Zookeeper is a natural solution to the problem. It is a distributed system for, among other things, managing coordination across a cluster of machines. Zookeeper manages information as a hierarchical system of "nodes" (much like a file system). Each node can contain data or can contain child nodes. Zookeeper supports several types of nodes. A node can be either "ephemeral" or "persistent" meaning it is either deleted when the process that created it ends or it remains until manually deleted. A node can also be "sequential" meaning each time a node is created with a given name, a sequence number is postfixed to that name. This allows you to create a series of nodes with the same name that are ordered in the same order they were created. To solved our problem we need to have a locking mechanism that works across processes and across machines that allows one holder of the lock to execute at a given time. Below is the Java code we wrote to solve the problem. I will go through it step by step. public class DistributedLock { private final ZooKeeper zk; private final String lockBasePath; private final String lockName; private String lockPath; public DistributedLock(ZooKeeper zk, String lockBasePath, String lockName) { this.zk = zk; this.lockBasePath = lockBasePath; this.lockName = lockName; } public void lock() throws IOException { try { // lockPath will be different than (lockBasePath + "/" + lockName) becuase of the sequence number ZooKeeper appends lockPath = zk.create(lockBasePath + "/" + lockName, null, Ids.OPEN_ACL_UNSAFE, CreateMode.EPHEMERAL_SEQUENTIAL); final Object lock = new Object(); synchronized(lock) { while(true) { List nodes = zk.getChildren(lockBasePath, new Watch() { @Override public void process(WatchedEvent event) { synchronized (lock) { lock.notifyAll(); } } }); Collections.sort(nodes); // ZooKeeper node names can be sorted lexographically if (lockPath.endsWith(nodes.get(0)) { return; } else { lock.wait(); } } } } catch (KeeperException e) { throw new IOException (e); } catch (InterruptedException e) { throw new IOException (e); } } public void unlock() throws IOException { try { zk.delete(lockPath, -1); lockPath = null; } catch (KeeperException e) { throw new IOException (e); } catch (InterruptedException e) { throw new IOException (e); } } } (Disclaimer: Credit for this code goes to Aaron McCurry for developing the core mechanism of this lock as well as the design for using ZooKeeper. Kudos to Aaron!) Each process that wants to use the lock should instantiate an object of the DistributedLock class. The DistributedLock constructor takes three parameters. The first parameter is a reference to the ZooKeeper client. The second parameter is the "base path" where you want your lock nodes to reside in. Remember that ZooKeeper stores its nodes like a file system, so think of this base path as the directory you want your lock nodes created in. The third parameter is the name of the lock to use. Note you should use the same lock name for every process that you want to share the same lock. The lock name is the common reference that multiple processes lock on. Note: This class can support multiple locks if you use a different lock name for each lock you want to create. Say you have two data stores (A and B). You have several processes that need mutate A and B. You could use two different lock names (say LockA and LockB) to represent the locks for each data store. Any process that needs to mutate data store A could create a DistributedLock with a lockname of LockA. Likewise, any process that needs to mutate data store B could create a DistributedLock with a lockname of LockB. A proces that needs to mutate both datastores would create two DistributedLock objects (one with lock name of LockA and one with a lock name of LockB). Once your process has created a DistributedLock object it can then call the lock() method to attempt to acquire the lock. The lock() method will block until the lock is acquired. // lockPath will be different than (lockBasePath + "/" + lockName) becuase of the sequence number ZooKeeper appends lockPath = zk.create(lockBasePath + "/" + lockName, null, Ids.OPEN_ACL_UNSAFE, CreateMode.EPHEMERAL_SEQUENTIAL); First of all, the lock() method creates a node in ZooKeeper to represent its "position in line" waiting for the lock. The node created is EPHEMERAL which means if our process dies for some reason, its lock or request for the lock with automatically disappear thanks to ZooKeeper's node management, so we do not have worry about timing out nodes or cleaning up stale nodes. final Object lock = new Object(); synchronized(lock) { while(true) { List nodes = zk.getChildren(lockBasePath, new Watch() { @Override public void process(WatchedEvent event) { synchronized (lock) { lock.notifyAll(); } } }); // Sequential ZooKeeper node names can be sorted lexographically! Collections.sort(nodes); // Are we the "topmost" node? (The node with the lowest sequence number that is.) if (lockPath.endsWith(nodes.get(0)) { return; } else { lock.wait(); } } } To understand the code above you need to understand how ZooKeeper works. ZooKeeper operates through a system of callbacks. When you call getChildren() you can pass in a "watcher" that will get called anytime the list of children changes. The gist of what we are doing here is this. We are creating an ordered list of nodes (sharing the same name). Whenever the list changes, every process that has registered a node is notified. Since the nodes are ordered, one node will be "on top" or in other words have the lowest sequence number. That node is the node that owns the lock. When a process detects that its node is the top most node, it proceeds to execute. When it is finished, it deletes its node, triggering a notification to all other processes who then determine who the next node is who has the lock. The tricky part of the code from a Java perspective is the use of nested synchronized blocks. The nested synchronization structure is used to ensure that the DistributedLock is able to process every update it gets from ZooKeeper and does not "lose" an update if two or more updates come from ZooKeeper in quick succession. The inner synchronized block in the Watcher method is called from an outside thread whenever ZooKeeper reports a change to its children. Since the Watcher callback is in a synchronized block keyed to the same Java lock object as the outer synchronized block, it means that the update from ZooKeeper cannot be processed until the contents of the outer synchronized block is finished. In other words, when an update comes in from ZooKeeper, it fires a notifyAll() which wakes up the loop in the lock() method. That lock method gets the updated children and sets a new Watcher. (Watchers have to be reset once they fire as they are not a perpetual callback. They fire once and then disappear.) If the newly reset Watcher fires before the rest of the loop executes, it will block because it is synchronized on the same Java lock object as the loop. The loop finishes its pass, and if it has not acquired the distrubted lock, it waits on the Java lock object. This frees the Watcher to execute whenever a new update comes, repeating the cycle. Once the lock() method returns, it means your process has the dsitributed lock and can continue to execute its business logic. Once it is complete it can release the lock by calling the unlock() method. public void unlock() throws IOException { try { zk.delete(lockPath, -1); lockPath = null; } catch (KeeperException e) { throw new IOException (e); } catch (InterruptedException e) { throw new IOException (e); } } All unlock() does is explictly delete this process's node which notifies all the other waiting processes and allows the next one in line to go. Because the nodes are EPHEMERAL, the process can exit without unlocking and ZooKeeper will eventually reap its node allowing the next process to execute. This is a good thing because it means if your process ends prematurely without you having a chance to call unlock() it will not block the remaining processes. Note that it is best to explicitly call unlock() if you can, because it is much faster than waiting for ZooKeeper to reap your node. You will delay the other processes less if you explicity unlock.
January 8, 2013
by Scott Leberknight
· 62,326 Views · 5 Likes
article thumbnail
SLF4J Logging in Eclipse Plugins
developing with maven and pure java libraries all the time, i never thought it could be a problem to issue a few log statements when developing an eclipse plugin. but it looks like in the imaginary of an eclipse developer everything is always inside the eclipse environment and nothing is outside the eclipse universe. if you search for the above headline using google, one of the first articles you’ll find is one about the “platform logging facility”. but what about 3rd libraries? they cannot use an eclipse-based logging framework. in my libraries i use the slf4j api and leave it up to the user to decide what logging implementation (log4j, logback, jdk) he or she wants to use. and that’s exactly what i want to do in eclipse. it was hard to figure out exactly how to do it, but here are the pieces of that puzzle. phase 1: development this describes the steps during the development phase of your custom plugin. step 1: get your libaries into a p2 repository everything you want to use in eclipse has to be installed from a p2 repository. but most of the libaries i use are in a maven repository. as far as i know there is no such thing as a main p2 repository similar to the “maven central,” and all libraries i found in p2 repositories were pretty old. so you have to create one by yourself. luckily there is a maven plugin called p2-maven-plugin that converts all your maven jars into a single p2 repository. you can upload the plugin to a folder of your website or simply install it from your local hard drive. for this example you’ll need the following libraries: org.slf4j:slf4j-api:1.6.6 org.slf4j:slf4j-log4j12:1.6.6 log4j:log4j:1.2.17 org.ops4j.pax.logging:pax-logging-api:1.7.0 org.ops4j.pax.logging:pax-logging-service:1.7.0 org.ops4j.pax.confman:pax-confman-propsloader:0.2.2 format “groupid:artifactid:version” is as used for the “p2-maven-plugin.” to skip this step you could also use http://www.fuin.org/p2-repository/ . step 2: install the slf4j api in the eclipse ide select “help / install new software…”. add the p2 repository url and install the “slf4j-api”—you could directly use the folder from step 1 with a file url like this: “file:/pathtoyour/p2-repository/”. add the freshly installed “slf4j.api” to your manifest.mf. start using slf4j logs in your code as usual. phase 2: production this describes the tasks a user of your custom plugin has to complete to start logging with log4j. the following assumes that your custom plugin is already installed. step 1: install the log libraries in the eclipse ide select “help / install new software…”. install the “equinox target components” from the eclipse update site. add the p2 repository url and install the following plugins: apache log4j ops4j pax confman–properties loader ops4j pax logging–api ops4j pax logging–service step 2: configure pax logging set the location for your log configuration in the “eclipse.ini” as “vmarg" … -vmargs -xms40m -xmx512m -dbundles.configuration.location= … create a folder named “services” in the above “config-dir.” create log4j properties named “org.ops4j.pax.logging.properties” in “services.” log4j.rootlogger=info, file log4j.appender.file=org.apache.log4j.fileappender log4j.appender.file.file=/example.log log4j.appender.file.layout=org.apache.log4j.patternlayout log4j.appender.file.layout.conversionpattern=%d{yyyy/mm/dd hh:mm:ss,sss} [%t] %-5p %c %x - %m%n log4j.logger.your.package=debug step 3: activate pax logging open the “console” view. select the “host osgi console.” start the following bundles: start org.eclipse.equinox.cm start org.ops4j.pax.logging.pax-logging-api start org.ops4j.pax.logging.pax-logging-service start org.ops4j.pax.configmanager now you should be able to see your log statements in the configured “example.log” file. step 4: changing the configuration if you want to change the configuration in the “org.ops4j.pax.logging.properties”, simply restart the pax configmanager in the osgi console stop org.ops4j.pax.configmanager start org.ops4j.pax.configmanager happy logging!
January 6, 2013
by Michael Schnell
· 39,182 Views · 1 Like
article thumbnail
How to Un-install a Plugin From Eclipse / STS?
It is easy to do - a few button clicks (generally) - but the button location is damn unintuitive. So, this is what you have got to do Go to "Help" menu item. Click on "About ..." button (why on earth should I click that when I am trying to un-install a plugin. By the way, the menu item just above "About ..." is "Install New Software ...". Would it have been too much pain to have a "Manage plugins" and / or "Un-install plugins" right underneath it?) A form opens up. At the bottom of it there is button "Installation details". Click that. (Again, why on earth would anyone think "Installation details" would have anything to do with un-installing stuff. I would have expected only a static display of stuff that are already installed.) Another multi tabbed form opens up (Anyone keeping count of the number of windows opened already. This is the 3rd window by now, including the parent editor window) which shows all the installed plugins. If you select any of the installed plugins, a button to "uninstall" becomes available. Click that and you should be able to un-install and after a restart everything should be fine. My interest in software and IT has always been much more than a 9 to 5 job (and I am sure there is a huge population that it holds equally true for). I have always wanted software to be efficient and beautiful apart from doing it's job. However, it took an excellent session on usability (which I joined only with casual curiosity but left with renewed interest in the subject and admiration for David Travis who delivered the course) to get me to start looking at all software with an "user's" perspective. And I was surprised with what I found and how it changed my coding. I have been using Eclipse and STS for years now (nearing a decade now) and I absolutely love these software. However when you start looking at them as a "user" and not only as a developer, there are quite a few usability opportunities of improvement that meets the eye. This article - apart from helping folks looking to un-install plugins in Eclipse - is also intended at folks who design Eclipse - just a humble request to consider this also as a usability improvement.
January 1, 2013
by Partha Bhattacharjee
· 16,163 Views
article thumbnail
JSON-Schema in WADL
In between other jobs I have recently been reviewing the WADL specification with a view to fixing some documentation problems and to producing an updated version. One of the things that was apparent was the lack of any grammar support for languages other than XML - yes you can use a mapping from JSON<->XML Schema but this would be less than pleasant for a JSON purist. So I began to look at how one would go about attaching a JSON-Schema grammar of a JSON document in a WADL description of a service. This isn't a specification yet; but a proposal of how it might work consistently. Now I work with Jersey mostly, so lets consider what Jersey will currently generate for a service that returns both XML and JSON. So the service here is implemented using the JAX-B binding so they both use a similar structure as defined by the XML-Schema reference by the include. So the first thing we considered was re-using the existing element property, which is defined as a QName, on the representation element to reference an imported JSON-Schema. It is shown here both with another an arbitrary namespace so it can be told apart from XML elements without a namespace. Or xmlns:json="http://wadl.dev.java.net/2009/02/json" The problem is that the JSON-Schema specification as it stands doesn't have a concept of a "name" property, so each JSON-Schema is uniquely identified by its URI. Also from my read of the specification, each JSON-Schema contains the definition for at most one document - not the multiple types / documents that can be contained in XML-Schema. So the next best suggestion would be to just use the "filename" part of the URI as a proxy for the URI; but of course that won't necessarily be unique. I could see for example the US government and Yahoo both publishing there own "address" micro format. The better solution to this problem is to introduce a new attribute, luckily the WADL spec was designed with this in mind, that is a type of URI that can be used to directly reference the JSON-Schema definitions. So rather than the direct import in the previous example we have a URI property on the element itself. The "describedby" attribute name comes from the JSON-Schema proposal and is consistent with the rel used on atom links in the spec. xmlns:json="http://wadl.dev.java.net/2009/02/json-schema" xmlns:m="urn:message" The secondary advantage is that this format is backward-compatible with tooling that was relying on the XML-Schema grammar. Although this is probably only of interest to people who work in tooling / testing tools like myself. Once you have the JSON-Schema definition then some users are going to want to do away with the XML all together, so finally here is a simple mapping of the WADL to a JSON document that contains just the JSON-Schema information. It has been suggested by Sergey Breyozkin that the JSON mapping would only show the json grammars and I am coming around to that way of thinking. I would be interested to hear of a usecase for the JSON mapping that would want access to the XML Schema. { "doc":{ "@generatedBy":"Jersey: 1.16-SNAPSHOT 10/26/2012 09:28 AM" }, "resources":{ "@base":"http://localhost/", "resource":{ "@path":"/root", "method":{ "@id":"hello", "@name":"PUT", "request":{ "representation":[ { "@mediaType":"application/json", "@describedby":"application.wadl/requestMessage" } ] }, "response":{ "representation":[ { "@mediaType":"application/json", "@describedby":"application.wadl/responseMessage" } ] } } } } } I am currently using the mime type of "application/vnd.sun.wadl+json" for this mapping to be consistent with the default WADL mime type. I suspect we would want to change this in the future; but it will do for starters. So this is all very interesting but you can't play with it unless you have an example implementation. I have something working for both the server side and for a Java client generator in Jersey and wadl2java respectively, and that will be the topic of my next post. I have been working with Pavel Bucek and the Jersey team on these implementations and the WADL proposal. Thanks very much to him for putting up with me.
December 17, 2012
by Gerard Davison
· 40,362 Views · 1 Like
article thumbnail
How to Change the Default Webapp Deployment Location of Tomcat in Eclipse
When you deploy your Java web application to the Apache Tomcat server, via Eclipse, by default the web app will be deployed under {YOUR_ECLIPSE_WORKSPACE}\.metadata\.plugins\org.eclipse.wst.server.core\tmp{a-number}\wtpwebapps. Suppose if you want to deploy your web app to a location that is easily navigable, follow these steps. First make sure you have removed all the web apps that are currently added to your server instance (In servers view, right click on the server name and then Add and Remove). And then double click on the server instance in servers view which will open up that server’s configuration page. On that page, see under Server Locations and select either the option Use Tomcat Installation to deploy the web app under the directory where the Tomcat server is installed or Use custom location to manually specify. Save, Re-add the web application and then Publish. Now the deployed web app will be under the directory of your choice.
December 15, 2012
by Veera Sundar
· 26,130 Views
article thumbnail
All about Two-Phase Locking and a little bit MVCC
In this blog I will describe the concurrency control methods implemented in database management systems, and the differences between them. I will also explain about what locking technique is used in CUBRID RDBMS, about locking modes and their compatibility, and finally, the deadlocks and the solution for them. Overview When multiple transactions, which change the data, are executed simultaneously, it is required to control the order of processing these transactions to satisfy the ACID (Atomicity, Consistency, Integrity, Durability) property of the database. Executing multiple transactions simultaneously should lead to the same result as executing each transaction independently, in other words, one transaction should not be affected by another transaction. If different data is changed for each transaction, no interference between transactions is made, so there is no issue. However, if the same data is simultaneously changed by multiple transactions, the order of processing each transaction should be controlled. Types of Concurrency Control For example, the T1 transaction changes the A record from 1 to 2 and then changes the B record, the T2 transaction can simultaneously change the A record, too. Let's assume that the T2 transaction changes the A record from 2 to 4 by adding +2. If two transactions are successfully terminated, there is no issue. But it is important that all transactions can be rolled back. If the T1 transaction is rolled back, the value of the A record should be returned to 1, i.e. the value before the T1 transaction was executed. This is to satisfy the ACID property of the database. However, the T2 transaction has already changed the A record value to 3. So, it is impossible to return the A record to 1 regardless of the situation. In this case, there can be two options. Two-phase locking (2PL) The first one is when the T2 transaction tries to change the A record, it knows that the T1 transaction has already changed the A record and waits until the T1 transaction is completed because the T2 transaction cannot know whether the T1 transaction will be committed or rolled back. This method is called Two-phase locking (2PL). Multi-version concurrency control (MVCC) The other one is to allow each of them, T1 and T2 transactions, to have their own changed versions. Even when the T1 transaction has changed the A record from 1 to 2, the T1 transaction leaves the original value 1 as it is and writes that the T1 transaction version of the A record is 2. Then, the following T2 transaction changes the A record from 1 to 3, not from 2 to 4, and writes that the T2 transaction version of the A record is 3. When the T1 transaction is rolled back, it does not matter if the 2, the T1 transaction version, is not applied to the A record. After that, if the T2 transaction is committed, the 3, the T2 transaction version, will be applied to the A record. If the T1 transaction is committed prior to the T2 transaction, the A record is changed to 2, and then to 3 at the time of committing the T2 transaction. The final database status is identical to the status of executing each transaction independently, without any impact on other transactions. Therefore, it satisfies the ACID property. This method is called Multi-version concurrency control (MVCC). CUBRID has implemented 2PL method as well as DB2 and SQL Server, while Oracle, InnoDB and PostgreSQL have implemented MVCC. Two-phase locking in CUBRID The 2PL adopted by CUBRID uses locks to ensure the consistency between transactions that change the identical data. As the "lock" literally means, the locking is executed through two phases: expanding phase (acquiring) shrinking phase (releasing) More accurately, all transactions should acquire lock for the data to be accessed and the acquired locks are released only when the transaction is terminated. After a transaction has acquired the lock for a certain data (regardless of the lock type, S_LOCK for read, stands for Shared Lock, or X_LOCK for write, stands for Exclusive Lock), when another transaction tries to acquire a new lock for the data, the new lock is allowed or pended depending on the lock compatibility rule. Therefore, success or failure of the prior transaction does not have impact on the following transactions, so the data consistency is maintained. Lock Manager in CUBRID Thus, the key point of 2PL, adopted by CUBRID, is that the lock must be processed through two phases: expanding phase and shrinking phase. Then, [Figure 1] release all locks, acquired while executing a transaction, only after the transaction ends (commit or rollback). Figure 1: Two-Phase Locking. 2PL concurrency control method naturally controls access to the identical data from transactions by making all transactions observe the 2PL protocol. The following Figure 2 below shows an example of three transactions using 2PL: Transaction 1 executes B=B+A operation, Transaction 2 executes C=A+B operation, and Transaction 3 executes Print C operation. Since all three transactions are accessing the data A, B and C, the concurrency control is required. In this case, each transaction is executed according to the 2PL protocol so that there is no data conflict. Figure 2: Concurrency Control by using 2PL. Lock modes To understand the concurrency control of multiple transactions more deeply, let's discuss about lock modes, lock conversion and transaction isolation level. In the above figure, you can see that S-lock, Shared Lock, for A was first acquired by Transaction 1, but it is also acquired by Transaction 2, too. On the contrary, the transaction which requested X-lock is blocked until S-lock is released. In this matter, a variety of lock modes are used to minimize conflicts by lockers. Major types of locks utilized in DBMSs are. Shared (S) Lock: Used for read operation. It is generally set on the target record when SELECT statement is executed. It blocks a transaction from changing data which was already read by other transactions. Exclusive (X) Lock: Used for write-operations such as INSERT, UPDATE, DELETE. It blocks one data from being changed by multiple transactions. Update (U) Lock: Used to define that the target resource will be changed. It is used to minimize deadlock which may occur when multiple transactions are executing both read and write. Intent Shared (IS) Lock: Set on the upper resource (e.g. tables) to set the S-lock on some lower resources (e.g. records or pages). It is to prevent other transactions from setting X-lock on the upper resource. Intent lock will soon be described. Intent Exclusive (IX) Lock: Set on the upper resource to set X-lock on some lower resources. Shared with Intent Exclusive (SIX) Lock: Set on the upper resource to set S-lock and X-lock on some lower resources. Lock mode compatibility Among the lock modes above, intent locks are used to improve the transaction concurrency and to prevent deadlock between the upper resources and the lower resources. For example, when Transaction A tries to read Record R on Table T, it sets IS_LOCK on Table T before setting S_LOCK on Record R. Then, Transaction B is prevented from setting X_LOCK on Table T to change the structure of Table T. If Transaction A has not set IS_LOCK on Table T, Transaction B would change the structure of Table T. Then, Transaction A would perform a wrong read operation. This way Transaction B has no need to check all records in Table T to check whether there is any lock set by other transactions for setting X_LOCK on Table T. The following lock mode compatibility table will clearly show the effect of intent locks: Table 1: The lock mode compatibility table of CUBRID. Current Lock Mode NULL IS NS S IX SIX U NX X Newly-requested Lock Mode NULL True True True True True True True True True IS True True N/A True True True N/A N/A False NS True N/A True True N/A N/A False True False S True True True True False False False False False IX True True N/A False True False N/A N/A False SIX True True N/A False False False N/A N/A False U True N/A True True N/A N/A False False False NX True N/A True False N/A N/A False False False X True False False False False False False False False From the lock mode compatibility table, you can see that X_LOCK cannot be set on a table if IS_LOCK is set on the table. And only IS_LOCK can be compatible with SIX_LOCK. This means that SIX_LOCK intends to set S_LOCK and X_LOCK on the record and it will not allow any lock but IS_LOCK for S_LOCK on other non-conflicting records. From the table, you can see that IX_LOCK and IX_LOCK can be compatible with each other. IX_LOCK intends to set X_LOCK for some records. So, the compatibility is available. If there are two transactions that try to change an identical record, IX_LOCK for the table is allowed. However, there is no problem in concurrency control since only the transaction that has acquired X_LOCK for the record first can change the record (X_LOCK and X_LOCK are not compatible). The lock mode compatibility table is expressed as a global variable lock_Comp[][] in the lock_table.c file in CUBRID source code. Among CUBRID sources, most codes related to lock modes are implemented in lock_manager.c file. To set lock on a data object, the lock_object() function is used which receives three parameters: the OID of an object where the lock mode will be set, the OID of the class where the object belongs, and the desired lock mode. In the source code of the function, you can see that the function is executed in several ways based on the target of the lock mode, the lock mode for an instance object or for a class object. Take note of this: in CUBRID, a class object is also an object. Keep it in mind that a class object has an OID and all class objects are the instances of a root class, so it uses ROOTOID, the OID of the root object, as its class OID. From the code, you can see that the required intent lock is set on a class object when a lock mode is required for an instance object. And there is a concept of lock waiting time in the lock mode request. To retrieve the lock timeout value set on the current transaction, the logtb_find_wait_secs() function is called. CUBRID supports the SET TRANSACTION LOCK TIMEOUT SQL command and the setLockTimeout() method in JDBC. The command is to specify the lock timeout of the current transaction. Lock waiting time means the time for a transaction, which has made a request for lock mode, to wait when a lock mode is set on an object by a transaction and the requested lock is not compatible with the already-set lock mode. As you have seen before, the 2PL concurrency control method does not allow lock from other transactions until the existing lock is released. For the following two reasons, lock timeout should be set by a transaction: When a user does not want to wait too long because of the lock mode. To lower the frequency of deadlock. Deadlocks A deadlock occurs when two or more transactions request resources locked by each of them, so all transactions cannot be progressed. Figure 8 below shows an example of a deadlock. Figure 2: Transaction Deadlock. First, Transaction 1 executes UPDATE participant SET gold=10 WHERE host_year=2004 AND nation_code=’KOR’ statement and sets X_LOCK on the ‘KOR’ record. Transaction 2 sets X_LOCK on the ‘JPN’ record. Transaction 3 sets X_LOCK on the ‘CHN’ record. After that, Transaction 1 requests X_LOCK on the ‘JPN’ record for executing UPDATE for that record. However, the ‘JPN’ record is already locked with X_LOCK by Transaction 2. So, Transaction 1 should wait until Transaction 2 ends. Based on the 2PL protocol, the X_LOCK is released when the transaction ends. Transaction 2 requests X_LOCK on the ‘CHN’record and waits for Transaction 3. Finally, Transaction 3 waits for Transaction 1 to acquire the 'KOR' record of Transaction 1 as it has X_LOCK on the ‘CHN’ record. As a result,Transaction 1 waits for Transaction 2 to end, Transaction 2 waits for Transaction 3 to end, and Transaction 3 waits for Transaction 1 to end. So, no transaction can be progressed. This is called a deadlock. Most DBMSs which use the 2PL method, including CUBRID, use the deadlock detection method to solve the deadlock problem. It periodically checks whether the cycle illustrated in the above figure occurs by drawing a Lock Wait Graph for the transactions being executed. In CUBRID, the thread for detecting deadlock checks the Lock Wait Graph every second. When a deadlock is detected, one transaction among the transactions is randomly selected and aborted by force. This is called unilateral abort. When a transaction is selected as a victim to be sacrificed to solve the deadlock and unilaterally aborted, the corresponding SQL statement returns an error code. The error message is "The transaction has timed out due to deadlock while waiting for X_LOCK for an object. It waited until User 2 ended.” When an error is returned and the application aborts the transaction, the locks of the transaction are released and other transactions can be continuously processed. To see how the deadlock is detected, see the lock_detect_local_deadlock() function in the source code. This function is called with the intervals (in seconds) specified by the PRM_LK_RUN_DEADLOCK_INTERVAL variable (the deadlock_detection_interval_in_secs parameter in cubrid.conf file) on the background thread which executes thread_deadlock_detect_thread(). Even if a deadlock does not occur, when the execution time of a transaction is too long, other transactions should wait for too long as well. For a certain application, it is wiser to give up rather than wait. In particular, when a web server has called DB tasks and the wait time is too long, all threads of the web server are used to process the DB, so they cannot be used to process external HTTP requests any more, causing service failures. Therefore, for a web application, the threads should be returned without waiting an unlimited amount of time for DB processing even if an error occurs. Two methods are used for that: One is lock timeout supported by CUBRID. The other is query cancel. JDBC is defined with an API which can cancel the SQL statement being executed. The key data structure of the lock manager is defined in the lock_manager.c file. typedef struct lk_entry LK_ENTRY; struct lk_entry { #if defined(SERVER_MODE) struct lk_res *res_head; /* back to resource entry */ THREAD_ENTRY *thrd_entry; /* thread entry pointer */ int tran_index; /* transaction table index */ LOCK granted_mode; /* granted lock mode */ LOCK blocked_mode; /* blocked lock mode */ int count; /* number of lock requests */ struct lk_entry *next; /* next entry */ struct lk_entry *tran_next; /* list of locks that trans. holds */ struct lk_entry *class_entry; /* ptr. to class lk_entry */ LK_ACQUISITION_HISTORY *history; /* lock acquisition history */ LK_ACQUISITION_HISTORY *recent; /* last node of history list */ int ngranules; /* number of finer granules */ int mlk_count; /* number of instant lock requests */ unsigned char scanid_bitset[1]; /* PRM_LK_MAX_SCANID_BIT/8]; */ #else /* not SERVER_MODE */ int dummy; #endif /* not SERVER_MODE */ }; typedef struct lk_res LK_RES; struct lk_res { MUTEX_T res_mutex; /* resource mutex */ LOCK_RESOURCE_TYPE type; /* type of resource: class,instance */ OID oid; OID class_oid; LOCK total_holders_mode; /* total mode of the holders */ LOCK total_waiters_mode; /* total mode of the waiters */ LK_ENTRY *holder; /* lock holder list */ LK_ENTRY *waiter; /* lock waiter list */ LK_ENTRY *non2pl; /* non2pl list */ LK_RES *hash_next; /* for hash chain */ }; From the file, the lk_Gl global variable of LK_GLOBAL_DATA type is the core. The LK_ENTRY structure stands for the lock itself. For example, when the Transaction T1 has requested a lock, one LK_ENTRY is created. LK_RES is a structure that shows to which resource the lock belongs. In CUBRID, all resources are objects (instance objects and class objects), so they are shaped as OIDs. In the LK_RES structure, you can see the list of holders with LK_ENTRY type and the list of waiters. The list of holders is a list of transactions that hold the lock for the resource now. For example, when Transaction T1 and Transaction T2 have acquired S_LOCK for the data record with OID1, LK_ENTRY that corresponds to the S_LOCK of T1 and T2 will be registered in the list of holders. When Transaction T3 requests X_LOCK on the OID1 record, T3 should wait because of the existing S_LOCK. So, the LK_ENTRY corresponding to X_LOCK of T3 will be registered to the list of waiters. Which lock is held by which transaction is maintained in the tran_lock_table variable which has the LK_TRAN_LOCK structure as a table. The Wait For Graph for detecting a deadlock is expressed as TWFG_node and TWFG_edge of the LK_WFG_NODE structure and the LK_WFG_EDGE structure. The lock_detect_local_deadlock() function creates a Wait For Graph and detects whether there is a cycle on the graph. When a cycle is detected, the lock_select_deadlock_victim() function selects a victim transaction to be sacrificed for solving the deadlock. For reference, transactions are continuously executed while a Wait For Graph is drawn up and checked, the information of the ended transaction is removed from the graph. The victim transaction is selected based on the following criteria: If a transaction is not a holder, it cannot be a victim. When a transaction is in the commit phase or the rollback phase, it cannot be selected as a victim. Select a transaction of which lock timeout is not set to -1 (unlimited waiting) first. Select the latest transaction rather than the older one. (The transaction ID is an incremental number. A transaction with smaller transaction number is the older one.) Conclusion This concludes the talk about Two-Phase Locking in CUBRID. I briefly covered the types of concurrency control, the difference between 2PL and MVCC, about what locking technique is used in CUBRID RDBMS, about locking modes and their compatibility, and finally, the deadlocks and the solution for them. In this article I have mentioned about OID (Object Identifiers) which are used to identify instance objects as well as class objects. In the next article I will continue this talk and explain what Object, Class, and OID are.
December 14, 2012
by Esen Sagynov
· 11,158 Views · 1 Like
article thumbnail
Spring Integration Mock SftpServer Example
In this example I will show how to test Spring Integration flow using Mock SftpServer.
December 14, 2012
by Krishna Prasad
· 47,608 Views · 3 Likes
article thumbnail
Using Spring FakeFtpServer to JUnit test a Spring Integration Flow
for people in hurry, get the latest code and the steps in github . to run the junit test, run “mvn test” and understand the test flow. introduction: fakeftpserver in this spring integration fakeftpserver example, i will demonstrate using spring fakeftpserver to junit test a spring integration flow. this is an interesting topic, and there are few articles on unit testing file transfers , which gives some insight on this topic. in this blog, we will test a spring integration flow which checks for a list of files, apply a splitter to separate each file and start downloading them into a local location. once the download is complete, it will delete the files on the ftp server. in my next blog, i will show how to do junit testing of spring integration flow with sftp server. spring integration flow spring integration fakeftpserver example in order to use fakeftpserver we need to have maven dependency as below, org.mockftpserver mockftpserver 2.3 test the first step to this is to create a fakeftpserver before every test runs as below, @before public void setup() throws exception { fakeftpserver = new fakeftpserver(); fakeftpserver.setservercontrolport(9999); // use any free port filesystem filesystem = new unixfakefilesystem(); filesystem.add(new fileentry(file, contents)); fakeftpserver.setfilesystem(filesystem); useraccount useraccount = new useraccount("user", "password", home_dir); fakeftpserver.adduseraccount(useraccount); fakeftpserver.start(); } @after public void teardown() throws exception { fakeftpserver.stop(); } finally run the junit test case as seen below, @autowired private filedownloadutil downloadutil; @test public void testftpdownload() throws exception { file file = new file("src/test/resources/output"); delete(file); ftpclient client = new ftpclient(); client.connect("localhost", 9999); client.login("user", "password"); string files[] = client.listnames("/dir"); client.help(); logger.debug("before delete" + files[0]); assertequals(1, files.length); downloadutil.downloadfilesfromremotedirectory(); logger.debug("after delete"); files = client.listnames("/dir"); client.help(); assertequals(0, files.length); assertequals(1, file.list().length); } i hope this blog helped.
December 13, 2012
by Krishna Prasad
· 17,412 Views
article thumbnail
Configuring IIS methods for ASP.NET Web API on Windows Azure Websites
That’s a pretty long title, I agree. When working on my implementation of RFC2324, also known as the HyperText Coffee Pot Control Protocol, I’ve been struggling with something that you will struggle with as well in your ASP.NET Web API’s: supporting additional HTTP methods like HEAD, PATCH or PROPFIND. ASP.NET Web API has no issue with those, but when hosting them on IIS you’ll find yourself in Yellow-screen-of-death heaven. The reason why IIS blocks these methods (or fails to route them to ASP.NET) is because it may happen that your IIS installation has some configuration leftovers from another API: WebDAV. WebDAV allows you to work with a virtual filesystem (and others) using a HTTP API. IIS of course supports this (because flagship product “SharePoint” uses it, probably) and gets in the way of your API. Bottom line of the story: if you need those methods or want to provide your own HTTP methods, here’s the bit of configuration to add to your Web.config file: Here’s what each part does: Under modules, the WebDAVModule is being removed. Just to make sure that it’s not going to get in our way ever again. The security/requestFiltering element I’ve added only applies if you want to define your own HTTP methods. So unless you need the XYZ method I’ve defined here, don’t add it to your config. Under handlers, I’m removing the default handlers that route into ASP.NET. Then, I’m adding them again. The important part? The "verb attribute. You can provide a list of comma-separated methods that you want to route into ASP.NET. Again, I’ve added my XYZ methodbut you probably don’t need it. This will work on any IIS server as well as on Windows Azure Websites. It will make your API… happy.
December 11, 2012
by Maarten Balliauw
· 20,490 Views
article thumbnail
A C# .NET Client Proxy for RabbitMQ Management API
RabbitMQ comes with a very nice Management UI and a HTTP JSON API, that allows you to configure and monitor your RabbitMQ broker. From the website: “The rabbitmq-management plugin provides an HTTP-based API for management and monitoring of your RabbitMQ server, along with a browser-based UI and a command line tool, rabbitmqadmin. Features include: Declare, list and delete exchanges, queues, bindings, users, virtual hosts and permissions. Monitor queue length, message rates globally and per channel, data rates per connection, etc. Send and receive messages. Monitor Erlang processes, file descriptors, memory use. Export / import object definitions to JSON. Force close connections, purge queues.” Wouldn’t it be cool if you could do all these management tasks from your .NET code? Well now you can. I’ve just added a new project to EasyNetQ called EasyNetQ.Management.Client. This is a .NET client-side proxy for the HTTP-based API. It’s on NuGet, so to install it, you simply run: PM> Install-Package EasyNetQ.Management.Client To give an overview of the sort of things you can do with EasyNetQ.Client.Management, have a look at this code. It first creates a new Virtual Host and a User, and gives the User permissions on the Virtual Host. Then it re-connects as the new user, creates an exchange and a queue, binds them, and publishes a message to the exchange. Finally it gets the first message from the queue and outputs it to the console. var initial = new ManagementClient("http://localhost", "guest", "guest"); // first create a new virtual host var vhost = initial.CreateVirtualHost("my_virtual_host"); // next create a user for that virutal host var user = initial.CreateUser(new UserInfo("mike", "topSecret")); // give the new user all permissions on the virtual host initial.CreatePermission(new PermissionInfo(user, vhost)); // now log in again as the new user var management = new ManagementClient("http://localhost", user.name, "topSecret"); // test that everything's OK management.IsAlive(vhost); // create an exchange var exchange = management.CreateExchange(new ExchangeInfo("my_exchagne", "direct"), vhost); // create a queue var queue = management.CreateQueue(new QueueInfo("my_queue"), vhost); // bind the exchange to the queue management.CreateBinding(exchange, queue, new BindingInfo("my_routing_key")); // publish a test message management.Publish(exchange, new PublishInfo("my_routing_key", "Hello World!")); // get any messages on the queue var messages = management.GetMessagesFromQueue(queue, new GetMessagesCriteria(1, false)); foreach (var message in messages) { Console.Out.WriteLine("message.payload = {0}", message.payload); } This library is also ideal for monitoring queue levels, channels and connections on your RabbitMQ broker. For example, this code prints out details of all the current connections to the RabbitMQ broker: var connections = managementClient.GetConnections(); foreach (var connection in connections) { Console.Out.WriteLine("connection.name = {0}", connection.name); Console.WriteLine("user:\t{0}", connection.client_properties.user); Console.WriteLine("application:\t{0}", connection.client_properties.application); Console.WriteLine("client_api:\t{0}", connection.client_properties.client_api); Console.WriteLine("application_location:\t{0}", connection.client_properties.application_location); Console.WriteLine("connected:\t{0}", connection.client_properties.connected); Console.WriteLine("easynetq_version:\t{0}", connection.client_properties.easynetq_version); Console.WriteLine("machine_name:\t{0}", connection.client_properties.machine_name); } On my machine, with one consumer running it outputs this: connection.name = [::1]:64754 -> [::1]:5672 user: guest application: EasyNetQ.Tests.Performance.Consumer.exe client_api: EasyNetQ application_location: D:\Source\EasyNetQ\Source\EasyNetQ.Tests.Performance.Consumer\bin\Debug connected: 14/11/2012 15:06:19 easynetq_version: 0.9.0.0 machine_name: THOMAS You can see the name of the application that’s making the connection, the machine it’s running on and even its location on disk. That’s rather nice. From this information it wouldn’t be too hard to auto-generate a complete system diagram of your distributed messaging application. Now there’s an idea :)
December 7, 2012
by Mike Hadlow
· 8,048 Views
article thumbnail
Pushing twice daily: our conversation with Facebook’s Chuck Rossi
At my new job we’re reigniting an effort to move to continuous delivery for our software releases. We figured that we could learn a thing or two from Facebook, so we reached out to Chuck Rossi, Facebook’s first release engineer and the head of their release engineering team. He generously gave us an hour of his time, offering insights into how Facebook releases software, as well as specific improvements we could make to our existing practice. This post describes several highlights of that conversation. What’s so good about Facebook release engineering? The core capability my company wants to reproduce is Facebook’s ability to release its frontend web UI on demand, invisibly and with high levels of control and quality. In fact Facebook does a traditional-style large weekly release each Tuesday, as well as not-so-traditional two daily pushes on all other weekdays. They are also able to release on demand as needed. This capability is impressive in any context; it’s all the more impressive when you consider Facebook’s incredible scale: Over 1B users worldwide About 700 developers committing against their frontend source code repo Single frontend code binary about 1.5GB in size Pushed out to many thousands of servers (the number is not public) Changes can go from check-in to end users in as quickly as 40 minutes Release process almost entirely invisible to the users Holy cow. While the release engineering problem for my company is considerably smaller than the one confronting Facebook, it’s not by any means small. (Facebook is so massive that user bases orders of magnitude smaller than Facebook can still have nontrivial scale.) We don’t have to contend with the 1B users, 700 developers, 1.5GB binary or many thousands of servers. But we do want to be able to release on demand, quickly, reliably and invisibly to our users. How Facebook pushes twice daily to over 1B users The common thread running through the practices below is that they reject the supposed tradeoff between speed and quality. Releases are going to happen twice a day, and this needs to occur without sacrificing quality. Indeed, the quality requirements are very high. So any approach to quality incompatible with the always-be-pushing requirement is a non-starter. Here are some of the key themes and techniques. Empower your release engineers Chuck mentioned early on that the whole thing rides on having an empowered release engineering team. Ultimately release engineers have to strike a balance between development’s desire to ship software and operations’ desire to keep everything running smoothly. Release engineers therefore need access to the information that tells them whether a given change is a good risk for some upcoming push, as well as the authority to reject changes that aren’t in fact good risks. At the same time, we want release engineers that “get it” when it comes to software development. We don’t want them blocking changes just because they don’t understand them, or just because they can. Facebook’s release engineers are all programmers, so they understand the importance of shipping software, and they know how to look at test plans, stack traces and the code itself should the need arise. Empowerment is part cultural, part process and part tool-related. On the cultural side, Chuck introduces new hires to the release process, and makes it clear that the release engineering team makes the decision. As part of that presentation, he explains how the development, test and review processes generate data about the risk associated with a change. The highly integrated toolset, based largely around Facebook’s open source Phabricator suite, provides visibility into that change risk data. Just to give you an idea of the expectation on the developers, there are a number of factors that determine whether a change will go through: The size of the diff. Bigger = more risky. The quality of the test plan. The amount of back-and-forth that occurred in the code review (see below). The more back-and-forth, the more rejections, the more requests for change—the more risk. The developer’s “push karma”. Developers with a history of pushing garbage through get more scrutiny. They track this, though any given developer’s push karma isn’t public. The day of the week. Mondays are for small, not-risky changes because they don’t want to wreck Tuesday’s bigger weekly release. Wednesdays allow the bigger changes that were blocked for Monday. Thursdays allow normal changes. Changes for Friday can’t be too risky, partly because weekend traffic tends to be heavier than Friday traffic (so they don’t want any nasty weekend surprises), and partly because developers can be harder to reach on weekends. The release engineers evaluate every change against these criteria, and then decide accordingly. They process 30-300 changes per day. Test suite should take no longer than the slowest test When you’re releasing code twice a day, you have to take testing very seriously. Part of this is making sure that developers write tests, and part of this is running the full test suite—including integration and acceptance tests—against every change before pushing it. In some development organizations, one major challenge with doing this is that integration tests are slow, and so running a full regression against every change becomes impractical. Such organizations—especially those that practice a lot of manual regression testing—often handle this by postponing full regression testing until late in the release cycle. This makes regression testing more cost-feasible because it happens only once per release. But if we’re trying to push twice daily, the run-regression-at-the-end-of-the-release-cycle approach doesn’t work. And neither does truncating the test suite. We can’t give up the quality. Facebook’s alternative is simple: apply extreme parallelization such that it’s the slowest integration test that limits the performance of the overall suite. Buy as many machines as are required to make this real. Now we can run the full battery of tests quickly against every single change. No more speed/quality tradeoff. Code review EVERYTHING Chuck was at Google before he joined Facebook, and apparently at both Google and Facebook they review every code change, no matter how small. Whereas some development shops either practice code review only in limited contexts or else not at all, pre-push code reviews are fundamental to Facebook’s development and release process. The process flat out doesn’t work without them. As the session progressed, I came to understand some reasons why. One key reason is that it promotes the right-sizing of changes so they can be developed, tested, understood and cherry-picked appropriately. Since Facebook releases are based on sets of cherry picks, commits need to be smallish and coherent in a way that reviews promote. And (as noted above) the release engineers depend upon the review process to generate data as to any given change’s riskiness so they can decide whether to perform the cherry pick. Another important benefit is that pre-push code reviews can make it feasible to pursue a single monolithic code repo strategy (often favored for frontend applications involving multiple components that must be tested together), because breaking changes are much less likely to make it into the central, upstream repo. Facebook has about 700 developers committing against a single source repository, so they can’t afford to have broken builds. Facebook uses Phabricator (specifically, Differential and Arcanist) for code reviews. Practice canary releases Testing and pre-push reviews are critical, but they aren’t the entire quality strategy. The problem is that testing and reviews don’t (and can’t) catch everything. So there has to be a way to detect and limit the impact of problems that make their way into the production environment. Facebook handles this using “canary releases”. The name comes from the practice of using canaries to test coal mines for the presence of poisonous gases. Facebook starts by pushing to six internal servers that their employees see. If no problems surface, they push to 2% of their overall server fleet and once again watch closely to see how it goes. If that passes, they release to 100% of the fleet. There’s a bunch of instrumentation in place to make sure that no fatal errors, performance issues and other such undesirables occur during the phased releases. Decouple stuff Chuck made a number of suggestions that I consider to fall under the general category “decouple stuff”. Whereas many of the previous suggestions were more about process, the ones below are more architectural in nature. Decouple the user from the web server. Sessions are stateless, so there’s no server affinity. This makes it much easier to push without impacting users (e.g., downtime, forcing them to reauthenticate, etc.). It also spreads the pain of a canary-test-gone-wrong across the entire user population, thus thinning it out. Users who run into a glitch can generally refresh their browser to get another server. Decouple the UI from the service. Facebook’s operational environment is extremely large and dynamic. Because of this, the environment is never homogeneous with respect to which versions of services and UI are running on the servers. Even though pushes are fast, they’re not instantaneous, so there has to be an accommodation for that reality. It becomes very important for engineers to design with backward and forward compatibility in mind. Contracts can evolve over time, but the evolution has to occur in a way that avoids strong assumptions about which exact software versions are operating across the contract. Decouple pushes from feature activation. Facebook uses dark launches and feature flags to decouple binary pushes from the activation of features. The general concept is for the features to exist in latent form in the production environment, with a means to activate and deactivate them at will. Dark launches and feature flags further erode the speed/quality tradeoff. You can release code without activating it, giving you a way to get it out the door without impacting users. And when you do activate it, you have a way to turn it off immediately should a problem arise. These techniques also simplify source code management because you can just manage everything on trunk instead of having a bunch of branches sitting around waiting to be merged. Facebook uses an internally-developed tool called Gatekeeper to manage feature flags. Gatekeeper allows Facebook to turn feature flags on and off, and to do that in a flexibly segmented fashion. Recap and concluding thoughts I mentioned earlier that Facebook rejects the apparent tradeoff between speed and quality. At their core, the practices above amount to ways to maintain quality in the face of rapid fire releases. As the overall release practice and infrastructure matures, opportunities for further speedups and quality enhancements emerge. As you can see, our one hour conversation was packed with a lot of outstanding information. I hope that others might benefit from this material in the way that I know my company will. Thanks Chuck! Additional resources for Facebook release engineering Facebook publishes a great deal of useful information about their release engineering processes. Here are some good resources to learn more, mostly directly from Chuck himself. Push: Tech Talk – May 26, 2011 (video): This is a class that Chuck gives to new developers when they join Facebook. It’s just slightly out of date as Facebook now does two daily pushes instead of one. Outstanding information about release schedule, branching strategy, cultural norms, tools and more. Just under an hour but well worth the watch. Release engineering and push karma: Chuck Rossi: Interview covering some highlights of the Facebook release process and its supporting culture. Ship early and ship twice as often: Chuck explains how Facebook moved from a once-per-day push schedule to a twice-per-day schedule. Release Engineering at Facebook: Secondary source with highlights on the Facebook release process. Hammering Usernames: Facebook explains how they use dark launches to mitigate risk. Girish Patangay keynote Velocity Europe 2012 “Move Fast and Ship Things” (video) – Keynote by Facebook’s Girish Patangay describing some additional elements of the Facebook release process, including its use of a BitTorrent-based system to push a large binary very quickly out to many thousands of servers.
December 6, 2012
by Willie Wheeler
· 15,431 Views
article thumbnail
How to Integrate FitNesse Test into Jenkins
In an ideal continuous integration pipeline different levels of testing are involved. Individual software modules are typically validated through unit tests, whereas aggregates of software modules are validated through integration tests. When a continuous integration build tool like Jenkins is used it is natural to define different build steps, each step returning feedback and generating test reports and trend charts for a specific level of testing. FitNesse is a lightweight testing framework that is meant to implement integration testing in a highly collaborative way, which makes it very suitable to be used within agile software projects. With Jenkins and Maven it is quite easy to trigger the execution of FitNesse integration tests automatically. When properly configured and bootstrapped, Jenkins can treat the FitNesse test results in a very similar way as it treats regular JUnit test results. Now lets suppose within a Maven project we have a FitNesse suite that contains the integration tests we want to be executed by a Jenkins job. With the Maven Failsafe Plugin and the help of some convenient FitNesse built-in JUnit utility classes this can be accomplished really easily. First of all we need to create a JUnit integration test class that will actually bootstrap the FitNesse tests. Lets says this class is named FitNesseIT. Within this class we need to instantiate a JUnitXMLTestListener and a JUnitHelper in such a way that Jenkins will automatically recognize the test results as regular JUnit test results: import fitnesse.junit.*; resultListener = new JUnitXMLTestListener("target/failsafe-reports"); jUnitHelper = new JUnitHelper(".", "target/fitnesse-reports", resultListener); The port property of the JUnitHelper does not need to be set when using the SLIM test system. However, if the FIT test system is used, this port must be set to an appropriate value as it specifies the port number of the FitServer that will be launched to execute the FIT tests. It is recommended to assign a random free available port, as it is considered a good practice to avoid using any fixed port on the executing Jenkins node: // if test system == FIT socket = new ServerSocket(0); jUnitHelper.setPort(socket.getLocalPort()); socket.close(); The debugMode property of the JUnitHelper should not be changed. It is set to true by default, which means that the SlimService or FitServer will efficiently run within the same Java process that is created by the Maven Failsafe Plugin to run the integration test. The JUnitHelper will be used to kick off the execution of the actual FitNesse tests: @Test public void assertSuitePasses() throws Exception { jUnitHelper.assertSuitePasses(suiteName); } The execution of the FitNesseIT test class itself can be triggered through the use of the Maven Failsafe Plugin. In this way the FitNesse suite will be executed automatically as part of the Maven lifecycle integration-test build phase. The FitNesseIT test class can also be executed from your IDE, which makes it really easy to actually debug the FitNesse tests by stepping through the fixture classes. Instead of instantiating a JUnitHelper ourself, we could have used the JUnit runner class FitNesseSuite and specified by annotation the actual FitNesse suite that needs to be executed as a JUnit test. However this runner class does not create the JUnit XML report files that need to be processed by Jenkins. As the JUnitXMLTestListener will already create report files for all individual FitNesse tests, there is no need to have a separate report file for the bootstrapping FitNesseIT test class itself. Therefore, the disableXmlReport configuration property of the Maven Failsafe Plugin need to be enabled. In this way the Jenkins job will only take the results of the individual FitNesse tests into account when generating its test report and trend chart. Furthermore, the system property variables TEST_SYSTEM and SLIM_PORT need to be configured appropriately: org.apache.maven.plugins maven-failsafe-plugin integration-test true slim 0 By setting the SLIM_PORT to 0, the SLIM executor will run on a random free available port, so no fixed port will be used on the executing Jenkins node. Obviously, when using FIT the TEST_SYSTEM variable must be set to fit instead of slim and the SLIM_PORT variable is not needed. Alternatively, the TEST_SYSTEM and SLIM_PORT variables can be defined with the Fitnesse define keyword: !define TEST_SYSTEM {slim} !define SLIM_PORT {0} As Jenkins automatically scans the failsafe-reports directories “**/target/failsafe-reports”, the FitNesse test results will be processed out of the box. No additional Jenkins plugins are required. The JUnitHelper also creates a nice HTML report that consist of a summary including some useful statistics as well as detailed test result pages for all executed tests. This report can be found in the “target/fitnesse-reports” directory and can be published by a post-build action with the HTML Publisher Plugin. In a continuous integration pipeline it makes sense to trigger the execution of the integration tests in an individual build step. This can be accomplished typically by activating the Maven Failsafe Plugin using a Maven profile. In this way the integration test results and unit test results are not mixed into the same reports and trend charts by Jenkins.
December 3, 2012
by Marcus Martina
· 15,802 Views · 1 Like
article thumbnail
Easy Integration Testing with Spring+Hibernate
I am guilty of not writing integration testing (At least for database related transactions) up until now. So in order to eradicate the guilt i read up on how one can achieve this with minimal effort during the weekend. Came up with a small example depicting how to achieve this with ease using Spring and Hibernate. With integration testing, you can test your DAO(Data access object) layer without ever having to deploy the application. For me this is a huge plus since now i can even test my criteria's, named queries and the sort without having to run the application. There is a property in hibernate that allows you to specify an sql script to run when the Session factory is initialized. With this, i can now populate tables with data that required by my DAO layer. The property is as follows; import.sql According to the hibernate documentation, you can have many comma separated sql scripts.One gotcha here is that you cannot create tables using the script. Because the schema needs to be created first in order for the script to run. Even if you issue a create table statement within the script, this is ignored when executing the script as i saw it. Let me first show you the DAO class i am going to test; package com.unittest.session.example1.dao; import org.springframework.transaction.annotation.Propagation; import org.springframework.transaction.annotation.Transactional; import com.unittest.session.example1.domain.Employee; @Transactional(propagation = Propagation.REQUIRED) public interface EmployeeDAO { public Long createEmployee(Employee emp); public Employee getEmployeeById(Long id); } package com.unittest.session.example1.dao.hibernate; import org.springframework.orm.hibernate3.support.HibernateDaoSupport; import com.unittest.session.example1.dao.EmployeeDAO; import com.unittest.session.example1.domain.Employee; public class EmployeeHibernateDAOImpl extends HibernateDaoSupport implements EmployeeDAO { @Override public Long createEmployee(Employee emp) { getHibernateTemplate().persist(emp); return emp.getEmpId(); } public Employee getEmployeeById(Long id) { return getHibernateTemplate().get(Employee.class, id); } } Nothing major, just a simple DAO with two methods where one is to persist and one is to retrieve. For me to test the retrieval method i need to populate the Employee table with some data. This is where the import sql script which was explained before comes into play. The import.sql file is as follows; insert into Employee (empId,emp_name) values (1,'Emp test'); This is just a basic script in which i am inserting one record to the employee table. Note again here that the employee table should be created through the hibernate auto create DDL option in order for the sql script to run. More info can be found here. Also the import.sql script in my instance is within the classpath. This is required in order for it to be picked up to be executed when the Session factory is created. Next up let us see how easy it is to run integration tests with Spring. package com.unittest.session.example1.dao.hibernate; import static org.junit.Assert.*; import org.junit.Test; import org.junit.runner.RunWith; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.test.context.ContextConfiguration; import org.springframework.test.context.junit4.SpringJUnit4ClassRunner; import org.springframework.test.context.transaction.TransactionConfiguration; import com.unittest.session.example1.dao.EmployeeDAO; import com.unittest.session.example1.domain.Employee; @RunWith(SpringJUnit4ClassRunner.class) @ContextConfiguration(locations="classpath:spring-context.xml") @TransactionConfiguration(defaultRollback=true,transactionManager="transactionManager") public class EmployeeHibernateDAOImplTest { @Autowired private EmployeeDAO employeeDAO; @Test public void testGetEmployeeById() { Employee emp = employeeDAO.getEmployeeById(1L); assertNotNull(emp); } @Test public void testCreateEmployee() { Employee emp = new Employee(); emp.setName("Emp123"); Long key = employeeDAO.createEmployee(emp); assertEquals(2L, key.longValue()); } } A few things to note here is that you need to instruct to run the test within a Spring context. We use the SpringJUnit4ClassRunner for this. Also the transction attribute is set to defaultRollback=true. Note that with MySQL, for this to work, your tables must have the InnoDB engine set as the MyISAM engine does not support transactions. And finally i present the spring configuration which wires everything up; com.unittest.session.example1.**.* org.hibernate.dialect.MySQLDialect com.mysql.jdbc.Driver jdbc:mysql://localhost:3306/hbmex1 root password true org.hibernate.dialect.MySQLDialect create import.sql That is about it. Personally i would much rather use a more light weight in-memory database such as hsqldb in order to run my integration tests. Here is the eclipse project for anyone who would like to run the program and try it out.
November 27, 2012
by Dinuka Arseculeratne
· 56,181 Views · 2 Likes
article thumbnail
Enterprise-ready Tool Support for Apache Camel
apache camel is my favorite integration framework on the java platform due to great dsls, a huge community, and so many different components. camel is used by many developers from different companies all over the world. however, most guys are not aware that some really cool and – more important – enterprise-ready tooling is available for camel, too. many people ask me about camel tooling when i do talks at conferences. this is the reason for this short blog post about camel tooling. [fyi: i work for talend (one of the vendors).] ide support camel consists of a set of normal java libraries and is therefore usable with any java ide (such as eclipse, netbeans or intellij idea) or even a classic text editor. programming dsls are available for java, groovy, and scala. even a kotlin dsl is in the works, thanks to camel’s founder james strachan. all familiar ide features such as code completion or javadoc view are available for these dsls. in the spring xml dsl, the eclipse-based springsource tool suite (sts) should be emphasized, which provides the best support for the spring framework and xml configurations. camel-specific tooling besides classical ide support, further products are available to provide additional functionality. integration problems can be modeled with the help of enterprise integration patterns (eip, http://www.eaipatterns.com/). eips are implemented by camel. visual designers are available to help modeling integration problems with these eips. these tools even generate the corresponding source code automatically. ideally, developers do not have to write any source code by hand. camel tooling is offered by talend with talend esb (http://de.talend.com/products/esb) and jboss, formerly fusesource, with fuse ide (http://fusesource.com/products/fuse-ide). both companies also provide full-time committers for the apache camel project. let’s take a short look at these two products in the following. open studio for talend esb talend esb is an eclipse-based integration platform within the talend unified platform. the familiar “look and feel” and the intuitive use of eclipse remain. the esb is open source and freely available. the paid enterprise version offers additional features and support. the esb can be used independently or in combination with other parts of the talend unified platform, such as BPM, big data, or master data management. the great benefit is that everything can be done within one suite using the same gui and concepts, based on eclipse. the entire talend unified platform is based on the “zero-coding” approach. this way, a very efficient implementation of integration problems is possible using the eips and components. routes are modeled and configured with intuitive tool support, all source code is generated. of course, custom integration logic can still be written and included, for example, pojos, spring beans, scripts in different languages, or own camel components. plenty of other components besides camel’s ones are available for talend esb – for example connectors to alfresco, jasper, sap, salesforce, or host systems. figure 1: visual designer of talend’s esb fuse ide the fuse ide is an eclipse plugin, which is installed from the eclipse update site. the visual designer (see figure 2) generates camel routes as xml code using the spring xml dsl. the generated code is editable vice-versa, i.e. the developer can change the source code. the graphical model applies changes automatically. fuse ide is intuitive to use for creating camel routes. fusesource offers some other products, which can be used in combination with fuse ide – such as management console or fuse mq for messaging. under fusesource, fuse ide was a proprietary product. however, fusesource was recently taken over by redhat (http://www.redhat.com/about/news/press-archive/2012/6/red-hat-to-acquire-fusesource) and now belongs to the jboss division. in the new roadmap, the fuse ide is still included. it will probably be integrated into the jboss enterprise soa platform and become “open sourced”. the integration of fusesource will take at least a few more months time to complete (http://www.redhat.com/promo/jboss_integration_week/). jboss now “owns” three esb products (jboss esb, switchyard and fuse esb). probably, these will be merged into one product in the end (switchyard is also based on camel). nevertheless, the fusesource products will also be supported for some time – primarily in order to satisfy existing customers (my guess). figure 2: visual designer of fuse ide (jboss, former fusesource) enterprise-ready tooling is already available for apache camel! the bottom line is that enterprise-ready tooling is already available for apache camel. it is great to see different companies working on tooling for apache camel. the winner definitely is apache camel… and there is no loser! talend esb and fuse ide are two different approaches for different kinds of projects. if you like the „zero-coding“ approach, then take a closer look at talend’s esb. it is really easy and efficient to realize integration projects without writing source code – nevertheless, there is enough flexibility for customization and adding own source code. the combination with bpm, mdm or big data (based on hadoop) is also supported within the unified platform using the same open source and „zero-coding“ concepts. if you „insist“ on writing and refactoring all source code by yourself within the text editor of an ide, then take a look at fuse ide. your best would be to try out both and see which one fits best into your next enterprise integration project. if you know any other cool camel tooling (no matter if it is enterprise-ready or not), or if you have any other feedback, please write a comment. thank you. best regards, kai wähner (twitter: @kaiwaehner) content from my blog: http://www.kai-waehner.de/blog/2012/11/23/enterprise-ready-tool-support-for-apache-camel/
November 26, 2012
by Kai Wähner DZone Core CORE
· 15,517 Views
article thumbnail
Lightweight RPC with ZeroMQ (ØMQ) and Protocol Buffers
A frequent issue I come across writing integration applications with Mule is deciding how to communicate back and forth between my front end application, typically a web or mobile application, and a flow hosted on Mule. I could use web services and do something like annotate a component with JAX-RS and expose this out over HTTP. This is potentially overkill, particularly if I only want to host a few methods, the methods are asynchronous or I don’t want to deal with the overhead of HTTP. It also could be a lot of extra effort if the only consumers of the API, at least initially, are internal facing applications. Another choice is to use “synchronous” JMS with temporary reply queues. While Mule makes this easy to do, particularly with MuleClient, I now have to deal with the overhead of spinning up a JMS infrastructure. I could also be limited to Java only clients, depending on which JMS broker I choose. The latter is particularly signifcant, as Java probably isn’t the technology of choice on the web or mobile layer. ØMQ for RPC ØMQ, or ZeroMQ, is a networking library designed from the ground up to ease integration between distributed applications. In addition to supporting a variety of messaging patterns, which are enumerated in the extremely well written guide, the library is written in platform agnostic C with wrappers for different languages like Java, Python and Ruby. These features make it a good candidate to solve the challenges I introduced above, particularly since a community contributed module for ØMQ was released recently. Let’s consider a simple service that accepts a request for a range of stock quotes and returns the results and see how we can host this service with Mule and expose it out with the ØMQ Module. Data Serialization with Protocol Buffers Data is transported back and forth over ØMQ as byte arrays. We, as such, need to decide on a way to serialize our stock quote request and responses “on the wire.” Before we do that, however, let’s take a look at the Java canonical data model we’re using on the client and server side. The following Gists show the important bits of the StockQuote and StockQuoteResponse classes. public class StockQuote implements Serializable { String symbol; Date date; Double open; Double high; Double low; Double close; Long volume; Double adjustedClose; public class StockQuoteRequest implements Serializable { String symbol; Date startDate; Date endDate; public interface StockDataService { public List getQuote(StockQuoteRequest request); } We could use Java serialization to get the objects into byte arrays. Ignoring the other deficiencies of default Java serialization, the main drawback is that it limits our clients to one’s running on a JVM. XML or JSON provide better alternatives, but for the purposes of this example we’ll assume we want a more compact representation of the data (this isn’t totally unrealistic, stock quote data can be extremely time sensitive and we probably want to minimize serialization and deserialization overhead.) Protocol Buffers provide a good middle ground and also boast a Mule Module to provide the necessary transformers we need to move back and forth from the byte array representations. Let’s define two .proto files to define the wire format and generate the intermediary stubs for serialization. package com.acmesoft.zeromq; option java_package = "com.acmesoft.stock.model.serialization.protobuf"; option optimize_for = SPEED;package com.acmesoft.zeromq; option java_package = "com.acmesoft.stock.model.serialization.protobuf"; option optimize_for = SPEED; option java_multiple_files = true; message StockQuoteResponseBuffer { repeated StockQuoteBuffer result = 1; } message StockQuoteBuffer { required string symbol = 1; required int64 date = 2; required double open = 3; required double high = 4; required double low = 5; required double close = 6; required int64 volume = 7; required double adjustedClose = 8; } option java_multiple_files = true; message StockQuoteRequestBuffer { required string symbol = 1; required int64 start = 2; required int64 end = 3; } You typically would use the “protoc” compiler to generate the Java stubs. This is tedious, however, so we’ll instead modify the pom.xml of our project to compile the protoc files during the compile goals: com.google.protobuf.tools maven-protoc-plugin /usr/local/bin/protoc compile testCompile Since we already have a domain model we’ll add some helper classes to simplify the serialization tasks on the client side. public byte[] toProtocolBufferAsBytes() { return StockQuoteRequestBuffer.newBuilder() .setSymbol(symbol) .setStart(startDate.getTime()) .setEnd(endDate.getTime()).build().toByteArray(); } public static StockQuoteRequest fromProtocolBuffer(StockQuoteRequestBuffer buffer) { StockQuoteRequest request = new StockQuoteRequest(); request.setSymbol(buffer.getSymbol()); request.setStartDate(new Date(buffer.getStart())); request.setEndDate(new Date(buffer.getEnd())); return request; } public static StockQuoteResponseBuffer toProtocolBuffer(List quotes) { StockQuoteResponseBuffer.Builder responseBuilder = StockQuoteResponseBuffer.newBuilder(); for (StockQuote quote : quotes) { responseBuilder.addResult(StockQuoteBuffer.newBuilder() .setAdjustedClose(quote.getAdjustedClose()) .setClose(quote.getClose()) .setDate(quote.getDate().getTime()) .setHigh(quote.getHigh()) .setLow(quote.getLow()) .setOpen(quote.getOpen()) .setSymbol(quote.getSymbol()) .setVolume(quote.getVolume()).build()); } return responseBuilder.build(); } public static List listOfStockQuotesFromBytes(byte[] bytes) { List buffer; try { buffer = StockQuoteResponseBuffer.parseFrom(bytes).getResultList(); } catch (InvalidProtocolBufferException e) { throw new SerializationException(e); } List quotes = new ArrayList(); for (StockQuoteBuffer stockQuoteBuffer : buffer) { StockQuote stockQuote = new StockQuote(); stockQuote.setClose(stockQuoteBuffer.getClose()); stockQuote.setDate(new Date(stockQuoteBuffer.getDate())); stockQuote.setHigh(stockQuoteBuffer.getHigh()); stockQuote.setOpen(stockQuoteBuffer.getOpen()); stockQuote.setSymbol(stockQuoteBuffer.getSymbol()); stockQuote.setVolume(stockQuoteBuffer.getVolume()); stockQuote.setAdjustedClose(stockQuoteBuffer.getAdjustedClose()); stockQuote.setLow(stockQuoteBuffer.getLow()); quotes.add(stockQuote); } return quotes; } Configuring StockDataService Now that we have a canonical data model and a wire format defined we’re ready to wire up a Mule flow to expose the service out. Note that for this to work you need to have jzmq installed locally on your system. The following dependency needs to be added to your pom.xml once its installed: org.zeromq zmq 2.2.0 /usr/local/lib/zmq.jar system Where systemPath is the location of the zmq.jar on your filesystem. Once that’s out of the way we can configure the flow, as illustrated below: The ZeroMQ inbound-endpoint will be bound to TCP port 9090 with a request-response exchange pattern. The deserialize MP in the protobuf module will deserialize the byte array to the generated StockQuoteRequestBuffer class. From there we’ll use MEL to invoke the helper method on StockQuoteRequest to transform the intermediary class to the domain model. The List of StockQuotes returned from StockDataService will be transformed by the MEL expression using the “toProtocolBuffer” helper method on the domain model. The Protocol Buffer Module is then smart enough to implicitly transform the intermediary object to a byte array for the response. Consuming the Service from the Client Side Now that the server is ready we can turn our attention to the client side code to invoke the remote service. Let’s take a look at how this works: StockQuoteRequest stockQuoteRequest = new StockQuoteRequest(); stockQuoteRequest.setSymbol("FB"); stockQuoteRequest.setStartDate(new Date( new Date().getTime() - (86400000 * 7))); stockQuoteRequest.setEndDate(new Date()); ZMQ.Socket zmqSocket = zmqContext.socket(ZMQ.REQ); zmqSocket.setReceiveTimeOut(RECEIVE_TIMEOUT); zmqSocket.connect("tcp://localhost:9090"); zmqSocket.send(stockQuoteRequest.toProtocolBufferAsBytes(), 0); List quotes = StockQuote.listOfStockQuotesFromBytes(zmqSocket.recv(0)); We start off by defining the StockQuoteRequest object to give us all the quotes for Facebook stock from the last week. We can then open up a ZMQ socket, set the timeout, connect to the ZMQ socket on the remote Mule instance and send the byte representation of the StockQuoteRequest to it. zmqSocket.recv is then used to receive the bytes back from Mule. From here we can use the listOfStockQuotesFromBytes helper method we wrote above to convert the Protocol Buffer representation to a List of StockQuotes. Despite the fair bit of plumbing we did above, this is a pretty concise bit of client side code to invoke the remote service. Conclusion This blog post only touched on the features of ØMQ and the ØMQ Mule Module. In addition to request-reply, other exchange-patterns are supported, like one-way, push and pull. This effectively gives you the benefits of a reliable, asynchronous messaging layer without a centralized infrastructure. I hope to cover this in a later post. Protocol buffers also seem like a natural fit as a wire format for ØMQ. protobuffers echo ØMQ’s principals of being lightweight, fast and platform agnostic. These are also, not coincidently, principals Mule shares as an integration framework. The project for this example is available on GitHub.
November 26, 2012
by John D'Emic
· 28,583 Views
article thumbnail
API Server Design - Making De-Normalization the Norm
In database design classes in Computer Science, we learn that normalization is a good thing. And it certainly is a good thing, for databases. In the case of APIs, it is a different story. If a client must do multiple GETs to obtain the data it needs, or multiple PUTs or POSTs to send up data, just because your database happens to be normalized, then something is wrong. One of the functions of an API Server is to de-normalize your data so that clients are spared from making extra REST API calls, with all of the overhead which goes with that. Mugunth Kumar explains this very well in this excellent presentation, using Twitter as an example. When you do a GET on a tweet, it not only returns you the Tweet itself, but also other information (e.g. description of the Twitter user who sent the tweet). This saves the API client (often a mobile app) from making another request for that data. Effectively, the API Server has gathered up that data, which may come from different database tables, and de-normalized it for the response. You can try it out yourself here, by looking at the JSON which comes back from this Twitter API GET the most recent Tweet from my timeline. Many Vordel customers are using the API Server to gather together the data which is returned to the API clients, often taking this data from multiple sources (not only databases, but also message queues and even from other APIs). This data is then amalgamated into single JSON or XML structures. It often then cached at the API Server, in this structure. In this way, clients are spared from doing multiple calls, and instead (like the Twitter API example above) get the data they need in one request, or can PUT or POST up data in one action, rather than piecemeal. De-normalization is key to this process, and is one of the great benefits of an API Server.
November 21, 2012
by Oren Eini
· 9,831 Views
article thumbnail
Spock and testing RESTful API services
Spock is a BBD testing framework that allows for easy BDD tests to be written. The framework is an extension upon JUnit which allows for easy IDE integration and using existing JUnit functionality. Spock tests are written in Groovy and can be used for writing a wide range of tests from small unit tests to full application integration tests. Without going into too much detail on how to write Spock based tests (see below for a few excellent links), lets go through how we can use the framework to build integration tests for testing a RESTful API. Our first RESTful API Test package com.wolfware.integration import groovyx.net.http.RESTClient import spock.lang.* import spock.lang.Specification import com.movideo.spock.extension.APIVersion import com.movideo.spock.extension.EnvironmentEndPoint @APIVersion(minimimApiVersion="1.0.0.0") class GetAuthenticationToken extends Specification { @EnvironmentEndPoint protected def environmentHost def "Get authentication token XML from API for valid account"() { given: "a valid account" def authenticationTokenRequestParams = ['key':"AAABBBCCC123", 'user':"[email protected]"] and: "a client to get the authentication token XML" def client = new RESTClient(environmentHost) when: "we attempt to retrieve authentication token XML" def resp = client.get(path : "/authenticate", query : authenticationTokenRequestParams) then: "we should get a valid authentication token XML response" assert resp.data.token.isEmpty() == false // lots more asserts } } As you can see, apart from the @APIVersion and @EnvironmentEndPoint annotations (these are Spock extensions as explained later), the spec is a fairly simple Spock test. This specification has a feature that, as the name suggests, gets a authentication token in XML format and validates it. Lets look at each step: Given The url parameters required to get a authentication token from the RESTful service When using the Groovy RestClient to call the RESTful service for the authentication token details Then We can assert all the details of the response. The thing I really like about Spock is the readability of the tests. From the name being a descriptive sentence rather than some short hand with _ throughout to make a valid method name to being able to easily see where setup of the test is done and then the expectations and assertions. Trying to test any environment RESTful service I've found that when trying to write integration tests, there has either been: Hard coded environment details and the code branched for each environment making it near impossible to keep code in sync as merge hell becomes the norm. Config files that define the environment are used to define environment details, again checked into each branch for each environment. Trying to follow the principles of continuous delivery, it would be great to be able to use the same code base to test against any environment. This is where Spock Extensions come into play to help us out. Spock Extensions In short Spock allows us to extend it to perform other functionality during the test life-cycle (a great post on extensions can be read on this excellent blog post). I've developed two extensions which help to make the idea of running the same test suite across different environments easier. The @EnvironmentEndPoint Extension The aim of this Spock extension is to have a placeholder variable in code that at run-time, can be defined with the environment host of the RESTful services that we want to test. package com.movideo.runtime.extension.custom import org.apache.commons.logging.Log import org.apache.commons.logging.LogFactory import org.spockframework.runtime.extension.AbstractAnnotationDrivenExtension import org.spockframework.runtime.extension.AbstractMethodInterceptor import org.spockframework.runtime.extension.IMethodInvocation import org.spockframework.runtime.model.FieldInfo import org.spockframework.runtime.model.SpecInfo /** * Spock Environment Annotation Extension */ class EnvironmentEndPointExtension extends AbstractAnnotationDrivenExtension { private static final Log LOG = LogFactory.getLog(getClass()); private static def config = new ConfigSlurper().parse(new File('src/test/resources/SpockConfig.groovy').toURL()) /** * env environment variable * * Defaults to {@code LOCAL_END_POINT} */ private static final String envString = System.getProperties().getProperty("env", config.envHost); static { LOG.info("Environment End Point [" + envString + "]") } /** * {@inheritDoc} */ @Override void visitFieldAnnotation(EnvironmentEndPoint annotation, FieldInfo field) { def interceptor = new EnvironmentInterceptor(field, envString) interceptor.install(field.parent.getTopSpec()) } } /** * * Environment Intercepter * */ class EnvironmentInterceptor extends AbstractMethodInterceptor { private final FieldInfo field private final String envString EnvironmentInterceptor(FieldInfo field, String envString) { this.field = field this.envString = envString } private void injectEnvironmentHost(target) { field.writeValue(target, envString) } @Override void interceptSetupMethod(IMethodInvocation invocation) { injectEnvironmentHost(invocation.target) invocation.proceed() } @Override void install(SpecInfo spec) { spec.setupMethod.addInterceptor this } } The EnvironmentEndPointExtension class defines the following: config: is a ConfigSlurper that parses a config file 'SpockConfig.groovy' that is used to define the default environment host (envHost) envString: gets the value of 'env' from all System Properties (these include run-time properties) and defaults to config.envHost With the environment host able to be accessed by Spock, now we need to inject this into the placeholder variable for Spock tests to access. An interceptor is created which is used to inject(field.writeValue method) the value of the environment host into the placeholder variable. This placeholder is the one that the @EnvironmentEndPoint is annotating. When the test is run, the interceptor sets the placeholder variable and the test can then use this value as the host for the RestClient object. When running the Spock tests either the default value from the config file will be used or the JVM argument -Denv=? can be used. This makes running the same test code base against any environment so much easier. A note on Gradle builds. By default, Gradle will not pass through JVM arguments through to forked processes such as running tests. The code snippet below shows how to achieve this: /* * Required to pass all system properties to Test tasks. * Not default for Gradle to pass system properties through to forked processes. */ tasks.withType(Test) { def config = new ConfigSlurper().parse(new File('src/test/resources/SpockConfig.groovy').toURL()) systemProperty 'env', System.getProperty('env', config.envHost) } This allows all tasks that are a type of 'Test' to have some custom code run. In this case, we are defining the 'SpockConfig.groovy' config file and then setting 'systemPropery' within Gradle Test tasks to 'env' and either getting the value from the passed in JVM argument or from the config file. With this code in the build.gradle, we're able to run all tests via a Gradle test build, which will produce lovely test reports (in Gradle HTML and JUnit XML). The @APIVersion Extension Another integration testing problem I've found is that if we try and develop our tests first (or at least during the process of developing a feature or bug fix) that running the same tests against an environment that doesn't yet have the new code base (but we are using the same test code base everywhere), we'll have failing tests that aren't really failures as the new code isn't there yet. To help solve this problem, I've developed the @APIVersion extension to help with this issue. As newly developed code should be deployed with a new version, we can use this version to compare to a minimum version that a test can be run against. package com.movideo.runtime.extension.custom import groovyx.net.http.RESTClient import java.lang.annotation.Annotation import java.util.regex.Pattern import org.apache.commons.logging.Log import org.apache.commons.logging.LogFactory import org.spockframework.runtime.extension.AbstractAnnotationDrivenExtension import org.spockframework.runtime.model.FeatureInfo import org.spockframework.runtime.model.SpecInfo /** * API Version Extension * */ class APIVersionExtension extends AbstractAnnotationDrivenExtension { /** * Logger */ private static final Log LOG = LogFactory.getLog(getClass()); /** * */ private static def config = new ConfigSlurper().parse(new File('src/test/resources/SpockConfig.groovy').toURL()) /** * env environment variable * * Defaults to {@code LOCAL_END_POINT} */ private static final String envString = System.getProperties().getProperty("env", config.envHost); /** * Version REGX pattern */ private static final def VERSION_PATTERN = Pattern.compile(".", Pattern.LITERAL); /** * Max version length */ private static final def MAX_VERSION_LENGTH = 4; /** * Current API Version */ private static final def CURRENT_API_VERSION = getDeployedAPIVersion(); /** * {@inheritDoc} */ @Override void visitFeatureAnnotation(APIVersion annotation, FeatureInfo feature) { if(!isApiVersionGreaterThanMinApiVersion(annotation, feature.name)) { feature.setSkipped(true) } } /** * {@inheritDoc} */ @Override public void visitSpecAnnotation(APIVersion annotation, SpecInfo spec) { if(!isApiVersionGreaterThanMinApiVersion(annotation, spec.name)) { spec.setSkipped(true) } } /** * Get the current deployed API version * * Performs a HTTP request to the current deployed API version. Parses the returned data and get the {@code version} node data. * @return current deployed API version */ private static String getDeployedAPIVersion() { def apiVersion = null try { def client = new RESTClient(envString) def resp = client.get(path : config.versionServiceUri) apiVersion = resp.data.version LOG.info("Current deployed API version [" + apiVersion + "]"); } catch (ex) { APIVersionError apiVersionError = new APIVersionError("Error occurred attempting to get current deployed API version from %s", envString + config.versionServiceUri); apiVersionError.setStackTrace(ex.stackTrace); throw apiVersionError; } return apiVersion } * @param annotation * @param infoName * @return */ private boolean isApiVersionGreaterThanMinApiVersion(APIVersion annotation, String infoName) { def isApiVersionGreaterThanMinApiVersion = true def minApiVersionRequired = annotation.minimimApiVersion(); // normalise both version id's def apiVersionNormalised = normaliseVersion(CURRENT_API_VERSION); def minApiVersionRequiredNormalised = normaliseVersion(minApiVersionRequired); // compare version id's int cmp = apiVersionNormalised.compareTo(minApiVersionRequiredNormalised); // if the comparison is less than 0, min API version is greater than the deployed API version if(cmp < 0) { LOG.info("min api version [" + minApiVersionRequired + "] greater than api version [" + CURRENT_API_VERSION + "], skipping [" + infoName + "]") isApiVersionGreaterThanMinApiVersion = false } return isApiVersionGreaterThanMinApiVersion } * @param version * @return */ private String normaliseVersion(String version) { String[] split = VERSION_PATTERN.split(version); StringBuilder sb = new StringBuilder(); for (String s : split) { sb.append(String.format("%" + MAX_VERSION_LENGTH + 's', s)); } return sb.toString(); } } The @APIVersion extension defines the same environment config as the @EnvironmentEndPoint extension does so that the environment can be injected and used purely for accessing the API version endpoint without the need for @EnvironmentEndPoint. The RESTful API version endpoint is required to be setup and publicly available. The @APIVersion extension will call this service to get details about the version of RESTful API. The version response data should be as follows: Media API 1.51.1 The @APIVersion extension will look for the version data to define what the current deployed version of the RESTful API is. Once the version of the RESTful API is known, the extension then checks the minimum API version required. Example @APIVersion(minimimApiVersion="1.0.0.0") The extension then uses this value to compare against the response data version and if the required version is greater than that of the deployed RESTful API services, then the test is skipped. This extension annotation can be placed on Specification's or Feature's allowing whole Specs to have a minimum version and / or Features to have their own minimum version. This extension has made writing integration tests with Spock even more portable and allows for a 'build once' set of tests that can be run against any environment, with some small changes to allow getting the API version. The SpockConfig.groovy file Here is an example of the SpockConfig.groovy config file used to configure defaults for both @EnvironmentEndPoint and @APIVersion extensions. versionServiceUri="/public/serviceInformation" envHost="http://api.preview.movideo.com" The 'versionServiceUri' is required for @APIVersion extension as the URI for the RESTful API version The 'envHost' is required for both @APIVersion and @EnvironmentEndPoint extensions as the host of the RESTful API Go and start testing Hopefully these Spock extensions might help your Spock integration tests. The framework is really easy and fun to use to build essential tests for the whole test stack. Checkout my GitHub projects for the code for both extensions. Hope this post has been helpful and hopefully I'll post something sooner for my next post. References and really helpful links Spock Homepage Annotation Driven Extensions With Spock
November 14, 2012
by Christian Strzadala
· 39,888 Views · 1 Like
  • Previous
  • ...
  • 255
  • 256
  • 257
  • 258
  • 259
  • 260
  • 261
  • 262
  • 263
  • 264
  • ...
  • Next
  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook
×