Microservices Resources

The Latest Microservices Topics

Please find the following sample which demonstrates the Nested iterator. The following is the source of the configuration. 15000 $1 Use the following request with soapui in order to test the above sample. SUN9 SUN10 From the second iterate mediator, messages retrieved from the first iterator are again iterated into smaller messages as below SUN10

February 23, 2013

by Achala Chathuranga Aponso

· 7,453 Views

When to use Aspect Oriented Architecture (AOA/AOD)

When is it appropriate to use aspect oriented architecture? I think the only honest answer to this question is that it depends on the context for which the question is being asked. There really are no hard and fast rules regarding the selection of an architectural model(s) for a project because each model provides good and bad benefits. Every system is built with a unique requirements and constraints. This context will dictate when to use one type of architecture over another or in conjunction with others. To me aspect oriented architecture models should be a sub-phase in the architectural modeling and design process especially when creating enterprise level models. Personally, I like to use this approach to create a base architectural model that is defined by non-functional requirements and system quality attributes. This general model can then be used as a starting point for additional models because it is targets all of the business key quality attributes required by the system. Aspect oriented architecture is a method for modeling non-functional requirements and quality attributes of a system known as aspects. These models do not deal directly with specific functionality. They do categorize functionality of the system. This approach allows a system to be created with a strong emphasis on separating system concerns into individual components. These cross cutting components enables a systems to create with compartmentalization in regards to non-functional requirements or quality attributes. This allows for the reduction in code because an each component maintains an aspect of a system that can be called by other aspects. This approach also allows for a much cleaner and smaller code base during the implementation and support of a system. Additionally, enabling developers to develop systems based on aspect-oriented design projects will be completed faster and will be more reliable because existing components can be shared across a system; thus, the time needed to create and test the functionality is reduced. Example of an effective use of Aspect Oriented Architecture In my experiences, aspect oriented architecture can be very effective with large or more complex systems. Typically, these types of systems have a large number of concerns so the act of defining them is very beneficial for reducing the system’s complexity because components can be developed to address each concern while exposing functionality to the other system components. The benefits to using the aspect oriented approach as the starting point for a system is that it promotes communication between IT and the business due to the fact that the aspect oriented models are quality attributes focused so not much technical understanding is needed to understand the model. An example of this can be in developing a new intranet website. Common Intranet Concerns: Error Handling Security Logging Notifications Database connectivity Example of a not as effective use of Aspect Oriented Architecture Again in my experiences, aspect oriented architecture is not as effective with small or less complex systems in comparison. There is no need to model concerns for a system that has a limited amount of them because the added overhead would not be justified for the actual benefits of creating the aspect oriented architecture model. Furthermore, these types of projects typically have a reduced time schedule and a limited budget. The creation of the Aspect oriented models would increase the overhead of a project and thus increase the time needed to implement the system. An example of this is seen by creating a small application to poll a network share for new files and then FTP them to a new location. The two primary concerns for this project is to monitor a network drive and FTP files to a new location. There is no need to create an aspect model for this system because there will never be a need to share functionality amongst either of these concerns. To add to my point, this system is so small that it could be created with just a few classes so the added layer of componentizing the concerns would be complete overkill for this situation. References: Brichau, Johan; D'Hondt, Theo. (2006) Aspect-Oriented Software Development (AOSD) - An Introduction. Retreived from: http://www.info.ucl.ac.be/~jbrichau/courses/introductionToAOSD.pdf

February 21, 2013

by Todd Merritt

· 9,578 Views

Apache Camel Meets Redis

The Lamborghini of Key-Value stores Camel is the best of bread Integration framework and in this post I'm going to show you how to make it even more powerful by leveraging another great project - Redis. Camel 2.11 is on its way to be released soon with lots of new features, bug fixes and components. Couple of these new components are authored by me, redis-component being my favourite one. Redis - a ligth key/value store is an amazing piece of Italian software designed for speed (same as Lamborghini - a two-seater Italian car designed for speed). Written in C and having an in-memory closer to the metal nature, Redis performs extremely well (Lamborgini's motto is "Closer to the Road"). Redis is often referred to as a data structure server since keys can contain strings, hashes, lists and sorted sets. A fast and light data structure server is like a super sportscars for software engineers - it just flies. If you want to find out more about Redis' and Lamborghini's unique performance characteristics google around and you will see for yourself. Getting started with Redis is easy: download, make, and start a redis-server. After these steps, you ready to use it from your Camel application. The component uses internally Spring Data which in turn uses Jedis driver, but with possibility to switch to other Redis drivers. Here are few use cases where the camel-redis component is a good fit: Idempotent Repository The term idempotent is used in mathematics to describe a function that produces the same result if it is applied to itself. In Messaging this concepts translates into the a message that has the same effect whether it is received once or multiple times. In Camel this pattern is implemented using the IdempotentConsumer class which uses an Expression to calculate a unique message ID string for a given message exchange; this ID can then be looked up in the IdempotentRepository to see if it has been seen before; if it has the message is consumed; if its not then the message is processed and the ID is added to the repository. RedisIdempotentRepository is using a set structure to store and check for existing Ids. ${in.body.id} Caching One of the main uses of Redis is as LRU cache. It can store data inmemory as Memcached or can be tuned to be durable flushing data to a log file that can be replayed if the node restarts.The various policies when maxmemory is reached allows creating caches for specific needs: volatile-lru remove a key among the ones with an expire set, trying to remove keys not recently used. volatile-ttl remove a key among the ones with an expire set, trying to remove keys with short remaining time to live. volatile-random remove a random key among the ones with an expire set. allkeys-lru like volatile-lru, but will remove every kind of key, both normal keys or keys with an expire set. allkeys-random like volatile-random, but will remove every kind of keys, both normal keys and keys with an expire set. Once your Redis server is configured with the right policies and running, the operation you need to do are SET and GET: SET keyOne valueOne Interap pub/sub with Redis Camel has various components for interacting between routes: direct: provides direct, synchronous invocation in the same camel context. seda: asynchronous behavior, where messages are exchanged on a BlockingQueue, again in the same camel context. vm: asynchronous behavior like seda, but also supports communication across CamelContext as long as they are in the same JVM. Complex applications usually consist of more than one standalone Camel instances running on separate machines. For this kind of scenarios, Camel provides jms, activemq, combination of AWS SNS with SQS, for messaging between instances. Redis has a simpler solution for the Publish/Subscribe messaging paradigm. Subscribers subscribes to one or more channels, by specifying the channel names or using pattern matching for receiving messages from multiple channels. Then the publisher publishes the messages to a channel, and Redis makes sure it reaches all the matching subscribers. PUBLISH testChannel Test Message Other usages Guaranteed Delivery: Camel supports this EIP using JMS, File, JPA and few other components. Here Redis can be used as lightweight key-value persistent store with its transaction support. The Claim Check from the EIP patterns allows you to replace message content with a claim check (a unique key), which can be used to retrieve the message content at a later time. The message content can be stored temporarily in Redis. Redis is also very popular for implementing counters, leaderboards, tagging systems and many more functionalities. Now, with two swiss army knives under your belt, the integrations to make are limited only by your imagination.

February 20, 2013

by Bilgin Ibryam

· 10,927 Views

CPU Cache Flushing Fallacy

Even from highly experienced technologists I often hear talk about how certain operations cause a CPU cache to "flush". This seems to be illustrating a very common fallacy about how CPU caches work, and how the cache sub-system interacts with the execution cores. In this article I will attempt to explain the function CPU caches fulfil, and how the cores, which execute our programs of instructions, interact with them. For a concrete example I will dive into one of the latest Intel x86 server CPUs. Other CPUs use similar techniques to achieve the same ends. Most modern systems that execute our programs are shared-memory multi-processor systems in design. A shared-memory system has a single memory resource that is accessed by 2 or more independent CPU cores. Latency to main memory is highly variable from 10s to 100s of nanoseconds. Within 100ns it is possible for a 3.0GHz CPU to process up to 1200 instructions. Each Sandy Bridge core is capable of retiring up to 4 instructions-per-cycle (IPC) in parallel. CPUs employ cache sub-systems to hide this latency and allow them to exercise their huge capacity to process instructions. Some of these caches are small, very fast, and local to each core; others are slower, larger, and shared across cores. Together with registers and main-memory, these caches make up our non-persistent memory hierarchy. Next time you are developing an important algorithm, try pondering that a cache-miss is a lost opportunity to have executed ~500 CPU instructions! This is for a single-socket system, on a multi-socket system you can effectively double the lost opportunity as memory requests cross socket interconnects. Memory Hierarchy Figure 1. For the circa 2012 Sandy Bridge E class servers our memory hierarchy can be decomposed as follows: Registers: Within each core are separate register files containing 160 entries for integers and 144 floating point numbers. These registers are accessible within a single cycle and constitute the fastest memory available to our execution cores. Compilers will allocate our local variables and function arguments to these registers. When hyperthreading is enabled these registers are shared between the co-located hyperthreads. Memory Ordering Buffers (MOB): The MOB is comprised of a 64-entry load and 36-entry store buffer. These buffers are used to track in-flight operations while waiting on the cache sub-system. The store buffer is a fully associative queue that can be searched for existing store operations, which have been queued when waiting on the L1 cache. These buffers enable our fast processors to run asynchronously while data is transferred to and from the cache sub-system. When the processor issues asynchronous reads and writes then the results can come back out-of-order. The MOB is used to disambiguate the ordering for compliance to the published memory model. Level 1 Cache: The L1 is a core-local cache split into separate 32K data and 32K instruction caches. Access time is 3 cycles and can be hidden as instructions are pipelined by the core for data already in the L1 cache. Level 2 Cache: The L2 cache is a core-local cache designed to buffer access between the L1 and the shared L3 cache. The L2 cache is 256K in size and acts as an effective queue of memory accesses between the L1 and L3. L2 contains both data and instructions. L2 access latency is 12 cycles. Level 3 Cache: The L3 cache is shared across all cores within a socket. The L3 is split into 2MB segments each connected to a ring-bus network on the socket. Each core is also connected to this ring-bus. Addresses are hashed to segments for greater throughput. Latency can be up to 38 cycles depending on cache size. Cache size can be up to 20MB depending on the number of segments, with each additional hop around the ring taking an additional cycle. The L3 cache is inclusive of all data in the L1 and L2 for each core on the same socket. This inclusiveness, at the cost of space, allows the L3 cache to intercept requests thus removing the burden from private core-local L1 & L2 caches. Main Memory: DRAM channels are connected to each socket with an average latency of ~65ns for socket local access on a full cache-miss. This is however extremely variable, being much less for subsequent accesses to columns in the same row buffer, through to significantly more when queuing effects and memory refresh cycles conflict. 4 memory channels are aggregated together on each socket for throughput, and to hide latency via pipelining on the independent memory channels. NUMA: In a multi-socket server we have non-uniform memory access. It is non-uniform because the required memory maybe on a remote socket having an additional 40ns hop across the QPI bus. Sandy Bridge is a major step forward for 2-socket systems over Westmere and Nehalem. With Sandy Bridge the QPI limit has been raised from 6.4GT/s to 8.0GT/s, and two lanes can be aggregated thus eliminating the bottleneck of the previous systems. For Nehalem and Westmere the QPI link is only capable of ~40% the bandwidth that could be delivered by the memory controller for an individual socket. This limitation made accessing remote memory a choke point. In addition, the QPI link can now forward pre-fetch requests which previous generations could not. Associativity Levels Caches are effectively hardware based hash tables. The hash function is usually a simple masking of some low-order bits for cache indexing. Hash tables need some means to handle a collision for the same slot. The associativity level is the number of slots, also known as ways or sets, which can be used to hold a hashed version of an address. Having more levels of associativity is a trade off between storing more data vs. power requirements and time to search each of the ways. For Sandy Bridge the L1 and L2 are 8-way and the L3 is 12-way associative. Cache Coherence With some caches being local to cores, we need a means of keeping them coherent so all cores can have a consistent view of memory. The cache sub-system is considered the "source of truth" for mainstream systems. If memory is fetched from the cache it is never stale; the cache is the master copy when data exists in both the cache and main-memory. This style of memory management is known as write-back whereby data in the cache is only written back to main-memory when the cache-line is evicted because a new line is taking its place. An x86 cache works on blocks of data that are 64-bytes in size, known as a cache-line. Other processors can use a different size for the cache-line. A larger cache-line size reduces effective latency at the expense of increased bandwidth requirements. To keep the caches coherent the cache controller tracks the state of each cache-line as being in one of a finite number of states. The protocol Intel employs for this is MESIF, AMD employs a variant know as MOESI. Under the MESIF protocol each cache-line can be in 1 of the 5 following states: Modified: Indicates the cache-line is dirty and must be written back to memory at a later stage. When written back to main-memory the state transitions to Exclusive. Exclusive: Indicates the cache-line is held exclusively and that it matches main-memory. When written to, the state then transitions to Modified. To achieve this state a Request-For-Ownership (RFO) message is sent which involves a read plus an invalidate broadcast to all other copies. Shared: Indicates a clean copy of a cache-line that matches main-memory. Invalid: Indicates an unused cache-line. Forward: Indicates a specialised version of the shared state i.e. this is the designated cache which should respond to other caches in a NUMA system. To transition from one state to another, a series of messages are sent between the caches to effect state changes. Previous to Nehalem for Intel, and Opteron for AMD, this cache coherence traffic between sockets had to share the memory bus which greatly limited scalability. These days the memory controller traffic is on a separate bus. The Intel QPI, and AMD HyperTransport, buses are used for cache coherence between sockets. The cache controller exists as a module within each L3 cache segment that is connected to the on-socket ring-bus network. Each core, L3 cache segment, QPI controller, memory controller, and integrated graphics sub-system are connected to this ring-bus. The ring is made up of 4 independent lanes for: request, snoop, acknowledge, and 32-bytes data per cycle. The L3 cache is inclusive in that any cache-line held in the L1 or L2 caches is also held in the L3. This provides for rapid identification of the core containing a modified line when snooping for changes. The cache controller for the L3 segment keeps track of which core could have a modified version of a cache-line it owns. If a core wants to read some memory, and it does not have it in a Shared, Exclusive, or Modified state; then it must make a read on the ring bus. It will then either be read from main-memory if not in the cache sub-systems, or read from L3 if clean, or snooped from another core if Modified. In any case the read will never return a stale copy from the cache sub-system, it is guaranteed to be coherent. Concurrent Programming If our caches are always coherent then why do we worry about visibility when writing concurrent programs? This is because within our cores, in their quest for ever greater performance, data modifications can appear out-of-order to other threads. There are 2 major reasons for this. Firstly, our compilers can generate programs that store variables in registers for relatively long periods of time for performance reasons, e.g. variables used repeatedly within a loop. If we need these variables to be visible across cores then the updates must not be register allocated. This is achieved in C by qualifying a variable as "volatile". Beware that C/C++ volatile is inadequate for telling the compiler to order other instructions. For this you need fences/barriers. The second major issue with ordering we have to be aware of is a thread could write a variable and then, if it reads it shortly after, could see the value in its store buffer which may be older than the latest value in the cache sub-system. This is never an issue for algorithms following the Single Writer Principle but is an issue for the likes of the Dekker and Peterson lock algorithms. To overcome this issue, and ensure the latest value is observed, the thread must wait for the store buffer to drain on that core. This can be achieved by issuing a fence instruction. The write of a volatile variable in Java, in addition to never being register allocated, is accompanied by a full fence instruction. This fence instruction on x86 has a significant performance impact by preventing progress on the issuing thread until the store buffer is drained. Fences on other processors can have more efficient implementations that simply put a marker in the store buffer for the search boundary, e.g. the Azul Vega does this. If you want to ensure memory ordering across Java threads when following the Single Writer Principle, and avoid the store fence, it is possible by using the j.u.c.Atomic(Int|Long|Reference).lazySet() method, as opposed to setting a volatile variable. The Fallacy Returning to the fallacy of "flushing the cache" as part of a concurrent algorithm. I think we can safely say that we never "flush" the CPU cache within our user space programs. I believe the source of this fallacy is the need to flush, mark or drain to a point, the store buffer for some classes of concurrent algorithms so the latest value can be observed on a subsequent load operation. For this we require a memory ordering fence and not a cache flush. Another possible source of this fallacy is that L1 caches, or the TLB, may need to be flushed based on address indexing policy on a context switch. ARM, previous to ARMv6, did not use address space tags on TLB entries thus requiring the whole L1 cache to be flushed on a context switch. Many processors require the L1 instruction cache to be flushed for similar reasons, in many cases this is simply because instruction caches are not required to be kept coherent. The bottom line is, context switching is expensive and a bit off topic, so in addition to the cache pollution of the L2, a context switch can also cause the TLB and/or L1 caches to require a flush. Intel x86 processors require only a TLB flush on context switch.

February 15, 2013

by Martin Thompson

· 11,554 Views · 3 Likes

The Saga Pattern and That Architecture vs. Design Thing

It has been few months since SOA Patterns was published and so far the book sold somewhere between 2K-3K copies which I guess is not bad for an unknown author – so first off, thanks to all of you who bought a copy (by the way, if you found the book useful I’d be grateful if you could also rate it on Amazon so that others would know about it too) I know at least a few of you actually read the book as from time to time I get questions about it :). Not all the questions are interesting to “the general public” but some are. One interesting question I got is about the so called “Canonical schema pattern“. I have a post in the making (for too long now,sorry about that Bill) that explains why I don’t consider it a pattern and why I think it verges on being an anti-pattern. Another question I got more recently, which is also the subject of this post, was about the Saga pattern. Here is (most of) the email I got from Ashic : “Garcia-Molina’s paper focuses on failure management and compensation so as to prevent partial success. It discusses a variety of approaches – with an SEC, with application code outside of the database, backward-forward and even forward-only (the latter having no “compensate” step per activity, rather a forward flow that takes care of the partial success). Nowadays, I see two viewpoints regarding sagas: 1. People calling process managers sagas, which is obviously incorrect. [e.g. NServiceBus "sagas".] 2. People focusing very strongly on a “context” of work, whereby the context gets passed around from activity to activity. For linear up front workflows, routing slips are an easy solution. An example of this can be found at Clemens’s post here: http://vasters.com/clemensv/2012/09/01/Sagas.aspx . For more complicated workflows, graph-like slips may be used. After discussing with some enthusiasts, they seem very keen to suggest that the context has to move along. They seem to reject the notion of a saga where a central coordinator controls the process. In other words, even if a process manager takes care of only routing messages, and that routing includes compensations to alleviate partial successes, they are unwilling (sometimes vehemently) to call that a saga. They acknowledge it can be useful, but say that is not a saga. I find this to be confusing. In this case the process manager acts as the SEC would in a Garcia-Molina saga capable database. This approach still allows interleaved transactions (or steps) without a global lock. Why would this not be a saga? In your book, I did see you mentioned orchestration as a way of implementing sagas. However, when this was brought up, the proponents of point 2 suggest that that is not what you really mean. To me it seems quite clear, and it aligns with Hector’s paper. I just want to make sure I have this right. I’d love your thoughts on this.” Let’s start with the answer to the question: When I think about the Saga pattern I see it as the application of the notions in the Garcia-Molina paper (which talked about databases) to SOA. In other words, I see sagas as the notion of getting distributed agreement of a process with reduced guarantees (vs. distributed transactions that propose ACID guarantees across systems). – So,basically, a Saga is loose transaction-like flow where, in case of failures, involved services perform compensation steps (which may be nothing, a complete undo or something else entirely). The Saga pattern can augment this process with temporary promises (which I call reservations). Under this definition both centrally managed processes and a “choreographed” processes are Sagas – as long as the semantics and intent mentioned above are kept. The centrally managed orchestration provides visibility of processes, ease of management etc; The cooperative event based, context shared sagas provide flexibility and allow serendipity of new processes; Both have merit and both have a place, at least in my opinion :) The main reason both of these, very different, approaches are valid designs and implementations for the Saga pattern is that the Saga pattern (like others in the book) is an Architectural pattern and not a Design pattern. Which brings us to the second reason for this post, the difference between “Architecture” and “Design”. In a nutshell, architecture is a type of design where the focus is quality attributes and wide(er) scope whereas design focuses on functional requirements and more localized concerns. The Saga pattern is an architectural pattern that focused on the integrity reliability quality attributes and it pertains to the communication patterns between services. When it comes to design the implementation of the pattern. you need to decide how to implement the concerns and roles defined in the pattern -e.g. controlling the flow and the status of the saga. One decision can be to implement it centrally and use orchestration another decision can be to decentralize it and use context… Design decision can be very meaningful sometimes it can be hard to find what’s left of the architecture – consider for example the whole idea behind blogging and RSS feeds. The architectural notion is a publish/subscribe system where the blog writer publish an “event” (a new post) and subscribers get a copy. When it came to design and implementation, considering it was implemented on top of HTTP and REST where there is no publish/subscribe capability it was actually designed as a pull system where the publisher provides a list of recent changes (the feed) and subscribers sample it and check if anything changed since the last time. So architecturally pub/sub, design pull a centralized server that exposes latest changes – a really big difference Does it matter at all? I think yes. Architecture lets us think about the system at a higher level of abstraction and thus tackle more complex systems. When we design and focus on more local issues we can tackle the nitty gritty details and make sure things actually work. we need to check the effects of design on architecture and vice versa to make sure the whole thing sticks together and actually does what we want/need. Note that architecture and design are not the complete story – another variable is the technology (e.g. HTTP in the example above) which affects the design decision and thus also the architecture (you can read a little more about it in my posts on SAF)

February 1, 2013

by Arnon Rotem-gal-oz

· 9,850 Views

How to Publish Maven Site Docs to BitBucket or GitHub Pages

In this post we will Utilize GitHub and/or BitBucket's static web page hosting capabilities to publish our project's Maven 3 Site Documentation. Each of the two SCM providers offer a slightly different solution to host static pages. The approach spelled out in this post would also be a viable solution to "backup" your site documentation in a supported SCM like Git or SVN. This solution does not directly cover site documentation deployment covered by the maven-site-plugin and the Wagon library (scp, WebDAV or FTP). There is one main project hosted on GitHub that I have posted with the full solution. The project URL is https://github.com/mike-ensor/clickconcepts-master-pom/. The POM has been pushed to Maven Central and will continue to be updated and maintained. com.clickconcepts.project master-site-pom 0.16 GitHub Pages GitHub hosts static pages by using a special branch "gh-pages" available to each GitHub project. This special branch can host any HTML and local resources like JavaScript, images and CSS. There is no server side development. To navigate to your static pages, the URL structure is as follows: http://.github.com/ An example of the project I am using in this blog post: http://mike-ensor.github.com/clickconcepts-master-pom/ where the first bold URL segment is a username and the second bold URL segment is the project. GitHub does allow you to create a base static hosted static site for your username by creating a repository with your username.github.com. The contents would be all of your HTML and associated static resources. This is not required to post documentation for your project, unlike the BitBucket solution. There is a GitHub Site plugin that publishes site documentation via GitHub's object API but this is outside the scope of this blog post because it does not provide a single solution for GitHub and BitBucket projects using Maven 3. BitBucket BitBucket provides a similar service to GitHub in that it hosts static HTML pages and their associated static resources. However, there is one large difference in how those pages are stored. Unlike GitHub, BitBucket requires you to create a new repository with a name fitting the convention. The files will be located on the master branch and each project will need to be a directory off of the root. mikeensor.bitbucket.org/ /some-project +index.html +... /css /img /some-other-project +index.html +... /css /img index.html .git .gitignore The naming convention is as follows: .bitbucket.org An example of a BitBucket static pages repository for me would be: http://mikeensor.bitbucket.org/. The structure does not require that you create an index.html page at the root of the project, but it would be advisable to avoid 404s. Generating Site Documentation Maven provides the ability to post documentation for your project by using the maven-site-plugin. This plugin is difficult to use due to the many configuration options that oftentimes are not well documented. There are many blog posts that can help you write your documentation including my post on maven site documentation. I did not mention how to use "xdoc", "apt" or other templating technologies to create documentation pages, but not to fear, I have provided this in my GitHub project. Putting it all Together The Maven SCM Publish plugin (http://maven.apache.org/plugins/maven-scm-publish-plugin/ publishes site documentation to a supported SCM. In our case, we are going to use Git through BitBucket or GitHub. Maven SCM Plugin does allow you to publish multi-module site documentation through the various properties, but the scope of this blog post is to cover single/mono module projects and the process is a bit painful. Take a moment to look at the POM file located in the clickconcepts-master-pom project. This master POM is rather comprehensive and the site documentation is only one portion of the project, but we will focus on the site documentation. There are a few things to point out here, first, the scm-publish plugin and the idiosyncronies when implementing the plugin. In order to create the site documentation, the "site" plugin must first be run. This is accomplished by running site:site. The plugin will generate the documentation into the "target/site" folder by default. The SCM Publish Plugin, by default, looks for the site documents to be in "target/staging" and is controlled by the content parameter. As you can see, there is a mismatch between folders. NOTE: My first approach was to run the site:stage command which is supposed to put the site documents into the "target/staging" folder. This is not entirely correct, the site plugin combines with the distributionManagement.site.url property to stage the documents, but there is very strange behavior and it is not documented well. In order to get the site plugin's site documents and the SCM Publish's location to match up, use the content property and set that to the location of the Site Plugin output (). If you are using GitHub, there is no modification to the siteOutputDirectory needed, however, if you are using BitBucket, you will need to modify the property to add in a directory layer into the site documentation generation (see above for differences between GitHub and BitBucket pages). The second property will tell the SCM Publish Plugin to look at the root "site" folder so that when the files are copied into the repository, the project folder will be the containing folder. The property will look like: ${project.build.directory}/site/ ${project.artifactId} ${project.build.directory} /site Next we will take a look at the custom properties defined in the master POM and used by the SCM Publish Plugin above. Each project will need to define several properties to use the Master POM that are used within the plugins during the site publishing. Fill in the variables with your own settings. BitBucket ... ... master scm:git:[email protected]:mikeensor/mikeensor.bitbucket.org.git ${project.build.directory}/site/${project.artifactId} ${project.build.directory}/site ${changelog.bitbucket.fileUri} ${changelog.revision.bitbucket.fileUri} ... ... GitHub ... ... gh-pages scm:git:[email protected]:mikeensor/clickconcepts-master-pom.git ${changelog.github.fileUri} ${changelog.revision.github.fileUri} ... ... NOTE: changelog parameters are required to use the Master POM and are not directly related to publishing site docs to GitHub or BitBucket How to Generate If you are using the Master POM (or have abstracted out the Site Plugin and the SCM Plugin) then to generate and publish the documentation is simple. mvn clean site:site scm-publish:publish-scm mvn clean site:site scm-publish:publish-scm -Dscmpublish.dryRun=true Gotchas In the SCM Publish Plugin documentation's "tips" they recommend creating a location to place the repository so that the repo is not cloned each time. There is a risk here in that if there is a git repository already in the folder, the plugin will overwrite the repository with the new site documentation. This was discovered by publishing two different projects and having my root repository wiped out by documentation from the second project. There are ways to mitigate this by adding in another folder layer, but make sure you test often! Another gotcha is to use the -Dscmpublish.dryRun=true to test out the site documentation process without making the SCM commit and push Project and Documentation URLs Here is a list of the fully working projects used to create this blog post: Master POM with Site and SCM Publish plugins &ndash https://github.com/mike-ensor/clickconcepts-master-pom. Documentation URL: http://mike-ensor.github.com/clickconcepts-master-pom/ Child Project using Master Pom &ndash http://mikeensor.bitbucket.org/fest-expected-exception. Documentation URL: http://mikeensor.bitbucket.org/fest-expected-exception/

January 23, 2013

by Mike Ensor

· 13,447 Views

Using Redis with Spring

As NoSQL solutions are getting more and more popular for many kind of problems, more often the modern projects consider to use some (or several) of NoSQLs instead (or side-by-side) of traditional RDBMS. I have already covered my experience with MongoDB in this, this and this posts. In this post I would like to switch gears a bit towards Redis, an advanced key-value store. Aside from very rich key-value semantics, Redis also supports pub-sub messaging and transactions. In this post I am going just to touch the surface and demonstrate how simple it is to integrate Redis into your Spring application. As always, we will start with Maven POM file for our project: 4.0.0 com.example.spring redis 0.0.1-SNAPSHOT jar UTF-8 3.1.0.RELEASE org.springframework.data spring-data-redis 1.0.0.RELEASE cglib cglib-nodep 2.2 log4j log4j 1.2.16 redis.clients jedis 2.0.0 jar org.springframework spring-core ${spring.version} org.springframework spring-context ${spring.version} Spring Data Redis is the another project under Spring Data umbrella which provides seamless injection of Redis into your application. The are several Redis clients for Java and I have chosen the Jedis as it is stable and recommended by Redis team at the moment of writing this post. We will start with simple configuration and introduce the necessary components first. Then as we move forward, the configuration will be extended a bit to demonstrated pub-sub capabilities. Thanks to Java config support, we will create the configuration class and have all our dependencies strongly typed, no XML anymore: package com.example.redis.config; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration; import org.springframework.data.redis.connection.jedis.JedisConnectionFactory; import org.springframework.data.redis.core.RedisTemplate; import org.springframework.data.redis.serializer.GenericToStringSerializer; import org.springframework.data.redis.serializer.StringRedisSerializer; @Configuration public class AppConfig { @Bean JedisConnectionFactory jedisConnectionFactory() { return new JedisConnectionFactory(); } @Bean RedisTemplate< String, Object > redisTemplate() { final RedisTemplate< String, Object > template = new RedisTemplate< String, Object >(); template.setConnectionFactory( jedisConnectionFactory() ); template.setKeySerializer( new StringRedisSerializer() ); template.setHashValueSerializer( new GenericToStringSerializer< Object >( Object.class ) ); template.setValueSerializer( new GenericToStringSerializer< Object >( Object.class ) ); return template; } } That's basically everything we need assuming we have single Redis server up and running on localhost with default configuration. Let's consider several common uses cases: setting a key to some value, storing the object and, finally, pub-sub implementation. Storing and retrieving a key/value pair is very simple: @Autowired private RedisTemplate< String, Object > template; public Object getValue( final String key ) { return template.opsForValue().get( key ); } public void setValue( final String key, final String value ) { template.opsForValue().set( key, value ); } Optionally, the key could be set to expire (yet another useful feature of Redis), f.e. let our keys expire in 1 second: public void setValue( final String key, final String value ) { template.opsForValue().set( key, value ); template.expire( key, 1, TimeUnit.SECONDS ); } Arbitrary objects could be saved into Redis as hashes (maps), f.e. let save instance of some class User public class User { private final Long id; private String name; private String email; // Setters and getters are omitted for simplicity } into Redis using key pattern "user:": public void setUser( final User user ) { final String key = String.format( "user:%s", user.getId() ); final Map< String, Object > properties = new HashMap< String, Object >(); properties.put( "id", user.getId() ); properties.put( "name", user.getName() ); properties.put( "email", user.getEmail() ); template.opsForHash().putAll( key, properties); } Respectively, object could easily be inspected and retrieved using the id. public User getUser( final Long id ) { final String key = String.format( "user:%s", id ); final String name = ( String )template.opsForHash().get( key, "name" ); final String email = ( String )template.opsForHash().get( key, "email" ); return new User( id, name, email ); } There are much, much more which could be done using Redis, I highly encourage to take a look on it. It surely is not a silver bullet but could solve many challenging problems very easy. Finally, let me show how to use a pub-sub messaging with Redis. Let's add a bit more configuration here (as part of AppConfig class): @Bean MessageListenerAdapter messageListener() { return new MessageListenerAdapter( new RedisMessageListener() ); } @Bean RedisMessageListenerContainer redisContainer() { final RedisMessageListenerContainer container = new RedisMessageListenerContainer(); container.setConnectionFactory( jedisConnectionFactory() ); container.addMessageListener( messageListener(), new ChannelTopic( "my-queue" ) ); return container; } The style of message listener definition should look very familiar to Spring users: generally, the same approach we follow to define JMS message listeners. The missed piece is our RedisMessageListener class definition: package com.example.redis.impl; import org.springframework.data.redis.connection.Message; import org.springframework.data.redis.connection.MessageListener; public class RedisMessageListener implements MessageListener { @Override public void onMessage(Message message, byte[] paramArrayOfByte) { System.out.println( "Received by RedisMessageListener: " + message.toString() ); } } Now, when we have our message listener, let see how we could push some messages into the queue using Redis. As always, it's pretty simple: @Autowired private RedisTemplate< String, Object > template; public void publish( final String message ) { template.execute( new RedisCallback< Long >() { @SuppressWarnings( "unchecked" ) @Override public Long doInRedis( RedisConnection connection ) throws DataAccessException { return connection.publish( ( ( RedisSerializer< String > )template.getKeySerializer() ).serialize( "queue" ), ( ( RedisSerializer< Object > )template.getValueSerializer() ).serialize( message ) ); } } ); } That's basically it for very quick introduction but definitely enough to fall in love with Redis.

January 17, 2013

by Andriy Redko

· 81,397 Views · 36 Likes

Hazelcast Distributed Execution with Spring

The ExecutorService feature had come with Java 5 and is under the java.util.concurrent package. It extends the Executor interface and provides a thread pool functionality to execute asynchronous short tasks. Java Executor Service Types is suggested to look over basic ExecutorService implementation. Also ThreadPoolExecutor is a very useful implementation of ExecutorService ınterface. It extends AbstractExecutorService providing default implementations of ExecutorService execution methods. It provides improved performance when executing large numbers of asynchronous tasks and maintains basic statistics, such as the number of completed tasks. How to develop and monitor Thread Pool Services by using Spring is also suggested to investigate how to develop and monitor Thread Pool Services. So far, we have just talked Undistributed Executor Service implementation. Let us also investigate Distributed Executor Service. Hazelcast Distributed Executor Service feature is a distributed implementation of java.util.concurrent.ExecutorService. It allows to execute business logic in cluster. There are four alternative ways to realize it : 1) The logic can be executed on a specific cluster member which is chosen. 2) The logic can be executed on the member owning the key which is chosen. 3) The logic can be executed on the member Hazelcast will pick. 4) The logic can be executed on all or subset of the cluster members. This article shows how to develop Distributed Executor Service via Hazelcast and Spring. Used Technologies : JDK 1.7.0_09 Spring 3.1.3 Hazelcast 2.4 Maven 3.0.4 STEP 1 : CREATE MAVEN PROJECT A maven project is created as below. (It can be created by using Maven or IDE Plug-in). STEP 2 : LIBRARIES Firstly, Spring dependencies are added to Maven’ s pom.xml 3.1.3.RELEASE UTF-8 org.springframework spring-core ${spring.version} org.springframework spring-context ${spring.version} com.hazelcast hazelcast-all 2.4 log4j log4j 1.2.16 maven-compiler-plugin(Maven Plugin) is used to compile the project with JDK 1.7 org.apache.maven.plugins maven-compiler-plugin 3.0 1.7 1.7 maven-shade-plugin(Maven Plugin) can be used to create runnable-jar org.apache.maven.plugins maven-shade-plugin 2.0 package shade com.onlinetechvision.exe.Application META-INF/spring.handlers META-INF/spring.schemas STEP 3 : CREATE Customer BEAN A new Customer bean is created. This bean will be distributed between two node in OTV cluster. In the following sample, all defined properties(id, name and surname)’ types are String and standart java.io.Serializable interface has been implemented for serializing. If custom or third-party object types are used, com.hazelcast.nio.DataSerializable interface can be implemented for better serialization performance. package com.onlinetechvision.customer; import java.io.Serializable; /** * Customer Bean. * * @author onlinetechvision.com * @since 27 Nov 2012 * @version 1.0.0 * */ public class Customer implements Serializable { private static final long serialVersionUID = 1856862670651243395L; private String id; private String name; private String surname; public String getId() { return id; } public void setId(String id) { this.id = id; } public String getName() { return name; } public void setName(String name) { this.name = name; } public String getSurname() { return surname; } public void setSurname(String surname) { this.surname = surname; } @Override public int hashCode() { final int prime = 31; int result = 1; result = prime * result + ((id == null) ? 0 : id.hashCode()); result = prime * result + ((name == null) ? 0 : name.hashCode()); result = prime * result + ((surname == null) ? 0 : surname.hashCode()); return result; } @Override public boolean equals(Object obj) { if (this == obj) return true; if (obj == null) return false; if (getClass() != obj.getClass()) return false; Customer other = (Customer) obj; if (id == null) { if (other.id != null) return false; } else if (!id.equals(other.id)) return false; if (name == null) { if (other.name != null) return false; } else if (!name.equals(other.name)) return false; if (surname == null) { if (other.surname != null) return false; } else if (!surname.equals(other.surname)) return false; return true; } @Override public String toString() { return "Customer [id=" + id + ", name=" + name + ", surname=" + surname + "]"; } } STEP 4 : CREATE ICacheService INTERFACE A new ICacheService Interface is created for service layer to expose cache functionality. package com.onlinetechvision.cache.srv; import com.hazelcast.core.IMap; import com.onlinetechvision.customer.Customer; /** * A new ICacheService Interface is created for service layer to expose cache functionality. * * @author onlinetechvision.com * @since 27 Nov 2012 * @version 1.0.0 * */ public interface ICacheService { /** * Adds Customer entries to cache * * @param String key * @param Customer customer * */ void addToCache(String key, Customer customer); /** * Deletes Customer entries from cache * * @param String key * */ void deleteFromCache(String key); /** * Gets Customer cache * * @return IMap Coherence named cache */ IMap getCache(); } STEP 5 : CREATE CacheService IMPLEMENTATION CacheService is implementation of ICacheService Interface. package com.onlinetechvision.cache.srv; import com.hazelcast.core.IMap; import com.onlinetechvision.customer.Customer; import com.onlinetechvision.test.listener.CustomerEntryListener; /** * CacheService Class is implementation of ICacheService Interface. * * @author onlinetechvision.com * @since 27 Nov 2012 * @version 1.0.0 * */ public class CacheService implements ICacheService { private IMap customerMap; /** * Constructor of CacheService * * @param IMap customerMap * */ @SuppressWarnings("unchecked") public CacheService(IMap customerMap) { setCustomerMap(customerMap); getCustomerMap().addEntryListener(new CustomerEntryListener(), true); } /** * Adds Customer entries to cache * * @param String key * @param Customer customer * */ @Override public void addToCache(String key, Customer customer) { getCustomerMap().put(key, customer); } /** * Deletes Customer entries from cache * * @param String key * */ @Override public void deleteFromCache(String key) { getCustomerMap().remove(key); } /** * Gets Customer cache * * @return IMap Coherence named cache */ @Override public IMap getCache() { return getCustomerMap(); } public IMap getCustomerMap() { return customerMap; } public void setCustomerMap(IMap customerMap) { this.customerMap = customerMap; } } STEP 6 : CREATE IDistributedExecutorService INTERFACE A new IDistributedExecutorService Interface is created for service layer to expose distributed execution functionality. package com.onlinetechvision.executor.srv; import java.util.Collection; import java.util.Set; import java.util.concurrent.Callable; import java.util.concurrent.ExecutionException; import com.hazelcast.core.Member; /** * A new IDistributedExecutorService Interface is created for service layer to expose distributed execution functionality. * * @author onlinetechvision.com * @since 27 Nov 2012 * @version 1.0.0 * */ public interface IDistributedExecutorService { /** * Executes the callable object on stated member * * @param Callable callable * @param Member member * @throws InterruptedException * @throws ExecutionException * */ String executeOnStatedMember(Callable callable, Member member) throws InterruptedException, ExecutionException; /** * Executes the callable object on member owning the key * * @param Callable callable * @param Object key * @throws InterruptedException * @throws ExecutionException * */ String executeOnTheMemberOwningTheKey(Callable callable, Object key) throws InterruptedException, ExecutionException; /** * Executes the callable object on any member * * @param Callable callable * @throws InterruptedException * @throws ExecutionException * */ String executeOnAnyMember(Callable callable) throws InterruptedException, ExecutionException; /** * Executes the callable object on all members * * @param Callable callable * @param Set all members * @throws InterruptedException * @throws ExecutionException * */ Collection executeOnMembers(Callable callable, Set members) throws InterruptedException, ExecutionException; } STEP 7 : CREATE DistributedExecutorService IMPLEMENTATION DistributedExecutorService is implementation of IDistributedExecutorService Interface. package com.onlinetechvision.executor.srv; import java.util.Collection; import java.util.Set; import java.util.concurrent.Callable; import java.util.concurrent.ExecutionException; import java.util.concurrent.ExecutorService; import java.util.concurrent.Future; import java.util.concurrent.FutureTask; import org.apache.log4j.Logger; import com.hazelcast.core.DistributedTask; import com.hazelcast.core.Member; import com.hazelcast.core.MultiTask; /** * DistributedExecutorService Class is implementation of IDistributedExecutorService Interface. * * @author onlinetechvision.com * @since 27 Nov 2012 * @version 1.0.0 * */ public class DistributedExecutorService implements IDistributedExecutorService { private static final Logger logger = Logger.getLogger(DistributedExecutorService.class); private ExecutorService hazelcastDistributedExecutorService; /** * Executes the callable object on stated member * * @param Callable callable * @param Member member * @throws InterruptedException * @throws ExecutionException * */ @SuppressWarnings("unchecked") public String executeOnStatedMember(Callable callable, Member member) throws InterruptedException, ExecutionException { logger.debug("Method executeOnStatedMember is called..."); ExecutorService executorService = getHazelcastDistributedExecutorService(); FutureTask task = (FutureTask) executorService.submit( new DistributedTask(callable, member)); String result = task.get(); logger.debug("Result of method executeOnStatedMember is : " + result); return result; } /** * Executes the callable object on member owning the key * * @param Callable callable * @param Object key * @throws InterruptedException * @throws ExecutionException * */ @SuppressWarnings("unchecked") public String executeOnTheMemberOwningTheKey(Callable callable, Object key) throws InterruptedException, ExecutionException { logger.debug("Method executeOnTheMemberOwningTheKey is called..."); ExecutorService executorService = getHazelcastDistributedExecutorService(); FutureTask task = (FutureTask) executorService.submit(new DistributedTask(callable, key)); String result = task.get(); logger.debug("Result of method executeOnTheMemberOwningTheKey is : " + result); return result; } /** * Executes the callable object on any member * * @param Callable callable * @throws InterruptedException * @throws ExecutionException * */ public String executeOnAnyMember(Callable callable) throws InterruptedException, ExecutionException { logger.debug("Method executeOnAnyMember is called..."); ExecutorService executorService = getHazelcastDistributedExecutorService(); Future task = executorService.submit(callable); String result = task.get(); logger.debug("Result of method executeOnAnyMember is : " + result); return result; } /** * Executes the callable object on all members * * @param Callable callable * @param Set all members * @throws InterruptedException * @throws ExecutionException * */ public Collection executeOnMembers(Callable callable, Set members) throws ExecutionException, InterruptedException { logger.debug("Method executeOnMembers is called..."); MultiTask task = new MultiTask(callable, members); ExecutorService executorService = getHazelcastDistributedExecutorService(); executorService.execute(task); Collection results = task.get(); logger.debug("Result of method executeOnMembers is : " + results.toString()); return results; } public ExecutorService getHazelcastDistributedExecutorService() { return hazelcastDistributedExecutorService; } public void setHazelcastDistributedExecutorService(ExecutorService hazelcastDistributedExecutorService) { this.hazelcastDistributedExecutorService = hazelcastDistributedExecutorService; } } STEP 8 : CREATE TestCallable CLASS TestCallable Class shows business logic to be executed. TestCallable task for first member of the cluster : package com.onlinetechvision.task; import java.io.Serializable; import java.util.concurrent.Callable; /** * TestCallable Class shows business logic to be executed. * * @author onlinetechvision.com * @since 27 Nov 2012 * @version 1.0.0 * */ public class TestCallable implements Callable, Serializable{ private static final long serialVersionUID = -1839169907337151877L; /** * Computes a result, or throws an exception if unable to do so. * * @return String computed result * @throws Exception if unable to compute a result */ public String call() throws Exception { return "First Member' s TestCallable Task is called..."; } } TestCallable task for second member of the cluster : package com.onlinetechvision.task; import java.io.Serializable; import java.util.concurrent.Callable; /** * TestCallable Class shows business logic to be executed. * * @author onlinetechvision.com * @since 27 Nov 2012 * @version 1.0.0 * */ public class TestCallable implements Callable, Serializable{ private static final long serialVersionUID = -1839169907337151877L; /** * Computes a result, or throws an exception if unable to do so. * * @return String computed result * @throws Exception if unable to compute a result */ public String call() throws Exception { return "Second Member' s TestCallable Task is called..."; } } STEP 9 : CREATE AnotherAvailableMemberNotFoundException CLASS AnotherAvailableMemberNotFoundException is thrown when another available member is not found. To avoid this exception, first node should be started before the second node. package com.onlinetechvision.exception; /** * AnotherAvailableMemberNotFoundException is thrown when another available member is not found. * To avoid this exception, first node should be started before the second node. * * @author onlinetechvision.com * @since 27 Nov 2012 * @version 1.0.0 * */ public class AnotherAvailableMemberNotFoundException extends Exception { private static final long serialVersionUID = -3954360266393077645L; /** * Constructor of AnotherAvailableMemberNotFoundException * * @param String Exception message * */ public AnotherAvailableMemberNotFoundException(String message) { super(message); } } STEP 10 : CREATE CustomerEntryListener CLASS CustomerEntryListener Class listens entry changes on named cache object. package com.onlinetechvision.test.listener; import com.hazelcast.core.EntryEvent; import com.hazelcast.core.EntryListener; /** * CustomerEntryListener Class listens entry changes on named cache object. * * @author onlinetechvision.com * @since 27 Nov 2012 * @version 1.0.0 * */ @SuppressWarnings("rawtypes") public class CustomerEntryListener implements EntryListener { /** * Invoked when an entry is added. * * @param EntryEvent * */ public void entryAdded(EntryEvent ee) { System.out.println("EntryAdded... Member : " + ee.getMember() + ", Key : "+ee.getKey()+", OldValue : "+ee.getOldValue()+", NewValue : "+ee.getValue()); } /** * Invoked when an entry is removed. * * @param EntryEvent * */ public void entryRemoved(EntryEvent ee) { System.out.println("EntryRemoved... Member : " + ee.getMember() + ", Key : "+ee.getKey()+", OldValue : "+ee.getOldValue()+", NewValue : "+ee.getValue()); } /** * Invoked when an entry is evicted. * * @param EntryEvent * */ public void entryEvicted(EntryEvent ee) { } /** * Invoked when an entry is updated. * * @param EntryEvent * */ public void entryUpdated(EntryEvent ee) { } } STEP 11 : CREATE Starter CLASS Starter Class loads Customers to cache and executes distributed tasks. Starter Class of first member of the cluster : package com.onlinetechvision.exe; import com.onlinetechvision.cache.srv.ICacheService; import com.onlinetechvision.customer.Customer; /** * Starter Class loads Customers to cache and executes distributed tasks. * * @author onlinetechvision.com * @since 27 Nov 2012 * @version 1.0.0 * */ public class Starter { private ICacheService cacheService; /** * Loads cache and executes the tasks * */ public void start() { loadCacheForFirstMember(); } /** * Loads Customers to cache * */ public void loadCacheForFirstMember() { Customer firstCustomer = new Customer(); firstCustomer.setId("1"); firstCustomer.setName("Jodie"); firstCustomer.setSurname("Foster"); Customer secondCustomer = new Customer(); secondCustomer.setId("2"); secondCustomer.setName("Kate"); secondCustomer.setSurname("Winslet"); getCacheService().addToCache(firstCustomer.getId(), firstCustomer); getCacheService().addToCache(secondCustomer.getId(), secondCustomer); } public ICacheService getCacheService() { return cacheService; } public void setCacheService(ICacheService cacheService) { this.cacheService = cacheService; } } Starter Class of second member of the cluster : package com.onlinetechvision.exe; import java.util.Set; import java.util.concurrent.ExecutionException; import com.hazelcast.core.Hazelcast; import com.hazelcast.core.HazelcastInstance; import com.hazelcast.core.Member; import com.onlinetechvision.cache.srv.ICacheService; import com.onlinetechvision.customer.Customer; import com.onlinetechvision.exception.AnotherAvailableMemberNotFoundException; import com.onlinetechvision.executor.srv.IDistributedExecutorService; import com.onlinetechvision.task.TestCallable; /** * Starter Class loads Customers to cache and executes distributed tasks. * * @author onlinetechvision.com * @since 27 Nov 2012 * @version 1.0.0 * */ public class Starter { private String hazelcastInstanceName; private Hazelcast hazelcast; private IDistributedExecutorService distributedExecutorService; private ICacheService cacheService; /** * Loads cache and executes the tasks * */ public void start() { loadCache(); executeTasks(); } /** * Loads Customers to cache * */ public void loadCache() { Customer firstCustomer = new Customer(); firstCustomer.setId("3"); firstCustomer.setName("Bruce"); firstCustomer.setSurname("Willis"); Customer secondCustomer = new Customer(); secondCustomer.setId("4"); secondCustomer.setName("Colin"); secondCustomer.setSurname("Farrell"); getCacheService().addToCache(firstCustomer.getId(), firstCustomer); getCacheService().addToCache(secondCustomer.getId(), secondCustomer); } /** * Executes Tasks * */ public void executeTasks() { try { getDistributedExecutorService().executeOnStatedMember(new TestCallable(), getAnotherMember()); getDistributedExecutorService().executeOnTheMemberOwningTheKey(new TestCallable(), "3"); getDistributedExecutorService().executeOnAnyMember(new TestCallable()); getDistributedExecutorService().executeOnMembers(new TestCallable(), getAllMembers()); } catch (InterruptedException | ExecutionException | AnotherAvailableMemberNotFoundException e) { e.printStackTrace(); } } /** * Gets cluster members * * @return Set Set of Cluster Members * */ private Set getAllMembers() { Set members = getHazelcastLocalInstance().getCluster().getMembers(); return members; } /** * Gets an another member of cluster * * @return Member Another Member of Cluster * @throws AnotherAvailableMemberNotFoundException An Another Available Member can not found exception */ private Member getAnotherMember() throws AnotherAvailableMemberNotFoundException { Set members = getAllMembers(); for(Member member : members) { if(!member.localMember()) { return member; } } throw new AnotherAvailableMemberNotFoundException("No Other Available Member on the cluster. Please be aware that all members are active on the cluster"); } /** * Gets Hazelcast local instance * * @return HazelcastInstance Hazelcast local instance */ @SuppressWarnings("static-access") private HazelcastInstance getHazelcastLocalInstance() { HazelcastInstance instance = getHazelcast().getHazelcastInstanceByName(getHazelcastInstanceName()); return instance; } public String getHazelcastInstanceName() { return hazelcastInstanceName; } public void setHazelcastInstanceName(String hazelcastInstanceName) { this.hazelcastInstanceName = hazelcastInstanceName; } public Hazelcast getHazelcast() { return hazelcast; } public void setHazelcast(Hazelcast hazelcast) { this.hazelcast = hazelcast; } public IDistributedExecutorService getDistributedExecutorService() { return distributedExecutorService; } public void setDistributedExecutorService(IDistributedExecutorService distributedExecutorService) { this.distributedExecutorService = distributedExecutorService; } public ICacheService getCacheService() { return cacheService; } public void setCacheService(ICacheService cacheService) { this.cacheService = cacheService; } } STEP 12 : CREATE hazelcast-config.properties FILE hazelcast-config.properties file shows the properties of cluster members. First member properties : hz.instance.name = OTVInstance1 hz.group.name = dev hz.group.password = dev hz.management.center.enabled = true hz.management.center.url = http://localhost:8080/mancenter hz.network.port = 5701 hz.network.port.auto.increment = false hz.tcp.ip.enabled = true hz.members = 192.168.1.32 hz.executor.service.core.pool.size = 2 hz.executor.service.max.pool.size = 30 hz.executor.service.keep.alive.seconds = 30 hz.map.backup.count=2 hz.map.max.size=0 hz.map.eviction.percentage=30 hz.map.read.backup.data=true hz.map.cache.value=true hz.map.eviction.policy=NONE hz.map.merge.policy=hz.ADD_NEW_ENTRY Second member properties : hz.instance.name = OTVInstance2 hz.group.name = dev hz.group.password = dev hz.management.center.enabled = true hz.management.center.url = http://localhost:8080/mancenter hz.network.port = 5702 hz.network.port.auto.increment = false hz.tcp.ip.enabled = true hz.members = 192.168.1.32 hz.executor.service.core.pool.size = 2 hz.executor.service.max.pool.size = 30 hz.executor.service.keep.alive.seconds = 30 hz.map.backup.count=2 hz.map.max.size=0 hz.map.eviction.percentage=30 hz.map.read.backup.data=true hz.map.cache.value=true hz.map.eviction.policy=NONE hz.map.merge.policy=hz.ADD_NEW_ENTRY STEP 13 : CREATE applicationContext-hazelcast.xml Spring Hazelcast Configuration file, applicationContext-hazelcast.xml, is created and Hazelcast Distributed Executor Service and Hazelcast Instance are configured. ${hz.instance.name} ${hz.members} STEP 14 : CREATE applicationContext.xml Spring Configuration file, applicationContext.xml, is created. classpath:/hazelcast-config.properties STEP 15 : CREATE Application CLASS Application Class is created to run the application. ackage com.onlinetechvision.exe; import org.springframework.context.ApplicationContext; import org.springframework.context.support.ClassPathXmlApplicationContext; /** * Application class starts the application * * @author onlinetechvision.com * @since 27 Nov 2012 * @version 1.0.0 * */ public class Application { /** * Starts the application * * @param String[] args * */ public static void main(String[] args) { ApplicationContext context = new ClassPathXmlApplicationContext("applicationContext.xml"); Starter starter = (Starter) context.getBean("starter"); starter.start(); } } STEP 16 : BUILD PROJECT After OTV_Spring_Hazelcast_DistributedExecution Project is built, OTV_Spring_Hazelcast_DistributedExecution-0.0.1-SNAPSHOT.jar will be created. Important Note : The Members of the cluster have got different configuration for Coherence so the project should be built separately for each member. STEP 17 : INTEGRATION with HAZELCAST MANAGEMENT CENTER Hazelcast Management Center enables to monitor and manage nodes in the cluster. Entity and backup counts which are owned by customerMap, can be seen via Map Memory Data Table. We have distributed 4 entries via customerMap as shown below : Sample keys and values can be seen via Map Browser : Added First Entry : Added Third Entry : hazelcastDistributedExecutorService details can be seen via Executors tab. We have executed 3 task on first member and 2 tasks on second member as shown below : STEP 18 : RUN PROJECT BY STARTING THE CLUSTER’ s MEMBER After created OTV_Spring_Hazelcast_DistributedExecution-0.0.1-SNAPSHOT.jar file is run at the cluster’ s members, the following console output logs will be shown : First member console output : Kas 25, 2012 4:07:20 PM com.hazelcast.impl.AddressPicker INFO: Interfaces is disabled, trying to pick one address from TCP-IP config addresses: [x.y.z.t] Kas 25, 2012 4:07:20 PM com.hazelcast.impl.AddressPicker INFO: Prefer IPv4 stack is true. Kas 25, 2012 4:07:20 PM com.hazelcast.impl.AddressPicker INFO: Picked Address[x.y.z.t]:5701, using socket ServerSocket[addr=/0:0:0:0:0:0:0:0,localport=5701], bind any local is true Kas 25, 2012 4:07:21 PM com.hazelcast.system INFO: [x.y.z.t]:5701 [dev] Hazelcast Community Edition 2.4 (20121017) starting at Address[x.y.z.t]:5701 Kas 25, 2012 4:07:21 PM com.hazelcast.system INFO: [x.y.z.t]:5701 [dev] Copyright (C) 2008-2012 Hazelcast.com Kas 25, 2012 4:07:21 PM com.hazelcast.impl.LifecycleServiceImpl INFO: [x.y.z.t]:5701 [dev] Address[x.y.z.t]:5701 is STARTING Kas 25, 2012 4:07:24 PM com.hazelcast.impl.TcpIpJoiner INFO: [x.y.z.t]:5701 [dev] --A new cluster is created and First Member joins the cluster. Members [1] { Member [x.y.z.t]:5701 this } Kas 25, 2012 4:07:24 PM com.hazelcast.impl.MulticastJoiner INFO: [x.y.z.t]:5701 [dev] Members [1] { Member [x.y.z.t]:5701 this } ... -- First member adds two new entries to the cache... EntryAdded... Member : Member [x.y.z.t]:5701 this, Key : 1, OldValue : null, NewValue : Customer [id=1, name=Jodie, surname=Foster] EntryAdded... Member : Member [x.y.z.t]:5701 this, Key : 2, OldValue : null, NewValue : Customer [id=2, name=Kate, surname=Winslet] ... --Second Member joins the cluster. Members [2] { Member [x.y.z.t]:5701 this Member [x.y.z.t]:5702 } ... -- Second member adds two new entries to the cache... EntryAdded... Member : Member [x.y.z.t]:5702, Key : 4, OldValue : null, NewValue : Customer [id=4, name=Colin, surname=Farrell] EntryAdded... Member : Member [x.y.z.t]:5702, Key : 3, OldValue : null, NewValue : Customer [id=3, name=Bruce, surname=Willis] Second member console output : Kas 25, 2012 4:07:48 PM com.hazelcast.impl.AddressPicker INFO: Interfaces is disabled, trying to pick one address from TCP-IP config addresses: [x.y.z.t] Kas 25, 2012 4:07:48 PM com.hazelcast.impl.AddressPicker INFO: Prefer IPv4 stack is true. Kas 25, 2012 4:07:48 PM com.hazelcast.impl.AddressPicker INFO: Picked Address[x.y.z.t]:5702, using socket ServerSocket[addr=/0:0:0:0:0:0:0:0,localport=5702], bind any local is true Kas 25, 2012 4:07:49 PM com.hazelcast.system INFO: [x.y.z.t]:5702 [dev] Hazelcast Community Edition 2.4 (20121017) starting at Address[x.y.z.t]:5702 Kas 25, 2012 4:07:49 PM com.hazelcast.system INFO: [x.y.z.t]:5702 [dev] Copyright (C) 2008-2012 Hazelcast.com Kas 25, 2012 4:07:49 PM com.hazelcast.impl.LifecycleServiceImpl INFO: [x.y.z.t]:5702 [dev] Address[x.y.z.t]:5702 is STARTING Kas 25, 2012 4:07:49 PM com.hazelcast.impl.Node INFO: [x.y.z.t]:5702 [dev] ** setting master address to Address[x.y.z.t]:5701 Kas 25, 2012 4:07:49 PM com.hazelcast.impl.MulticastJoiner INFO: [x.y.z.t]:5702 [dev] Connecting to master node: Address[x.y.z.t]:5701 Kas 25, 2012 4:07:49 PM com.hazelcast.nio.ConnectionManager INFO: [x.y.z.t]:5702 [dev] 55715 accepted socket connection from /x.y.z.t:5701 Kas 25, 2012 4:07:55 PM com.hazelcast.cluster.ClusterManager INFO: [x.y.z.t]:5702 [dev] --Second Member joins the cluster. Members [2] { Member [x.y.z.t]:5701 Member [x.y.z.t]:5702 this } Kas 25, 2012 4:07:56 PM com.hazelcast.impl.LifecycleServiceImpl INFO: [x.y.z.t]:5702 [dev] Address[x.y.z.t]:5702 is STARTED -- Second member adds two new entries to the cache... EntryAdded... Member : Member [x.y.z.t]:5702 this, Key : 3, OldValue : null, NewValue : Customer [id=3, name=Bruce, surname=Willis] EntryAdded... Member : Member [x.y.z.t]:5702 this, Key : 4, OldValue : null, NewValue : Customer [id=4, name=Colin, surname=Farrell] 25.11.2012 16:07:56 DEBUG (DistributedExecutorService.java:42) - Method executeOnStatedMember is called... 25.11.2012 16:07:56 DEBUG (DistributedExecutorService.java:46) - Result of method executeOnStatedMember is : First Member' s TestCallable Task is called... 25.11.2012 16:07:56 DEBUG (DistributedExecutorService.java:61) - Method executeOnTheMemberOwningTheKey is called... 25.11.2012 16:07:56 DEBUG (DistributedExecutorService.java:65) - Result of method executeOnTheMemberOwningTheKey is : First Member' s TestCallable Task is called... 25.11.2012 16:07:56 DEBUG (DistributedExecutorService.java:78) - Method executeOnAnyMember is called... 25.11.2012 16:07:57 DEBUG (DistributedExecutorService.java:82) - Result of method executeOnAnyMember is : Second Member' s TestCallable Task is called... 25.11.2012 16:07:57 DEBUG (DistributedExecutorService.java:96) - Method executeOnMembers is called... 25.11.2012 16:07:57 DEBUG (DistributedExecutorService.java:101) - Result of method executeOnMembers is : [First Member' s TestCallable Task is called..., Second Member' s TestCallable Task is called...] STEP 19 : DOWNLOAD https://github.com/erenavsarogullari/OTV_Spring_Hazelcast_DistributedExecution REFERENCES : Java ExecutorService Interface Hazelcast Distributed Executor Service

December 11, 2012

by Eren Avsarogullari

· 29,976 Views · 1 Like

Lightweight RPC with ZeroMQ (ØMQ) and Protocol Buffers

A frequent issue I come across writing integration applications with Mule is deciding how to communicate back and forth between my front end application, typically a web or mobile application, and a flow hosted on Mule. I could use web services and do something like annotate a component with JAX-RS and expose this out over HTTP. This is potentially overkill, particularly if I only want to host a few methods, the methods are asynchronous or I don’t want to deal with the overhead of HTTP. It also could be a lot of extra effort if the only consumers of the API, at least initially, are internal facing applications. Another choice is to use “synchronous” JMS with temporary reply queues. While Mule makes this easy to do, particularly with MuleClient, I now have to deal with the overhead of spinning up a JMS infrastructure. I could also be limited to Java only clients, depending on which JMS broker I choose. The latter is particularly signifcant, as Java probably isn’t the technology of choice on the web or mobile layer. ØMQ for RPC ØMQ, or ZeroMQ, is a networking library designed from the ground up to ease integration between distributed applications. In addition to supporting a variety of messaging patterns, which are enumerated in the extremely well written guide, the library is written in platform agnostic C with wrappers for different languages like Java, Python and Ruby. These features make it a good candidate to solve the challenges I introduced above, particularly since a community contributed module for ØMQ was released recently. Let’s consider a simple service that accepts a request for a range of stock quotes and returns the results and see how we can host this service with Mule and expose it out with the ØMQ Module. Data Serialization with Protocol Buffers Data is transported back and forth over ØMQ as byte arrays. We, as such, need to decide on a way to serialize our stock quote request and responses “on the wire.” Before we do that, however, let’s take a look at the Java canonical data model we’re using on the client and server side. The following Gists show the important bits of the StockQuote and StockQuoteResponse classes. public class StockQuote implements Serializable { String symbol; Date date; Double open; Double high; Double low; Double close; Long volume; Double adjustedClose; public class StockQuoteRequest implements Serializable { String symbol; Date startDate; Date endDate; public interface StockDataService { public List getQuote(StockQuoteRequest request); } We could use Java serialization to get the objects into byte arrays. Ignoring the other deficiencies of default Java serialization, the main drawback is that it limits our clients to one’s running on a JVM. XML or JSON provide better alternatives, but for the purposes of this example we’ll assume we want a more compact representation of the data (this isn’t totally unrealistic, stock quote data can be extremely time sensitive and we probably want to minimize serialization and deserialization overhead.) Protocol Buffers provide a good middle ground and also boast a Mule Module to provide the necessary transformers we need to move back and forth from the byte array representations. Let’s define two .proto files to define the wire format and generate the intermediary stubs for serialization. package com.acmesoft.zeromq; option java_package = "com.acmesoft.stock.model.serialization.protobuf"; option optimize_for = SPEED;package com.acmesoft.zeromq; option java_package = "com.acmesoft.stock.model.serialization.protobuf"; option optimize_for = SPEED; option java_multiple_files = true; message StockQuoteResponseBuffer { repeated StockQuoteBuffer result = 1; } message StockQuoteBuffer { required string symbol = 1; required int64 date = 2; required double open = 3; required double high = 4; required double low = 5; required double close = 6; required int64 volume = 7; required double adjustedClose = 8; } option java_multiple_files = true; message StockQuoteRequestBuffer { required string symbol = 1; required int64 start = 2; required int64 end = 3; } You typically would use the “protoc” compiler to generate the Java stubs. This is tedious, however, so we’ll instead modify the pom.xml of our project to compile the protoc files during the compile goals: com.google.protobuf.tools maven-protoc-plugin /usr/local/bin/protoc compile testCompile Since we already have a domain model we’ll add some helper classes to simplify the serialization tasks on the client side. public byte[] toProtocolBufferAsBytes() { return StockQuoteRequestBuffer.newBuilder() .setSymbol(symbol) .setStart(startDate.getTime()) .setEnd(endDate.getTime()).build().toByteArray(); } public static StockQuoteRequest fromProtocolBuffer(StockQuoteRequestBuffer buffer) { StockQuoteRequest request = new StockQuoteRequest(); request.setSymbol(buffer.getSymbol()); request.setStartDate(new Date(buffer.getStart())); request.setEndDate(new Date(buffer.getEnd())); return request; } public static StockQuoteResponseBuffer toProtocolBuffer(List quotes) { StockQuoteResponseBuffer.Builder responseBuilder = StockQuoteResponseBuffer.newBuilder(); for (StockQuote quote : quotes) { responseBuilder.addResult(StockQuoteBuffer.newBuilder() .setAdjustedClose(quote.getAdjustedClose()) .setClose(quote.getClose()) .setDate(quote.getDate().getTime()) .setHigh(quote.getHigh()) .setLow(quote.getLow()) .setOpen(quote.getOpen()) .setSymbol(quote.getSymbol()) .setVolume(quote.getVolume()).build()); } return responseBuilder.build(); } public static List listOfStockQuotesFromBytes(byte[] bytes) { List buffer; try { buffer = StockQuoteResponseBuffer.parseFrom(bytes).getResultList(); } catch (InvalidProtocolBufferException e) { throw new SerializationException(e); } List quotes = new ArrayList(); for (StockQuoteBuffer stockQuoteBuffer : buffer) { StockQuote stockQuote = new StockQuote(); stockQuote.setClose(stockQuoteBuffer.getClose()); stockQuote.setDate(new Date(stockQuoteBuffer.getDate())); stockQuote.setHigh(stockQuoteBuffer.getHigh()); stockQuote.setOpen(stockQuoteBuffer.getOpen()); stockQuote.setSymbol(stockQuoteBuffer.getSymbol()); stockQuote.setVolume(stockQuoteBuffer.getVolume()); stockQuote.setAdjustedClose(stockQuoteBuffer.getAdjustedClose()); stockQuote.setLow(stockQuoteBuffer.getLow()); quotes.add(stockQuote); } return quotes; } Configuring StockDataService Now that we have a canonical data model and a wire format defined we’re ready to wire up a Mule flow to expose the service out. Note that for this to work you need to have jzmq installed locally on your system. The following dependency needs to be added to your pom.xml once its installed: org.zeromq zmq 2.2.0 /usr/local/lib/zmq.jar system Where systemPath is the location of the zmq.jar on your filesystem. Once that’s out of the way we can configure the flow, as illustrated below: The ZeroMQ inbound-endpoint will be bound to TCP port 9090 with a request-response exchange pattern. The deserialize MP in the protobuf module will deserialize the byte array to the generated StockQuoteRequestBuffer class. From there we’ll use MEL to invoke the helper method on StockQuoteRequest to transform the intermediary class to the domain model. The List of StockQuotes returned from StockDataService will be transformed by the MEL expression using the “toProtocolBuffer” helper method on the domain model. The Protocol Buffer Module is then smart enough to implicitly transform the intermediary object to a byte array for the response. Consuming the Service from the Client Side Now that the server is ready we can turn our attention to the client side code to invoke the remote service. Let’s take a look at how this works: StockQuoteRequest stockQuoteRequest = new StockQuoteRequest(); stockQuoteRequest.setSymbol("FB"); stockQuoteRequest.setStartDate(new Date( new Date().getTime() - (86400000 * 7))); stockQuoteRequest.setEndDate(new Date()); ZMQ.Socket zmqSocket = zmqContext.socket(ZMQ.REQ); zmqSocket.setReceiveTimeOut(RECEIVE_TIMEOUT); zmqSocket.connect("tcp://localhost:9090"); zmqSocket.send(stockQuoteRequest.toProtocolBufferAsBytes(), 0); List quotes = StockQuote.listOfStockQuotesFromBytes(zmqSocket.recv(0)); We start off by defining the StockQuoteRequest object to give us all the quotes for Facebook stock from the last week. We can then open up a ZMQ socket, set the timeout, connect to the ZMQ socket on the remote Mule instance and send the byte representation of the StockQuoteRequest to it. zmqSocket.recv is then used to receive the bytes back from Mule. From here we can use the listOfStockQuotesFromBytes helper method we wrote above to convert the Protocol Buffer representation to a List of StockQuotes. Despite the fair bit of plumbing we did above, this is a pretty concise bit of client side code to invoke the remote service. Conclusion This blog post only touched on the features of ØMQ and the ØMQ Mule Module. In addition to request-reply, other exchange-patterns are supported, like one-way, push and pull. This effectively gives you the benefits of a reliable, asynchronous messaging layer without a centralized infrastructure. I hope to cover this in a later post. Protocol buffers also seem like a natural fit as a wire format for ØMQ. protobuffers echo ØMQ’s principals of being lightweight, fast and platform agnostic. These are also, not coincidently, principals Mule shares as an integration framework. The project for this example is available on GitHub.

November 26, 2012

by John D'Emic

· 28,583 Views

How to Monitor Java Garbage Collection

This is the second article in the series of "Become a Java GC Expert". In the first issue Understanding Java Garbage Collection we have learned about the processes for different GC algorithms, about how GC works, what Young and Old Generation is, what you should know about the 5 types of GC in the new JDK 7, and what the performance implications are for each of these GC types. In this article, I will explain how JVM is actually running Garbage Collection in the real time. What is GC Monitoring? Garbage Collection Monitoring refers to the process of figuring out how JVM is running GC. For example, we can find out: when an object in young has moved to old and by how much, or when stop-the-world has occurred and for how long. GC monitoring is carried out to see if JVM is running GC efficiently, and to check if additional GC tuning is necessary. Based on this information, the application can be edited or GC method can be changed (GC tuning). How to Monitor GC? There are different ways to monitor GC, but the only difference is how the GC operation information is shown. GC is done by JVM, and since the GC monitoring tools disclose the GC information provided by JVM, you will get the same results no matter how you monitor GC. Therefore, you do not need to learn all methods to monitor GC, but since it only requires a little amount of time to learn each GC monitoring method, knowing a few of them can help you use the right one for different situations and environments. The tools or JVM options listed below cannot be used universally regardless of the HVM vendor. This is because there is no need for a "standard" for disclosing GC information. In this example we will use HotSpot JVM (Oracle JVM). Since NHN is using Oracle (Sun) JVM, there should be no difficulties in applying the tools or JVM options that we are explaining here. First, the GC monitoring methods can be separated into CUI and GUI depending on the access interface. The typical CUI GC monitoring method involves using a separate CUI application called "jstat", or selecting a JVM option called "verbosegc" when running JVM. GUI GC monitoring is done by using a separate GUI application, and three most commonly used applications would be "jconsole", "jvisualvm" and "Visual GC". Let's learn more about each method. jstat jstat is a monitoring tool in HotSpot JVM. Other monitoring tools for HotSpot JVM are jps and jstatd. Sometimes, you need all three tools to monitor a Java application. jstat does not provide only the GC operation information display. It also provides class loader operation information or Just-in-Time compiler operation information. Among all the information jstat can provide, in this article we will only cover its functionality to monitor GC operating information. jstat is located in $JDK_HOME/bin, so if java or javac can run without setting a separate directory from the command line, so can jstat. You can try running the following in the command line. $> jstat –gc $ 1000 S0C S1C S0U S1U EC EU OC OU PC PU YGC YGCT FGC FGCT GCT 3008.0 3072.0 0.0 1511.1 343360.0 46383.0 699072.0 283690.2 75392.0 41064.3 2540 18.454 4 1.133 19.588 3008.0 3072.0 0.0 1511.1 343360.0 47530.9 699072.0 283690.2 75392.0 41064.3 2540 18.454 4 1.133 19.588 3008.0 3072.0 0.0 1511.1 343360.0 47793.0 699072.0 283690.2 75392.0 41064.3 2540 18.454 4 1.133 19.588 $> Just like in the example, the real type data will be output along with the following columns: S0C S1C S0U S1U EC EU OC OU PC. vmid (Virtual Machine ID), as its name implies, is the ID for the VM. Java applications running either on a local machine or on a remote machine can be specified using vmid. The vmid for Java application running on a local machine is called lvmid (Local vmid), and usually is PID. To find out the lvmid, you can write the PID value using a ps command or Windows task manager, but we suggest jps because PID and lvmid does not always match. jps stands for Java PS. jps shows vmids and main method information. Just like ps shows PIDs and process names. Find out the vmid of the Java application that you want to monitor by using jps, then use it as a parameter in jstat. If you use jps alone, only bootstrap information will show when several WAS instances are running in one equipment. We suggest that you use ps -ef | grep java command along with jps. GC performance data needs constant observation, therefore when running jstat, try to output the GC monitoring information on a regular basis. For example, running "jstat –gc 1000" (or 1s) will display the GC monitoring data on the console every 1 second. "jstat –gc 1000 10" will display the GC monitoring information once every 1 second for 10 times in total. There are many options other than -gc, among which GC related ones are listed below. Option Name Description gc It shows the current size for each heap area and its current usage (Ede, survivor, old, etc.), total number of GC performed, and the accumulated time for GC operations. gccapactiy It shows the minimum size (ms) and maximum size (mx) of each heap area, current size, and the number of GC performed for each area. (Does not show current usage and accumulated time for GC operations.) gccause It shows the "information provided by -gcutil" + reason for the last GC and the reason for the current GC. gcnew Shows the GC performance data for the new area. gcnewcapacity Shows statistics for the size of new area. gcold Shows the GC performance data for the old area. gcoldcapacity Shows statistics for the size of old area. gcpermcapacity Shows statistics for the permanent area. gcutil Shows the usage for each heap area in percentage. Also shows the total number of GC performed and the accumulated time for GC operations. Only looking at frequency, you will probably use -gcutil (or -gccause), -gc and -gccapacity the most in that order. -gcutil is used to check the usage of heap areas, the number of GC performed, and the total accumulated time for GC operations, while -gccapacity option and others can be used to check the actual size allocated. You can see the following output by using the -gc option: S0C S1C … GCT 1248.0 896.0 … 1.246 1248.0 896.0 … 1.246 … … … … Different jstat options show different types of columns, which are listed below. Each column information will be displayed when you use the "jstat option" listed on the right. Column Description Jstat Option S0C Displays the current size of Survivor0 area in KB -gc -gccapacity -gcnew -gcnewcapacity S1C Displays the current size of Survivor1 area in KB -gc -gccapacity -gcnew -gcnewcapacity S0U Displays the current usage of Survivor0 area in KB -gc -gcnew S1U Displays the current usage of Survivor1 area in KB -gc -gcnew EC Displays the current size of Eden area in KB -gc -gccapacity -gcnew -gcnewcapacity EU Displays the current usage of Eden area in KB -gc -gcnew OC Displays the current size of old area in KB -gc -gccapacity -gcold -gcoldcapacity OU Displays the current usage of old area in KB -gc -gcold PC Displays the current size of permanent area in KB -gc -gccapacity -gcold -gcoldcapacity -gcpermcapacity PU Displays the current usage of permanent area in KB -gc -gcold YGC The number of GC event occurred in young area -gc -gccapacity -gcnew -gcnewcapacity -gcold -gcoldcapacity -gcpermcapacity -gcutil -gccause YGCT The accumulated time for GC operations for Yong area -gc -gcnew -gcutil -gccause FGC The number of full GC event occurred -gc -gccapacity -gcnew -gcnewcapacity -gcold -gcoldcapacity -gcpermcapacity -gcutil -gccause FGCT The accumulated time for full GC operations -gc -gcold -gcoldcapacity -gcpermcapacity -gcutil -gccause GCT The total accumulated time for GC operations -gc -gcold -gcoldcapacity -gcpermcapacity -gcutil -gccause NGCMN The minimum size of new area in KB -gccapacity -gcnewcapacity NGCMX The maximum size of max area in KB -gccapacity -gcnewcapacity NGC The current size of new area in KB -gccapacity -gcnewcapacity OGCMN The minimum size of old area in KB -gccapacity -gcoldcapacity OGCMX The maximum size of old area in KB -gccapacity -gcoldcapacity OGC The current size of old area in KB -gccapacity -gcoldcapacity PGCMN The minimum size of permanent area in KB -gccapacity -gcpermcapacity PGCMX The maximum size of permanent area in KB -gccapacity -gcpermcapacity PGC The current size of permanent generation area in KB -gccapacity -gcpermcapacity PC The current size of permanent area in KB -gccapacity -gcpermcapacity PU The current usage of permanent area in KB -gc -gcold LGCC The cause for the last GC occurrence -gccause GCC The cause for the current GC occurrence -gccause TT Tenuring threshold. If copied this amount of times in young area (S0 ->S1, S1->S0), they are then moved to old area. -gcnew MTT Maximum Tenuring threshold. If copied this amount of times inside young arae, then they are moved to old area. -gcnew DSS Adequate size of survivor in KB -gcnew The advantage of jstat is that it can always monitor the GC operation data of Java applications running on local/remote machine, as long as a console can be used. From these items, the following result is output when –gcutil is used. At the time of GC tuning, pay careful attention to YGC, YGCT, FGC, FGCT and GCT. S0 S1 E O P YGC YGCT FGC FGCT GCT 0.00 66.44 54.12 10.58 86.63 217 0.928 2 0.067 0.995 0.00 66.44 54.12 10.58 86.63 217 0.928 2 0.067 0.995 0.00 66.44 54.12 10.58 86.63 217 0.928 2 0.067 0.995 These items are important because they show how much time was spent in running GC. In this example, YGC is 217 and YGCT is 0.928. So, after calculating the arithmetical average, you can see that it required about 4 ms (0.004 seconds) for each young GC. Likewise, the average full GC time us 33ms. But the arithmetical average often does not help analyzing the actual GC problem. This is due to the severe deviations in GC operation time. (In other words, if the average time is 0.067 seconds for a full GC, one GC may have lasted 1 ms while the other one lasted 57 ms.) In order to check the individual GC time instead of the arithmetical average time, it is better to use -verbosegc. -verbosegc -verbosegc is one of the JVM options specified when running a Java application. While jstat can monitor any JVM application that has not specified any options, -verbosegc needs to be specified in the beginning, so it could be seen as an unnecessary option (since jstat can be used instead). However, as -verbosegc displays easy to understand output results whenever a GC occurs, it is very helpful for monitoring rough GC information. jstat -verbosegc Monitoring Target Java application running on a machine that can log in to a terminal, or a remote Java application that can connect to the network by using jstatd Only when -verbogc was specified as a JVM starting option Output information Heap status (usage, maximum size, number of times for GC/time, etc.) Size of ew and old area before/after GC, and GC operation time Output Time Every designated time Whenever GC occurs Whenever useful When trying to observe the changes of the size of heap area When trying to see the effect of a single GC The followings are other options that can be used with -verbosegc. -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC -XX:+PrintGCDateStamps (from JDK 6 update 4) If only -verbosegc is used, then -XX:+PrintGCDetails is applied by default. Additional options for –verbosgc are not exclusive and can be mixed and used together. When using -verbosegc, you can see the results in the following format whenever a minor GC occurs. [GC [: -> , secs] -> , secs] ] Collector Name of Collector Used for minor gc starting occupancy1 The size of young area before GC ending occupancy1 The size of young area after GC pause time1 The time when the Java application stopped running for minor GC starting occupancy3 The total size of heap area before GC ending occupancy3 The total size of heap area after GC pause time3 The time when the Java application stopped running for overall heap GC, including major GC This is an example of -verbosegc output for minor GC: S0 S1 E O P YGC YGCT FGC FGCT GCT 0.00 66.44 54.12 10.58 86.63 217 0.928 2 0.067 0.995 0.00 66.44 54.12 10.58 86.63 217 0.928 2 0.067 0.995 0.00 66.44 54.12 10.58 86.63 217 0.928 2 0.067 0.995 This is the example of output results after an Full GC occurred. [Full GC [Tenured: 3485K->4095K(4096K), 0.1745373 secs] 61244K->7418K(63104K), [Perm : 10756K->10756K(12288K)], 0.1762129 secs] [Times: user=0.19 sys=0.00, real=0.19 secs] If a CMS collector is used, then the following CMS information can be provided as well. As -verbosegc option outputs a log every time a GC event occurs, it is easy to see the changes of the heap usage rates caused by GC operation. (Java) VisualVM + Visual GC Java Visual VM is a GUI profiling/monitoring tool provided by Oracle JDK. Figure 1: VisualVM Screenshot. Instead of the version that is included with JDK, you can download Visual VM directly from its website. For the sake of convenience, the version included with JDK will be referred to as Java VisualVM (jvisualvm), and the version available from the website will be referred to as Visual VM (visualvm). The features of the two are not exactly identical, as there are slight differences, such as when installing plug-ins. Personally, I prefer the Visual VM version, which can be downloaded from the website. After running Visual VM, if you select the application that you wish to monitor from the window on the left side, you can find the "Monitoring" tab there. You can get the basic information about GC and Heap from this Monitoring tab. Though the basic GC status is also available through the basic features of VisualVM, you cannot access detailed information that is available from either jstat or -verbosegc option. If you want the detailed information provided by jstat, then it is recommended to install the Visual GC plug-in. Visual GC can be accessed in real time from the Tools menu. Figure 2: Viusal GC Installation Screenshot. By using Visual GC, you can see the information provided by running jstatd in a more intuitive way. Figure 3: Visual GC execution screenshot. HPJMeter HPJMeter is convenient for analyzing -verbosegc output results. If Visual GC can be considered as the GUI equivalent of jstat, then HPJMeter would be the GUI equivalent of -verbosgc. Of course, GC analysis is just one of the many features provided by HPJMeter. HPJMeter is a performance monitoring tool developed by HP. It can be used in HP-UX, as well as Linux and MS Windows. Originally, a tool called HPTune used to provide the GUI analysis feature for -verbosegc. However, since the HPTune feature has been integrated into HPJMeter since version 3.0, there is no need to download HPTune separately. When executing an application, the -verbosegc output results will be redirected to a separate file. You can open the redirected file with HPJMeter, which allows faster and easier GC performance data analysis through the intuitive GUI. Figure 4: HPJMeter.

October 24, 2012

by Esen Sagynov

· 99,799 Views · 7 Likes

EasyNetQ Cluster Support

EasyNetQ, my super simple .NET API for RabbitMQ, now (from version 0.7.2.34) supports RabbitMQ clusters without any need to deploy a load balancer. Simply list the nodes of the cluster in the connection string ... var bus = RabbitHutch.CreateBus("host=ubuntu:5672,ubuntu:5673"); In this example I have set up a cluster on a single machine, 'ubuntu', with node 1 on port 5672 and node 2 on port 5673. When the CreateBus statement executes, EasyNetQ will attempt to connect to the first host listed (ubuntu:5672). If it fails to connect it will attempt to connect to the second host listed (ubuntu:5673). If neither node is available it will sit in a re-try loop attempting to connect to both servers every five seconds. It logs all this activity to the registered IEasyNetQLogger. You might see something like this if the first node was unavailable: DEBUG: Trying to connect ERROR: Failed to connect to Broker: 'ubuntu', Port: 5672 VHost: '/'. ExceptionMessage: 'None of the specified endpoints were reachable' DEBUG: OnConnected event fired INFO: Connected to RabbitMQ. Broker: 'ubuntu', Port: 5674, VHost: '/' If the node that EasyNetQ is connected to fails, EasyNetQ will attempt to connect to the next listed node. Once connected, it will re-declare all the exchanges and queues and re-start all the consumers. Here's an example log record showing one node failing then EasyNetQ connecting to the other node and recreating the subscribers: INFO: Disconnected from RabbitMQ Broker DEBUG: Trying to connect DEBUG: OnConnected event fired DEBUG: Re-creating subscribers INFO: Connected to RabbitMQ. Broker: 'ubuntu', Port: 5674, VHost: '/' You get automatic fail-over out of the box. That’s pretty cool. If you have multiple services using EasyNetQ to connect to a RabbitMQ cluster, they will all initially connect to the first listed node in their respective connection strings. For this reason the EasyNetQ cluster support is not really suitable for load balancing high throughput systems. I would recommend that you use a dedicated hardware or software load balancer instead, if that’s what you want.

October 14, 2012

by Mike Hadlow

· 6,886 Views

Redis pub/sub Using Spring

Continuing to discover the powerful set of Redis features, the one worth mentioning about is out of the box support of pub/sub messaging. Pub/Sub messaging is essential part of many software architectures. Some software systems demand from messaging solution to provide high-performance, scalability, queues persistence and durability, fail-over support, transactions, and many more nice-to-have features, which in Java world mostly always leads to using one of JMS implementation providers. In my previous projects I have actively used Apache ActiveMQ (now moving towards Apache ActiveMQ Apollo). Though it's a great implementation, sometimes I just needed simple queuing support and Apache ActiveMQ just looked overcomplicated for that. Alternatives? Please welcome Redis pub/sub! If you are already using Redis as key/value store, few additional lines of configuration will bring pub/sub messaging to your application in no time. Spring Data Redis project abstracts very well Redis pub/sub API and provides the model so familiar to everyone who uses Spring capabilities to integrate with JMS. As always, let's start with the POM configuration file. It's pretty small and simple, includes necessary Spring dependencies, Spring Data Redis and Jedis, great Java client for Redis. 4.0.0 com.example.spring redis 0.0.1-SNAPSHOT jar UTF-8 3.1.1.RELEASE org.springframework.data spring-data-redis 1.0.1.RELEASE cglib cglib-nodep 2.2 log4j log4j 1.2.16 redis.clients jedis 2.0.0 jar org.springframework spring-core ${spring.version} org.springframework spring-context ${spring.version} org.apache.maven.plugins maven-compiler-plugin 2.3.2 1.6 1.6 Moving on to configuring Spring context, let's understand what we need to have in order for a publisher to publish some messages and for a consumer to consume them. Knowing the respective Spring abstractions for JMS will help a lot with that. we need connection factory -> JedisConnectionFactory we need a template for publisher to publish messages -> RedisTemplate we need a message listener for consumer to consume messages -> RedisMessageListenerContainer Using Spring Java configuration, let's describe our context: package com.example.redis.config; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration; import org.springframework.data.redis.connection.jedis.JedisConnectionFactory; import org.springframework.data.redis.core.RedisTemplate; import org.springframework.data.redis.listener.ChannelTopic; import org.springframework.data.redis.listener.RedisMessageListenerContainer; import org.springframework.data.redis.listener.adapter.MessageListenerAdapter; import org.springframework.data.redis.serializer.GenericToStringSerializer; import org.springframework.data.redis.serializer.StringRedisSerializer; import org.springframework.scheduling.annotation.EnableScheduling; import com.example.redis.IRedisPublisher; import com.example.redis.impl.RedisMessageListener; import com.example.redis.impl.RedisPublisherImpl; @Configuration @EnableScheduling public class AppConfig { @Bean JedisConnectionFactory jedisConnectionFactory() { return new JedisConnectionFactory(); } @Bean RedisTemplate< String, Object > redisTemplate() { final RedisTemplate< String, Object > template = new RedisTemplate< String, Object >(); template.setConnectionFactory( jedisConnectionFactory() ); template.setKeySerializer( new StringRedisSerializer() ); template.setHashValueSerializer( new GenericToStringSerializer< Object >( Object.class ) ); template.setValueSerializer( new GenericToStringSerializer< Object >( Object.class ) ); return template; } @Bean MessageListenerAdapter messageListener() { return new MessageListenerAdapter( new RedisMessageListener() ); } @Bean RedisMessageListenerContainer redisContainer() { final RedisMessageListenerContainer container = new RedisMessageListenerContainer(); container.setConnectionFactory( jedisConnectionFactory() ); container.addMessageListener( messageListener(), topic() ); return container; } @Bean IRedisPublisher redisPublisher() { return new RedisPublisherImpl( redisTemplate(), topic() ); } @Bean ChannelTopic topic() { return new ChannelTopic( "pubsub:queue" ); } } Very easy and straightforward. The presence of @EnableScheduling annotation is not necessary and is required only for our publisher implementation: the publisher will publish a string message every 100 ms. package com.example.redis.impl; import java.util.concurrent.atomic.AtomicLong; import org.springframework.data.redis.core.RedisTemplate; import org.springframework.data.redis.listener.ChannelTopic; import org.springframework.scheduling.annotation.Scheduled; import com.example.redis.IRedisPublisher; public class RedisPublisherImpl implements IRedisPublisher { private final RedisTemplate< String, Object > template; private final ChannelTopic topic; private final AtomicLong counter = new AtomicLong( 0 ); public RedisPublisherImpl( final RedisTemplate< String, Object > template, final ChannelTopic topic ) { this.template = template; this.topic = topic; } @Scheduled( fixedDelay = 100 ) public void publish() { template.convertAndSend( topic.getTopic(), "Message " + counter.incrementAndGet() + ", " + Thread.currentThread().getName() ); } } And finally our message listener implementation (which just prints message on a console). package com.example.redis.impl; import org.springframework.data.redis.connection.Message; import org.springframework.data.redis.connection.MessageListener; public class RedisMessageListener implements MessageListener { @Override public void onMessage( final Message message, final byte[] pattern ) { System.out.println( "Message received: " + message.toString() ); } } Awesome, just two small classes, one configuration to wire things together and we have full pub/sub messaging support in our application! Let's run the application as standalone ... package com.example.redis; import org.springframework.context.ApplicationContext; import org.springframework.context.annotation.AnnotationConfigApplicationContext; import com.example.redis.config.AppConfig; public class RedisPubSubStarter { public static void main(String[] args) { new AnnotationConfigApplicationContext( AppConfig.class ); } } ... and see following output in a console: ... Message received: Message 1, pool-1-thread-1 Message received: Message 2, pool-1-thread-1 Message received: Message 3, pool-1-thread-1 Message received: Message 4, pool-1-thread-1 Message received: Message 5, pool-1-thread-1 Message received: Message 6, pool-1-thread-1 Message received: Message 7, pool-1-thread-1 Message received: Message 8, pool-1-thread-1 Message received: Message 9, pool-1-thread-1 Message received: Message 10, pool-1-thread-1 Message received: Message 11, pool-1-thread-1 Message received: Message 12, pool-1-thread-1 Message received: Message 13, pool-1-thread-1 Message received: Message 14, pool-1-thread-1 Message received: Message 15, pool-1-thread-1 Message received: Message 16, pool-1-thread-1 ... Great! There is much more which you could do with Redis pub/sub, excellent documentation is available for you on Redis official web site.

October 13, 2012

by Andriy Redko

· 42,914 Views · 4 Likes

SOA Service Design Cheat Sheet

this simple cheat sheet contains all the key goals, principals and design patterns that you should be aware of when designing soa services and contains helpful links to places where you can find more in-depth information on each topic. when i was studying for my soa certified architect exams, i kept notes on all the best bits from the course material. after 9 months and several hundred hours of study, i found that there were certain key pieces of information that i kept referring back to time and time again, such as… how do you define service-orientation? what are the goals and strategic benefits of having a service-oriented business? what are the design principals you should apply to soa service design & soa governance? what are the characteristics of soa based businesses – how can you recognise one? what are the most useful soa design patterns and how are they grouped? i thought it might be useful to bring all this information together into one place, so in collaboration with soagrowers we have published a free pdf cheat sheet on soa service design which you can print out and keep close to hand so it’s there whenever you need it. it’s not meant to be an exhaustive guide – it’s just a set of place-holders to remind you of the topics that may be of relevance to you when designing services. however, it should prove useful to any service architect or developer who’s interested in service design or anyone who is going through the same certification programme as i did – even if you just use it as a check-list or aide-mémoire . none of it is particularly technology specific. the same set of goals, principals and patterns can be applied equally to soap based web services , restful services or any other kind of distributed components – that’s the beauty of service-orientation, it’s vendor and technology neutral. in the sheet i’ve also highlighted something that often get’s overlooked when technologists have the lead on soa implementations:- soa has some very attractive and unique business benefits that can only be fully realised when you apply the design paradigm correctly. for my money, it’s this outcome oriented viewpoint (the business case if you like) that really differentiates soa from other tactics like eai/esb, but all too often this message gets lost in the melee . we hope you find it useful. to get your copy of the soa service design cheat sheet, just click on the image below. if you like it please share it (there are handy share buttons on the page below). click on the image to download the pdf get involved. did you find this useful? is there something you think could be added or removed? did you notice how esb is just a small fraction of the bigger picture? let me know your thoughts in the comments below. ————————————————————————— updated: 18/09/2012. i’ve now added a small section on contract first service design, just because it so fundamentally underpins many of the most important goals, principals and patterns used to deliver successful soa. for more information on contract first, see spring-ws’s excellent whitepaper . contract-first isn’t just a soap thing by the way. ‘contract’ in a soa design context means operations, data types, policies and anything else to do with the service’s public facia. so although rest has an implicit contract with predetermined operations (get, put, post, etc.) it still has data type’s and flexible url’s that convey some meaning. therefore, if you want to make a rest architecture more interoperable and less brittle for clients, it helps to plan these datatypes and url’s in advance if you can so they become better standardised and therefore more reusable.

October 2, 2012

by Ben Wilcock

· 19,652 Views

Resolve Circular Dependency in Spring Autowiring

I would consider this post as best practice for using Spring in enterprise application development.

September 27, 2012

by Gal Levinsky

· 127,561 Views · 24 Likes

New ActiveMQ failover and Clustering Goodies

For the last two weeks I’ve been working on some interesting use cases for the good ol’ failover transport. I finally have some time at my hands, so here’s a brief recap of what’s coming in 5.6 release in this area. First there’s a new feature, called Priority Backup. It’s described in details here, but in a nutshell it provides you with the mechanism of prioritizing your failover urls and keep your clients connected to them as soon as they are available. The most obvious use case for this is to keep your clients connected to the broker in local data center whenever you can. By doing this, you can both have better performances and stability of your clients, but also save on your bandwidth bills. Another improvement is coming for automatic broker cluster feature. Although this feature is not new, I spent some time hardening it and thought to share some more insight in how (and when) to use it in your projects. In search of high availability, people often default to master-slave architecture. This makes sense in most use cases, but if your flow is purely non-persistent you can probably come up with more optimal architecture. Instead of having one broker at the time handling all your load, and other one just waiting for it to fail, you’ll get more efficient system with some kind of active-active configuration where (possibly multiple) brokers share the load all the time. Ideally clients would be evenly distributed and would rebalance if anything changes. Brokers don’t need to share any messages as clients are distributed and messages are non-persistent so they will be lost if broker fails. So can you achieve this kind of architecture with ActiveMQ? Sure you do. That’s where automatic rebalance and clustering shines. First of all, brokers should be networked but only so they can exchange information on their availability. They shouldn’t exchange the messages (but of course can if your use case needs it). In 5.6 you do that with pure static networks, using configuration like So now imagine three brokers A,B and C forming a full mesh. In addition every broker uses rebalance options on their transport connectors All that is left for the client to do is connect to one of the brokers it knows like failover:(brokerA) and the broker will fill it with all information on other brokers in the cluster and whether it should reconnect to one of them or not. So having a large number of clients connecting like this, very soon they’ll rebalance over available brokers. You can stop one of the brokers in the cluster for updates and clients will rebalance over remaining ones. You can even add a new broker to the cluster and everything will get rebalanced without any need for you to touch your clients. So, basically in this way you have both load balancing and high availability for your non-persistent messages. Additionally, your clients are automatically updated with all information they need, and no manual intervention is needed. Although the basic support for clustering was there since 5.4, I did some more hardening and better rebalancing, so it’s coming in the Apache ActiveMQ 5.6 (and the next Fuse 5.5.1) release. Also, there are some more great stuff regarding broker clustering coming soon, so stay tuned and happy messaging.

September 10, 2012

by Dejan Bosanac

· 15,514 Views

Managing Camel Routes With JMX APIs

Here is a quick example of how to programmatically access Camel MBeans to monitor and manipulate routes... first, get a connection to a JMX server (assumes localhost, port 1099, no auth) note, always cache the connection for subsequent requests (can cause memory utilization issues otherwise) JMXServiceURL url = new JMXServiceURL("service:jmx:rmi:///jndi/rmi://localhost:1099/jmxrmi"); JMXConnector jmxc = JMXConnectorFactory.connect(url); MBeanServerConnection server = jmxc.getMBeanServerConnection(); use the following to iterate over all routes and retrieve statistics (state, exchanges, etc)... ObjectName objName = new ObjectName("org.apache.camel:type=routes,*"); List cacheList = new LinkedList(server.queryNames(objName, null)); for (Iterator iter = cacheList.iterator(); iter.hasNext();) { objName = iter.next(); String keyProps = objName.getCanonicalKeyPropertyListString(); ObjectName objectInfoName = new ObjectName("org.apache.camel:" + keyProps); String routeId = (String) server.getAttribute(objectInfoName, "RouteId"); String description = (String) server.getAttribute(objectInfoName, "Description"); String state = (String) server.getAttribute(objectInfoName, "State"); ... } use the following to execute operations against a Camel route (stop,start, etc) ObjectName objName = new ObjectName("org.apache.camel:type=routes,*"); List cacheList = new LinkedList(server.queryNames(objName, null)); for (Iterator iter = cacheList.iterator(); iter.hasNext();) { objName = iter.next(); String keyProps = objName.getCanonicalKeyPropertyListString(); if(keyProps.contains(routeID)) { ObjectName objectRouteName = new ObjectName("org.apache.camel:" + keyProps); Object[] params = {}; String[] sig = {}; server.invoke(objectRouteName, operationName, params, sig); return; } } summary These APIs can easily be used to build a web or command line based tool to support remote Camel management features. All of these features are available via the JMX console and Camel does provide a web console to support some management/monitoring tasks. See these pages for more information... http://camel.apache.org/camel-jmx.html http://camel.apache.org/web-console.html

July 30, 2012

by Ben O'Day

· 12,048 Views

How Changing Java Package Names Transformed my System Architecture

Changing your perspective even a small amount can have profound effects on how you approach your system. Let’s say you’re writing a web application in Java. In the system you deal with orders, customers and products. As a web application, your classes include staples like PersonController, PersonRepository, CustomerController and OrderService. How do you organize your classes into packages? There are two fundamental ways to structure your packages. Either you can focus on the logical tiers, like com.brodwall.myapp.controllers, com.brodwall.myapp.domain or perhaps com.brodwall.myapp.services.customer. Or you can focus on the domain contexts, like com.brodwall.myapp.customer, com.brodwall.myapp.orders and com.brodwall.myapp.products. The first approach is by far the most prevalent. In my view, it’s also the least helpful. Here are some ways your thinking changes if you structure your packages around domain concepts, rather than technological tiers: First, and most fundamentally, your mental model will now be aligned with that of the users of your system. If you’re asked to implement a typical feature, it is now more likely to be focused around a strict subset of the packages of your system. For example, adding a new field to a form will at least affect the presentation logic, entity and persistence layer for the corresponding domain concept. If your packages are organized around tiers, this change will hit all over your system. In a word: A system organized around features, rather than technologies, have higher coherence. This technical term means that a large percentage of a the dependencies of a class are located close to that class. Secondly, organizing around domain concepts will give you more options when your software grows. When a package contains tens of classes, you may want to split it up in several packages. The discussion can itself be enlightening. “Maybe we should separate out the customer address classes into a com.brodwall.myapp.customer.address package. It seems to have a bit of a life on its own.” “Yeah, and maybe we can use the same classes for other places we need addresses, such as suppliers?” “Cool, so com.brodwall.myapp.address, then?” Or maybe you decide that order status codes and payment status codes deserve to be in the “com.brodwall.myapp.order.codes” package. On the other hand, what options do you have for splitting up com.brodwall.myapp.controllers? You could create subpackages for customer, orders and products, but these subpackages may only have one or possibly two classes each. Finally, and perhaps most intriguingly, using domain concepts for packages allows you to vary the design according on a case by case basis. Maybe you really need a OrderService which coordinates the payment and shipping of an order, while ProductController only needs basic create-retrieve-update-delete functionality with a repository. A ProductService would just get in the way. If ProductService is missing from the com.brodwall.myapp.services package, this may be confusing or at the very least give you a nagging feeling that something is wrong. On the other hand, if there’s no Controller in the com.brodwall.myapp.product package, it doesn’t matter much. Also, most systems have some good parts and some not-so-good parts. If your Services package is not working for you, there’s not much you can do. But if the Products package is rotten, you can throw it out and reimplement it without the whole system being thrown into a state of chaos. By putting the classes needed to implement a feature together with each other and apart from the classes needed to implement other features, developers can be pragmatic and innovative when developing one feature without negatively affecting other features. The flip side of this is that most developers are more comfortable with some technologies in the application and less comfortable with other technologies. Organizing around features instead of technologies force each developer to consider a larger set of technological challenges. Some programmers take this as a motivating challenge to learn, while others, it seems, would rather not have to learn something new. If it were my money being spend to create features, I know what kind of developer I would want. Trivial changes can have large effects. By organizing your software around features, you get a more coherent system that allows for growth. It may challenge your developers, but it drives down the number of hand-offs needed to implement a feature and it challenges the developers to improve the parts of the application they are working on. See also my blog post on Architecture as tidying up.

July 20, 2012

by Johannes Brodwall

· 17,518 Views

Everything You Need To Know About Couchbase Architecture

After receiving a lot of good feedback and comment on my last blog on MongoDb, I was encouraged to do another deep dive on another popular document oriented db; Couchbase. I have been a long-time fan CouchDb and has wrote a blog on it many years ago. After it merges with Membase, I am very excited to take a deep look into it again. Couchbase is the merge of two popular NOSQL technologies: Membase, which provides persistence, replication, sharding to the high performance memcached technology CouchDB, which pioneers the document oriented model based on JSON Like other NOSQL technologies, both Membase and CouchDB are built from the ground up on a highly distributed architecture, with data shard across machines in a cluster. Built around the Memcached protocol, Membase provides an easy migration to existing Memcached users who want to add persistence, sharding and fault resilience on their familiar Memcached model. On the other hand, CouchDB provides first class support for storing JSON documents as well as a simple RESTful API to access them. Underneath, CouchDB also has a highly tuned storage engine that is optimized for both update transaction as well as query processing. Taking the best of both technologies, Membase is well-positioned in the NOSQL marketplace. Programming model Couchbase provides client libraries for different programming languages such as Java / .NET / PHP / Ruby / C / Python / Node.js For read, Couchbase provides a key-based lookup mechanism where the client is expected to provide the key, and only the server hosting the data (with that key) will be contacted. Couchbase also provides a query mechanism to retrieve data where the client provides a query (for example, range based on some secondary key) as well as the view (basically the index). The query will be broadcasted to all servers in the cluster and the result will be merged and sent back to the client. For write, Couchbase provides a key-based update mechanism where the client sends in an updated document with the key (as doc id). When handling write request, the server will return to client’s write request as soon as the data is stored in RAM on the active server, which offers the lowest latency for write requests. Following is the core API that Couchbase offers. (in an abstract sense) # Get a document by key doc = get(key) # Modify a document, notice the whole document # need to be passed in set(key, doc) # Modify a document when no one has modified it # since my last read casVersion = doc.getCas() cas(key, casVersion, changedDoc) # Create a new document, with an expiration time # after which the document will be deleted addIfNotExist(key, doc, timeToLive) # Delete a document delete(key) # When the value is an integer, increment the integer increment(key) # When the value is an integer, decrement the integer decrement(key) # When the value is an opaque byte array, append more # data into existing value append(key, newData) # Query the data results = query(viewName, queryParameters) In Couchbase, document is the unit of manipulation. Currently Couchbase doesn't support server-side execution of custom logic. Couchbase server is basically a passive store and unlike other document oriented DB, Couchbase doesn't support field-level modification. In case of modifying documents, client need to retrieve documents by its key, do the modification locally and then send back the whole (modified) document back to the server. This design tradeoff network bandwidth (since more data will be transferred across the network) for CPU (now CPU load shift to client). Couchbase currently doesn't support bulk modification based on a condition matching. Modification happens only in a per document basis. (client will save the modified document one at a time). Transaction Model Similar to many NOSQL databases, Couchbase’s transaction model is primitive as compared to RDBMS. Atomicity is guaranteed at a single document and transactions that span update of multiple documents are unsupported. To provide necessary isolation for concurrent access, Couchbase provides a CAS (compare and swap) mechanism which works as follows … When the client retrieves a document, a CAS ID (equivalent to a revision number) is attached to it. While the client is manipulating the retrieved document locally, another client may modify this document. When this happens, the CAS ID of the document at the server will be incremented. Now, when the original client submits its modification to the server, it can attach the original CAS ID in its request. The server will verify this ID with the actual ID in the server. If they differ, the document has been updated in between and the server will not apply the update. The original client will re-read the document (which now has a newer ID) and re-submit its modification. Couchbase also provides a locking mechanism for clients to coordinate their access to documents. Clients can request a LOCK on the document it intends to modify, update the documents and then releases the LOCK. To prevent a deadlock situation, each LOCK grant has a timeout so it will automatically be released after a period of time. Deployment Architecture In a typical setting, a Couchbase DB resides in a server clusters involving multiple machines. Client library will connect to the appropriate servers to access the data. Each machine contains a number of daemon processes which provides data access as well as management functions. The data server, written in C/C++, is responsible to handle get/set/delete request from client. The Management server, written in Erlang, is responsible to handle the query traffic from client, as well as manage the configuration and communicate with other member nodes in the cluster. Virtual Buckets The basic unit of data storage in Couchbase DB is a JSON document (or primitive data type such as int and byte array) which is associated with a key. The overall key space is partitioned into 1024 logical storage unit called "virtual buckets" (or vBucket). vBucket are distributed across machines within the cluster via a map that is shared among servers in the cluster as well as the client library. High availability is achieved through data replication at the vBucket level. Currently Couchbase supports one active vBucket zero or more standby replicas hosted in other machines. Curremtly the standby server are idle and not serving any client request. In future version of Couchbase, the standby replica will be able to serve read request. Load balancing in Couchbase is achieved as follows: Keys are uniformly distributed based on the hash function When machines are added and removed in the cluster. The administrator can request a redistribution of vBucket so that data are evenly spread across physical machines. Management Server Management server performs the management function and co-ordinate the other nodes within the cluster. It includes the following monitoring and administration functions Heartbeat: A watchdog process periodically communicates with all member nodes within the same cluster to provide Couchbase Server health updates. Process monitor: This subsystem monitors execution of the local data manager, restarting failed processes as required and provide status information to the heartbeat module. Configuration manager: Each Couchbase Server node shares a cluster-wide configuration which contains the member nodes within the cluster, a vBucket map. The configuration manager pull this config from other member nodes at bootup time. Within a cluster, one node’s Management Server will be elected as the leader which performs the following cluster-wide management function Controls the distribution of vBuckets among other nodes and initiate vBucket migration Orchestrates the failover and update the configuration manager of member nodes If the leader node crashes, a new leader will be elected from surviving members in the cluster. When a machine in the cluster has crashed, the leader will detect that and notify member machines in the cluster that all vBuckets hosted in the crashed machine is dead. After getting this signal, machines hosting the corresponding vBucket replica will set the vBucket status as “active”. The vBucket/server map is updated and eventually propagated to the client lib. Notice that at this moment, the replication level of the vBucket will be reduced. Couchbase doesn’t automatically re-create new replicas which will cause data copying traffic. Administrator can issue a command to explicitly initiate a data rebalancing. The crashed machine, after reboot can rejoin the cluster. At this moment, all the data it stores previously will be completely discard and the machine will be treated as a brand new empty machine. As more machines are put into the cluster (for scaling out), vBucket should be redistributed to achieve a load balance. This is currently triggered by an explicit command from the administrator. Once receive the “rebalance” command, the leader will compute the new provisional map which has the balanced distribution of vBuckets and send this provisional map to all members of the cluster. To compute the vBucket map and migration plan, the leader attempts the following objectives: Evenly distribute the number of active vBuckets and replica vBuckets among member nodes. Place the active copy and each replicas in physically separated nodes. Spread the replica vBucket as wide as possible among other member nodes. Minimize the amount of data migration Orchestrate the steps of replica redistribution so no node or network will be overwhelmed by the replica migration. Once the vBucket maps is determined, the leader will pass the redistribution map to each member in the cluster and coordinate the steps of vBucket migration. The actual data transfer happens directly between the origination node to the destination node. Notice that since we have generally more vBuckets than machines. The workload of migration will be evenly distributed automatically. For example, when new machines are added into the clusters, all existing machines will migrate some portion of its vBucket to the new machines. There is no single bottleneck in the cluster. Throughput the migration and redistribution of vBucket among servers, the life cycle of a vBucket in a server will be in one of the following states “Active”: means the server is hosting the vBucket is ready to handle both read and write request “Replica”: means the server is hosting the a copy of the vBucket that may be slightly out of date but can take read request that can tolerate some degree of outdate. “Pending”: means the server is hosting a copy that is in a critical transitional state. The server cannot take either read or write request at this moment. “Dead”: means the server is no longer responsible for the vBucket and will not take either read or write request anymore. Data Server Data server implements the memcached APIs such as get, set, delete, append, prepend, etc. It contains the following key datastructure: One in-memory hashtable (key by doc id) for the corresponding vBucket hosted. The hashtable acts as both a metadata for all documents as well as a cache for the document content. Maintain the entry gives a quick way to detect whether the document exists on disk. To support async write, there is a checkpoint linkedlist per vBucket holding the doc id of modified documents that hasn't been flushed to disk or replicated to the replica. To handle a "GET" request Data server routes the request to the corresponding ep-engine responsible for the vBucket. The ep-engine will lookup the document id from the in-memory hastable. If the document content is found in cache (stored in the value of the hashtable), it will be returned. Otherwise, a background disk fetch task will be created and queued into the RO dispatcher queue. The RO dispatcher then reads the value from the underlying storage engine and populates the corresponding entry in the vbucket hash table. Finally, the notification thread notifies the disk fetch completion to the memcached pending connection, so that the memcached worker thread can revisit the engine to process a get request. To handle a "SET" request, a success response will be returned to the calling client once the updated document has been put into the in-memory hashtable with a write request put into the checkpoint buffer. Later on the Flusher thread will pickup the outstanding write request from each checkpoint buffer, lookup the corresponding document content from the hashtable and write it out to the storage engine. Of course, data can be lost if the server crashes before the data has been replicated to another server and/or persisted. If the client requires a high data availability across different crashes, it can issue a subsequent observe() call which blocks on the condition that the server persist data on disk, or the server has replicated the data to another server (and get its ACK). Overall speaking, the client has various options to tradeoff data integrity with throughput. Hashtable Management To synchronize accesses to a vbucket hash table, each incoming thread needs to acquire a lock before accessing a key region of the hash table. There are multiple locks per vbucket hash table, each of which is responsible for controlling exclusive accesses to a certain ket region on that hash table. The number of regions of a hash table can grow dynamically as more documents are inserted into the hash table. To control the memory size of the hashtable, Item pager thread will monitor the memory utilization of the hashtable. Once a high watermark is reached, it will initiate an eviction process to remove certain document content from the hashtable. Only entries that is not referenced by entries in the checkpoint buffer can be evicted because otherwise the outstanding update (which only exists in hashtable but not persisted) will be lost. After eviction, the entry of the document still remains in the hashtable; only the document content of the document will be removed from memory but the metadata is still there. The eviction process stops after reaching the low watermark. The high / low water mark is determined by the bucket memory quota. By default, the high water mark is set to 75% of bucket quota, while the low water mark is set to 60% of bucket quota. These water marks can be configurable at runtime. In CouchDb, every document is associated with an expiration time and will be deleted once it is expired. Expiry pager is responsible for tracking and removing expired document from both the hashtable as well as the storage engine (by scheduling a delete operation). Checkpoint Manager Checkpoint manager is responsible to recycle the checkpoint buffer, which holds the outstanding update request, consumed by the two downstream processes, Flusher and TAP replicator. When all the request in the checkpoint buffer has been processed, the checkpoint buffer will be deleted and a new one will be created. TAP Replicator TAP replicator is responsible to handle vBucket migration as well as vBucket replication from active server to replica server. It does this by propagating the latest modified document to the corresponding replica server. At the time a replica vBucket is established, the entire vBucket need to be copied from the active server to the empty destination replica server as follows The in-memory hashtable at the active server will be transferred to the replica server. Notice that during this period, some data may be updated and therefore the data set transfered to the replica can be inconsistent (some are the latest and some are outdated). Nevertheless, all updates happen after the start of transfer is tracked in the checkpoint buffer. Therefore, after the in-memory hashtable transferred is completed, the TAP replicator can pickup those updates from the checkpoint buffer. This ensures the latest versioned of changed documents are sent to the replica, and hence fix the inconsistency. However the hashtable cache doesn’t contain all the document content. Data also need to be read from the vBucket file and send to the replica. Notice that during this period, update of vBucket will happen in active server. However, since the file is appended only, subsequent data update won’t interfere the vBucket copying process. After the replica server has caught up, subsequent update at the active server will be available at its checkpoint buffer which will be pickup by the TAP replicator and send to the replica server. CouchDB Storage Structure Data server defines an interface where different storage structure can be plugged-in. Currently it supports both a SQLite DB as well as CouchDB. Here we describe the details of CouchDb, which provides a super high performance storage mechanism underneath the Couchbase technology. Under the CouchDB structure, there will be one file per vBucket. Data are written to this file in an append-only manner, which enables Couchbase to do mostly sequential writes for update, and provide the most optimized access patterns for disk I/O. This unique storage structure attributes to Couchbase’s fast on-disk performance for write-intensive applications. The following diagram illustrate the storage model and how it is modified by 3 batch updates (notice that since updates are asynchronous, it is perform by "Flusher" thread in batches). The Flusher thread works as follows: 1) Pick up all pending write request from the dirty queue and de-duplicate multiple update request to the same document. 2) Sort each request (by key) into corresponding vBucket and open the corresponding file 3) Append the following into the vBucket file (in the following contiguous sequence) All document contents in such write request batch. Each document will be written as [length, crc, content] one after one sequentially. The index that stores the mapping from document id to the document’s position on disk (called the BTree by-id) The index that stores the mapping from update sequence number to the document’s position on disk. (called the BTree by-seq) The by-id index plays an important role for looking up the document by its id. It is organized as a B-Tree where each node contains a key range. To lookup a document by id, we just need to start from the header (which is the end of the file), transfer to the root BTree node of the by-id index, and then further traverse to the leaf BTree node that contains the pointer to the actual document position on disk. During the write, the similar mechanism is used to trace back to the corresponding BTree node that contains the id of the modified documents. Notice that in the append-only model, update is not happening in-place, instead we located the existing location and copy it over by appending. In other words, the modified BTree node will be need to be copied over and modified and finally paste to the end of file, and then its parent need to be modified to point to the new location, which triggers the parents to be copied over and paste to the end of file. Same happens to its parents’ parent and eventually all the way to the root node of the BTree. The disk seek can be at the O(logN) complexity. The by-seq index is used to keep track of the update sequence of lived documents and is used for asynchronous catchup purposes. When a document is created, modified or deleted, a sequence number is added to the by-seq btree and the previous seq node will be deleted. Therefore, for cross-site replication, view index update and compaction, we can quickly locate all the lived documents in the order of their update sequence. When a vBucket replicator asks for the list of update since a particular time, it provides the last sequence number in previous update, the system will then scan through the by-seq BTree node to locate all the document that has sequence number larger than that, which effectively includes all the document that has been modified since the last replication. As time goes by, certain data becomes garbage (see the grey-out region above) and become unreachable in the file. Therefore, we need a garbage collection mechanism to clean up the garbage. To trigger this process, the by-id and by-seq B-Tree node will keep track of the data size of lived documents (those that is not garbage) under its substree. Therefore, by examining the root BTree node, we can determine the size of all lived documents within the vBucket. When the ratio of actual size and vBucket file size fall below a certain threshold, a compaction process will be triggered whose job is to open the vBucket file and copy the survived data to another file. Technically, the compaction process opens the file and read the by-seq BTree at the end of the file. It traces the Btree all the way to the leaf node and copy the corresponding document content to the new file. The compaction process happens while the vBucket is being updated. However, since the file is appended only, new changes are recorded after the BTree root that the compaction has opened, so subsequent data update won’t interfere with the compaction process. When the compaction is completed, the system need to copy over the data that was appended since the beginning of the compaction to the new file. View Index Structure Unlike most indexing structure which provide a pointer from the search attribute back to the document. The CouchDb index (called View Index) is better perceived as a denormalized table with arbitrary keys and values loosely associated to the document. Such denormalized table is defined by a user-provided map() and reduce() function. map = function(doc) { … emit(k1, v1) … emit(k2, v2) … } reduce = function(keys, values, isRereduce) { if (isRereduce) { // Do the re-reduce only on values (keys will be null) } else { // Do the reduce on keys and values } // result must be ready for input values to re-reduce return result } Whenever a document is created, updated, deleted, the corresponding map(doc) function will be invoked (in an asynchronous manner) to generate a set of key/value pairs. Such key/value will be stored in a B-Tree structure. All the key/values pairs of each B-Tree node will be passed into the reduce() function, which compute an aggregated value within that B-Tree node. Re-reduce also happens in non-leaf B-Tree nodes which further aggregate the aggregated value of child B-Tree nodes. The management server maintains the view index and persisted it to a separate file. Create a view index is perform by broadcast the index creation request to all machines in the cluster. The management process of each machine will read its active vBucket file and feed each surviving document to the Map function. The key/value pairs emitted by the Map function will be stored in a separated BTree index file. When writing out the BTree node, the reduce() function will be called with the list of all values in the tree node. Its return result represent a partially reduced value is attached to the BTree node. The view index will be updated incrementally as documents are subsequently getting into the system. Periodically, the management process will open the vBucket file and scan all documents since the last sequence number. For each changed document since the last sync, it invokes the corresponding map function to determine the corresponding key/value into the BTree node. The BTree node will be split if appropriate. Underlying, Couchbase use a back index to keep track of the document with the keys that it previously emitted. Later when the document is deleted, it can look up the back index to determine what those key are and remove them. In case the document is updated, the back index can also be examined; semantically a modification is equivalent to a delete followed by an insert. The following diagram illustrates how the view index file will be incrementally updated via the append-only mechanism. Query Processing Query in Couchbase is made against the view index. A query is composed of the view name, a start key and end key. If the reduce() function isn’t defined, the query result will be the list of values sorted by the keys within the key range. In case the reduce() function is defined, the query result will be a single aggregated value of all keys within the key range. If the view has no reduce() function defined, the query processing proceeds as follows: Client issue a query (with view, start/end key) to the management process of any server (unlike a key based lookup, there is no need to locate a specific server). The management process will broadcast the request to other management process on all servers (include itself) within the cluster. Each management process (after receiving the broadcast request) do a local search for value within the key range by traversing the BTree node of its view file, and start sending back the result (automatically sorted by the key) to the initial server. The initial server will merge the sorted result and stream them back to the client. However, if the view has reduce() function defined, the query processing will involve computing a single aggregated value as follows: Client issue a query (with view, start/end key) to the management process of any server (unlike a key based lookup, there is no need to locate a specific server). The management process will broadcast the request to other management process on all servers (include itself) within the cluster. Each management process do a local reduce for value within the key range by traversing the BTree node of its view file to compute the reduce value of the key range. If the key range span across a BTree node, the pre-computed of the sub-range can be used. This way, the reduce function can reuse a lot of partially reduced values and doesn’t need to recomputed every value of the key range from scratch. The original server will do a final re-reduce() in all the return value from each other servers, and then passed back the final reduced value to the client. To illustrate the re-reduce concept, lets say the query has its key range from A to F. Instead of calling reduce([A,B,C,D,E,F]), the system recognize the BTree node that contains [B,C,D] has been pre-reduced and the result P is stored in the BTree node, so it only need to call reduce(A,P,E,F). Update View Index as vBucket migrates Since the view index is synchronized with the vBuckets in the same server, when the vBucket has migrated to a different server, the view index is no longer correct; those key/value that belong to a migrated vBucket should be discarded and the reduce value cannot be used anymore. To keep track of the vBucket and key in the view index, each bTree node has a 1024-bitmask indicating all the vBuckets that is covered in the subtree (ie: it contains a key emitted from a document belonging to the vBucket). Such bit-mask is maintained whenever the bTree node is updated. At the server-level, a global bitmask is used to indicate all the vBuckets that this server is responsible for. In processing the query of the map-only view, before the key/value pair is returned, an extra check will be perform for each key/value pair to make sure its associated vBucket is what this server is responsible for. When processing the query of a view that has a reduce() function, we cannot use the pre-computed reduce value if the bTree node contains a vBucket that the server is not responsible for. In this case, the bTree node’s bit mask is compared with the global bit mask. In case if they are not aligned, then the reduce value need to be recomputed. Here is an example to illustrate this process Couchbase is one of the popular NOSQL technology built on a solid technology foundation designed for high performance. In this post, we have examined a number of such key features: Load balancing between servers inside a cluster that can grow and shrink according to workload conditions. Data migration can be used to re-achieve workload balance. Asynchronous write provides lowest possible latency to client as it returns once the data is store in memory. Append-only update model pushes most update transaction into sequential disk access, hence provide extremely high throughput for write intensive applications. Automatic compaction ensures the data lay out on disk are kept optimized all the time. Map function can be used to pre-compute view index to enable query access. Summary data can be pre-aggregated using the reduce function. Overall, this cut down the workload of query processing dramatically. For a review on NOSQL architecture in general and some theoretical foundation, I have wrote a NOSQL design pattern blog, as well as some fundamental difference between SQL and NOSQL. For other NOSQL technologies, please read my other blog on MongoDb, Cassandra and HBase, Memcached Special thanks to Damien Katz and Frank Weigel from Couchbase team who provide a lot of implementation details of Couchbase.

July 7, 2012

by Ricky Ho

· 84,879 Views · 5 Likes

Apache Camel Monitoring

I've seen a lot of discussion about how to monitor Camel based applications. Most people are looking for the following features: ability to view services (contexts, endpoints, routes), to view performance statistics (route throughput, etc) and to perform basic operations (start/stop routes, send messages, etc). This post will breakdown the options (that I know of) that are available today (as of Camel 2.8). If you have used other approaches or know of other ongoing development in this area, please let me know. JMX APIs Camel uses JMX to provide a standardized way to access metadata about contexts/routes/endpoints defined in a given application. Also, you can use JMX to interact with these components (start/stop routes, etc) in some interesting ways. I recently had some very specific Camel/ActiveMQ monitoring requests from a client. After looking at the options, we ended up building a standalone Tomcat web app that used JSPs, jQuery, Ajax and JMX APIs to view route/endpoint statistics, manage Camel routes (stop, start, etc) and monitor/manipulate ActiveMQ queues. It provided some much needed visibility and management features for our Camel/ActiveMQ based message processing application... CamelContext If you have a handle to the CamelContext, there are various APIs that can help describe and manage routes and endpoints. These are used by the existing Camel Web Console and can be used to build custom interface to retrieve and use this information in various ways... here are some of the notable APIs... getRouteDefinitions() getEndpoints() getEndpointsMap() getRouteStatus(routeId) startRoute(routeId) stopRoute(routeId) removeRoute(routeId) addRoutes(routeBuilder) suspendRoute(routeId) resumeRoute(routeId) With a little creativity, you can use these APIs to manage/monitor and re-wire a Camel application dynamically. Camel Web Console This console provides web and REST interfaces to Camel contexts/routes/endpoints and allows you to view/manage endpoints/routes, send messages to endpoints, viewing route statistics, etc. That being said, using this web console with an existing Camel application is tricky at the moment. It's currently deployed as a war file that only has access to the CamelContext defined in its embedded spring XML file. Though the entire camel-web project can be embedded and customized in your application if you desire (and know Scalate). Given my recent client requirements, I opted to build my own basic app using JSPs/JMX as described above. There has been some recent support for deploying this console in OSGI, where it should be able to view any CamelContexts deployed in the container, etc. However, I'm yet to see this work...more on this later. Using Camel APIs There are also a number of Camel technologies/patterns that can be used to add monitoring to existing routes. wire tap - can add message logging (to a file or JMS queue/topic, etc) or other inline processing advicewith - can be used to modify existing routes to apply before/after operations or add/remove operations in a route intercept - can be used to intercept Exchanges while they are in route, can apply to all endpoints, certain endpoints or just starting endpoints BrowsableEndpoint - is an interface which Endpoints may implement to support the browsing of the exchanges which are pending or have been sent on it. That being said, it takes some creativity to use these effectively and caution to not adversely affect the routes you are trying to monitor. Hyperic HQ You can use this tool to monitor Servicemix (or any process), but it more geared towards system monitoring and JVM stats. I didn't find it useful for any Camel specific monitoring. jConsole/VisualVM these are standard JMX based consoles. They aren't web based and can't be customized (easily anyways) to provide anything more than a tree-like view of JMX MBeans. If you know where to look though, you can do a lot with it. Summary These are just some quick notes at this point. As I learn about other ways of monitoring Camel, I'll update this list and give some more detailed comparison. Any comments are welcome...

June 27, 2012

by Ben O'Day

· 20,150 Views

Managing ActiveMQ with JMX APIs

Here is a quick example of how to programmatically access ActiveMQ MBeans to monitor and manipulate message queues... First, get a connection to a JMX server (assumes localhost, port 1099, no auth) Note, always cache the connection for subsequent requests (can cause memory utilization issues otherwise) JMXServiceURL url = new JMXServiceURL("service:jmx:rmi:///jndi/rmi://localhost:1099/jmxrmi"); JMXConnector jmxc = JMXConnectorFactory.connect(url); MBeanServerConnection conn = jmxc.getMBeanServerConnection(); Then, you can execute various operations such as addQueue, removeQueue, etc... String operationName="addQueue"; String parameter="MyNewQueue"; ObjectName activeMQ = new ObjectName("org.apache.activemq:BrokerName=localhost,Type=Broker"); if(parameter != null) { Object[] params = {parameter}; String[] sig = {"java.lang.String"}; conn.invoke(activeMQ, operationName, params, sig); } else { conn.invoke(activeMQ, operationName,null,null); } Also, you can get an ActiveMQ QueueViewMBean instance for a specified queue name... ObjectName activeMQ = new ObjectName("org.apache.activemq:BrokerName=localhost,Type=Broker"); BrokerViewMBean mbean = (BrokerViewMBean) MBeanServerInvocationHandler.newProxyInstance(conn, activeMQ,BrokerViewMBean.class, true); for (ObjectName name : mbean.getQueues()) { QueueViewMBean queueMbean = (QueueViewMBean) MBeanServerInvocationHandler.newProxyInstance(mbsc, name, QueueViewMBean.class, true); if (queueMbean.getName().equals(queueName)) { queueViewBeanCache.put(cacheKey, queueMbean); return queueMbean; } } Then, execute one of several APIs against the QueueViewMBean instance... Queue monitoring - getEnqueueCount(), getDequeueCount(), getConsumerCount(), etc... Queue manipulation - purge(), getMessage(String messageId), removeMessage(String messageId), moveMessageTo(String messageId, String destinationName), copyMessageTo(String messageId, String destinationName), etc... Summary The APIs can easily be used to build a web or command line based tool to support remote ActiveMQ management features. That being said, all of these features are available via the JMX console itself and ActiveMQ does provide a web console to support some management/monitoring tasks. See these pages for more information... http://activemq.apache.org/jmx-support.html http://activemq.apache.org/web-console.html

June 22, 2012

by Ben O'Day

· 32,238 Views · 1 Like