DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

The Latest Microservices Topics

article thumbnail
In-Memory Data Grids
Introduction The IT buzzword of 2012 is without a doubt Big Data. It’s new and here to stay, and for a good reason. Big data is data that exceeds the processing capacity of conventional database systems. Great examples are CERN with the Large Hadron Collider, whose experiments generate 25 petabytes of data annually, or Walmart, which handles more than one million customer transaction every hour. Problems These vast amounts of data leave us with two problems. Problem 1: To gain value from this data, one must choose an alternative way to process it. The value of big data to an organization falls into two categories: analytical use, and enabling new products. Big data analytics can reveal insights hidden previously by data too costly to process, such as peer influence among customers, revealed by analyzing shoppers’ transactions, social and geographical data. Being able to process every item of data in reasonable time removes the troublesome need for sampling and promotes an investigative approach to data, in contrast to the somewhat static nature of running predetermined reports. Problem 2: The data is too big, moves too fast, or doesn’t fit the strictures of your database architectures. Remember the CERN case where the LHC produces over 25 Petabytes of data annually? No “classic” database architecture or setup is capable of holding these amounts of data. Solutions Fortunately, both problems can be solved by implementing the correct infrastructure and rethinking data storage. There are two critical factors in Big Data environments: size and speed. We already discussed the vast amounts of data and desire to be able to access and process the data fast. The latter is the main differentiator from more traditional data warehouses. Just imagine what you can do when you can access all your data real-time. Enter big data. A common Big Data implementation is an in-memory data grid that lives in a distributed cluster, ensuring both speed, by storing data in-memory, and capacity by using scalability features provided by a cluster. As a bonus, availability is ensured by using a distributed cluster. As for the data storage, there are typically two kinds: in-memory databases and in-memory data grids. But first some background. It is not a new attempt to use main memory as a storage area instead of a disk. In our daily lives there are numerous examples of main memory databases (MMDB), as they perform much faster than disk-based databases. An every day example is a mobile phone. When you SMS or call someone most mobile service providers use MMDB to get the information on your contact as soon as possible. The same applies to your phone. When someone calls you, the caller details are looked up in the contacts application, usually providing a name and sometimes a picture. In memory data grids In Memory Data Grid (IMDG) is the same as MMDB in that it stores data in main memory, but it has a totally different architecture. The features of IMDG can be summarized as follows: Data is distributed and stored on multiple servers. Each server operates in the active mode. A data model is usually object-oriented (serialized) and non-relational. According to the necessity, you often need to add or reduce servers. No traditional database features such as tables. In other words, IMDG is designed to store data in main memory, ensure scalability and store an object itself. These days, there are many IMDG products, both commercial and open source. Some of the most commonly used products are: Hazelcast (http://www.hazelcast.com) JBoss Infinispan (http://www.jboss.org/infinispan) GridGain DataGrid (http://www.gridgain.com/features/in-memory-data-grid/) VMware Gemfire (http://www.vmware.com/nl/products/application-platform/vfabric-gemfire/overview.html) Oracle Coherence (http://www.oracle.com/technetwork/middleware/coherence/overview/index.html) Gigaspaces XAP (http://www.gigaspaces.com/datagrid) Terracotta Enterprise Suite (http://terracotta.org/products/enterprise-suite) Why Memory? The main reasons for using main memory for data storage are once again the two main themes of Big Data: speed and capacity. The processing performance of main memory is 800 times faster than an HDD and up to 40 times faster than an. Moreover, the latest x86 server supports main memory of hundreds of GB per server. It is said that the limit of a traditional processing database’s (OLTP) data capacity is approximately 1 TB and that the OLTP processing data capacity would not increment well. If servers using main memory of 1 TB or larger become more commonly used, you will be able to conduct operations with the entire data placed in main memory, at least in the field of OLTP. IMDG Architecture To use main memory as a storage area, two weak points should be overcome: Limited capacity: involves data that exceeds the maximum capacity of the main memory of the server Reliability: involves data loss in case of a (system) failure. IMDG overcomes the limit of capacity by ensuring horizontal scalability using a distributed architecture, and resolves the issue of reliability through a replication system as part of the grid (or a distributed cluster). Now let’s discuss how an IMDG actually works. First of all, it is important to understand that an IMDG is not the same as an in-memory database, also referred to as MMDB (main memory databases). Typical examples of MMDBs are Oracle TimesTen or Sap Hana. MMDBs are full database products that simply reside in memory. As a result of being a full-blown database, they also carry the weight and overhead of database management features. IMDG is different. No tables, indexes, triggers, stored procedures, process managers etc. Just plain storage. The data model used in IMDG is key-value pairs. A key-value pair is a list with only two parts: a key and a value. The key can be used for storing and retrieving the values in the list. A key can be compared to the index or primary key of a table in a database. Note that IMDG are closely tied to development environments such as Java as the key-value pairs are represented by the structures provided by such a programming environment. Most IMDGs are written in Java, and can only be used within other Java applications. Therefore, the values of key-value pairs can be anything supported by Java, ranging from simple data types such as a string or number, to complex objects. This overcomes the two important hurdles: as you can store complex Java objects as value, there’s no need to translate these objects into a relational datamodel (which is the case in more traditional applications using a database for storage). Furthermore, the seeming limitation of being able to store only one value per key, is actually no limitation at all. Large memory sizes Most of the products introduced above use Java as an implementation language. Java reserves and uses a part of the RAM (internal memory) for dynamic memory allocation. This reserved memory space is called the Java heap. All runtime objects created by a Java application are stored in heap. Using large amounts of data causes two problems. Size limitation: By default, the heap size is 128 MB, but for current business applications, this limit is reached easily. Once the heap is “full”, no new objects can be created and the Java application will show some nasty errors. Performance: It is possible to increase the size of the heap, but this introduces some new problems. When a heap reaches a size of more than 4 gigabytes, Java will have serious issues with memory managements, causing your application to slow down or even freeze. Java has a feature called Garbage Collector, which periodically scans the heap and checks each object if it is still valid and being used. If not, the garbage collector removes the object and defragments the newly available space. The problem is, the larger the heap size, the more work to do for the garbage collector, resulting in performance degradation. Imagine a large bank has a Java application that manages customers, accounts and transactions. We have seen that an IMDG allows the application to store and access all data very quickly by caching it in memory, instead of storing the data in relatively slow databases. Let’s assume the combined data has a size of 40 gigabytes. Storing it in heap is simply not possible, considering the performance penalties of Java’s memory management capabilities. The graph below illustrates the garbage collection pause time when placing cached data in heap: Terracotta’s BigMemory product has a method to overcome these limitations. The method is to use an off-heap memory (direct buffer). Data will not be stored in Java’s heap, but directly in the available internal memory (RAM). Since, this is not subject to Java’s garbage collector, there are no performance penalties. The differences on performance are significant, as can be seen in the graph below: Using off-heap storage has some major benefits: You can use all the available memory on your machine, not just the memory that is allocated to the heap (usually less that 512 Mb). This allows you to store more data in a in-memory data grid, greatly speeding up your application. The heap can be relieved by storing data in native memory, speeding up Java applications as less heap space has to be garbage collected. Clustering, fail over and high availability So far, we have seen IMDG features that are applicable to a single server. However, the real power of IMDG lies in it’s networking and clustering capabilities, providing features as data replication, data synchronization between clients, fail over and high availability. To achieve this, a cluster of servers (or server array) acts a backbone of the infrastructure. Applications (that still can have their own IMDG or off-heap cache) that are connected to the cluster can share, replicate and backup their data with either the cluster or other applications. The graph below depicts a typical setup using Terracotta's BigMemory: The caches on the application servers are usually referred to as “level 1” cache, while the data cache on the server array is referred to as “level 2” cache. There are many different scenarios possible for storing, clustering, synchronizing and replicating data. Covering all these topics goes far beyond the scope of this article. For more information, consult the technical documentation of the product of your choice. Conclusion Big Data brings us some new challenges. First of all, storing and accessing vast amounts of data makes us rethink traditional methods and technologies. Next, there’s the question what to do with all the available data. The potential value for marketing, financial and other businesses is huge. In order to facilitate Big Data, in-memory data grids are considered the best option. IMDGs with off-heap storage are even more powerful, allowing data centric enterprise application to overcome certain limits of the Java platform, such as memory and performance constraints. As the amount of data that (large) companies produce and store, grows exponentially, databases will hit a limit. Accessing your data without a performance penalty simply will not be possible. The answer to this is using an IMDG.
March 13, 2013
by Roy Prins
· 32,632 Views · 5 Likes
article thumbnail
JUnit testing of Spring MVC application: Testing the Service Layer
In continuation of my earlier blogs on Introduction to Spring MVC and Testing DAO layer in Spring MVC, in this blog I will demonstrate how to test Service layer in Spring MVC. The objective of this demo is 2 fold, to build the Service layer using TDD and increase the code coverage during JUnit testing of Service layer. For people in hurry, get the latest code from Github and run the below command mvn clean test -Dtest=com.example.bookstore.service.AccountServiceTest Since in my earlier blog, we have already tested the DAO layer, in this blog we only need to focus on testing service layer. We need to mock the DAO layer so that we can control the behavior in Service layer and cover various scenarios. Mockito is a good framework which is used to mock a method and return known data and assert that in the JUnit. As a first step we define the AccountServiceTestContextConfiguration class with AccountServiceTest class. If you notice there are 2 beans defined in that class and we marked the as a @Configuration which shows that it is a Spring Context class. In the JUnit test we @Autowired AccountService class. And AccountServiceImpl @Autowired the AccountRepository class. When creating the Bean in the configuration file we also stubbed the AccountRepository class using Mockito, @RunWith(SpringJUnit4ClassRunner.class) @ContextConfiguration public class AccountServiceTest { @Configuration static class AccountServiceTestContextConfiguration { @Bean public AccountService accountService() { return new AccountServiceImpl(); } @Bean public AccountRepository accountRepository() { return Mockito.mock(AccountRepository.class); } } //We Autowired the AccountService bean so that it is injected from the configuration @Autowired private AccountService accountService; @Autowired private AccountRepository accountRepository; During the setup of the JUnit we use Mockito mock findByUsername method to return a predefined account object as below @Before public void setup() { Account account = new AccountBuilder() { { address("Herve", "4650", "Rue de la gare", "1", null, "Belgium"); credentials("john", "secret"); name("John", "Doe"); } }.build(true); Mockito.when(accountRepository.findByUsername("john")).thenReturn(account); } Now we write the tests as below and test both the positive and negative scenarios, @Test(expected = AuthenticationException.class) public void testLoginFailure() throws AuthenticationException { accountService.login("john", "fail"); } @Test() public void testLoginSuccess() throws AuthenticationException { Account account = accountService.login("john", "secret"); assertEquals("John", account.getFirstName()); assertEquals("Doe", account.getLastName()); } } Finally we verify if the findByUsername method is called only once successfully as below in the teardown, @After public void verify() { Mockito.verify(accountRepository, VerificationModeFactory.times(1)).findByUsername(Mockito.anyString()); // This is allowed here: using container injected mocks Mockito.reset(accountRepository); } AccountService class looks as below, @Service @Transactional(readOnly = true) public class AccountServiceImpl implements AccountService { @Autowired private AccountRepository accountRepository; @Override public Account login(String username, String password) throws AuthenticationException { Account account = this.accountRepository.findByUsername(username, password); } else { throw new AuthenticationException("Wrong username/password", "invalid.username"); } return account; } } I hope this blog helped you. In my next blog, I will demo how to build a controller JUnit test. Reference: Pro Spring MVC: With Web Flow by by Marten Deinum, Koen Serneels
March 3, 2013
by Krishna Prasad
· 81,621 Views · 3 Likes
article thumbnail
Parallelizing Work with Redis
Curator's Note: Here's a Redis how-to from back in 2011. Anyone who has heard of Redis has probably also heard of Resque, which is a lightweight queue'ing system. To the uninitiated it might seem strange, or maybe even impossible, to construct a queue'ing system using just a key-value store. In this article, I’m going to break down some of the primitives redis exposes that make building a queue'ing system over it trivial and show how Redis is so much more than just a key-value store. The problem Let’s say, you are a mathematician and have just come up with this super performant way of computing factors of numbers. You decide to write up the following sinatra service: def compute_factors(number) factors = crazy_performant_computation number end get "/compute_factors" do number, post_back_url = params[:number].to_i, params[:post_back_url] RestClient.post post_back_url, factors => compute_factors(number).to_json "OK" end You soon start seeing crazy traffic and realize, performant as your factor computation algorithm is, it’s not fast enough to keep up with the speed at which you are getting requests to your service. First pass at optimization by forking You realize that it’s going to be far more efficient to fork off a new Process or Thread and have that perform the computation and post back the result. So your code now changes to: get "/compute_factors" do number, post_back_url = params[:number].to_i, params[:post_back_url] Process.fork do RestClient.post post_back_url, factors => compute_factors(number).to_json end "OK" end While, this is great you soon realize that filling up the process table in your OS is not such a good idea. Capping process creation using a process pool It is exactly this problem that a process pool was meant to solve. The basic idea is that you would still like to perform your time-intensive task in the background, but would like to put a cap on the number of background processes you have running. There are some excellent libraries that solve this problem such as Delayed Job and Resque. However, being the hacker that you are, you decide to roll one yourself. There are however a bunch of issues that these libraries solve and you decide to pull a pen and paper and note them down to ensure that you are not missing anything: Cap how many workers you create You need to have a way to cap the number of background workers you create, that way you don’t have the same problem you were having before. Control worker creation and destruction You would like to be able to boot up and bring down your workers reasonably gracefully. Handle race conditions You realize, that spinning new processes means that you now have to ensure your code is concurrent-safe. Redis provides, some wonderful atomic operations out-of-the-box so this shouldn’t be too hard. Second pass using BRPOP Redis supports a couple of interesting data-structures including lists, sets and hashes. Redis lists have a command called RPOP which basically lets you pop an item off the tail of a list, in essence treating it like a queue. The RPOP command comes with a blocking variant of itself called BRPOP that blocks on the call to popping an element from the list. You can also specify a timeout for how long (in seconds) you would like to block on the call. def compute_factors(number) factors = crazy_performant_computation number end NUMBER_OF_WORKERS = (ENV['NUMBER_OF_WORKERS'] || 50).to_i NUMBER_OF_WORKERS.times do Process.fork do redis = Redis.new loop do val = redis.brpop "work_queue", 1 unless val puts "Process: #{Process.pid} is exiting" exit 0 end number, postback_url = Marshal.load val.last RestClient.post postback_url, factors => compute_factors(number).to_json end end end redis = Redis.new get "/compute_factors" do number, post_back_url = params[:number].to_i, params[:post_back_url] redis.lpush "work_queue", Marshal.dump([number, post_back_url]) "OK" end So you now have solved a bunch of problems in this new approach. We have a fixed number of workers running to handle our background processing – so now our process table getting filled is not subject to traffic conditions. Race conditions are handled for us by Redis, since BRPOP is atomic and guarantees no two workers will do duplicate work. And finally, workers destroy themselves if they break out of the brpop call due to their timeout being hit, in this case 1 second. So, that’s quite a slew of problems that have been solved for us by virtue of just using redis. We soon start, seeing a different problem though. As traffic in our site lags, workers seem to be dying off since their timeout is being hit. We’d really like to now have the workers block for a longer time than just 1 second, while also having the option to kill them off sooner if we need to. That way, they’ll not be hanging around for any longer than they have to. Gracefully shutting down workers Our mandate now is to shutdown our workers gracefully, using redis and little bit of UNIX signals magic (for examples of using signals in this area checkout Unicorn Is Unix and the Unicorn web-server. Our code now morphs to: def compute_factors(number) factors = crazy_performant_computation number end NUMBER_OF_WORKERS = (ENV['NUMBER_OF_WORKERS'] || 50).to_i NUMBER_OF_WORKERS.times do Process.fork do redis = Redis.new loop do val = redis.brpop "work_queue", 30 unless val puts "Process: #{Process.pid} is signing off due to timeout!" exit 0 end if val.last == "DIE!" puts "Process: #{Process.pid} has been asked to kill itself by parent" exit 0 end number, postback_url = Marshal.load val.last RestClient.post postback_url, factors => compute_factors(number).to_json end end end redis = Redis.new get "/compute_factors" do number, post_back_url = params[:number].to_i, params[:post_back_url] redis.lpush "work_queue", Marshal.dump([number, post_back_url]) "OK" end `echo #{Process.pid} > /tmp/factors.pid` puts "Parent process wrote PID to /tmp/factors.pid" trap('QUIT') do NUMBER_OF_WORKERS.times do redis.lpush "work_queue", "DIE!" end end We have now bumped up the timeout to 30 seconds and also have in place a way to bring down the workers near instantly. This is accomplished by the web-server trapping the QUIT signal and when it does, it pushes a “DIE!” message onto the redis “work_queue”. It pushes this message the same number of times as the NUMBER_OF_WORKERS. And since BRPOP is an atomic and concurrent-safe operation we are now supporting the bringing down of workers via redis. How cool is that! To gracefully shutdown the server and workers we just need to: kill -s QUIT `cat /tmp/factors.pid` Conclusion The next time you need to get some background job action going, stop yourself from just grabbing a library. Instead, toy around with redis lists a little. You’ll be surprised by how much you can accomplish with just straight redis primitives.
February 28, 2013
by Santosh Kumar
· 8,156 Views
article thumbnail
Nested Iterator with WSO2 ESB
Please find the following sample which demonstrates the Nested iterator. The following is the source of the configuration. 15000 $1 Use the following request with soapui in order to test the above sample. SUN9 SUN10 From the second iterate mediator, messages retrieved from the first iterator are again iterated into smaller messages as below SUN10
February 23, 2013
by Achala Chathuranga Aponso
· 7,421 Views
article thumbnail
When to use Aspect Oriented Architecture (AOA/AOD)
When is it appropriate to use aspect oriented architecture? I think the only honest answer to this question is that it depends on the context for which the question is being asked. There really are no hard and fast rules regarding the selection of an architectural model(s) for a project because each model provides good and bad benefits. Every system is built with a unique requirements and constraints. This context will dictate when to use one type of architecture over another or in conjunction with others. To me aspect oriented architecture models should be a sub-phase in the architectural modeling and design process especially when creating enterprise level models. Personally, I like to use this approach to create a base architectural model that is defined by non-functional requirements and system quality attributes. This general model can then be used as a starting point for additional models because it is targets all of the business key quality attributes required by the system. Aspect oriented architecture is a method for modeling non-functional requirements and quality attributes of a system known as aspects. These models do not deal directly with specific functionality. They do categorize functionality of the system. This approach allows a system to be created with a strong emphasis on separating system concerns into individual components. These cross cutting components enables a systems to create with compartmentalization in regards to non-functional requirements or quality attributes. This allows for the reduction in code because an each component maintains an aspect of a system that can be called by other aspects. This approach also allows for a much cleaner and smaller code base during the implementation and support of a system. Additionally, enabling developers to develop systems based on aspect-oriented design projects will be completed faster and will be more reliable because existing components can be shared across a system; thus, the time needed to create and test the functionality is reduced. Example of an effective use of Aspect Oriented Architecture In my experiences, aspect oriented architecture can be very effective with large or more complex systems. Typically, these types of systems have a large number of concerns so the act of defining them is very beneficial for reducing the system’s complexity because components can be developed to address each concern while exposing functionality to the other system components. The benefits to using the aspect oriented approach as the starting point for a system is that it promotes communication between IT and the business due to the fact that the aspect oriented models are quality attributes focused so not much technical understanding is needed to understand the model. An example of this can be in developing a new intranet website. Common Intranet Concerns: Error Handling Security Logging Notifications Database connectivity Example of a not as effective use of Aspect Oriented Architecture Again in my experiences, aspect oriented architecture is not as effective with small or less complex systems in comparison. There is no need to model concerns for a system that has a limited amount of them because the added overhead would not be justified for the actual benefits of creating the aspect oriented architecture model. Furthermore, these types of projects typically have a reduced time schedule and a limited budget. The creation of the Aspect oriented models would increase the overhead of a project and thus increase the time needed to implement the system. An example of this is seen by creating a small application to poll a network share for new files and then FTP them to a new location. The two primary concerns for this project is to monitor a network drive and FTP files to a new location. There is no need to create an aspect model for this system because there will never be a need to share functionality amongst either of these concerns. To add to my point, this system is so small that it could be created with just a few classes so the added layer of componentizing the concerns would be complete overkill for this situation. References: Brichau, Johan; D'Hondt, Theo. (2006) Aspect-Oriented Software Development (AOSD) - An Introduction. Retreived from: http://www.info.ucl.ac.be/~jbrichau/courses/introductionToAOSD.pdf
February 21, 2013
by Todd Merritt
· 9,543 Views
article thumbnail
Apache Camel Meets Redis
The Lamborghini of Key-Value stores Camel is the best of bread Integration framework and in this post I'm going to show you how to make it even more powerful by leveraging another great project - Redis. Camel 2.11 is on its way to be released soon with lots of new features, bug fixes and components. Couple of these new components are authored by me, redis-component being my favourite one. Redis - a ligth key/value store is an amazing piece of Italian software designed for speed (same as Lamborghini - a two-seater Italian car designed for speed). Written in C and having an in-memory closer to the metal nature, Redis performs extremely well (Lamborgini's motto is "Closer to the Road"). Redis is often referred to as a data structure server since keys can contain strings, hashes, lists and sorted sets. A fast and light data structure server is like a super sportscars for software engineers - it just flies. If you want to find out more about Redis' and Lamborghini's unique performance characteristics google around and you will see for yourself. Getting started with Redis is easy: download, make, and start a redis-server. After these steps, you ready to use it from your Camel application. The component uses internally Spring Data which in turn uses Jedis driver, but with possibility to switch to other Redis drivers. Here are few use cases where the camel-redis component is a good fit: Idempotent Repository The term idempotent is used in mathematics to describe a function that produces the same result if it is applied to itself. In Messaging this concepts translates into the a message that has the same effect whether it is received once or multiple times. In Camel this pattern is implemented using the IdempotentConsumer class which uses an Expression to calculate a unique message ID string for a given message exchange; this ID can then be looked up in the IdempotentRepository to see if it has been seen before; if it has the message is consumed; if its not then the message is processed and the ID is added to the repository. RedisIdempotentRepository is using a set structure to store and check for existing Ids. ${in.body.id} Caching One of the main uses of Redis is as LRU cache. It can store data inmemory as Memcached or can be tuned to be durable flushing data to a log file that can be replayed if the node restarts.The various policies when maxmemory is reached allows creating caches for specific needs: volatile-lru remove a key among the ones with an expire set, trying to remove keys not recently used. volatile-ttl remove a key among the ones with an expire set, trying to remove keys with short remaining time to live. volatile-random remove a random key among the ones with an expire set. allkeys-lru like volatile-lru, but will remove every kind of key, both normal keys or keys with an expire set. allkeys-random like volatile-random, but will remove every kind of keys, both normal keys and keys with an expire set. Once your Redis server is configured with the right policies and running, the operation you need to do are SET and GET: SET keyOne valueOne Interap pub/sub with Redis Camel has various components for interacting between routes: direct: provides direct, synchronous invocation in the same camel context. seda: asynchronous behavior, where messages are exchanged on a BlockingQueue, again in the same camel context. vm: asynchronous behavior like seda, but also supports communication across CamelContext as long as they are in the same JVM. Complex applications usually consist of more than one standalone Camel instances running on separate machines. For this kind of scenarios, Camel provides jms, activemq, combination of AWS SNS with SQS, for messaging between instances. Redis has a simpler solution for the Publish/Subscribe messaging paradigm. Subscribers subscribes to one or more channels, by specifying the channel names or using pattern matching for receiving messages from multiple channels. Then the publisher publishes the messages to a channel, and Redis makes sure it reaches all the matching subscribers. PUBLISH testChannel Test Message Other usages Guaranteed Delivery: Camel supports this EIP using JMS, File, JPA and few other components. Here Redis can be used as lightweight key-value persistent store with its transaction support. The Claim Check from the EIP patterns allows you to replace message content with a claim check (a unique key), which can be used to retrieve the message content at a later time. The message content can be stored temporarily in Redis. Redis is also very popular for implementing counters, leaderboards, tagging systems and many more functionalities. Now, with two swiss army knives under your belt, the integrations to make are limited only by your imagination.
February 20, 2013
by Bilgin Ibryam
· 10,876 Views
article thumbnail
CPU Cache Flushing Fallacy
Even from highly experienced technologists I often hear talk about how certain operations cause a CPU cache to "flush". This seems to be illustrating a very common fallacy about how CPU caches work, and how the cache sub-system interacts with the execution cores. In this article I will attempt to explain the function CPU caches fulfil, and how the cores, which execute our programs of instructions, interact with them. For a concrete example I will dive into one of the latest Intel x86 server CPUs. Other CPUs use similar techniques to achieve the same ends. Most modern systems that execute our programs are shared-memory multi-processor systems in design. A shared-memory system has a single memory resource that is accessed by 2 or more independent CPU cores. Latency to main memory is highly variable from 10s to 100s of nanoseconds. Within 100ns it is possible for a 3.0GHz CPU to process up to 1200 instructions. Each Sandy Bridge core is capable of retiring up to 4 instructions-per-cycle (IPC) in parallel. CPUs employ cache sub-systems to hide this latency and allow them to exercise their huge capacity to process instructions. Some of these caches are small, very fast, and local to each core; others are slower, larger, and shared across cores. Together with registers and main-memory, these caches make up our non-persistent memory hierarchy. Next time you are developing an important algorithm, try pondering that a cache-miss is a lost opportunity to have executed ~500 CPU instructions! This is for a single-socket system, on a multi-socket system you can effectively double the lost opportunity as memory requests cross socket interconnects. Memory Hierarchy Figure 1. For the circa 2012 Sandy Bridge E class servers our memory hierarchy can be decomposed as follows: Registers: Within each core are separate register files containing 160 entries for integers and 144 floating point numbers. These registers are accessible within a single cycle and constitute the fastest memory available to our execution cores. Compilers will allocate our local variables and function arguments to these registers. When hyperthreading is enabled these registers are shared between the co-located hyperthreads. Memory Ordering Buffers (MOB): The MOB is comprised of a 64-entry load and 36-entry store buffer. These buffers are used to track in-flight operations while waiting on the cache sub-system. The store buffer is a fully associative queue that can be searched for existing store operations, which have been queued when waiting on the L1 cache. These buffers enable our fast processors to run asynchronously while data is transferred to and from the cache sub-system. When the processor issues asynchronous reads and writes then the results can come back out-of-order. The MOB is used to disambiguate the ordering for compliance to the published memory model. Level 1 Cache: The L1 is a core-local cache split into separate 32K data and 32K instruction caches. Access time is 3 cycles and can be hidden as instructions are pipelined by the core for data already in the L1 cache. Level 2 Cache: The L2 cache is a core-local cache designed to buffer access between the L1 and the shared L3 cache. The L2 cache is 256K in size and acts as an effective queue of memory accesses between the L1 and L3. L2 contains both data and instructions. L2 access latency is 12 cycles. Level 3 Cache: The L3 cache is shared across all cores within a socket. The L3 is split into 2MB segments each connected to a ring-bus network on the socket. Each core is also connected to this ring-bus. Addresses are hashed to segments for greater throughput. Latency can be up to 38 cycles depending on cache size. Cache size can be up to 20MB depending on the number of segments, with each additional hop around the ring taking an additional cycle. The L3 cache is inclusive of all data in the L1 and L2 for each core on the same socket. This inclusiveness, at the cost of space, allows the L3 cache to intercept requests thus removing the burden from private core-local L1 & L2 caches. Main Memory: DRAM channels are connected to each socket with an average latency of ~65ns for socket local access on a full cache-miss. This is however extremely variable, being much less for subsequent accesses to columns in the same row buffer, through to significantly more when queuing effects and memory refresh cycles conflict. 4 memory channels are aggregated together on each socket for throughput, and to hide latency via pipelining on the independent memory channels. NUMA: In a multi-socket server we have non-uniform memory access. It is non-uniform because the required memory maybe on a remote socket having an additional 40ns hop across the QPI bus. Sandy Bridge is a major step forward for 2-socket systems over Westmere and Nehalem. With Sandy Bridge the QPI limit has been raised from 6.4GT/s to 8.0GT/s, and two lanes can be aggregated thus eliminating the bottleneck of the previous systems. For Nehalem and Westmere the QPI link is only capable of ~40% the bandwidth that could be delivered by the memory controller for an individual socket. This limitation made accessing remote memory a choke point. In addition, the QPI link can now forward pre-fetch requests which previous generations could not. Associativity Levels Caches are effectively hardware based hash tables. The hash function is usually a simple masking of some low-order bits for cache indexing. Hash tables need some means to handle a collision for the same slot. The associativity level is the number of slots, also known as ways or sets, which can be used to hold a hashed version of an address. Having more levels of associativity is a trade off between storing more data vs. power requirements and time to search each of the ways. For Sandy Bridge the L1 and L2 are 8-way and the L3 is 12-way associative. Cache Coherence With some caches being local to cores, we need a means of keeping them coherent so all cores can have a consistent view of memory. The cache sub-system is considered the "source of truth" for mainstream systems. If memory is fetched from the cache it is never stale; the cache is the master copy when data exists in both the cache and main-memory. This style of memory management is known as write-back whereby data in the cache is only written back to main-memory when the cache-line is evicted because a new line is taking its place. An x86 cache works on blocks of data that are 64-bytes in size, known as a cache-line. Other processors can use a different size for the cache-line. A larger cache-line size reduces effective latency at the expense of increased bandwidth requirements. To keep the caches coherent the cache controller tracks the state of each cache-line as being in one of a finite number of states. The protocol Intel employs for this is MESIF, AMD employs a variant know as MOESI. Under the MESIF protocol each cache-line can be in 1 of the 5 following states: Modified: Indicates the cache-line is dirty and must be written back to memory at a later stage. When written back to main-memory the state transitions to Exclusive. Exclusive: Indicates the cache-line is held exclusively and that it matches main-memory. When written to, the state then transitions to Modified. To achieve this state a Request-For-Ownership (RFO) message is sent which involves a read plus an invalidate broadcast to all other copies. Shared: Indicates a clean copy of a cache-line that matches main-memory. Invalid: Indicates an unused cache-line. Forward: Indicates a specialised version of the shared state i.e. this is the designated cache which should respond to other caches in a NUMA system. To transition from one state to another, a series of messages are sent between the caches to effect state changes. Previous to Nehalem for Intel, and Opteron for AMD, this cache coherence traffic between sockets had to share the memory bus which greatly limited scalability. These days the memory controller traffic is on a separate bus. The Intel QPI, and AMD HyperTransport, buses are used for cache coherence between sockets. The cache controller exists as a module within each L3 cache segment that is connected to the on-socket ring-bus network. Each core, L3 cache segment, QPI controller, memory controller, and integrated graphics sub-system are connected to this ring-bus. The ring is made up of 4 independent lanes for: request, snoop, acknowledge, and 32-bytes data per cycle. The L3 cache is inclusive in that any cache-line held in the L1 or L2 caches is also held in the L3. This provides for rapid identification of the core containing a modified line when snooping for changes. The cache controller for the L3 segment keeps track of which core could have a modified version of a cache-line it owns. If a core wants to read some memory, and it does not have it in a Shared, Exclusive, or Modified state; then it must make a read on the ring bus. It will then either be read from main-memory if not in the cache sub-systems, or read from L3 if clean, or snooped from another core if Modified. In any case the read will never return a stale copy from the cache sub-system, it is guaranteed to be coherent. Concurrent Programming If our caches are always coherent then why do we worry about visibility when writing concurrent programs? This is because within our cores, in their quest for ever greater performance, data modifications can appear out-of-order to other threads. There are 2 major reasons for this. Firstly, our compilers can generate programs that store variables in registers for relatively long periods of time for performance reasons, e.g. variables used repeatedly within a loop. If we need these variables to be visible across cores then the updates must not be register allocated. This is achieved in C by qualifying a variable as "volatile". Beware that C/C++ volatile is inadequate for telling the compiler to order other instructions. For this you need fences/barriers. The second major issue with ordering we have to be aware of is a thread could write a variable and then, if it reads it shortly after, could see the value in its store buffer which may be older than the latest value in the cache sub-system. This is never an issue for algorithms following the Single Writer Principle but is an issue for the likes of the Dekker and Peterson lock algorithms. To overcome this issue, and ensure the latest value is observed, the thread must wait for the store buffer to drain on that core. This can be achieved by issuing a fence instruction. The write of a volatile variable in Java, in addition to never being register allocated, is accompanied by a full fence instruction. This fence instruction on x86 has a significant performance impact by preventing progress on the issuing thread until the store buffer is drained. Fences on other processors can have more efficient implementations that simply put a marker in the store buffer for the search boundary, e.g. the Azul Vega does this. If you want to ensure memory ordering across Java threads when following the Single Writer Principle, and avoid the store fence, it is possible by using the j.u.c.Atomic(Int|Long|Reference).lazySet() method, as opposed to setting a volatile variable. The Fallacy Returning to the fallacy of "flushing the cache" as part of a concurrent algorithm. I think we can safely say that we never "flush" the CPU cache within our user space programs. I believe the source of this fallacy is the need to flush, mark or drain to a point, the store buffer for some classes of concurrent algorithms so the latest value can be observed on a subsequent load operation. For this we require a memory ordering fence and not a cache flush. Another possible source of this fallacy is that L1 caches, or the TLB, may need to be flushed based on address indexing policy on a context switch. ARM, previous to ARMv6, did not use address space tags on TLB entries thus requiring the whole L1 cache to be flushed on a context switch. Many processors require the L1 instruction cache to be flushed for similar reasons, in many cases this is simply because instruction caches are not required to be kept coherent. The bottom line is, context switching is expensive and a bit off topic, so in addition to the cache pollution of the L2, a context switch can also cause the TLB and/or L1 caches to require a flush. Intel x86 processors require only a TLB flush on context switch.
February 15, 2013
by Martin Thompson
· 11,485 Views · 3 Likes
article thumbnail
The Saga Pattern and That Architecture vs. Design Thing
It has been few months since SOA Patterns was published and so far the book sold somewhere between 2K-3K copies which I guess is not bad for an unknown author – so first off, thanks to all of you who bought a copy (by the way, if you found the book useful I’d be grateful if you could also rate it on Amazon so that others would know about it too) I know at least a few of you actually read the book as from time to time I get questions about it :). Not all the questions are interesting to “the general public” but some are. One interesting question I got is about the so called “Canonical schema pattern“. I have a post in the making (for too long now,sorry about that Bill) that explains why I don’t consider it a pattern and why I think it verges on being an anti-pattern. Another question I got more recently, which is also the subject of this post, was about the Saga pattern. Here is (most of) the email I got from Ashic : “Garcia-Molina’s paper focuses on failure management and compensation so as to prevent partial success. It discusses a variety of approaches – with an SEC, with application code outside of the database, backward-forward and even forward-only (the latter having no “compensate” step per activity, rather a forward flow that takes care of the partial success). Nowadays, I see two viewpoints regarding sagas: 1. People calling process managers sagas, which is obviously incorrect. [e.g. NServiceBus "sagas".] 2. People focusing very strongly on a “context” of work, whereby the context gets passed around from activity to activity. For linear up front workflows, routing slips are an easy solution. An example of this can be found at Clemens’s post here: http://vasters.com/clemensv/2012/09/01/Sagas.aspx . For more complicated workflows, graph-like slips may be used. After discussing with some enthusiasts, they seem very keen to suggest that the context has to move along. They seem to reject the notion of a saga where a central coordinator controls the process. In other words, even if a process manager takes care of only routing messages, and that routing includes compensations to alleviate partial successes, they are unwilling (sometimes vehemently) to call that a saga. They acknowledge it can be useful, but say that is not a saga. I find this to be confusing. In this case the process manager acts as the SEC would in a Garcia-Molina saga capable database. This approach still allows interleaved transactions (or steps) without a global lock. Why would this not be a saga? In your book, I did see you mentioned orchestration as a way of implementing sagas. However, when this was brought up, the proponents of point 2 suggest that that is not what you really mean. To me it seems quite clear, and it aligns with Hector’s paper. I just want to make sure I have this right. I’d love your thoughts on this.” Let’s start with the answer to the question: When I think about the Saga pattern I see it as the application of the notions in the Garcia-Molina paper (which talked about databases) to SOA. In other words, I see sagas as the notion of getting distributed agreement of a process with reduced guarantees (vs. distributed transactions that propose ACID guarantees across systems). – So,basically, a Saga is loose transaction-like flow where, in case of failures, involved services perform compensation steps (which may be nothing, a complete undo or something else entirely). The Saga pattern can augment this process with temporary promises (which I call reservations). Under this definition both centrally managed processes and a “choreographed” processes are Sagas – as long as the semantics and intent mentioned above are kept. The centrally managed orchestration provides visibility of processes, ease of management etc; The cooperative event based, context shared sagas provide flexibility and allow serendipity of new processes; Both have merit and both have a place, at least in my opinion :) The main reason both of these, very different, approaches are valid designs and implementations for the Saga pattern is that the Saga pattern (like others in the book) is an Architectural pattern and not a Design pattern. Which brings us to the second reason for this post, the difference between “Architecture” and “Design”. In a nutshell, architecture is a type of design where the focus is quality attributes and wide(er) scope whereas design focuses on functional requirements and more localized concerns. The Saga pattern is an architectural pattern that focused on the integrity reliability quality attributes and it pertains to the communication patterns between services. When it comes to design the implementation of the pattern. you need to decide how to implement the concerns and roles defined in the pattern -e.g. controlling the flow and the status of the saga. One decision can be to implement it centrally and use orchestration another decision can be to decentralize it and use context… Design decision can be very meaningful sometimes it can be hard to find what’s left of the architecture – consider for example the whole idea behind blogging and RSS feeds. The architectural notion is a publish/subscribe system where the blog writer publish an “event” (a new post) and subscribers get a copy. When it came to design and implementation, considering it was implemented on top of HTTP and REST where there is no publish/subscribe capability it was actually designed as a pull system where the publisher provides a list of recent changes (the feed) and subscribers sample it and check if anything changed since the last time. So architecturally pub/sub, design pull a centralized server that exposes latest changes – a really big difference Does it matter at all? I think yes. Architecture lets us think about the system at a higher level of abstraction and thus tackle more complex systems. When we design and focus on more local issues we can tackle the nitty gritty details and make sure things actually work. we need to check the effects of design on architecture and vice versa to make sure the whole thing sticks together and actually does what we want/need. Note that architecture and design are not the complete story – another variable is the technology (e.g. HTTP in the example above) which affects the design decision and thus also the architecture (you can read a little more about it in my posts on SAF)
February 1, 2013
by Arnon Rotem-gal-oz
· 9,798 Views
article thumbnail
How to Publish Maven Site Docs to BitBucket or GitHub Pages
In this post we will Utilize GitHub and/or BitBucket's static web page hosting capabilities to publish our project's Maven 3 Site Documentation. Each of the two SCM providers offer a slightly different solution to host static pages. The approach spelled out in this post would also be a viable solution to "backup" your site documentation in a supported SCM like Git or SVN. This solution does not directly cover site documentation deployment covered by the maven-site-plugin and the Wagon library (scp, WebDAV or FTP). There is one main project hosted on GitHub that I have posted with the full solution. The project URL is https://github.com/mike-ensor/clickconcepts-master-pom/. The POM has been pushed to Maven Central and will continue to be updated and maintained. com.clickconcepts.project master-site-pom 0.16 GitHub Pages GitHub hosts static pages by using a special branch "gh-pages" available to each GitHub project. This special branch can host any HTML and local resources like JavaScript, images and CSS. There is no server side development. To navigate to your static pages, the URL structure is as follows: http://.github.com/ An example of the project I am using in this blog post: http://mike-ensor.github.com/clickconcepts-master-pom/ where the first bold URL segment is a username and the second bold URL segment is the project. GitHub does allow you to create a base static hosted static site for your username by creating a repository with your username.github.com. The contents would be all of your HTML and associated static resources. This is not required to post documentation for your project, unlike the BitBucket solution. There is a GitHub Site plugin that publishes site documentation via GitHub's object API but this is outside the scope of this blog post because it does not provide a single solution for GitHub and BitBucket projects using Maven 3. BitBucket BitBucket provides a similar service to GitHub in that it hosts static HTML pages and their associated static resources. However, there is one large difference in how those pages are stored. Unlike GitHub, BitBucket requires you to create a new repository with a name fitting the convention. The files will be located on the master branch and each project will need to be a directory off of the root. mikeensor.bitbucket.org/ /some-project +index.html +... /css /img /some-other-project +index.html +... /css /img index.html .git .gitignore The naming convention is as follows: .bitbucket.org An example of a BitBucket static pages repository for me would be: http://mikeensor.bitbucket.org/. The structure does not require that you create an index.html page at the root of the project, but it would be advisable to avoid 404s. Generating Site Documentation Maven provides the ability to post documentation for your project by using the maven-site-plugin. This plugin is difficult to use due to the many configuration options that oftentimes are not well documented. There are many blog posts that can help you write your documentation including my post on maven site documentation. I did not mention how to use "xdoc", "apt" or other templating technologies to create documentation pages, but not to fear, I have provided this in my GitHub project. Putting it all Together The Maven SCM Publish plugin (http://maven.apache.org/plugins/maven-scm-publish-plugin/ publishes site documentation to a supported SCM. In our case, we are going to use Git through BitBucket or GitHub. Maven SCM Plugin does allow you to publish multi-module site documentation through the various properties, but the scope of this blog post is to cover single/mono module projects and the process is a bit painful. Take a moment to look at the POM file located in the clickconcepts-master-pom project. This master POM is rather comprehensive and the site documentation is only one portion of the project, but we will focus on the site documentation. There are a few things to point out here, first, the scm-publish plugin and the idiosyncronies when implementing the plugin. In order to create the site documentation, the "site" plugin must first be run. This is accomplished by running site:site. The plugin will generate the documentation into the "target/site" folder by default. The SCM Publish Plugin, by default, looks for the site documents to be in "target/staging" and is controlled by the content parameter. As you can see, there is a mismatch between folders. NOTE: My first approach was to run the site:stage command which is supposed to put the site documents into the "target/staging" folder. This is not entirely correct, the site plugin combines with the distributionManagement.site.url property to stage the documents, but there is very strange behavior and it is not documented well. In order to get the site plugin's site documents and the SCM Publish's location to match up, use the content property and set that to the location of the Site Plugin output (). If you are using GitHub, there is no modification to the siteOutputDirectory needed, however, if you are using BitBucket, you will need to modify the property to add in a directory layer into the site documentation generation (see above for differences between GitHub and BitBucket pages). The second property will tell the SCM Publish Plugin to look at the root "site" folder so that when the files are copied into the repository, the project folder will be the containing folder. The property will look like: ${project.build.directory}/site/ ${project.artifactId} ${project.build.directory} /site Next we will take a look at the custom properties defined in the master POM and used by the SCM Publish Plugin above. Each project will need to define several properties to use the Master POM that are used within the plugins during the site publishing. Fill in the variables with your own settings. BitBucket ... ... master scm:git:[email protected]:mikeensor/mikeensor.bitbucket.org.git ${project.build.directory}/site/${project.artifactId} ${project.build.directory}/site ${changelog.bitbucket.fileUri} ${changelog.revision.bitbucket.fileUri} ... ... GitHub ... ... gh-pages scm:git:[email protected]:mikeensor/clickconcepts-master-pom.git ${changelog.github.fileUri} ${changelog.revision.github.fileUri} ... ... NOTE: changelog parameters are required to use the Master POM and are not directly related to publishing site docs to GitHub or BitBucket How to Generate If you are using the Master POM (or have abstracted out the Site Plugin and the SCM Plugin) then to generate and publish the documentation is simple. mvn clean site:site scm-publish:publish-scm mvn clean site:site scm-publish:publish-scm -Dscmpublish.dryRun=true Gotchas In the SCM Publish Plugin documentation's "tips" they recommend creating a location to place the repository so that the repo is not cloned each time. There is a risk here in that if there is a git repository already in the folder, the plugin will overwrite the repository with the new site documentation. This was discovered by publishing two different projects and having my root repository wiped out by documentation from the second project. There are ways to mitigate this by adding in another folder layer, but make sure you test often! Another gotcha is to use the -Dscmpublish.dryRun=true to test out the site documentation process without making the SCM commit and push Project and Documentation URLs Here is a list of the fully working projects used to create this blog post: Master POM with Site and SCM Publish plugins &ndash https://github.com/mike-ensor/clickconcepts-master-pom. Documentation URL: http://mike-ensor.github.com/clickconcepts-master-pom/ Child Project using Master Pom &ndash http://mikeensor.bitbucket.org/fest-expected-exception. Documentation URL: http://mikeensor.bitbucket.org/fest-expected-exception/
January 23, 2013
by Mike Ensor
· 13,411 Views
article thumbnail
Using Redis with Spring
As NoSQL solutions are getting more and more popular for many kind of problems, more often the modern projects consider to use some (or several) of NoSQLs instead (or side-by-side) of traditional RDBMS. I have already covered my experience with MongoDB in this, this and this posts. In this post I would like to switch gears a bit towards Redis, an advanced key-value store. Aside from very rich key-value semantics, Redis also supports pub-sub messaging and transactions. In this post I am going just to touch the surface and demonstrate how simple it is to integrate Redis into your Spring application. As always, we will start with Maven POM file for our project: 4.0.0 com.example.spring redis 0.0.1-SNAPSHOT jar UTF-8 3.1.0.RELEASE org.springframework.data spring-data-redis 1.0.0.RELEASE cglib cglib-nodep 2.2 log4j log4j 1.2.16 redis.clients jedis 2.0.0 jar org.springframework spring-core ${spring.version} org.springframework spring-context ${spring.version} Spring Data Redis is the another project under Spring Data umbrella which provides seamless injection of Redis into your application. The are several Redis clients for Java and I have chosen the Jedis as it is stable and recommended by Redis team at the moment of writing this post. We will start with simple configuration and introduce the necessary components first. Then as we move forward, the configuration will be extended a bit to demonstrated pub-sub capabilities. Thanks to Java config support, we will create the configuration class and have all our dependencies strongly typed, no XML anymore: package com.example.redis.config; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration; import org.springframework.data.redis.connection.jedis.JedisConnectionFactory; import org.springframework.data.redis.core.RedisTemplate; import org.springframework.data.redis.serializer.GenericToStringSerializer; import org.springframework.data.redis.serializer.StringRedisSerializer; @Configuration public class AppConfig { @Bean JedisConnectionFactory jedisConnectionFactory() { return new JedisConnectionFactory(); } @Bean RedisTemplate< String, Object > redisTemplate() { final RedisTemplate< String, Object > template = new RedisTemplate< String, Object >(); template.setConnectionFactory( jedisConnectionFactory() ); template.setKeySerializer( new StringRedisSerializer() ); template.setHashValueSerializer( new GenericToStringSerializer< Object >( Object.class ) ); template.setValueSerializer( new GenericToStringSerializer< Object >( Object.class ) ); return template; } } That's basically everything we need assuming we have single Redis server up and running on localhost with default configuration. Let's consider several common uses cases: setting a key to some value, storing the object and, finally, pub-sub implementation. Storing and retrieving a key/value pair is very simple: @Autowired private RedisTemplate< String, Object > template; public Object getValue( final String key ) { return template.opsForValue().get( key ); } public void setValue( final String key, final String value ) { template.opsForValue().set( key, value ); } Optionally, the key could be set to expire (yet another useful feature of Redis), f.e. let our keys expire in 1 second: public void setValue( final String key, final String value ) { template.opsForValue().set( key, value ); template.expire( key, 1, TimeUnit.SECONDS ); } Arbitrary objects could be saved into Redis as hashes (maps), f.e. let save instance of some class User public class User { private final Long id; private String name; private String email; // Setters and getters are omitted for simplicity } into Redis using key pattern "user:": public void setUser( final User user ) { final String key = String.format( "user:%s", user.getId() ); final Map< String, Object > properties = new HashMap< String, Object >(); properties.put( "id", user.getId() ); properties.put( "name", user.getName() ); properties.put( "email", user.getEmail() ); template.opsForHash().putAll( key, properties); } Respectively, object could easily be inspected and retrieved using the id. public User getUser( final Long id ) { final String key = String.format( "user:%s", id ); final String name = ( String )template.opsForHash().get( key, "name" ); final String email = ( String )template.opsForHash().get( key, "email" ); return new User( id, name, email ); } There are much, much more which could be done using Redis, I highly encourage to take a look on it. It surely is not a silver bullet but could solve many challenging problems very easy. Finally, let me show how to use a pub-sub messaging with Redis. Let's add a bit more configuration here (as part of AppConfig class): @Bean MessageListenerAdapter messageListener() { return new MessageListenerAdapter( new RedisMessageListener() ); } @Bean RedisMessageListenerContainer redisContainer() { final RedisMessageListenerContainer container = new RedisMessageListenerContainer(); container.setConnectionFactory( jedisConnectionFactory() ); container.addMessageListener( messageListener(), new ChannelTopic( "my-queue" ) ); return container; } The style of message listener definition should look very familiar to Spring users: generally, the same approach we follow to define JMS message listeners. The missed piece is our RedisMessageListener class definition: package com.example.redis.impl; import org.springframework.data.redis.connection.Message; import org.springframework.data.redis.connection.MessageListener; public class RedisMessageListener implements MessageListener { @Override public void onMessage(Message message, byte[] paramArrayOfByte) { System.out.println( "Received by RedisMessageListener: " + message.toString() ); } } Now, when we have our message listener, let see how we could push some messages into the queue using Redis. As always, it's pretty simple: @Autowired private RedisTemplate< String, Object > template; public void publish( final String message ) { template.execute( new RedisCallback< Long >() { @SuppressWarnings( "unchecked" ) @Override public Long doInRedis( RedisConnection connection ) throws DataAccessException { return connection.publish( ( ( RedisSerializer< String > )template.getKeySerializer() ).serialize( "queue" ), ( ( RedisSerializer< Object > )template.getValueSerializer() ).serialize( message ) ); } } ); } That's basically it for very quick introduction but definitely enough to fall in love with Redis.
January 17, 2013
by Andriy Redko
· 81,275 Views · 36 Likes
article thumbnail
Hazelcast Distributed Execution with Spring
The ExecutorService feature had come with Java 5 and is under the java.util.concurrent package. It extends the Executor interface and provides a thread pool functionality to execute asynchronous short tasks. Java Executor Service Types is suggested to look over basic ExecutorService implementation. Also ThreadPoolExecutor is a very useful implementation of ExecutorService ınterface. It extends AbstractExecutorService providing default implementations of ExecutorService execution methods. It provides improved performance when executing large numbers of asynchronous tasks and maintains basic statistics, such as the number of completed tasks. How to develop and monitor Thread Pool Services by using Spring is also suggested to investigate how to develop and monitor Thread Pool Services. So far, we have just talked Undistributed Executor Service implementation. Let us also investigate Distributed Executor Service. Hazelcast Distributed Executor Service feature is a distributed implementation of java.util.concurrent.ExecutorService. It allows to execute business logic in cluster. There are four alternative ways to realize it : 1) The logic can be executed on a specific cluster member which is chosen. 2) The logic can be executed on the member owning the key which is chosen. 3) The logic can be executed on the member Hazelcast will pick. 4) The logic can be executed on all or subset of the cluster members. This article shows how to develop Distributed Executor Service via Hazelcast and Spring. Used Technologies : JDK 1.7.0_09 Spring 3.1.3 Hazelcast 2.4 Maven 3.0.4 STEP 1 : CREATE MAVEN PROJECT A maven project is created as below. (It can be created by using Maven or IDE Plug-in). STEP 2 : LIBRARIES Firstly, Spring dependencies are added to Maven’ s pom.xml 3.1.3.RELEASE UTF-8 org.springframework spring-core ${spring.version} org.springframework spring-context ${spring.version} com.hazelcast hazelcast-all 2.4 log4j log4j 1.2.16 maven-compiler-plugin(Maven Plugin) is used to compile the project with JDK 1.7 org.apache.maven.plugins maven-compiler-plugin 3.0 1.7 1.7 maven-shade-plugin(Maven Plugin) can be used to create runnable-jar org.apache.maven.plugins maven-shade-plugin 2.0 package shade com.onlinetechvision.exe.Application META-INF/spring.handlers META-INF/spring.schemas STEP 3 : CREATE Customer BEAN A new Customer bean is created. This bean will be distributed between two node in OTV cluster. In the following sample, all defined properties(id, name and surname)’ types are String and standart java.io.Serializable interface has been implemented for serializing. If custom or third-party object types are used, com.hazelcast.nio.DataSerializable interface can be implemented for better serialization performance. package com.onlinetechvision.customer; import java.io.Serializable; /** * Customer Bean. * * @author onlinetechvision.com * @since 27 Nov 2012 * @version 1.0.0 * */ public class Customer implements Serializable { private static final long serialVersionUID = 1856862670651243395L; private String id; private String name; private String surname; public String getId() { return id; } public void setId(String id) { this.id = id; } public String getName() { return name; } public void setName(String name) { this.name = name; } public String getSurname() { return surname; } public void setSurname(String surname) { this.surname = surname; } @Override public int hashCode() { final int prime = 31; int result = 1; result = prime * result + ((id == null) ? 0 : id.hashCode()); result = prime * result + ((name == null) ? 0 : name.hashCode()); result = prime * result + ((surname == null) ? 0 : surname.hashCode()); return result; } @Override public boolean equals(Object obj) { if (this == obj) return true; if (obj == null) return false; if (getClass() != obj.getClass()) return false; Customer other = (Customer) obj; if (id == null) { if (other.id != null) return false; } else if (!id.equals(other.id)) return false; if (name == null) { if (other.name != null) return false; } else if (!name.equals(other.name)) return false; if (surname == null) { if (other.surname != null) return false; } else if (!surname.equals(other.surname)) return false; return true; } @Override public String toString() { return "Customer [id=" + id + ", name=" + name + ", surname=" + surname + "]"; } } STEP 4 : CREATE ICacheService INTERFACE A new ICacheService Interface is created for service layer to expose cache functionality. package com.onlinetechvision.cache.srv; import com.hazelcast.core.IMap; import com.onlinetechvision.customer.Customer; /** * A new ICacheService Interface is created for service layer to expose cache functionality. * * @author onlinetechvision.com * @since 27 Nov 2012 * @version 1.0.0 * */ public interface ICacheService { /** * Adds Customer entries to cache * * @param String key * @param Customer customer * */ void addToCache(String key, Customer customer); /** * Deletes Customer entries from cache * * @param String key * */ void deleteFromCache(String key); /** * Gets Customer cache * * @return IMap Coherence named cache */ IMap getCache(); } STEP 5 : CREATE CacheService IMPLEMENTATION CacheService is implementation of ICacheService Interface. package com.onlinetechvision.cache.srv; import com.hazelcast.core.IMap; import com.onlinetechvision.customer.Customer; import com.onlinetechvision.test.listener.CustomerEntryListener; /** * CacheService Class is implementation of ICacheService Interface. * * @author onlinetechvision.com * @since 27 Nov 2012 * @version 1.0.0 * */ public class CacheService implements ICacheService { private IMap customerMap; /** * Constructor of CacheService * * @param IMap customerMap * */ @SuppressWarnings("unchecked") public CacheService(IMap customerMap) { setCustomerMap(customerMap); getCustomerMap().addEntryListener(new CustomerEntryListener(), true); } /** * Adds Customer entries to cache * * @param String key * @param Customer customer * */ @Override public void addToCache(String key, Customer customer) { getCustomerMap().put(key, customer); } /** * Deletes Customer entries from cache * * @param String key * */ @Override public void deleteFromCache(String key) { getCustomerMap().remove(key); } /** * Gets Customer cache * * @return IMap Coherence named cache */ @Override public IMap getCache() { return getCustomerMap(); } public IMap getCustomerMap() { return customerMap; } public void setCustomerMap(IMap customerMap) { this.customerMap = customerMap; } } STEP 6 : CREATE IDistributedExecutorService INTERFACE A new IDistributedExecutorService Interface is created for service layer to expose distributed execution functionality. package com.onlinetechvision.executor.srv; import java.util.Collection; import java.util.Set; import java.util.concurrent.Callable; import java.util.concurrent.ExecutionException; import com.hazelcast.core.Member; /** * A new IDistributedExecutorService Interface is created for service layer to expose distributed execution functionality. * * @author onlinetechvision.com * @since 27 Nov 2012 * @version 1.0.0 * */ public interface IDistributedExecutorService { /** * Executes the callable object on stated member * * @param Callable callable * @param Member member * @throws InterruptedException * @throws ExecutionException * */ String executeOnStatedMember(Callable callable, Member member) throws InterruptedException, ExecutionException; /** * Executes the callable object on member owning the key * * @param Callable callable * @param Object key * @throws InterruptedException * @throws ExecutionException * */ String executeOnTheMemberOwningTheKey(Callable callable, Object key) throws InterruptedException, ExecutionException; /** * Executes the callable object on any member * * @param Callable callable * @throws InterruptedException * @throws ExecutionException * */ String executeOnAnyMember(Callable callable) throws InterruptedException, ExecutionException; /** * Executes the callable object on all members * * @param Callable callable * @param Set all members * @throws InterruptedException * @throws ExecutionException * */ Collection executeOnMembers(Callable callable, Set members) throws InterruptedException, ExecutionException; } STEP 7 : CREATE DistributedExecutorService IMPLEMENTATION DistributedExecutorService is implementation of IDistributedExecutorService Interface. package com.onlinetechvision.executor.srv; import java.util.Collection; import java.util.Set; import java.util.concurrent.Callable; import java.util.concurrent.ExecutionException; import java.util.concurrent.ExecutorService; import java.util.concurrent.Future; import java.util.concurrent.FutureTask; import org.apache.log4j.Logger; import com.hazelcast.core.DistributedTask; import com.hazelcast.core.Member; import com.hazelcast.core.MultiTask; /** * DistributedExecutorService Class is implementation of IDistributedExecutorService Interface. * * @author onlinetechvision.com * @since 27 Nov 2012 * @version 1.0.0 * */ public class DistributedExecutorService implements IDistributedExecutorService { private static final Logger logger = Logger.getLogger(DistributedExecutorService.class); private ExecutorService hazelcastDistributedExecutorService; /** * Executes the callable object on stated member * * @param Callable callable * @param Member member * @throws InterruptedException * @throws ExecutionException * */ @SuppressWarnings("unchecked") public String executeOnStatedMember(Callable callable, Member member) throws InterruptedException, ExecutionException { logger.debug("Method executeOnStatedMember is called..."); ExecutorService executorService = getHazelcastDistributedExecutorService(); FutureTask task = (FutureTask) executorService.submit( new DistributedTask(callable, member)); String result = task.get(); logger.debug("Result of method executeOnStatedMember is : " + result); return result; } /** * Executes the callable object on member owning the key * * @param Callable callable * @param Object key * @throws InterruptedException * @throws ExecutionException * */ @SuppressWarnings("unchecked") public String executeOnTheMemberOwningTheKey(Callable callable, Object key) throws InterruptedException, ExecutionException { logger.debug("Method executeOnTheMemberOwningTheKey is called..."); ExecutorService executorService = getHazelcastDistributedExecutorService(); FutureTask task = (FutureTask) executorService.submit(new DistributedTask(callable, key)); String result = task.get(); logger.debug("Result of method executeOnTheMemberOwningTheKey is : " + result); return result; } /** * Executes the callable object on any member * * @param Callable callable * @throws InterruptedException * @throws ExecutionException * */ public String executeOnAnyMember(Callable callable) throws InterruptedException, ExecutionException { logger.debug("Method executeOnAnyMember is called..."); ExecutorService executorService = getHazelcastDistributedExecutorService(); Future task = executorService.submit(callable); String result = task.get(); logger.debug("Result of method executeOnAnyMember is : " + result); return result; } /** * Executes the callable object on all members * * @param Callable callable * @param Set all members * @throws InterruptedException * @throws ExecutionException * */ public Collection executeOnMembers(Callable callable, Set members) throws ExecutionException, InterruptedException { logger.debug("Method executeOnMembers is called..."); MultiTask task = new MultiTask(callable, members); ExecutorService executorService = getHazelcastDistributedExecutorService(); executorService.execute(task); Collection results = task.get(); logger.debug("Result of method executeOnMembers is : " + results.toString()); return results; } public ExecutorService getHazelcastDistributedExecutorService() { return hazelcastDistributedExecutorService; } public void setHazelcastDistributedExecutorService(ExecutorService hazelcastDistributedExecutorService) { this.hazelcastDistributedExecutorService = hazelcastDistributedExecutorService; } } STEP 8 : CREATE TestCallable CLASS TestCallable Class shows business logic to be executed. TestCallable task for first member of the cluster : package com.onlinetechvision.task; import java.io.Serializable; import java.util.concurrent.Callable; /** * TestCallable Class shows business logic to be executed. * * @author onlinetechvision.com * @since 27 Nov 2012 * @version 1.0.0 * */ public class TestCallable implements Callable, Serializable{ private static final long serialVersionUID = -1839169907337151877L; /** * Computes a result, or throws an exception if unable to do so. * * @return String computed result * @throws Exception if unable to compute a result */ public String call() throws Exception { return "First Member' s TestCallable Task is called..."; } } TestCallable task for second member of the cluster : package com.onlinetechvision.task; import java.io.Serializable; import java.util.concurrent.Callable; /** * TestCallable Class shows business logic to be executed. * * @author onlinetechvision.com * @since 27 Nov 2012 * @version 1.0.0 * */ public class TestCallable implements Callable, Serializable{ private static final long serialVersionUID = -1839169907337151877L; /** * Computes a result, or throws an exception if unable to do so. * * @return String computed result * @throws Exception if unable to compute a result */ public String call() throws Exception { return "Second Member' s TestCallable Task is called..."; } } STEP 9 : CREATE AnotherAvailableMemberNotFoundException CLASS AnotherAvailableMemberNotFoundException is thrown when another available member is not found. To avoid this exception, first node should be started before the second node. package com.onlinetechvision.exception; /** * AnotherAvailableMemberNotFoundException is thrown when another available member is not found. * To avoid this exception, first node should be started before the second node. * * @author onlinetechvision.com * @since 27 Nov 2012 * @version 1.0.0 * */ public class AnotherAvailableMemberNotFoundException extends Exception { private static final long serialVersionUID = -3954360266393077645L; /** * Constructor of AnotherAvailableMemberNotFoundException * * @param String Exception message * */ public AnotherAvailableMemberNotFoundException(String message) { super(message); } } STEP 10 : CREATE CustomerEntryListener CLASS CustomerEntryListener Class listens entry changes on named cache object. package com.onlinetechvision.test.listener; import com.hazelcast.core.EntryEvent; import com.hazelcast.core.EntryListener; /** * CustomerEntryListener Class listens entry changes on named cache object. * * @author onlinetechvision.com * @since 27 Nov 2012 * @version 1.0.0 * */ @SuppressWarnings("rawtypes") public class CustomerEntryListener implements EntryListener { /** * Invoked when an entry is added. * * @param EntryEvent * */ public void entryAdded(EntryEvent ee) { System.out.println("EntryAdded... Member : " + ee.getMember() + ", Key : "+ee.getKey()+", OldValue : "+ee.getOldValue()+", NewValue : "+ee.getValue()); } /** * Invoked when an entry is removed. * * @param EntryEvent * */ public void entryRemoved(EntryEvent ee) { System.out.println("EntryRemoved... Member : " + ee.getMember() + ", Key : "+ee.getKey()+", OldValue : "+ee.getOldValue()+", NewValue : "+ee.getValue()); } /** * Invoked when an entry is evicted. * * @param EntryEvent * */ public void entryEvicted(EntryEvent ee) { } /** * Invoked when an entry is updated. * * @param EntryEvent * */ public void entryUpdated(EntryEvent ee) { } } STEP 11 : CREATE Starter CLASS Starter Class loads Customers to cache and executes distributed tasks. Starter Class of first member of the cluster : package com.onlinetechvision.exe; import com.onlinetechvision.cache.srv.ICacheService; import com.onlinetechvision.customer.Customer; /** * Starter Class loads Customers to cache and executes distributed tasks. * * @author onlinetechvision.com * @since 27 Nov 2012 * @version 1.0.0 * */ public class Starter { private ICacheService cacheService; /** * Loads cache and executes the tasks * */ public void start() { loadCacheForFirstMember(); } /** * Loads Customers to cache * */ public void loadCacheForFirstMember() { Customer firstCustomer = new Customer(); firstCustomer.setId("1"); firstCustomer.setName("Jodie"); firstCustomer.setSurname("Foster"); Customer secondCustomer = new Customer(); secondCustomer.setId("2"); secondCustomer.setName("Kate"); secondCustomer.setSurname("Winslet"); getCacheService().addToCache(firstCustomer.getId(), firstCustomer); getCacheService().addToCache(secondCustomer.getId(), secondCustomer); } public ICacheService getCacheService() { return cacheService; } public void setCacheService(ICacheService cacheService) { this.cacheService = cacheService; } } Starter Class of second member of the cluster : package com.onlinetechvision.exe; import java.util.Set; import java.util.concurrent.ExecutionException; import com.hazelcast.core.Hazelcast; import com.hazelcast.core.HazelcastInstance; import com.hazelcast.core.Member; import com.onlinetechvision.cache.srv.ICacheService; import com.onlinetechvision.customer.Customer; import com.onlinetechvision.exception.AnotherAvailableMemberNotFoundException; import com.onlinetechvision.executor.srv.IDistributedExecutorService; import com.onlinetechvision.task.TestCallable; /** * Starter Class loads Customers to cache and executes distributed tasks. * * @author onlinetechvision.com * @since 27 Nov 2012 * @version 1.0.0 * */ public class Starter { private String hazelcastInstanceName; private Hazelcast hazelcast; private IDistributedExecutorService distributedExecutorService; private ICacheService cacheService; /** * Loads cache and executes the tasks * */ public void start() { loadCache(); executeTasks(); } /** * Loads Customers to cache * */ public void loadCache() { Customer firstCustomer = new Customer(); firstCustomer.setId("3"); firstCustomer.setName("Bruce"); firstCustomer.setSurname("Willis"); Customer secondCustomer = new Customer(); secondCustomer.setId("4"); secondCustomer.setName("Colin"); secondCustomer.setSurname("Farrell"); getCacheService().addToCache(firstCustomer.getId(), firstCustomer); getCacheService().addToCache(secondCustomer.getId(), secondCustomer); } /** * Executes Tasks * */ public void executeTasks() { try { getDistributedExecutorService().executeOnStatedMember(new TestCallable(), getAnotherMember()); getDistributedExecutorService().executeOnTheMemberOwningTheKey(new TestCallable(), "3"); getDistributedExecutorService().executeOnAnyMember(new TestCallable()); getDistributedExecutorService().executeOnMembers(new TestCallable(), getAllMembers()); } catch (InterruptedException | ExecutionException | AnotherAvailableMemberNotFoundException e) { e.printStackTrace(); } } /** * Gets cluster members * * @return Set Set of Cluster Members * */ private Set getAllMembers() { Set members = getHazelcastLocalInstance().getCluster().getMembers(); return members; } /** * Gets an another member of cluster * * @return Member Another Member of Cluster * @throws AnotherAvailableMemberNotFoundException An Another Available Member can not found exception */ private Member getAnotherMember() throws AnotherAvailableMemberNotFoundException { Set members = getAllMembers(); for(Member member : members) { if(!member.localMember()) { return member; } } throw new AnotherAvailableMemberNotFoundException("No Other Available Member on the cluster. Please be aware that all members are active on the cluster"); } /** * Gets Hazelcast local instance * * @return HazelcastInstance Hazelcast local instance */ @SuppressWarnings("static-access") private HazelcastInstance getHazelcastLocalInstance() { HazelcastInstance instance = getHazelcast().getHazelcastInstanceByName(getHazelcastInstanceName()); return instance; } public String getHazelcastInstanceName() { return hazelcastInstanceName; } public void setHazelcastInstanceName(String hazelcastInstanceName) { this.hazelcastInstanceName = hazelcastInstanceName; } public Hazelcast getHazelcast() { return hazelcast; } public void setHazelcast(Hazelcast hazelcast) { this.hazelcast = hazelcast; } public IDistributedExecutorService getDistributedExecutorService() { return distributedExecutorService; } public void setDistributedExecutorService(IDistributedExecutorService distributedExecutorService) { this.distributedExecutorService = distributedExecutorService; } public ICacheService getCacheService() { return cacheService; } public void setCacheService(ICacheService cacheService) { this.cacheService = cacheService; } } STEP 12 : CREATE hazelcast-config.properties FILE hazelcast-config.properties file shows the properties of cluster members. First member properties : hz.instance.name = OTVInstance1 hz.group.name = dev hz.group.password = dev hz.management.center.enabled = true hz.management.center.url = http://localhost:8080/mancenter hz.network.port = 5701 hz.network.port.auto.increment = false hz.tcp.ip.enabled = true hz.members = 192.168.1.32 hz.executor.service.core.pool.size = 2 hz.executor.service.max.pool.size = 30 hz.executor.service.keep.alive.seconds = 30 hz.map.backup.count=2 hz.map.max.size=0 hz.map.eviction.percentage=30 hz.map.read.backup.data=true hz.map.cache.value=true hz.map.eviction.policy=NONE hz.map.merge.policy=hz.ADD_NEW_ENTRY Second member properties : hz.instance.name = OTVInstance2 hz.group.name = dev hz.group.password = dev hz.management.center.enabled = true hz.management.center.url = http://localhost:8080/mancenter hz.network.port = 5702 hz.network.port.auto.increment = false hz.tcp.ip.enabled = true hz.members = 192.168.1.32 hz.executor.service.core.pool.size = 2 hz.executor.service.max.pool.size = 30 hz.executor.service.keep.alive.seconds = 30 hz.map.backup.count=2 hz.map.max.size=0 hz.map.eviction.percentage=30 hz.map.read.backup.data=true hz.map.cache.value=true hz.map.eviction.policy=NONE hz.map.merge.policy=hz.ADD_NEW_ENTRY STEP 13 : CREATE applicationContext-hazelcast.xml Spring Hazelcast Configuration file, applicationContext-hazelcast.xml, is created and Hazelcast Distributed Executor Service and Hazelcast Instance are configured. ${hz.instance.name} ${hz.members} STEP 14 : CREATE applicationContext.xml Spring Configuration file, applicationContext.xml, is created. classpath:/hazelcast-config.properties STEP 15 : CREATE Application CLASS Application Class is created to run the application. ackage com.onlinetechvision.exe; import org.springframework.context.ApplicationContext; import org.springframework.context.support.ClassPathXmlApplicationContext; /** * Application class starts the application * * @author onlinetechvision.com * @since 27 Nov 2012 * @version 1.0.0 * */ public class Application { /** * Starts the application * * @param String[] args * */ public static void main(String[] args) { ApplicationContext context = new ClassPathXmlApplicationContext("applicationContext.xml"); Starter starter = (Starter) context.getBean("starter"); starter.start(); } } STEP 16 : BUILD PROJECT After OTV_Spring_Hazelcast_DistributedExecution Project is built, OTV_Spring_Hazelcast_DistributedExecution-0.0.1-SNAPSHOT.jar will be created. Important Note : The Members of the cluster have got different configuration for Coherence so the project should be built separately for each member. STEP 17 : INTEGRATION with HAZELCAST MANAGEMENT CENTER Hazelcast Management Center enables to monitor and manage nodes in the cluster. Entity and backup counts which are owned by customerMap, can be seen via Map Memory Data Table. We have distributed 4 entries via customerMap as shown below : Sample keys and values can be seen via Map Browser : Added First Entry : Added Third Entry : hazelcastDistributedExecutorService details can be seen via Executors tab. We have executed 3 task on first member and 2 tasks on second member as shown below : STEP 18 : RUN PROJECT BY STARTING THE CLUSTER’ s MEMBER After created OTV_Spring_Hazelcast_DistributedExecution-0.0.1-SNAPSHOT.jar file is run at the cluster’ s members, the following console output logs will be shown : First member console output : Kas 25, 2012 4:07:20 PM com.hazelcast.impl.AddressPicker INFO: Interfaces is disabled, trying to pick one address from TCP-IP config addresses: [x.y.z.t] Kas 25, 2012 4:07:20 PM com.hazelcast.impl.AddressPicker INFO: Prefer IPv4 stack is true. Kas 25, 2012 4:07:20 PM com.hazelcast.impl.AddressPicker INFO: Picked Address[x.y.z.t]:5701, using socket ServerSocket[addr=/0:0:0:0:0:0:0:0,localport=5701], bind any local is true Kas 25, 2012 4:07:21 PM com.hazelcast.system INFO: [x.y.z.t]:5701 [dev] Hazelcast Community Edition 2.4 (20121017) starting at Address[x.y.z.t]:5701 Kas 25, 2012 4:07:21 PM com.hazelcast.system INFO: [x.y.z.t]:5701 [dev] Copyright (C) 2008-2012 Hazelcast.com Kas 25, 2012 4:07:21 PM com.hazelcast.impl.LifecycleServiceImpl INFO: [x.y.z.t]:5701 [dev] Address[x.y.z.t]:5701 is STARTING Kas 25, 2012 4:07:24 PM com.hazelcast.impl.TcpIpJoiner INFO: [x.y.z.t]:5701 [dev] --A new cluster is created and First Member joins the cluster. Members [1] { Member [x.y.z.t]:5701 this } Kas 25, 2012 4:07:24 PM com.hazelcast.impl.MulticastJoiner INFO: [x.y.z.t]:5701 [dev] Members [1] { Member [x.y.z.t]:5701 this } ... -- First member adds two new entries to the cache... EntryAdded... Member : Member [x.y.z.t]:5701 this, Key : 1, OldValue : null, NewValue : Customer [id=1, name=Jodie, surname=Foster] EntryAdded... Member : Member [x.y.z.t]:5701 this, Key : 2, OldValue : null, NewValue : Customer [id=2, name=Kate, surname=Winslet] ... --Second Member joins the cluster. Members [2] { Member [x.y.z.t]:5701 this Member [x.y.z.t]:5702 } ... -- Second member adds two new entries to the cache... EntryAdded... Member : Member [x.y.z.t]:5702, Key : 4, OldValue : null, NewValue : Customer [id=4, name=Colin, surname=Farrell] EntryAdded... Member : Member [x.y.z.t]:5702, Key : 3, OldValue : null, NewValue : Customer [id=3, name=Bruce, surname=Willis] Second member console output : Kas 25, 2012 4:07:48 PM com.hazelcast.impl.AddressPicker INFO: Interfaces is disabled, trying to pick one address from TCP-IP config addresses: [x.y.z.t] Kas 25, 2012 4:07:48 PM com.hazelcast.impl.AddressPicker INFO: Prefer IPv4 stack is true. Kas 25, 2012 4:07:48 PM com.hazelcast.impl.AddressPicker INFO: Picked Address[x.y.z.t]:5702, using socket ServerSocket[addr=/0:0:0:0:0:0:0:0,localport=5702], bind any local is true Kas 25, 2012 4:07:49 PM com.hazelcast.system INFO: [x.y.z.t]:5702 [dev] Hazelcast Community Edition 2.4 (20121017) starting at Address[x.y.z.t]:5702 Kas 25, 2012 4:07:49 PM com.hazelcast.system INFO: [x.y.z.t]:5702 [dev] Copyright (C) 2008-2012 Hazelcast.com Kas 25, 2012 4:07:49 PM com.hazelcast.impl.LifecycleServiceImpl INFO: [x.y.z.t]:5702 [dev] Address[x.y.z.t]:5702 is STARTING Kas 25, 2012 4:07:49 PM com.hazelcast.impl.Node INFO: [x.y.z.t]:5702 [dev] ** setting master address to Address[x.y.z.t]:5701 Kas 25, 2012 4:07:49 PM com.hazelcast.impl.MulticastJoiner INFO: [x.y.z.t]:5702 [dev] Connecting to master node: Address[x.y.z.t]:5701 Kas 25, 2012 4:07:49 PM com.hazelcast.nio.ConnectionManager INFO: [x.y.z.t]:5702 [dev] 55715 accepted socket connection from /x.y.z.t:5701 Kas 25, 2012 4:07:55 PM com.hazelcast.cluster.ClusterManager INFO: [x.y.z.t]:5702 [dev] --Second Member joins the cluster. Members [2] { Member [x.y.z.t]:5701 Member [x.y.z.t]:5702 this } Kas 25, 2012 4:07:56 PM com.hazelcast.impl.LifecycleServiceImpl INFO: [x.y.z.t]:5702 [dev] Address[x.y.z.t]:5702 is STARTED -- Second member adds two new entries to the cache... EntryAdded... Member : Member [x.y.z.t]:5702 this, Key : 3, OldValue : null, NewValue : Customer [id=3, name=Bruce, surname=Willis] EntryAdded... Member : Member [x.y.z.t]:5702 this, Key : 4, OldValue : null, NewValue : Customer [id=4, name=Colin, surname=Farrell] 25.11.2012 16:07:56 DEBUG (DistributedExecutorService.java:42) - Method executeOnStatedMember is called... 25.11.2012 16:07:56 DEBUG (DistributedExecutorService.java:46) - Result of method executeOnStatedMember is : First Member' s TestCallable Task is called... 25.11.2012 16:07:56 DEBUG (DistributedExecutorService.java:61) - Method executeOnTheMemberOwningTheKey is called... 25.11.2012 16:07:56 DEBUG (DistributedExecutorService.java:65) - Result of method executeOnTheMemberOwningTheKey is : First Member' s TestCallable Task is called... 25.11.2012 16:07:56 DEBUG (DistributedExecutorService.java:78) - Method executeOnAnyMember is called... 25.11.2012 16:07:57 DEBUG (DistributedExecutorService.java:82) - Result of method executeOnAnyMember is : Second Member' s TestCallable Task is called... 25.11.2012 16:07:57 DEBUG (DistributedExecutorService.java:96) - Method executeOnMembers is called... 25.11.2012 16:07:57 DEBUG (DistributedExecutorService.java:101) - Result of method executeOnMembers is : [First Member' s TestCallable Task is called..., Second Member' s TestCallable Task is called...] STEP 19 : DOWNLOAD https://github.com/erenavsarogullari/OTV_Spring_Hazelcast_DistributedExecution REFERENCES : Java ExecutorService Interface Hazelcast Distributed Executor Service
December 11, 2012
by Eren Avsarogullari
· 29,924 Views · 1 Like
article thumbnail
Lightweight RPC with ZeroMQ (ØMQ) and Protocol Buffers
A frequent issue I come across writing integration applications with Mule is deciding how to communicate back and forth between my front end application, typically a web or mobile application, and a flow hosted on Mule. I could use web services and do something like annotate a component with JAX-RS and expose this out over HTTP. This is potentially overkill, particularly if I only want to host a few methods, the methods are asynchronous or I don’t want to deal with the overhead of HTTP. It also could be a lot of extra effort if the only consumers of the API, at least initially, are internal facing applications. Another choice is to use “synchronous” JMS with temporary reply queues. While Mule makes this easy to do, particularly with MuleClient, I now have to deal with the overhead of spinning up a JMS infrastructure. I could also be limited to Java only clients, depending on which JMS broker I choose. The latter is particularly signifcant, as Java probably isn’t the technology of choice on the web or mobile layer. ØMQ for RPC ØMQ, or ZeroMQ, is a networking library designed from the ground up to ease integration between distributed applications. In addition to supporting a variety of messaging patterns, which are enumerated in the extremely well written guide, the library is written in platform agnostic C with wrappers for different languages like Java, Python and Ruby. These features make it a good candidate to solve the challenges I introduced above, particularly since a community contributed module for ØMQ was released recently. Let’s consider a simple service that accepts a request for a range of stock quotes and returns the results and see how we can host this service with Mule and expose it out with the ØMQ Module. Data Serialization with Protocol Buffers Data is transported back and forth over ØMQ as byte arrays. We, as such, need to decide on a way to serialize our stock quote request and responses “on the wire.” Before we do that, however, let’s take a look at the Java canonical data model we’re using on the client and server side. The following Gists show the important bits of the StockQuote and StockQuoteResponse classes. public class StockQuote implements Serializable { String symbol; Date date; Double open; Double high; Double low; Double close; Long volume; Double adjustedClose; public class StockQuoteRequest implements Serializable { String symbol; Date startDate; Date endDate; public interface StockDataService { public List getQuote(StockQuoteRequest request); } We could use Java serialization to get the objects into byte arrays. Ignoring the other deficiencies of default Java serialization, the main drawback is that it limits our clients to one’s running on a JVM. XML or JSON provide better alternatives, but for the purposes of this example we’ll assume we want a more compact representation of the data (this isn’t totally unrealistic, stock quote data can be extremely time sensitive and we probably want to minimize serialization and deserialization overhead.) Protocol Buffers provide a good middle ground and also boast a Mule Module to provide the necessary transformers we need to move back and forth from the byte array representations. Let’s define two .proto files to define the wire format and generate the intermediary stubs for serialization. package com.acmesoft.zeromq; option java_package = "com.acmesoft.stock.model.serialization.protobuf"; option optimize_for = SPEED;package com.acmesoft.zeromq; option java_package = "com.acmesoft.stock.model.serialization.protobuf"; option optimize_for = SPEED; option java_multiple_files = true; message StockQuoteResponseBuffer { repeated StockQuoteBuffer result = 1; } message StockQuoteBuffer { required string symbol = 1; required int64 date = 2; required double open = 3; required double high = 4; required double low = 5; required double close = 6; required int64 volume = 7; required double adjustedClose = 8; } option java_multiple_files = true; message StockQuoteRequestBuffer { required string symbol = 1; required int64 start = 2; required int64 end = 3; } You typically would use the “protoc” compiler to generate the Java stubs. This is tedious, however, so we’ll instead modify the pom.xml of our project to compile the protoc files during the compile goals: com.google.protobuf.tools maven-protoc-plugin /usr/local/bin/protoc compile testCompile Since we already have a domain model we’ll add some helper classes to simplify the serialization tasks on the client side. public byte[] toProtocolBufferAsBytes() { return StockQuoteRequestBuffer.newBuilder() .setSymbol(symbol) .setStart(startDate.getTime()) .setEnd(endDate.getTime()).build().toByteArray(); } public static StockQuoteRequest fromProtocolBuffer(StockQuoteRequestBuffer buffer) { StockQuoteRequest request = new StockQuoteRequest(); request.setSymbol(buffer.getSymbol()); request.setStartDate(new Date(buffer.getStart())); request.setEndDate(new Date(buffer.getEnd())); return request; } public static StockQuoteResponseBuffer toProtocolBuffer(List quotes) { StockQuoteResponseBuffer.Builder responseBuilder = StockQuoteResponseBuffer.newBuilder(); for (StockQuote quote : quotes) { responseBuilder.addResult(StockQuoteBuffer.newBuilder() .setAdjustedClose(quote.getAdjustedClose()) .setClose(quote.getClose()) .setDate(quote.getDate().getTime()) .setHigh(quote.getHigh()) .setLow(quote.getLow()) .setOpen(quote.getOpen()) .setSymbol(quote.getSymbol()) .setVolume(quote.getVolume()).build()); } return responseBuilder.build(); } public static List listOfStockQuotesFromBytes(byte[] bytes) { List buffer; try { buffer = StockQuoteResponseBuffer.parseFrom(bytes).getResultList(); } catch (InvalidProtocolBufferException e) { throw new SerializationException(e); } List quotes = new ArrayList(); for (StockQuoteBuffer stockQuoteBuffer : buffer) { StockQuote stockQuote = new StockQuote(); stockQuote.setClose(stockQuoteBuffer.getClose()); stockQuote.setDate(new Date(stockQuoteBuffer.getDate())); stockQuote.setHigh(stockQuoteBuffer.getHigh()); stockQuote.setOpen(stockQuoteBuffer.getOpen()); stockQuote.setSymbol(stockQuoteBuffer.getSymbol()); stockQuote.setVolume(stockQuoteBuffer.getVolume()); stockQuote.setAdjustedClose(stockQuoteBuffer.getAdjustedClose()); stockQuote.setLow(stockQuoteBuffer.getLow()); quotes.add(stockQuote); } return quotes; } Configuring StockDataService Now that we have a canonical data model and a wire format defined we’re ready to wire up a Mule flow to expose the service out. Note that for this to work you need to have jzmq installed locally on your system. The following dependency needs to be added to your pom.xml once its installed: org.zeromq zmq 2.2.0 /usr/local/lib/zmq.jar system Where systemPath is the location of the zmq.jar on your filesystem. Once that’s out of the way we can configure the flow, as illustrated below: The ZeroMQ inbound-endpoint will be bound to TCP port 9090 with a request-response exchange pattern. The deserialize MP in the protobuf module will deserialize the byte array to the generated StockQuoteRequestBuffer class. From there we’ll use MEL to invoke the helper method on StockQuoteRequest to transform the intermediary class to the domain model. The List of StockQuotes returned from StockDataService will be transformed by the MEL expression using the “toProtocolBuffer” helper method on the domain model. The Protocol Buffer Module is then smart enough to implicitly transform the intermediary object to a byte array for the response. Consuming the Service from the Client Side Now that the server is ready we can turn our attention to the client side code to invoke the remote service. Let’s take a look at how this works: StockQuoteRequest stockQuoteRequest = new StockQuoteRequest(); stockQuoteRequest.setSymbol("FB"); stockQuoteRequest.setStartDate(new Date( new Date().getTime() - (86400000 * 7))); stockQuoteRequest.setEndDate(new Date()); ZMQ.Socket zmqSocket = zmqContext.socket(ZMQ.REQ); zmqSocket.setReceiveTimeOut(RECEIVE_TIMEOUT); zmqSocket.connect("tcp://localhost:9090"); zmqSocket.send(stockQuoteRequest.toProtocolBufferAsBytes(), 0); List quotes = StockQuote.listOfStockQuotesFromBytes(zmqSocket.recv(0)); We start off by defining the StockQuoteRequest object to give us all the quotes for Facebook stock from the last week. We can then open up a ZMQ socket, set the timeout, connect to the ZMQ socket on the remote Mule instance and send the byte representation of the StockQuoteRequest to it. zmqSocket.recv is then used to receive the bytes back from Mule. From here we can use the listOfStockQuotesFromBytes helper method we wrote above to convert the Protocol Buffer representation to a List of StockQuotes. Despite the fair bit of plumbing we did above, this is a pretty concise bit of client side code to invoke the remote service. Conclusion This blog post only touched on the features of ØMQ and the ØMQ Mule Module. In addition to request-reply, other exchange-patterns are supported, like one-way, push and pull. This effectively gives you the benefits of a reliable, asynchronous messaging layer without a centralized infrastructure. I hope to cover this in a later post. Protocol buffers also seem like a natural fit as a wire format for ØMQ. protobuffers echo ØMQ’s principals of being lightweight, fast and platform agnostic. These are also, not coincidently, principals Mule shares as an integration framework. The project for this example is available on GitHub.
November 26, 2012
by John D'Emic
· 28,583 Views
article thumbnail
How to Monitor Java Garbage Collection
This is the second article in the series of "Become a Java GC Expert". In the first issue Understanding Java Garbage Collection we have learned about the processes for different GC algorithms, about how GC works, what Young and Old Generation is, what you should know about the 5 types of GC in the new JDK 7, and what the performance implications are for each of these GC types. In this article, I will explain how JVM is actually running Garbage Collection in the real time. What is GC Monitoring? Garbage Collection Monitoring refers to the process of figuring out how JVM is running GC. For example, we can find out: when an object in young has moved to old and by how much, or when stop-the-world has occurred and for how long. GC monitoring is carried out to see if JVM is running GC efficiently, and to check if additional GC tuning is necessary. Based on this information, the application can be edited or GC method can be changed (GC tuning). How to Monitor GC? There are different ways to monitor GC, but the only difference is how the GC operation information is shown. GC is done by JVM, and since the GC monitoring tools disclose the GC information provided by JVM, you will get the same results no matter how you monitor GC. Therefore, you do not need to learn all methods to monitor GC, but since it only requires a little amount of time to learn each GC monitoring method, knowing a few of them can help you use the right one for different situations and environments. The tools or JVM options listed below cannot be used universally regardless of the HVM vendor. This is because there is no need for a "standard" for disclosing GC information. In this example we will use HotSpot JVM (Oracle JVM). Since NHN is using Oracle (Sun) JVM, there should be no difficulties in applying the tools or JVM options that we are explaining here. First, the GC monitoring methods can be separated into CUI and GUI depending on the access interface. The typical CUI GC monitoring method involves using a separate CUI application called "jstat", or selecting a JVM option called "verbosegc" when running JVM. GUI GC monitoring is done by using a separate GUI application, and three most commonly used applications would be "jconsole", "jvisualvm" and "Visual GC". Let's learn more about each method. jstat jstat is a monitoring tool in HotSpot JVM. Other monitoring tools for HotSpot JVM are jps and jstatd. Sometimes, you need all three tools to monitor a Java application. jstat does not provide only the GC operation information display. It also provides class loader operation information or Just-in-Time compiler operation information. Among all the information jstat can provide, in this article we will only cover its functionality to monitor GC operating information. jstat is located in $JDK_HOME/bin, so if java or javac can run without setting a separate directory from the command line, so can jstat. You can try running the following in the command line. $> jstat –gc $ 1000 S0C S1C S0U S1U EC EU OC OU PC PU YGC YGCT FGC FGCT GCT 3008.0 3072.0 0.0 1511.1 343360.0 46383.0 699072.0 283690.2 75392.0 41064.3 2540 18.454 4 1.133 19.588 3008.0 3072.0 0.0 1511.1 343360.0 47530.9 699072.0 283690.2 75392.0 41064.3 2540 18.454 4 1.133 19.588 3008.0 3072.0 0.0 1511.1 343360.0 47793.0 699072.0 283690.2 75392.0 41064.3 2540 18.454 4 1.133 19.588 $> Just like in the example, the real type data will be output along with the following columns: S0C S1C S0U S1U EC EU OC OU PC. vmid (Virtual Machine ID), as its name implies, is the ID for the VM. Java applications running either on a local machine or on a remote machine can be specified using vmid. The vmid for Java application running on a local machine is called lvmid (Local vmid), and usually is PID. To find out the lvmid, you can write the PID value using a ps command or Windows task manager, but we suggest jps because PID and lvmid does not always match. jps stands for Java PS. jps shows vmids and main method information. Just like ps shows PIDs and process names. Find out the vmid of the Java application that you want to monitor by using jps, then use it as a parameter in jstat. If you use jps alone, only bootstrap information will show when several WAS instances are running in one equipment. We suggest that you use ps -ef | grep java command along with jps. GC performance data needs constant observation, therefore when running jstat, try to output the GC monitoring information on a regular basis. For example, running "jstat –gc 1000" (or 1s) will display the GC monitoring data on the console every 1 second. "jstat –gc 1000 10" will display the GC monitoring information once every 1 second for 10 times in total. There are many options other than -gc, among which GC related ones are listed below. Option Name Description gc It shows the current size for each heap area and its current usage (Ede, survivor, old, etc.), total number of GC performed, and the accumulated time for GC operations. gccapactiy It shows the minimum size (ms) and maximum size (mx) of each heap area, current size, and the number of GC performed for each area. (Does not show current usage and accumulated time for GC operations.) gccause It shows the "information provided by -gcutil" + reason for the last GC and the reason for the current GC. gcnew Shows the GC performance data for the new area. gcnewcapacity Shows statistics for the size of new area. gcold Shows the GC performance data for the old area. gcoldcapacity Shows statistics for the size of old area. gcpermcapacity Shows statistics for the permanent area. gcutil Shows the usage for each heap area in percentage. Also shows the total number of GC performed and the accumulated time for GC operations. Only looking at frequency, you will probably use -gcutil (or -gccause), -gc and -gccapacity the most in that order. -gcutil is used to check the usage of heap areas, the number of GC performed, and the total accumulated time for GC operations, while -gccapacity option and others can be used to check the actual size allocated. You can see the following output by using the -gc option: S0C S1C … GCT 1248.0 896.0 … 1.246 1248.0 896.0 … 1.246 … … … … Different jstat options show different types of columns, which are listed below. Each column information will be displayed when you use the "jstat option" listed on the right. Column Description Jstat Option S0C Displays the current size of Survivor0 area in KB -gc -gccapacity -gcnew -gcnewcapacity S1C Displays the current size of Survivor1 area in KB -gc -gccapacity -gcnew -gcnewcapacity S0U Displays the current usage of Survivor0 area in KB -gc -gcnew S1U Displays the current usage of Survivor1 area in KB -gc -gcnew EC Displays the current size of Eden area in KB -gc -gccapacity -gcnew -gcnewcapacity EU Displays the current usage of Eden area in KB -gc -gcnew OC Displays the current size of old area in KB -gc -gccapacity -gcold -gcoldcapacity OU Displays the current usage of old area in KB -gc -gcold PC Displays the current size of permanent area in KB -gc -gccapacity -gcold -gcoldcapacity -gcpermcapacity PU Displays the current usage of permanent area in KB -gc -gcold YGC The number of GC event occurred in young area -gc -gccapacity -gcnew -gcnewcapacity -gcold -gcoldcapacity -gcpermcapacity -gcutil -gccause YGCT The accumulated time for GC operations for Yong area -gc -gcnew -gcutil -gccause FGC The number of full GC event occurred -gc -gccapacity -gcnew -gcnewcapacity -gcold -gcoldcapacity -gcpermcapacity -gcutil -gccause FGCT The accumulated time for full GC operations -gc -gcold -gcoldcapacity -gcpermcapacity -gcutil -gccause GCT The total accumulated time for GC operations -gc -gcold -gcoldcapacity -gcpermcapacity -gcutil -gccause NGCMN The minimum size of new area in KB -gccapacity -gcnewcapacity NGCMX The maximum size of max area in KB -gccapacity -gcnewcapacity NGC The current size of new area in KB -gccapacity -gcnewcapacity OGCMN The minimum size of old area in KB -gccapacity -gcoldcapacity OGCMX The maximum size of old area in KB -gccapacity -gcoldcapacity OGC The current size of old area in KB -gccapacity -gcoldcapacity PGCMN The minimum size of permanent area in KB -gccapacity -gcpermcapacity PGCMX The maximum size of permanent area in KB -gccapacity -gcpermcapacity PGC The current size of permanent generation area in KB -gccapacity -gcpermcapacity PC The current size of permanent area in KB -gccapacity -gcpermcapacity PU The current usage of permanent area in KB -gc -gcold LGCC The cause for the last GC occurrence -gccause GCC The cause for the current GC occurrence -gccause TT Tenuring threshold. If copied this amount of times in young area (S0 ->S1, S1->S0), they are then moved to old area. -gcnew MTT Maximum Tenuring threshold. If copied this amount of times inside young arae, then they are moved to old area. -gcnew DSS Adequate size of survivor in KB -gcnew The advantage of jstat is that it can always monitor the GC operation data of Java applications running on local/remote machine, as long as a console can be used. From these items, the following result is output when –gcutil is used. At the time of GC tuning, pay careful attention to YGC, YGCT, FGC, FGCT and GCT. S0 S1 E O P YGC YGCT FGC FGCT GCT 0.00 66.44 54.12 10.58 86.63 217 0.928 2 0.067 0.995 0.00 66.44 54.12 10.58 86.63 217 0.928 2 0.067 0.995 0.00 66.44 54.12 10.58 86.63 217 0.928 2 0.067 0.995 These items are important because they show how much time was spent in running GC. In this example, YGC is 217 and YGCT is 0.928. So, after calculating the arithmetical average, you can see that it required about 4 ms (0.004 seconds) for each young GC. Likewise, the average full GC time us 33ms. But the arithmetical average often does not help analyzing the actual GC problem. This is due to the severe deviations in GC operation time. (In other words, if the average time is 0.067 seconds for a full GC, one GC may have lasted 1 ms while the other one lasted 57 ms.) In order to check the individual GC time instead of the arithmetical average time, it is better to use -verbosegc. -verbosegc -verbosegc is one of the JVM options specified when running a Java application. While jstat can monitor any JVM application that has not specified any options, -verbosegc needs to be specified in the beginning, so it could be seen as an unnecessary option (since jstat can be used instead). However, as -verbosegc displays easy to understand output results whenever a GC occurs, it is very helpful for monitoring rough GC information. jstat -verbosegc Monitoring Target Java application running on a machine that can log in to a terminal, or a remote Java application that can connect to the network by using jstatd Only when -verbogc was specified as a JVM starting option Output information Heap status (usage, maximum size, number of times for GC/time, etc.) Size of ew and old area before/after GC, and GC operation time Output Time Every designated time Whenever GC occurs Whenever useful When trying to observe the changes of the size of heap area When trying to see the effect of a single GC The followings are other options that can be used with -verbosegc. -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC -XX:+PrintGCDateStamps (from JDK 6 update 4) If only -verbosegc is used, then -XX:+PrintGCDetails is applied by default. Additional options for –verbosgc are not exclusive and can be mixed and used together. When using -verbosegc, you can see the results in the following format whenever a minor GC occurs. [GC [: -> , secs] -> , secs] ] Collector Name of Collector Used for minor gc starting occupancy1 The size of young area before GC ending occupancy1 The size of young area after GC pause time1 The time when the Java application stopped running for minor GC starting occupancy3 The total size of heap area before GC ending occupancy3 The total size of heap area after GC pause time3 The time when the Java application stopped running for overall heap GC, including major GC This is an example of -verbosegc output for minor GC: S0 S1 E O P YGC YGCT FGC FGCT GCT 0.00 66.44 54.12 10.58 86.63 217 0.928 2 0.067 0.995 0.00 66.44 54.12 10.58 86.63 217 0.928 2 0.067 0.995 0.00 66.44 54.12 10.58 86.63 217 0.928 2 0.067 0.995 This is the example of output results after an Full GC occurred. [Full GC [Tenured: 3485K->4095K(4096K), 0.1745373 secs] 61244K->7418K(63104K), [Perm : 10756K->10756K(12288K)], 0.1762129 secs] [Times: user=0.19 sys=0.00, real=0.19 secs] If a CMS collector is used, then the following CMS information can be provided as well. As -verbosegc option outputs a log every time a GC event occurs, it is easy to see the changes of the heap usage rates caused by GC operation. (Java) VisualVM + Visual GC Java Visual VM is a GUI profiling/monitoring tool provided by Oracle JDK. Figure 1: VisualVM Screenshot. Instead of the version that is included with JDK, you can download Visual VM directly from its website. For the sake of convenience, the version included with JDK will be referred to as Java VisualVM (jvisualvm), and the version available from the website will be referred to as Visual VM (visualvm). The features of the two are not exactly identical, as there are slight differences, such as when installing plug-ins. Personally, I prefer the Visual VM version, which can be downloaded from the website. After running Visual VM, if you select the application that you wish to monitor from the window on the left side, you can find the "Monitoring" tab there. You can get the basic information about GC and Heap from this Monitoring tab. Though the basic GC status is also available through the basic features of VisualVM, you cannot access detailed information that is available from either jstat or -verbosegc option. If you want the detailed information provided by jstat, then it is recommended to install the Visual GC plug-in. Visual GC can be accessed in real time from the Tools menu. Figure 2: Viusal GC Installation Screenshot. By using Visual GC, you can see the information provided by running jstatd in a more intuitive way. Figure 3: Visual GC execution screenshot. HPJMeter HPJMeter is convenient for analyzing -verbosegc output results. If Visual GC can be considered as the GUI equivalent of jstat, then HPJMeter would be the GUI equivalent of -verbosgc. Of course, GC analysis is just one of the many features provided by HPJMeter. HPJMeter is a performance monitoring tool developed by HP. It can be used in HP-UX, as well as Linux and MS Windows. Originally, a tool called HPTune used to provide the GUI analysis feature for -verbosegc. However, since the HPTune feature has been integrated into HPJMeter since version 3.0, there is no need to download HPTune separately. When executing an application, the -verbosegc output results will be redirected to a separate file. You can open the redirected file with HPJMeter, which allows faster and easier GC performance data analysis through the intuitive GUI. Figure 4: HPJMeter.
October 24, 2012
by Esen Sagynov
· 99,692 Views · 7 Likes
article thumbnail
EasyNetQ Cluster Support
EasyNetQ, my super simple .NET API for RabbitMQ, now (from version 0.7.2.34) supports RabbitMQ clusters without any need to deploy a load balancer. Simply list the nodes of the cluster in the connection string ... var bus = RabbitHutch.CreateBus("host=ubuntu:5672,ubuntu:5673"); In this example I have set up a cluster on a single machine, 'ubuntu', with node 1 on port 5672 and node 2 on port 5673. When the CreateBus statement executes, EasyNetQ will attempt to connect to the first host listed (ubuntu:5672). If it fails to connect it will attempt to connect to the second host listed (ubuntu:5673). If neither node is available it will sit in a re-try loop attempting to connect to both servers every five seconds. It logs all this activity to the registered IEasyNetQLogger. You might see something like this if the first node was unavailable: DEBUG: Trying to connect ERROR: Failed to connect to Broker: 'ubuntu', Port: 5672 VHost: '/'. ExceptionMessage: 'None of the specified endpoints were reachable' DEBUG: OnConnected event fired INFO: Connected to RabbitMQ. Broker: 'ubuntu', Port: 5674, VHost: '/' If the node that EasyNetQ is connected to fails, EasyNetQ will attempt to connect to the next listed node. Once connected, it will re-declare all the exchanges and queues and re-start all the consumers. Here's an example log record showing one node failing then EasyNetQ connecting to the other node and recreating the subscribers: INFO: Disconnected from RabbitMQ Broker DEBUG: Trying to connect DEBUG: OnConnected event fired DEBUG: Re-creating subscribers INFO: Connected to RabbitMQ. Broker: 'ubuntu', Port: 5674, VHost: '/' You get automatic fail-over out of the box. That’s pretty cool. If you have multiple services using EasyNetQ to connect to a RabbitMQ cluster, they will all initially connect to the first listed node in their respective connection strings. For this reason the EasyNetQ cluster support is not really suitable for load balancing high throughput systems. I would recommend that you use a dedicated hardware or software load balancer instead, if that’s what you want.
October 14, 2012
by Mike Hadlow
· 6,851 Views
article thumbnail
Redis pub/sub Using Spring
Continuing to discover the powerful set of Redis features, the one worth mentioning about is out of the box support of pub/sub messaging. Pub/Sub messaging is essential part of many software architectures. Some software systems demand from messaging solution to provide high-performance, scalability, queues persistence and durability, fail-over support, transactions, and many more nice-to-have features, which in Java world mostly always leads to using one of JMS implementation providers. In my previous projects I have actively used Apache ActiveMQ (now moving towards Apache ActiveMQ Apollo). Though it's a great implementation, sometimes I just needed simple queuing support and Apache ActiveMQ just looked overcomplicated for that. Alternatives? Please welcome Redis pub/sub! If you are already using Redis as key/value store, few additional lines of configuration will bring pub/sub messaging to your application in no time. Spring Data Redis project abstracts very well Redis pub/sub API and provides the model so familiar to everyone who uses Spring capabilities to integrate with JMS. As always, let's start with the POM configuration file. It's pretty small and simple, includes necessary Spring dependencies, Spring Data Redis and Jedis, great Java client for Redis. 4.0.0 com.example.spring redis 0.0.1-SNAPSHOT jar UTF-8 3.1.1.RELEASE org.springframework.data spring-data-redis 1.0.1.RELEASE cglib cglib-nodep 2.2 log4j log4j 1.2.16 redis.clients jedis 2.0.0 jar org.springframework spring-core ${spring.version} org.springframework spring-context ${spring.version} org.apache.maven.plugins maven-compiler-plugin 2.3.2 1.6 1.6 Moving on to configuring Spring context, let's understand what we need to have in order for a publisher to publish some messages and for a consumer to consume them. Knowing the respective Spring abstractions for JMS will help a lot with that. we need connection factory -> JedisConnectionFactory we need a template for publisher to publish messages -> RedisTemplate we need a message listener for consumer to consume messages -> RedisMessageListenerContainer Using Spring Java configuration, let's describe our context: package com.example.redis.config; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration; import org.springframework.data.redis.connection.jedis.JedisConnectionFactory; import org.springframework.data.redis.core.RedisTemplate; import org.springframework.data.redis.listener.ChannelTopic; import org.springframework.data.redis.listener.RedisMessageListenerContainer; import org.springframework.data.redis.listener.adapter.MessageListenerAdapter; import org.springframework.data.redis.serializer.GenericToStringSerializer; import org.springframework.data.redis.serializer.StringRedisSerializer; import org.springframework.scheduling.annotation.EnableScheduling; import com.example.redis.IRedisPublisher; import com.example.redis.impl.RedisMessageListener; import com.example.redis.impl.RedisPublisherImpl; @Configuration @EnableScheduling public class AppConfig { @Bean JedisConnectionFactory jedisConnectionFactory() { return new JedisConnectionFactory(); } @Bean RedisTemplate< String, Object > redisTemplate() { final RedisTemplate< String, Object > template = new RedisTemplate< String, Object >(); template.setConnectionFactory( jedisConnectionFactory() ); template.setKeySerializer( new StringRedisSerializer() ); template.setHashValueSerializer( new GenericToStringSerializer< Object >( Object.class ) ); template.setValueSerializer( new GenericToStringSerializer< Object >( Object.class ) ); return template; } @Bean MessageListenerAdapter messageListener() { return new MessageListenerAdapter( new RedisMessageListener() ); } @Bean RedisMessageListenerContainer redisContainer() { final RedisMessageListenerContainer container = new RedisMessageListenerContainer(); container.setConnectionFactory( jedisConnectionFactory() ); container.addMessageListener( messageListener(), topic() ); return container; } @Bean IRedisPublisher redisPublisher() { return new RedisPublisherImpl( redisTemplate(), topic() ); } @Bean ChannelTopic topic() { return new ChannelTopic( "pubsub:queue" ); } } Very easy and straightforward. The presence of @EnableScheduling annotation is not necessary and is required only for our publisher implementation: the publisher will publish a string message every 100 ms. package com.example.redis.impl; import java.util.concurrent.atomic.AtomicLong; import org.springframework.data.redis.core.RedisTemplate; import org.springframework.data.redis.listener.ChannelTopic; import org.springframework.scheduling.annotation.Scheduled; import com.example.redis.IRedisPublisher; public class RedisPublisherImpl implements IRedisPublisher { private final RedisTemplate< String, Object > template; private final ChannelTopic topic; private final AtomicLong counter = new AtomicLong( 0 ); public RedisPublisherImpl( final RedisTemplate< String, Object > template, final ChannelTopic topic ) { this.template = template; this.topic = topic; } @Scheduled( fixedDelay = 100 ) public void publish() { template.convertAndSend( topic.getTopic(), "Message " + counter.incrementAndGet() + ", " + Thread.currentThread().getName() ); } } And finally our message listener implementation (which just prints message on a console). package com.example.redis.impl; import org.springframework.data.redis.connection.Message; import org.springframework.data.redis.connection.MessageListener; public class RedisMessageListener implements MessageListener { @Override public void onMessage( final Message message, final byte[] pattern ) { System.out.println( "Message received: " + message.toString() ); } } Awesome, just two small classes, one configuration to wire things together and we have full pub/sub messaging support in our application! Let's run the application as standalone ... package com.example.redis; import org.springframework.context.ApplicationContext; import org.springframework.context.annotation.AnnotationConfigApplicationContext; import com.example.redis.config.AppConfig; public class RedisPubSubStarter { public static void main(String[] args) { new AnnotationConfigApplicationContext( AppConfig.class ); } } ... and see following output in a console: ... Message received: Message 1, pool-1-thread-1 Message received: Message 2, pool-1-thread-1 Message received: Message 3, pool-1-thread-1 Message received: Message 4, pool-1-thread-1 Message received: Message 5, pool-1-thread-1 Message received: Message 6, pool-1-thread-1 Message received: Message 7, pool-1-thread-1 Message received: Message 8, pool-1-thread-1 Message received: Message 9, pool-1-thread-1 Message received: Message 10, pool-1-thread-1 Message received: Message 11, pool-1-thread-1 Message received: Message 12, pool-1-thread-1 Message received: Message 13, pool-1-thread-1 Message received: Message 14, pool-1-thread-1 Message received: Message 15, pool-1-thread-1 Message received: Message 16, pool-1-thread-1 ... Great! There is much more which you could do with Redis pub/sub, excellent documentation is available for you on Redis official web site.
October 13, 2012
by Andriy Redko
· 42,839 Views · 4 Likes
article thumbnail
SOA Service Design Cheat Sheet
this simple cheat sheet contains all the key goals, principals and design patterns that you should be aware of when designing soa services and contains helpful links to places where you can find more in-depth information on each topic. when i was studying for my soa certified architect exams, i kept notes on all the best bits from the course material. after 9 months and several hundred hours of study, i found that there were certain key pieces of information that i kept referring back to time and time again, such as… how do you define service-orientation? what are the goals and strategic benefits of having a service-oriented business? what are the design principals you should apply to soa service design & soa governance? what are the characteristics of soa based businesses – how can you recognise one? what are the most useful soa design patterns and how are they grouped? i thought it might be useful to bring all this information together into one place, so in collaboration with soagrowers we have published a free pdf cheat sheet on soa service design which you can print out and keep close to hand so it’s there whenever you need it. it’s not meant to be an exhaustive guide – it’s just a set of place-holders to remind you of the topics that may be of relevance to you when designing services. however, it should prove useful to any service architect or developer who’s interested in service design or anyone who is going through the same certification programme as i did – even if you just use it as a check-list or aide-mémoire . none of it is particularly technology specific. the same set of goals, principals and patterns can be applied equally to soap based web services , restful services or any other kind of distributed components – that’s the beauty of service-orientation, it’s vendor and technology neutral. in the sheet i’ve also highlighted something that often get’s overlooked when technologists have the lead on soa implementations:- soa has some very attractive and unique business benefits that can only be fully realised when you apply the design paradigm correctly. for my money, it’s this outcome oriented viewpoint (the business case if you like) that really differentiates soa from other tactics like eai/esb, but all too often this message gets lost in the melee . we hope you find it useful. to get your copy of the soa service design cheat sheet, just click on the image below. if you like it please share it (there are handy share buttons on the page below). click on the image to download the pdf get involved. did you find this useful? is there something you think could be added or removed? did you notice how esb is just a small fraction of the bigger picture? let me know your thoughts in the comments below. ————————————————————————— updated: 18/09/2012. i’ve now added a small section on contract first service design, just because it so fundamentally underpins many of the most important goals, principals and patterns used to deliver successful soa. for more information on contract first, see spring-ws’s excellent whitepaper . contract-first isn’t just a soap thing by the way. ‘contract’ in a soa design context means operations, data types, policies and anything else to do with the service’s public facia. so although rest has an implicit contract with predetermined operations (get, put, post, etc.) it still has data type’s and flexible url’s that convey some meaning. therefore, if you want to make a rest architecture more interoperable and less brittle for clients, it helps to plan these datatypes and url’s in advance if you can so they become better standardised and therefore more reusable.
October 2, 2012
by Ben Wilcock
· 19,571 Views
article thumbnail
Resolve Circular Dependency in Spring Autowiring
I would consider this post as best practice for using Spring in enterprise application development.
September 27, 2012
by Gal Levinsky
· 127,433 Views · 24 Likes
article thumbnail
New ActiveMQ failover and Clustering Goodies
For the last two weeks I’ve been working on some interesting use cases for the good ol’ failover transport. I finally have some time at my hands, so here’s a brief recap of what’s coming in 5.6 release in this area. First there’s a new feature, called Priority Backup. It’s described in details here, but in a nutshell it provides you with the mechanism of prioritizing your failover urls and keep your clients connected to them as soon as they are available. The most obvious use case for this is to keep your clients connected to the broker in local data center whenever you can. By doing this, you can both have better performances and stability of your clients, but also save on your bandwidth bills. Another improvement is coming for automatic broker cluster feature. Although this feature is not new, I spent some time hardening it and thought to share some more insight in how (and when) to use it in your projects. In search of high availability, people often default to master-slave architecture. This makes sense in most use cases, but if your flow is purely non-persistent you can probably come up with more optimal architecture. Instead of having one broker at the time handling all your load, and other one just waiting for it to fail, you’ll get more efficient system with some kind of active-active configuration where (possibly multiple) brokers share the load all the time. Ideally clients would be evenly distributed and would rebalance if anything changes. Brokers don’t need to share any messages as clients are distributed and messages are non-persistent so they will be lost if broker fails. So can you achieve this kind of architecture with ActiveMQ? Sure you do. That’s where automatic rebalance and clustering shines. First of all, brokers should be networked but only so they can exchange information on their availability. They shouldn’t exchange the messages (but of course can if your use case needs it). In 5.6 you do that with pure static networks, using configuration like So now imagine three brokers A,B and C forming a full mesh. In addition every broker uses rebalance options on their transport connectors All that is left for the client to do is connect to one of the brokers it knows like failover:(brokerA) and the broker will fill it with all information on other brokers in the cluster and whether it should reconnect to one of them or not. So having a large number of clients connecting like this, very soon they’ll rebalance over available brokers. You can stop one of the brokers in the cluster for updates and clients will rebalance over remaining ones. You can even add a new broker to the cluster and everything will get rebalanced without any need for you to touch your clients. So, basically in this way you have both load balancing and high availability for your non-persistent messages. Additionally, your clients are automatically updated with all information they need, and no manual intervention is needed. Although the basic support for clustering was there since 5.4, I did some more hardening and better rebalancing, so it’s coming in the Apache ActiveMQ 5.6 (and the next Fuse 5.5.1) release. Also, there are some more great stuff regarding broker clustering coming soon, so stay tuned and happy messaging.
September 10, 2012
by Dejan Bosanac
· 15,408 Views
article thumbnail
Managing Camel Routes With JMX APIs
Here is a quick example of how to programmatically access Camel MBeans to monitor and manipulate routes... first, get a connection to a JMX server (assumes localhost, port 1099, no auth) note, always cache the connection for subsequent requests (can cause memory utilization issues otherwise) JMXServiceURL url = new JMXServiceURL("service:jmx:rmi:///jndi/rmi://localhost:1099/jmxrmi"); JMXConnector jmxc = JMXConnectorFactory.connect(url); MBeanServerConnection server = jmxc.getMBeanServerConnection(); use the following to iterate over all routes and retrieve statistics (state, exchanges, etc)... ObjectName objName = new ObjectName("org.apache.camel:type=routes,*"); List cacheList = new LinkedList(server.queryNames(objName, null)); for (Iterator iter = cacheList.iterator(); iter.hasNext();) { objName = iter.next(); String keyProps = objName.getCanonicalKeyPropertyListString(); ObjectName objectInfoName = new ObjectName("org.apache.camel:" + keyProps); String routeId = (String) server.getAttribute(objectInfoName, "RouteId"); String description = (String) server.getAttribute(objectInfoName, "Description"); String state = (String) server.getAttribute(objectInfoName, "State"); ... } use the following to execute operations against a Camel route (stop,start, etc) ObjectName objName = new ObjectName("org.apache.camel:type=routes,*"); List cacheList = new LinkedList(server.queryNames(objName, null)); for (Iterator iter = cacheList.iterator(); iter.hasNext();) { objName = iter.next(); String keyProps = objName.getCanonicalKeyPropertyListString(); if(keyProps.contains(routeID)) { ObjectName objectRouteName = new ObjectName("org.apache.camel:" + keyProps); Object[] params = {}; String[] sig = {}; server.invoke(objectRouteName, operationName, params, sig); return; } } summary These APIs can easily be used to build a web or command line based tool to support remote Camel management features. All of these features are available via the JMX console and Camel does provide a web console to support some management/monitoring tasks. See these pages for more information... http://camel.apache.org/camel-jmx.html http://camel.apache.org/web-console.html
July 30, 2012
by Ben O'Day
· 12,008 Views
article thumbnail
How Changing Java Package Names Transformed my System Architecture
Changing your perspective even a small amount can have profound effects on how you approach your system. Let’s say you’re writing a web application in Java. In the system you deal with orders, customers and products. As a web application, your classes include staples like PersonController, PersonRepository, CustomerController and OrderService. How do you organize your classes into packages? There are two fundamental ways to structure your packages. Either you can focus on the logical tiers, like com.brodwall.myapp.controllers, com.brodwall.myapp.domain or perhaps com.brodwall.myapp.services.customer. Or you can focus on the domain contexts, like com.brodwall.myapp.customer, com.brodwall.myapp.orders and com.brodwall.myapp.products. The first approach is by far the most prevalent. In my view, it’s also the least helpful. Here are some ways your thinking changes if you structure your packages around domain concepts, rather than technological tiers: First, and most fundamentally, your mental model will now be aligned with that of the users of your system. If you’re asked to implement a typical feature, it is now more likely to be focused around a strict subset of the packages of your system. For example, adding a new field to a form will at least affect the presentation logic, entity and persistence layer for the corresponding domain concept. If your packages are organized around tiers, this change will hit all over your system. In a word: A system organized around features, rather than technologies, have higher coherence. This technical term means that a large percentage of a the dependencies of a class are located close to that class. Secondly, organizing around domain concepts will give you more options when your software grows. When a package contains tens of classes, you may want to split it up in several packages. The discussion can itself be enlightening. “Maybe we should separate out the customer address classes into a com.brodwall.myapp.customer.address package. It seems to have a bit of a life on its own.” “Yeah, and maybe we can use the same classes for other places we need addresses, such as suppliers?” “Cool, so com.brodwall.myapp.address, then?” Or maybe you decide that order status codes and payment status codes deserve to be in the “com.brodwall.myapp.order.codes” package. On the other hand, what options do you have for splitting up com.brodwall.myapp.controllers? You could create subpackages for customer, orders and products, but these subpackages may only have one or possibly two classes each. Finally, and perhaps most intriguingly, using domain concepts for packages allows you to vary the design according on a case by case basis. Maybe you really need a OrderService which coordinates the payment and shipping of an order, while ProductController only needs basic create-retrieve-update-delete functionality with a repository. A ProductService would just get in the way. If ProductService is missing from the com.brodwall.myapp.services package, this may be confusing or at the very least give you a nagging feeling that something is wrong. On the other hand, if there’s no Controller in the com.brodwall.myapp.product package, it doesn’t matter much. Also, most systems have some good parts and some not-so-good parts. If your Services package is not working for you, there’s not much you can do. But if the Products package is rotten, you can throw it out and reimplement it without the whole system being thrown into a state of chaos. By putting the classes needed to implement a feature together with each other and apart from the classes needed to implement other features, developers can be pragmatic and innovative when developing one feature without negatively affecting other features. The flip side of this is that most developers are more comfortable with some technologies in the application and less comfortable with other technologies. Organizing around features instead of technologies force each developer to consider a larger set of technological challenges. Some programmers take this as a motivating challenge to learn, while others, it seems, would rather not have to learn something new. If it were my money being spend to create features, I know what kind of developer I would want. Trivial changes can have large effects. By organizing your software around features, you get a more coherent system that allows for growth. It may challenge your developers, but it drives down the number of hand-offs needed to implement a feature and it challenges the developers to improve the parts of the application they are working on. See also my blog post on Architecture as tidying up.
July 20, 2012
by Johannes Brodwall
· 17,427 Views
  • Previous
  • ...
  • 274
  • 275
  • 276
  • 277
  • 278
  • 279
  • 280
  • Next
  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook
×