The Latest and Popular SDLC Topics

The Latest Popular Topics

Agrona Event Counters

Efficient open source event counters from the Agrona library. Agrona The Agrona library is an open source Java library of utility code. Unlike libraries such as Google Guava or Apache Commons which are general purpose Java utility libraries, Agrona is targeted at providing high performance code. It initially consists of code from the open source Aeron messaging library. Event Counters One of the features of the Agrona library is the event counters framework. One of the design goals of Aeron was to be easy to monitor. We wanted to make sure that people could easily check up on what Aeron is doing with services such as Nagios, internally written monitoring software or just from the commandline. Writing integrations with many services is a herculean task in and of itself, but we definitely wanted to be able to expose an API. We also didn't want to incorporate large 3rd party external dependencies, any allocation heavy code or things we couldn't control the performance of. This meant that we were going to have to write our own event counters rather than using something like the Coda Hale metrics framework. Our requirements for monitoring were very simple though. Update or increment the counter's value. Read or write to/from the counter value from a different thread or process. No garbage creation after initial setup. Labels should be associated with each counter value for readability's sake. Design Threadsafe updates of a long value is a very simple operation and already supported in Java through the AtomicLong class. The problem with using an AtomicLong as your event counter is that an external program, running in a different process, can't read from the value from your Java heap. Consequently Agrona's event counters are allocated in an off-heap buffer. This can be placed on a memory-mapped file which means it can be shared between two different processes. In order to give our counters names we store a table of name and counter id entries on another buffer. This can be placed on the same memory mapped file for convenience. API The CountersManager is responsible for allocating event counters. It needs to be instantiated with the buffers upon which to store the event counters and their labels. Here is an example of how to instantiate a counter from the counters manager. AtomicCounter conductorProxyFails = countersManager.newCounter("Failed offers to DriverConductorProxy"); You can also iterate over the current counter names and their ids. Here is some code that uses that to print a table of the event counter values: countersManager.forEach((id, label)->{finalint offset =CountersManager.counterOffset(id);finallong value = valuesBuffer.getLongVolatile(offset);System.out.format("%3d: %,20d - %s\n", id, value, label);}); Each instance of AtomicCounter represents one event counter. Here are some examples of using the atomic counter in code. // Increment the counter in a thread-safe manner conductorProxyFails.increment();// Increment the counter if you're only writing from a single thread conductorProxyFails.orderedIncrement();// atomically add 5 to the counter value conductorProxyFails.add(5);// Reset the counter conductorProxyFails.set(0); Conclusions I've just gone through a few simple examples of how to use the event counters from Agrona, which hopefully you've found useful. This isn't the only code in Agrona though - there are utilities for agents, and executing timing events as well as collections such as queues, ringbuffers and hashmaps. We're also expanding the library which is already on maven central. Currently documentation is a little bit thin on the ground, but contributions are always welcome. Thanks to Martin Thompson and Chris West for feedback on this blog post.

April 23, 2015

by Richard Warburton

· 5,818 Views

Why Elasticsearch is Suitable for Application Log Analytics

Handling Application Logs Enterprise application development using Web technologies has been around for a long time. In recent years we have seen a sharp increase in the deployment of such applications. This is partly due to the proliferation of ecommerce sites, social media sites, mobile application supporting sites, as well as the desire of enterprises to have their applications available 24x7. In most cases, such applications cater to huge load and are deployed on cloud infrastructure. Monitoring deployed applications is increasingly becoming a crucial task, as deployed applications are bound to fail, irrespective of the robust techniques used during development. Whenever an application fails, the most common resolution method starts by examining the application log. If the application has implemented logging properly, the logs can reveal the cause of application failure. Examination of log files is usually done by viewing the file using tools like vi, less, more, tail or grep. Another method is to download the file to a Windows system and viewing it using an editor like Notepad++. Engineers usually scan the log information to look for clues that point to the reasons for failure. Once the cause of failure is identified, suitable action is taken for restoring the application and/or service. The Key to Application Log Analytics This process, of logging onto a remote system and viewing logs is tedious. Additionally, many of the tools do not provide support to make the task of issue identification any simpler. Even when using tools like grep (if we know the pattern), we still need to view the logs in order to go through other information that has been logged, such as the log information that precedes the failure point. While it has always been possible to develop applications to parse application logs, the recent renewed interest in application log analytics is due to the acceptance of NoSQL-like technologies and the availability of standard tools to parse application logs. Though relational databases (RDBMS) have for many years provided the facility to store structured data, they are not well-suited for handling log data, as in many cases, the structure of the logged information is not the same across the file. This does not fit well in the rigidly defined world of an RDBMS. In comparison, NoSQL allows document flexibility and documents with different schemas can be stored in the same database / index / store. The ability to convert log data into a well-defined structure, as well as the ability to search, are key to implement a modern log analytics solution. In this document, we cover how Elasticsearch. Elasticsearch can store documents, giving us the benefit of structured storage without the overheads of a database system. The Suitability of Elasticsearch In the following subsections, we share our views as to why Elasticsearch is a suitable data store for an application log analytics solution. Elasticsearch is part of a popular trio of tools, commonly known as ELK. Of these, L stands for Logstash, the log parser; E stands for Elasticsearch, the document store; and K stands for Kibana, the visualization tool. Storing Documents Logstash can be used to parse plain text data into structured text. Once data has some structure, it becomes easy to find information by enabling search on it. While parsing application logs is not a challenge, the challenge has been in storing the data and enabling search on it. Most prior solutions have used an RDBMS for storage, but the varying structure and textual nature of application logs makes it difficult to use an RDBMS table structure to store data. RDBMSs are not geared toward ‘search’. They are geared for maintaining a ‘single value of truth’ for the data, defining relations between the data, ensuring their consistency and so on. Search is also not a strong point for RDBMSs as they use exact matches for values, while Elasticsearch supports exact matches as well as partial matches. It also supports document scoring, which attaches a confidence factor to the documents located. Elasticsearch supports documents in JSON format and uses the NoSQL philosophy for document storage. This has the advantage of allowing a flexible schema for the data. Unlike an RDBMS, Elasticsearch is a search engine at heart and hence is built for the same. Though Elasticsearch uses NoSQL for storing documents, it does not provide robust methods to update stored data. Not supporting updates is a serious disadvantage in most cases. In the case of application logs, not supporting updates actually works in favour of Elasticsearch. In case of machine logs, updates are not really required. Application logs are generated from a debugging perspective – having data handy for debugging purposes in the event of application crash or incorrect execution. They usually record important events from application execution and provide additional information to allow application developers to identify the reasons for failure. Additionally, existing information in application logs is rarely, if ever, updated. New information is continually being written to the logs, with no need to refer to old information. This plays to Elasticsearch’s strength, which is able to ingest and index new information very quickly. Search One of the easiest ways of locating information from large volumes of logs is to perform a search. Elasticsearch is well suited not only to handle search, it also supports huge volume of data, using distributed computing (implemented using Shards). While Kibana is one of the commonly used tools to display and visualize information stored in Elasticsearch, it is more suited to display standard charts like bar chart, column chart and pie chart. If the features provided by Kibana are not enough, we can always use Elasticsearch’s REST API support and it’s Query DSL (Domain-Specific Language), to search for required information. The Query DSL and the result of the query are in JSON format. Though this format makes it easy for applications to parse and process, users would need a friendly user interface to interact with the data. Handling Voluminous Data Elasticsearch supports distributed search out of the box – using the concept of ‘shards’. A shard is a single Lucene instance and is managed by Elasticsearch. Two types of shards, namely ‘primary shard’ and ‘replica shard’ are supported. By default, a document is first indexed on the primary shard and then on the replica shards. The number of primary shards can be specified, to cater to the expected volume. By default, Elasticsearch creates five shards for an index. But, once the number of primary shards is decided, it cannot be changed. A replica shards are copies the primary shard. They are used to handle fail-over and the increase performance. While performance across voluminous data can be handled by sharding, it is important to note that shards, once created for an index, cannot be changed. Thus, the sharding strategy of the data has to be decided in advance, after an assessment of the data and an estimation of its growth. In the case of application logs, the sharding strategy can be based on the application name, the business unit ID, the application OD or the application’s geolocation, just to name a few. Analytics By storing data in a structure, analytics can be enabled on the data. Not only can application perform a simple search, it is also possible to restrict the search for specific terms or over a specified time period. Structured storage also makes it easier to develop reports with well-defined visualizations, which in turn makes it easy to understand the current state of applications. It is also possible to perform various analytics operations like time series analysis using the timestamp and identification of patterns from the data using machine learning techniques (assuming, we have the right kind of data in the logs). Though Elasticsearch does not provide built-in support for analytics, applications can benefit from its fast search capability and also from its ability to handle voluminous data sets. In Closing One of the main hurdles for application logs has been the ability to search for information from the huge volume of data. By parsing application log files using Logstash, we can convert a flat file into structured data. Structured data, once stored in Elasticsearch, is easier to search and locate. Visualizations and business logic for generating alerts and tickets is easier to develop on structured data. Elasticsearch, which stores and searches documents, along with its ability to scale over huge volume of data, is a good candidate for inclusion in an application log analytics solution.

April 22, 2015

by Bipin Patwardhan

· 11,685 Views · 2 Likes

Using Apache Kafka for Integration and Data Processing Pipelines with Spring

written by josh long on the spring blog applications generated more and more data than ever before and a huge part of the challenge - before it can even be analyzed - is accommodating the load in the first place. apache’s kafka meets this challenge. it was originally designed by linkedin and subsequently open-sourced in 2011. the project aims to provide a unified, high-throughput, low-latency platform for handling real-time data feeds. the design is heavily influenced by transaction logs. it is a messaging system, similar to traditional messaging systems like rabbitmq, activemq, mqseries, but it’s ideal for log aggregation, persistent messaging, fast (_hundreds_ of megabytes per second!) reads and writes, and can accommodate numerous clients. naturally, this makes it perfect for cloud-scale architectures! kafka powers many large production systems . linkedin uses it for activity data and operational metrics to power the linkedin news feed, and linkedin today, as well as offline analytics going into hadoop. twitter uses it as part of their stream-processing infrastructure. kafka powers online-to-online and online-to-offline messaging at foursquare. it is used to integrate foursquare monitoring and production systems with hadoop-based offline infrastructures. square uses kafka as a bus to move all system events through square’s various data centers. this includes metrics, logs, custom events, and so on. on the consumer side, it outputs into splunk, graphite, or esper-like real-time alerting. netflix uses it for 300-600bn messages per day. it’s also used by airbnb, mozilla, goldman sachs, tumblr, yahoo, paypal, coursera, urban airship, hotels.com, and a seemingly endless list of other big-web stars. clearly, it’s earning its keep in some powerful systems! installing apache kafka there are many different ways to get apache kafka installed. if you’re on osx, and you’re using homebrew, it can be as simple as brew install kafka . you can also download the latest distribution from apache . i downloaded kafka_2.10-0.8.2.1.tgz , unzipped it, and then within you’ll find there’s a distribution of apache zookeeper as well as kafka, so nothing else is required. i installed apache kafka in my $home directory, under another directory, bin , then i created an environment variable, kafka_home , that points to $home/bin/kafka . start apache zookeeper first, specifying where the configuration properties file it requires is: $kafka_home/bin/zookeeper-server-start.sh $kafka_home/config/zookeeper.properties the apache kafka distribution comes with default configuration files for both zookeeper and kafka, which makes getting started easy. you will in more advanced use cases need to customize these files. then start apache kafka. it too requires a configuration file, like this: $kafka_home/bin/kafka-server-start.sh $kafka_home/config/server.properties the server.properties file contains, among other things, default values for where to connect to apache zookeeper ( zookeeper.connect ), how much data should be sent across sockets, how many partitions there are by default, and the broker id ( broker.id - which must be unique across a cluster). there are other scripts in the same directory that can be used to send and receive dummy data, very handy in establishing that everything’s up and running! now that apache kafka is up and running, let’s look at working with apache kafka from our application. some high level concepts.. a kafka broker cluster consists of one or more servers where each may have one or more broker processes running. apache kafka is designed to be highly available; there are no master nodes. all nodes are interchangeable. data is replicated from one node to another to ensure that it is still available in the event of a failure. in kafka, a topic is a category, similar to a jms destination or both an amqp exchange and queue. topics are partitioned, and the choice of which of a topic’s partition a message should be sent to is made by the message producer. each message in the partition is assigned a unique sequenced id, its offset . more partitions allow greater parallelism for consumption, but this will also result in more files across the brokers. producers send messages to apache kafka broker topics and specify the partition to use for every message they produce. message production may be synchronous or asynchronous. producers also specify what sort of replication guarantees they want. consumers listen for messages on topics and process the feed of published messages. as you’d expect if you’ve used other messaging systems, this is usually (and usefully!) asynchronous. like spring xd and numerous other distributed system, apache kafka uses apache zookeeper to coordinate cluster information. apache zookeeper provides a shared hierarchical namespace (called znodes ) that nodes can share to understand cluster topology and availability (yet another reason that spring cloud has forthcoming support for it..). zookeeper is very present in your interactions with apache kafka. apache kafka has, for example, two different apis for acting as a consumer. the higher level api is simpler to get started with and it handles all the nuances of handling partitioning and so on. it will need a reference to a zookeeper instance to keep the coordination state. let’s turn now turn to using apache kafka with spring. using apache kafka with spring integration the recently released apache kafka 1.1 spring integration adapter is very powerful, and provides inbound adapters for working with both the lower level apache kafka api as well as the higher level api. the adapter, currently, is xml-configuration first, though work is already underway on a spring integration java configuration dsl for the adapter and milestones are available. we’ll look at both here, now. to make all these examples work, i added the libs-milestone-local maven repository and used the following dependencies: org.apache.kafka:kafka_2.10:0.8.1.1 org.springframework.boot:spring-boot-starter-integration:1.2.3.release org.springframework.boot:spring-boot-starter:1.2.3.release org.springframework.integration:spring-integration-kafka:1.1.1.release org.springframework.integration:spring-integration-java-dsl:1.1.0.m1 using the spring integration apache kafka with the spring integration xml dsl first, let’s look at how to use the spring integration outbound adapter to send message instances from a spring integration flow to an external apache kafka instance. the example is fairly straightforward: a spring integration channel named inputtokafka acts as a conduit that forwards message messages to the outbound adapter, kafkaoutboundchanneladapter . the adapter itself can take its configuration from the defaults specified in the kafka:producer-context element or it from the adapter-local configuration overrides. there may be one or many configurations in a given kafka:producer-context element. here’s the java code from a spring boot application to trigger message sends using the outbound adapter by sending messages into the incoming inputtokafka messagechannel . package xml; import org.apache.commons.logging.log; import org.apache.commons.logging.logfactory; import org.springframework.beans.factory.annotation.qualifier; import org.springframework.boot.commandlinerunner; import org.springframework.boot.springapplication; import org.springframework.boot.autoconfigure.springbootapplication; import org.springframework.context.annotation.bean; import org.springframework.context.annotation.dependson; import org.springframework.context.annotation.importresource; import org.springframework.integration.config.enableintegration; import org.springframework.messaging.messagechannel; import org.springframework.messaging.support.genericmessage; @springbootapplication @enableintegration @importresource("/xml/outbound-kafka-integration.xml") public class demoapplication { private log log = logfactory.getlog(getclass()); @bean @dependson("kafkaoutboundchanneladapter") commandlinerunner kickoff(@qualifier("inputtokafka") messagechannel in) { return args -> { for (int i = 0; i < 1000; i++) { in.send(new genericmessage<>("#" + i)); log.info("sending message #" + i); } }; } public static void main(string args[]) { springapplication.run(demoapplication.class, args); } } using the new apache kafka spring integration java configuration dsl shortly after the spring integration 1.1 release, spring integration rockstar artem bilan got to work on adding a spring integration java configuration dsl analog and the result is a thing of beauty! it’s not yet ga (you need to add the libs-milestone repository for now), but i encourage you to try it out and kick the tires. it’s working well for me and the spring integration team are always keen on getting early feedback whenever possible! here’s an example that demonstrates both sending messages and consuming them from two different integrationflow s. the producer is similar to the example xml above. new in this example is the polling consumer. it is batch-centric, and will pull down all the messages it sees at a fixed interval. in our code, the message received will be a map that contains as its keys the topic and as its value another map with the partition id and the batch (in this case, of 10 records), of records read. there is a messagelistenercontainer -based alternative that processes messages as they come. package jc; import org.apache.commons.logging.log; import org.apache.commons.logging.logfactory; import org.springframework.beans.factory.annotation.autowired; import org.springframework.beans.factory.annotation.qualifier; import org.springframework.beans.factory.annotation.value; import org.springframework.boot.commandlinerunner; import org.springframework.boot.springapplication; import org.springframework.boot.autoconfigure.springbootapplication; import org.springframework.context.annotation.bean; import org.springframework.context.annotation.configuration; import org.springframework.context.annotation.dependson; import org.springframework.integration.integrationmessageheaderaccessor; import org.springframework.integration.config.enableintegration; import org.springframework.integration.dsl.integrationflow; import org.springframework.integration.dsl.integrationflows; import org.springframework.integration.dsl.sourcepollingchanneladapterspec; import org.springframework.integration.dsl.kafka.kafka; import org.springframework.integration.dsl.kafka.kafkahighlevelconsumermessagesourcespec; import org.springframework.integration.dsl.kafka.kafkaproducermessagehandlerspec; import org.springframework.integration.dsl.support.consumer; import org.springframework.integration.kafka.support.zookeeperconnect; import org.springframework.messaging.messagechannel; import org.springframework.messaging.support.genericmessage; import org.springframework.stereotype.component; import java.util.list; import java.util.map; /** * demonstrates using the spring integration apache kafka java configuration dsl. * thanks to spring integration ninja artem bilan * for getting the java configuration dsl working so quickly! * * @author josh long */ @enableintegration @springbootapplication public class demoapplication { public static final string test_topic_id = "event-stream"; @component public static class kafkaconfig { @value("${kafka.topic:" + test_topic_id + "}") private string topic; @value("${kafka.address:localhost:9092}") private string brokeraddress; @value("${zookeeper.address:localhost:2181}") private string zookeeperaddress; kafkaconfig() { } public kafkaconfig(string t, string b, string zk) { this.topic = t; this.brokeraddress = b; this.zookeeperaddress = zk; } public string gettopic() { return topic; } public string getbrokeraddress() { return brokeraddress; } public string getzookeeperaddress() { return zookeeperaddress; } } @configuration public static class producerconfiguration { @autowired private kafkaconfig kafkaconfig; private static final string outbound_id = "outbound"; private log log = logfactory.getlog(getclass()); @bean @dependson(outbound_id) commandlinerunner kickoff( @qualifier(outbound_id + ".input") messagechannel in) { return args -> { for (int i = 0; i < 1000; i++) { in.send(new genericmessage<>("#" + i)); log.info("sending message #" + i); } }; } @bean(name = outbound_id) integrationflow producer() { log.info("starting producer flow.."); return flowdefinition -> { consumer spec = (kafkaproducermessagehandlerspec.producermetadataspec metadata)-> metadata.async(true) .batchnummessages(10) .valueclasstype(string.class) .valueencoder(string::getbytes); kafkaproducermessagehandlerspec messagehandlerspec = kafka.outboundchanneladapter( props -> props.put("queue.buffering.max.ms", "15000")) .messagekey(m -> m.getheaders().get(integrationmessageheaderaccessor.sequence_number)) .addproducer(this.kafkaconfig.gettopic(), this.kafkaconfig.getbrokeraddress(), spec); flowdefinition .handle(messagehandlerspec); }; } } @configuration public static class consumerconfiguration { @autowired private kafkaconfig kafkaconfig; private log log = logfactory.getlog(getclass()); @bean integrationflow consumer() { log.info("starting consumer.."); kafkahighlevelconsumermessagesourcespec messagesourcespec = kafka.inboundchanneladapter( new zookeeperconnect(this.kafkaconfig.getzookeeperaddress())) .consumerproperties(props -> props.put("auto.offset.reset", "smallest") .put("auto.commit.interval.ms", "100")) .addconsumer("mygroup", metadata -> metadata.consumertimeout(100) .topicstreammap(m -> m.put(this.kafkaconfig.gettopic(), 1)) .maxmessages(10) .valuedecoder(string::new)); consumer endpointconfigurer = e -> e.poller(p -> p.fixeddelay(100)); return integrationflows .from(messagesourcespec, endpointconfigurer) .>>handle((payload, headers) -> { payload.entryset().foreach(e -> log.info(e.getkey() + '=' + e.getvalue())); return null; }) .get(); } } public static void main(string[] args) { springapplication.run(demoapplication.class, args); } } the example makes heavy use of java 8 lambdas. the producer spends a bit of time establishing how many messages will be sent in a single send operation, how keys and values are encoded (kafka only knows about byte[] arrays, after all) and whether messages should be sent synchronously or asynchronously. in the next line, we configure the outbound adapter itself and then define an integrationflow such that all messages get sent out via the kafka outbound adapter. the consumer spends a bit of time establishing which zookeeper instance to connect to, how many messages to receive (10) in a batch, etc. once the message batches are recieved, they’re handed to the handle method where i’ve passed in a lambda that’ll enumerate the payload’s body and print it out. nothing fancy. using apache kafka with spring xd apache kafka is a message bus and it can be very powerful when used as an integration bus. however, it really comes into its own because it’s fast enough and scalable enough that it can be used to route big-data through processing pipelines. and if you’re doing data processing, you really want spring xd ! spring xd makes it dead simple to use apache kafka (as the support is built on the apache kafka spring integration adapter!) in complex stream-processing pipelines. apache kafka is exposed as a spring xd source - where data comes from - and a sink - where data goes to. spring xd exposes a super convenient dsl for creating bash -like pipes-and-filter flows. spring xd is a centralized runtime that manages, scales, and monitors data processing jobs. it builds on top of spring integration, spring batch, spring data and spring for hadoop to be a one-stop data-processing shop. spring xd jobs read data from sources , run them through processing components that may count, filter, enrich or transform the data, and then write them to sinks. spring integration and spring xd ninja marius bogoevici , who did a lot of the recent work in the spring integration and spring xd implementation of apache kafka, put together a really nice example demonstrating how to get a full working spring xd and kafka flow working . the readme walks you through getting apache kafka, spring xd and the requisite topics all setup. the essence, however, is when you use the spring xd shell and the shell dsl to compose a stream. spring xd components are named components that are pre-configured but have lots of parameters that you can override with --.. arguments via the xd shell and dsl. (that dsl, by the way, is written by the amazing andy clement of spring expression language fame!) here’s an example that configures a stream to read data from an apache kafka source and then write the message a component called log , which is a sink. log , in this case, could be syslogd, splunk, hdfs, etc. xd> stream create kafka-source-test --definition "kafka --zkconnect=localhost:2181 --topic=event-stream | log"--deploy and that’s it! naturally, this is just a tase of spring xd, but hopefully you’ll agree the possibilities are tantalizing. deploying a kafka server with lattice and docker it’s easy to get an example kafka installation all setup using lattice , a distributed runtime that supports, among other container formats, the very popular docker image format. there’s a docker image provided by spotify that sets up a collocated zookeeper and kafka image . you can easily deploy this to a lattice cluster, as follows: ltc create --run-as-root m-kafka spotify/kafka from there, you can easily scale the apache kafka instances and even more easily still consume apache kafka from your cloud-based services. next steps you can find the code for this blog on my github account . we’ve only scratched the surface! if you want to learn more (and why wouldn’t you?), then be sure to check out marius bogoevici and dr. mark pollack’s upcoming webinar on reactive data-pipelines using spring xd and apache kafka where they’ll demonstrate how easy it can be to use rxjava, spring xd and apache kafka!

April 18, 2015

by Pieter Humphrey

· 29,091 Views

Meet Molly, the virtual nurse

Telemedicine is a topic that I’ve touched on numerous times on this blog over the past year or so, with a number of platforms emerging to offer patients the opportunity to consult with a medical professional from anywhere in the world. Most of these platforms are clinical based, such as the Babylon service, and therefore provide you with access to doctors when you have something wrong with you. There are however, also a number that are taking a more preventative approach and seek to keep you healthier in the first place. For instance, last autumn I wrote about the Vida platform that is providing coaching and support to ensure you stay out of the emergency room altogether. Of course, all of these services require a human healthcare professional at the other end of your video call to answer your queries for you. Sense.ly are taking another approach by offering an AI based nurse, called Molly, who aims to provide help and support in those periods between appointments with real life professionals. The service is aimed specifically at patients with common medical conditions such as diabetes or heart failure. The patient signs up to the site either direct or via their GP, and the platform then draws up a personalized care plan for them based upon both their medical records and the individual needs of the patient. The patient then follows this prescribed plan (hopefully), with regular check-ins with the virtual nurse via their smartphone or computer to help their progress. Doctors can also access information via the site to see how their patient is getting on, whilst the system will alert them automatically if worrying symptoms begin to emerge. I’ve written previously about the important role the appearance of avatars has on how we engage with them, so Molly has been designed with a friendly face and a softly spoken voice. She interacts with patients using voice recognition technology and can ask relatively simple questions of the patient whilst guiding them through exercise plans and collecting medical data from them. This data can then be analyzed by a doctor (although AI analysis seems inevitable), whilst the patient can also use Molly as a sort of health PA and book appointments with their doctor through her. With voice recognition technology growing at an incredible pace and the AI grunt of services such as Watson increasingly capable of making complex decisions, this seems the beginning of an inevitable trend towards more automated healthcare. Check out the video below that explains more about Molly and the service she offers. Original post

April 15, 2015

by Adi Gaskell

· 2,084 Views

Using Multiple Grok Statements to Parse a Java Stack Trace

Parse your Java stack trace log information with the Logstash tool.

April 14, 2015

by Bipin Patwardhan

· 77,925 Views · 6 Likes

MQ Trace on Java MQ Clients

I find it very difficult to debug the Java client application written for MQ. Most of the error messages are self explanatory but some of the error messages are difficult to diagnose and find the cause. Enable tracing would really help to identify the root of the problem. In this post, we will see how to enable tracing on Java applications. MQ Trace Simple MQ Trace enabling can be done by adding a JVM parameter like this java -Dcom.ibm.mq.commonservices=trace.properties ... Create trace.properties with the following attributes Diagnostics.MQ=enabled Diagnostics.Java=explorer,wmqjavaclasses,all Diagnostics.Java.Trace.Detail=high Diagnostics.Java.Trace.Destination.File=enabled Diagnostics.Java.Trace.Destination.Console=disabled Diagnostics.Java.Trace.Destination.Pathname=/tmp/trace Diagnostics.Java.FFDC.Destination.Pathname=/tmp/FFDC Diagnostics.Java.Errors.Destination.Filename=/tmp/errors/AMQJERR.LOG oints to be noted Add libraries of IBM JRE (From the directory $JRE_HOME/lib) to CLASSPATH Create the directories /tmp/FFDC, /tmp/trace and /tmp/errors Add read permission to trace.properties Keep an eye on the log file size and make sure to disable logging when not required. Otherwise, it will keep filling the disk space. (Logs size will become some gigs in few hours easily) One you start the Java Client, it will start printing the trace into the mentioned files. If you want to disable the tracing, make the Diagnostics.MQ property value to disabled so that trace will be stopped. JSSE Debug In case of secure connection with MQ, if you want to enable debug for JSSE, then you can use the JVM parameter java.net.debug with different values -Djavax.net.debug=true This prints full trace of JSSE, this can be limited to only handshake by changing the value of the parameter to ssl:handshake -Djavax.net.debug=ssl:handshake This prints only the trace related to handshake, all other trace will be simply ignored. For more articles read my blog . Happy Learning!!!!

April 14, 2015

by Veeresham Kardas

· 7,342 Views

Mockito & DBUnit: Implementing a Mocking Structure Focused and Independent to Automated Tests on Java

On this post, we will make a hands-on about Mockito and DBUnit, two libraries from Java's open source ecosystem which can help us in improving our JUnit tests on focus and independence. But why mocking is so important on our unit tests? Focusing the tests Let's imagine a Java back-end application with a tier-like architecture. On this application, we could have 2 tiers: The service tier, which have the business rules and make as a interface for the front-end; The entity tier, which have the logic responsible for making calls to a database, utilizing techonologies like JDBC or JPA; Of course, on a architecture of this kind, we will have the following dependence of our tiers: Service >>> Entity On this kind of architecture, the most common way of building our automated tests is by creating JUnit Test Classes which test each tier independently, thus we can make running tests that reflect only the correctness of the tier we want to test. However, if we simply create the classes without any mocking, we will got problems like the following: On the JUnit tests of our service tier, for example, if we have a problem on the entity tier, we will have also our tests failed, because the error from the entity tier will reverberate across the tiers; If we have a project where different teams are working on the same system, and one team is responsible for the construction of the service tier, while the other is responsible for the construction of the entity tier, we will have a dependency of one team with the other before the tests could be made; To resolve such issues, we could mock the entity tier on the service tier's unit test classes, so we can have independence and focus of our tests on the service tier, which it belongs. independence One point that it is specially important when we make our JUnit test classes in the independence department is the entity tier. Since in our example this tier is focused in the connection and running of SQL commands on a database, it makes a break on our independence goal, since we will need a database so we can run our tests. Not only that, if a test breaks any structure that it is used by the subsequent tests, all of them will also fail. It is on this point that enters our other library, DBUnit. With DBUnit, we can use embedded databases, such as HSQLDB, to make our database exclusive to the running of our tests. So, without further delay, let's begin our hands-on! Hands-on For this lab, we will create a basic CRUD for a Client entity. The structure will follow the simple example we talked about previously, with the DAO (entity) and Service tiers. We will use DBUnit and JUnit to test the DAO tier, and Mockito with JUnit to test the Service tier. First, let's create a Maven project, without any archetype and include the following dependencies on pom.xml: . . . junit junit 4.12 org.dbunit dbunit 2.5.0 org.mockito mockito-all 1.10.19 org.hibernate hibernate-entitymanager 4.3.8.Final org.hsqldb hsqldb 2.3.2 org.springframework spring-core 4.1.4.RELEASE org.springframework spring-context 4.1.5.RELEASE org.springframework spring-test 4.1.5.RELEASE org.springframework spring-tx 4.1.5.RELEASE org.springframework spring-orm 4.1.5.RELEASE . . . On the previous snapshot, we included not only the Mockito, DBUnit and JUnit libraries, but we also included Hibernate to implement the persistence layer and Spring 4 to use the IoC container and the transaction management. We also included the Spring Test library, which includes some features that we will use later on this lab. Finally, to simplify the setup and remove the need of installing a database to run the code, we will use HSQLDB as our database. Our lab will have the following structure: One class will represent the application itself, as a standalone class, where we will consume the tiers, like a real application would do; We will have another 2 classes, each one with JUnit tests, that will test each tier independently; First, we define a persistence unit, where we define the name of the unit and the properties to make Hibernate create the table for us and populate her with some initial rows. The code of the persistence.xml can be seen bellow: com.alexandreesl.handson.model.Client And the initial data to populate the table can be seen bellow: insert into Client(id,name,sex, phone) values (1,'Alexandre Eleuterio Santos Lourenco','M','22323456'); insert into Client(id,name,sex, phone) values (2,'Lucebiane Santos Lourenco','F','22323876'); insert into Client(id,name,sex, phone) values (3,'Maria Odete dos Santos Lourenco','F','22309456'); insert into Client(id,name,sex, phone) values (4,'Eleuterio da Silva Lourenco','M','22323956'); insert into Client(id,name,sex, phone) values (5,'Ana Carolina Fernandes do Sim','F','22123456'); In order to not making the post burdensome, we will not discuss the project structure during the lab, but just show the final structure at the end. The code can be found on a Github repository, at the end of the post. With the persistence unit defined, we can start coding! First, we create the entity class: package com.alexandreesl.handson.model; import javax.persistence.Column; import javax.persistence.Entity; import javax.persistence.Id; import javax.persistence.Table; @Table(name = "Client") @Entity public class Client { @Id private long id; @Column(name = "name", nullable = false, length = 50) private String name; @Column(name = "sex", nullable = false) private String sex; @Column(name = "phone", nullable = false) private long phone; public long getId() { return id; } public void setId(long id) { this.id = id; } public String getName() { return name; } public void setName(String name) { this.name = name; } public String getSex() { return sex; } public void setSex(String sex) { this.sex = sex; } public long getPhone() { return phone; } public void setPhone(long phone) { this.phone = phone; } } In order to create the persistence-related beans to enable Hibernate and the transaction manager, alongside all the rest of the beans necessary for the application, we use a Java-based Spring configuration class. The code of the class can be seen bellow: package com.alexandreesl.handson.core; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.ComponentScan; import org.springframework.context.annotation.Configuration; import org.springframework.jdbc.datasource.DriverManagerDataSource; import org.springframework.orm.jpa.JpaTransactionManager; import org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean; import org.springframework.orm.jpa.vendor.Database; import org.springframework.orm.jpa.vendor.HibernateJpaDialect; import org.springframework.orm.jpa.vendor.HibernateJpaVendorAdapter; import org.springframework.transaction.annotation.EnableTransactionManagement; @Configuration @EnableTransactionManagement @ComponentScan({ "com.alexandreesl.handson.dao", "com.alexandreesl.handson.service" }) public class AppConfiguration { @Bean public DriverManagerDataSource dataSource() { DriverManagerDataSource dataSource = new DriverManagerDataSource(); dataSource.setDriverClassName("org.hsqldb.jdbcDriver"); dataSource.setUrl("jdbc:hsqldb:mem://standalone"); dataSource.setUsername("sa"); dataSource.setPassword(""); return dataSource; } @Bean public JpaTransactionManager transactionManager() { JpaTransactionManager transactionManager = new JpaTransactionManager(); transactionManager.setEntityManagerFactory(entityManagerFactory() .getNativeEntityManagerFactory()); transactionManager.setDataSource(dataSource()); transactionManager.setJpaDialect(jpaDialect()); return transactionManager; } @Bean public HibernateJpaDialect jpaDialect() { return new HibernateJpaDialect(); } @Bean public HibernateJpaVendorAdapter jpaVendorAdapter() { HibernateJpaVendorAdapter jpaVendor = new HibernateJpaVendorAdapter(); jpaVendor.setDatabase(Database.HSQL); jpaVendor.setDatabasePlatform("org.hibernate.dialect.HSQLDialect"); return jpaVendor; } @Bean public LocalContainerEntityManagerFactoryBean entityManagerFactory() { LocalContainerEntityManagerFactoryBean entityManagerFactory = new LocalContainerEntityManagerFactoryBean(); entityManagerFactory .setPersistenceXmlLocation("classpath:META-INF/persistence.xml"); entityManagerFactory.setPersistenceUnitName("persistence"); entityManagerFactory.setDataSource(dataSource()); entityManagerFactory.setJpaVendorAdapter(jpaVendorAdapter()); entityManagerFactory.setJpaDialect(jpaDialect()); return entityManagerFactory; } } And finally, we create the classes that represent the tiers itself. This is the DAO class: package com.alexandreesl.handson.dao; import javax.persistence.EntityManager; import javax.persistence.PersistenceContext; import org.springframework.stereotype.Component; import org.springframework.transaction.annotation.Transactional; import com.alexandreesl.handson.model.Client; @Component public class ClientDAO { @PersistenceContext private EntityManager entityManager; @Transactional(readOnly = true) public Client find(long id) { return entityManager.find(Client.class, id); } @Transactional public void create(Client client) { entityManager.persist(client); } @Transactional public void update(Client client) { entityManager.merge(client); } @Transactional public void delete(Client client) { entityManager.remove(client); } } And this is the service class: package com.alexandreesl.handson.service; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.stereotype.Component; import com.alexandreesl.handson.dao.ClientDAO; import com.alexandreesl.handson.model.Client; @Component public class ClientService { @Autowired private ClientDAO clientDAO; public ClientDAO getClientDAO() { return clientDAO; } public void setClientDAO(ClientDAO clientDAO) { this.clientDAO = clientDAO; } public Client find(long id) { return clientDAO.find(id); } public void create(Client client) { clientDAO.create(client); } public void update(Client client) { clientDAO.update(client); } public void delete(Client client) { clientDAO.delete(client); } } The reader may notice that we created a getter/setter to the DAO class on the Service class. This is not necessary for the Spring injection, but we made this way to get easier to change the real DAO by a Mockito's mock on the tests class. Finally, we code the class we talked about previously, the one that consume the tiers: package com.alexandreesl.handson.core; import org.springframework.context.ApplicationContext; import org.springframework.context.annotation.AnnotationConfigApplicationContext; import com.alexandreesl.handson.model.Client; import com.alexandreesl.handson.service.ClientService; public class App { public static void main(String[] args) { ApplicationContext context = new AnnotationConfigApplicationContext( AppConfiguration.class); ClientService service = (ClientService) context .getBean(ClientService.class); System.out.println(service.find(1).getName()); System.out.println(service.find(3).getName()); System.out.println(service.find(5).getName()); Client client = new Client(); client.setId(6); client.setName("Celina do Sim"); client.setPhone(44657688); client.setSex("F"); service.create(client); System.out.println(service.find(6).getName()); System.exit(0); } } If we run the class, we can see that the console print all the clients we searched for and that Hibernate is initialized properly, proving our implementation is a success: Mar 28, 2015 1:09:22 PM org.springframework.context.annotation.AnnotationConfigApplicationContext prepareRefresh INFO: Refreshing org.springframework.context.annotation.AnnotationConfigApplicationContext@6433a2: startup date [Sat Mar 28 13:09:22 BRT 2015]; root of context hierarchy Mar 28, 2015 1:09:22 PM org.springframework.jdbc.datasource.DriverManagerDataSource setDriverClassName INFO: Loaded JDBC driver: org.hsqldb.jdbcDriver Mar 28, 2015 1:09:22 PM org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean createNativeEntityManagerFactory INFO: Building JPA container EntityManagerFactory for persistence unit 'persistence' Mar 28, 2015 1:09:22 PM org.hibernate.jpa.internal.util.LogHelper logPersistenceUnitInformation INFO: HHH000204: Processing PersistenceUnitInfo [ name: persistence ...] Mar 28, 2015 1:09:22 PM org.hibernate.Version logVersion INFO: HHH000412: Hibernate Core {4.3.8.Final} Mar 28, 2015 1:09:22 PM org.hibernate.cfg.Environment INFO: HHH000206: hibernate.properties not found Mar 28, 2015 1:09:22 PM org.hibernate.cfg.Environment buildBytecodeProvider INFO: HHH000021: Bytecode provider name : javassist Mar 28, 2015 1:09:22 PM org.hibernate.annotations.common.reflection.java.JavaReflectionManager INFO: HCANN000001: Hibernate Commons Annotations {4.0.5.Final} Mar 28, 2015 1:09:23 PM org.hibernate.dialect.Dialect INFO: HHH000400: Using dialect: org.hibernate.dialect.HSQLDialect Mar 28, 2015 1:09:23 PM org.hibernate.hql.internal.ast.ASTQueryTranslatorFactory INFO: HHH000397: Using ASTQueryTranslatorFactory Mar 28, 2015 1:09:23 PM org.hibernate.tool.hbm2ddl.SchemaExport execute INFO: HHH000227: Running hbm2ddl schema export Mar 28, 2015 1:09:23 PM org.hibernate.tool.hbm2ddl.SchemaExport execute INFO: HHH000230: Schema export complete Alexandre Eleuterio Santos Lourenco Maria Odete dos Santos Lourenco Ana Carolina Fernandes do Sim Celina do Sim Now, let's move on for the tests themselves. For the DBUnit tests, we create a Base class, which will provide the base DB operations which all of our JUnit tests will benefit. On the @PostConstruct method, which is fired after all the injections of the Spring context are made - reason why we couldn't use the @BeforeClass annotation, because we need Spring to instantiate and inject the EntityManager first - we use DBUnit to make a connection to our database, with the class DatabaseConnection and populate the table using the DataSet class we created, passing a XML structure that represents the data used on the tests. This operation of populating the table is made by the DatabaseOperation class, which we use with the CLEAN_INSERT operation, that truncate the table first and them insert the data on the dataset. Finally, we use one of JUnit's event listeners, the @After event, which is called after every test case. On our scenario, we use this event to call the clear() method on the EntityManager, which forces Hibernate to query against the Database for the first time at every test case, thus eliminating possible problems we could have between our test cases because of data that it is different on the second level cache than it is on the DB. The code for the base class is the following: package com.alexandreesl.handson.dao.test; import java.io.InputStream; import java.sql.SQLException; import javax.annotation.PostConstruct; import javax.persistence.EntityManager; import javax.persistence.EntityManagerFactory; import javax.persistence.PersistenceUnit; import org.dbunit.DatabaseUnitException; import org.dbunit.database.DatabaseConfig; import org.dbunit.database.DatabaseConnection; import org.dbunit.database.IDatabaseConnection; import org.dbunit.dataset.IDataSet; import org.dbunit.dataset.xml.FlatXmlDataSetBuilder; import org.dbunit.ext.hsqldb.HsqldbDataTypeFactory; import org.dbunit.operation.DatabaseOperation; import org.hibernate.HibernateException; import org.hibernate.internal.SessionImpl; import org.junit.After; public class BaseDBUnitSetup { private static IDatabaseConnection connection; private static IDataSet dataset; @PersistenceUnit public EntityManagerFactory entityManagerFactory; private EntityManager entityManager; @PostConstruct public void init() throws HibernateException, DatabaseUnitException, SQLException { entityManager = entityManagerFactory.createEntityManager(); connection = new DatabaseConnection( ((SessionImpl) (entityManager.getDelegate())).connection()); connection.getConfig().setProperty( DatabaseConfig.PROPERTY_DATATYPE_FACTORY, new HsqldbDataTypeFactory()); FlatXmlDataSetBuilder flatXmlDataSetBuilder = new FlatXmlDataSetBuilder(); InputStream dataSet = Thread.currentThread().getContextClassLoader() .getResourceAsStream("test-data.xml"); dataset = flatXmlDataSetBuilder.build(dataSet); DatabaseOperation.CLEAN_INSERT.execute(connection, dataset); } @After public void afterTests() { entityManager.clear(); } } The xml structure used on the test cases is the following: And the code of our test class of the DAO tier is the following: package com.alexandreesl.handson.dao.test; import static org.junit.Assert.assertNotNull; import static org.junit.Assert.assertNull; import org.junit.Test; import org.junit.runner.RunWith; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.test.context.ContextConfiguration; import org.springframework.test.context.junit4.SpringJUnit4ClassRunner; import org.springframework.test.context.transaction.TransactionConfiguration; import org.springframework.transaction.annotation.Transactional; import com.alexandreesl.handson.core.test.AppTestConfiguration; import com.alexandreesl.handson.dao.ClientDAO; import com.alexandreesl.handson.model.Client; @RunWith(SpringJUnit4ClassRunner.class) @ContextConfiguration(classes = AppTestConfiguration.class) @TransactionConfiguration(defaultRollback = true) public class ClientDAOTest extends BaseDBUnitSetup { @Autowired private ClientDAO clientDAO; @Test public void testFind() { Client client = clientDAO.find(1); assertNotNull(client); client = clientDAO.find(2); assertNotNull(client); client = clientDAO.find(3); assertNull(client); client = clientDAO.find(4); assertNull(client); client = clientDAO.find(5); assertNull(client); } @Test @Transactional public void testInsert() { Client client = new Client(); client.setId(3); client.setName("Celina do Sim"); client.setPhone(44657688); client.setSex("F"); clientDAO.create(client); } @Test @Transactional public void testUpdate() { Client client = clientDAO.find(1); client.setPhone(12345678); clientDAO.update(client); } @Test @Transactional public void testRemove() { Client client = clientDAO.find(1); clientDAO.delete(client); } } The code is very self explanatory so we will just focus on explaining the annotations at the top-level class. The @RunWith(SpringJUnit4ClassRunner.class) annotationchanges the JUnit base class that runs our test cases, using rather one made by Spring that enable support of the IoC container and the Spring's annotations. The @TransactionConfiguration(defaultRollback = true) annotation is from Spring's test library and change the behavior of the @Transactional annotation, making the transactions to roll back after execution, instead of a commit. That ensures that our test cases wont change the structure of the DB, so a test case wont break the execution of his followers. The reader may notice that we changed the configuration class to another one, exclusive for the test cases. It is essentially the same beans we created on the original configuration class, just changing the database bean to point to a different DB then the previously one, showing that we can change the database of our tests without breaking the code. On a real world scenario, the configuration class of the application would be pointing to a relational database like Oracle, DB2, etc and the test cases would use a embedded database such as HSQLDB, which we are using on this case: package com.alexandreesl.handson.core.test; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.ComponentScan; import org.springframework.context.annotation.Configuration; import org.springframework.jdbc.datasource.DriverManagerDataSource; import org.springframework.orm.jpa.JpaTransactionManager; import org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean; import org.springframework.orm.jpa.vendor.Database; import org.springframework.orm.jpa.vendor.HibernateJpaDialect; import org.springframework.orm.jpa.vendor.HibernateJpaVendorAdapter; import org.springframework.transaction.annotation.EnableTransactionManagement; @Configuration @EnableTransactionManagement @ComponentScan({ "com.alexandreesl.handson.dao", "com.alexandreesl.handson.service" }) public class AppTestConfiguration { @Bean public DriverManagerDataSource dataSource() { DriverManagerDataSource dataSource = new DriverManagerDataSource(); dataSource.setDriverClassName("org.hsqldb.jdbcDriver"); dataSource.setUrl("jdbc:hsqldb:mem://standalone-test"); dataSource.setUsername("sa"); dataSource.setPassword(""); return dataSource; } @Bean public JpaTransactionManager transactionManager() { JpaTransactionManager transactionManager = new JpaTransactionManager(); transactionManager.setEntityManagerFactory(entityManagerFactory() .getNativeEntityManagerFactory()); transactionManager.setDataSource(dataSource()); transactionManager.setJpaDialect(jpaDialect()); return transactionManager; } @Bean public HibernateJpaDialect jpaDialect() { return new HibernateJpaDialect(); } @Bean public HibernateJpaVendorAdapter jpaVendorAdapter() { HibernateJpaVendorAdapter jpaVendor = new HibernateJpaVendorAdapter(); jpaVendor.setDatabase(Database.HSQL); jpaVendor.setDatabasePlatform("org.hibernate.dialect.HSQLDialect"); return jpaVendor; } @Bean public LocalContainerEntityManagerFactoryBean entityManagerFactory() { LocalContainerEntityManagerFactoryBean entityManagerFactory = new LocalContainerEntityManagerFactoryBean(); entityManagerFactory .setPersistenceXmlLocation("classpath:META-INF/persistence.xml"); entityManagerFactory.setPersistenceUnitName("persistence"); entityManagerFactory.setDataSource(dataSource()); entityManagerFactory.setJpaVendorAdapter(jpaVendorAdapter()); entityManagerFactory.setJpaDialect(jpaDialect()); return entityManagerFactory; } } If we run the test class, we can see that it runs the test cases successfully, showing that our code is a success. If we see the console, we can see that transactions were created and rolled back, respecting our configuration: . . . ar 28, 2015 2:29:55 PM org.springframework.test.context.transaction.TransactionContext startTransaction INFO: Began transaction (1) for test context [DefaultTestContext@644abb8f testClass = ClientDAOTest, testInstance = com.alexandreesl.handson.dao.test.ClientDAOTest@1a411233, testMethod = testInsert@ClientDAOTest, testException = [null], mergedContextConfiguration = [MergedContextConfiguration@70325d20 testClass = ClientDAOTest, locations = '{}', classes = '{class com.alexandreesl.handson.core.test.AppTestConfiguration}', contextInitializerClasses = '[]', activeProfiles = '{}', propertySourceLocations = '{}', propertySourceProperties = '{}', contextLoader = 'org.springframework.test.context.support.DelegatingSmartContextLoader', parent = [null]]]; transaction manager [org.springframework.orm.jpa.JpaTransactionManager@7c2327fa]; rollback [true] Mar 28, 2015 2:29:55 PM org.springframework.test.context.transaction.TransactionContext endTransaction INFO: Rolled back transaction for test context [DefaultTestContext@644abb8f testClass = ClientDAOTest, testInstance = com.alexandreesl.handson.dao.test.ClientDAOTest@1a411233, testMethod = testInsert@ClientDAOTest, testException = [null], mergedContextConfiguration = [MergedContextConfiguration@70325d20 testClass = ClientDAOTest, locations = '{}', classes = '{class com.alexandreesl.handson.core.test.AppTestConfiguration}', contextInitializerClasses = '[]', activeProfiles = '{}', propertySourceLocations = '{}', propertySourceProperties = '{}', contextLoader = 'org.springframework.test.context.support.DelegatingSmartContextLoader', parent = [null]]]. Mar 28, 2015 2:29:55 PM org.springframework.test.context.transaction.TransactionContext startTransaction INFO: Began transaction (1) for test context [DefaultTestContext@644abb8f testClass = ClientDAOTest, testInstance = com.alexandreesl.handson.dao.test.ClientDAOTest@2adddc06, testMethod = testRemove@ClientDAOTest, testException = [null], mergedContextConfiguration = [MergedContextConfiguration@70325d20 testClass = ClientDAOTest, locations = '{}', classes = '{class com.alexandreesl.handson.core.test.AppTestConfiguration}', contextInitializerClasses = '[]', activeProfiles = '{}', propertySourceLocations = '{}', propertySourceProperties = '{}', contextLoader = 'org.springframework.test.context.support.DelegatingSmartContextLoader', parent = [null]]]; transaction manager [org.springframework.orm.jpa.JpaTransactionManager@7c2327fa]; rollback [true] Mar 28, 2015 2:29:55 PM org.springframework.test.context.transaction.TransactionContext endTransaction INFO: Rolled back transaction for test context [DefaultTestContext@644abb8f testClass = ClientDAOTest, testInstance = com.alexandreesl.handson.dao.test.ClientDAOTest@2adddc06, testMethod = testRemove@ClientDAOTest, testException = [null], mergedContextConfiguration = [MergedContextConfiguration@70325d20 testClass = ClientDAOTest, locations = '{}', classes = '{class com.alexandreesl.handson.core.test.AppTestConfiguration}', contextInitializerClasses = '[]', activeProfiles = '{}', propertySourceLocations = '{}', propertySourceProperties = '{}', contextLoader = 'org.springframework.test.context.support.DelegatingSmartContextLoader', parent = [null]]]. Mar 28, 2015 2:29:55 PM org.springframework.test.context.transaction.TransactionContext startTransaction INFO: Began transaction (1) for test context [DefaultTestContext@644abb8f testClass = ClientDAOTest, testInstance = com.alexandreesl.handson.dao.test.ClientDAOTest@4905c46b, testMethod = testUpdate@ClientDAOTest, testException = [null], mergedContextConfiguration = [MergedContextConfiguration@70325d20 testClass = ClientDAOTest, locations = '{}', classes = '{class com.alexandreesl.handson.core.test.AppTestConfiguration}', contextInitializerClasses = '[]', activeProfiles = '{}', propertySourceLocations = '{}', propertySourceProperties = '{}', contextLoader = 'org.springframework.test.context.support.DelegatingSmartContextLoader', parent = [null]]]; transaction manager [org.springframework.orm.jpa.JpaTransactionManager@7c2327fa]; rollback [true] Mar 28, 2015 2:29:55 PM org.springframework.test.context.transaction.TransactionContext endTransaction INFO: Rolled back transaction for test context [DefaultTestContext@644abb8f testClass = ClientDAOTest, testInstance = com.alexandreesl.handson.dao.test.ClientDAOTest@4905c46b, testMethod = testUpdate@ClientDAOTest, testException = [null], mergedContextConfiguration = [MergedContextConfiguration@70325d20 testClass = ClientDAOTest, locations = '{}', classes = '{class com.alexandreesl.handson.core.test.AppTestConfiguration}', contextInitializerClasses = '[]', activeProfiles = '{}', propertySourceLocations = '{}', propertySourceProperties = '{}', contextLoader = 'org.springframework.test.context.support.DelegatingSmartContextLoader', parent = [null]]]. Now let's move on to the Service tests, with the help of Mockito. The class to test the Service tier is very simple, as we can see bellow: package com.alexandreesl.handson.service.test; import static org.junit.Assert.assertEquals; import org.junit.BeforeClass; import org.junit.Test; import org.mockito.Mockito; import org.mockito.invocation.InvocationOnMock; import org.mockito.stubbing.Answer; import com.alexandreesl.handson.dao.ClientDAO; import com.alexandreesl.handson.model.Client; import com.alexandreesl.handson.service.ClientService; public class ClientServiceTest { private static ClientDAO clientDAO; private static ClientService clientService; @BeforeClass public static void beforeClass() { clientService = new ClientService(); clientDAO = Mockito.mock(ClientDAO.class); clientService.setClientDAO(clientDAO); Client client = new Client(); client.setId(0); client.setName("Mocked client!"); client.setPhone(11111111); client.setSex("M"); Mockito.when(clientDAO.find(Mockito.anyLong())).thenReturn(client); Mockito.doThrow(new RuntimeException("error on client!")) .when(clientDAO).delete((Client) Mockito.any()); Mockito.doNothing().when(clientDAO).create((Client) Mockito.any()); Mockito.doAnswer(new Answer

April 14, 2015

by Alexandre Lourenco

· 21,768 Views · 2 Likes

A cluster management framework, Apache Helix

What is Helix? It is used for the automatic management of partitioned, replicated and distributed resources hosted on a cluster of nodes. Helix automates reassignment of resources in the face of node failure and recovery, cluster expansion, and reconfiguration. Modeling a distributed system as a state machine with constraints on states and transitions. Terminologies Node : A single machine Cluster: Set of Nodes Resource : A logical entry (e.g. database, index, task) Partition: Subset of the resource (Each subtask is referred to as a partition) Replica: Copy of a Partition State (e.g Master, Slave). It increase the availability of the system State: Describes the role of a replica (Each node in the cluster has its own Current State) State Machine and Transitions: An action that allows a replica to move from one state to another, thus changing its role. ( e.g Slave --> Master ) spectators: the external clients. Helix provides an External View that is an aggregated view of the current state across all nodes. Current State: represents resource's actual state at a participating node. - INSTANCE_NAME: Unique name representing the process - SESSION_ID: ID that is automatically assigned every time a process joins the cluster Rebalancer: The core component of Helix is the Controller which runs the Rebalance algorithm on every cluster event. Dynamic Ideal State: Helix powerful is that Ideal State can be changed dynamically. It is adjusting the ideal state. Whenever a cluster event occurs, Helix can operate in one of three modes FULL_AUTO SEMI_AUTO CUSTOMIZED Cluster events can be one of the following: Nodes start and/or stop Nodes experience soft and/or hard failures New nodes are added/removed [1] http://helix.apache.org/Concepts.html

April 13, 2015

by Madhuka Udantha

· 7,854 Views

How to Batch DELETE Statements with Hibernate

Introduction In my , I explained the Hibernate configurations required for batching INSERT and UPDATE statements. This post will continue this topic with DELETE statements batching. Domain model entities We’ll start with the following entity model: The Post entity has a one-to-many association to a Comment and a one-to-one relationship with the PostDetails entity: @OneToMany(cascade = CascadeType.ALL, mappedBy = "post", orphanRemoval = true) private List comments = new ArrayList<>(); @OneToOne(cascade = CascadeType.ALL, mappedBy = "post", orphanRemoval = true, fetch = FetchType.LAZY) private PostDetails details; The up-coming tests will be run against the following data: doInTransaction(session -> { int batchSize = batchSize(); for(int i = 0; i < itemsCount(); i++) { int j = 0; Post post = new Post(String.format( "Post no. %d", i)); post.addComment(new Comment( String.format( "Post comment %d:%d", i, j++))); post.addComment(new Comment(String.format( "Post comment %d:%d", i, j++))); post.addDetails(new PostDetails()); session.persist(post); if(i % batchSize == 0 && i > 0) { session.flush(); session.clear(); } } }); Hibernate Configuration As , the following properties are required for batching INSERT and UPDATE statements: properties.put("hibernate.jdbc.batch_size", String.valueOf(batchSize())); properties.put("hibernate.order_inserts", "true"); properties.put("hibernate.order_updates", "true"); properties.put("hibernate.jdbc.batch_versioned_data", "true"); Next, we are going to check if DELETE statements are batched as well. JPA Cascade Delete Because is convenient, I’m going to prove that CascadeType.DELETE and JDBC batching don’t mix well. The following tests is going to: Select some Posts along with Comments and PostDetails Delete the Posts, while propagating the delete event to Comments and PostDetails as well @Test public void testCascadeDelete() { LOGGER.info("Test batch delete with cascade"); final AtomicReference startNanos = new AtomicReference<>(); addDeleteBatchingRows(); doInTransaction(session -> { List posts = session.createQuery( "select distinct p " + "from Post p " + "join fetch p.details d " + "join fetch p.comments c") .list(); startNanos.set(System.nanoTime()); for (Post post : posts) { session.delete(post); } }); LOGGER.info("{}.testCascadeDelete took {} millis", getClass().getSimpleName(), TimeUnit.NANOSECONDS.toMillis( System.nanoTime() - startNanos.get() )); } Running this test gives the following output: Query:{[delete from Comment where id=? and version=?][55,0]} {[delete from Comment where id=? and version=?][56,0]} Query:{[delete from PostDetails where id=?][3]} Query:{[delete from Post where id=? and version=?][3,0]} Query:{[delete from Comment where id=? and version=?][54,0]} {[delete from Comment where id=? and version=?][53,0]} Query:{[delete from PostDetails where id=?][2]} Query:{[delete from Post where id=? and version=?][2,0]} Query:{[delete from Comment where id=? and version=?][52,0]} {[delete from Comment where id=? and version=?][51,0]} Query:{[delete from PostDetails where id=?][1]} Query:{[delete from Post where id=? and version=?][1,0]} Only the Comment DELETE statements were batched, the other entities being deleted in separate database round-trips. The reason for this behaviour is given by the ActionQueue sorting implementation: if ( session.getFactory().getSettings().isOrderUpdatesEnabled() ) { // sort the updates by pk updates.sort(); } if ( session.getFactory().getSettings().isOrderInsertsEnabled() ) { insertions.sort(); } While INSERTS and UPDATES are covered, DELETE statements are not sorted at all. A JDBC batch can only be reused when all statements belong to the same database table. When an incoming statement targets a different database table, the current batch has to be released, so that the new batch matches the current statement database table: public Batch getBatch(BatchKey key) { if ( currentBatch != null ) { if ( currentBatch.getKey().equals( key ) ) { return currentBatch; } else { currentBatch.execute(); currentBatch.release(); } } currentBatch = batchBuilder().buildBatch(key, this); return currentBatch; } If you enjoy reading this article, you might want to subscribe to my newsletter and get a discount for my book as well. Orphan removal and manual flushing A work-around is to dissociate all Child entities while manually flushing the HibernateSession before advancing to a new Child association: @Test public void testOrphanRemoval() { LOGGER.info("Test batch delete with orphan removal"); final AtomicReference startNanos = new AtomicReference<>(); addDeleteBatchingRows(); doInTransaction(session -> { List posts = session.createQuery( "select distinct p " + "from Post p " + "join fetch p.details d " + "join fetch p.comments c") .list(); startNanos.set(System.nanoTime()); posts.forEach(Post::removeDetails); session.flush(); posts.forEach(post -> { for (Iterator commentIterator = post.getComments().iterator(); commentIterator.hasNext(); ) { Comment comment = commentIterator.next(); comment.post = null; commentIterator.remove(); } }); session.flush(); posts.forEach(session::delete); }); LOGGER.info("{}.testOrphanRemoval took {} millis", getClass().getSimpleName(), TimeUnit.NANOSECONDS.toMillis( System.nanoTime() - startNanos.get() )); } This time all DELETE statements are properly batched: Query:{[delete from PostDetails where id=?][2]} {[delete from PostDetails where id=?][3]} {[delete from PostDetails where id=?][1]} Query:{[delete from Comment where id=? and version=?][53,0]} {[delete from Comment where id=? and version=?][54,0]} {[delete from Comment where id=? and version=?][56,0]} {[delete from Comment where id=? and version=?][55,0]} {[delete from Comment where id=? and version=?][52,0]} {[delete from Comment where id=? and version=?][51, Query:{[delete from Post where id=? and version=?][2,0]} {[delete from Post where id=? and version=?][3,0]} {[delete from Post where id=? and version=?][1,0]} SQL Cascade Delete A better solution is to use SQL cascade deletion, instead of JPA entity state propagation mechanism. This way, we can also reduce the DML statements count. Because Hibernate Session acts as a , we must be extra cautious when mixing entity state transitions with database-side automatic actions, as the Persistence Context might not reflect the latest database changes. The Post entity one-to-manyComment association is marked with the Hibernate specific @OnDelete annotation, so that the auto-generated database schema includes the ON DELETE CASCADE directive: @OneToMany(cascade = { CascadeType.PERSIST, CascadeType.MERGE}, mappedBy = "post") @OnDelete(action = OnDeleteAction.CASCADE) private List comments = new ArrayList<>(); Generating the following DDL: alter table Comment add constraint FK_apirq8ka64iidc18f3k6x5tc5 foreign key (post_id) references Post on delete cascade The same is done with the PostDetails entity one-to-one Post association: @OneToOne(fetch = FetchType.LAZY) @JoinColumn(name = "id") @MapsId @OnDelete(action = OnDeleteAction.CASCADE) private Post post; And the associated DDL: alter table PostDetails add constraint FK_h14un5v94coafqonc6medfpv8 foreign key (id) references Post on delete cascade The CascadeType.ALL and orphanRemoval were replaced with CascadeType.PERSIST and CascadeType.MERGE, because we no longer want Hibernate to propagate the entity removal event. The test only deletes the Post entities. doInTransaction(session -> { List posts = session.createQuery( "select p from Post p") .list(); startNanos.set(System.nanoTime()); for (Post post : posts) { session.delete(post); } }); The DELETE statements are properly batched as there’s only one target table. Query:{[delete from Post where id=? and version=?][1,0]} {[delete from Post where id=? and version=?][2,0]} {[delete from Post where id=? and version=?][3,0]} If you enjoyed this article, I bet you are going to love my book as well. Conclusion If INSERT and UPDATE statements batching is just a matter of configuration, DELETE statements require some additional steps, which may increase the data access layer complexity. Code available on GitHub. If you have enjoyed reading my article and you’re looking forward to getting instant email notifications of my latest posts, consider .

April 11, 2015

by Vlad Mihalcea

· 22,615 Views · 1 Like

Currency Format Validation and Parsing

In Java, formatting a number according to a locale-specific currency format is pretty simple. You use an instance of java.text.NumberFormat class by instantiating it through NumberFormat.getCurrencyInstance() and invoke one of the format() methods. Following is a code-snippet from https://docs.oracle.com/javase/tutorial/i18n/format/numberFormat.html static public void displayCurrency( Locale currentLocale) { Double currencyAmount = new Double(9876543.21); Currency currentCurrency = Currency.getInstance(currentLocale); NumberFormat currencyFormatter = NumberFormat.getCurrencyInstance(currentLocale); System.out.println( currentLocale.getDisplayName() + ", " + currentCurrency.getDisplayName() + ": " + currencyFormatter.format(currencyAmount)); } However, if an application allows users to enter an amount as a string using separators and currency symbols, there is quite a possibility that currency format may not have been followed according to the locale. For example, user can use a wrong thousand or decimal separator, or a wrong currency symbol altogether. In that case, the application should incorporate a mechanism to first ensure that format is followed and then parse that string to convert it into a proper number in order to perform several mathematical calculations in a currency format independent manner. It is important to note that parsing a string using java.text.NumberFormat#public Number parse(String source) method requires that the currency string must contain a value following the locale-specific pattern defined in java.text.DecimalFormat and symbols defined in java.text.DecimalFormatSymbols. If the pattern and/or symbols are not followed, program will throw java.text.ParseException. For example, currency pattern for it_CH = Italian (Switzerland) is ¤ #,##0.00 and grouping separator is ', hence SFr. 1'234.56 is valid while 1,234.56 SFr. is invalid In order to check the validity of a currency amount before actually parsing it, Apache Commons Validator project's org.apache.commons.validator.routines.CurrencyValidator comes in real handy. All you have to do is to construct its object and call public boolean isValid(String value,Locale locale) method to check whether locale-specific format is followed or not. Once you are done with validation and found that number is valid, you can then parse the number by calling the parse() method. Following code snippet shows how to validate and parse. Note that the validation is lenient and if currency symbol is not already present in the string, it appends appropriate currency symbol according to the pattern and then validate and parse it. /** * Converts given item price into a number based on given * currency code. * * It works generically for all currencies supported by Java * * * @param itemPrice currency amount * @param currencyCode 3-letter ISO country code * @return {@link String} containing stripped price or same as given price * if parsing failed or formatter couldn't be constructed * @author Muhammad Haris */ public static Double convertPrice(String itemPrice, String currencyCode) { Double itemPriceConverted = null; Locale currencyLocale = LocaleUtility .getLocaleAgainstCurrency(currencyCode); DecimalFormat currencyFormatter = getCurrencyFormatter(currencyLocale); if (currencyFormatter != null) { itemPrice = appendCurrencySymbol(itemPrice, currencyFormatter); try { Number number = currencyFormatter.parse(itemPrice); itemPriceConverted = number.doubleValue(); } catch (ParseException e) { LOG.error("Failed to parse currency: " + currencyCode + ", value: " + itemPrice + ". " + e.getMessage(), e); } } else { LOG.error("No appropriate formatter found for currency: " + currencyCode + ", value: " + itemPrice + ". "); } return itemPriceConverted; } /** * Gets currency formatter against given currency locale * * @param currencyCode * {@link String} containing 3 letter ISO currency code * @return {@link NumberFormat} object specialized for the currency or null * if it couldn't be composed * @author Muhammad Haris */ public static DecimalFormat getCurrencyFormatter(Locale currencyLocale) { if (currencyLocale != null) { return (DecimalFormat) NumberFormat .getCurrencyInstance(currencyLocale); } return null; } /** * Appends appropriate currency symbol to the given price using the pattern * defined in the given currency formatter * * @param itemPrice * {@link String} containing price of the item in locale specific * format * @param currencyFormatter * {@link DecimalFormat} object containing currency locale * specific formatting info * @author Muhammad Haris */ public static String appendCurrencySymbol(String itemPrice, DecimalFormat currencyFormatter) { String currencySymbol = currencyFormatter.getDecimalFormatSymbols() .getCurrencySymbol(); String pattern = currencyFormatter.toPattern(); if (!itemPrice.contains(currencySymbol)) { if (pattern.startsWith("¤ ")) { itemPrice = currencySymbol + " " + itemPrice; } else if (pattern.endsWith(" ¤")) { itemPrice = itemPrice + " " + currencySymbol; } else if (pattern.startsWith("¤")) { itemPrice = currencySymbol + itemPrice; } else if (pattern.endsWith("¤")) { itemPrice = itemPrice + currencySymbol; } } return itemPrice; }

April 10, 2015

by Muhammad Haris

· 14,935 Views

Date and Size Rolling in log4j

Currently I am working on a project, where we have faced the problem of the huge size of daily logs. We have some environment, which exists on the same shared file system, so if logs on some environment will consume free disk space, it will cause crashing of other environments. Our log policy uses daily rolling of logs (every day a new log file is created and logs are stored only for N last days, any log that becomes older than N days will be deleted). And on a usual day this logic is ok, but in case of any errors it can be dangerous. For example, some module has lost connection to the database and starts logging an error with details, attempting to restore connection every 10 seconds. It works in case of any minor network issue, when connection is restored in minutes, maybe hours... But in case of serious issues the connection could not be restored without participation of the developer. Imagine then that it happens in the night when everybody is offline and nobody is looking at the issue in real time (just forgot to mention: these are dev environments, not prod). As a result the module will produce a huge log with a lot of identical errors. Going further... What if the issue happens on the database side and all modules on the environment lose connection? Correct, every module will produce logs at high rate. In the morning a developer will find out logs files of some GB, no free space on the shared disk and all other dev environments in a dead state, due to lack of space. There are some ways to resolve this issue and improve stability of development environments: increase disk space; review logging policy on the application level: produce less messages; review the recovery policy of the module: for example, increase time between attempts to reestablish connection; review logging policy on the logger level: introduce a log size limit, enable ZIP for old logs; Increase disk space. In enterprise development? Huh... you might be kidding? It could last for ages. No, of course, it's possible, but still we have limits. Though it will require more than one day to exhaust disk space, it can still fail on long weekends. What is about changing the logging policy? First of all, it can involve more complicated logic of messages output and possibly decrease chances of finding the root cause of some problem due to lack of details. Secondly, it requires changes of the code and, by the way, application code is really huge, so it is a challenge to carefully review all the code to find messages which can be output less often. Changing the module recovery policy: there is also a problem with code changes, also it is not clear what a new time interval between recover attempts should be, and how it will affect other modules. For example, in case of a minor issue, recovering of the module in 10 seconds will save working state of dependant module, but, for example, recovery time of 15 seconds will cause crash of other modules, so minor issue will become a serious problem. As a result the most painless way is to introduce the log size limit. The project uses log4j of version 1.2.17 currently. Short investigation shows that log rolling on a daily basis taking into account log size at the same time is not possible out of the box (you can find DailyRollingFileAppender and RollingFileAppender, but they act independently). It can be done in log4j starting from version 2. Migrating to log4j 2 promises to be very painful due to API changes and the size of our project (also at that time log4j 2 was in beta). So we took a look in the direction of log4j 3rd party appenders. As a result we found and started to use TimeAndSizeRollingAppender. It comes as a maven dependency and can be configured as any other log4j appender: All params here have clear names, I suppose. To put it simply the above configuration allows you to have 10 log files on the disk at the same time irrespective of the reason why a new file was created: because of the daily roll or because of the file size exceeded the specified limit. Every day a new log file will be created, the previous day log will be renamed accordingly to DatePattern param, in addition to this MaxFileSize sets file size limit, once it exceeds - a new log file will also be created. In both cases the oldest log will be deleted, if number of files become greater than MaxRollFileCount. Pay attention to the compression settings. CompressionAlgorithm enables compression of old logs and CompressionMinQueueSize defines the max number of existing uncompressed files. So in this case latest 5 logs will be stored uncompressed, all other log files will be compressed. Such feature is good, because you are able to see the latest logs without decompressing them. This appender will save the file system from unexpected logs growing. Of course, you can lose the original error message during log rolling, but it depends on the way you output errors. At least you will surely see the error message that spams you log, but overall system stability will not be affected by 'no free space' error.

April 8, 2015

by Ivan Zerin

· 16,766 Views

Using Oauth 2.0 in your Web Browser with AngularJS

I have a few popular Oauth related posts on my blog. I have one pertaining to Oauth 1.0a, and I have one on the topic of Oauth 2.0 for use in mobile application development. However, I get a lot of requests to show how to accomplish an Oauth 2.0 connection in a web browser using only JavaScript and AngularJS. We’re going to better explore the process flow behind Oauth 2.0 to establish a secure connection with a provider of our choice. In this particular example we’ll be using Imgur because I personally think it is a great service. Before we begin, it is important to note that this tutorial will only work with providers that offer the implicit grant type. Oauth Implicit Grant Type via OauthLib: The implicit grant type is used to obtain access tokens (it does not support the issuance of refresh tokens) and is optimized for public clients known to operate a particular redirection URI. These clients are typically implemented in a browser using a scripting language such as JavaScript. Unlike the authorization code grant type, in which the client makes separate requests for authorization and for an access token, the client receives the access token as the result of the authorization request. You’ll know the provider supports the implicit grant type when they make use of response_type=token rather than response_type=code. So there are going to be a few requirements to accomplish this in AngularJS: We are going to be using the AngularJS UI-Router library We are going to have a stand-alone index.html page with multiple templates We are going to have a stand-alone oauth_callback.html page with no AngularJS involvement With that said, let’s go ahead and create our project to look like the following: project root templates login.html secure.html js app.js index.html oauth_callback.html The templates/login.html page is where we will initialize the Oauth flow. After reaching the oauth_callback.html page we will redirect to the templates/secure.html page which requires a successful sign in. Crack open your index.html file and add the following code: Now it is time to add some very basic HTML to our templates/login.html and templates/secure.html pages: Login Login with Imgur Secure Web Page Access Token: {{accessToken} Not much left to do now. Open your js/app.js file and add the following AngularJS code: var example = angular.module("example", ['ui.router']); example.config(function($stateProvider, $urlRouterProvider) { $stateProvider .state('login', { url: '/login', templateUrl: 'templates/login.html', controller: 'LoginController' }) .state('secure', { url: '/secure', templateUrl: 'templates/secure.html', controller: 'SecureController' }); $urlRouterProvider.otherwise('/login'); }); example.controller("LoginController", function($scope) { $scope.login = function() { window.location.href = "https://api.imgur.com/oauth2/authorize?client_id=" + "CLIENT_ID_HERE" + "&response_type=token" } }); example.controller("SecureController", function($scope) { $scope.accessToken = JSON.parse(window.localStorage.getItem("imgur")).oauth.access_token; }); We are first going to focus on the login method of the LoginController. Go ahead and add the following, pretty much taken exactly from the Imgur documentation: $scope.login = function() { window.location.href = "https://api.imgur.com/oauth2/authorize?client_id=" + "CLIENT_ID_HERE" + "&response_type=token" } This long URL has the following components: Parameter Description client_id The application id found in your Imgur developer dashboard response_type Authorization grant or implicit grant type. In our case token for implicit grant The values will typically change per provider, but the parameters will usually remain the same. Now let’s dive into the callback portion. After the Imgur login flow, it is going to send you to http://localhost/oauth_callback.html because that is what we’ve decided to enter into the Imgur dashboard. Crack open your oauth_callback.html file and add the following source code: Redirecting... If you’re familiar with the ng-cordova-oauth library that I made, you’ll know much of this code was copied from it. Basically what we’re doing is grabbing the current URL and parsing out all the token parameters that Imgur has provided us. We are then going to construct an object with these parameters and serialize them into local storage. Finally we are going to redirect into the secure area of our application. In order to test this we need to be running our site from a domain or localhost. We cannot test this via a file:// URL. If you’re on a Mac or Linux machine, the simplest thing to do is run sudo python -m SimpleHTTPServer 80 since both these platforms ship with Python. This will run your web application as localhost on port 80. A video version of this article can be seen below.

April 7, 2015

by Nic Raboy

· 27,392 Views · 1 Like

Parse an XML Response with Java and Dom4J

Previously we’ve explored how to parse XML data using NodeJS as well as PHP. Continuing on the trend of parsing data using various programming languages, this time we’re going to take a look at parsing XML data using the dom4j library with Java. Now dom4j, is not the only way to parse XML data in Java. There are many other ways including using the SAX parser. Everyone will have their own opinions on which of the many to use. To keep up with my previous two XML tutorials, we’re going to use the following XML data saved in a file called data.xml at the root of the project: Code Blog Nic Raboy Nic Raboy Maria Campos With our XML content figured out, let’s make sure we structure our project like the following: project root src xmlparser MainDriver.java libs dom4j-1.6.1.jar build.xml data.xml Based on our project structure, you can probably tell that we’re going to be using Apache Ant for building. Say what you want about using Ant, but I’m still one of many who still uses it. Feel free to make changes to Apache Maven or other to better meet your needs. We’re now ready to crack open our src/xmlparser/MainDriver.java to start adding our parse logic. package xmlparser; import java.io.*; import java.util.*; import org.dom4j.*; import org.dom4j.io.*; public class MainDriver { public static void main(String[] args) { } public static void printRecursive(Element element) { } public static Document readFile(String filename) throws Exception { } } To further explain our intentions, the readFile(String filename) function will load the data.xmlfile and return it as a Document object for further parsing. The printRecursive(Element element)function will iterate through each node of the XML and print it out if it contains text. All levels of the XML will be iterated through. So let’s start with readFile(String filename): public static Document readFile(String filename) throws Exception { SAXReader reader = new SAXReader(); Document document = reader.read(new File(filename)); return document; } Nothing really to the above code. In fact, I pulled most of it from the dom4j quick-start code. The printRecursive(Element element) function is where things get more complex: public static void printRecursive(Element element) { for(int i = 0, size = element.nodeCount(); i < size; i++) { Node node = element.node(i); if(node instanceof Element) { Element currentNode = (Element) node; if(currentNode.isTextOnly()) { System.out.println(currentNode.getText()); } printRecursive(currentNode); } } } Some of the above code was taken from the dom4j quick-start, but the rest is some custom work. We are basically looking at each node and trying to visit any available children. If none exist, bail out. We also only want to print if there is text. Finally, we’re looking at the main(String[] args) function to bring it all together: public static void main(String[] args) { try { Element root = readFile("data.xml").getRootElement(); printRecursive(root); } catch (Exception e) { e.printStackTrace(); } } Just like that we’ve printed our each node of our XML document. In case you’re interested in the build.xml code, it can be seen below: To test the project you’d just run ant buildandrun from your command prompt or Terminal. Assuming of course you have Apache Ant configured correctly. The dom4j library is very thorough so I recommend have a look at the Javadocs that go with it.

April 7, 2015

by Nic Raboy

· 15,864 Views

Introduction to Apache Cassandra's Architecture

Some key concepts for Apache's popular Cassandra Architecture include partitioning, replication, consistency, bootstrapping, and write paths.

April 6, 2015

by Akhil Mehra

· 118,058 Views · 38 Likes

How to Configure a Simple JBoss Cluster in Domain Mode

Clustering is a very important thing to master for any serious user of an application server. Clustering allows for high availability by making your application available on secondary servers when the primary instance is down or it lets you scale up or out by increasing the server density on the host, or by adding servers on other hosts. It can even help to increase performance with effective load balancing between servers based on their respective hardware. Andy Overton has already covered how to set up a cluster of servers in standalone mode fronted by mod_cluster for load balancing, so in this post I'll cover clustering in domain mode. I won't rehash mod_cluster settings, so this will just cover the set up of a doman controller on one host, and the host controller and server instances on another host. To follow along with this blog, you'll need to download either JBoss EAP 6.x or WildFly. I'll be using WildFly 8.2 on Xubuntu 14.04. I'll be using $WF_HOME to refer to your WildFly home directory. Configuring the Domain Controller The domain controller needs both the domain.xml and host.xml configured. In the $WF_HOME/domain/configuration directory, you'll see that those two files are joined by a host-master.xml and a host-slave.xml. These are preconfigured host.xml files which you can use to give you a head start in making a host.xml for the domain controller (master) and host controller (slave) to use. You can either change the name of the file to be host.xml, so it will get picked up and used by default, or you can specify the host configuration you want to use on the command line by adding the --host-config argument: domain.sh --host-config=host-master.xml Whether you choose to modify the host.xml or the host-master.xml, you need to make sure that the empty element has been added to the section. This is so that when WildFly looks to see which server is the domain controller, it knows to become the domain controller itself. The other change is optional, but recommended. We need to tell the domain controller to bind its management interface to the correct IP address because, by default, it will bind to localhost, so the management communication it needs to do with the remote hosts won't be able to reach the domain controller at all! We can set this address permanently in the host.xml by making sure the inet-address value is set to the right IP, by changing the 127.0.0.1 in the example below to the correct IP: The result of that is that the default bind IP of the management interface is no longer localhost, although you can still override this value by starting JBoss with the variable left of the colon as a -D argument: domain.sh -Djboss.bind.address.management=10.0.0.1 Next, we need to modify the domain.xml file, where we need to define our server groups; essentially just defining the cluster. Each server group is named, so we can reference it later, and references a particular profile which needs to be one of the profiles named and defined in the same XML file. As I mentioned in my previous blog, domain mode has several profiles in the same file (domain.xml) rather than multiple files for each, like standalone mode (standalone.xml, standalone-ha.xml etc.). In the screenshot, there are two server groups defined - "main-server-group" which references the "full" profile, and "other-server-group" which references the "full-ha" profile. These are just the defaults which come with WildFly, so you're free to use them and modify the settings or create your own from scratch. Whichever you choose, it's a good idea to rename your server group to something meaningful, like a description of the workload, or the application name. Configuring the Host Controllers Every host server which you want to be part of the cluster must have the host.xml file configured. We've already configured the host.xml on the domain controller, so now we'll focus on the host controller. Remember, this process can be repeated on any number of hosts, depending on how many servers you want in your server group and their topology. First, we need to make sure that the domain controller and the host controller can communicate, and to do that we need a valid management user. On the domain controller, run the add-user.sh or add-user.bat script. You will need to make sure to: Choose a management user Make sure the user is different than the one you would use to log in to the web console Confirm that the new user will connect one AS process to another AS process Make a note of the secret value (this is very important!) You will find that you get prompts similar to the following: mike@mike-C2B2:~$ /opt/wildfly/wildfly-8.2.0.Final/bin/add-user.sh What type of user do you wish to add? a) Management User (mgmt-users.properties) b) Application User (application-users.properties) (a): a Enter the details of the new user to add. Using realm 'ManagementRealm' as discovered from the existing property files. Username : mgmt Password recommendations are listed below. To modify these restrictions edit the add-user.properties configuration file. - The password should not be one of the following restricted values {root, admin, administrator} - The password should contain at least 8 characters, 1 alphabetic character(s), 1 digit(s), 1 non-alphanumeric symbol(s) - The password should be different from the username Password : Re-enter Password : What groups do you want this user to belong to? (Please enter a comma separated list, or leave blank for none)[ ]: About to add user 'mgmt' for realm 'ManagementRealm' Is this correct yes/no? yes Added user 'mgmt' to file '/opt/wildfly/wildfly-8.2.0.Final/standalone/configuration/mgmt-users.properties' Added user 'mgmt' to file '/opt/wildfly/wildfly-8.2.0.Final/domain/configuration/mgmt-users.properties' Added user 'mgmt' with groups to file '/opt/wildfly/wildfly-8.2.0.Final/standalone/configuration/mgmt-groups.properties' Added user 'mgmt' with groups to file '/opt/wildfly/wildfly-8.2.0.Final/domain/configuration/mgmt-groups.properties' Is this new user going to be used for one AS process to connect to another AS process? e.g. for a slave host controller connecting to the master or for a Remoting connection for server to server EJB calls. yes/no? yes To represent the user add the following to the server-identities definition Once we have the secret value for our management user, we can add it to the host.xml file. I'm choosing to modify the host-slave.xml file, since much of the configuration I need is done for me: Next, we need to tell the host controller where to look for the domain controller. We set this to for the domain controller's host.xml file, but in the host-slave.xml we have an example tag filled out for us. All we need to do is add the domain controller's IP or hostname exactly as we did for the management bind address earlier. So our host-slave.xml should go from this: to this: This way, like with the management interface on the domain controller, the default address will be 10.0.0.1, but it can also be overridden on the command line if needed. Once we've sorted the communication out, we need to tell the host controller to actually start some server instances! At the bottom of the host-slave.xml file, there are two predefined servers to use: These are already configured to become members of the two server groups configured in the domain.xml. Note that the second server has to have a port offset. Despite it being in a different server group, it's still going to run on the same host and will attempt to bind to the same ports as the first server unless we tell it not to! We would also need to do the same thing if we added other server instances. Optionally, we can make things a little easier for ourselves when managing a lot of servers on a lot of hosts. We can give each server instance its own unique name, but we can also name the host by adding a name attribute to the parent tag, changing it from: to So both in the logs and in the admin console, you should see this host controller referred to as "host1". Now, if you wanted to name your server instances the same across hosts, you'll be able to tell which is which! If all you wanted was to configure a single domain controller and a single host controller, then that's all we need to do to get them speaking to each other. You can then carry on and configure mod_cluster and Apache to forward requests on to the right server, or just deploy your applications and connect to them directly.

April 3, 2015

by Mike Croft

· 23,548 Views

To Shard, or Not to Shard

When I talk with customers about sharding decisions I often start by telling the following true story… A couple of years ago, a customer came to me looking for advice on how to shard his system. He told me he was already convinced he needed to do that since he read that some smart people at MySQL giants like Facebook and Twitter were sharding—so naturally this was something he should be doing, too. I paused for a moment and then I asked him what the size of his database was. “10GB,” he said. I nodded and asked if he handles many queries or if they were very complicated. “No,” he said. “Just a few hundred queries per second, and they have not been loading down the system by more than a few percent.” I asked him whether he was expecting exponential growth in the near future—looking to double every week or something like that. “No, our load and data size grew about 7 percent last year and we expect about the same growth this year and for the foreseeable future.” My recommendation to him was not to waste time and effort on sharding because it is just not needed in his company’s case. Before you decide how to shard, you’d best understand whether or not you really need to shard to begin with. Yes, on the extremely large-scale side of database demands, sharding is the only game in town. And not just for MySQL, but for pretty much any technology out there. Yet thanks to emerging technologies there is an increasing amount of applications that can run databases without sharding. Today we can easily run with terabytes of data per MySQL instance and serve tens of thousands of queries in many OLTP environments. This allows organizations to build very large applications without needing to shard. And keep this in mind: Sharding is a pain under all circumstances. Even if you have sharding provided out of the box by the database system, it is a pain because it introduces more components and complexity. Creating good distributed query execution plans is a very complicated task that needs to take network topology and load into account in addition to the data distribution and load of individual nodes. Before you decide if you need to shard, you should look at alternatives to scale your application. In the MySQL world, the solutions are typically as follows: Alternatives to Sharding Functional Partitioning: In many environments a single MySQL instance becomes a dumping ground for all kinds of databases—you might end up having your main application share a database instance with Drupal, which powers your website, with WordPress, which powers your blog, and with vBulletin, which powers your forums. Splitting those pieces into different database instances is something you should consider before you look into sharding. Custom-made systems will often have many applications using different data sets that can be easily split out. Replication: Many applications are read-heavy, so scaling reads becomes the issue earlier than it does with scaling writes. Replication is a great solution for this. MySQL’s built-in replication is very robust, though due to its asynchronous nature it adds complexity to the application. The developer must decide which of the reads can be done from the replica servers and which can’t, because you must be absolutely certain that you’re reading the most recent, actual data. This is the reason that alternative, synchronous replication technologies for MySQL like Percona XtraDB Cluster, are gaining popularity: They provide single database-like behavior from the cluster in most cases. Caching and Queueing: Caching is a great technology for reducing the amount of reads that hit the database. There are many applications that have reduced read load on the database by 80-95% using this technology. Queueing, in contrast, optimizes writes. It does this by merging multiple write operations together so they hit the database efficiently. Most large-scale applications should rely heavily on both of these technologies. Memcached and Redis are two popular caching technologies in the MySQL space. For queueing, the most popular technologies are ActiveMQ and RabbitMQ [1]. Supplemental Technologies: MySQL is great at many things but not at everything. If you’re looking for high-performance full-text search, consider ElasticSearch, Sphinx, or Lucene. If you’re looking at large-scale data analytics, a Hadoop-based infrastructure or Vertica might work well for you. You should let MySQL handle the things it is good at, and leave the rest to supporting tools. Optimizations to Make Before Sharding Scaling isn’t just about architecture either. You also need to make sure your system is reasonably optimized. Many people decide sharding is inevitable for them even though there are much easier and more cost-effective ways to get the performance and scale they are looking for. All of which, I might add, are also going to be valuable if sharding is indeed eventually needed. Hardware: Are you using the right hardware? I’ve seen many people looking into sharding when in fact simply purchasing decent hardware would solve their problems for years to come. Make sure you have plenty of memory and high-performance flash storage if you’re working with a large database. In many cases it can transform your system so much it will look like magic. MySQL version and Configuration: Use a recent MySQL version. By that I mean the latest GA version (MySQL 5.6 at the time of this article’s publication). Percona Server, which is free, often offers additional performance improvements for demanding workloads. Use the most recent operating system too, especially if you’re using modern hardware. Finally, make sure MySQL is configured properly. The difference in MySQL performance between poorly configured MySQL and well-tuned MySQL can be 10x or more. Schema and Queries: The same application logic can be expressed using a variety of schema and queries. I’ve seen a lot of similar applications approaching things differently, and the difference in the performance between an optimal approach and a poor one (but still used in production) can be 100x or more. Many of the changes can be retrofitted to existing schema—such as minor query changes and changes to the index structure—however, if your schema doesn’t fit your application needs well, then you might be looking at a complete redesign. So it is a good idea to think things through early. When to Shard So when should you start thinking about sharding? Basically, if none of the measures listed above have given you the performance you need, it might be time to consider sharding. Sharding does have the advantage of allowing you to potentially use lower-cost hardware or cheaper cloud instances. Most developers are using agile development methods these days and there is a common term, “Architectural Runway,” which defines how far the application can go with its current architecture. If you’ve already found success using replication in particular, it might be a bad decision to add sharding because it will force your developers to deal with the complexity of sharding and asynchronous replication. However, replication is still typically used to achieve high availability even if you’re already sharding, but in this case it’s not for scaling reads. If you’ve come to the point where you’re sure you need to shard, here are some of the questions you need to ask about how you’ll implement your sharding strategy: Shard Level: At which level should we shard? It does not have to be at the database level. Many applications, SaaS in particular, often “shard” on higher levels, deploying multiple copies of their full stack to offer complete isolation for availability, performance, security etc. In many large scale applications you will see multiple copies of a full stack deployed, each having its own sharded MySQL environment. Shard Key: How do we shard? In many cases the choice depends on whether you’re authenticating for user accounts or your organization, but in other cases it is not so obvious. When making a sharding choice, you need to think about two things: 1) as many data access points as possible should go into a single shard, because cross-shard access is expensive if supported at all, and 2) making sure such sharding does not produce a shard that is too large to handle either in terms of data size or traffic. For example, sharding by country is a poor idea because the requirements to handle Belgium traffic won’t be the same for the United States or China, which require a lot more resources. Shard by Schema or Instance: What is the unit of your shard? The typical choices are MySQL instance or database (schema). I like the shard = database approach, which doesn’t limit you to a single MySQL instance per physical box. That way you do not have to run too many MySQL instances, but you can run more than one if the application works better that way. Shard Unit: If you shard by a single MySQL server, you will run into a problem with high availability very soon. When you have 100 MySQL servers there are roughly 100 more chances for one of them to crash compared with having only one, so ensuring there is a high availability solution becomes critical. Instead of sharing across MySQL servers you will usually be sharding across “Replication Clusters,” such as one MySQL primary node and one or several replica or PXC (Percona XtraDB Cluster) nodes. Shard Technology: What technology can you use to assist you with sharding? Within the MySQL world there is no standard sharding technology as of yet that everyone uses. Most of the large web properties have implemented something in-house for their sharding needs, and some have released their solutions as open source projects. One example is Vitess, contributed by Google, and another is JetPants, contributed by Tumblr. Rolling out your own simple sharding framework might look easy for some developers until you have to deal with operational issues like balancing the shards, resharding, etc., on a large scale. There are a number of purpose-built technologies that can help you with sharding if this doesn’t sound like something your team can manage. Sharding Technologies Here are technologies that you should consider: MySQL Fabric: This is the sharding technology being developed by the MySQL team at Oracle. MySQL Fabric is GA, but its functionality right now is rather limited, especially in terms of their support for multi-sharded queries. Given more time however, it has the potential to become the standard sharding technology for MySQL. Tesora: Tesora has a proxy-based solution for MySQL sharding that became open source some time ago. I would be especially looking at Tesora if you’re also looking at deploying OpenStack, as they’ve invested a lot into the integration. ScaleArc: ScaleArc is a commercial database proxy solution that can do caching, filtering, routing, and sharding. It is a pretty mature solution that handles multiple database technologies and not just MySQL. ScaleBase: ScaleBase is a sharding solution designed specifically for MySQL and the cloud, which similarly to MySQL, operates at the proxy level. There are many technologies in the MySQL space that can help you scale your application without sharding. If you’re going to build the next “Facebook,” however, you will surely need to shard, and there are a number of technologies that can help you do it as painlessly as possible. Large-scale applications on large-scale databases will always introduce complexity, which makes them more complicated to develop against and manage. Success comes with cost. [1] http://dzone.com/research/guide-to-enterprise-integration Peter Zaitsev co-founded Percona in 2006, assuming the role of CEO. Percona helps companies of all sizes maximize their success with MySQL. Peter enjoys mixing business leadership with hands on technical expertise. Peter is also the co-author of O’Reilly’s High Performance MySQL, one of the most popular books on MySQL performance.

April 2, 2015

by Peter Zaitsev

· 20,882 Views · 1 Like

Dismantling invokedynamic

Many Java developers regarded the JDK's version seven release as somewhat a disappointment. On the surface, merely a few language and library extensions made it into the release, namely Project Coin and NIO2. But under the covers, the seventh version of the platform shipped the single biggest extension to the JVM's type system ever introduced after its initial release. Adding the invokedynamic instruction did not only lay the foundation for implementing lambda expressions in Java 8, it also was a game changer for translating dynamic languages into the Java byte code format. While the invokedynamic instruction is an implementation detail for executing a language on the Java virtual machine, understanding the functioning of this instruction gives true insights into the inner workings of executing a Java program. This article gives a beginner's view on what problem the invokedynamic instruction solves and how it solves it. Method handles Method handles are often described as a retrofitted version of Java's reflection API, but this is not what they are meant to represent. While method handles can represent a method, constructor or field, they are not intended to describe properties of these class members. It is for example not possible to directly extract metadata from a method handle such as modifiers or annotation values of the represented method. And while method handles allow for the invocation of a referenced method, their main purpose is to be used together with an invokedynamic call site. For gaining a better understanding of method handles, looking at them as an imperfect replacement for the reflection API is however a reasonable starting point. Method handles cannot be instantiated. Instead, method handles are created by using a designated lookup object. These objects are themselves created by using a factory method that is provided by the MethodHandles class. Whenever this factory is invoked, it first creates a security context which ensures that the resulting lookup object can only locate methods that are also visible to the class from which the factory method was invoked. A lookup object can then be created as follows: class Example { void doSomething() { MethodHandles.Lookup lookup = MethodHandles.lookup(); } private void foo() { /* ... */ } } As argued before, the above lookup object could only be used to locate methods that are also visible to the Example class such asfoo. It would for example be impossible to look up a private method of another class. This is a first major difference to using the reflection API where private methods of outside classes can be located just as any other method and where these methods can even be invoked after marking such a method as accessible. Method handles are therefore sensible of their creation context which is a first major difference to the reflection API. Apart from that, a method handle is more specific than the reflection API by describing a specific type of method rather than representing just any method. In a Java program, a method's type is a composite of both the method's return type and the types of its parameters. For example, the only method of the following Counter class returns an int representing the number of characters of the only String-typed argument: class Counter { static int count(String name) { return name.length(); } } A representation of this method's type can be created by using another factory. This factory is found in the MethodType class which also represents instances of created method types. Using this factory, the method type for Counter::count can be created by handing over the method's return type and its parameter types bundled as an array: MethodType methodType = MethodType.methodType(int.class, new Class[] {String.class}); By using the lookup object that was created before and the above method type, it is now possible to locate a method handle that represents the Counter::count method as depicted in the following code: MethodType methodType = MethodType.methodType(int.class, new Class[] {String.class}); MethodHandles.Lookup lookup = MethodHandles.lookup(); MethodHandle methodHandle = lookup.findStatic(Counter.class, "count", methodType); int count = methodHandle.invokeExact("foo"); assertThat(count, is(3)); At first glance, using a method handle might seem like an overly complex version of using the reflection API. However, keep in mind that the direct invocation of a method using a handle is not the main intent of its use. The main difference of the above example code and of invoking a method via the reflection API is only revealed when looking into the differences of how the Java compiler translates both invocations into Java byte code. When a Java program invokes a method, this method is uniquely identified by its name and by its (non-generic) parameter types and even by its return type. It is for this reason that it is possible to overload methods in Java. And even though the Java programming language does not allow it, the JVM does in theory allow to overload a method by its return type. Following this principle, a reflective method call is executed as a common method call of the Method::invoke method. This method is identified by its two parameters which are of the types Object and Object[]. In addition to this, the method is identified by its Object return type. Because of this signature, all arguments to this method need to always be boxed and enclosed in an array. Similarly, the return value needs to be boxed if it was primitive or null is returned if the method was void. Method handles are the exception to this rule. Instead of invoking a method handle by referring to the signature ofMethodHandle::invokeExact signature which takes an Object[] as its single argument and returns Object, method handles are invoked by using a so-called polymorphic signature. A polymorphic signature is created by the Java compiler dependant on the types of the actual arguments and the expected return type at a call site. For example, when invoking the method handle as above with int count = methodHandle.invokeExact("foo"); the Java compiler translates this invocation as if the invokeExact method was defined to accept a single single argument of typeString and returning an int type. Obviously, such a method does not exist and for (almost) any other method, this would result in a linkage error at runtime. For method handles, the Java Virtual Machine does however recognize this signature to be polymorphic and treats the invocation of the method handle as if the Counter::count method that the handle refers to was inset directly into the call site. Thus, the method can be invoked without the overhead of boxing primitive values or the return type and without placing the argument values inside an array. At the same time, when using the invokeExact invocation, it is guaranteed to the Java virtual machine that the method handle always references a method at runtime that is compatible to the polymorphic signature. For the example, the JVM expected that the referenced method actually accepts a String as its only argument and that it returns a primitive int. If this constraint was not fulfilled, the execution would instead result in a runtime error. However, any other method that accepts a single String and that returns a primitive int could be successfully filled into the method handle's call site to replace Counter::count. In contrast, using the Counter::count method handle at the following three invocations would result in runtime errors, even though the code compiles successfully: int count1 = methodHandle.invokeExact((Object) "foo"); int count2 = (Integer) methodHandle.invokeExact("foo"); methodHandle.invokeExact("foo"); The first statement results in an error because the argument that is handed to the handle is too general. While the JVM expected a String as an argument to the method, the Java compiler suggested that the argument would be an Object type. It is important to understand that the Java compiler took the casting as a hint for creating a different polymorphic signature with anObject type as a single parameter type while the JVM expected a String at runtime. Note that this restriction also holds for handing too specific arguments, for example when casting an argument to an Integer where the method handle required aNumber type as its argument. In the second statement, the Java compiler suggested to the runtime that the handle's method would return an Integer wrapper type instead of the primitive int. And without suggesting a return type at all in the third statement, the Java compiler implicitly translated the invocation into a void method call. Hence, invokeExact really does mean exact. This restriction can sometimes be too harsh. For this reason, instead of requiring an exact invocation, the method handle also allows for a more forgiving invocation where conversions such as type castings and boxings are applied. This sort of invocation can be applied by using the MethodHandle::invoke method. Using this method, the Java compiler still creates a polymorphic signature. This time, the Java virtual machine does however test the actual arguments and the return type for compatibility at run time and converts them by applying boxings or castings, if appropriate. Obviously, these transformations can sometimes add a runtime overhead. Fields, methods and constructors: handles as a unified interface Other than Method instances of the reflection API, method handles can equally reference fields or constructors. The name of theMethodHandle type could therefore be seen as too narrow. Effectively, it does not matter what class member is referenced via a method handle at runtime as long as its MethodType, another type with a misleading name, matches the arguments that are passed at the associated call site. Using the appropriate factories of a MethodHandles.Lookup object, a field can be looked up to represent a getter or a setter. Using getters or setters in this context does not refer to invoking an actual method that follows the Java bean specification. Instead, the field-based method handle directly reads from or writes to the field but in shape of a method call via invoking the method handle. By representing such field access via method handles, field access or method invocations can be used interchangeably. As an example for such interchange, take the following class: class Bean { String value; void print(String x) { System.out.println(x); } } For the above Bean class, the following method handles can be used for either writing a string to the value field or for invoking the print method with the same string as an argument: MethodHandle fieldHandle = lookup.findSetter(Bean.class, "value", String.class); MethodType methodType = MethodType.methodType(void.class, new Class[] {String.class}); MethodHandle methodHandle = lookup.findVirtual(Bean.class, "print", methodType); As long as the method handle call site is handed an instance of Bean together with a String while returning void, both method handles could be used interchangeably as shown here: anyHandle.invokeExact((Bean) mybean, (String) myString); Note that the polymorphic signature of the above call site does not match the method type of the above handle. However, within Java byte code, non-static methods are invoked as if they were static methods with where the this reference is handed as a first, implicit argument. A non-static method's nominal type does therefore diverge from its actual runtime type. Similarly, access to a non-static field requires an instance to be access. Similarly to fields and methods, it is possible to locate and invoke constructors which are considered as methods with a voidreturn value for their nominal type. Furthermore, one can not only invoke a method directly but even invoke a super method as long as this super method is reachable for the class from which the lookup factory was created. In contrast, invoking a super method is not possible at all when relying on the reflection API. If required, it is even possible to return a constant value from a handle. Performance metrics Method handles are often described as being a more performant as the Java reflection API. At least for recent releases of the HotSpot virtual machine, this is not true. The simplest way of proving this is writing an appropriate benchmark. Then again, is not all too simple to write a benchmark for a Java program which is optimized while it is executed. The de facto standard for writing a benchmark has become using JMH, a harness that ships under the OpenJDK umbrella. The full benchmark can be found as a gist in my GitHub profile. In this article, only the most important aspects of this benchmark are covered. From the benchmark, it becomes obvious that reflection is already implemented quite efficiently. Modern JVMs know a concept named inflation where a frequently invoked reflective method call is replaced with runtime generated Java byte code. What remains is the overhead of applying the boxing for passing arguments and receiving a return values. These boxings can sometimes be eliminated by the JVM's Just-in-time compiler but this is not always possible. For this reason, using method handles can be more performant than using the reflection API if method calls involve a significant amount of primitive values. This does however require that the exact method signatures are already known at compile time such that the appropriate polymorphic signature can be created. For most use cases of the reflection API, this guarantee can however not be given because the invoked method's types are not known at compile time. In this case, using method handles does not offer any performance benefits and should not be used to replace it. Creating an invokedynamic call site Normally, invokedynamic call sites are created by the Java compiler only when it needs to translate a lambda expression into byte code. It is worthwhile to note that lambda expressions could have been implemented without invokedynamic call sites altogether, for example by converting them into anonymous inner classes. As a main difference to the suggested approach, using invokedynamic delays the creation of a similar class to runtime. We are looking into class creation in the next section. For now, bear however in mind that invokedynamic does not have anything to do with class creation, it only allows to delay the decision of how to dispatch a method until runtime. For a better understanding of invokedynamic call sites, it helps to create such call sites explicitly in order to look at the mechanic in isolation. To do so, the following example makes use of my code generation framework Byte Buddy which provides explicit byte code generation of invokedynamic call sites without requiring a any knowledge of the byte code format. Any invokedynamic call site eventually yields a MethodHandle that references the method to be invoked. Instead of invoking this method handle manually, it is however up to the Java runtime to do so. Because method handles have become a known concept to the Java virtual machine, these invocations are then optimized similarly to a common method call. Any such method handle is received from a so-called bootstrap method which is nothing more than a plain Java method that fulfills a specific signature. For a trivial example of a bootstrap method, look at the following code: class Bootstrapper { public static CallSite bootstrap(Object... args) throws Throwable { MethodType methodType = MethodType.methodType(int.class, new Class[] {String.class}) MethodHandles.Lookup lookup = MethodHandles.lookup(); MethodHandle methodHandle = lookup.findStatic(Counter.class, "count", methodType); return new ConstantCallSite(methodHandle); } } For now, we do not care much about the arguments of the method. Instead, notice that the method is static what is as a matter of fact a requirement. Within Java byte code, an invokedynamic call site references the full signature of a bootstrap method but not a specific object which could have a state and a life cycle. Once the invokedynamic call site is invoked, control flow is handed to the referenced bootstrap method which is now responsible for identifying a method handle. Once this method handle is returned from the bootstrap method, it is invoked by the Java runtime. As obvious from the above example, a MethodHandle is not returned directly from a bootstrap method. Instead, the handle is wrapped inside of a CallSite object. Whenever a bootstrap method is invoked, the invokedynamic call site is later permanently bound to the CallSite object that is returned from this method. Consequently, a bootstrap method is only invoked a single time for any call site. Thanks to this intermediate CallSite object, it is however possible to exchange the referenced MethodHandle at a later point. For this purpose, the Java class library already offers different implementations of CallSite. We have already seen a ConstantCallSite in the example code above. As the name suggests, a ConstantCallSite always references the same method handle without a possibility of a later exchange. Alternatively, it is however also possible to for example use aMutableCallSite which allows to change the referenced MethodHandle at a later point in time or it is even possible to implement a custom CallSite class. With the above bootstrap method and Byte Buddy, we can now implement a custom invokedynamic instruction. For this, Byte Buddy offers the InvokeDynamic instrumentation that accepts a bootstrap method as its only mandatory argument. Such instrumentations are then fed to Byte Buddy. Assuming the following class: abstract class Example { abstract int method(); } we can use Byte Buddy to subclass Example in order to override method. We are then going to implement this method to contain a single invokedynamic call site. Without any further configuration, Byte Buddy creates a polymorphic signature that resembles the method type of the overridden method. However, for non-static methods, the this reference is set as a first, implicit argument. Assuming that we want to bind the Counter::count method which expects a String as a single argument, we could not bind this handle to Example::method because of this type mismatch. Therefore, we need to create a different call site without the implicit argument but with an String in its place. This can be achieved by using Byte Buddy's domain specific language: Instrumentation invokeDynamic = InvokeDynamic .bootstrap(Bootstrapper.class.getDeclaredMethod(“bootstrap”, Object[].class)) .withoutImplicitArguments() .withValue("foo"); With this instrumentation in place, we can finally extend the Example class and override method to implement the invokedynamic call site as in the following code snippet: Example example = new ByteBuddy() .subclass(Example.class) .method(named(“method”)).intercept(invokeDynamic) .make() .load(Example.class.getClassLoader(), ClassLoadingStrategy.Default.INJECTION) .getLoaded() .newInstance(); int result = example.method(); assertThat(result, is(3)); As obvious from the above assertion, the characters of the "foo" string were counted correctly. By setting appropriate break points in the code, it is further possible to validate that the bootstrap method is called and that control flow further reaches theCounter::count method. So far, we did not gain much from using an invokedynamic call site. The above bootstrap method would always bindCounter::count and can therefore only produce a valid result if the invokedynamic call site really wanted to transform a Stringinto an int. Obviously, bootstrap methods can however be more flexible thanks to the arguments they receive from the invokedynamic call site. Any bootstrap method receives at least three arguments: As a first argument, the bootstrap method receives a MethodHandles.Lookup object. The security context of this object is that of the class that contains the invokedynamic call site that triggered the bootstrapping. As discussed before, this implies that private methods of the defining class could be bound to the invokedynamic call site using this lookup instance. The second argument is a String representing a method name. This string serves as a hint to indicate from the call site which method should be bound to it. Strictly speaking, this argument is not required as it is perfectly legal to bind a method with another name. Byte Buddy simply serves the the name of the overridden method as this argument, if not specified differently. Finally, the MethodType of the method handle that is expected to be returned is served as a third argument. For the example above, we specified explicitly that we expect a String as a single parameter. At the same time, Byte Buddy derived that we require an int as a return value from looking at the overridden method, as we again did not specify any explicit return type. It is up to the implementor of a bootstrap method what exact signature this method should portray as long as it can at least accept these three arguments. If the last parameter of a bootstrap method represents an Object array, this last parameter is treated as a varargs and can therefore accept any excess arguments. This is also the reason why the above example bootstrap method is valid. Additionally, a bootstrap method can receive several arguments from an invokedynamic call site as long as these arguments can be stored in a class's constant pool. For any Java class, a constant pool stores values that are used inside of a class, largely numbers or string values. As of today, such constants can be primitive values of at least 32 bit size, Strings, Classes,MethodHandles and MethodTypes. This allows bootstrap methods to be used more flexible, if locating a suitable method handle requires additional information in form of such arguments. Lambda expressions Whenever the Java compiler translates a lambda expression into byte code, it copies the lambda's body into a private method inside of the class in which the expression is defined. These methods are named lambda$X$Y with X being the name of the method that contains the lambda expression and with Y being a zero-based sequence number. The parameters of such a method are those of the functional interface that the lambda expression implements. Given that the lambda expression makes no use of non-static fields or methods of the enclosing class, the method is also defined to be static. For compensation, the lambda expression is itself substituted by an invokedynamic call site. On its invocation, this call site requests the binding of a factory for an instance of the functional interface. As arguments to this factory, the call site supplies any values of the lambda expression's enclosing method which are used inside of the expression and a reference to the enclosing instance, if required. As a return type, the factory is required to provide an instance of the functional interface. For bootstrapping a call site, any invokedynamic instruction currently delegates to the LambdaMetafactory class which is included in the Java class library. This factory is then responsible for creating a class that implements the functional interface and which invokes the appropriate method that contains the lambda's body which, as described before, is stored in the original class. In the future, this bootstrapping process might however change which is one of the major advantages of using invokedynamic for implementing lambda expressions. If one day, a better suited language feature was available for implementing lambda expressions, the current implementation could simply be swapped out. In order to being able to create a class that implements the functional interface, any call site representing a lambda expression provides additional arguments to the bootstrap method. For the obligatory arguments, it already provides the name of the functional interface's method. Also, it provides a MethodType of the factory method that the bootstrapping is supposed to yield as a result. Additionally, the bootstrap method is supplied another MethodType that describes the signature of the functional interface's method. To that, it receives a MethodHandle referencing the method that contains the lambda's method body. Finally, the call site provides a MethodType of the generic signature of the functional interface's method, i.e. the signature of the method at the call site before type-erasure was applied. When invoked, the bootstrap method looks at these arguments and creates an appropriate implementation of a class that implements the functional interface. This class is created using the ASM library, a low-level byte code parser and writer that has become the de facto standard for direct Java byte code manipulation. Besides implementing the functional interface's method, the bootstrap method also adds an appropriate constructor and a static factory method for creating instances of the class. It is this factory method that is later bound to the invokedyanmic call site. As arguments, the factory receives an instance to the lambda method's enclosing instance, in case it is accessed and also any values that are read from the enclosing method. As an example, consider the following lambda expression: class Foo { int i; void bar(int j) { Consumer consumer = k -> System.out.println(i + j + k); } } In order to be executed, the lambda expression requires access to both the enclosing instance of Foo and to the value j of its enclosing method. Therefore, the desugared version of the above class looks something like the following where the invokedynamic instruction is represented by some pseudo-code: class Foo { int i; void bar(int j) { Consumer consumer = ; } private /* non-static */ void lambda$foo$0(int j, int k) { System.out.println(this.i + j + k); } } In order to being able to invoke lambda$foo$0, both the enclosing Foo instance and the j variable are handed to the factory that is bound by the invokedyanmic instruction. This factory then receives the variables it requires in order to create an instance of the generated class. This generated class would then look something like the following: class Foo$$Lambda$0 implements Consumer { private final Foo _this; private final int j; private Foo$$Lambda$0(Foo _this, int j) { this._this = _this; this.j = j; } private static Consumer get$Lambda(Foo _this, int j) { return new Foo$$Lambda$0(_this, j); } public void accept(Object value) { // type erasure _this.lambda$foo$0(_this, j, (Integer) value); } } Eventually, the factory method of the generated class is bound to the invokedynamic call site via a method handle that is contained by a ConstantCallSite. However, if the lambda expression is fully stateless, i.e. it does not require access to the instance or method in which it is enclosed, the LambdaMetafactory returns a so-called constant method handle that references an eagerly created instance of the generated class. Hence, this instance serves as a singleton to be used for every time that the lambda expression's call site is reached. Obviously, this optimization decision affects your application's memory footprint and is something to keep in mind when writing lambda expressions. Also, no factory method is added to a class of a stateless lambda expression. You might have noticed that the lambda expression's method body is contained in a private method which is now invoked from another class. Normally, this would result in an illegal access error. To overcome this limitation, the generated classes are loaded using so-called anonymous class loading. Anonymous class loading can only be applied when a class is loaded explicitly by handing a byte array. Also, it is not normally possible to apply anonymous class loading in user code as it is hidden away in the internal classes of the Java class library. When a class is loaded using anonymous class loading, it receives a host class of which it inherits its full security context. This involves both method and field access rights and the protection domain such that a lambda expression can also be generated for signed jar files. Using this approch, lambda expression can be considered more secure than anonymous inner classes because private methods are never reachable from outside of a class. Under the covers: lambda forms Lambda forms are an implementation detail of how MethodHandles are executed by the virtual machine. Because of their name, lambda forms are however often confused with lambda expressions. Instead, lambda forms are inspired by lambda calculus and received their name for that reason, not for their actual usage to implement lambda expressions in the OpenJDK. In earlier versions of the OpenJDK 7, method handles could be executed in one of two modes. Method handles were either directly rendered as byte code or they were dispatched using explicit assembly code that was supplied by the Java runtime. The byte code rendering was applied to any method handle that was considered to be fully constant throughout the lifetime of a Java class. If the JVM could however not prove this property, the method handle was instead executed by dispatching it to the supplied assembly code. Unfortunately, because assembly code cannot be optimized by Java's JIT-compiler, this lead to non-constant method handle invocations to "fall off the performance cliff". As this also affected the lazily bound lambda expressions, this was obviously not a satisfactory solution. LambdaForms were introduced to solve this problem. Roughly speaking, lambda forms represent byte code instructions which, as stated before, can be optimized by a JIT-compiler. In the OpenJDK, a MethodHandle's invocation semantics are today represented by a LambdaForm to which the handle carries a reference. With this optimizable intermediate representation, the use of non-constant MethodHandles has become significantly more performant. As a matter of fact, it is even possible to see a byte-code compiled LambdaForm in action. Simply place a break point inside of a bootstrap method or inside of a method that is invoked via a MethodHandle. Once the break point kicks it, the byte code-translated LambdaForms can be found on the call stack. Why this matters for dynamic languages Any language that should be executed on the Java virtual machine needs to be translated to Java byte code. And as the name suggests, Java byte code aligns rather close to the Java programming language. This includes the requirement to define a strict type for any value and before invokedynamic was introduced, a method call required to specify an explicit target class for dispatching a method. Looking at the following JavaScript code, specifying either information is however not possible when translating the method into byte code: 1 2 3 function (foo) { foo.bar(); } Using an invokedynamic call site, it has become possible to delay the identification of the method's dispatcher until runtime and furthermore, to rebind the invocation target, in case that a previous decision needs to be corrected. Before, using the reflection API with all of its performance drawbacks was the only real alternative to implementing a dynamic language. The real profiteer of the invokedynamic instruction are therefore dynamic programming languages. Adding the instruction was a first step away from aligning the byte code format to the Java programming language, making the JVM a powerful runtime even for dynamic languages. And as lambda expressions proved, this stronger focus on hosting dynamic languages on the JVM does neither interfere with evolving the Java language. In contrast, the Java programming languages gained from these efforts.

April 2, 2015

by Rafael Winterhalter

· 13,685 Views · 7 Likes

How CAS (Compare And Swap) in Java works

Before we dig into CAS (Compare And Swap) strategy and how is it used by atomic constructs like AtomicInteger, first consider this code: public class MyApp { private volatile int count = 0; public void upateVisitors() { ++count; //increment the visitors count } } This sample code is tracking the count of visitors to the application. Is there anything wrong with this code? What will happen if multiple threads try to update count? Actually the problem is simply marking count as volatile does not guarantee atomicity and ++count is not an atomic operations. To read more check this. Can we solve this problem if we mark the method itself synchronized as shown below: public class MyApp { private int count = 0; public synchronized void upateVisitors() { ++count; //increment the visitors count } } Will this work? If yes then what changes have we made actually? Does this code guarantee atomicity? Yes. Does this code guarantee visibility? Yes. Then what is the problem? It makes use of locking and that introduces lot of delay and overhead. Check this article. This is very expensive way of making things work. To overcome these problems atomic constructs were introduced. If we make use of an AtomicInteger to track the count it will work. public class MyApp { private AtomicInteger count = new AtomicInteger(0); public void upateVisitors() { count.incrementAndGet(); //increment the visitors count } } The classes that support atomic operations e.g. AtomicInteger, AtomicLong etc. makes use of CAS. CAS does not make use of locking rather it is very optimistic in nature. It follows these steps: Compare the value of the primitive to the value we have got in hand. If the values do not match it means some thread in between has changed the value. Else it will go ahead and swap the value with new value. Check the following code in AtomicLong class: public final long incrementAndGet() { for (;;) { long current = get(); long next = current + 1; if (compareAndSet(current, next)) return next; } } In JDK 8 the above code has been changed to a single intrinsic: public final long incrementAndGet() { return unsafe.getAndAddLong(this, valueOffset, 1L) + 1L; } What advantage this single intrinsic have? Actually this single line is JVM intrinsic which is translated by JIT into an optimized instruction sequence. In case of x86 architecture it is just a single CPU instruction LOCK XADD which might yield better performance than classic load CAS loop. Now think about the possibility when we have high contention and a number of threads want to update the same atomic variable. In that case there is a possibility that locking will outperform the atomic variables but in realistic contention levels atomic variables outperform lock. There is one more construct introduced in Java 8, LongAdder. As per the documentation: This class is usually preferable to AtomicLong when multiple threads update a common sum that is used for purposes such as collecting statistics, not for fine-grained synchronization control. Under low update contention, the two classes have similar characteristics. But under high contention, expected throughput of this class is significantly higher, at the expense of higher space consumption. So LongAdder is not always a replacement for AtomicLong. We need to consider the following aspects: When no contention is present AtomicLong performs better. LongAdder will allocate Cells (a final class declared in abstract class Striped64) to avoid contention which consumes memory. So in case we have a tight memory budget we should prefer AtomicLong. That's all folks. Hope you enjoyed it.

April 1, 2015

by Akhil Mittal

· 71,081 Views · 2 Likes

Fork/Join Framework vs. Parallel Streams vs. ExecutorService: The Ultimate Fork/Join Benchmark

How does the Fork/Join framework act under different configurations? Just like the upcoming episode of Star Wars, there has been a lot of excitement mixed with criticism around Java 8 parallelism. The syntactic sugar of parallel streams brought some hype almost like the new lightsaber we’ve seen in the trailer. With many ways now to do parallelism in Java, we wanted to get a sense of the performance benefits and the dangers of parallel processing. After over 260 test runs, some new insights rose from the data and we wanted to share these with you in this post. Fork/Join Framework vs. Parallel Streams vs. ExecutorService: The Ultimate Fork/Join Benchmark http://t.co/CMNfYZe58Z pic.twitter.com/6WExlmbyo6 — Takipi (@takipid) January 20, 2015 ExecutorService vs. Fork/Join Framework vs. Parallel Streams A long time ago, in a galaxy far, far away.... I mean, some 10 years ago concurrency was available in Java only through 3rd party libraries. Then came Java 5 and introduced the java.util.concurrent library as part of the language, strongly influenced by Doug Lea. The ExecutorService became available and provided us a straightforward way to handle thread pools. Of course java.util.concurrent keeps evolving and in Java 7 the Fork/Join framework was introduced, building on top of the ExecutorService thread pools. With Java 8 streams, we’ve been provided an easy way to use Fork/Join that remains a bit enigmatic for many developers. Let’s find out how they compare to one another. We’ve taken 2 tasks, one CPU-intensive and the other IO-intensive, and tested 4 different scenarios with the same basic functionality. Another important factor is the number of threads we use for each implementation, so we tested that as well. The machine we used had 8 cores available so we had variations of 4, 8, 16 and 32 threads to get a sense of the general direction the results are going. For each of the tasks, we’ve also tried a single threaded solution, which you’ll not see in the graphs since, well, it took much much longer to execute. To learn more about exactly how the tests ran you can check out the groundwork section below. Now, let’s get to it. Indexing a 6GB file with 5.8M lines of text In this test, we’ve generated a huge text file, and created similar implementations for the indexing procedure. Here’s what the results looked like: ** Single threaded execution: 176,267msec, or almost 3 minutes. ** Notice the graph starts at 20000 milliseconds. 1. Fewer threads will leave CPUs unutilized, too many will add overhead The first thing you notice in the graph is the shape the results are starting to take - you can get an impression of how each implementation behaves from only these 4 data points. The tipping point here is between 8 and 16 threads, since some threads are blocking in file IO, and adding more threads than cores helped utilize them better. When 32 threads are in, performance got worse because of the additional overhead. 2. Parallel Streams are the best! Almost 1 second better than the runner up: using Fork/Join directly Syntactic sugar aside (lambdas! we didn’t mention lambdas), we’ve seen parallel streams perform better than the Fork/Join and the ExecutorService implementations. 6GB of text indexed in 24.33 seconds. You can trust Java here to deliver the best result. 3. But… Parallel Streams also performed the worst: The only variation that went over 30 seconds This is another reminder of how parallel streams can slow you down. Let’s say this happens on machines that already run multithreaded applications. With a smaller number of threads available, using Fork/Join directly could actually be better than going through parallel streams - a 5 second difference, which makes for about an 18% penalty when comparing these 2 together. 4. Don’t go for the default pool size with IO in the picture When using the default pool size for Parallel Streams, the same number of cores on the machine (which is 8 here), performed almost 2 seconds worse than the 16 threads version. That’s a 7% penalty for going with the default pool size. The reason this happens is related with blocking IO threads. There’s more waiting going on, so introducing more threads lets us get more out of the CPU cores involved while other threads wait to be scheduled instead of being idle. How do you change the default Fork/Join pool size for parallel streams? You can either change the common Fork/Join pool size using a JVM argument: [java] -Djava.util.concurrent.ForkJoinPool.common.parallelism=16 [/java] (All Fork/Join tasks are using a common static pool the size of the number of your cores by default. The benefit here is reducing resource usage by reclaiming the threads for other tasks during periods of no use.) Or... You can use this trick and run Parallel Streams within a custom Fork/Join pool. This overrides the default use of the common Fork/Join pool and lets you use a pool you’ve set up yourself. Pretty sneaky. In the tests, we’ve used the common pool. 5. Single threaded performance was 7.25x worse than the best result Parallelism provided a 7.25x improvement, and considering the machine had 8 cores, it got pretty close to the theoretic 8x prediction! We can attribute the rest to overhead. With that being said, even the slowest parallelism implementation we tested, which this time was parallel streams with 4 threads (30.24sec), performed 5.8x better than the single threaded solution (176.27sec). What happens when you take IO out of the equation? Checking if a number is prime For the next round of tests, we’ve eliminated IO altogether and examined how long it would take to determine if some really big number is prime or not. How big? 19 digits. 1,530,692,068,127,007,263, or in other words: one quintillion seventy nine quadrillion three hundred sixty four trillion thirty eight billion forty eight million three hundred five thousand thirty three. Argh, let me get some air. Anyhow, we haven’t used any optimization other than running to its square root, so we checked all even numbers even though our big number doesn’t divide by 2 just to make it process longer. Spoiler alert: it’s a prime, so each implementation ran the same number of calculations. Here’s how it turned out: ** Single threaded execution: 118,127msec, or almost 2 minutes. ** Notice the graph starts at 20000 milliseconds 1. Smaller differences between 8 and 16 threads Unlike the IO test, we don’t have IO calls here so the performance of 8 and 16 threads was mostly similar, except for the Fork/Join solution. We’ve actually ran a few more sets of tests to make sure we’re getting good results here because of this “anomaly” but it turned out very similar time after time. We’d be glad to hear your thoughts about this in the comment section below. 2. The best results are similar for all methods We see that all implementations share a similar best result of around 28 seconds. No matter which way we tried to approach it, the results came out the same. This doesn’t mean that we’re indifferent to which method to use. Check out the next insight. 3. Parallel streams handle the thread overload better than other implementations This is the more interesting part. With this test, we see again that the the top results for running 16 threads are coming from using parallel streams. Moreover, in this version, using parallel streams was a good call for all variations of thread numbers. 4. Single threaded performance was 4.2x worse than the best result In addition, the benefit of using parallelism when running computationally intensive tasks is almost 2 times worse than the IO test with file IO. This makes sense since it’s a CPU intensive test, unlike the previous one where we could get an extra benefit from cutting down the time our cores were waiting on threads stuck with IO. Conclusion I’d recommend going to the source to learn more about when to use parallel streams and applying careful judgement anytime you do parallelism in Java. The best path to take would be running similar tests to these in a staging environment where you can try and get a better sense of what you’re up against. The factors you have to be mindful of are of course the hardware you’re running on (and the hardware you’re testing on), and the total number of threads in your application. This includes the common Fork/Join pool and code other developers on your team are working on. So try to keep those in check and get a full view of your application before adding parallelism of your own. Groundwork To run this test we’ve used an EC2 c3.2xlarge instance with 8 vCPUs and 15GB of RAM. A vCPU means there’s hyperthreading in place so in fact we have here 4 physical cores that each act as if it were 2. As far as the OS scheduler is concerned, we have 8 cores here. To try and make it as fair as we could, each implementation ran 10 times and we’ve taken the average run time of runs 2 through 9. That’s 260 test runs, phew! Another thing that was important is the processing time. We’ve chosen tasks that would take well over 20 seconds to process so the differences will be easier to spot and less affected by external factors. What’s next? The raw results are available right here, and the code is on GitHub. Please feel free to tinker around with it and let us know what kind of results you’re getting. If you have any more interesting insights or explanations for the results that we’ve missed, we’d be happy to read them and add it to the post. Originally posted on Takipi's blog

April 1, 2015

by Chen Harel

· 16,700 Views

Get Client (Browser) timezone and maintain it in cookie

Recently, I came with requirement where we need to get browser timezone and maintain it so our Spring MVC application can use it. Our application need to convert date and time from server timezone to client timezone. Below is overall idea of implementation: Get Browser timezone by javascript. We can use opensource 'jstz.min.js' file for getting this. We can find this from ‘http://pellepim.bitbucket.org/jstz/’. We need to maintain this timezone. For same, we will store this timezone in cookie. This can be done by creating one jsp 'findTimeZonePage.jsp'. This page will store timezone in cookie and again redirect to original page. Every method of Spring MVC controller will check whether cookie is available, If not then it will redirect to findTimeZonePage.jsp. While doing this we will also pass current Url(will set in model) so that findTimeZonePage jsp can redirect to same page again. Code: 1. findTimeZonePage.jsp loading the page... 2. Add below Methods in Util class: public static TimeZonegetBrowserTimeZone(HttpServletRequest request){ Cookie[] cookieArray = request.getCookies(); if(cookieArray != null){ for(Cookie cookie : cookieArray){ if("CalenderAppTimeZone".equals(cookie.getName())){ String timeZoneId = cookie.getValue(); return TimeZone.getTimeZone(timeZoneId); } } } return null; } public static StringgetFullURL(HttpServletRequest request) { StringBuffer requestURL = request.getRequestURL(); String queryString = request.getQueryString(); if (queryString == null) { return requestURL.toString(); } else { return requestURL.append('?').append(queryString).toString(); } } 3. In each method of MVC Controller class, Add below code at start of method: TimeZone currentTimeZone = MyUtil.getBrowserTimeZone(request); if(currentTimeZone == null){ String url = MyUtil.getFullURL(request); System.out.println("Url="+url); model.addAttribute("redirectUrl", url); //Redirect to 'findTimeZone' for setting timezone. System.out.println("####Timezone is not set. Redirecting to findTimeZone.jsp for setting timezone."); return "findTimeZonePage"; } System.out.println("####Current TimeZone="+currentTimeZone.getID()); Hope this will help.

March 28, 2015

by Rajeshkumar Dave

· 12,814 Views