The Latest and Popular SDLC Topics

The Latest Popular Topics

A Prodecedural City in 100 Lines of Three.js

The above skyline is from "City," a simple flight-simulator by Ricardo "Mr. Doob" Cabello. "City" is a demo of the capabilities of WebGL, and is written in an impressive 100 lines of JavaScript using Three.js. In his blog post "How to Do a Procedural City in 100 Lines," Jerome Etienne walks you through the process of recreating Cabello's "City." The secret lies in creating 20,000 cubes that are given random sizes and positions and merging them together to create a city. Let's hope that this algorithm is never used for actual city-planning, though, because the buildings can randomly intersect each other and there are no logical spaces for streets! Check out Etienne's blog post here and watch the screencast introduction here:

August 9, 2013

by Allen Coin

· 8,779 Views

AspectJ with Akka and Scala

Imagine that you need to find out how your actors are performing, but without using the Typesafe Console. (You don’t actually want to use this code in production, but it is an interesting project to learn.) What we are interested in is intercepting the receiveMessage call of the ActorCell class. A Short Introduction to AOP We could roll our own build of Akka, but that is not a good idea. Instead, we would like to somehow modify the bytecode of the existing Akka JARs to include our instrumentation code. It turns out that we can use aspect oriented programming to implement cross-cutting concerns in our code. Cross-cutting concerns are pieces of code that cut across the object hierarchy; in other words, we inject the same functionality to different levels of our class structure. In OOP, the only way to share functionality is with inheritance. I will explain the simplest case here, but the same rules apply to mix-in inheritance. If I have: class A { def foo: Int = 42 } class B { def bar: Int = 42 } class AA extends A class BB extends B Then the only way to share the implementation of foo is to extend A; the only way to share the implementation of bar is to extend B. Suppose we now want to measure how many times we call foo on all subtypes of A and bar, but only in subtypes of BB. This now turns out to be rather clumsy. Even if all we do is println("Executing foo") or println("Executing bar"), we have a lot of duplication on our hands. Instead, we would like to write something like this: before executing A.foo { println("Executing foo") } before executing BB.bar { println("Executing bar") } And have some machinery apply this to the classes that make up our system. Importantly, we would like this logic to be applied to every subtype of A and every subtype of BB, even if we only have their compiled forms in a JAR. Enter AspectJ And this is exactly what AspectJ does. It allows us to define these cross-cutting concerns and to weave them into our classes. The weaving can be done at compile time, or even at class load time. In other words, the AspectJ weaver can modify the classes with our cross-cutting concerns as the JVM loads them. The technical details now involve translating our before executing pseudo-syntax into the syntax that AspectJ can use. We are going to be particularly lazy and use AspectJ’s Java syntax. package monitor; import org.aspectj.lang.JoinPoint; import org.aspectj.lang.annotation.Aspect; import org.aspectj.lang.annotation.Before; import org.aspectj.lang.annotation.Pointcut; import javax.management.InstanceAlreadyExistsException; import javax.management.MBeanRegistrationException; import javax.management.MalformedObjectNameException; import javax.management.NotCompliantMBeanException; @Aspect public class MonitorAspect { @Pointcut( value = "execution (* akka.actor.ActorCell.receiveMessage(..))" + "&& args(msg)", argNames = "msg") public void receiveMessagePointcut(Object msg) {} @Before(value = "receiveMessagePointcut(msg)", argNames = "jp,msg") public void message(JoinPoint jp, Object msg) { // log the message System.out.println("Message " + msg); } } The annotations come from the AspectJ dependencies, which, in SBT syntax are just: "org.aspectj" % "aspectjweaver" % "1.7.2", "org.aspectj" % "aspectjrt" % "1.7.2" Excellent. We have defined a pointcut, which you can imagine as a target in the structure of our system. In this particular case, the pointcut says on execution of the receiveMessage method in akka.actor...ActorCell class, with any parameters of any type (..), returning any type * as long as the argument is called msg and inferred to be of the type Object. Without an advice though, a pointcut would be useless–just like having a method that is never called. An advice applies our logic at the point identified by the pointcut. Here, we have the message advice, which runs on execution of the methods matched by the receiveMssagePointcut. At the moment, it does nothing of importance. To see this in action, we need to tell the JVM to use the AspectJ weaver. We do so by specifying the Java agent, which registers a Java Instrumentation implementation, which performs the class file transformation. However, this transformation is costly. Imagine if we had to re-compile every class as it is loaded. To restrict the scope of the transformations, the AspectJ load-time weaver loads an XML file (boo, hiss, I know) from META-INF/aop.xml, which includes its settings. Now, to get it all running all you need to do is include javaagent:~/path-to/aspectjweaver.jar when we start the JVM. Measuring Now that we have our monitor.MonitoringAspect and META-INF/aop.xml ready, all we need to do is implement the logic that records the messages and prints out their per-second averages. I will leave it to the curious reader to come up with a much better approach, but here is one that works, albeit in a very naive way: package monitor; import java.util.HashMap; import java.util.Map; class ActorSystemMessages { private final Map messages = new HashMap<>(); void recordMessage() { long second = System.currentTimeMillis() / 1000; int count = 0; if (messages.containsKey(second)) { count = messages.get(second); } messages.put(second, ++count); } float average() { if (messages.isEmpty()) return 0; int total = 0; for (Integer i : messages.values()) { total += i; } return total / messages.size(); } } We can now use it in our MonitorAspect: @Aspect public class MonitorAspect { private ActorSystemMessages asm = new ActorSystemMessages(); @Pointcut( value = "execution (* akka.actor.ActorCell.receiveMessage(..)) " + "&& args(msg)", argNames = "msg") public void receiveMessagePointcut(Object msg) {} @Before(value = "receiveMessagePointcut(msg)", argNames = "jp,msg") public void message(JoinPoint jp, Object msg) { asm.recordMessage(); System.out.println("Average throughput " + asm.average()); } } Bringing in an Actor Let's see it all run. We will make a simple actor. This time, we will use the Actor DSL. We are not interested that much in its behavior, but we want to see messages being sent around. So, we construct a simple App subclass with two actors. One that sends the messages around and one that prints any message it receives. import akka.actor.{ActorRef, ActorSystem} object Main extends App { import akka.actor.ActorDSL._ import Commands._ implicit val system = ActorSystem() val chatter = actor(new Act { become { case i: Int => self ! (sender, i) case (sender: ActorRef, i: Int) => if (i > 0) self ! (sender, i - 1) else sender ! "zero" } }) implicit val _ = actor(new Act { become { case x => println(">>> " + x) } }) def commandLoop(): Unit = { readLine() match { case CountdownCommand(count) => chatter ! count.toInt case QuitCommand => return } commandLoop() } commandLoop() system.shutdown() } object Commands { val CountdownCommand = """(\d+)""".r val QuitCommand = "quit" } The chatter actor, as you can see, receives the number of messages to be crunched. It then sends a message to itself as a tuple containing the original sender and the number of messages, which will continue to decrease until we hit 0, when we send the "zero"String back to the original sender. As an interesting aside, we have the tail-recursive commandLoop() function that deals with the input that the users type in. Running the Example If you run the example without specifying the -javaagent JVM parameter, the aspect will not be weaved in; consequently, no bytecode will be modified and our logging will not work. Because your IDEs are different, the only reliable way is to run it in SBT. And so, execute sbt run, enter the number of messages and see them displayed. Note that I’m setting the javaOptions, fork and connectInput. The javaOptions is obvious since that’s how I specify the -javaagent parameter and fork makes SBT fork the java process so that the javaOptions takes effect. Finally, the connectInput parameter connects the System.in to the console’s STDIN. (We must do this because we use readLine() in our app.) JMX Now, I don’t like println in the best of the times, and System.out.println is even worse. So, the last modification is to add JMX exporter and to expose the ActorSystemPerformance MBean. The rather baroque JMX code is: public interface ActorSystemPerformanceMXBean { float getMessagesPerSecond(); } public class ActorSystemPerformanceMXBeanImpl implements ActorSystemPerformanceMXBean{ private ActorSystemMessages messages; ActorSystemPerformanceMXBeanImpl(ActorSystemMessages messages) { this.messages = messages; } @Override public float getMessagesPerSecond() { return this.messages.average(); } } public class JMXEndpoint { public static void start(ActorSystemMessages messages) throws ... { MBeanServer mbs = ManagementFactory.getPlatformMBeanServer(); ObjectName name = new ObjectName("monitor:type=Performance"); ActorSystemPerformanceMXBeanImpl mbean = new ActorSystemPerformanceMXBeanImpl(messages); mbs.registerMBean(mbean, name); } } With this in place, we can remove the System.out.println and call the JMXEndpoint.start) method in the aspect’s constructor, giving us: @Aspect public class MonitorAspect { final ActorSystemMessages messages; public MonitorAspect() throws ... { this.messages = new ActorSystemMessages(); JMXEndpoint.start(messages); } @Pointcut(...) public void receiveMessagePointcut(Object msg) {} @Before(...) public void message(JoinPoint jp, Object msg) { messages.recordMessage(); } } Run the application using sbt run again, connect to the JMX MBean using jconsole and see the wonders: Summary This article is a simple exploration of AOP (as implemented in AspectJ) and its use in Scala and Akka. The implementation is very simplistic. If you use it in production, I will endorse you for Enterprise PHP on LinkedIn. However, it is an interesting exercise and really shows how Scala fits well into even the slightly more esoteric Java libraries. The source code for your compiling pleasure is at https//github.com/eigengo/activator-akka-aspectj.

August 8, 2013

by Jan Machacek

· 9,701 Views

Spock - Return Nested Spies / Mocks

Hi! Some time ago I have written an article about Mockito and using RETURNS_DEEP_STUBS when working with JAXB. Quite recently we have faced a similliar issue with deeply nesetd JAXB and the awesome testing framework written in Groovy called Spock. Natively Spock does not support creating deep stubs or spies so we needed to create a workaround for it and this article will show you how to do it. Project structure We will be working on the same data structure as in the RETURNS_DEEP_STUBS when working with JAXB article so the project structure will be quite simillar: As you can see the main difference is such that the tests are present in the /test/groovy/ folder instead of /test/java/ folder. Extended Spock Specification In order to use Spock as a testing framework you have to create Groovy test scripts that extend the Spock Specification class. The details of how to use Spock are available here. In this project I have created an abstract class that extends Specification and adds two additional methods for creating nested Test Doubles (I don't remember if I haven't seen a prototype of this approach somewhere on the internet). ExtendedSpockSpecification.groovy package com.blogspot.toomuchcoding.spock; import spock.lang.Specification /** * Created with IntelliJ IDEA. * User: MGrzejszczak * Date: 14.06.13 * Time: 15:26 */ abstract class ExtendedSpockSpecification extends Specification { /** * The method creates nested structure of spies for all the elements present in the property parameter. Those spies are set on the input object. * * @param object - object on which you want to create nested spies * @param property - field accessors delimited by a dot - JavaBean convention * @return Spy of the last object from the property path */ protected def createNestedSpies(object, String property) { def lastObject = object property.tokenize('.').inject object, { obj, prop -> if (obj[prop] == null) { def foundProp = obj.metaClass.properties.find { it.name == prop } obj[prop] = Spy(foundProp.type) } lastObject = obj[prop] } lastObject } /** * The method creates nested structure of mocks for all the elements present in the property parameter. Those mocks are set on the input object. * * @param object - object on which you want to create nested mocks * @param property - field accessors delimited by a dot - JavaBean convention * @return Mock of the last object from the property path */ protected def createNestedMocks(object, String property) { def lastObject = object property.tokenize('.').inject object, { obj, prop -> def foundProp = obj.metaClass.properties.find { it.name == prop } def mockedProp = Mock(foundProp.type) lastObject."${prop}" >> mockedProp lastObject = mockedProp } lastObject } } These two methods work in a very simillar manner. Assuming that the method's argument property looks as follows: "a.b.c.d" then the methods tokenize the string by "." and iterate over the array -["a","b","c","d"]. We then iterate over the properties of the Meta Class to find the one whose name is equal to prop (for example "a"). If that is the case we then use Spock's Mock/Spy method to create a Test Double of a given class (type). Finally we have to bind the mocked nested element to its parent. For the Spy it's easy since we set the value on the parent (lastObject = obj[prop]). For the mocks however we need to use the overloaded >> operator to record the behavior for our mock - that's why dynamically call the property that is present in the prop variable (lastObject."${prop}" >> mockedProp). Then we return from the closure the mocked/spied instance and we repeat the process for it Class to be tested Let's take a look at the class to be tested: PlayerServiceImpl.java package com.blogspot.toomuchcoding.service; import com.blogspot.toomuchcoding.model.PlayerDetails; /** * User: mgrzejszczak * Date: 08.06.13 * Time: 19:02 */ public class PlayerServiceImpl implements PlayerService { @Override public boolean isPlayerOfGivenCountry(PlayerDetails playerDetails, String country) { String countryValue = playerDetails.getClubDetails().getCountry().getCountryCode().getCountryCode().value(); return countryValue.equalsIgnoreCase(country); } } The test class And now the test class: PlayerServiceImplWrittenUsingSpockTest.groovy package com.blogspot.toomuchcoding.service import com.blogspot.toomuchcoding.model.* import com.blogspot.toomuchcoding.spock.ExtendedSpockSpecification /** * User: mgrzejszczak * Date: 14.06.13 * Time: 16:06 */ class PlayerServiceImplWrittenUsingSpockTest extends ExtendedSpockSpecification { public static final String COUNTRY_CODE_ENG = "ENG"; PlayerServiceImpl objectUnderTest def setup(){ objectUnderTest = new PlayerServiceImpl() } def "should return true if country code is the same when creating nested structures using groovy"() { given: PlayerDetails playerDetails = new PlayerDetails( clubDetails: new ClubDetails( country: new CountryDetails( countryCode: new CountryCodeDetails( countryCode: CountryCodeType.ENG ) ) ) ) when: boolean playerIsOfGivenCountry = objectUnderTest.isPlayerOfGivenCountry(playerDetails, COUNTRY_CODE_ENG); then: playerIsOfGivenCountry } def "should return true if country code is the same when creating nested structures using spock mocks - requires CGLIB for non interface types"() { given: PlayerDetails playerDetails = Mock() ClubDetails clubDetails = Mock() CountryDetails countryDetails = Mock() CountryCodeDetails countryCodeDetails = Mock() countryCodeDetails.countryCode >> CountryCodeType.ENG countryDetails.countryCode >> countryCodeDetails clubDetails.country >> countryDetails playerDetails.clubDetails >> clubDetails when: boolean playerIsOfGivenCountry = objectUnderTest.isPlayerOfGivenCountry(playerDetails, COUNTRY_CODE_ENG); then: playerIsOfGivenCountry } def "should return true if country code is the same using ExtendedSpockSpecification's createNestedMocks"() { given: PlayerDetails playerDetails = Mock() CountryCodeDetails countryCodeDetails = createNestedMocks(playerDetails, "clubDetails.country.countryCode") countryCodeDetails.countryCode >> CountryCodeType.ENG when: boolean playerIsOfGivenCountry = objectUnderTest.isPlayerOfGivenCountry(playerDetails, COUNTRY_CODE_ENG); then: playerIsOfGivenCountry } def "should return false if country code is not the same using ExtendedSpockSpecification createNestedMocks"() { given: PlayerDetails playerDetails = Mock() CountryCodeDetails countryCodeDetails = createNestedMocks(playerDetails, "clubDetails.country.countryCode") countryCodeDetails.countryCode >> CountryCodeType.PL when: boolean playerIsOfGivenCountry = objectUnderTest.isPlayerOfGivenCountry(playerDetails, COUNTRY_CODE_ENG); then: !playerIsOfGivenCountry } def "should return true if country code is the same using ExtendedSpockSpecification's createNestedSpies"() { given: PlayerDetails playerDetails = Spy() CountryCodeDetails countryCodeDetails = createNestedSpies(playerDetails, "clubDetails.country.countryCode") countryCodeDetails.countryCode = CountryCodeType.ENG when: boolean playerIsOfGivenCountry = objectUnderTest.isPlayerOfGivenCountry(playerDetails, COUNTRY_CODE_ENG); then: playerIsOfGivenCountry } def "should return false if country code is not the same using ExtendedSpockSpecification's createNestedSpies"() { given: PlayerDetails playerDetails = Spy() CountryCodeDetails countryCodeDetails = createNestedSpies(playerDetails, "clubDetails.country.countryCode") countryCodeDetails.countryCode = CountryCodeType.PL when: boolean playerIsOfGivenCountry = objectUnderTest.isPlayerOfGivenCountry(playerDetails, COUNTRY_CODE_ENG); then: !playerIsOfGivenCountry } } Let's move through the test methods one by one. First I present the code and then have a quick description of the presented snippet. def "should return true if country code is the same when creating nested structures using groovy"() { given: PlayerDetails playerDetails = new PlayerDetails( clubDetails: new ClubDetails( country: new CountryDetails( countryCode: new CountryCodeDetails( countryCode: CountryCodeType.ENG ) ) ) ) when: boolean playerIsOfGivenCountry = objectUnderTest.isPlayerOfGivenCountry(playerDetails, COUNTRY_CODE_ENG); then: playerIsOfGivenCountry } Here you could find the approach of creating nested structures by using the Groovy feature of passing properties to be set in the constructor. def "should return true if country code is the same when creating nested structures using spock mocks - requires CGLIB for non interface types"() { given: PlayerDetails playerDetails = Mock() ClubDetails clubDetails = Mock() CountryDetails countryDetails = Mock() CountryCodeDetails countryCodeDetails = Mock() countryCodeDetails.countryCode >> CountryCodeType.ENG countryDetails.countryCode >> countryCodeDetails clubDetails.country >> countryDetails playerDetails.clubDetails >> clubDetails when: boolean playerIsOfGivenCountry = objectUnderTest.isPlayerOfGivenCountry(playerDetails, COUNTRY_CODE_ENG); then: playerIsOfGivenCountry } Here you can find a test that creates mocks using Spock - mind you that you need CGLIB as a dependency when you are mocking non interface types. def "should return true if country code is the same using ExtendedSpockSpecification's createNestedMocks"() { given: PlayerDetails playerDetails = Mock() CountryCodeDetails countryCodeDetails = createNestedMocks(playerDetails, "clubDetails.country.countryCode") countryCodeDetails.countryCode >> CountryCodeType.ENG when: booleanplayerIsOfGivenCountry = objectUnderTest.isPlayerOfGivenCountry(playerDetails, COUNTRY_CODE_ENG); then: playerIsOfGivenCountry } Here you have an example of creating nested mocks using the createNestedMocks method. def "should return false if country code is not the same using ExtendedSpockSpecification createNestedMocks"() { given: PlayerDetails playerDetails = Mock() CountryCodeDetails countryCodeDetails = createNestedMocks(playerDetails, "clubDetails.country.countryCode") countryCodeDetails.countryCode >> CountryCodeType.PL when: booleanplayerIsOfGivenCountry = objectUnderTest.isPlayerOfGivenCountry(playerDetails, COUNTRY_CODE_ENG); then: !playerIsOfGivenCountry } An example showing that creating nested mocks using the createNestedMocks method really works - should return false for improper country code. def "should return true if country code is the same using ExtendedSpockSpecification's createNestedSpies"() { given: PlayerDetails playerDetails = Spy() CountryCodeDetails countryCodeDetails = createNestedSpies(playerDetails, "clubDetails.country.countryCode") countryCodeDetails.countryCode = CountryCodeType.ENG when: booleanplayerIsOfGivenCountry = objectUnderTest.isPlayerOfGivenCountry(playerDetails, COUNTRY_CODE_ENG); then: playerIsOfGivenCountry } Here you have an example of creating nested spies using the createNestedSpies method. def "should return false if country code is not the same using ExtendedSpockSpecification's createNestedSpies"() { given: PlayerDetails playerDetails = Spy() CountryCodeDetails countryCodeDetails = createNestedSpies(playerDetails, "clubDetails.country.countryCode") countryCodeDetails.countryCode = CountryCodeType.PL when: booleanplayerIsOfGivenCountry = objectUnderTest.isPlayerOfGivenCountry(playerDetails, COUNTRY_CODE_ENG); then: !playerIsOfGivenCountry } An example showing that creating nested spies using the createNestedSpies method really works - should return false for improper country code. Summary In this post I have shown you how you can create nested mocks and spies using Spock. It can be useful especially when you are working with nested structures such as JAXB. Still you have to bear in mind that those structures to some extend violate the Law of Demeter. If you check my previous article about Mockito you would see that: We are getting the nested elements from the JAXB generated classes. Although it violates the Law of Demeter it is quite common to call methods of structures because JAXB generated classes are in fact structures so in fact I fully agree with Martin Fowler that it should be called the Suggestion of Demeter. And in case of this example the idea is the same - we are talking about structures so we don't violate the Law of Demeter. Advantages With a single method you can mock/spy nested elements Code cleaner than creating tons of objects that you then have to manually set Disadvantages Your IDE won't help you with providing the property names since the properties are presented as Strings You have to set Test Doubles only in the Specification context (and sometimes you want to go outside this scope) Sources As usual the sources are available at BitBucket and GitHub.

August 8, 2013

by Marcin Grzejszczak

· 16,275 Views · 1 Like

EclipseLink MOXy and the Java API for JSON Processing - Object Model APIs

The Java API for JSON Processing (JSR-353) is the Java standard for producing and consuming JSON which was introduced as part of Java EE 7. JSR-353 includes object (DOM like) and stream (StAX like) APIs. In this post I will demonstrate the initial JSR-353 support we have added to MOXy's JSON binding in EclipseLink 2.6. You can now use MOXy to marshal to: javax.json.JsonArrayBuilder javax.json.JsonObjectBuilder And unmarshal from: javax.json.JsonStructure javax.json.JsonObject javax.json.JsonArray You can try this out today using a nightly build of EclipseLink 2.6.0: http://www.eclipse.org/eclipselink/downloads/nightly.php The JSR-353 reference implementation is available here: https://java.net/projects/jsonp/downloads/download/ri/javax.json-ri-1.0.zip Java Model Below is the simple customer model that we will use for this post. Note for this example we are only using the standard JAXB (JSR-222) annotations. Customer package blog.jsonp.moxy; import java.util.*; import javax.xml.bind.annotation.*; @XmlType(propOrder={"id", "firstName", "lastName", "phoneNumbers"}) public class Customer { private int id; private String firstName; private String lastName; private List phoneNumbers = new ArrayList(); public int getId() { return id; } public void setId(int id) { this.id = id; } public String getFirstName() { return firstName; } public void setFirstName(String firstName) { this.firstName = firstName; } @XmlElement(nillable=true) public String getLastName() { return lastName; } public void setLastName(String lastName) { this.lastName = lastName; } @XmlElement public List getPhoneNumbers() { return phoneNumbers; } } PhoneNumber package blog.jsonp.moxy; import javax.xml.bind.annotation.*; @XmlAccessorType(XmlAccessType.FIELD) public class PhoneNumber { private String type; private String number; public String getType() { return type; } public void setType(String type) { this.type = type; } public String getNumber() { return number; } public void setNumber(String number) { this.number = number; } } jaxb.properties To specify MOXy as your JAXB provider you need to include a file called jaxb.properties in the same package as your domain model with the following entry (see: Specifying EclipseLink MOXy as your JAXB Provider) javax.xml.bind.context.factory=org.eclipse.persistence.jaxb.JAXBContextFactory Marshal Demo In the demo code below we will use a combination of JSR-353 and MOXy APIs to produce JSON. JSR-353's JsonObjectBuilder and JsonArrayBuilder are used to produces instances of JsonObject and JsonArray. We can use MOXy to marshal to these builders by wrapping them in instances of MOXy's JsonObjectBuilderResult and JsonArrayBuilderResult. package blog.jsonp.moxy; import java.util.*; import javax.json.*; import javax.json.stream.JsonGenerator; import javax.xml.bind.*; import org.eclipse.persistence.jaxb.JAXBContextProperties; import org.eclipse.persistence.oxm.json.*; public class MarshalDemo { public static void main(String[] args) throws Exception { // Create the EclipseLink JAXB (MOXy) Marshaller Map jaxbProperties = new HashMap(2); jaxbProperties.put(JAXBContextProperties.MEDIA_TYPE, "application/json"); jaxbProperties.put(JAXBContextProperties.JSON_INCLUDE_ROOT, false); JAXBContext jc = JAXBContext.newInstance(new Class[] {Customer.class}, jaxbProperties); Marshaller marshaller = jc.createMarshaller(); // Create the JsonArrayBuilder JsonArrayBuilder customersArrayBuilder = Json.createArrayBuilder(); // Build the First Customer Customer customer = new Customer(); customer.setId(1); customer.setFirstName("Jane"); customer.setLastName(null); PhoneNumber phoneNumber = new PhoneNumber(); phoneNumber.setType("cell"); phoneNumber.setNumber("555-1111"); customer.getPhoneNumbers().add(phoneNumber); // Marshal the First Customer Object into the JsonArray JsonArrayBuilderResult result = new JsonArrayBuilderResult(customersArrayBuilder); marshaller.marshal(customer, result); // Build List of PhoneNumer Objects for Second Customer List phoneNumbers = new ArrayList(2); PhoneNumber workPhone = new PhoneNumber(); workPhone.setType("work"); workPhone.setNumber("555-2222"); phoneNumbers.add(workPhone); PhoneNumber homePhone = new PhoneNumber(); homePhone.setType("home"); homePhone.setNumber("555-3333"); phoneNumbers.add(homePhone); // Marshal the List of PhoneNumber Objects JsonArrayBuilderResult arrayBuilderResult = new JsonArrayBuilderResult(); marshaller.marshal(phoneNumbers, arrayBuilderResult); customersArrayBuilder // Use JSR-353 APIs for Second Customer's Data .add(Json.createObjectBuilder() .add("id", 2) .add("firstName", "Bob") .addNull("lastName") // Included Marshalled PhoneNumber Objects .add("phoneNumbers", arrayBuilderResult.getJsonArrayBuilder()) ) .build(); // Write JSON to System.out Map jsonProperties = new HashMap(1); jsonProperties.put(JsonGenerator.PRETTY_PRINTING, true); JsonWriterFactory writerFactory = Json.createWriterFactory(jsonProperties); JsonWriter writer = writerFactory.createWriter(System.out); writer.writeArray(customersArrayBuilder.build()); writer.close(); } } Highlighted lines: 36, 37, 38, 54, 55, 64 Output Below is the output from running the marshal demo (MarshalDemo). The highlighted portions (lines 2-12 and 18-25) correspond to the portions that were populated from our Java model. [ { "id":1, "firstName":"Jane", "lastName":null, "phoneNumbers":[ { "type":"cell", "number":"555-1111" } ] }, { "id":2, "firstName":"Bob", "lastName":null, "phoneNumbers":[ { "type":"work", "number":"555-2222" }, { "type":"home", "number":"555-3333" } ] } ] Highlighted lines: 2-12, 18-25 Unmarshal Demo MOXy enables you to unmarshal from a JSR-353 JsonStructure (JsonObject or JsonArray). To do this simply wrap the JsonStructure in an instance of MOXy's JsonStructureSource and use one of the unmarshal operations that takes an instance of Source. package blog.jsonp.moxy; import java.io.FileInputStream; import java.util.*; import javax.json.*; import javax.xml.bind.*; import org.eclipse.persistence.jaxb.JAXBContextProperties; import org.eclipse.persistence.oxm.json.JsonStructureSource; public class UnmarshalDemo { public static void main(String[] args) throws Exception { try (FileInputStream is = new FileInputStream("src/blog/jsonp/moxy/input.json")) { // Create the EclipseLink JAXB (MOXy) Unmarshaller Map jaxbProperties = new HashMap(2); jaxbProperties.put(JAXBContextProperties.MEDIA_TYPE, "application/json"); jaxbProperties.put(JAXBContextProperties.JSON_INCLUDE_ROOT, false); JAXBContext jc = JAXBContext.newInstance(new Class[] {Customer.class}, jaxbProperties); Unmarshaller unmarshaller = jc.createUnmarshaller(); // Parse the JSON JsonReader jsonReader = Json.createReader(is); // Unmarshal Root Level JsonArray JsonArray customersArray = jsonReader.readArray(); JsonStructureSource arraySource = new JsonStructureSource(customersArray); List customers = (List) unmarshaller.unmarshal(arraySource, Customer.class) .getValue(); for(Customer customer : customers) { System.out.println(customer.getFirstName()); } // Unmarshal Nested JsonObject JsonObject customerObject = customersArray.getJsonObject(1); JsonStructureSource objectSource = new JsonStructureSource(customerObject); Customer customer = unmarshaller.unmarshal(objectSource, Customer.class) .getValue(); for(PhoneNumber phoneNumber : customer.getPhoneNumbers()) { System.out.println(phoneNumber.getNumber()); } } } } Highlighted lines: 27-30, 37-39 Input (input.json) The following JSON input will be converted to a JsonArray using a JsonReader. [ { "id":1, "firstName":"Jane", "lastName":null, "phoneNumbers":[ { "type":"cell", "number":"555-1111" } ] }, { "id":2, "firstName":"Bob", "lastName":null, "phoneNumbers":[ { "type":"work", "number":"555-2222" }, { "type":"home", "number":"555-3333" } ] } ] Highlighted lines: 4, 15, 20, 24 Output Below is the output from running the unmarshal demo (UnmarshalDemo). Jane Bob 555-2222 555-3333

August 7, 2013

by Blaise Doughan

· 13,751 Views

Getting started with CQEngine: LINQ for Java, Only Faster

CQEngine or collection query engine is a library that allows you to build indices over java collections and query them for objects using exposed properties. It offers similar capability to LINQ in .net but is thought to be faster because it builds indices over collections before querying them and uses set theory instead of iterations. In this post we will see how to query a simple collection of objects, in our example, a collection of users of a hypothetical system, using CQEngine. In a subsequent post, we will also see how iteratively searching a collection compares to querying via CQEngine. The first step is to get the CQEengine jar file. Download the jar from the CQEngine website or if you are using Maven, add the following dependency. com.googlecode.cqengine cqengine 1.0.3 Next, lets create the Class whose object we will be searching for: package co.syntx.examples.cqengine; import com.googlecode.cqengine.attribute.Attribute; import com.googlecode.cqengine.attribute.SimpleAttribute; public class User { private String username; private String password; private String fullname; private Role role; public User(String username, String password, String fullname, Role role) { super(); this.username = username; this.password = password; this.fullname = fullname; this.role = role; } public static final Attribute FULL_NAME = new SimpleAttribute("fullname") { public String getValue(User user) { return user.fullname; } }; public static final Attribute USERNAME = new SimpleAttribute("username") { public String getValue(User user) { return user.username; } }; public String getUsername() { return username; } public void setUsername(String username) { this.username = username; } public String getPassword() { return password; } public void setPassword(String password) { this.password = password; } public String getFullname() { return fullname; } public void setFullname(String fullname) { this.fullname = fullname; } public Role getRole() { return role; } public void setRole(Role role) { this.role = role; } } Next, we write a class, to perform our searches. I will go function by function. 1. Function to Build a Test Indexed Collection: In the following function, we build an indexed collection, define indices on attributes, and populate this collection with a certain number of objects. In actual usage, your collection will probably be filled with objects being returned from the DB, read from a file or other similar scenarios. public void buildIndexedCollection(int size) throws Exception { indexedUsers = CQEngine.newInstance(); indexedUsers.addIndex(HashIndex.onAttribute(User.FULL_NAME)); indexedUsers.addIndex(SuffixTreeIndex.onAttribute(User.FULL_NAME)); for (int i = 0; i < size; i++) { String username = RandomStringGenerator.generateRandomString(8,RandomStringGenerator.Mode.ALPHANUMERIC); String password = RandomStringGenerator.generateRandomString(8,RandomStringGenerator.Mode.ALPHANUMERIC); String fullname = RandomStringGenerator.generateRandomString(5,RandomStringGenerator.Mode.ALPHA) + " " + RandomStringGenerator.generateRandomString(5,RandomStringGenerator.Mode.ALPHA); Role role = new Role(); role.setName("admin"); indexedUsers.add(new User(username, password, fullname, role)); } } In line 3 we are initializing a new Indexed Collection, a reference of which is stored in the class variable indexedUsers. The reference is of type IndexedCollection In lines 4 and 5, we define two indices i) a Hash Index suitable for equal style queries. ii) a Suffix Index suitable for ends with style queried. For the purpose of this example, we are building indices only on the the Full name field. In line 8, we use a random string generator to populate dummy objects. In line 14 we add our object to our indexed collection. 2. Function to Perform Indexed Search for Exact Matches: In this function, we are querying for names that exactly match a given name. The equal function takes in the attribute upon which to perform the query and the value to search. The method equal is statically imported via a import static com.googlecode.cqengine.query.QueryFactory.*; In the example below, we are looping the results returned by the retrieve method and not doing anything with it. In your case, you may choose to return the Iterator returned by retrieve. public void indexedSearchForEquals(String fullname) throws Exception { Query query = equal(User.FULL_NAME, fullname); for (User user : indexedUsers.retrieve(query)) { // System.out.println(user.getFullname()); } } 3. Function to Perform Indexed Search for Ends With Matches: In this function, we are querying for names that end with a certain suffix. public void indexedSearchForEndsWith(String endswith) throws Exception { Query query1 = endsWith(User.FULL_NAME, endswith); for (User user : indexedUsers.retrieve(query1)) { // System.out.println(user.getFullname()); } } 4. Function to Perform Indexed Search for Equals or Ends With Matches: This function is a combination of both queries mentioned below and has an or relation between them. public void indexedSearchForEqualOrEndsWith(String equals, String ends) throws Exception { Query query = or(equal(User.FULL_NAME, equals),endsWith(User.FULL_NAME, ends)); for (User user : indexedUsers.retrieve(query)) { // System.out.println(user.getFullname()); } } 5. Putting it together: All the functions above belong to a class called CQEngineTest. We create a new object, build a test collection, then search for either exact matches, strings that end with a certain suffix or either. CQEngineTest test = new CQEngineTest(); test.buildIndexedCollection(size); test.indexedSearchForEqualOrEndsWith("test", "test"); In this example, we have used a Hash Index and a Suffix Tree Index. There are many other types of indices that you can choose depending on the type of query that you want to perform. A list of these indices and when to use them can be found on the cqengine project page. Also, apart from equal or endswith operations, there are others that you would typically expect to find. In a subsequent post we will also see how an indexed search compares with a typical iterative search in terms of times.

August 5, 2013

by Faheem Sohail

· 28,032 Views · 1 Like

Configuring log4j Loggers to Ignore Spring/Hibernate Logging

If you use log4j with your project that also uses Spring framework and/or hibernate or any other framework, and you would feel the need to set different Log levels for your own code and for third party components, you may also want different log levels for your own components. The answer is simple. Configure different loggers for different package names and provide different log levels for each. For example, lets take a look at the following log4j XML In the logger tag, specify the package name/prefix/regex and then specify the log level. The following logger will output only logs of level ERROR and above for packages starting with org.springframework

August 5, 2013

by Faheem Sohail

· 14,780 Views · 1 Like

JPA Searching Using Lucene - A Working Example with Spring and DBUnit

Working Example on Github There's a small, self contained mavenised example project over on Github to accompany this post - check it out here:https://github.com/adrianmilne/jpa-lucene-spring-demo Running the Demo See the README file over on GitHub for details of running the demo. Essentially - it's just running the Unit Tests, with the usual maven build and test results output to the console - example below. This is the result of running the DBUnit test, which inserts Book data into the HSQL database using JPA, and then uses Lucene to query the data, testing that the expected Books are returned (i.e. only those int he SCI-FI category, containing the word 'Space', and ensuring that any with 'Space' in the title appear before those with 'Space' only in the description. The Book Entity Our simple example stores Books. The Book entity class below is a standard JPA Entity with a few additional annotations to identify it to Lucene: @Indexed - this identifies that the class will be added to the Lucene index. You can define a specific index by adding the 'index' attribute to the annotation. We're just choosing the simplest, minimal configuration for this example. In addition to this - you also need to specify which properties on the entity are to be indexed, and how they are to be indexed. For our example we are again going for the default option by just adding an @Field annotation with no extra parameters. We are adding one other annotation to the 'title' field - @Boost - this is just telling Lucene to give more weight to search term matches that appear in this field (than the same term appearing in the description field). This example is purposefully kept minimal in terms of the ins-and-outs of Lucene (I may cover that in a later post) - we're really just concentrating on the integration with JPA and Spring for now. package com.cor.demo.jpa.entity; import javax.persistence.Entity; import javax.persistence.EnumType; import javax.persistence.Enumerated; import javax.persistence.GeneratedValue; import javax.persistence.Id; import javax.persistence.Lob; import org.hibernate.search.annotations.Boost; import org.hibernate.search.annotations.Field; import org.hibernate.search.annotations.Indexed; /** * Book JPA Entity. */ @Entity @Indexed public class Book { @Id @GeneratedValue private Long id; @Field @Boost(value = 1.5f) private String title; @Field @Lob private String description; @Field @Enumerated(EnumType.STRING) private BookCategory category; public Book(){ } public Book(String title, BookCategory category, String description){ this.title = title; this.category = category; this.description = description; } public Long getId() { return id; } public void setId(Long id) { this.id = id; } public String getTitle() { return title; } public void setTitle(String title) { this.title = title; } public BookCategory getCategory() { return category; } public void setCategory(BookCategory category) { this.category = category; } public String getDescription() { return description; } public void setDescription(String description) { this.description = description; } @Override public String toString() { return "Book [id=" + id + ", title=" + title + ", description=" + description + ", category=" + category + "]"; } } The Book Manager The BookManager class acts as a simple service layer for the Book operations - used for adding books and searching books. As you can see, the JPA database resources are autowired in by Spring from the application-context.xml. We are just using an in-memory hsql database in this example. package com.cor.demo.jpa.manager; import java.util.List; import javax.persistence.EntityManager; import javax.persistence.PersistenceContext; import javax.persistence.PersistenceContextType; import javax.persistence.Query; import org.hibernate.search.jpa.FullTextEntityManager; import org.hibernate.search.jpa.Search; import org.hibernate.search.query.dsl.QueryBuilder; import org.slf4j.Logger; import org.slf4j.LoggerFactory; import org.springframework.context.annotation.Scope; import org.springframework.stereotype.Component; import org.springframework.transaction.annotation.Transactional; import com.cor.demo.jpa.entity.Book; import com.cor.demo.jpa.entity.BookCategory; /** * Manager for persisting and searching on Books. Uses JPA and Lucene. */ @Component @Scope(value = "singleton") public class BookManager { /** Logger. */ private static Logger LOG = LoggerFactory.getLogger(BookManager.class); /** JPA Persistence Unit. */ @PersistenceContext(type = PersistenceContextType.EXTENDED, name = "booksPU") private EntityManager em; /** Hibernate Full Text Entity Manager. */ private FullTextEntityManager ftem; /** * Method to manually update the Full Text Index. This is not required if inserting entities * using this Manager as they will automatically be indexed. Useful though if you need to index * data inserted using a different method (e.g. pre-existing data, or test data inserted via * scripts or DbUnit). */ public void updateFullTextIndex() throws Exception { LOG.info("Updating Index"); getFullTextEntityManager().createIndexer().startAndWait(); } /** * Add a Book to the Database. */ @Transactional public Book addBook(Book book) { LOG.info("Adding Book : " + book); em.persist(book); return book; } /** * Delete All Books. */ @SuppressWarnings("unchecked") @Transactional public void deleteAllBooks() { LOG.info("Delete All Books"); Query allBooks = em.createQuery("select b from Book b"); List books = allBooks.getResultList(); // We need to delete individually (rather than a bulk delete) to ensure they are removed // from the Lucene index correctly for (Book b : books) { em.remove(b); } } @SuppressWarnings("unchecked") @Transactional public void listAllBooks() { LOG.info("List All Books"); LOG.info("------------------------------------------"); Query allBooks = em.createQuery("select b from Book b"); List books = allBooks.getResultList(); for (Book b : books) { LOG.info(b.toString()); getFullTextEntityManager().index(b); } } /** * Search for a Book. */ @SuppressWarnings("unchecked") @Transactional public List search(BookCategory category, String searchString) { LOG.info("------------------------------------------"); LOG.info("Searching Books in category '" + category + "' for phrase '" + searchString + "'"); // Create a Query Builder QueryBuilder qb = getFullTextEntityManager().getSearchFactory().buildQueryBuilder().forEntity(Book.class).get(); // Create a Lucene Full Text Query org.apache.lucene.search.Query luceneQuery = qb.bool() .must(qb.keyword().onFields("title", "description").matching(searchString).createQuery()) .must(qb.keyword().onField("category").matching(category).createQuery()).createQuery(); Query fullTextQuery = getFullTextEntityManager().createFullTextQuery(luceneQuery, Book.class); // Run Query and print out results to console List result = (List) fullTextQuery.getResultList(); // Log the Results LOG.info("Found Matching Books :" + result.size()); for (Book b : result) { LOG.info(" - " + b); } return result; } /** * Convenience method to get Full Test Entity Manager. Protected scope to assist mocking in Unit * Tests. * @return Full Text Entity Manager. */ protected FullTextEntityManager getFullTextEntityManager() { if (ftem == null) { ftem = Search.getFullTextEntityManager(em); } return ftem; } /** * Get the JPA Entity Manager (required for the DBUnit Tests). * @return Entity manager */ protected EntityManager getEntityManager() { return em; } /** * Sets the JPA Entity Manager (required to assist with mocking in Unit Test) * @param em EntityManager */ protected void setEntityManager(EntityManager em) { this.em = em; } } application-context.xml This is the Spring configuration file. You can see in the JPA Entity Manager configuration the key for 'hibernate.search.default.indexBase' is added to the jpaPropertyMap to tell Lucene where to create the index. We have also externalised the database login credentials to a properties file (as you may wish to change these for different environments), for example by updating the propertyConfigurer to look for and use a different external properties if it finds one on the file system). classpath:/system.properties Testing Using DBUnit In the project is an example of using DBUnit with Spring to test adding and searching against the database using DBUnit to populate the database with test data, exercise the Book Manager search operations and then clean the database down. This is a great way to test database functionality and can be easily integrated into maven and continuous build environments. Because DBUnit bypasses the standard JPA insertion calls - the data does not get automatically added to the Lucene index. We have a method exposed on the service interface to update the Full Text index 'updateFullTextIndex()' - calling this causes Lucene to update the index with the current data in the database. This can be useful when you are adding search to pre-populated databases to index the existing content. package com.cor.demo.jpa.manager; import java.io.InputStream; import java.util.List; import org.dbunit.DBTestCase; import org.dbunit.database.DatabaseConnection; import org.dbunit.database.IDatabaseConnection; import org.dbunit.dataset.IDataSet; import org.dbunit.dataset.xml.FlatXmlDataSetBuilder; import org.dbunit.operation.DatabaseOperation; import org.hibernate.impl.SessionImpl; import org.junit.After; import org.junit.Before; import org.junit.Test; import org.junit.runner.RunWith; import org.slf4j.Logger; import org.slf4j.LoggerFactory; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.test.context.ContextConfiguration; import org.springframework.test.context.junit4.SpringJUnit4ClassRunner; import com.cor.demo.jpa.entity.Book; import com.cor.demo.jpa.entity.BookCategory; /** * DBUnit Test - loads data defined in 'test-data-set.xml' into the database to run tests against the * BookManager. More thorough (and ultimately easier in this context) than using mocks. */ @RunWith(SpringJUnit4ClassRunner.class) @ContextConfiguration(locations = { "classpath:/application-context.xml" }) public class BookManagerDBUnitTest extends DBTestCase { /** Logger. */ private static Logger LOG = LoggerFactory.getLogger(BookManagerDBUnitTest.class); /** Book Manager Under Test. */ @Autowired private BookManager bookManager; @Before public void setup() throws Exception { DatabaseOperation.CLEAN_INSERT.execute(getDatabaseConnection(), getDataSet()); } @After public void tearDown() { deleteBooks(); } @Override protected IDataSet getDataSet() throws Exception { InputStream inputStream = this.getClass().getClassLoader().getResourceAsStream("test-data-set.xml"); FlatXmlDataSetBuilder builder = new FlatXmlDataSetBuilder(); return builder.build(inputStream); } /** * Get the underlying database connection from the JPA Entity Manager (DBUnit needs this connection). * @return Database Connection * @throws Exception */ private IDatabaseConnection getDatabaseConnection() throws Exception { return new DatabaseConnection(((SessionImpl) (bookManager.getEntityManager().getDelegate())).connection()); } /** * Tests the expected results for searching for 'Space' in SCF-FI books. */ @Test public void testSciFiBookSearch() throws Exception { bookManager.listAllBooks(); bookManager.updateFullTextIndex(); List results = bookManager.search(BookCategory.SCIFI, "Space"); assertEquals("Expected 2 results for SCI FI search for 'Space'", 2, results.size()); assertEquals("Expected 1st result to be '2001: A Space Oddysey'", "2001: A Space Oddysey", results.get(0).getTitle()); assertEquals("Expected 2nd result to be 'Apollo 13'", "Apollo 13", results.get(1).getTitle()); } private void deleteBooks() { LOG.info("Deleting Books...-"); bookManager.deleteAllBooks(); } } The source data for the test is defined in an xml file.

August 5, 2013

by Adrian Milne

· 31,293 Views

JMS vs RabbitMQ

Definition : JMS : Java Message Service is an API that is part of Java EE for sending messages between two or more clients. There are many JMS providers such as OpenMQ (glassfish’s default), HornetQ(Jboss), and ActiveMQ. RabbitMQ: is an open source message broker software which uses the AMQP standard and is written by Erlang. Messaging Model: JMS supports two models: one to one and publish/subscriber. RabbitMQ supports the AMQP model which has 4 models : direct, fanout, topic, headers. Data types: JMS supports 5 different data types but RabbitMQ supports only the binary data type. Workflow strategy: In AMQP, producers send to the exchange then the queue, but in JMS, producers send to the queue or topic directly. Technology compatibility: JMS is specific for java users only, but RabbitMQ supports many technologies. Performance: If you would like to know more about their performance, this benchmark is a good place to start, but look for others as well.

July 30, 2013

by Saeid Siavashi

· 51,713 Views · 16 Likes

An Introduction to Generics in Java – Part 5

This is a continuation of an introductory discussion on generics, previous parts of which can be found here. In this post I will focus on type parameter bounds and their usage. Bounded Type Parameters When a generic type is compiled, all occurrences of a type parameter are removed by the compiler and replaced by a concrete type. The compiler also generates appropriate casting needed for type safety by itself during this procedure. This concrete type is typically Object, but compiler can use other types as well. This process is called Type Erasure and will be explained in a future post. For the time being, all we need to understand is that the type information of generic types are lost once they are compiled. For this reason, if we want to access a method/property using a type parameter, we’ll typically be able to access those ones that are defined in the class Object (I am oversimplifying here as we’ll be able to access other methods/properties as well if we use a bound, which we will discuss here in this post). For example, take a look at the following code snippet – public class MyGenericClass { private E prop; public MyGenericClass(E prop) { this.prop = prop; } public E getProp() { return this.prop; } public void printProp() { //OK, because toString is defined within Object System.out.println(this.prop.toString()); } public int getValue() { /** * NOT OK, because Object doesn’t have * compareTo method. Compile-time Error. */ return this.prop.intValue(); } } After compiling the above class, I get the following message – MyGenericClass.java:37: error: cannot find symbol return this.prop.intValue(); ^ symbol: method intValue(E) location: variable prop of type E where E is a type-variable: E extends Object declared in class MyGenericClass 1 error Although the message seems cryptic, the reason behind this is the one that I’ve stated above – when this type is compiled the type parameter E here will be replaced by Object by the compiler, and it doesn’t have intValue method. So the problem occurs here because the compiler is using Object to replace the type parameters. If I could somehow tell the compiler to use other types during this erasure which has an intValue (Number, for example) method, then this error would have been resolved. This is exactly what parameter bounds do. By using a type as a bound on a type parameter, I can instruct compiler to use that type during the erasure in place of Object, and then I can easily access the methods/properties defined in that type. The general syntax for specifying a type parameter bound is as follows – public class MyGenericClass This also tells the compiler that when a type argument is passed during an instance creation of this generic type, it will be a subtype of MyBoundType, so it can safely let us access the methods that are defined in that type using the type parameter E. If any other type is passed, then the compiler will issue a compile-time error. The extends keyword specify the bound relation between E and MyBoundType. We will use the same keyword even if E is bounded by an interface type, that is, if MyBoundType is an interface. Here extends means both classical extends and implements. So, if we use Number as our parameter bound for our last example, the error message will be gone because now the compiler will use Number to erase type parameter E, and it has an intValue method defined in it - public class MyGenericClass { private E prop; public MyGenericClass(E prop) { this.prop = prop; } public E getProp() { return this.prop; } public void printProp() { // OK, because toString is defined within Object System.out.println(this.prop.toString()); } public int getValue() { // Now it compiles just fine! return this.prop.intValue(); } } This code will now compile just fine. Remember our generic Insertion Sort algorithm from the first part of the series? We had declared it like this – public class InsertionSort> This told the compiler that the type arguments that will be passed here will all implement the Comparable interface, so they will have a compareTo method. As a result, compiler allowed us to do this inside the sort method – for (int i = 1; i <= elements.length - 1; i++) { E valueToInsert = elements[i]; int holePos = i; /** * See how we are calling compareTo method * on the type parameter? */ while (holePos > 0 && valueToInsert.compareTo(elements[holePos - 1]) < 0) { elements[holePos] = elements[holePos - 1]; holePos--; } elements[holePos] = valueToInsert; } This example also shows that we can pass another generic type as a type parameter bound. In fact we can use all Classes, Interfaces and Enums and even another type parameter as a bound. Only primitive and array types are not allowed as a bound. Multiple and Recursive Bounds We can define multiple bounds on a single parameter. In this case we use the & operator to list them in the following way - public class MyGenericClass This tells the compiler that the type argument that is passed will be a subtype of MyBoundClass and implements MyBoundInterface. So, we will be able to access all the methods/properties defined in those types. The Java Language Specification requires us to list the class bound first, otherwise the compiler will throw an error. For example, the following will throw an error – /** * Will throw an error because we are not * listing the class bound first. */ public class MyGenericClass We can also declare recursive bounds, so that a bound can depend on itself too. Consider our sorting example from first part of the series. We declared it like this – public class InsertionSort> Here, the bound is recursive because E itself depends upon E (the one that is supplied to Comparator). If we passInteger as a type argument when creating an instance of InsertionSort, then the type argument to Comparable will beInteger too. We can also declare mutually recursive bounds like this – public class MyGenericClass , U extends SecondType> Java Enum Declaration Let us now consider an example from the Java API itself. We all know that enumerations in Java are all objects of a class, and that class extends the Enum class. The declaration of that class looks like this – public abstract class Enum> implements Comparable, Serializable Beginners in Java Generics find the first line very confusing. Before explaining the reasoning behind this weird declaration, let us explore an example. Suppose that we are going to build a software system which will have various types of vehicles (a vehicle simulation system, perhaps?). The vehicles will have a name and a length. We also want to compare vehicles with each other based on their lengths. An approach for building the vehicle classes might be something like this – public abstract class Vehicle { private String name; private double length; public String getName() { return name; } public void setName(String name) { this.name = name; } public double getLength() { return length; } public void setLength(double length) { this.length = length; } } public class Car extends Vehicle implements Comparable { public int compareTo(Car anotherCar) { double thisLength = this.getLength(); double thatLength = anotherCar.getLength(); if (thisLength > thatLength) return 1; else if (thisLength < thatLength) return -1; return 0; } // other methods and properties } public class Bus extends Vehicle implements Comparable { public int compareTo(Bus anotherBus) { double thisLength = this.getLength(); double thatLength = anotherBus.getLength(); if (thisLength > thatLength) return 1; else if (thisLength < thatLength) return -1; return 0; } // other methods and properties } The problem of the above implementation is pretty obvious – even though the comparing logics are almost the same among all the subtypes of Vehicle, it’s duplicated in all of them. This creates a maintenance problem as now changing the comparison logic forces us to change the code in many places. To remedy this, we can remove the comparison from the subtypes and move it up in the Vehicle. To do this, we will rewrite those classes as follows – public abstract class Vehicle implements Comparable { private String name; private double length; public String getName() { return name; } public void setName(String name) { this.name = name; } public double getLength() { return length; } public void setLength(double length) { this.length = length; } public int compareTo(Vehicle anotherVehicle) { double thisLength = this.getLength(); double thatLength = anotherVehicle.getLength(); if (thisLength > thatLength) return 1; else if (thisLength < thatLength) return -1; return 0; } } public class Car extends Vehicle { // car-specific methods and properties } public class Bus extends Vehicle { // bus-specific methods and properties } This approach has also a problem. The above implementation will allow us to compare a car with a bus without any errors – Car car = new Car(); car.setName("Toyota"); car.setLength(2); Bus bus = new Bus(); bus.setName("Volvo"); bus.setLength(4); car.compareTo(bus); // No error This is certainly a problem, since comparing a bus with a car should not be done using the same logic that is used to compare a car with a car. How can we solve this? How can we reuse the comparison logic among all the subtypes, while at the same time raising error flags at compile time whenever someone tries to compare two incompatible types? In the last example the problem occurred because the compareTo method has a parameter which is of type Vehicle. As a result we were able to pass any subtypes of Vehicle to it, like the way we passed a bus to compare with a car. Let’s try to change the type of this parameter so that now this kind of mixing generates an error. If we want to allow the comparison of a car only with a car, then the argument to the compareTo method must be of type Car. If we change it to Car, we will also need to change the type argument that we are passing to Comparable in the Vehicle class declaration – public abstract class Vehicle implements Comparable { // other methods and properties public int compareTo(Car car) { // method implementation } } But then this will not allow us to compare any other types. We will not be able to compare a bus with another bus. To allow this, we will need to change the parameter to be of type Bus. If we declare a new subtype named Cycle, we will also need this method to support this type too! So we can see that the parameter type of this compare method should vary if we need to enforce compatible comparison. From the above discussion it’s clear that we need to parameterize the parameter type of the compareTo method, and in turn, parameterize the Vehicle class itself. If we do this, we will then be able to pass Car, Bus, and Cycle etc. as its type argument, which in turn will be used as the parameter type of the compare method. In general, after we declare Vehicleas a generic type, all of its subtypes will pass themselves as a type argument while extending from it, so that the parameter type of this compareTo method matches their type – public abstract class Vehicle implements Comparable { // other methods and properties public int compareTo(E vehicle) { // method implementation } } /** * Now this class’s compareTo version will take a Car type * as its argument. */ public class Car extends Vehicle { // car specific method and properties } /** * Now this class’s compareTo version will take a Bus type * as its argument. */ public class Bus extends Vehicle { // bus specific method and properties } /** * Doing something like this will now generate a * compile-time error. */ car.compareTo(bus); This approach solves our last problem that we were facing, but introduces a new one. After converting Vehicle a generic type and using the type parameter as the parameter type of the compare method, it looks like this – public int compareTo(E anotherVehicle) { double thisLength = this.getLength(); // Now the following line is an error. double thatLength = anotherVehicle.getLength(); if (thisLength > thatLength) return 1; else if (thisLength < thatLength) return -1; return 0; } Since we didn’t put any bound on the type parameter, and Object class doesn’t have a getLength method, compiler will generate an error. We get to call this method on an object of type E only if it’s bounded by Vehicle itself, because then compiler will know that objects of this type will have this method. So our compare method will work only if E is bounded by Vehicle itself! After this modification, the classes look like below – public abstract class Vehicle> implements Comparable { private String name; private double length; public String getName() { return name; } public void setName(String name) { this.name = name; } public double getLength() { return length; } public void setLength(double length) { this.length = length; } public int compareTo(E anotherVehicle) { double thisLength = this.getLength(); double thatLength = anotherVehicle.getLength(); if (thisLength > thatLength) return 1; else if (thisLength < thatLength) return -1; return 0; } } public class Car extends Vehicle { // Car-specific properties and methods } public class Bus extends Vehicle { // Bus-specific properties and methods } // and in main Car car = new Car(); car.setName("Toyota"); car.setLength(2); Bus bus = new Bus(); bus.setName("Volvo"); bus.setLength(4); car.compareTo(car); // Works as expected car.compareTo(bus); // compile-time error Even with the above example, a certain kind of type mixing is possible. Rather than discussing it here, I am going to leave it to you to figure it out. If you can’t, check out the next post of this series! I guess now you know why the Enum class is declared in that way. This kind of recursive bound allows us to write methods in a supertype which will take its subtypes as its arguments, or return them as return value. I encourage you to check out the source code of the Enum class to find out these methods. That’s it for today. Stay tuned for the next post! Resources Java Generics and Collections Java Generics FAQs by Angelika Langer

July 29, 2013

by MD Sayem Ahmed

· 32,282 Views · 5 Likes

Why String is Immutable in Java

this is an old yet still popular question. there are multiple reasons that string is designed to be immutable in java. a good answer depends on good understanding of memory, synchronization, data structures, etc. in the following, i will summarize some answers. 1. requirement of string pool string pool (string intern pool) is a special storage area in java heap. when a string is created and if the string already exists in the pool, the reference of the existing string will be returned, instead of creating a new object and returning its reference. the following code will create only one string object in the heap. string string1 = "abcd"; string string2 = "abcd"; here is how it looks: if string is not immutable, changing the string with one reference will lead to the wrong value for the other references. 2. allow string to cache its hashcode the hashcode of string is frequently used in java. for example, in a hashmap. being immutable guarantees that hashcode will always the same, so that it can be cashed without worrying the changes.that means, there is no need to calculate hashcode every time it is used. this is more efficient. 3. security string is widely used as parameter for many java classes, e.g. network connection, opening files, etc. were string not immutable, a connection or file would be changed and lead to serious security threat. the method thought it was connecting to one machine, but was not. mutable strings could cause security problem in reflection too, as the parameters are strings. here is a code example: boolean connect(string s){ if (!issecure(s)) { throw new securityexception(); } //here will cause problem, if s is changed before this by using other references. causeproblem(s); } in summary, the reasons include design, efficiency, and security. actually, this is also true for many other “why” questions in a java interview.

July 29, 2013

by Ryan Wang

· 217,298 Views · 9 Likes

Node.js Call HTTPS With BASIC Authentication

Node.js https module used to make a remote call to a remote server using https and BASIC authentication: var options = { host: 'test.example.com', port: 443, path: '/api/service/'+servicename, // authentication headers headers: { 'Authorization': 'Basic ' + new Buffer(username + ':' + passw).toString('base64') } }; //this is the call request = https.get(options, function(res){ var body = ""; res.on('data', function(data) { body += data; }); res.on('end', function() { //here we have the full response, html or json object console.log(body); }) res.on('error', function(e) { onsole.log("Got error: " + e.message); }); }); }

July 29, 2013

by Santiago Urrizola

· 106,772 Views · 2 Likes

Displaying and Searching std::map Contents in WinDbg

This time we’re up for a bigger challenge. We want to automatically display and possibly search and filter std::map objects in WinDbg. The script for std::vectors was relatively easy because of the flat structure of the data in a vector; maps are more complex beasts. Specifically, an map in the Visual C++ STL is implemented as a red-black tree. Each tree node has three important pointers: _Left, _Right, and _Parent. Additionally, each node has a _Myval field that contains the std::pair with the key and value represented by the node. Iterating a tree structure requires recursion, and WinDbg scripts don’t have any syntax to define functions. However, we can invoke a script recursively – a script is allowed to contain the $$>a< command that invokes it again with a different set of arguments. The path to the script is also readily available in ${$arg0}. Before I show you the script, there’s just one little challenge I had to deal with. When you call a script recursively, the values of the pseudo-registers (like $t0) will be clobbered by the recursive invocation. I was on the verge of allocating memory dynamically or calling into a shell process to store and load variables, when I stumbled upon the .push and .pop commands, which store the register context and load it, respectively. These are a must for recursive WinDbg scripts. OK, so suppose you want to display values from an std::map where the key is less than or equal to 2. Here we go: 0:000> $$>a< traverse_map.script my_map -c ".block { .if (@@(@$t9.first) <= 2) { .echo ----; ?? @$t9.second } }" size = 10 ---- struct point +0x000 x : 0n1 +0x004 y : 0n2 +0x008 data : extra_data ---- struct point +0x000 x : 0n0 +0x004 y : 0n1 +0x008 data : extra_data ---- struct point +0x000 x : 0n2 +0x004 y : 0n3 +0x008 data : extra_data For each pair (stored in the $t9 pseudo-register), the block checks if the first component is less than or equal to 2, and if it is, outputs the second component. Next, here’s the script. Note it’s considerably more complex that what we had to with vectors, because it essentially invokes itself with a different set of parameters and then repeats recursively. .if ($sicmp("${$arg1}", "-n") == 0) { .if (@@(@$t0->_Isnil) == 0) { .if (@$t2 == 1) { .printf /D "%p\n", @$t0, @$t0 .printf "key = " ?? @$t0->_Myval.first .printf "value = " ?? @$t0->_Myval.second } .else { r? $t9 = @$t0->_Myval command } } $$ Recurse into _Left, _Right unless they point to the root of the tree .if (@@(@$t0->_Left) != @@(@$t1)) { .push /r /q r? $t0 = @$t0->_Left $$>a< ${$arg0} -n .pop /r /q } .if (@@(@$t0->_Right) != @@(@$t1)) { .push /r /q r? $t0 = @$t0->_Right $$>a< ${$arg0} -n .pop /r /q } } .else { r? $t0 = ${$arg1} .if (${/d:$arg2}) { .if ($sicmp("${$arg2}", "-c") == 0) { r $t2 = 0 aS ${/v:command} "${$arg3}" } } .else { r $t2 = 1 aS ${/v:command} " " } .printf "size = %d\n", @@(@$t0._Mysize) r? $t0 = @$t0._Myhead->_Parent r? $t1 = @$t0->_Parent $$>a< ${$arg0} -n ad command } Of particular note are the aS command which configures an alias that is then used by the recursive invocation to invoke a command block for each of the map’s elements; the $sicmp function which compares strings; and the .printf /D function, which outputs a chunk of DML. Finally, the recursion terminates when _Left or _Right are equal to the root of the tree (that’s just how the tree is implemented in this case).

July 26, 2013

by Sasha Goldshtein

· 4,906 Views

Why I Never Use the Maven Release Plugin

Just about every 6 months or so an article appears cursing Maven, attracting both proponents as opponents to Maven and Ant. While it’s real fun to watch (I really get a laugh when people start to advocate the return to Ant), most of the time it’s always the same arguments. Maven lacks flexibility, the plugin system sucks (when will people learn to use plugin versions…), you can’t use scripting and the all time favorite: the release plugin sucks. Well, I am a Maven addict and I’m happy to say: yes, I agree, the release plugin sucks. Big time. But here’s something you may have forgotten: you don’t need it! Even more: you shouldn’t use it. The Maven release plugin tries to make releasing software a breeze. That’s where the plugin authors got it wrong to start with. Releases are not something done on a whim. They are carefully planned and orchestrated actions, preceded by countless rules and followed by more rules. Assuming you can bundle all that in a simple mvn release:release is just plain naive. Even Maven’s most fierce supporters agree on this. The Maven release plugin just tries to do too much stuff at once: build your software, tag it, build it again, deploy it, build the site (triggering yet another build in the process) and deploy the site. And whilst doing that, running the tests x times. Most of the time, you’re making candidate releases, so building the complete documentation is a complete waste of time. Now, if you break down the release plugin into sensible steps, you’ll really save yourself a whole lot of trouble. I use these steps to release something. As a sidenote: I use git and git-flow standards (as described here). Assume the POM’s version’s currently on 1.0-SNAPSHOT. Announce the release process Very important. As I said, you don’t release on a whim. Make sure everyone on your team knows a release is pending and has all their stuff pushed to the development branch that needs to be included. Branch the development branch into a release branch. Following git-flow rules, I make a release branch 1.0. Update the POM version of the development branch. Update the version to the next release version. For example mvn versions:set -DnewVersion=2.0-SNAPSHOT. Commit and push. Now you can put resources developing towards the next release version. Update the POM version of the release branch. Update the version to the standard CR version. For example mvn versions:set -DnewVersion=1.0.CR-SNAPSHOT. Commit and push. Run tests on the release branch. Run all the tests. If one or more fail, fix them first. Create a candidate release from the release branch. Use the Maven version plugin to update your POM’s versions. For example mvn versions:set -DnewVersion=1.0.CR1. Commit and push. Make a tag on git. Use the Maven version plugin to update your POM’s versions back to the standard CR version. For example mvn versions:set -DnewVersion=1.0.CR-SNAPSHOT. Commit and push. Checkout the new tag. Do a deployment build (mvn clean deploy). Since you’ve just run your tests and fixed any failing ones, this shouldn’t fail. Put deployment on QA environment. Iterate until QA gives a green light on the candidate release. Fix bugs. Fix bugs reported on the CR releases on the release branch. Merge into development branch on regular intervals (or even better, continuous). Run tests continuously, making bug reports on failures and fixing them as you go. Create a candidate release. Use the Maven version plugin to update your POM’s versions. For example mvn versions:set -DnewVersion=1.0.CRx. Commit and push. Make a tag on git. Use the Maven version plugin to update your POM’s versions back to the standard CR version. For example mvn versions:set -DnewVersion=1.0.CR-SNAPSHOT. Commit and push. Checkout the new tag. Do a deployment build (mvn clean deploy). Since you’ve run your tests continuously, this shouldn’t fail. Put deployment on QA environment. Once QA has signed off on the release, create a final release. Check whether there are no new commits since the last release tag (if there are, slap developers as they have done stuff that wasn’t needed or asked for). Use the Maven version plugin to update your POM’s versions. For example mvn versions:set -DnewVersion=1.0. Commit and push. Tag the release branch. Merge into the master branch. Checkout the master branch. Do a deployment build (mvn clean deploy). Start production release and deployment process (in most companies, not a small feat). This can involve building the site and doing other stuff, some not even Maven related. There’s no way in hell Maven can automate this process and if you try, you’ll bump into the many pitfalls the release plugin has to offer. The release plugin is just a combination of the versions, scm, deploy and site plugin that seriously violates the single responsibility principle. The release plugin is one of the reasons Maven has gotten a bad reputation with some people. It’s long due for an overhaul, but if you ask me, they should just remove it altogether. Releasing software is a process, not a single command on the command line. The process I just described isn’t perfect in any way, but it works and I avoid using the release plugin as it just does too much stuff. Have fun bashing Maven, but please, keep it clean :) .

July 26, 2013

by Lieven Doclo

· 117,903 Views · 13 Likes

Using Morphia to Map Java Objects in MongoDB

MongoDB is an open source document-oriented NoSQL database system which stores data as JSON-like documents with dynamic schemas. As it doesn't store data in tables as is done in the usual relational database setup, it doesn't map well to the JPA way of storing data. Morphia is an open source lightweight type-safe library designed to bridge the gap between the MongoDB Java driver and domain objects. It can be an alternative to SpringData if you're not using the Spring Framework to interact with MongoDB. This post will cover the basics of persisting and querying entities along the lines of JPA by using Morphia and a MongoDB database instance. There are four POJOs this example will be using. First we have BaseEntity which is an abstract class containing the Id and Version fields: package com.city81.mongodb.morphia.entity; import org.bson.types.ObjectId; import com.google.code.morphia.annotations.Id; import com.google.code.morphia.annotations.Property; import com.google.code.morphia.annotations.Version; public abstract class BaseEntity { @Id @Property("id") protected ObjectId id; @Version @Property("version") private Long version; public BaseEntity() { super(); } public ObjectId getId() { return id; } public void setId(ObjectId id) { this.id = id; } public Long getVersion() { return version; } public void setVersion(Long version) { this.version = version; } } Whereas JPA would use @Column to rename the attribute, Morphia uses @Property. Another difference is that @Property needs to be on the variable whereas @Column can be on the variable or the get method. The main entity we want to persist is the Customer class: package com.city81.mongodb.morphia.entity; import java.util.List; import com.google.code.morphia.annotations.Embedded; import com.google.code.morphia.annotations.Entity; @Entity public class Customer extends BaseEntity { private String name; private List accounts; @Embedded private Address address; public String getName() { return name; } public void setName(String name) { this.name = name; } public List getAccounts() { return accounts; } public void setAccounts(List accounts) { this.accounts = accounts; } public Address getAddress() { return address; } public void setAddress(Address address) { this.address = address; } } As with JPA, the POJO is annotated with @Entity. The class also shows an example of @Embedded: The Address class is also annotated with @Embedded as shown below: package com.city81.mongodb.morphia.entity; import com.google.code.morphia.annotations.Embedded; @Embedded public class Address { private String number; private String street; private String town; private String postcode; public String getNumber() { return number; } public void setNumber(String number) { this.number = number; } public String getStreet() { return street; } public void setStreet(String street) { this.street = street; } public String getTown() { return town; } public void setTown(String town) { this.town = town; } public String getPostcode() { return postcode; } public void setPostcode(String postcode) { this.postcode = postcode; } } Finally, we have the Account class of which the customer class has a collection of: package com.city81.mongodb.morphia.entity; import com.google.code.morphia.annotations.Entity; @Entity public class Account extends BaseEntity { private String name; public String getName() { return name; } public void setName(String name) { this.name = name; } } The above show only a small subset of what annotations can be applied to domain classes. More can be found at http://code.google.com/p/morphia/wiki/AllAnnotations The Example class shown below goes through the steps involved in connecting to the MongoDB instance, populating the entities, persisting them and then retrieving them: package com.city81.mongodb.morphia; import java.net.UnknownHostException; import java.util.ArrayList; import java.util.List; import com.city81.mongodb.morphia.entity.Account; import com.city81.mongodb.morphia.entity.Address; import com.city81.mongodb.morphia.entity.Customer; import com.google.code.morphia.Datastore; import com.google.code.morphia.Key; import com.google.code.morphia.Morphia; import com.mongodb.Mongo; import com.mongodb.MongoException; /** * A MongoDB and Morphia Example * */ public class Example { public static void main( String[] args ) throws UnknownHostException, MongoException { String dbName = new String("bank"); Mongo mongo = new Mongo(); Morphia morphia = new Morphia(); Datastore datastore = morphia.createDatastore(mongo, dbName); morphia.mapPackage("com.city81.mongodb.morphia.entity"); Address address = new Address(); address.setNumber("81"); address.setStreet("Mongo Street"); address.setTown("City"); address.setPostcode("CT81 1DB"); Account account = new Account(); account.setName("Personal Account"); List accounts = new ArrayList(); accounts.add(account); Customer customer = new Customer(); customer.setAddress(address); customer.setName("Mr Bank Customer"); customer.setAccounts(accounts); Key savedCustomer = datastore.save(customer); System.out.println(savedCustomer.getId()); } Executing the first few lines will result in the creation of a Datastore. This interface will provide the ability to get, delete and save objects in the 'bank' MongoDB instance. The mapPackage method call on the morphia object determines what objects are mapped by that instance of Morphia. In this case all those in the package supplied. Other alternatives exist to map classes, including the method map which takes a single class (this method can be chained as the returning object is the morphia object), or passing a Set of classes to the Morphia constructor. After creating instances of the entities, they can be saved by calling save on the datastore instance and can be found using the primary key via the get method. The output from the Example class would look something like the below: 11-Jul-2012 13:20:06 com.google.code.morphia.logging.MorphiaLoggerFactory chooseLoggerFactory INFO: LoggerImplFactory set to com.google.code.morphia.logging.jdk.JDKLoggerFactory 4ffd6f7662109325c6eea24f Mr Bank Customer There are many other methods on the Datastore interface and they can be found along with the other Javadocs at http://morphia.googlecode.com/svn/site/morphia/apidocs/index.html An alternative to using the Datastore directly is to use the built in DAO support. This can be done by extending the BasicDAO class as shown below for the Customer entity: package com.city81.mongodb.morphia.dao; import com.city81.mongodb.morphia.entity.Customer; import com.google.code.morphia.Morphia; import com.google.code.morphia.dao.BasicDAO; import com.mongodb.Mongo; public class CustomerDAO extends BasicDAO { public CustomerDAO(Morphia morphia, Mongo mongo, String dbName) { super(mongo, morphia, dbName); } } To then make use of this, the Example class can be changed (and enhanced to show a query and a delete): ... CustomerDAO customerDAO = new CustomerDAO(morphia, mongo, dbName); customerDAO.save(customer); Query query = datastore.createQuery(Customer.class); query.and( query.criteria("accounts.name").equal("Personal Account"), query.criteria("address.number").equal("81"), query.criteria("name").contains("Bank") ); QueryResults retrievedCustomers = customerDAO.find(query); for (Customer retrievedCustomer : retrievedCustomers) { System.out.println(retrievedCustomer.getName()); System.out.println(retrievedCustomer.getAddress().getPostcode()); System.out.println(retrievedCustomer.getAccounts().get(0).getName()); customerDAO.delete(retrievedCustomer); } ... With the output from running the above shown below: 11-Jul-2012 13:30:46 com.google.code.morphia.logging.MorphiaLoggerFactory chooseLoggerFactory INFO: LoggerImplFactory set to com.google.code.morphia.logging.jdk.JDKLoggerFactory Mr Bank Customer CT81 1DB Personal Account This post only covers a few brief basics of Morphia but shows how it can help bridge the gap between JPA and NoSQL.

July 25, 2013

by Geraint Jones

· 76,116 Views

Asynchronous Retry Pattern

When you have a piece of code that often fails and must be retried, this Java 7/8 library provides rich and unobtrusive API with fast and scalable solution to this problem: ScheduledExecutorService scheduler = Executors.newSingleThreadScheduledExecutor(); RetryExecutor executor = new AsyncRetryExecutor(scheduler). retryOn(SocketException.class). withExponentialBackoff(500, 2). //500ms times 2 after each retry withMaxDelay(10_000). //10 seconds withUniformJitter(). //add between +/- 100 ms randomly withMaxRetries(20); You can now run arbitrary block of code and the library will retry it for you in case it throws SocketException: final CompletableFuture future = executor.getWithRetry(() -> new Socket("localhost", 8080) ); future.thenAccept(socket -> System.out.println("Connected! " + socket) ); Please look carefully! getWithRetry() does not block. It returns CompletableFuture immediately and invokes given function asynchronously. You can listen for that Future or even for multiple futures at once and do other work in the meantime. So what this code does is: trying to connect to localhost:8080 and if it fails with SocketException it will retry after 500 milliseconds (with some random jitter), doubling delay after each retry, but not above 10 seconds. Equivalent but more concise syntax: executor. getWithRetry(() -> new Socket("localhost", 8080)). thenAccept(socket -> System.out.println("Connected! " + socket)); This is a sample output that you might expect: TRACE | Retry 0 failed after 3ms, scheduled next retry in 508ms (Sun Jul 21 21:01:12 CEST 2013) java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) ~[na:1.8.0-ea] //... TRACE | Retry 1 failed after 0ms, scheduled next retry in 934ms (Sun Jul 21 21:01:13 CEST 2013) java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) ~[na:1.8.0-ea] //... TRACE | Retry 2 failed after 0ms, scheduled next retry in 1919ms (Sun Jul 21 21:01:15 CEST 2013) java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) ~[na:1.8.0-ea] //... TRACE | Successful after 2 retries, took 0ms and returned: Socket[addr=localhost/127.0.0.1,port=8080,localport=46332] Connected! Socket[addr=localhost/127.0.0.1,port=8080,localport=46332] Imagine you connect to two different systems, one is slow, second unreliable and fails often: CompletableFuture stringFuture = executor.getWithRetry(ctx -> unreliable()); CompletableFuture intFuture = executor.getWithRetry(ctx -> slow()); stringFuture.thenAcceptBoth(intFuture, (String s, Integer i) -> { //both done after some retries }); thenAcceptBoth() callback is executed asynchronously when both slow and unreliable systems finally reply without any failure. Similarly (using CompletableFuture.acceptEither()) you can call two or more unreliable servers asynchronously at the same time and be notified when the first one succeeds after some number of retries. I can’t emphasize this enough - retries are executed asynchronously and effectively use thread pool, rather than sleeping blindly. Rationale Often we are forced to retry given piece of code because it failed and we must try again, typically with a small delay to spare CPU. This requirement is quite common and there are few ready-made generic implementations with retry support in Spring Batch through RetryTemplate class being best known. But there are few other, quite similar approaches ([1], [2]). All of these attempts (and I bet many of you implemented similar tool yourself!) suffer the same issue - they are blocking, thus wasting a lot of resources and not scaling well. This is not bad per se because it makes programming model much simpler - the library takes care of retrying and you simply have to wait for return value longer than usual. But not only it creates leaky abstraction (method that is typically very fast suddenly becomes slow due to retries and delay), but also wastes valuable threads since such facility will spend most of the time sleeping between retries. Therefore Async-Retry utility was created, targeting Java 8 (with Java 7 backport existing) and addressing issues above. The main abstraction is RetryExecutor that provides simple API: public interface RetryExecutor { CompletableFuture doWithRetry(RetryRunnable action); CompletableFuture getWithRetry(Callable task); CompletableFuture getWithRetry(RetryCallable task); CompletableFuture getFutureWithRetry(RetryCallable> task); } Don’t worry about RetryRunnable and RetryCallable - they allow checked exceptions for your convenience and most of the time we will use lambda expressions anyway. Please note that it returns CompletableFuture. We no longer pretend that calling faulty method is fast. If the library encounters an exception it will retry our block of code with preconfigured backoff delays. The invocation time will sky-rocket from milliseconds to several seconds. CompletableFuture clearly indicates that. Moreover it’s not a dumb java.util.concurrent.Future we all know - CompletableFuture in Java 8 is very powerful and most importantly - non-blocking by default. If you need blocking result after all, just call .get() on Future object. Basic API The API is very simple. You provide a block of code and the library will run it multiple times until it returns normally rather than throwing an exception. It may also put configurable delays between retries: RetryExecutor executor = //... executor.getWithRetry(() -> new Socket("localhost", 8080)); Returned CompletableFuture will be resolved once connecting to localhost:8080 succeeds. Optionally we can consume RetryContext to get extra context like which retry is currently being executed: executor. getWithRetry(ctx -> new Socket("localhost", 8080 + ctx.getRetryCount())). thenAccept(System.out::println); This code is more clever than it looks. During first execution ctx.getRetryCount() returns 0, therefore we try to connect to localhost:8080. If this fails, next retry will try localhost:8081 (8080 + 1) and so on. And if you realize that all of this happens asynchronously you can scan ports of several machines and be notified about first responding port on each host: Arrays.asList("host-one", "host-two", "host-three"). stream(). forEach(host -> executor. getWithRetry(ctx -> new Socket(host, 8080 + ctx.getRetryCount())). thenAccept(System.out::println) ); For each host RetryExecutor will attempt to connect to port 8080 and retry with higher ports. getFutureWithRetry() requires special attention. I you want to retry method that already returns CompletableFuture: e.g. result of asynchronous HTTP call: private CompletableFuture asyncHttp(URL url) { /*...*/} //... final CompletableFuture> response = executor.getWithRetry(ctx -> asyncHttp(new URL("http://example.com"))); Passing asyncHttp() to getWithRetry() will yield CompletableFuture>. Not only it’s awkward to work with, but also broken. The library will barely call asyncHttp() and retry only if it fails, but not if returned CompletableFuture fails. The solution is simple: final CompletableFuture response = executor.getFutureWithRetry(ctx -> asyncHttp(new URL("http://example.com"))); In this case RetryExecutor will understand that whatever was returned from asyncHttp() is the actually just a Future and will (asynchronously) wait for result or failure. This library is much more powerful, so let’s dive into: Configuration Options In general there are two important factors you can configure: RetryPolicy that controls whether next retry attempt should be made and Backoff - that optionally adds delay between subsequent retry attempts. By default RetryExecutor repeats user task infinitely on every Throwable and adds 1 second delay between retry attempts. Creating an Instance of RetryExecutor Default implementation of RetryExecutor is AsyncRetryExecutor which you can create directly: ScheduledExecutorService scheduler = Executors.newSingleThreadScheduledExecutor(); RetryExecutor executor = new AsyncRetryExecutor(scheduler); //... scheduler.shutdownNow(); The only required dependency is standard ScheduledExecutorService from JDK. One thread is enough in many cases but if you want to concurrently handle retries of hundreds or more tasks, consider increasing the pool size. Notice that the AsyncRetryExecutor does not take care of shutting down the ScheduledExecutorService. This is a conscious design decision which will be explained later. AsyncRetryExecutor has few other constructors but most of the time altering the behaviour of this class is most convenient with calling chained with*() methods. You will see plenty of examples written this way. Later on we will simply use executor reference without defining it. Assume it’s of RetryExecutor type. Retrying Policy Exceptions By default every Throwable (except special AbortRetryException) thrown from user task causes retry. Obviously this is configurable. For example in JPA you may want to retry a transaction that failed due to OptimisticLockException - but every other exception should fail immediately: executor. retryOn(OptimisticLockException.class). withNoDelay(). getWithRetry(ctx -> dao.optimistic()); Where dao.optimistic() may throw OptimisticLockException. In that case you probably don’t want any delay between retries, more on that later. If you don’t like the default of retrying on every Throwable, just restrict that using retryOn(): executor.retryOn(Exception.class) Of course the opposite might also be desired - to abort retrying and fail immediately in case of certain exception being thrown rather than retrying. It’s that simple: executor. abortOn(NullPointerException.class). abortOn(IllegalArgumentException.class). getWithRetry(ctx -> dao.optimistic()); Clearly you don’t want to retry NullPointerException or IllegalArgumentException as they indicate programming bug rather than transient failure. And finally you can combine both retry and abort policies. User code will retry in case of any retryOn() exception (or subclass) unless it should abortOn() specified exception. For example we want to retry every IOException or SQLException but abort if FileNotFoundException or java.sql.DataTruncation is encountered (order is irrelevant): executor. retryOn(IOException.class). abortIf(FileNotFoundException.class). retryOn(SQLException.class). abortIf(DataTruncation.class). getWithRetry(ctx -> dao.load(42)); If this is not enough you can provide custom predicate that will be invoked on each failure: executor. abortIf(throwable -> throwable instanceof SQLException && throwable.getMessage().contains("ORA-00911") ); Max Number of Retries Another way of interrupting retrying “loop” (remember that this process is asynchronous, there is no blocking loop) is by specifying maximum number of retries: executor.withMaxRetries(5) In rare cases you may want to disable retries and barely take advantage from asynchronous execution. In that case try: executor.dontRetry() Delays Between Retries (backoff) Retrying immediately after failure is sometimes desired (see OptimisticLockException example) but in most cases it’s a bad idea. If you can’t connect to external system, waiting a little bit before next attempt sounds reasonably. You save CPU, bandwidth and other server’s resources. But there are quite a few options to consider: should we retry with constant intervals or increase delay after each failure? should there be a lower and upper limit on waiting time? should we add random “jitter” to delay times to spread retries of many tasks in time? This library answers all these questions. Fixed Interval Between Retries By default each retry is preceded by 1 second waiting time. So if initial attempt fails, first retry will be executed after 1 second. Of course we can change that default, e.g. to 200 milliseconds: executor.withFixedBackoff(200) If we are already here, by default backoff is applied after executing user task. If user task itself consumes some time, retries will be less frequent. For example with retry delay of 200ms and average time it takes before user task fails at about 50ms RetryExecutor will retry about 4 times per second (50ms + 200ms). However if you want to keep retry frequency at more predictable level you can use fixedRate flag: executor. withFixedBackoff(200). withFixedRate() This is similar to “fixed rate” vs. “fixed delay” approaches in ScheduledExecutorService. BTW don’t expect RetryExecutor to be very precise, it does it’s best but it heavily depends on aforementioned ScheduledExecutorService accuracy. Exponentially Growing Intervals Between Retries It’s probably an active research subject, but in general you may wish to expand retry delay over time, assuming that if the user task fails several times we should try less frequently. For example let’s say we start with 100ms delay until first retry attempt is made but if that one fails as well, we should wait two times more (200ms). And later 400ms, 800ms… You get the idea: executor.withExponentialBackoff(100, 2) This is an exponential function that can grow very fast. Thus it’s useful to set maximum backoff time at some reasonable level, e.g. 10 seconds: executor. withExponentialBackoff(100, 2). withMaxDelay(10_000) //10 seconds Random Jitter One phenomena often observed during major outages is that systems tend to synchronize. Imagine a busy system that suddenly stops responding. Hundreds or thousands of requests fail and are retried. It depends on your backoff but by default all these requests will retry exactly after one second producing huge wave of traffic at one point in time. Finally such failures are propagated to other systems that, in turn, synchronize as well. To avoid this problem it’s useful to spread retries over time, flattening the load. A simple solution is to add random jitter to delay time so that not all request are scheduled for retry at the exact same time. You have choice between uniform jitter (random value from -100ms to 100ms): executor.withUniformJitter(100) //ms …and proportional jitter, multiplying delay time by random factor, by default between 0.9 and 1.1 (10%): executor.withProportionalJitter(0.1) //10% You may also put hard lower limit on delay time to avoid to short retry times being scheduled: executor.withMinDelay(50) //ms Implementation Details This library was built with Java 8 in mind to take advantage of lambdas and new CompletableFuture abstraction (but Java 7 port with Guava dependency exists). It uses ScheduledExecutorService underneath to run tasks and schedule retries in the future - which allows best thread utilization. But what is really interesting is that the whole library is fully immutable, there is no single mutable field, at all. This might be counter-intuitive at first, take for example this trivial code sample: ScheduledExecutorService scheduler = Executors.newSingleThreadScheduledExecutor(); AsyncRetryExecutor first = new AsyncRetryExecutor(scheduler). retryOn(Exception.class). withExponentialBackoff(500, 2); AsyncRetryExecutor second = first.abortOn(FileNotFoundException.class); AsyncRetryExecutor third = second.withMaxRetries(10); It might seem that all with*() methods or retryOn()/abortOn() mutate existing executor. But that’s not the case, each configuration change creates new instance, leaving the old one untouched. So for example while first executor will retry on FileNotFoundException, the second and third won’t. However they all share the same scheduler. This is the reason why AsyncRetryExecutor does not shut down ScheduledExecutorService (it doesn’t even have any close() method). Since we have no idea how many copies of AsyncRetryExecutor exist pointing to the same scheduler, we don’t even try to manage its lifecycle. However this is typically not a problem (see Spring integration below). You might be wondering, why such an awkward design decision? There are three reasons: when writing a concurrent code immutability can greatly reduce risk of multi-threading bugs. For example RetryContext holds number of retries. But instead of mutating it we simply create new instance (copy) with incremented but final counter. No race condition or visibility can ever occur. if you are given an existing RetryExecutor which is almost exactly what you want but you need one minor tweak, you simply call executor.with...() and get a fresh copy. You don’t have to worry about other places where the same executor was used (see: Spring integration for further examples) functional programming and immutable data structures are sexy these days ;-). N.B.: AsyncRetryExecutor is not marked final, does you can break immutability by subclassing it and adding mutable state. Please don’t do this, subclassing is only permitted to alter behaviour. Dependencies This library requires Java 8 and SLF4J for logging. Java 7 port additionally depends on Guava. Spring Integration If you are just about to use RetryExecutor in Spring - feel free, but the configuration API might not work for you. Spring promotes (or used to promote) the convention of mutable services with plenty of setters. In XML you define bean and invoke setters (via ) on it. This convention assumes the existence of mutating setters. But I found this approach error-prone and counter-intuitive under some circumstances. Let’s say we globally defined org.springframework.transaction.support.TransactionTemplate bean and injected it in multiple places. Great. Now there is this one single request that requires slightly different timeout: @Autowired private TransactionTemplate template; and later in the same class: final int oldTimeout = template.getTimeout(); template.setTimeout(10_000); //do the work template.setTimeout(oldTimeout); This code is wrong on so many levels! First of all if something fails we never restore oldTimeout. OK, finally to the rescue. But also notice how we changed global, shared TransactionTemplate instance. Who knows how many other beans and threads are just about to use it, unaware of changed configuration? And even if you do want to globally change the transaction timeout, fair enough, but it’s still wrong way to do this. private timeout field is not volatile and thus changes made to it may or may not be visible to other threads. What a mess! The same problem appears with many other classes like JmsTemplate. You see where I’m going? Just create one, immutable service class and safely adjust it by creating copies whenever you need it. And using such services is equally simple these days: @Configuration class Beans { @Bean public RetryExecutor retryExecutor() { return new AsyncRetryExecutor(scheduler()). retryOn(SocketException.class). withExponentialBackoff(500, 2); } @Bean(destroyMethod = "shutdownNow") public ScheduledExecutorService scheduler() { return Executors.newSingleThreadScheduledExecutor(); } } Hey! It’s 21st century, we don’t need XML in Spring any more. Bootstrap is simple as well: final ApplicationContext context = new AnnotationConfigApplicationContext(Beans.class); final RetryExecutor executor = context.getBean(RetryExecutor.class); //... context.close(); As you can see integrating modern, immutable services with Spring is just as simple. BTW if you are not prepared for such a big change when designing your own services, at least consider constructor injection. Maturity This library is covered with a strong battery of unit tests. However it wasn’t yet used in any production code and the API is subject to change. Of course you are encouraged to submit bugs, feature requests and pull requests. It was developed with Java 8 in mind but Java 7 backport exists with slightly more verbose API and mandatory Guava dependency (ListenableFuture instead of CompletableFuture from Java 8). Full source code on GitHub.

July 24, 2013

by Tomasz Nurkiewicz

· 77,021 Views · 2 Likes

Algorithm of the Week: Spatial Indexing with Quadtrees and Hilbert Curves

some time ago at oredev, after the sessions, there was "birds of a feather" - a sort of mini-unconference. anyone could write up a topic on the whiteboard; interested individuals added their names, and each group got allocated a room to chat about the topic. i joined the "spatial indexing" group, and we spent a fascinating hour and a half talking about spatial indexing methods, reminding me of several interesting algorithms and techniques. spatial indexing is increasingly important as more and more data and applications are geospatially-enabled. efficiently querying geospatial data, however, is a considerable challenge: because the data is two-dimensional (or sometimes, more), you can't use standard indexing techniques to query on position. spatial indexes solve this through a variety of techniques. in this post, we'll cover several - quadtrees , geohashes (not to be confused with geohashing ), and space-filling curves - and reveal how they're all interrelated. quadtrees quadtrees are a very straightforward spatial indexing technique. in a quadtree, each node represents a bounding box covering some part of the space being indexed, with the root node covering the entire area. each node is either a leaf node - in which case it contains one or more indexed points, and no children, or it is an internal node, in which case it has exactly four children, one for each quadrant obtained by dividing the area covered in half along both axes - hence the name. a representation of how a quadtree divides an indexed area. source: wikipedia inserting data into a quadtree is simple: starting at the root, determine which quadrant your point occupies. recurse to that node and repeat, until you find a leaf node. then, add your point to that node's list of points. if the list exceeds some pre-determined maximum number of elements, split the node, and move the points into the correct subnodes. a representation of how a quadtree is structured internally. to query a quadtree, starting at the root, examine each child node, and check if it intersects the area being queried for. if it does, recurse into that child node. whenever you encounter a leaf node, examine each entry to see if it intersects with the query area, and return it if it does. note that a quadtree is very regular - it is, in fact, a trie , since the values of the tree nodes do not depend on the data being inserted. a consequence of this is that we can uniquely number our nodes in a straightforward manner: simply number each quadrant in binary (00 for the top left, 10 for the top right, and so forth), and the number for a node is the concatenation of the quadrant numbers for each of its ancestors, starting at the root. using this system, the bottom right node in the sample image would be numbered 11 01. if we define a maximum depth for our tree, then, we can calculate a point's node number without reference to the tree - simply normalize the node's coordinates to an appropriate integer range (for example, 32 bits each), and then interleave the bits from the x and y coordinates -each pair of bits specifies a quadrant in the hypothetical quadtree. geohashes this system might seem familiar: it's a geohash ! at this point, you can actually throw out the quadtree itself - the node number, or geohash, contains all the information we need about its location in the tree. each leaf node in a full-height tree is a complete geohash, and each internal node is represented by the range from its smallest leaf node to its largest one. thus, you can efficiently locate all the points under any internal node by indexing on the geohash by performing a query for everything within the numeric range covered by the desired node. querying once we've thrown away the tree itself becomes a little more complex. instead of refining our search set recursively, we need to construct a search set ahead of time. first, find the smallest prefix (or quadtree node) that completely covers the query area. in the worst case, this may be substantially larger than the actual query area - for example, a small shape in the center of the indexed area that intersects all four quadrants would require selecting the root node for this step. the aim, now, is to construct a set of prefixes that completely covers the query region, while including as little area outside the region as possible. if we had no other constraints, we could simply select the set of leaf nodes that intersect the query area - but that would result in a lot of queries. another constraint, then, is that we want to minimise the number of distinct ranges we have to query for. one approach to doing this is to start by setting a maximum number of ranges we're willing to have. construct a set of ranges, initially populated with the prefix we identified earlier. pick the node in the set that can be subdivided without exceeding the maximum range count and will remove the most unwanted area from the query region. repeat this until there are no ranges in the set that can be further subdivided. finally, examine the resulting set, and join any adjacent ranges, if possible. the diagram below demonstrates how this works for a query on a circular area with a limit of 5 query ranges. how a query for a region is broken into a series of geohash prefixes/ranges. this approach works well, and it allows us to avoid the need to do recursive lookups - the set of range lookups we do execute can all be done in parallel. since each lookup can be expected to require a disk seek, parallelizing our queries allows us to substantially cut down the time required to return the results. still, we can do better. you may notice that all the areas we need to query in the above diagram are adjacent, yet we can only merge two of them (the two in the bottom right of the selected area) into a single range query, requiring us to do 4 separate queries. this is due in part to the order that our geohashing approach 'visits' subregions, working left to right, then top to bottom in each quad. the discontinuity as we go from top right to bottom left quad results in us having to split up some ranges that we could otherwise make contiguous. if we were to visit regions in a different order, perhaps we could minimise or eliminate these discontinuities, resulting in more areas that can be treated as adjacent and fetched with a single query. with an improvement in efficiency like that, we could do fewer queries for the same area covered, or conversely, the same number of queries, but including less extraneous area. illustrates the order in which the geohashing approach 'visits' each quad. hilbert curves suppose instead, we visit regions in a 'u' shape. within each quad, of course, we also visit subquads in the same 'u' shape, but aligned so as to match up with neighbouring quads. if we organise the orientation of these 'u's correctly, we can completely eliminate any discontinuities, and visit the entire area at whatever resolution we choose continuously, fully exploring each region before moving on to the next. not only does this eliminate discontinuities, but it also improves the overall locality. the pattern we get if we do this may look familiar - it's a hilbert curve. hilbert curves are part of a class of one-dimensional fractals known as space-filling curves , so named because they are one dimensional lines that nevertheless fill all available space in a fixed area. they're fairly well known, in part thanks to xkcd's use of them for a map of the internet . as you can see, they're also of use for spatial indexing, since they exhibit exactly the locality and continuity required. for example, if we take another look at the example we used for finding the set of queries required to encompass a circle above, we find that we can reduce the number of queries by one: the small region in the lower left is now contiguous with the region to its right, and whilst the two regions at the bottom are no longer contiguous with each other, the rightmost one is now contiguous with the large area in the upper right. illustrates the order in which a hilbert curve 'visits' each quad. one thing that our elegant new system is lacking, so far, is a way of converting between a pair of (x,y) coordinates and the corresponding position in the hilbert curve. with geohashing it was easy and obvious - just interleave the x and y coordinates - but there's no obvious way to modify that for a hilbert curve. searching the internet, you're likely to come across many descriptions of how hilbert curves are drawn, but few if any descriptions of how to find the position of an arbitrary point. to figure this out, we need to take a closer look at how the hilbert cure can be recursively constructed. the first thing to observe is that although most references to hilbert curves focus on how to draw the curve, this is a distraction from the essential property of the curve, and its importance to us: it's an ordering for points on a plane. if we express a hilbert curve in terms of this ordering, drawing the curve itself becomes trivial - simply a matter of connecting the dots. forget about how to connect adjacent sub-curves, and instead focus on how we can recursively enumerate the points. hilbert curves are all about ordering a set of points on a 2d plane at the root level, enumerating the points is simple: pick a direction and a start point, and proceed around the four quadrants, numbering them 0 to 3. the difficulty is introduced when we want to determine the order we visit the sub-quadrants in while maintaining the overall adjacency property. examination reveals that each of the sub-quadrants' curves is a simple transformation of the original curve: there are only four possible transformations. naturally, this applies recursively to sub-sub quadrants, and so forth. the curve we use for a given quadrant is determined by the curve we used for the square it's in, and the quadrant's position. with a little work, we can construct a table that encapsulates this: suppose we want to use this table to determine the position of a point on a third-level hilbert curve. for the sake of this example, assume our point has coordinates (5,2) starting with the first square on the diagram, find the quadrant your point is in - in this case, it's the upper right quadrant. the first part of our hilbert curve position, then, is 3 (11 in binary). next, we consult the square shown in the inset of square 3 - in this case, it's the second square. repeat the process: which sub-quadrant does our point fall into? here, it's the lower left one, meaning the next part of our position is 1, and the square we should consult next is the second one again. repeating the process one final time, we find our point falls in the upper right sub-sub-quadrant, our final coordinate is 3 (11 in binary). stringing them together, we now know the position of our point on the curve is 110111 binary, or 55. let's be a little more methodical, and write methods to convert between x,y coordinates and hilbert curve positions. first, we need to express our diagram above in terms a computer can understand: hilbert_map = { 'a': {(0, 0): (0, 'd'), (0, 1): (1, 'a'), (1, 0): (3, 'b'), (1, 1): (2, 'a')}, 'b': {(0, 0): (2, 'b'), (0, 1): (1, 'b'), (1, 0): (3, 'a'), (1, 1): (0, 'c')}, 'c': {(0, 0): (2, 'c'), (0, 1): (3, 'd'), (1, 0): (1, 'c'), (1, 1): (0, 'b')}, 'd': {(0, 0): (0, 'a'), (0, 1): (3, 'c'), (1, 0): (1, 'd'), (1, 1): (2, 'd')}, } in the snippet above, each element of 'hilbert_map' corresponds to one of the four squares in the diagram above. to make things easier to follow, i've identified each one with a letter - 'a' is the first square, 'b' the second, and so forth. the value for each square is a dict, mapping x and y coordinates for the (sub-)quadrant to the position along the line (the first part of the value tuple) and the square to use next (the second part of the value tuple). here's how we can use this to translate x and y coordinates into a hilbert curve position: def point_to_hilbert(x, y, order=16): current_square = 'a' position = 0 for i in range(order - 1, -1, -1): position <<= 2 quad_x = 1 if x & (1 << i) else 0 quad_y = 1 if y & (1 << i) else 0 quad_position, current_square = hilbert_map[current_square][(quad_x, quad_y)] position |= quad_position return position the input to this function is the integer x and y coordinates, and the order of the curve. an order 1 curve fills a 2x2 grid, an order 2 curve fills a 4x4 grid, and so forth. our x and y coordinates, then, should be normalized to a range of 0 to 2order-1. the function works by stepping over each bit of the x and y coordinates, starting with the most significant. for each, it determines which (sub-)quadrant the coordinate lies in, by testing the corresponding bit, then fetches the position along the line and the next square to use from the table we defined earlier. the curve position is set as the least significant 2 bits on the position variable, and at the beginning of the next loop, it's left-shifted to make room for the next set of coordinates. let's check that we've written the function correctly by running our example from above through it: >>> point_to_hilbert(5,2,3)55 presto! for a further test, we can use the function to generate a complete list of ordered points for a hilbert curve, then use a spreadsheet to graph them and see if we get a hilbert curve. enter the following expression into an interactive python interpreter: >>> points =[(x, y)for x in range(8)for y in range(8)]>>> sorted_points = sorted(points, key=lambda k: point_to_hilbert(k[0], k[1],3))>>>print'\n'.join('%s,%s'% x for x in sorted_points) take the resulting text, paste it into a file called 'hilbert.csv', open it in your favorite spreadsheet, and instruct it to generate a scatter plot. the result is, of course, a nicely plotted hilbert curve! the inverse of point_to_hilbert is a straightforward reversal of the hilbert_map; implementing it is left as an exercise for the reader. conclusion there you have it - spatial indexing, from quadtrees to geohashes to hilbert curves. one final observation: if you express the ordered sequence of x,y coordinates required to draw a hilbert curve in binary, do you notice anything interesting about the ordering? does it remind you of anything? just to wrap up, a caveat: all of the indexing methods i've described today are only well-suited to indexing points. if you want to index lines, polylines, or polygons, you're probably out of luck with these methods - and so far as i'm aware, the only known algorithm for effectively indexing shapes is the r-tree , an entirely different and more complex beast.

July 23, 2013

by Nick Johnson

· 43,710 Views

Converting Java Objects to Byte Array, JSON and XML

Quick reference for converting Java objects to various formats (byte array, JSON, XML) and back, using different libraries for serialization and deserialization.

July 22, 2013

by Faheem Sohail

· 107,031 Views

Log4j 2: Performance close to insane

Recently a respected member of the Apache community tried Log4j 2 and wrote on Twitter: (Quote from Mark Struberg: @TheASF #log4j2 rocks big times! Performance is close to insane ^^ http://logging.apache.org/log4j/2.x/ ) It happened shortly after Remko Popma contributed something which is now called the “AsyncLoggers”. Some of you might know Log4j 2 has AsyncAppenders already. They are similar like the ones you can find in Log4j 1 and other logging frameworks. I am honest: I wasn’t so excited about the new feature until I read the tweet on its performance and became curious. Clearly Java logging has many goals. Among them: logging must be as fast as hell. Nobody wants his logging framework to become a bottleneck. Of course you’ll always have a cost when logging. There is some operation the CPU must perform. Something is happening, even when you decide NOT to write a log statement. Logging is expected to be invisible. Until now, the well-known logging frameworks were similar in speed. Benchmarks are unreliable after all. We have made some benchmarks over at Apache Logging. Sometimes one logging frameworks wins, sometimes the other. But at the end of the day you can say they are all very good and you can choose whatever your liking is. Until we got Remko’s contribution and Log4j 2 became “insanely fast”. Small software projects running one thread might not care about performance so much. When running a SaaS you simply don’t know when your app gets so much attraction that you need to scale. Then you suddenly need some extra power. With Log4j 2, running 64 threads might bring you twelve times more logging throughput than with comparable frameworks. We speak of more than 18,000,000 messages per second, while others do around 1,500,000 or less in the same environment. I saw the chart, but simply couldn’t believe it. There must be something wrong. I rechecked. I ran the tests myself. It’s like that: Log4j 2 is insanely fast. Async Performance, last read on July 19, 2013 As of now, we have a logging framework which performs lots better than every other logging framework out there. As of now we need to justify our decision when we do not want to use Log4j 2, if speed matters.Everything else than Log4j 2 can become a bottleneck and a risk. With such a fast logging framework you might even consider to log a bit more in production than you did before. Eventually I wrote Remko an e-mail and asked him what exactly the difference between the old AsyncAppenders and the new Asynchronous Loggers is. The difference between old AsynAppenders and new AsyncLoggers “The Asynchronous Loggers do two things differently than the AsyncAppender”, he told me, “they try to do the minimum amount of work before handing off the log message to another thread, and they use a different mechanism to pass information between the producer and consumer threads. AsyncAppender uses an ArrayBlockingQueue to hand off the messages to the thread that writes to disk, and Asynchronous Loggers use the LMAX Disruptor library. Especially the Disruptor has made a large performance difference.” In other terms, the AsyncAppender use a first-in-first-out Queue to work through messages. But the Async Logger uses something new – the Disruptor. To be honest, I had never heard of it. And furthermore, I never thought much about scaling my logging framework. When somebody said “scale the system”, I thought about the database, the app server and much more, but usually not logging. In production, logging was off. End of story. But Remko thinks about scaling when it comes to logging. “Looking at the performance test results for the Asynchronous Loggers, the first thing you notice is that some ways of logging scale much better than others. By scaling better I mean that you get more throughput when you add more threads. If your throughput increases a constant amount with every thread you add, you have linear scalability. This is very desirable but can be difficult to achieve.”, he wrote me. “Comparing synchronous to asynchronous, you would expect any asynchronous mechanism to scale much better than synchronous logging because you don’t do the I/O in the producing thread any more, and we all know that ‘I/O is slow’ (and I’ll get back to this in a bit)”. Yes, exactly my understanding. I thought it would be enough to send something to a queue, and something else would pick it up and write the message. The app would go on. This is exactly what the old AsyncAppender does, wrote Remko: “With AsyncAppender, all your application thread needs to do is create a LogEvent object and put it on the ArrayBlockingQueue; the consuming thread will then take these events off the queue and do all the time-consuming work. That is, the work of turning the event into bytes and writing these bytes to the I/O device. Since the application threads do not need to do the I/O, you would expect this to scale better, meaning adding threads will allow you to log more events.” If you believed that like me, take a seat and a deep breath. We were wrong. “What may surprise you is that this is not the case.”, he wrote. “If you look at the performance numbers for the AsyncAppenders of all logging frameworks, you’ll see that every time you double the number of threads, your throughput per thread roughly halves.” “So your total throughput remains more or less flat! AsyncAppenders are faster than synchronous logging, but they are similar in the sense that neither of them gives you more total throughput when you add more threads.”, he told me. It hit me like a hammer. Basically instead of making your logging faster with adding more threads you made basically: nothing. After all Appenders didn’t scale until now. I asked Remko why this was the case. “It turns out that queues are not the most optimal data structure to pass information between threads. The concurrent queues that are part of the standard Java libraries use locks to make sure that values don’t get corrupted and to ensure data visibility between threads.”. LMAX Disruptor? “The LMAX team did a lot of research on this and found that these queues have a lot of lock contention. An interesting thing they found is that queues are always either full or empty: If your producer is faster, your queue will be full most of the time (and that may be a problem in itself ). If your consumer is fast enough, your queue will be empty most of the time. Either way, you will have contention on the head or on the tail of the queue, where both the producer and the consumer thread want to update the same field. To resolve this, the LMAX team came up with the Disruptor library, which is a lock-free data structure for passing messages between threads. Here is a performance comparison between the Disruptor and ArrayBlockingQueue:Performance Comparison.” Wow. After all these years of Java programming I actually felt a bit like a Junior programmer again. I missed the LMAX disruptor and even never considered it a performance problem to use the Queue. I wonder what other performance problems I did not discover so far. I realized, I had to re-learn Java. I asked Remko how he could find a library like the LMAX disruptor. I mean nobody writes software, creates an instance of a Queue-class, doubts its performance and finally searches the internet for “something better”. Or are there really people of that kind? “How I found about the Disruptor? The short answer is, it was all a mistake.”, he started. “Okay, perhaps that was a bit too short, so here is the longer answer: a colleague of mine wrote a small logger, essentially adding a time-stamped log message to a queue, with a background thread that took these strings off the queue and wrote them to disk. He did this because he needed better performance than what he could get with log4j-1.x. I did some testing and found it was faster, I don’t remember exactly by how much. I was quite surprised because I had been using log4j for years and had never thought it would be easily outperformed. Until then I had assumed that the well-known libraries would be fast, because, well… To be honest, I had just assumed. So this was a bit of an eye-opener for me. However, the custom logger was a bit bare-bones in terms of functionality so I started to look around for alternatives.” “Before I start talking about the Disruptor, I have to confess something. I recently went back to see how much faster the custom logger was than log4j-1.x, but when I measured it it was actually slower! It turned out that I had been comparing the custom logger to an old beta of log4j-2.0, I think beta3 or beta4. AsyncAppender in those betas still had a performance issue (LOG4J2-153 if you’re curious). If I had compared the custom logger to the AsyncAppender in log4j-1.x, I would have found that log4j-1.x was faster and I would not have thought about it further. But because I made this mistake I started to look for other high-performance logging libraries that were richer in functionality. I did not find such a logging library, but I ran into a whole bunch of other interesting stuff, including the Disruptor. Eventually I decided to try to combine Log4j-2, which has a very nice code base, with the Disruptor. The result of this was eventually accepted into Log4j-2 itself, and the rest, as they say, was history.” “One thing I came across that I should mention here is Peter Lawrey’sChronicle library. Chronicle uses memory-mapped files to write tens of millions of messages per second to disk with very low latency. Remember that above I said that “we all know that I/O is slow”? Chronicle shows that synchronous I/O can be very, very fast.“. “It was via Peter’s work that I came across the Disruptor. There is a lot of good material out there about the Disruptor. Just to give you a few pointers: Martin Fowler: LMAX Trisha Lee on LMAX under the hood (slightly outdated now but the most detailed material I know of) …and video presentations like this The Disruptor google group is also highly recommended. Recommended readings on Java performance in general are: Martin Thompson’s “Mechanical Sympathy” Martin Thompson Presentations. Martin Thompson has done a number of articles and presentations on various aspects of high performance computing in java. He does a great job of making the complex stuff that is going on under the hood accessible.” My bookmarks folder went full after reading this e-mail, and I appreciate the lots of starting points for improving my knowledge on Java performance. Should I use AsyncLoggers by default? I was sure I want to use the new Async Loggers. This all sounds just fantastic. But on the other hand, I am a bit scared and even a little paranoid to include new dependencies or new technologies like the new Log4j 2 Async Loggers. I asked Remko if he would use the new feature by default or if he would enable them just for a few, limited use cases. “I use Async Loggers by default, yes.”, he wrote me. “One use case when you would _not_ want to use asynchronous logging is when you use logging for audit purposes. In that case a logging error is a problem that your application needs to know about and deal with. I believe that most applications are different, in that they don’t care too much about logging errors. Most applications don’t want to stop if a logging exception occurs, in fact, they don’t even want to know about it. By default, appenders in Log4j-2.0 will suppress exceptions so the application doesn’t need to try/catch every log statement. If that is your usage, then you will not lose anything by using asynchronous loggers, so you get only the benefits, which is improved performance.” “One nice little detail I should mention is that both Async Loggers and Async Appenders fix something that has always bothered me in Log4j-1.x, which is that they will flush the buffer after logging the last event in the queue. With Log4j-1.x, if you used buffered I/O, you often could not see the last few log events, as they were still stuck in the memory buffer. Your only option was setting immediateFlush to true, which forces disk I/O on every single log event and has a performance impact. With Async Loggers and Appenders in Log4j-2.0 your log statements are all flushed to disk, so they are always visible, but this happens in a very efficient manner.” Isn’t it risky to log to use Log4js AsyncLoggers? But considering that Log4j-1 had serious threading issues and the modern world uses cloud computing and clustering all the time to scale their apps,isn’t asynchronous logging some kind of additional risk? Or is it safe? I knew my questions would sound like the questions of a decision maker, not of an developer. But the whole LMAX thing was so new to me and since I maintain the old and really ugly Log4j 1 code, I simply had to ask. Remko: “There are a number of questions in there. First, is Log4j-2 safer from a concurrency perspective than Log4j-1.x? I believe so. The Log4j-2 team has put in considerable effort to support multi-threaded applications, and the asynchronous loggers are just a very recent and relatively small addition to the project. Log4j-2 uses more granular locking than log4j-1.x, and is architecturally simpler, which should result in fewer issues, and any issues that do come up will be easier to fix.” “On the other hand, Log4j-2 is still in beta and is under active development, although recently I think most effort is being spent on fixing things and tying up loose ends rather than adding new features. I believe it is stable enough for production use. If you are considering using Log4j-2, for performance or other reasons, I’d suggest you do your due diligence and test, just like you would before adopting any other 3rd party library in your project.” (Sidenote: A stable version of Log4j2 can be expected soon, most likely autumn 2013). Sounded good to me. And yes, I can perfectly agree with that from my own observations on the project, though I personally did not write code in the Log4j 2 repository. “The other question I see is: Is asynchronous logging riskier than synchronous logging? I don’t think so, in fact, if your application is multi threaded the opposite may be the case: once the log event has been handed off to the consumer thread that does the I/O, there is only that one thread dealing with the layouts, appenders and all the other logging-related components. So after the hand-off you’re single-threaded and you don’t need to worry about any threading issues like deadlock and liveliness etc any more.” “You can take this one step further and make your business logic completely single-threaded, using the disruptor for all I/O or communication with external systems. Single-threaded business logic without lock contention can be blazingly fast. The results at LMAX (6 million transactions/sec, with less than 10 ms latency) speak for themselves.” Reading Remko’s message I learned three things. First, I had to learn more about Java performance. Second, I definitely want to make my applications use Log4j 2. As first step, I will enable it in my Struts 2 apps, which I use often. Third, a web application framework using the LMAX Disruptor might blow us all away. I would like to give a big thank you and a hug to Remko Popma for answering my questions and working on this blog post with me. All errors are my own.

July 20, 2013

by Christian Grobmeier

· 7,450 Views · 1 Like

Creating External DSLs using ANTLR and Java

In my previous post quite sometime back I had written about Internal DSLs using Java. In the book Domain Specific Languages by Martin Fowler, he discusses about another type of DSL called external DSLs in which the DSL is written in another language which is then parsed by the host language to populate the semantic model. In the previous example I was discussing about creating a DSL for defining a graph. The advantage of using an external dsl is that any change in the graph data would not require recompilation of the program instead the program can just load the external dsl, create a parse tree and then populate the semantic model. The semantic model will remain the same and the advantage of using the semantic model is that one can make modification to the DSL without making much changes to the semantic model. In the example between Internal DSLs and external DSLs I have not modified the semantic model. To create an external DSL I am making use of ANTLR. What is ANTLR? The definition as given on the official site is: ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files. It’s widely used to build languages, tools, and frameworks. From a grammar, ANTLR generates a parser that can build and walk parse trees. The notable features of ANTLR from the above definition are: parser generator for structured text or binary files can build and walk parse trees Semantic Model In this example I will exploit the above features of ANTLR to parse a DSL and then walk through the parse tree to populate the Semantic model. To recap, the semantic model consists of Graph, Edge and Vertex classes which represent a Graph and an Edge and a Vertex of the Graph respectively. The below code shows the class definitions: public class Graph { private List edges; private Set vertices; public Graph() { edges = new ArrayList<>(); vertices = new TreeSet<>(); } public void addEdge(Edge edge){ getEdges().add(edge); getVertices().add(edge.getFromVertex()); getVertices().add(edge.getToVertex()); } public void addVertice(Vertex v){ getVertices().add(v); } public List getEdges() { return edges; } public Set getVertices() { return vertices; } public static void printGraph(Graph g){ System.out.println("Vertices..."); for (Vertex v : g.getVertices()) { System.out.print(v.getLabel() + " "); } System.out.println(""); System.out.println("Edges..."); for (Edge e : g.getEdges()) { System.out.println(e); } } } public class Edge { private Vertex fromVertex; private Vertex toVertex; private Double weight; public Edge() { } public Edge(Vertex fromVertex, Vertex toVertex, Double weight) { this.fromVertex = fromVertex; this.toVertex = toVertex; this.weight = weight; } @Override public String toString() { return fromVertex.getLabel() + " to " + toVertex.getLabel() + " with weight " + getWeight(); } public Vertex getFromVertex() { return fromVertex; } public void setFromVertex(Vertex fromVertex) { this.fromVertex = fromVertex; } public Vertex getToVertex() { return toVertex; } public void setToVertex(Vertex toVertex) { this.toVertex = toVertex; } public Double getWeight() { return weight; } public void setWeight(Double weight) { this.weight = weight; } } public class Vertex implements Comparable { private String label; public Vertex(String label) { this.label = label.toUpperCase(); } @Override public int compareTo(Vertex o) { return (this.getLabel().compareTo(o.getLabel())); } public String getLabel() { return label; } public void setLabel(String label) { this.label = label; } } Creating the DSL Lets come up with the structure of the language before going into creating grammar rules. The structure which I am planning to come up is something like: Graph { A -> B (10) B -> C (20) D -> E (30) } Each line in the Graph block represents an edge and the vertices involved in the edge and the value in the braces represent the weight of the edge. One limitation which I am enforcing is that the Graph cannot have dangling vertices i.e vertices which are not part of any edge. This limitation can be removed by slightly changing the grammar, but I would leave that as an exercise for the readers of this post. The first task in creating the DSL is to define the grammar rules. These are the rules which your lexer and parser will use to convert the DSL into a Abstract Syntax tree/parse tree. ANTLR then makes use of this grammar to generate the Parser, Lexer and a Listener which are nothing but java classes extending/implementing some classes from the ANTLR library. The creators of the DSL must make use of these java classes to load the external DSL, parse it and then using the listener populate the semantic model as and when the parser encounters certain nodes (think of this as a variant of SAX parser for XML) Now that we know in very brief what ANTLR can do and the steps in using ANTLR, we would have to setup ANTLR i.e download ANTLR API jar and setup up some scripts for generating the parser and lexer and then trying out the language via the command line tool. For that please visit this official tutorial from ANTLR which shows how to setup ANTLR and a simple Hello World example. Grammar for the DSL Now that you have ANTLR setup let me dive into the grammar for my DSL: grammar Graph; graph: 'Graph {' edge+ '}'; vertex: ID; edge: vertex '->' vertex '(' NUM ')' ; ID: [a-zA-Z]+; NUM: [0-9]+; WS: [ \t\r\n]+ -> skip; Lets go rule: graph: 'Graph {' edge+ '}'; The above grammar rule which is the start rule says that the language should start with ‘Graph {‘ and end with ‘}’ and has to contain at lease one edge or more than one edge. vertex: ID; edge: vertex '->' vertex '(' NUM ')' ; ID: [a-zA-Z]+; NUM: [0-9]+; The above four rules say that a vertex should have atleast one character or more than one character. And an edge is defined as collection of two vertices separated by a ‘->’ and with the some digits in the ‘()’. I have named the grammar language as “Graph” and hence once we use ANTLR to generate the java classes i.e parser and lexer we will end up seeing the following classes: GraphParser, GraphLexer, GraphListener and GraphBaseListener. The first two classes deal with the generation of parse tree and the last two classes deal with the parse tree walk through. GraphListener is an interface which contains all the methods for dealing with the parse tree i.e dealing with events such as entering a rule, exiting a rule, visiting a terminal node and in addition to these it contains methods for dealing with events related to entering the graph rule, entering the edge rule and entering the vertex rule. We will be making use of these methods to intercept the data present in the dsl and then populate the semantic model. Populating the semantic model I have created a file graph.gr in the resource package which contains the DSL for populating the graph. As the files in the resource package are available to the ClassLoader at runtime, we can use the ClassLoader to read the DSL script and then pass it on to the Lexer and parser classes. The DSL script used is: Graph { A -> B (10) B -> C (20) D -> E (30) A -> E (12) B -> D (8) } And the code which loads the DSL and populates the semantic model: //Please resolve the imports for the classes used. public class GraphDslAntlrSample { public static void main(String[] args) throws IOException { //Reading the DSL script InputStream is = ClassLoader.getSystemResourceAsStream("resources/graph.gr"); //Loading the DSL script into the ANTLR stream. CharStream cs = new ANTLRInputStream(is); //Passing the input to the lexer to create tokens GraphLexer lexer = new GraphLexer(cs); CommonTokenStream tokens = new CommonTokenStream(lexer); //Passing the tokens to the parser to create the parse trea. GraphParser parser = new GraphParser(tokens); //Semantic model to be populated Graph g = new Graph(); //Adding the listener to facilitate walking through parse tree. parser.addParseListener(new MyGraphBaseListener(g)); //invoking the parser. parser.graph(); Graph.printGraph(g); } } /** * Listener used for walking through the parse tree. */ class MyGraphBaseListener extends GraphBaseListener { Graph g; public MyGraphBaseListener(Graph g) { this.g = g; } @Override public void exitEdge(GraphParser.EdgeContext ctx) { //Once the edge rule is exited the data required for the edge i.e //vertices and the weight would be available in the EdgeContext //and the same can be used to populate the semantic model Vertex fromVertex = new Vertex(ctx.vertex(0).ID().getText()); Vertex toVertex = new Vertex(ctx.vertex(1).ID().getText()); double weight = Double.parseDouble(ctx.NUM().getText()); Edge e = new Edge(fromVertex, toVertex, weight); g.addEdge(e); } } And the output when the above would be executed would be: Vertices... A B C D E Edges... A to B with weight 10.0 B to C with weight 20.0 D to E with weight 30.0 A to E with weight 12.0 B to D with weight 8.0 To summarize, this post creates a external DSL for populating the data for graphs by making use of ANTLR. I will enhance this simple DSL and expose it as an utility which can be used by programmers working on graphs. The post is very heavy on concepts and code, feel free to drop in any queries you have so that I can try to address them for benefit of others as well.

July 19, 2013

by Mohamed Sanaulla

· 25,298 Views · 1 Like

Java 8 APIs: java.util.time - Instant, LocalDate, LocalTime, and LocalDateTime

An overview starting with some basic classes of the Java 8 package: Instant, LocalDate, LocalTime, and LocalDateTime.

July 19, 2013

by Eyal Lupu

· 215,395 Views · 7 Likes