Coding Resources

The Latest Coding Topics

Synchronizing transactions with asynchronous events in Spring

Today as an example we will take a very simple scenario: placing an order stores it and sends an e-mail about that order: @Service class OrderService @Autowired() (orderDao: OrderDao, mailNotifier: OrderMailNotifier) { @Transactional def placeOrder(order: Order) { orderDao save order mailNotifier sendMail order } } So far so good, but e-mail functionality has nothing to do with placing an order. It's just a side-effect that distracts rather than part of business logic. Moreover sending an e-mail unnecessarily prolongs transaction and introduces latency. So we decided to decouple these two actions by using events. For simplicity I will take advantage of Spring built-in custom events but our discussion is equally relevant for JMS or other producer-consumer library/queue. case class OrderPlacedEvent(order: Order) extends ApplicationEvent @Service class OrderService @Autowired() (orderDao: OrderDao, eventPublisher: ApplicationEventPublisher) { @Transactional def placeOrder(order: Order) { orderDao save order eventPublisher publishEvent OrderPlacedEvent(order) } } As you can see instead of accessing OrderMailNotifier bean directly we send OrderPlacedEvent wrapping newly created order. ApplicationEventPublisher is needed to send an event. Of course we also have to implement the client side receiving messages: @Service class OrderMailNotifier extends ApplicationListener[OrderPlacedEvent] { def onApplicationEvent(event: OrderPlacedEvent) { //sending e-mail... } } ApplicationListener[OrderPlacedEvent] indicates what type of events are we interested in. This works, however by default Spring ApplicationEvents are synchronous, which means publishEvent() is actually blocking. Knowing Spring it shouldn't be hard to turn event broadcasting into asynchronous mode. Indeed there are two ways: one suggested in JavaDoc and the other I discovered because I failed to read the JavaDoc first... According to documentation if you want your events to be delivered asynchronously, you should define bean named applicationEventMulticaster of type SimpleApplicationEventMulticaster and define taskExecutor: @Bean def applicationEventMulticaster() = { val multicaster = new SimpleApplicationEventMulticaster() multicaster.setTaskExecutor(taskExecutor()) multicaster } @Bean def taskExecutor() = { val pool = new ThreadPoolTaskExecutor() pool.setMaxPoolSize(10) pool.setCorePoolSize(10) pool.setThreadNamePrefix("Spring-Async-") pool } Spring already supports broadcasting events using custom TaskExecutor. I didn't know about it so first I simply annotated onApplicationEvent() with @Async: @Async def onApplicationEvent(event: OrderPlacedEvent) { //... no further modifications, once Spring discovers @Async method it runs it in different thread asynchronously. Period. Well, you still have to enable @Async support if you don't use it already: @Configuration @EnableAsync class ThreadingConfig extends AsyncConfigurer { def getAsyncExecutor = taskExecutor() @Bean def taskExecutor() = { val pool = new ThreadPoolTaskExecutor() pool.setMaxPoolSize(10) pool.setCorePoolSize(10) pool.setThreadNamePrefix("Spring-Async-") pool } } Technically @EnableAsync is enough. However by default Spring uses SimpleAsyncTaskExecutor which creates new thread on every @Async invocation. A bit unfortunate default for enterprise framework, luckily easy to change. Undoubtedly @Async seems cleaner than defining some magic beans. All above was just a setup to expose the real problem. We now send an asynchronous message that is processed in other thread. Unfortunately we introduced race condition that manifests itself under heavy load, or maybe only some particular operating system. Can you spot it? To give you a hint, here is what happens: Starting transaction Storing order in database Sending a message wrapping order Commit In the meantime some asynchronous thread picks up OrderPlacedEvent and starts processing it. The question is, does it happen right after point (3) but before point (4) or maybe after (4)? That makes a big difference! In the former case the transaction didn't yet committed, thus Order is not yet in the database. On the other hand lazy loading might work on that object as it's still bound to a a PersistenceContext (in case we are using JPA). However if the original transaction already committed, order will behave much differently. If you rely on one behaviour or the other, due to race condition, your event listener might fail spuriously under heavy to predict circumstances. Of course there is a solution1: using not commonly known TransactionSynchronizationManager. Basically it allows us to register arbitrary number of TransactionSynchronization listeners. Each such listener will then be notified about various events like transaction commit and rollback. Here is a basic API: @Transactional def placeOrder(order: Order) { orderDao save order afterCommit { eventPublisher publishEvent OrderPlacedEvent(order) } } private def afterCommit[T](fun: => T) { TransactionSynchronizationManager.registerSynchronization(new TransactionSynchronizationAdapter { override def afterCommit() { fun } }) } afterCommit() takes a function and calls it after the current transaction commits. We use it to hide the complexity of Spring API. One can safely call registerSynchronization() multiple times - listeners are stored in a Set and are local to the current transaction, disappearing after commit. So, the publishEvent() method will be called after the enclosing transaction commits, which makes our code predictable and race condition free. However, even with higher order function afterCommit() it still feels a bit unwieldy and unnecessarily complex. Moreover it's easy to forget wrapping every publishEvent(), thus maintainability suffers. Can we do better? One solution is to use write custom utility class wrapping publishEvent() or employ AOP. But there is much simpler, proven solution that works great with Spring - the Decorator pattern. We shall wrap original implementation of ApplicationEventPublisher provided by Spring and decorate its publishEven(): class TransactionAwareApplicationEventPublisher(delegate: ApplicationEventPublisher) extends ApplicationEventPublisher { override def publishEvent(event: ApplicationEvent) { if (TransactionSynchronizationManager.isActualTransactionActive) { TransactionSynchronizationManager.registerSynchronization( new TransactionSynchronizationAdapter { override def afterCommit() { delegate publishEvent event } }) } else delegate publishEvent event } } As you can see if the transaction is active, we register commit listener and postpone sending of a message until transaction is completed. Otherwise we simply forward the event to original ApplicationEventPublisher, which delivers it immediately. Of course we somehow have to plug this new implementation instead of the original one. @Primary does the trick: @Resource val applicationContext: ApplicationContext = null @Bean @Primary def transactionAwareApplicationEventPublisher() = new TransactionAwareApplicationEventPublisher(applicationContext) Notice that the original implementation of ApplicationEventPublisher is provided by core ApplicationContext class. After all these changes our code looks... exactly the same as in the beginning: @Service class OrderService @Autowired() (orderDao: OrderDao, eventPublisher: ApplicationEventPublisher) { @Transactional def placeOrder(order: Order) { orderDao save order eventPublisher publishEvent OrderPlacedEvent(order) } However this time auto-injected eventPublisher is our custom decorator. Eventually we managed to fix the race condition problem without touching the business code. Our solution is safe, predictable and robust. Notice that the exact same approach can be taken for any other queuing technology, including JMS (if complex transaction manager was not used) or custom queues. We also discovered an interesting low-level API for transaction lifecycle listening. Might be useful one day. 1 - one might argue that a much simpler solution would be to publishEvent() outside of the transaction: def placeOrder(order: Order) { storeOrder(order) eventPublisher publishEvent OrderPlacedEvent(order) } @Transactional def storeOrder(order: Order) = orderDao save order That's true, but this solution doesn't "scale" well (what if placeOrder() has to be part of a greater transaction?) and is most likely incorrect due to proxying peculiarities.

March 31, 2013

by Tomasz Nurkiewicz

CORE

· 10,065 Views

How Does Java Handle Aliasing?

Aliasing means there are multiple aliases to a location that can be updated, and these aliases have different types. In the following example, a and b are two variable names that have two different types A and B. B extends A. B[] b = new B[10]; A[] a = b; a[0] = new A(); b[0].methodParent(); In memory, they both refer to the same location. The pointed memory location are pointed by both a and b. During run-time, the actual object stored determines which method to call. How does Java handle aliasing problem? If you copy this code to your eclipse, there will be no compilation errors. class A { public void methodParent() { System.out.println("method in Parent"); } } class B extends A { public void methodParent() { System.out.println("override method in Child"); } public void methodChild() { System.out.println("method in Child"); } } public class Main { public static void main(String[] args) { B[] b = new B[10]; A[] a = b; a[0] = new A(); b[0].methodParent(); } } But if you run the code, the output would be as follows: Exception in thread “main” java.lang.ArrayStoreException: aliasingtest.A at aliasingtest.Main.main(Main.java:26) The reason is that Java handles aliasing during run-time. During run-time, it knows that the first element should be a B object, instead of A. Therefore, it only runs correctly if it is changed to: B[] b = new B[10]; A[] a = b; a[0] = new B(); b[0].methodParent(); and the output is: override method in Child * original article

March 30, 2013

by Ryan Wang

· 36,565 Views

How to Validate WSDLs with Eclipse

Create a new project and add the necessary resources which you need to validate. Then right click on the file and click "validate". This will detect errors , (issues) and it will save lot of time. Check out the WSDL validator to go more in depth: http://wiki.eclipse.org/WSDL_Validator

March 30, 2013

by Achala Chathuranga Aponso

· 13,447 Views

HashSet vs. TreeSet vs. LinkedHashSet

in a set, there are no duplicate elements. that is one of the major reasons to use a set. there are 3 commonly used implementations of set in java: hashset, treeset and linkedhashset. when and which to use is an important question. in brief, if we want a fast set, we should use hashset; if we need a sorted set, then treeset should be used; if we want a set that can be read by following its insertion order, linkedhashset should be used. 1. set interface set interface extends collection interface. in a set, no duplicates are allowed. every element in a set must be unique. we can simply add elements to a set, and finally we will get a set of elements with duplicates removed automatically. 2. hashset vs. treeset vs. linkedhashset hashset is implemented using a hash table. elements are not ordered. the add, remove, and contains methods has constant time complexity o(1). treeset is implemented using a tree structure(red-black tree in algorithm book). the elements in a set are sorted, but the add, remove, and contains methods has time complexity of o(log (n)). it offers several methods to deal with the ordered set like first(), last(), headset(), tailset(), etc. linkedhashset is between hashset and treeset. it is implemented as a hash table with a linked list running through it, so it provides the order of insertion. the time complexity of basic methods is o(1). 3. treeset example treeset tree = new treeset(); tree.add(12); tree.add(63); tree.add(34); tree.add(45); iterator iterator = tree.iterator(); system.out.print("tree set data: "); while (iterator.hasnext()) { system.out.print(iterator.next() + " "); } output is sorted as follows: tree set data: 12 34 45 63 now let's define a dog class as follows: class dog { int size; public dog(int s) { size = s; } public string tostring() { return size + ""; } } let's add some dogs to treeset like the following: import java.util.iterator; import java.util.treeset; public class testtreeset { public static void main(string[] args) { treeset dset = new treeset(); dset.add(new dog(2)); dset.add(new dog(1)); dset.add(new dog(3)); iterator iterator = dset.iterator(); while (iterator.hasnext()) { system.out.print(iterator.next() + " "); } } } compile ok, but run-time error occurs: exception in thread "main" java.lang.classcastexception: collection.dog cannot be cast to java.lang.comparable at java.util.treemap.put(unknown source) at java.util.treeset.add(unknown source) at collection.testtreeset.main(testtreeset.java:22) because treeset is sorted, the dog object need to implement java.lang.comparable's compareto() method like the following: class dog implements comparable{ int size; public dog(int s) { size = s; } public string tostring() { return size + ""; } @override public int compareto(dog o) { return size - o.size; } } the output is: 1 2 3 4. hashset example hashset dset = new hashset(); dset.add(new dog(2)); dset.add(new dog(1)); dset.add(new dog(3)); dset.add(new dog(5)); dset.add(new dog(4)); iterator iterator = dset.iterator(); while (iterator.hasnext()) { system.out.print(iterator.next() + " "); } output: 5 3 2 1 4 note the order is not certain. 5. linkedhashset example linkedhashset dset = new linkedhashset(); dset.add(new dog(2)); dset.add(new dog(1)); dset.add(new dog(3)); dset.add(new dog(5)); dset.add(new dog(4)); iterator iterator = dset.iterator(); while (iterator.hasnext()) { system.out.print(iterator.next() + " "); } the order of the output is certain and it is the insertion order. 2 1 3 5 4 6. performance testing the following method tests the performance of the three class on add() method. public static void main(string[] args) { random r = new random(); hashset hashset = new hashset(); treeset treeset = new treeset(); linkedhashset linkedset = new linkedhashset(); // start time long starttime = system.nanotime(); for (int i = 0; i < 1000; i++) { int x = r.nextint(1000 - 10) + 10; hashset.add(new dog(x)); } // end time long endtime = system.nanotime(); long duration = endtime - starttime; system.out.println("hashset: " + duration); // start time starttime = system.nanotime(); for (int i = 0; i < 1000; i++) { int x = r.nextint(1000 - 10) + 10; treeset.add(new dog(x)); } // end time endtime = system.nanotime(); duration = endtime - starttime; system.out.println("treeset: " + duration); // start time starttime = system.nanotime(); for (int i = 0; i < 1000; i++) { int x = r.nextint(1000 - 10) + 10; linkedset.add(new dog(x)); } // end time endtime = system.nanotime(); duration = endtime - starttime; system.out.println("linkedhashset: " + duration); } from the output below, we can clearly wee that hashset is the fastest one. hashset: 2244768 treeset: 3549314 linkedhashset: 2263320 if you enjoyed this article and want to learn more about java collections, check out this collection of tutorials and articles on all things java collections.

March 29, 2013

by Ryan Wang

· 180,817 Views · 3 Likes

Introduction to Functional Interfaces – A Concept Recreated in Java 8

Any Java developer around the world would have used at least one of the following interfaces: java.lang.Runnable, java.awt.event.ActionListener, java.util.Comparator, java.util.concurrent.Callable. There is some common feature among the stated interfaces and that feature is they have only one method declared in their interface definition. There are lot more such interfaces in JDK and also lot more created by java developers. These interfaces are also called Single Abstract Method interfaces (SAM Interfaces). And a popular way in which these are used is by creating Anonymous Inner classes using these interfaces, something like: public class AnonymousInnerClassTest { public static void main(String[] args) { new Thread(new Runnable() { @Override public void run() { System.out.println("A thread created and running ..."); } }).start(); } } With Java 8 the same concept of SAM interfaces is recreated and are called Functional interfaces. These can be represented using Lambda expressions, Method reference and constructor references(I will cover these two topics in the upcoming blog posts). There’s an annotation introduced- @FunctionalInterface which can be used for compiler level errors when the interface you have annotated is not a valid Functional Interface. Lets try to have a look at a simple functional interface with only one abstract method: @FunctionalInterface public interface SimpleFuncInterface { public void doWork(); } The interface can also declare the abstract methods from the java.lang.Object class, but still the interface can be called as a Functional Interface: @FunctionalInterface public interface SimpleFuncInterface { public void doWork(); public String toString(); public boolean equals(Object o); } Once you add another abstract method to the interface then the compiler/IDE will flag it as an error as shown in the screenshot below: Interface can extend another interface and in case the Interface it is extending in functional and it doesn’t declare any new abstract methods then the new interface is also functional. But an interface can have one abstract method and any number of default methods and the interface would still be called an functional interface. To get an idea of default methods please read here. @FunctionalInterface public interface ComplexFunctionalInterface extends SimpleFuncInterface { default public void doSomeWork(){ System.out.println("Doing some work in interface impl..."); } default public void doSomeOtherWork(){ System.out.println("Doing some other work in interface impl..."); } } The above interface is still a valid functional interface. Now lets see how we can use the lambda expression as against anonymous inner class for implementing functional interfaces: /* * Implementing the interface by creating an * anonymous inner class versus using * lambda expression. */ public class SimpleFunInterfaceTest { public static void main(String[] args) { carryOutWork(new SimpleFuncInterface() { @Override public void doWork() { System.out.println("Do work in SimpleFun impl..."); } }); carryOutWork(() -> System.out.println("Do work in lambda exp impl...")); } public static void carryOutWork(SimpleFuncInterface sfi){ sfi.doWork(); } } And the output would be … Do work in SimpleFun impl... Do work in lambda exp impl... In case you are using an IDE which supports the Java Lambda expression syntax(Netbeans 8 Nightly builds) then it provides an hint when you use an anonymous inner class as used above This was a brief introduction to the concept of functional interfaces in java 8 and also how they can be implemented using Lambda expressions.

March 29, 2013

by Mohamed Sanaulla

· 212,512 Views · 18 Likes

Solving the IBM MQ Client Error – No mqjbnd in java.library.path

if you come across this issue when you try to connect a jms client to ibm mq (v7.0.x.x), this has nothing to do with any environment variables or vm arguments, at least it wasn’t for me (there are quite a lot of those articles out there, that makes you think this is the problem). the fix for this will has to be done on the server side. open the mq explorer. now, if you have not done so already, you need to add your jndi directory to jms administered objects. in the connection factories, you will note that your factories’ transport type is actually “binding”. you need to right-click and go to the switch transport option which will have the “mq client” option that needs to be selected. now the transport type will be “client”. do this to all connection factories that you are connecting to. now, your configuration will look something like below: now, run your client again, and the error should go away. hth.

March 29, 2013

by Tharindu Mathew

· 19,222 Views

How To Style A Checkbox With CSS

Checkboxes is a HTML element that is possibly used on every website, but most people don't style them so they look the same as on every other site.

March 28, 2013

by Paul Underwood

· 179,767 Views

Introduction to Default Methods (Defender Methods) in Java 8

We all know that interfaces in Java contain only method declarations and no implementations and any non-abstract class implementing the interface had to provide the implementation. Lets look at an example: public interface SimpleInterface { public void doSomeWork(); } class SimpleInterfaceImpl implements SimpleInterface{ @Override public void doSomeWork() { System.out.println("Do Some Work implementation in the class"); } public static void main(String[] args) { SimpleInterfaceImpl simpObj = new SimpleInterfaceImpl(); simpObj.doSomeWork(); } } Now what if I add a new method in the SimpleInterface? public interface SimpleInterface { public void doSomeWork(); public void doSomeOtherWork(); } and if we try to compile the code we end up with: $javac .\SimpleInterface.java .\SimpleInterface.java:18: error: SimpleInterfaceImpl is not abstract and does not override abstract method doSomeOtherWork() in SimpleInterface class SimpleInterfaceImpl implements SimpleInterface{ ^ 1 error And this limitation makes it almost impossible to extend/improve the existing interfaces and APIs. The same challenge was faced while enhancing the Collections API in Java 8 to support lambda expressions in the API. To overcome this limitation a new concept is introduced in Java 8 called default methods which is also referred to as Defender Methods or Virtual extension methods. Default methods are those methods which have some default implementation and helps in evolving the interfaces without breaking the existing code. Lets look at an example: public interface SimpleInterface { public void doSomeWork(); //A default method in the interface created using "default" keyword default public void doSomeOtherWork(){ System.out.println("DoSomeOtherWork implementation in the interface"); } } class SimpleInterfaceImpl implements SimpleInterface{ @Override public void doSomeWork() { System.out.println("Do Some Work implementation in the class"); } /* * Not required to override to provide an implementation * for doSomeOtherWork. */ public static void main(String[] args) { SimpleInterfaceImpl simpObj = new SimpleInterfaceImpl(); simpObj.doSomeWork(); simpObj.doSomeOtherWork(); } } and the output is: Do Some Work implementation in the class DoSomeOtherWork implementation in the interface This is a very brief introduction to default methods. One can read in depth about default methods here.

March 28, 2013

by Mohamed Sanaulla

· 27,793 Views

AWS VPC NAT Instance Failover and High Availability

Amazon Virtual Private Cloud (VPC) is a great way to setup an isolated portion of AWS and control the network topology. It is a great way to extend your data center and use AWS for burst requirements. With the latest VPC for Everyone announcement, what was earlier "Classic" and "VPC" in AWS will soon be only VPC. That is, every deployment in AWS will be on a VPC even though one might not need all the additional features that VPC provides. One might eventually start looking at utilizing VPC features such as multiple Subnets, Network isolation, Network ACLs, etc.. Those who have already worked with VPC's understand the role of NAT Instance in a VPC. When you create a VPC, you create them with multiple Subnets (Public and Private). Instances launched in the Public Subnet have direct internet connectivity to send and receive internet traffic through the internet gateway of the VPC. Typically, internet facing servers such as web servers are kept in the Public Subnet. A Private Subnet can be used to launch Instances that do not require direct access from the internet. Instances in a Private Subnet can access the Internet without exposing their private IP address by routing their traffic through a Network Address Translation (NAT) instance in the Public Subnet. AWS provides an AMI that can be launched as a NAT Instance. Following diagram is the representation of a standard VPC that gets provisioned through the AWS Management Console wizard. Standard Private and Public Subnets in a VPC The above architecture has A Public Subnet that has direct internet connectivity through the Internet Gateway. Web Instances can be placed within the Public Subnet The custom Route Table associated with Public Subnet will have the necessary routing information to route traffic to the Internet Gateway A NAT Instance is also provisioned in the Public Subnet A Private Subnet that has outbound internet connectivity through the NAT Instance in the Public Subnet The Main Route Table is by default associated with the Private Subnet. This will have necessary routing information to route internet traffic to the NAT Instance Instances in the Private Subnet will use the NAT Instance for outbound internet connectivity. For example, DB backups from standby that needs to be stored in S3. Background programs that make external web services calls Of course, the above architecture has limited High Availability since all the Subnets are created within the same Availability Zone. We can avoid this by creating multiple Subnets in multiple Availability Zones. Public and Private Subnets with multiple Availability Zones Additional Subnets (Public and Private) are created in one another Availability Zone Both Private Subnets are attached to the Main Routing Table Both Public Subnets are attached to the same Custom Routing Table Instances in the Private Subnet still continue to use the NAT Instance for outbound internet connectivity Though we increased the High Availability by utilizing multiple Availability Zones, the NAT Instance is still a Single Point of Failure. NAT Instance is just another EC2 Instance that can become unavailable any time. The updated architecture below uses two NAT Instances to provide failover and High Availability for the NAT Instances NAT Instance High Availability Each Subnet is associated with its own Route Table NAT1 is provisioned in Public Subnet 1 NAT2 is provisioned in Public Subnet 2 Private Subnet 1's Route Table (RT) has routing entry to NAT1 for internet traffic Private Subnet 2's Route Table (RT) has routing entry to NAT2 for internet traffic NAT Instance HA Illustration A script can be installed on both the NAT Instances to monitor each other and swap the routing table association if one of them fails. For example, if NAT1 detects that NAT2 is not responding to its ping requests, it can change the Route Table of Private Subnet 2 to NAT1 for internet traffic. Once NAT2 becomes operational again, a reverse swapping can happen. AWS has a pretty good documentation on this and a sample script for the swapping. Apart from HA, the above architecture also provides better overall throughput, since during normal conditions, both NAT Instances can be used to drive the outbound internet requirements of the VPC. If there are workloads that requires a lot of outbound internet connectivity, having more than one NAT Instance would make sense. Of course, you are still limited with one NAT Instance per Subnet.

March 28, 2013

by Raghuraman Balachandran

· 18,335 Views

Why We Need Lambda Expressions in Java - Part 1

Lambda expressions are coming to Java 8 and together with Raoul-Gabriel Urma and Alan Mycroft I started writing a book on this topic.

March 27, 2013

by Mario Fusco

· 180,151 Views · 11 Likes

Debugging “Wrong FS expected: file:///” exception from HDFS

I just spent some time putting together some basic Java code to read some data from HDFS. Pretty basic stuff. No map reduce involved. Pretty boilerplate code like the stuff from this popular tutorial on the topic. No matter what, I kept hitting my head on this error: Exception in thread “main” java.lang.IllegalArgumentException: Wrong FS: hdfs://localhost:9000/user/hadoop/DOUG_SVD/out.txt, expected: file:/// If you checkout the tutorial above, what’s supposed to be happening is that an instance of Hadoop’s Configuration should encounter a fs.default.name property, in one of the config files its given. The Configuration should realize that this property has a value of hdfs://localhost:9000. When you use the Configuration to create a Hadoop FileSystem instance, it should happily read this property from Configuration and process paths from HDFS. That’s a long way of saying these three lines of Java code: // pickup config files off classpath Configuration conf = new Configuration() // explicitely add other config files conf.addResource("/home/hadoop/conf/core-site.xml"); // create a FileSystem object needed to load file resources FileSystem fs = FileSystem.get(conf); // load files and stuff below! Well… My Hadoop config files (core-site.xml) appear setup correctly. It appears to be in my CLASSPATH. I’m even trying to explicitly add the resource. Basically I’ve followed all the troubleshooting tips you’re supposed to follow when you encounter this exception. But I’m STILL getting this exception. Head meet wall. This has to be something stupid. Troubleshooting Hadoop’s Configuration & FileSystem Objects Well before I reveal my dumb mistake in the above code, it turns out there’s some helpful functions to help debug these kind of problems: As Configuration is just a bunch of key/value pairs from a set of resources, its useful to know what resources it thinks it loaded and what properties it thinks it loaded from those files. getRaw() — return the raw value for a configuration item (like conf.getRaw("fs.default.name")) toString() — Configuration‘s toString shows the resources loaded You can similarly checkout FileSystem‘s helpful toString method. It nicely lays out where it thinks its pointing (native vs HDFS vs S3 etc). So if you similarly are looking for a stupid mistake like I was, pepper your code with printouts of these bits of info. They will at least point you in a new direction to search for your dumb mistake. Drumroll Please Turns out I missed the crucial step of passing a Path object not a String to addResource. They appear to do slightly different things. Adding a String adds a resource relative to the classpath. Adding a Path is used to add a resource at an absolute location and does not consider the classpath. So to explicitly load the correct config file, the code above gets turned into (drumroll please): // pickup config files off classpath Configuration conf = new Configuration() // explicitely add other config files // PASS A PATH NOT A STRING! conf.addResource(new Path("/home/hadoop/conf/core-site.xml")); FileSystem fs = FileSystem.get(conf); // load files and stuff below! Then Tada! everything magically works! Hopefully these tips can save you the next time you encounter these kinds of problems.

March 27, 2013

by Doug Turnbull

· 17,642 Views

Accessing AWS Without Key and Secret

If you are using Amazon Web Services(AWS), you are probably aware how to access and use resources like SNS, SQS, S3 using key and secret. With the aws-java-sdk that is straight forward: AmazonSNSClient snsClient = new AmazonSNSClient( new BasicAWSCredentials("your key", "your secret")) One of the difficulties with this approach is storing the key/secret securely especially when there are different set of these for different environments. Using java property files, combined with maven or spring profiles might help a little bit to externalize the key/secret out of your source code, but still doesn't solve the issue of securely accessing these resources. Amazon has another service to help you in this occasion. No, no, this is not one more service to pay for in order to use the previous services. It is a free service, actually it is a feature of the amazon account. AWS Identity and Access Management (IAM) lets you securely control access to AWS services and resources for your users, you can manage users and groups and define permissions for AWS resources. One interesting functionality of IAM is the ability to assign roles to EC2 instances. The idea is you create roles with sets of permissions and you launch an EC2 instance by assigning the role to the instance. And when you deploy an application on that instance, the application doesn't need to have access key and secret in order to access other amazon resource. The application will use the role credentials to sign the requests. This has a number of benefits like a centralized place to control all the instances credentials, reduced risk with auto refreshing credentials and so on. Here is a short video demonstrating how to assign roles to an EC2 instance: Once you have role based security enabled for an instance, to access other resources from that instances you have to create and AwsClient using the chained credential provider: AmazonSNSClient snsClient = new AmazonSNSClient( new DefaultAWSCredentialsProviderChain()) The provider will search your system properties, environment properties and finally call instance metadata API to retrieve the role credentials in chain of responsibility fashion. It will also refresh the credentials in the background periodically depending on its expiration period. And finally, if you want to use role based security from Camel applications running on Amazon, all you have to do is create an instance of the client with configured chained credentials object and don't specify any key or secret: from("direct:start") .to("aws-sns://MyTopic?amazonSNSClient=#snsClient");

March 26, 2013

by Bilgin Ibryam

· 14,078 Views

MySQL 5.6 Compatible Percona XtraBackup 2.0.6 Released

This post comes from Hrvoje Matijakovic at the MySQL Performance Blog. Percona is glad to announce the release of Percona XtraBackup 2.0.6 for MySQL 5.6 on March 20, 2013. Downloads are available from our download site here and Percona Software Repositories. This release is the current GA (Generally Available) stable release in the 2.0 series. New Features: XtraBackup 2.0.6 has implemented basic support for MySQL 5.6, Percona Server 5.6 and MariaDB 10.0. Basic support means that these versions are are recognized by XtraBackup, and that backup/restore works as long as no 5.6-specific features are used (such as GTID, remote/transportable tablespaces, separate undo tablespace, 5.6-style buffer pool dump files). Bugs Fixed: Individual InnoDB tablespaces with size less than 1MB were extended to 1MB on the backup prepare operation. This led to a large increase in disk usage in cases when there are many small InnoDB tablespaces. Bug fixed #950334 (Daniel Frett, Alexey Kopytov). Fixed the issue that caused databases corresponding to inaccessible datadir subdirectories to be ignored by XtraBackup without warning or error messages. This was happening because InnoDB code silently ignored datadir subdirectories it could not open. Bug fixed #664986 (Alexey Kopytov). Under some circumstances XtraBackup could fail to copy a tablespace with a high --parallel option value and a low innodb_open_files value. Bug fixed #870119 (Alexey Kopytov). Fix for the bug #711166 introduced a regression that caused individual partition backups to fail when used with --include option in innobackupex or the --tables option in xtrabackup. Bug fixed #1130627 (Alexey Kopytov). innobackupex didn’t add the file-per-table setting for table-independent backups. Fixed by making XtraBackup auto-enable innodb_file_per_table when the --export option is used. Bug fixed #930062 (Alexey Kopytov). Under some circumstances XtraBackup could fail on a backup prepare with innodb_flush_method=O_DIRECT. Bug fixed #1055547 (Alexey Kopytov). innobackupex did not pass the --tmpdir option to the xtrabackup binary resulting in the server’s tmpdir always being used for temporary files. Bug fixed #1085099 (Alexey Kopytov). XtraBackup for MySQL 5.6 has improved the error reporting for unrecognized server versions. Bug fixed #1087219 (Alexey Kopytov). Fixed the missing rpm dependency for Perl Time::HiRes package that caused innobackupex to fail on minimal CentOS installations. Bug fixed #1121573 (Alexey Bychko). innobackupex would fail when --no-lock and --rsync were used in conjunction. Bug fixed #1123335 (Sergei Glushchenko). Fix for the bug #1055989 introduced a regression that caused xtrabackup_pid file to remain in the temporary dir after execution. Bug fixed #1114955 (Alexey Kopytov). Unnecessary debug messages have been removed from the XtraBackup output. Bug fixed #1131084 (Alexey Kopytov). Other bug fixes: bug fixed #1153334 (Alexey Kopytov), bug fixed #1098498 (Laurynas Biveinis), bug fixed #1132763 (Laurynas Biveinis), bug fixed #1142229 (Laurynas Biveinis), bug fixed #1130581 (Laurynas Biveinis). Release notes with all the bugfixes for Percona XtraBackup 2.0.6 for MySQL 5.6 are available in our online documentation. Bugs can be reported on the launchpad bug tracker.

March 24, 2013

by Peter Zaitsev

· 2,747 Views

Extracting the Elements of the Java Collection- The Java 8 Way

We all have extensively used Collection classes like List, Map and their derived versions. And each time we used them we had to iterate through them to either find some element or update the elements or find different elements matching some condition. Consider a List of Person as shown below: List personList = new ArrayList<>(); personList.add(new Person("Virat", "Kohli",22)); personList.add(new Person("Arun", "Kumar",25)); personList.add(new Person("Rajesh", "Mohan", 32)); personList.add(new Person("Rahul", "Dravid", 35)); To find out all the Person instances with age greater than 30, we would do: List olderThan30OldWay = new ArrayList<>(); for ( Person p : personList){ if ( p.age >= 30){ olderThan30OldWay.add(p); } } System.out.println(olderThan30OldWay); and this gives me the output as: [Rajesh Mohan, 32, Rahul Dravid, 35] The code is easy to write, but is it not a bit more verbose, especially the iteration part? Why would we have to iterate? Would it not be cool if there was an API which would iterate the contents and give us the end result i.e we give the source List and use a series of method calls to get us the result List we are looking for? Yes, this is possible in other languages like Scala, Groovy which support passing closures and also support internal iteration. But is there a solution for Java developers? Yes, this exact problem is being solved by introducing support for Lambda Expressions(closures) and a enhanced Collection API to leverage the lambda expression support. The sad news is that it’s going to be part of Java 8 and will take some time to be into mainstream development. Leveraging the Java 8 enhancements to the above scenario As I said before the Collections API is being enhanced to support the use of Lambda Expression and more about it can be read here. Instead of adding all the new APIs to the Collection class the JDK team created a new concept called “Stream” and added most of the APIs in that class. “Stream” is a sequence of elements obtained from the Collection from which it is created. To read more about the origin of Stream class please refer to this document. To implement the example I started with using the enhancements in Java 8 we would be using few new APIs namely: stream(), filter(), collect(), Collectors.toCollection(). stream(): Uses the collection on which this API is called to create an instance of Stream class. filter():This method accepts a lambda expression which takes in one parameter and returns a boolean value. This lambda expression is written as a replacement for implementing the Predicate class. collect(): There are 2 overloaded versions of this method. The one I am using here takes an instance of Collector. This method takes the contents of the stream and constructs another collection. This construction logic is defined by the Collector. Collectors.toCollection(): Collectors is a factory for Collector. And the toCollection() takes a Lambda expression/Method reference which should return a new instance of any derivatives of Collection class. With brief introduction to the APIs used, let me show the code which is equivalent to the first code sample: List olderThan30 = //Create a Stream from the personList personList.stream(). //filter the element to select only those with age >= 30 filter(p -> p.age >= 30). //put those filtered elements into a new List. collect(Collectors.toCollection(() -> new ArrayList())); System.out.println(olderThan30); The above code uses both Internal iteration and lambda expressions to make it intuitive, concise and soothing to the eye. If you are not familiar with the idea of Lambda Expressions, check out my previous entry which covers in brief about Lambda expressions.

March 23, 2013

by Mohamed Sanaulla

· 77,934 Views · 2 Likes

5 Ways Objects Can Communicate With Each Other Heading Towards Decoupling

Way 1. Simple method call Object A calls a method on object B. This is clearly the simplest type of communication between two objects but is also the way which results in the highest coupling. Object A’s class has a dependency upon object B’s class. Wherever you try to take object A’s class, object B’s class (and all of its dependencies) are coming with it. Way 2. Decouple the callee from the caller Object A’s class declares an interface and calls a method on that interface. Object B’s class implements that interface. This is a step in the right direction as object A’s class has no dependency on object B’s class. However, something else has to create object B and introduce it to object A for it to call. So we have created the need for an additional class which has a dependency upon object B’s class. We have also created a dependency from B to A. However, these can be a small price to pay if we are serious about taking object A’s class off to other projects. Way 3. Use an AdaptorÂ Object A’s class declares an interface and calls a method on that interface. An adaptor class implements the interface and wraps object B, forwarding calls to it. This frees up object B’s class from being dependent on object A’s class. Now we are getting closer to some real decoupling. This is particularly useful if object B’s class is a third-party class which we have no control over. Way 4. Dependency Injection Dependency injection is used to find, create and call object B. This amounts to deferring until runtime how object A will talk to object B. This way certainly feels to have the lowest coupling, but in reality just shifts the coupling problem into the wiring realm. At least before we could rely on the compiler to ensure that there was a concrete object on the other end of each call – and furthermore we had the convenience of using the development tools to help us unpick the interaction between objects. Way 5. Chain of command pattern The chain of command pattern is used to allow object A to effectively say “does anyone know how to handle this call?”. Object B, which is listening out for these cries for help, picks up the message and figures out for itself if it is able to respond. This approach does mean that object A has to be ready for the outcome that nobody is able to respond, however it buys us great flexibility in how the responder is implemented. Chain of command – way 5 – is the decoupling winner and here's an example to help explain why. Let object A be a raster image file viewer, with responsibilities for allowing the user to pick the file to open, and zoom in and out on the image as it is displayed. Let object B be a loader which has the responsibility of opening a gif file and returning an array of colored pixels. Our aim is to avoid tying object A's class to object B's class because object B's class uses a third party library. Additionally, object A doesn't want to know about how the image file is interpreted, or even if it is a gif, jpg, png or whatever. In this example object B, or more likely a wrapper of object B, will declare a method which equips it to respond to any requests to open an image file. The method will respond with an array of pixels if the file is of a format it recognizes, or respond with null if it does not recognize the format. The framework then simply asks handlers in turn until one provides a non-null response. With this framework in place we are now free to slide in more image loaders with the addition of just one more handler class. And furthermore, on the source end of the call, we can add other classes to not just view the images, but print them, edit them or manipulate them in any other way we choose. In conclusion, we can see that decoupling can be achieved and yield flexibility, but this does not mean it is appropriate for every call from one object to another. The best thing to do is start with straight method calls, but keep cohesion in mind. Then if at a later stage it becomes necessary to swap in and out different objects it won't be too hard to extract an interface and put in place a decoupling mechanism.

March 22, 2013

by Paul Wells

· 42,824 Views · 2 Likes

OpenJPA: Memory Leak Case Study

This article will provide the complete root cause analysis details and resolution of a Java heap memory leak (Apache OpenJPA leak) affecting an Oracle Weblogic server 10.0 production environment. This post will also demonstrate the importance to follow the Java Persistence API best practices when managing the javax.persistence.EntityManagerFactory lifecycle. Environment specifications Java EE server: Oracle Weblogic Portal 10.0 OS: Solaris 10 JDK: Oracle/Sun HotSpot JVM 1.5 32-bit @2 GB capacity Java Persistence API: Apache OpenJPA 1.0.x (JPA 1.0 specifications) RDBMS: Oracle 10g Platform type: Web Portal Troubleshooting tools Quest Foglight for Java (Java heap monitoring) MAT (Java heap dump analysis) Problem description & observations The problem was initially reported by our Weblogic production support team following production outages. An initial root cause analysis exercise did reveal the following facts and observations: Production outages were observed on regular basis after ~2 weeks of traffic. The failures were due to Java heap (OldGen) depletion e.g. OutOfMemoryError: Java heap space error found in the Weblogic logs. A Java heap memory leak was confirmed after reviewing the Java heap OldGen space utilization over time from Foglight monitoring tool along with the Java verbose GC historical data. Following the discovery of the above problems, the decision was taken to move to the next phase of the RCA and perform a JVM heap dump analysis of the affected Weblogic (JVM) instances. JVM heap dump analysis ** A video explaining the following JVM Heap Dump analysis is now available here. In order to generate a JVM heap dump, the supported team did use the HotSpot 1.5 jmap utility which generated a heap dump file (heap.bin) of about ~1.5 GB. The heap dump file was then analyzed using the Eclipse Memory Analyzer Tool. Now let’s review the heap dump analysis so we can understand the source of the OldGen memory leak. MAT provides an initial Leak Suspects report which can be very useful to highlight your high memory contributors. For our problem case, MAT was able to identify a leak suspect contributing to almost 600 MB or 40% of the total OldGen space capacity. At this point we found one instance of java.util.LinkedList using almost 600 MB of memory and loaded to one of our application parent class loader (@ 0x7e12b708). The next step was to understand the leaking objects along with the source of retention. MAT allows you to inspect any class loader instance of your application, providing you with capabilities to inspect the loaded classes & instances. Simply search for the desired object by providing the address e.g. 0x7e12b708 and then inspect the loaded classes & instances by selecting List Objects > with outgoing references. As you can see from the above snapshot, the analysis was quite revealing. What we found was one instance of org.apache.openjpa.enhance.PCRegistry at the source of the memory retention; more precisely the culprit was the _listeners field implemented as a LinkedList. For your reference, the Apache OpenJPA PCRegistry is used internally to track the registered persistence-capable classes. Find below a snippet of the PCRegistry source code from Apache OpenJPA version 1.0.4 exposing the _listeners field. /** * Tracks registered persistence-capable classes. * * @since 0.4.0 * @author Abe White */ publicclass PCRegistry { // DO NOT ADD ADDITIONAL DEPENDENCIES TO THIS CLASS privatestaticfinal Localizer _loc = Localizer.forPackage (PCRegistry.class); // map of pc classes to meta structs; weak so the VM can GC classes privatestaticfinal Map _metas = new ConcurrentReferenceHashMap (ReferenceMap.WEAK, ReferenceMap.HARD); // register class listeners privatestaticfinal Collection _listeners = new LinkedList(); …………………………………………………………………………………… Now the question is why is the memory footprint of this internal data structure so big and potentially leaking over time? The next step was to deep dive into the _listeners LinkedLink instance in order to review the leaking objects. We finally found that the leaking objects were actually the JDBC & SQL mapping definitions (metadata) used by our application in order to execute various queries against our Oracle database. A review of the JPA specifications, OpenJPA documentation and source did confirm that the root cause was associated with a wrong usage of the javax.persistence.EntityManagerFactory such of lack of closure of a newly created EntityManagerFactory instance. If you look closely at the above code snapshot, you will realize that the close() method is indeed responsible to cleanup any recently used metadata repository instance. It did also raise another concern, why are we creating such Factory instances over and over… The next step of the investigation was to perform a code walkthrough of our application code, especially around the life cycle management of the JPA EntityManagerFactory and EntityManager objects. Root cause and solution A code walkthrough of the application code did reveal that the application was creating a new instance of EntityManagerFactory on each single request and not closing it properly. public class Application { @Resource private UserTransaction utx = null; // Initialized on each application request and not closed! @PersistenceUnit(unitName = "UnitName") private EntityManagerFactory emf = Persistence.createEntityManagerFactory("PersistenceUnit"); public EntityManager getEntityManager() { return this.emf.createEntityManager(); } public void businessMethod() { // Create a new EntityManager instance via from the newly created EntityManagerFactory instance // Do something... // Close the EntityManager instance } } This code defect and improver use of JPA EntityManagerFactory was causing a leak or accumulation of metadata repository instances within the OpenJPA _listeners data structure demonstrated from the earlier JVM heap dump analysis. The solution of the problem was to centralize the management & life cycle of the thread safe javax.persistence.EntityManagerFactory via the Singleton pattern. The final solution was implemented as per below: Create and maintain only one static instance of javax.persistence.EntityManagerFactory per application class loader and implemented via the Singleton Pattern. Create and dispose new instances of EntityManager for each application request. Please review this discussion from Stackoverflow as the solution we implemented is quite similar. Following the implementation of the solution to our production environment, no more Java heap OldGen memory leak is observed. Please feel free to provide your comments and share your experience on the same.

March 21, 2013

by Pierre - Hugues Charbonneau

· 8,961 Views

Using Lambda Expression to Sort a List in Java 8 using NetBeans Lambda Support

As part of JSR 335Lambda expressions are being introduced to the Java language from Java 8 onwards and this is a major change in the Java language.

March 21, 2013

by Mohamed Sanaulla

· 345,707 Views · 6 Likes

Use Eclipse JDT to dynamically create, access, and load projects

in this article, we are going to use eclipse jdt to create, access and load projects. i assume that you know how to create a simple eclipse plug-in project which adds a menu item that you can click and trigger some actions. if you don’t know, you can go to this article . the reason why we need a plug-in project is that java model only work inside of a plug-in, a standalone application will not support java model. this article only focus on jdt java model. the following three topics will be explored: create projects in workspace access projects in workspace dynamically import existing projects into workspace those are essentially important when you want to process a large number of java projects. 1. create projects we can use java model to create a new project in the work space. the following here requires the following dependencies: import org.eclipse.core.resources.ifolder; import org.eclipse.core.resources.iproject; import org.eclipse.core.resources.iprojectdescription; import org.eclipse.core.resources.iworkspaceroot; import org.eclipse.core.resources.resourcesplugin; import org.eclipse.core.runtime.coreexception; import org.eclipse.jdt.core.iclasspathentry; import org.eclipse.jdt.core.icompilationunit; import org.eclipse.jdt.core.ijavaproject; import org.eclipse.jdt.core.ipackagefragment; import org.eclipse.jdt.core.ipackagefragmentroot; import org.eclipse.jdt.core.itype; import org.eclipse.jdt.core.javacore; import org.eclipse.jdt.core.javamodelexception; import org.eclipse.jdt.launching.javaruntime; add code to run method. the code ignores the try/catch statements, eclipse will ask you to add exception handling code. // create a project with name "testjdt" iworkspaceroot root = resourcesplugin.getworkspace().getroot(); iproject project = root.getproject("testjdt"); project.create(null); project.open(null); //set the java nature iprojectdescription description = project.getdescription(); description.setnatureids(new string[] { javacore.nature_id }); //create the project project.setdescription(description, null); ijavaproject javaproject = javacore.create(project); //set the build path iclasspathentry[] buildpath = { javacore.newsourceentry(project.getfullpath().append("src")), javaruntime.getdefaultjrecontainerentry() }; javaproject.setrawclasspath(buildpath, project.getfullpath().append( "bin"), null); //create folder by using resources package ifolder folder = project.getfolder("src"); folder.create(true, true, null); //add folder to java element ipackagefragmentroot srcfolder = javaproject .getpackagefragmentroot(folder); //create package fragment ipackagefragment fragment = srcfolder.createpackagefragment( "com.programcreek", true, null); //init code string and create compilation unit string str = "package com.programcreek;" + "\n" + "public class test {" + "\n" + " private string name;" + "\n" + "}"; icompilationunit cu = fragment.createcompilationunit("test.java", str, false, null); //create a field itype type = cu.gettype("test"); type.createfield("private string age;", null, true, null); when you trigger the action, the following project will be created. 2. access projects if there are already projects in our work space, we can use java model to loop through each of them. public void run(iaction action) { // get the root of the workspace iworkspace workspace = resourcesplugin.getworkspace(); iworkspaceroot root = workspace.getroot(); // get all projects in the workspace iproject[] projects = root.getprojects(); // loop over all projects for (iproject project : projects) { system.out.println(project.getname()); } } if we import some projects or create some, and click the menu item we created, the projects names will show up as follows. 3. dynamically load/import existing projects into workspace in the previous step, we need manually import existing projects to work space. if the number is larger, this would not be applicable. eclipse jdt provide functions to do this dynamically. now let’s see how to import a large number of existing projects into the work space. it does not copy files to the workspace root directory, but only point to the projects in the external directory. in the example, i use the flash drive to hold my open source projects. in this way, you can parse thousands of projects and get useful information you need without copying anything. iworkspaceroot root= resourcesplugin.getworkspace().getroot(); final iworkspace workspace = resourcesplugin.getworkspace(); system.out.println("root" + root.getlocation().toosstring()); runnable runnable = new runnable() { public void run() { try { ipath projectdotprojectfile = new path("/media/flashx/testprojectimport" + "/.project"); iprojectdescription projectdescription = workspace.loadprojectdescription(projectdotprojectfile); iproject project = workspace.getroot().getproject(projectdescription.getname()); javacapabilityconfigurationpage.createproject(project, projectdescription.getlocationuri(), null); //project.create(null); } catch (coreexception e) { e.printstacktrace(); } } }; // and now get the workbench to do the work final iworkbench workbench = platformui.getworkbench(); workbench.getdisplay().syncexec(runnable); iproject[] projects = root.getprojects(); for(iproject project: projects){ system.out.println(project.getname()); } what if the project we want to load does not contain a .project file? this is the complicated case, we need dynamically create all those projects by using its source code. notes when you practice the examples above, you may got error message like “the type org.eclipse.core.runtime.iadaptable cannot be resolved. it is indirectly referenced from required .class files”. the solution is adding org.eclipse.core.runtime through plug-in menifest editor. simply adding to build path will not work. if you think this article is useful and want to read more, you can go to eclipse jdt tutorial series i wrote.

March 21, 2013

by Ryan Wang

· 8,039 Views

Entity Framework 5/6 vs NHibernate 3 – The State of Affairs

It has been almost two years since I've last compared NHibernate and Entity Framework, so with the recentalpha version of EF 6, it's about time to look at the current state of affair. I've been using NHibernate for more than 6 years so obviously I'm a bit biased. But I can't ignore that EF's feature list is growing and some of the things I like about the NHibernate eco-system such as code-based mappings and automatic migrations have found a place in EF. Moreover, EF is now open-source, so they're accepting pull requests as well. Rather than doing a typical feature-by-feature comparison, I'll be looking at those aspects of an object-relational mapper that I think are important when building large-scale enterprise applications. So let's see how those frameworks match up. Just for your information, I've been looking at Entity Framework 6 Alpha 3 and NHibernate 3.3.1GA. Support for rich domain models When you're practicing Domain Driven Design it is crucial to be able to model your domain using the right object-oriented principles. For example, you should be able to encapsulate data and only expose properties if that is needed by the functional requirements. If you model an association using a UML qualifier, you should be able to implement that using a IDictionary. Similarly, collection properties should be based on IEnumerableor any of the newer read-only collections introduced in .NET 4.5 so that your collections are protected by external changes. NHibernate supports all these requirements and adds quite a lot of flexibility like ordered and unordered sets. Unfortunately, neither EF5 or 6 supports mapping private fields (yet) nor can you directly use a dictionary class. In fact, EF only supports ICollections of entities, so collections of value objects are out of the question. One notable type that still isn't fully supported is the enum. It was introduced in EF5, but only if you target .NET 4.5. EF6 will fortunately fixes this so that it is also available in .NET 4.0 applications. A good ORM should also allow your domain model to be as persistence ignorant as possible. In other words, you shouldn't need to decorate your classes with attributes or subclass some framework-provided base-class (something you might remember from Linq2Sql). Both frameworks impose some limitations such as protected default constructors or virtual members, but that's not going to be too much of an issue. Vendor support Although Microsoft makes us believe that corporate clients only use SQL Server or SQL Azure, we all know that the opposite is much more true. The big drawback of EF compared to NH is that the latter has all the providers built-in. So whenever a new version of the framework is released you don't have to worry about vendor support. Both EF5 and NH 3.3 support various flavors of SQL Server/Azure, SQLite, PostgreSQL, Oracle, Sybase, Firebird and DB2. Most of these providers originate from EF 4, so they don’t support code-first (migrations) or the new DBContext façade. EF6 is still an alpha release and its provider model seems to contain some breaking changes so don't expect any support for anything other than Microsoft's own databases anytime soon. Support switching databases for automated testing purposes Our architecture uses a repository pattern implementation that allows swapping the actual data mapper on-the-fly. Since we're heavily practicing Test Driven Development, we use this opportunity to approach our testing in different ways. We use an in-memory Dictionary for unit tests where the subject-under-test simply needs some data to be setup in a specific way (using Test Data Builders). We use an in-memory SQLite database when we want to verify that NHibernate can process the LINQ query correctly and performs sufficiently using NHProf. We use an actual SQL Server for unit tests that verify that our mapping against the database schema is correct. We have some integration code that interacts with a third-party Oracle system that is tested on SQL Server on a local development box, but uses Oracle on our automated SpecFlow build. So you can imagine switching between database providers without changing the mapping code is quite essential for us. During development, we decided that we did not care about the actual class that represented the integration tables, so we tried to use the Entity Framework model-first approach. Unfortunately, when you do that, you're basically locking yourself to a particular database. After switching back to our normal NHibernate approach, changing the connection string during deployment was enough to switch between SQL Server and Oracle. Fortunately this has also been possible since EF 4.1 and Jason Short wrote a good blog post about that. Automatic schema migration When you're practicing an agile methodology such as Scrum, you'll probably try to deliver a potentially shippable release at the end of every sprint. Part of being agile is that functionality can be added at any time where some of that might be affecting the database schema. The most traditional way of dealing with that is to generate or hand-write SQL scripts that are applied during deployment. The problem with SQL scripts is that they are tedious to write, might contain bugs, and are often closely coupled to the database vendor. Wouldn't it be great if the ORM framework would support some way of figuring out what version of the schema is being used and automatically upgrade the database scheme as part of your normal development cycle? Or what about the ability to revert the schema to an older version? The good news that this exists for both frameworks, but with a caveat. For instance, NHibernate doesn't support this out-of-the-box (although you can generate the initial schema). But with the help of another open-source project, Fluent Migrations, you can get very far. We currently use it in an enterprise system and it works like a charm. The caveat is that the support for the various databases is not always at the same level. For instance, SQLite doesn't allow renaming a column and Fluent Migrations doesn't support it (although theoretically it could create a new column, copy the old data over, and drop the old column). As an example of a fluent migration supporting both an update as well as a rollback, check out this snippet. [Migration(1)] public class TestCreateAndDropTableMigration: Migration { public override void Up() { Create.Table("TestTable") .WithColumn("Id").AsInt32().NotNullable().PrimaryKey().Identity() .WithColumn("Name").AsString(255).NotNullable().WithDefaultValue("Anonymous"); Create.Table("TestTable2") .WithColumn("Id").AsInt32().NotNullable().PrimaryKey().Identity() .WithColumn("Name").AsString(255).Nullable() .WithColumn("TestTableId").AsInt32().NotNullable(); Create.Index("ix_Name").OnTable("TestTable2").OnColumn("Name").Ascending() .WithOptions().NonClustered(); Create.Column("Name2").OnTable("TestTable2").AsBoolean().Nullable(); Create.ForeignKey("fk_TestTable2_TestTableId_TestTable_Id") .FromTable("TestTable2").ForeignColumn("TestTableId") .ToTable("TestTable").PrimaryColumn("Id"); Insert.IntoTable("TestTable").Row(new { Name = "Test" }); } public override void Down() { Delete.Table("TestTable2"); Delete.Table("TestTable"); } } Entity Framework has something similar built-in since version 5. It's called Code-First Migrations and looks surprisingly similar to Fluent Migrations. Just like the NHibernate solution has some limitations, EF's has as well and that is support from vendors. At the time of this writing not a single vendor supports Code-First Migrations. On the other hand, if you're only using SQL Server, SQL Express, SQL Compact or SQL Azure, there's nothing from stopping you to use it. Code-based mapping If you remember the old days of NHibernate, you might recall those ugly XML files that were needed to configure the mapping of your .NET classes to the underlying database. Fluent NHibernate has been offering a very nice fluent API for replacing those mappings with code. Not only does this prevent errors in the XML, it is also a very refactor-friendly approach. We've been using it for years and the extensive (and customizable) convention-basedmapping engine even allows auto-mapping entities to tables without the need of explicit mapping code. Strangely enough, NHibernate 3.2 has introduced a brand new fluent API that directly competes with Fluent NHibernate. Because of lack of documentation, I've never bothered to look at it, especially since Fluent NHibernate has been doing its job remarkedly. But during my research for this post I noticed that Adam Bar has written a very extensive series on the new API, and he actually managed to raise a renewed interest in the new API. Until Entity Framework 4.1 the only way to set-up the mapping was through its data model designer (not to be confused with an OO designer). But apparently the team behind it learned from Fluent Nhibernate and decided to introduce their own code-first approach, surprisingly named Code-First. In terms of convention-based mapping, it was quite limited, especially compared to Fluent NHibernate. EF 6 is going to introduce a lot of hooks for changing the conventions, both on property level as well as on class level. Supporting custom types and collections One of the guidelines in my own coding guidelines is to consider wrapping primitive types with more domain-specific types. Conincedentily it is also one of the rules of Object Calisthenetics. In Domain Driven Design these types are called Value Objects and their purpose is to encapsulate all the data and behavior associated with a recurring domain concept. For instance, rather than having two separate DateTime properties to represent a period and separate methods for determining whether some point of time occurs within that period, I would prefer to have a dedicated Period class that contains all that logic. This approach results in a design that contains less duplication and is easier to understand. Contrary to NHibernate, Entity Framework doesn't offer anything like this and as far as I know, doesn't plan to. NH on the other hand offers a myriad of options for creating custom types, custom collections or even composite types. Granted, you have to do a bit of digging to find the right documentation (and StackOverflow is your friend here), but if you do, it really helps to enrich your domain model. Query flexibility Some would argue that EF's LINQ support is much more mature, and until NH 3.2 I would have agreed. But since then, NH's LINQ support has improved substantially. For instance, during development we use an in-memory SQLite database in our query-related unit tests to make sure the query can actually be executed by NH. Before 3.2, we regularly ran into strange cast exceptions or exceptions because of unsupported expressions. Since 3.2, we've never seen those anymore. I haven't tried to run all our existing queries against EF, but I have no doubts that it would have any issue with it. In terms of non-LINQ querying, EF supports Entity SQL as well as native SQL (although I don’t know if all vendors are supported). NHibernate offers the HQL, QueryOver and Criteria APIs next to native vendor-specific SQL. Both frameworks support stored procedures. All in all plenty of flexibility. Extensibility EF 6 uses the service locator pattern to allow replacing certain aspects of the framework at runtime. This is a good starting point for extensibility, but unfortunately the team always demonstrates this by replacing the pluralization service. As if someone would actually like to do that. Nonetheless, I'm sure the team's plan is to expose more extension points in the near future. NH has a very extensive set of observable collections called listeners that can be used to hook into virtually every part of the framework. We've been using it for cross-cutting concerns, for hooking up auditing services and also for some CQRS related aspects. You can also tweak a lot of NH's behavior throughconfiguration properties (although you'll have to Google…eh…Bing for the right examples). Other notable features Each of the frameworks has some unique features that don't fit in any of the other topics I've discussed up to now. A short summary: Entity Framework 6 adds async/await support, a feature that NHibernate may never get due to the impact it has on the entire architecture. It also has built-in support for automatically reconnecting to the database, which is particularly useful for a known issue with SQL Azure. Both NHibernate as well as the Entity Framework support .NET 4.5, but only the latter gains some significant performance improvements from it. NHibernate also offers a unique feature called Futures that you can use to compose a set of queries and send them to the database as a single request. The Entity Framework allows creating a DBContext with an existing open connection. As far as I know that's not possible in NHibernate. Version 6 of the Entity Framework adds spatial support, something for which you need a 3rd party library to get that in NHibernate. Pedro Sousa wrote an in-depth blog post series about that. Wrap-up The big difference between Entity Framework and NHibernate from a developer perspective is that the former offers an integrated set of services whereas the latter requires the combination of several open-source libraries. That on itself is not a big issue - that's why we have NuGet, don't we? - but we've noticed that those libraries are not always up-to-date soon enough when new NHibernate versions are released. From that same perspective NHibernate does offer a lot of flexibility and clearly shows its maturity. On the other hand, you could also see that as a potential barrier for new developers. It's just much easier to get started with Entity Framework than with NH. The documentation on Entity Framework is quite comprehensive and even the new functionality for version 6 is extensively documented using feature specifications. The NHibernatedocumentation has always been lagging behind a bit. For instance, the new mapping system is not even mentioned even though the reference documentation mentions the correct version. The information is available, but you just have to search a bit. The fact that the EH is being developed using a Git source control repository is also a big plus. Just look at themany pull requests they've been taking in. On the other hand, to my surprise somebody moved the NHibernate source code to GitHub while I wasn't paying attention. So on that aspect they are equals. And does NHibernate have a future at all? Some would argue it is dead already. I don't agree though. Just look at the statistics on GitHub; 240 forks, almost 200 pull requests and a lot of commits in the last few months. I do agree that NoSQL solutions like RavenDB are extremely powerful and offer a lot of fun and flexibility for development, but the fact of the matter is that they are still not widely accepted by enterprises with a history in SQL Server or Oracle. Nevertheless, the RAD aspect of EF cannot be ignored and is important for small short-running projects where SQL Server is the norm. And for those projects, I would wholeheartedly recommend EF. But for the bigger systems where a NoSQL solution is not an option, especially those based on Domain Driven Design, NHibernate is still the king in town. As usual, I might have overlooked a feature or misinterpreted some aspect of both frameworks. If so, leave a comment, email me or drop me a tweet at @ddoomen.

March 20, 2013

by Dennis Doomen

· 38,690 Views

Encode Email Addresses With PHP

Spam bots will crawl your pages exactly the same way Search Engines crawls your pages, but while Search Engines are crawling to index your content, spam bots are crawling to find certain information. One of the most common bits of data they are searching for are email addresses, this is because they can store these email addresses in a list and send out spam emails. Displaying your email address on your web page makes it very easy for your visitors to store this and send you email, but you will need to get around the spam bot problem to do this you will need to HTML encode your email address. While encoding your email address will make it display correctly in the browser in the source code the email address will be encoded so that spam bots can not read the email. Below is a PHP code snippet which will allow you to pass in an email address and encode it so that you can display the email address on your website. This uses the PHP function ord() to return an ASCII value for the character used in the email. /** * Encode an email address to display on your website */ function encode_email_address( $email ) { $output = ''; for ($i = 0; $i < strlen($email); $i++) { $output .= '&#'.ord($email[$i]).';'; } return $output; } To use in on your page all you have to do is pass in a parameter and it will return the encoded email you can now use on your website. $encodedEmail = encode_email_address( 'example@domain.com' ); printf('%s', $encodedEmail, $encodedEmail);

March 20, 2013

by Paul Underwood

· 9,853 Views