DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

The Latest Testing, Deployment, and Maintenance Topics

article thumbnail
XML unmarshalling benchmark in Java: JAXB vs STax vs Woodstox
towards the end of last week i started thinking how to deal with large amounts of xml data in a resource-friendly way.the main problem that i wanted to solve was how to process large xml files in chunks while at the same time providing upstream/downstream systems with some data to process. of course i've been using jaxb technology for few years now; the main advantage of using jaxb is the quick time-to-market; if one possesses an xml schema, there are tools out there to auto-generate the corresponding java domain model classes automatically (eclipse indigo, maven jaxb plugins in various sauces, ant tasks, to name a few). the jaxb api then offers a marshaller and an unmarshaller to write/read xml data, mapping the java domain model. when thinking of jaxb as solution for my problem i suddendlly realised that jaxb keeps the whole objectification of the xml schema in memory, so the obvious question was: "how would our infrastructure cope with large xml files (e.g. in my case with a number of elements > 100,000) if we were to use jaxb?". i could have simply produced a large xml file, then a client for it and find out about memory consumption. as one probably knows there are mainly two approaches to processing xml data in java: dom and sax. with dom, the xml document is represented into memory as a tree; dom is useful if one needs cherry-pick access to the tree nodes or if one needs to write brief xml documents. on the other side of the spectrum there is sax, an event-driven technology, where the whole document is parsed one xml element at the time, and for each xml significative event, callbacks are "pushed" to a java client which then deals with them (such as start_document, start_element, end_element, etc). since sax does not bring the whole document into memory but it applies a cursor like approach to xml processing it does not consume huge amounts of memory. the drawback with sax is that it processes the whole document start to finish; this might not be necessarily what one wants for large xml documents. in my scenario, for instance, i'd like to be able to pass to downstream systems xml elements as they are available, but at the same time maybe i'd like to pass only 100 elements at the time, implementing some sort of pagination solution. dom seems too demanding from a memory-consumption point of view, whereas sax seems to coarse-grained for my needs. i remembered reading something about stax, a java technology which offered a middle ground between the capability to pull xml elements (as opposed to pushing xml elements, e.g. sax) while being ram-friendly. i then looked into the technology and decided that stax was probably the compromise i was looking for; however i wanted to keep the easy programming model offered by jaxb, so i really needed a combination of the two. while investigating stax, i came across woodstox; this open source project promises to be a faster xml parser than many othrers, so i decided to include it in my benchmark as well. i now had all elements to create a benchmark to give me memory consumption and processing speed metrics when processing large xml documents. the benchmark plan in order to create a benchmark i needed to do the following: create an xml schema which defined my domain model. this would be the input for jaxb to create the java domain model create three large xml files representing the model, with 10,000 / 100,000 / 1,000,000 elements respectively have a pure jaxb client which would unmarshall the large xml files completely in memory have a stax/jaxb client which would combine the low-memory consumption of sax technologies with the ease of programming model offered by jaxb have a woodstox/jaxb client with the same characteristics of the stax/jaxb client (in few words i just wanted to change the underlying parser and see if i could obtain any performance boost) record both memory consumption and speed of processing (e.g. how quickly would each solution make xml chunks available in memory as jaxb domain model classes) make the results available graphically, since, as we know, one picture tells one thousands words. the domain model xml schema i decided for a relatively easy domain model, with xml elements representing people, with their names and addresses. i also wanted to record whether a person was active. using jaxb to create the java model i am a fan of maven and use it as my default tool to build systems. this is the pom i defined for this little benchmark: 4.0.0 uk.co.jemos.tests.xml large-xml-parser 1.0.0-snapshot jar large-xml-parser http://www.jemos.co.uk utf-8 org.apache.maven.plugins maven-compiler-plugin 2.3.2 1.6 1.6 org.jvnet.jaxb2.maven2 maven-jaxb2-plugin 0.7.5 generate ${basedir}/src/main/resources **/*.xsd true -enableintrospection -xtostring -xequals -xhashcode true true org.jvnet.jaxb2_commons jaxb2-basics 0.6.1 org.apache.maven.plugins maven-jar-plugin 2.3.1 true uk.co.jemos.tests.xml.xmlpullbenchmarker org.apache.maven.plugins maven-assembly-plugin 2.2 ${project.build.directory}/site/downloads src/main/assembly/project.xml src/main/assembly/bin.xml junit junit 4.5 test uk.co.jemos.podam podam 2.3.11.release commons-io commons-io 2.0.1 com.sun.xml.bind jaxb-impl 2.1.3 org.jvnet.jaxb2_commons jaxb2-basics-runtime 0.6.0 org.codehaus.woodstox stax2-api 3.0.3 just few things to notice about this pom.xml. i use java 6, since starting from version 6, java contains all the xml libraries for jaxb, dom, sax and stax. to auto-generate the domain model classes from the xsd schema, i used the excellent maven-jaxb2-plugin, which allows, amongst other things, to obtain pojos with tostring, equals and hashcode support. i have also declared the jar plugin, to create an executable jar for the benchmark and the assembly plugin to distribute an executable version of the benchmark. the code for the benchmark is attached to this post, so if you want to build it and run it yourself, just unzip the project file, open a command line and run: $ mvn clean install assembly:assembly this command will place *-bin.* files into the folder target/site/downloads. unzip the one of your preference and to run the benchmark use (-dcreate.xml=true will generate the xml files. don't pass it if you have these files already, e.g. after the first run): $ java -jar -dcreate.xml=true large-xml-parser-1.0.0-snapshot.jar creating the test data to create the test data, i used podam , a java tool to auto-fill pojos and javabeans with data. the code is as simple as: jaxbcontext context = jaxbcontext .newinstance("xml.integration.jemos.co.uk.large_file"); marshaller marshaller = context.createmarshaller(); marshaller.setproperty(marshaller.jaxb_formatted_output, boolean.true); marshaller.setproperty(marshaller.jaxb_encoding, "utf-8"); personstype personstype = new objectfactory().createpersonstype(); list persons = personstype.getperson(); podamfactory factory = new podamfactoryimpl(); for (int i = 0; i < nbrelements; i++) { persons.add(factory.manufacturepojo(persontype.class)); } jaxbelement towrite = new objectfactory() .createpersons(personstype); file file = new file(filename); bufferedoutputstream bos = new bufferedoutputstream( new fileoutputstream(file), 4096); try { marshaller.marshal(towrite, bos); bos.flush(); } finally { ioutils.closequietly(bos); } the xmlpullbenchmarker generates three large xml files under ~/xml-benchmark: large-person-10000.xml (approx 3m) large-person-100000.xml (approx 30m) large-person-1000000.xml (approx 300m) each file looks like the following: ult6yn0d7l u8djoutlk2 dxwlpow6x3 o4ggvximo7 io7kuz0xmz lmiy1uqkxs zhtukbtwti gbc7kex9tn kxmwnlprep 9bibs1m5gr hmtqpxjcpw bhpf1rrldm ydjjillyrw xgstdjcfjc [..etc] each file contains 10,000 / 100,000 / 1,000,000 elements. the running environments i tried the benchmarker on three different environments: ubuntu 10, 64-bit running as virtual machine on a windows 7 ultimate, with cpu i5, 750 @2.67ghz and 2.66ghz, 8gb ram of which 4gb dedicated to the vm. jvm: 1.6.0_25, hotspot windows 7 ultimate , hosting the above vm, therefore with same processor. jvm, 1.6.0_24, hotspot ubuntu 10, 32-bit , 3gb ram, dual core. jvm, 1.6.0_24, openjdk the xml unmarshalling to unmarshall the code i used three different strategies: pure jaxb stax + jaxb woodstox + jaxb pure jaxb unmarshalling the code which i used to unmarshall the large xml files using jaxb follows: private void readlargefilewithjaxb(file file, int nbrrecords) throws exception { jaxbcontext ucontext = jaxbcontext .newinstance("xml.integration.jemos.co.uk.large_file"); unmarshaller unmarshaller = ucontext.createunmarshaller(); bufferedinputstream bis = new bufferedinputstream(new fileinputstream( file)); long start = system.currenttimemillis(); long memstart = runtime.getruntime().freememory(); long memend = 0l; try { jaxbelement root = (jaxbelement) unmarshaller .unmarshal(bis); root.getvalue().getperson().size(); memend = runtime.getruntime().freememory(); long end = system.currenttimemillis(); log.info("jaxb (" + nbrrecords + "): - total memory used: " + (memstart - memend)); log.info("jaxb (" + nbrrecords + "): time taken in ms: " + (end - start)); } finally { ioutils.closequietly(bis); } } the code uses a one-liner to unmarshall each xml file: jaxbelement root = (jaxbelement) unmarshaller .unmarshal(bis); i also accessed the size of the underlying persontype collection to "touch" in memory data. btw, debugging the application showed that all 10,000 elements were indeed available in memory after this line of code. jaxb + stax with stax, i just had to use an xmlstreamreader, iterate through all elements, and pass each in turn to jaxb to unmarshall it into a persontype domain model object. the code follows: // set up a stax reader xmlinputfactory xmlif = xmlinputfactory.newinstance(); xmlstreamreader xmlr = xmlif .createxmlstreamreader(new filereader(file)); jaxbcontext ucontext = jaxbcontext.newinstance(persontype.class); unmarshaller unmarshaller = ucontext.createunmarshaller(); long start = system.currenttimemillis(); long memstart = runtime.getruntime().freememory(); long memend = 0l; try { xmlr.nexttag(); xmlr.require(xmlstreamconstants.start_element, null, "persons"); xmlr.nexttag(); while (xmlr.geteventtype() == xmlstreamconstants.start_element) { jaxbelement pt = unmarshaller.unmarshal(xmlr, persontype.class); if (xmlr.geteventtype() == xmlstreamconstants.characters) { xmlr.next(); } } memend = runtime.getruntime().freememory(); long end = system.currenttimemillis(); log.info("stax - (" + nbrrecords + "): - total memory used: " + (memstart - memend)); log.info("stax - (" + nbrrecords + "): time taken in ms: " + (end - start)); } finally { xmlr.close(); } } note that this time when creating the context, i had to specify that it was for the persontype object, and when invoking the jaxb unmarshalling i had to pass also the desired returned class type, with: jaxbelement pt = unmarshaller.unmarshal(xmlr, persontype.class); note that i don't to anything with the object, just create it, to keep the benchmark as truthful and possible by not introducing any unnecessary steps. jaxb + woodstox with woodstox, the approach is very similar to the one used with stax. in fact woodstox provides a stax2 compatible api, so all i had to do was to provide the correct factory and...bang! i had woodstox under the cover working. private void readlargexmlwithfasterstax(file file, int nbrrecords) throws factoryconfigurationerror, xmlstreamexception, filenotfoundexception, jaxbexception { // set up a woodstox reader xmlinputfactory xmlif = xmlinputfactory2.newinstance(); xmlstreamreader xmlr = xmlif .createxmlstreamreader(new filereader(file)); jaxbcontext ucontext = jaxbcontext.newinstance(persontype.class); unmarshaller unmarshaller = ucontext.createunmarshaller(); long start = system.currenttimemillis(); long memstart = runtime.getruntime().freememory(); long memend = 0l; try { xmlr.nexttag(); xmlr.require(xmlstreamconstants.start_element, null, "persons"); xmlr.nexttag(); while (xmlr.geteventtype() == xmlstreamconstants.start_element) { jaxbelement pt = unmarshaller.unmarshal(xmlr, persontype.class); if (xmlr.geteventtype() == xmlstreamconstants.characters) { xmlr.next(); } } memend = runtime.getruntime().freememory(); long end = system.currenttimemillis(); log.info("woodstox - (" + nbrrecords + "): total memory used: " + (memstart - memend)); log.info("woodstox - (" + nbrrecords + "): time taken in ms: " + (end - start)); } finally { xmlr.close(); } } note the following line: xmlinputfactory xmlif = xmlinputfactory2.newinstance(); where i pass in a stax2 xmlinputfactory. this uses the woodstox implementation. the main loop once the files are in place (you obtain this by passing -dcreate.xml=true), the main performs the following: system.gc(); system.gc(); for (int i = 0; i < 10; i++) { main.readlargefilewithjaxb(new file(output_folder + file.separatorchar + "large-person-10000.xml"), 10000); main.readlargefilewithjaxb(new file(output_folder + file.separatorchar + "large-person-100000.xml"), 100000); main.readlargefilewithjaxb(new file(output_folder + file.separatorchar + "large-person-1000000.xml"), 1000000); main.readlargexmlwithstax(new file(output_folder + file.separatorchar + "large-person-10000.xml"), 10000); main.readlargexmlwithstax(new file(output_folder + file.separatorchar + "large-person-100000.xml"), 100000); main.readlargexmlwithstax(new file(output_folder + file.separatorchar + "large-person-1000000.xml"), 1000000); main.readlargexmlwithfasterstax(new file(output_folder + file.separatorchar + "large-person-10000.xml"), 10000); main.readlargexmlwithfasterstax(new file(output_folder + file.separatorchar + "large-person-100000.xml"), 100000); main.readlargexmlwithfasterstax(new file(output_folder + file.separatorchar + "large-person-1000000.xml"), 1000000); } it invites the gc to run, although as we know this is at the gc thread discretion. it then executes each strategy 10 times, to normalise ram and cpu consumption. the final data are then collected by running an average on the ten runs. the benchmark results for memory consumption here follow some diagrams which show memory consumption across the different running environments, when unmarshalling 10,000 / 100,000 / 1,000,000 files. you will probably notice that memory consumption for stax-related strategies often shows a negative value. this means that there was more free memory after unmarshalling all elements than there was at the beginning of the unmarshalling loop; this, in turn, suggests that the gc ran a lot more with stax than with jaxb. this is logical if one thinks about it; since with stax we don't keep all objects into memory there are more objects available for garbage collection. in this particular case i believe the persontype object created in the while loop gets eligible for gc and enters the young generation area and then it gets reclamed by the gc. this, however, should have a minimum impact on performance, since we know that claiming objects from the young generation space is done very efficiently. summary for 10,000 xml elements summary for 100,000 xml elements summary for 1,000,000 xml elements the benchmark results for processing speed results for 10,000 elements results for 100,000 elements results for 1,000,000 elements conclusions the results on all three different environments, although with some differences, all tell us the same story: if you are looking for performance (e.g. xml unmarshalling speed), choose jaxb if you are looking for low-memory usage (and are ready to sacrifice some performance speed), then use stax. my personal opinion is also that i wouldn't go for woodstox, but i'd choose either jaxb (if i needed processing power and could afford the ram) or stax (if i didn't need top speed and was low on infrastructure resources). both these technologies are java standards and part of the jdk starting from java 6. resources benchmarker source code zip version: download large-xml-parser-1.0.0-snapshot-project tar.gz version: download large-xml-parser-1.0.0-snapshot-project.tar tar.bz2 version: download large-xml-parser-1.0.0-snapshot-project.tar benchmarker executables: zip version: download large-xml-parser-1.0.0-snapshot-bin tar.gz version: download large-xml-parser-1.0.0-snapshot-bin.tar tar.bz2 version: download large-xml-parser-1.0.0-snapshot-bin.tar data files: ubuntu 64-bit vm running environment: download stax-vs-jaxb-ubuntu-64-vm ubuntu 32-bit running environment : download stax-vs-jaxb-ubuntu-32-bit windows 7 ultimate running environment : download stax-vs-jaxb-windows7 from http://tedone.typepad.com/blog/2011/06/unmarshalling-benchmark-in-java-jaxb-vs-stax-vs-woodstox.html
June 27, 2011
by Marco Tedone
· 71,367 Views · 9 Likes
article thumbnail
Java EE6 CDI, Named Components and Qualifiers
One of the biggest promises java EE6 made, was to ease the use of dependency injection. They did, using CDI. CDI, which stands for Contexts and Dependency Injection for Java EE, offers a base set to apply dependency injection in your enterprise application. Before CDI, EJB 3 also introduced dependency injection, but this was a bit basic. You could inject an EJB (statefull or stateless) into another EJB or Servlet (if you container supported this). Offcourse not every application needs EJB’s, that is why CDI is gaining so much popularity. To start, I have made this example. There is a Payment interface, and 2 implementations. A cash payment and a visa payment. I want to be able to choose witch type of payment I inject, still using the same interface. public interface Payment { void pay(BigDecimal amount); } and the 2 implementations public class CashPaymentImpl implements Payment { private static final Logger LOGGER = Logger.getLogger(CashPaymentImpl.class.toString()); @Override public void pay(BigDecimal amount) { LOGGER.log(Level.INFO, "payed {0} cash", amount.toString()); } } public class VisaPaymentImpl implements Payment { private static final Logger LOGGER = Logger.getLogger(VisaPaymentImpl.class.toString()); @Override public void pay(BigDecimal amount) { LOGGER.log(Level.INFO, "payed {0} with visa", amount.toString()); } } To inject the interface we use the @Inject annotation. The annotation does basically what it says. It injects a component, that is available in your application. 1 @Inject private Payment payment; Off course, you saw this coming from a mile away, this won’t work. The container has 2 implementations of our Payment interface, so he does not know which one to inject. Unsatisfied dependencies for type [Payment] with qualifiers [@Default] at injection point [[field] @Inject private be.styledideas.blog.qualifier.web.PaymentBackingAction.payment] So we need some sort of qualifier to point out what implementation we want. CDI offers the @Named Annotation, allowing you to give a name to an implementation. @Named("cash") public class CashPaymentImpl implements Payment { private static final Logger LOGGER = Logger.getLogger(CashPaymentImpl.class.toString()); @Override public void pay(BigDecimal amount) { LOGGER.log(Level.INFO, "payed {0} cash", amount.toString()); } } @Named("visa") public class VisaPaymentImpl implements Payment { private static final Logger LOGGER = Logger.getLogger(VisaPaymentImpl.class.toString()); @Override public void pay(BigDecimal amount) { LOGGER.log(Level.INFO, "payed {0} with visa", amount.toString()); } } When we now change our injection code, we can specify wich implementation we need. @Inject private @Named("visa") Payment payment; This works, but the flexibility is limited. When we want to rename our @Named parameter, we have to change it on everyplace where it is used. There is also no refactoring support. There is a beter alternative using Custom made annotations using the @Qualifier annotation. Let us change the code a little bit. First of all, we create new Annotation types. @java.lang.annotation.Documented @java.lang.annotation.Retention(RetentionPolicy.RUNTIME) @javax.inject.Qualifier public @interface CashPayment {} @java.lang.annotation.Documented @java.lang.annotation.Retention(RetentionPolicy.RUNTIME) @javax.inject.Qualifier public @interface VisaPayment {} The @Qualifier annotation that is added to the annotation, makes this annotation discoverable by the container. We can now simply add these annotations to our implementations. @CashPayment public class CashPaymentImpl implements Payment { private static final Logger LOGGER = Logger.getLogger(CashPaymentImpl.class.toString()); @Override public void pay(BigDecimal amount) { LOGGER.log(Level.INFO, "payed {0} cash", amount.toString()); } } @VisaPayment public class VisaPaymentImpl implements Payment { private static final Logger LOGGER = Logger.getLogger(VisaPaymentImpl.class.toString()); @Override public void pay(BigDecimal amount) { LOGGER.log(Level.INFO, "payed {0} with visa", amount.toString()); } } The only thing we now need to do, is change our injection code to @Inject private @VisaPayment Payment payment; When we now change something to our qualifier, we have nice compiler and refactoring support. This also adds extra flexibilty for API or Domain-specific language design. From http://styledideas.be/blog/2011/06/16/java-ee6-cdi-named-components-and-qualifiers/
June 24, 2011
by Jelle Victoor
· 72,924 Views · 5 Likes
article thumbnail
Git Tutorial: Comparing Files With diff
The most common scenario to use diff is to see what changes you made after your last commit. Let’s see how to do it.
June 19, 2011
by Veera Sundar
· 271,836 Views · 2 Likes
article thumbnail
Developing Android Apps with NetBeans, Maven, and VirtualBox
I am an experienced Java developer who has used various IDEs and prefer NetBeans IDE over all others by a long shot. I am also very fond of Maven as the tool to simplify and automate nearly every aspect of the development of my Java project throughout its lifecycle. Recently, I started developing Android applications and naturally I looked for a Maven plugin that would manage my Android projects. Luckily I found the maven-android-plugin which worked like a charm and allowed me to use Maven for developing my Android projects. The Android Emulator from the Android SDK seemed unusably slow. Lucklily, I found a way to use an Android Virtual Machine for VirtualBox that worked nearly as fast as my native computer! This page documents my experiences. Tested Environment Dev machine: Ubuntu 11.04 Linux IDE: NetBeans VirtualBox: 4.0.8 r71778 Android SDK Revision 11, Add on XML Schema #1, Repository XML Schema #3 (from About in SDK and AVD Manager) Android Version: 2.2 Overview of Steps Download and install the Android SDK on your dev machine Attach an Android Device to dev machine Configure and load your device for development and other use Create an initial Android maven project Connect Android Device to Android SDK Debug Android app using NetBeans Graphical Debuger Download and Install Android SDK Download and install the Android SDK on your dev machine as described here. Make sure to set the following in dev machine ~/.bashrc file: export ANDROID_HOME=$HOME/android-sdk-linux_x86 #Change as needed export PATH="$ANDROID_HOME/tools:$ANDROID_HOME/platform-tools:$PATH" Attaching an Android Device to Dev Machine If you have an actual device that is usually always best. If not, you must use a virtual Android device which usually has various limitations (e.g. no GPS, Camera etc.). The Android SDK makes it easy to create a new Virtual Device but the resulting device is painfully slow in my experience and not usable. Do not bother with this. Instead, create a virtual Android device using VirtualBox as described in the following steps: Install virtual box and initial Android VM as described here: http://androidspin.com/2011/01/24/howto-install-android-x86-2-2-in-virtualbox/ http://geeknizer.com/how-to-run-google-android-in-virtualbox-vmware-on-netbooks/ Configure Android VM so it is connected bidirectionally with your dev machine over TCP as described here: http://stackoverflow.com/questions/61156/virtualbox-host-guest-network-setup I used the approach of configuring a HOST ONLY network adapater and a second NAT adapter on the Android VM within virtual box. Configuring your Android Device This section describes various things I did to setup a dev environment for my Android device: Root the device. I used Universal AndRoot Install ConnectBot so you have ssh and related network utilities Creating Initial Android Maven Application Create initial project using instructions here. I found it best to create stub project structure using the maven-archtype-plugin and the archtypes at https://github.com/akquinet/android-archetypes/wiki Connecting Android VM Device to Android SDK In order for your code to be deployed from NetBeans IDE to Android Device and in order for you to monitor your deployed app from the Dalvik Debug Monitor (ddms) you need to connect your android VM device to the android sdk over TCP as described in the following steps. On Android Device open the Terminal Emulator Type su to become root (your device must be rooted for this Type following commands in root shell: setprop service.adb.tcp.port 5555 stop adbd start adbd Type the following commands on dev machine shell. TODO: Note that IP address below is whatever is the ip address associated with the device (see ifconfig on linux for device vboxnet0) adb tcpip 5555 adb connect 192.168.0.101:5555 For details on above steps see: http://stackoverflow.com/questions/2604727/how-can-i-connect-to-android-with-adb-over-tcp Set up port forwarding as described here http://redkrieg.com/2010/10/11/adb-over-ssh-fun-with-port-forwards/ (this is where I am most fuzzy) Build your maven android project using Right-Click / Clean and Build Now for the acid test whether you can deploy your app to the device from NetBeans IDE! Right-click / Custom / Goal to show Run Maven dialog. Enter android:deploy in Goals field. Select Remember As button and enter android:deploy for its text field. If all is well, the app will deploy to the device and will show up in its "Applications" screen. Debugging Android App Using NetBeans Graphical Debugger Once you can build and deploy your app to the real or virtual Android device, here are the steps to debug the app using NetBeans debugger: On Device: Start the app (TODO: determine how to start app on device with JVM options so it can wait for debugger connection. This should be easy) On Dev Machine run Dalvik Debug Monitor (ddms) in background: $ANDROID_HOME/tools/ddms & Lookup your app in ddms and get its debug port. This is described here but does not address NetBeans specifically In NetBeans do: Debug / Attach Debugger and specify the port looked up in ddms in previous step. You may leave rest of the fields with defaults. Click OK
June 18, 2011
by Farrukh Najmi
· 173,467 Views
article thumbnail
Git Tip : Restore a deleted tag
A little tip that can be very useful, how to restore a deleted Git tag. If you juste deleted a tag by error, you can easily restore it following these steps. First, use git fsck --unreachable | grep tag then, you will see the unreachable tag. If you have several tags on the list, use git show KEY to found the good tag and finally, when you know which tag to restore, use git update-ref refs/tags/NAME KEY and the previously deleted tag with restore with NAME. Thanks to Shawn Pearce for the tip. From http://www.baptiste-wicht.com/2011/06/git-tip-restore-a-deleted-tag/
June 16, 2011
by Baptiste Wicht
· 27,676 Views
article thumbnail
Android Tutorial: How to Parse/Read JSON Data Into a Android ListView
Today we get on with our series that will connect our Android applications to internet webservices! Next up in line: from JSON to a Listview. A lot of this project is identical to the previous post in this series so try to look there first if you have any problems. On the bottom of the post ill add the Eclipse project with the source. For this example i made use of an already existing JSON webservice located here. This is a piece of the JSON array that gets returned: {"earthquakes": [ { "eqid": "c0001xgp", "magnitude": 8.8, "lng": 142.369, "src": "us", "datetime": "2011-03-11 04:46:23", "depth": 24.4, "lat": 38.322 }, { "eqid": "2007hear", "magnitude": 8.4, "lng": 101.3815, "src": "us", "datetime": "2007-09-12 09:10:26", "depth": 30, "lat": -4.5172 }<--more -->]} So how do we get this data into our application! Behold our getJSON class! getJSON(String url) public static JSONObject getJSONfromURL(String url){//initializeInputStream is = null;String result = "";JSONObject jArray = null;//http posttry{HttpClient httpclient = new DefaultHttpClient();HttpPost httppost = new HttpPost(url);HttpResponse response = httpclient.execute(httppost);HttpEntity entity = response.getEntity();is = entity.getContent();}catch(Exception e){Log.e("log_tag", "Error in http connection "+e.toString());}//convert response to stringtry{BufferedReader reader = new BufferedReader(new InputStreamReader(is,"iso-8859-1"),8);StringBuilder sb = new StringBuilder();String line = null;while ((line = reader.readLine()) != null) {sb.append(line + "\n");}is.close();result=sb.toString();}catch(Exception e){Log.e("log_tag", "Error converting result "+e.toString());}//try parse the string to a JSON objecttry{ jArray = new JSONObject(result);}catch(JSONException e){Log.e("log_tag", "Error parsing data "+e.toString());}return jArray;} The code above can be divided in 3 parts. the first part makes the HTTP call the second part converts the stream into a String the third part converts the string to a JSPNObject Now we only have to implement this into out ListView. We can use the same method as in the XML tutorial. We make a HashMap that stores our data and we put JSON values in the HashMap. After that we will bind that HashMap to a SimpleAdapter. Here is how its done: Implementation ArrayList> mylist = new ArrayList>();//Get the data (see above)JSONObject json =JSONfunctions.getJSONfromURL("http://api.geonames.org/postalCodeSearchJSON?formatted=true&postalcode=9791&maxRows=10&username=demo&style=full"); try{//Get the element that holds the earthquakes ( JSONArray )JSONArray earthquakes = json.getJSONArray("earthquakes"); //Loop the Array for(int i=0;i < earthquakes.length();i++){ HashMap map = new HashMap(); JSONObject e = earthquakes.getJSONObject(i); map.put("id", String.valueOf(i)); map.put("name", "Earthquake name:" + e.getString("eqid")); map.put("magnitude", "Magnitude: " + e.getString("magnitude")); mylist.add(map);} }catch(JSONException e) { Log.e("log_tag", "Error parsing data "+e.toString()); } After this we only need to make up the Simple Adapter ListAdapter adapter = new SimpleAdapter(this, mylist , R.layout.main, new String[] { "name", "magnitude" }, new int[] { R.id.item_title, R.id.item_subtitle }); setListAdapter(adapter); final ListView lv = getListView(); lv.setTextFilterEnabled(true); lv.setOnItemClickListener(new OnItemClickListener() { public void onItemClick(AdapterView parent, View view, int position, long id) { @SuppressWarnings("unchecked") Toast.makeText(Main.this, "ID '" + o.get("id") + "' was clicked.", Toast.LENGTH_SHORT).show(); }); Now we have a ListView filled with JSON data! Here is the Eclipse project: source code Have fun playing around with it.
June 8, 2011
by Mark Mooibroek
· 260,339 Views
article thumbnail
When to Use Apache Camel?
When to use Apache Camel, a popular JVM/Java environment, and when to use other alternatives.
June 5, 2011
by Kai Wähner DZone Core CORE
· 153,182 Views · 12 Likes
article thumbnail
CDI AOP Tutorial: Java Standard Method Interception Tutorial - Java EE
This article discusses CDI based AOP in a tutorial format. CDI is the Java standard for dependency injection (DI) and interception (AOP). It is evident from the popularity of DI and AOP that Java needs to address DI and AOP so that it can build other standards on top of it. DI and AOP are already the foundation of many Java frameworks. CDI is a foundational aspect of Java EE 6. It is or will be shortly supported by Caucho's Resin Java Application Server, Java EE WebProfile certified, IBM's WebSphere, Oracle's Glassfish, Red Hat's JBoss and many more application servers. CDI is similar to core Spring and Guice frameworks. Like JPA did for ORM, CDI simplifies and sanitizes the API for DI and AOP. If you have worked with Spring or Guice, you will find CDI easy to use and easy to learn. If you are new to AOP, then CDI is an easy on ramp for picking up AOP quickly, as it uses a small subset of what AOP provides. CDI based AOP is simpler to use and learn. One can argue that CDI only implements a small part of AOP—method interception. While this is a small part of what AOP has to offer, it is also the part that most developers use. CDI can be used standalone and can be embedded into any application. Here is the source code for this tutorial, and instructions for use. It is no accident that this tutorial follows many of the same examples in the written three years ago. It will be interesting to compare and contrast the examples in this tutorial with the one written three years ago for Spring based AOP. Design goals of this tutorial This tutorial is meant to be a description and explanation of AOP in CDI without the clutter of EJB 3.1 or JSF. There are already plenty of tutorials that cover EJB 3.1 and JSF (and CDI). We believe that CDI has merit on its own outside of the EJB and JSF space. This tutorial only covers CDI. Repeat there is no JSF 2 or EJB 3.1 in this tutorial. There are plenty of articles and tutorials that cover using CDI as part of a larger JEE 6 application. This tutorial is not that. This tutorial series is CDI and only CDI. This tutorial only has full, complete code examples with source code you can download and try out on your own. There are no code snippets where you can't figure out where in the code you are suppose to be. So far these tutorials have been well recieved and we got a lot of feedback. There appears to be a lot of interest in the CDI standard. Thanks for reading and thanks for your comments and participation so far. AOP Basics For some, AOP seems like voodoo magic. For others, AOP seems like a cure-all. For now, let's just say that AOP is a tool that you want in your developer toolbox. It can make seemingly impossible things easy. Aagin, when we talk about AOP in CDI, we are really talking about interception which is a small but very useful part of AOP. For brevity, I am going to refer to interception as AOP. The first time that I used AOP was with Spring's transaction management support. I did not realize I was using AOP. I just knew Spring could apply EJB-style declarative transaction management to POJOs. It was probably three to six months before I realized that I was using was Spring's AOP support. The Spring framework truly brought AOP out of the esoteric closet into the main stream light of day. CDI brings these concepts into the JSR standards where other Java standards can build on top of CDI. You can think of AOP as a way to apply services (called cross-cutting concerns) to objects. AOP encompasses more than this, but this is where it gets used mostly in the main stream. I've using AOP to apply caching services, transaction management, resource management, etc. to any number of objects in an application. I am currently working with a team of folks on the CDI implementation for the revived JSR-107 JCache. AOP is not a panacea, but it certainly fits a lot of otherwise difficult use cases. You can think of AOP as a dynamic decorator design pattern. The decorator pattern allows additional behavior to be added to an existing class by wrapping the original class and duplicating its interface and then delegating to the original. See this article decorator pattern for more detail about the decorator design pattern. (Notice in addition to supporting AOP style interception CDI also supports actual decorators, which are not covered in this article.) Sample application revisited For this introduction to AOP, let's take a simple example, let's apply security services to our Automated Teller Machine example from the first the first in this series. Let's say when a user logs into a system that a SecurityToken is created that carries the user's credentials and before methods on objects get invoked, we want to check to see if the user has credentials to invoke these methods. For review, let's look at the AutomatedTellerMachine interface. Code Listing: AutomatedTellerMachine interface package org.cdi.advocacy; import java.math.BigDecimal; public interface AutomatedTellerMachine { public abstract void deposit(BigDecimal bd); public abstract void withdraw(BigDecimal bd); } In a web application, you could write a ServletFilter, that stored this SecurityToken in HttpSession and then on every request retrieved the token from Session and put it into a ThreadLocal variable where it could be accessed from a SecurityService that you could implement. Perhaps the objects that needed the SecurityService could access it as follows: Code Listing: AutomatedTellerMachineImpl implementing security without AOP public void deposit(BigDecimal bd) { /* If the user is not logged in, don't let them use this method */ if(!securityManager.isLoggedIn()){ throw new SecurityViolationException(); } /* Only proceed if the current user is allowed. */ if (!securityManager.isAllowed("AutomatedTellerMachine", operationName)){ throw new SecurityViolationException(); } ... transport.communicateWithBank(...); } In our ATM example, the above might work out well, but imagine a system with thousands of classes that needed security. Now imagine, the way we check to see if a user is "logged in" changed. If we put this code into every method that needed security, then we could possibly have to change this a thousand times if we changed the way we checked to see if a user was logged in. What we want to do instead is to use CDI to create a decorated version of the AutomateTellerMachineImpl bean. The decorated version would add the additional behavior to the AutomateTellerMachineImpl object without changing the actual implementation of the AutomateTellerMachineImpl. In AOP speak, this concept is called a cross-cutting concern. A cross-cutting concern is a concern that crosses the boundry of many objects. CDI does this by creating what is called an AOP proxy. An AOP proxy is like a dynamic decorator. Underneath the covers CDI can generate a class at runtime (the AOP proxy) that has the same interface as our AutomatedTellerMachine. The AOP proxy wraps our existing atm object and provides additional behavior by delegating to a list of method interceptors. The method interceptors provide the additional behavior and are similar to ServletFilters but for methods instead of requests. Diagrams of CDI AOP support Thus before we added CDI AOP, our atm example was like Figure 1. Figure 1: Before AOP advice After we added AOP support, we now get an AOP proxy that applies the securityAdvice to the atm as show in figure 2. Figure 2: After AOP advice You can see that the AOP proxy implements the AutomatedTellerMachine interface. When the client object looks up the atm and starts invoking methods instead of executing the methods directly, it executes the method on the proxy, which then delegates the call to a series of method interceptor called advice, which eventually invoke the actual atm instance (now called atmTarget). Let's actually look at the code for this example. For this example, we will use a simplified SecurityToken that gets stored into a ThreadLocal variable, but one could imagine one that was populated with data from a database or an LDAP server or some other source of authentication and authorization. Here is the SecurityToken, which gets stored into a ThreadLocal variable, for this example: SecurityToken.java Gets stored in ThreadLocal package org.cdi.advocacy.security; /** * @author Richard Hightower * */ public class SecurityToken { private boolean allowed; private String userName; public SecurityToken() { } public SecurityToken(boolean allowed, String userName) { super(); this.allowed = allowed; this.userName = userName; } public boolean isAllowed(String object, String methodName) { return allowed; } /** * @return Returns the allowed. */ public boolean isAllowed() { return allowed; } /** * @param allowed The allowed to set. */ public void setAllowed(boolean allowed) { this.allowed = allowed; } /** * @return Returns the userName. */ public String getUserName() { return userName; } /** * @param userName The userName to set. */ public void setUserName(String userName) { this.userName = userName; } } The SecurityService stores the SecurityToken into the ThreadLocal variable, and then delegates to it to see if the current user has access to perform the current operation on the current object as follows: SecurityService.java Service package org.cdi.advocacy.security; public class SecurityService { private static ThreadLocal currentToken = new ThreadLocal(); public static void placeSecurityToken(SecurityToken token){ currentToken.set(token); } public static void clearSecuirtyToken(){ currentToken.set(null); } public boolean isLoggedIn(){ SecurityToken token = currentToken.get(); return token!=null; } public boolean isAllowed(String object, String method){ SecurityToken token = currentToken.get(); return token.isAllowed(); } public String getCurrentUserName(){ SecurityToken token = currentToken.get(); if (token!=null){ return token.getUserName(); }else { return "Unknown"; } } } The SecurityService will throw a SecurityViolationException if a user is not allowed to access a resource. SecurityViolationException is just a simple exception for this example. SecurityViolationException.java Exception package com.arcmind.springquickstart.security; /** * @author Richard Hightower * */ public class SecurityViolationException extends RuntimeException { /** * */ private static final long serialVersionUID = 1L; } To remove the security code out of the AutomatedTellerMachineImpl class and any other class that needs security, we will write an Aspect in CDI to intercept calls and perform security checks before the method call. To do this we will create a method interceptor (known is AOP speak as an advice) and intercept method calls on the atm object. Here is the SecurityAdvice class which will intercept calls on the AutomatedTellerMachineImpl class. SecurityAdvice package org.cdi.advocacy.security; import javax.inject.Inject; import javax.interceptor.AroundInvoke; import javax.interceptor.Interceptor; import javax.interceptor.InvocationContext; /** * @author Richard Hightower */ @Secure @Interceptor public class SecurityAdvice { @Inject private SecurityService securityManager; @AroundInvoke public Object checkSecurity(InvocationContext joinPoint) throws Exception { System.out.println("In SecurityAdvice"); /* If the user is not logged in, don't let them use this method */ if(!securityManager.isLoggedIn()){ throw new SecurityViolationException(); } /* Get the name of the method being invoked. */ String operationName = joinPoint.getMethod().getName(); /* Get the name of the object being invoked. */ String objectName = joinPoint.getTarget().getClass().getName(); /* * Invoke the method or next Interceptor in the list, * if the current user is allowed. */ if (!securityManager.isAllowed(objectName, operationName)){ throw new SecurityViolationException(); } return joinPoint.proceed(); } } Notice that we annotate the SecuirtyAdvice class with an @Secure annotation. The @Secure annotation is an @InterceptorBinding. We use it to denote both the interceptor and the classes it intercepts. More on this later. Notice that we use @Inject to inject the securityManager. Also we mark the method that implements that around advice with and @AroundInvoke annotation. This essentially says this is the method that does the dynamic decoration. Thus, the checkSecurity method of SecurityAdvice is the method that implements the advice. You can think of advice as the decoration that we want to apply to other objects. The objects getting the decoration are called advised objects. Notice that the SecurityService gets injected into the SecurityAdvice and the checkSecurity method uses the SecurityService* to see if the user is logged in and the user has the rights to execute the method. An instance of InvocationContext, namely joinPoint, is passed as an argument to checkSecurity. The InvocationContext has information about the method that is being called and provides control that determines if the method on the advised object's methods gets invoked (e.g., AutomatedTellerMachineImpl.withdraw and AutomatedTellerMachineImpl.deposit). If *`joinPoint.proceed()`* is not called then the wrapped method of the advised object (withdraw or deposit) is not called. (The proceed method causes the actual decorated method to be invoked or the next interceptor in the chain to get invoked.) In Spring, to apply an Advice like SecurityAdvice to an advised object, you need a pointcut. A pointcut is like a filter that picks the objects and methods that get decorated. In CDI, you just mark the class or methods of the class that you want decorated with an interceptor binding annotation. There is no complex pointcut language. You could implement one as a CDI extention, but it does not come with CDI by default. CDI uses the most common way developer apply interceptors, i.e., with annotations. CDI scans each class in each jar (and other classpath locations) that has a META-INF/beans.xml. The SecurityAdvice get installed in the CDI beans.xml. META-INF/beans.xml org.cdi.advocacy.security.SecurityAdvice You can install interceptors in the order you want them called. In order to associate a interceptor with the classes and methods it decorates, you have to define an InterceptorBinding annotation. An example of such a binding is defined below in the @Secure annotation. Secure.java annotation package org.cdi.advocacy.security; import java.lang.annotation.Retention; import java.lang.annotation.Target; import static java.lang.annotation.ElementType.*; import static java.lang.annotation.RetentionPolicy.*; import javax.interceptor.InterceptorBinding; @InterceptorBinding @Retention(RUNTIME) @Target({TYPE, METHOD}) public @interface Secure { } Notice that we annotated the @Secure annotation with the @InterceptorBinding annotation. InterceptorBindings follow a lot of the same rules as Qualifiers as discussed in the first two articles in this series. InterceptorBindings are like qaulifiers for injection in that they can have members which can further qualify the injection. You can also disable InterceptorBinding annotation members from qualifying an interception by using the @NonBinding just like you can in Qualifiers. To finish our example, we need to annotate our AutomatedTellerMachine with the same @Secure annotation; thus, associating the AutomatedTellerMachine with our SecurityAdvice. AutomatedTellerMachine class using @Secure package org.cdi.advocacy; ... import javax.inject.Inject; import org.cdi.advocacy.security.Secure; @Secure public class AutomatedTellerMachineImpl implements AutomatedTellerMachine { @Inject @Json private ATMTransport transport; public void deposit(BigDecimal bd) { System.out.println("deposit called"); transport.communicateWithBank(null); } public void withdraw(BigDecimal bd) { System.out.println("withdraw called"); transport.communicateWithBank(null); } } You have the option of use @Secure on the methods or at the class level. In this example, we annotated the class itself, which then applies the interceptor to every method. Let's complete our example by reviewing the AtmMain main method that looks up the atm out of CDI's beanContainer. Let's review AtmMain as follows: AtmMain.java package org.cdi.advocacy; import java.math.BigDecimal; import org.cdi.advocacy.security.SecurityToken; import org.cdiadvocate.beancontainer.BeanContainer; import org.cdiadvocate.beancontainer.BeanContainerManager; import org.cdi.advocacy.security.SecurityService; public class AtmMain { public static void simulateLogin() { SecurityService.placeSecurityToken(new SecurityToken(true, "Rick Hightower")); } public static void simulateNoAccess() { SecurityService.placeSecurityToken(new SecurityToken(false, "Tricky Lowtower")); } public static BeanContainer beanContainer = BeanContainerManager .getInstance(); static { beanContainer.start(); } public static void main(String[] args) throws Exception { simulateLogin(); //simulateNoAccess(); AutomatedTellerMachine atm = beanContainer .getBeanByType(AutomatedTellerMachine.class); atm.deposit(new BigDecimal("1.00")); } } Continue reading... Click on the navigation links below the author bio to read the other pages of this article. Be sure to check out part I of this series as well! Although not a fan of EJB 3, Rick is a big fan of the potential of CDI and thinks that EJB 3.1 has come a lot closer to the mark. CDI Implementations - Resin Candi - Seam Weld - Apache OpenWebBeans Before we added AOP support when we looked up the atm, we looked up the object directly as shown in figure 1, now that we applied AOP when we look up the object we get what is in figure 2. When we look up the atm in the application context, we get the AOP proxy that applies the decoration (advice, method interceptor) to the atm target by wrapping the target and delegating to it after it invokes the series of method interceptors. Victroy lap The last code listing works just like you think. If you use simulateLogin, atm.deposit does not throw a SecurityException. If you use simulateNoAccess, it does throw a SecurityException. Now let's weave in a few more "Aspects" to the mix to drive some points home and to show how interception works with multiple interceptors. I will go quicker this time. LoggingInterceptor package org.cdi.advocacy; import java.util.Arrays; import java.util.logging.Logger; import javax.interceptor.AroundInvoke; import javax.interceptor.Interceptor; import javax.interceptor.InvocationContext; @Logable @Interceptor public class LoggingInterceptor { @AroundInvoke public Object log(InvocationContext ctx) throws Exception { System.out.println("In LoggingInterceptor"); Logger logger = Logger.getLogger(ctx.getTarget().getClass().getName()); logger.info("before call to " + ctx.getMethod() + " with args " + Arrays.toString(ctx.getParameters())); Object returnMe = ctx.proceed(); logger.info("after call to " + ctx.getMethod() + " returned " + returnMe); return returnMe; } } Now we need to define the Logable interceptor binding annotation as follows: package org.cdi.advocacy; import java.lang.annotation.Retention; import java.lang.annotation.Target; import static java.lang.annotation.ElementType.*; import static java.lang.annotation.RetentionPolicy.*; import javax.interceptor.InterceptorBinding; @InterceptorBinding @Retention(RUNTIME) @Target({TYPE, METHOD}) public @interface Logable { } Now to use it we just mark the methods where we want this logging. AutomatedTellerMachineImpl.java using Logable package org.cdi.advocacy; ... @Secure public class AutomatedTellerMachineImpl implements AutomatedTellerMachine { ... @Logable public void deposit(BigDecimal bd) { System.out.println("deposit called"); transport.communicateWithBank(null); } public void withdraw(BigDecimal bd) { System.out.println("withdraw called"); transport.communicateWithBank(null); } } Notice that we use the @Secure at the class level which will applies the security interceptor to every mehtod in the AutomatedTellerMachineImpl. But, we use @Logable only on the deposit method which applies it, you guessed it, only on the deposit method. Now you have to add this interceptor to the beans.xml: META-INF/beans.xml org.cdi.advocacy.LoggingInterceptor org.cdi.advocacy.security.SecurityAdvice When we run this again, we get something like this in our console output: May 15, 2011 6:46:22 PM org.cdi.advocacy.LoggingInterceptor log INFO: before call to public void org.cdi.advocacy.AutomatedTellerMachineImpl.deposit(java.math.BigDecimal) with args [1.00] May 15, 2011 6:46:22 PM org.cdi.advocacy.LoggingInterceptor log INFO: after call to public void org.cdi.advocacy.AutomatedTellerMachineImpl.deposit(java.math.BigDecimal) returned null Notice that the order of interceptors in the beans.xml file determines the order of execution in the code. (I added a println to each interceptor just to show the ordering.) When we run this, we get the following output. Output: In LoggingInterceptor In SecurityAdvice If we switch the order in the beans.xml file, we will get a different order in the console output. META-INF/beans.xml org.cdi.advocacy.security.SecurityAdvice org.cdi.advocacy.LoggingInterceptor In SecurityAdvice In LoggingInterceptor This is important as many interceptors can be applied. You have one place to set the order. Conclusion AOP is neither a cure all or voodoo magic, but a powerful tool that needs to be in your bag of tricks. The Spring framework has brought AOP to the main stream masses and Spring 2.5/3.x has simplified using AOP. CDI brings AOP and DI into the standard's bodies where it can get further mainstreamed, refined and become part of future Java standards like JCache, Java EE 6 and Java EE 7. You can use Spring CDI to apply services (called cross-cutting concerns) to objects using AOP's interception model. AOP need not seem like a foreign concept as it is merely a more flexible version of the decorator design pattern. With AOP you can add additional behavior to an existing class without writing a lot of wrapper code. This can be a real time saver when you have a use case where you need to apply a cross cutting concern to a slew of classes. To reiterate... CDI is the Java standard for dependency injection and interception (AOP). It is evident from the popularity of DI and AOP that Java needs to address DI and AOP so that it can build other standards on top of it. DI and AOP are the foundation of many Java frameworks. I hope you share my excitement of CDI as a basis for other JSRs, Java frameworks and standards. CDI is a foundational aspect of Java EE 6. It is or will be shortly supported by Caucho's Resin, IBM's WebSphere, Oracle's Glassfish, Red Hat's JBoss and many more application servers. CDI is similar to core Spring and Guice frameworks. However CDI is a general purpose framework that can be used outside of JEE 6. CDI simplifies and sanitizes the API for DI and AOP. I find that working with CDI based AOP is easier and covers the most common use cases. CDI is a rethink on how to do dependency injection and AOP (interception really). It simplifies it. It reduces it. It gets rid of legacy, outdated ideas. CDI is to Spring and Guice what JPA is to Hibernate, and Toplink. CDI will co-exist with Spring and Guice. There are plugins to make them interoperate nicely (more on these shortly). This is just a brief taste. There is more to come. Resources CDI Source CDI advocacy group CDI advocacy blog CDI advocacy google code project Google group for CDI advocacy Manisfesto version 1 Weld reference documentation CDI JSR299 Resin fast and light CDI and Java EE 6 Web Profile implementation CDI & JSF Part 1 Intro by Andy Gibson CDI & JSF Part 2 Intro by Andy Gibson CDI & JSF Part 3 Intro by Andy Gibson About the Author This article was written with CDI advocacy in mind by Rick Hightower with some collaboration from others. Rick Hightower has worked as a CTO, Director of Development and a Developer for the last 20 years. He has been involved with J2EE since its inception. He worked at an EJB container company in 1999. He has been working with Java since 1996, and writing code professionally since 1990. Rick was an early Spring enthusiast. Rick enjoys bouncing back and forth between C, Python, Groovy and Java development. Although not a fan of EJB 3, Rick is a big fan of the potential of CDI and thinks that EJB 3.1 has come a lot closer to the mark. Rick Hightower is CTO of Mammatus and is an expert on Java and Cloud Computing. There are 18 code listings in this article
May 25, 2011
by Rick Hightower
· 83,641 Views · 10 Likes
article thumbnail
Merge Policy Internals in Solr
last week, a colleague asked me a really simple question about segments merging in solr. after discussing the answer for some minutes while playing around with solr, i realized that there are a lot of subtleties in this subject. so i started to look at the code and discovered some interesting things, which i’m going to summarize in this post. what is a merge policy? the process of choosing which segments are going to be merged, and in which way, is carried out by an abstract class named mergepolicy. this class builds a mergespecification, which is an object composed of a list of onemerge objects. each one of them represents a single merge operation, defined by the segments that will be merged into a new one. after a change in the index, the indexwriter invokes the mergepolicy to obtain a mergespecification. next, it invokes the mergescheduler, who is in charge of determining when the merges will be executed. there are mainly two implementations of mergescheduler: the concurrentmergescheduler, which runs each merge in a separate thread, and the serialmergescheduler, which runs all the merges sequentially in the current thread. finally, when the time to do a merge comes, the indexwriter does part of the job, and delegates the other part to a segmentmerger. so, if we want to know when a set of segments is going to be merged, why a segment is being merged and another one isn’t, and other things like these, we should take a look at the mergepolicy. there are many implementations of mergepolicy, but i’m going to focus in one of them (logbytesizemergepolicy), because it’s solr’s default, and i believe that it’s the one used by most people. mergepolicy defines three abstract methods to construct a mergespecification: findmerges is invoked whenever there is a change in the index. this is the method that i’ll study in this post. findmergesforoptimize is invoked whenever an optimize operation is called. findmergestoexpungedeletes is invoked whenever an expunge deletes operation is called. step by step i’ll start by giving a brief and conceptual description of how the merge policy works, that you can follow by looking at the figure below: sort the segments by name. group the existing segments into levels. each level is a contiguous set of segments. for each level, determine the interval of segments that are going to be merged. parameters lets define a couple of parameters involved in the process of merging the segments that compose an index: mergefactor: this parameter determines many things, like how many segments are going to be merged into a new one, the maximum number of segments that can be in a level and the span of each level. can be set in solrconfig.xml. minmergesize: all segments whose size is less than this parameter’s value will belong to the same level. this value is fixed. maxmergesize: all segments whose size is greater than this parameter’s value won’t be ever merged. this value is fixed. maxmergedocs: all segments containing more documents than this parameter’s value won’t be merged. can be set in solrconfig.xml. constructing the levels let’s see how levels are constructed. to define the first level, the algorithm searches the maximum segment’ size. let’s call this value levelmaxsize. if this value is less than minmergesize, then all the segments are considered to be at the same level. otherwise, let’s define levelminsize as: that’s quite an ugly arithmetic formula, but it means something like this: levelminsize will be almost mergefactor times less than levelmaxsize (it would have been mergefactor times if 1 had been used as exponent instead of 0.75), but if it goes below minmergesize, make it equal to minmergesize. after this calculation, the algorithm will choose which segments belong to the current level. to do this, it will find the newest segment whose size is greater or equal than levelminsize, and will consider this segment, and all the segments older than it, to be part of this level. the next levels will be constructed using the same algorithm, but constraining the set of available segments to the ones newer than the newest of the previous level. in the next figure you can see a simple example with mergefactor=10 and minmergesize=1.6mib. the intention behind using a mergefactor of 10 is to obtain levels with span of decreasing order of magnitude. but in some cases, this algorithm can result in a structure of levels that you won’t expect if you don’t know the internals. take, for example, the segments of the following table, assuming mergefactor=10 and minmergesize=1.6mib: segment size a 200 mib l 88 mib m 8.9 mib n 6.5 mib o 1.4 mib p 842 kib q 842 kib r 842 kib s 842 kib t 842 kib u 842 kib v 842 kib w 842 kib x 160 mib how many levels are in this case? let’s see: the maximum segment size is 200 mib, thus, levelminsize is 35 mib. the newest segment with size greater than levelminsize is x , so the first level will include x and all the segments older than it. this means that, in this case, there’s only one level! choosing which segments to merge after defining the levels, the mergepolicy will choose which segments to merge. to do this, it’ll analyze each level separately. if a level has less than mergefactor segments, then it’ll be skipped. otherwise, each group of mergefactor contiguous segments will be merged into a new one. if at least one of the segments in a group is bigger than maxmergefactor, or has more than maxmergedocs documents, then the group is skipped. resuming the second example of the previous section, where only one level is present, the result of the merge will be: segment size u 842 kib v 842 kib w 842 kib x 160 mib y 311 mib minmergesize and maxmergesize through the post, i’ve talked about these two parameters, which are involved in the process of merging. it’s worth noting that the value of them is hard coded in the source code of lucene, and their values are the following: minmergesize has a fixed value of 1.6 mib. this means that any segment whose size is less than 1.6 mib will be included in the last level. maxmergesize has a fixed value of 2 gib. this means that any segment whose size is greater than 2 gib will never be merged. conclusion while the algorithm itself isn’t extremely complex, you need to understand the internals if you want to predict what will happen to your index when more documents are added. also, knowing how the merge policy works will help you in determining the value of some parameters like mergefactor and maxmergedocs.
May 23, 2011
by Juan Grande
· 16,403 Views
article thumbnail
Git backups, and no, it's not just about pushing
Git is a backup system itself: for example, you can version your .txt folders containing TODO lists. Since Git version your files just like it does for code, after accidental deletion or modifications it will be able to bring you back. Yet, if you do not regularly push your commits, a problem with the drive containing the repository may cause the loss of all your work. You can put the repository in Dropbox or on a similar service, but I don't trust it. Dropbox syncs files in .git independently from the rest and from one another, and it may break temporarily or for good the repository. By the way, I only want to snapshot a backup at specific points in time, not always occupying my connection by instant mirroring. A note before beginning: with binary data Git is not proficient as a backup tool: text works a lot better (it's like code). This article is dedicated to the backup of code and textual content. Push is not a backup For example, because it may lack branches. In general, pushing to origin is not even an option as you may not want to push your changes yet, but still perform a backup. It's only in the open source world that backup corresponds to publishing online. However, thanks to decentralization there are some simple solutions, involving the creation of repositorite different from origin: git clone /path/to/working/copy #creates the backup git pull #origin master of course, updates the backup # you can specify better branches via the local configuration of the backup copy (git config) The inverse solution, involging pulling, is also possible: git init . #in the folder of your backup, or you can use a remote repository git remote add backup_repo /path/to/backup/repo #or a git:// repo git push backup #master usually, but also multiple branches git push --all backup #an alternative that pushes all branches All the commands, also the one that will follow, are just bash commands: it's easy to create a script and automate its execution with cron, anacron or whatever you want. The Force"del" Unix is powerful in you. git bundle git bundle is another command that may be used for backup purposes. It will create a single file containing all the refs you need to export from your local repository. It's often used for publishing commits via USB keys or other media in absence of a connection. For one branch, it's simple. This command will create a myrepo.bundle file. git bundle create myrepo.bundle master For more branches or tags, it's still simple: git bundle create myrepo.bundle master other_branch Restoring the content of the bundle is a single comment. Inside an empty repo, type: git bundle unbundle myrepo.bundle Instead if you do not have a repo, and just want to recreate the old one: git clone myrepo.bundle -b master myrepo_folder In emergency situations, bundle comes handy. But my issue with that command is that I always forget something when I use it: for example in my tutorial repository I had a lot of tags, but bundle did not include them by default (you have to specify the whole references list like for master other_branch.) Tarballs An alternative is just to archive the repository in a tar.gz or tar.bz file. tar -cf repository.tar repository/ gzip repository.tar # or bzip2 repository.tar After that, you can use scp or even rsync (but I don't think it will speed up much) to put repository.tar.gz on another medium. The weight is higher in this case, since the repository contains also the checked out working copy. But you don't have to learn new commands: apart from the weight and the lack of incremental updates, this solution works fine. Bare repositories You can use git clone --bare repository/ backup_folder/ to create a bare copy of the repository, as a backup. The bare repository does not maintain a checked out working tree, and as so saves space and time for its transferral. This method can be used in conjunction with the pull/push or the tarball method. For restoring the backup: git clone backup_folder/ new_repository/ will recreate the original situation in new_repository. In any of the cases the new folders are created automatically. I won't advise to just copy the folder as often on other backup filesystems (like an USB key's vfat) permissions, owner and other metadata are lost. Conclusion So now you have some alternatives for backing up your repositories or transporting them without setting up a server like Gitosis or passing from the publicly available Github. In fact, I researched this techniques for transporting my tutorial code to phpDay 2011 and the Dutch PHP Conference, and they have worked pretty well.
May 18, 2011
by Giorgio Sironi
· 72,924 Views · 1 Like
article thumbnail
Practical PHP Testing Patterns: Transaction Rollback Teardown
Maintaining isolation of tests when they have a database as Shared Fixture is not a trivial task. An important constraint is not having the headache of keeping track what manipulations on the database has your code done; in that case the rollback may not even be performed in case of a regression. An alternative way to resetting the database via DELETE and TRUNCATE queries is to roll back a transaction which has been started in the setup phase during the teardown. Implementation The phases of a database test involving Transaction Rollback Teardown are roughly the following: begin transaction, usually in setUp(). arrange, act, assert actions in the various Test Methods. rollback of the transaction in teardown(). The active transaction is never committed. An issue with using this pattern is that code that already uses transaction is prone to generate errors, and ultimately should never be tested with this technique. The rules for your safety are simple: the SUT should never start a transaction or committing it. Some databases support nested transaction levels, but it's very brittle to use them for testing purposes, and in case of any failure the whole suite will blow up as test executes teardowns at the wrong level of nesting. This pattern safety is also difficult to ensure, as DDL statements like CREATE/DELETE or other commands may commit the current transaction automatically. Check the documentation of your testing database. The advantage of this pattern is great performance: rollback is faster than every other command, including TRUNCATE. Moreover, if you encapsulate transactions well in your production code, most of it won't commit them (typically leaving the control over the transaction to an upper layer). Doctrine 2 In a sense, we already use this pattern with an UnitOfWork ORM such as Doctrine 2 when we do not flush() the ORM in our code. The flow is: The database is ready by setup. Exercise code. Check results as persisted or removed entities. Instead of calling flush() over the Entity Manager, call clear(). In this case, the database never sees a transaction, as Doctrine 2 keeps everything in the Unit Of Work until you say to flush it. Even when your code is calling flush(), you can explicitly use beginTransaction() and rollback() over the connection object: in this other scenario, the testing database sees an open transaction, but it's never committed and can be discarded in teardown() like the pattern prescribes. Example The code sample is the same test case shown in the Table Truncation Teardown article, which now uses transactions to encapsulate the single tests. The various tests check the tables content is restored, along with the AUTOINCREMENT next value. exec('CREATE TABLE users ( id INTEGER PRIMARY KEY AUTOINCREMENT, name VARCHAR(255) )'); } $this->connection = self::$sharedConnection; $this->connection->beginTransaction(); } public function teardown() { $this->connection->rollback(); } public function testTableCanBePopulated() { $this->connection->exec('INSERT INTO users (name) VALUES ("Giorgio")'); $this->assertEquals(1, $this->howManyUsers()); } public function testTableRestartsFrom1() { $this->assertEquals(0, $this->howManyUsers()); $this->connection->exec('INSERT INTO users (name) VALUES ("Isaac")'); $stmt = $this->connection->query('SELECT name FROM users WHERE id=1'); $result = $stmt->fetch(); $this->assertEquals('Isaac', $result['name']); } public function testTableIsEmpty() { $this->assertEquals(0, $this->howManyUsers()); } private function howManyUsers() { $stmt = $this->connection->query('SELECT COUNT(*) AS number FROM users'); $result = $stmt->fetch(); return $result['number']; } }
May 11, 2011
by Giorgio Sironi
· 7,745 Views
article thumbnail
Practical PHP Testing Patterns: Stored Procedure Test
It happened in the day before the advent of DDD and the Hexagonal architecture, that you had code that lived inside the database, such as Stored Procedures, constraints, and triggers. Back in the day, the relational database was considered the single source of truth instead of a Domain Model written in a language like PHP or Java. Today the picture is different - but there are still scenarios where pushing code in the database make sense. One of the reasons for having logic expressed as SQL and in other database languages is their power, and their performance. SQL operators, especially when augmented by proprietary extensions, let you declare pieces of logic that you would instead have to code by hand. SQL that is executed directly on the database can accomplish operations too onerous to perform over a reconstituted object graph with a subsequent saving. In fact, every decent ORM include a language for batch updates that translates to SQL, like Doctrine with DQL; and also a mechanism for providing hints for the underlying database, like indexes definitions. The problem with SQL derivates and other database-specific embedded logic is that we cannot execute it and test it in isolation - we need a real copy of the database to perform our tests. Thus the Stored Procedure Test is an umbrella term for tests that encompasses database code, even when they're not actually stored procedures. When I'll use the term stored procedure in this article, it will be to signify any database-specific code, such as complex queries, triggers and so on. Implementation The pattern prescribes to write unit tests for the stored procedure, to test it in isolation from the rest of application a first simplification. These tests cover nontrivial logic in database code - probably you don't need them for indexes definition, but more for queries with aggregate functions. In the PHP world, Sqlite often suffices for testing queries - as long as you have an intermediate layer like Doctrine DBAL (part of Doctrine 2) which smooths out the differences between vendors. You use MySQL in production, Sqlite in the test suite, and you can write queries in Doctrine's DQL being confident that it will translate them to the right SQL dialect. These tests should be executed in a sandbox - a database with just enough structure and data to test the stored procedure at hand. This sandbox should run by definition on the production dbms. The most difficult aspect of the pattern is integrating with the dbms: it should be running and listening on the right port. A sandbox should be created in setUp() or setUpBeforeClass(), and destroyed during teardown. In case the database is not available, the tests should be marked as skipped or incomplete. Variations In In-Database Stored Procedure Tests, the test is written in the same language as the database code. I cannot imagine something more boring for a PHP programmer. In Remoted Stored Procedure Tests, which is the variation of interest for us, the tests are written in PHPUnit and integrated with the suite (slowing it down a bit). The logic is that whatever SQL logic you're going to add to your application, is already encapsulated in some PHP class: for example, complex queries are encapsulated in Repositories or DAO. So it's going to be feasible to build a sandbox via PHP code, and test the stored procedure as a black box. It will be encapsulated for a unique execution, like schema creation, or for executing multiple times in case of queries. Example The example shows you how to test a query with a real database - supposing a surrogate database does not support all the needed functions - from inside a test suite. I thought it would be difficult to write this test, but instead it required less than a Pomodoro. connection = new PDO("mysql:host=localhost;dbname=sandbox", 'root', ''); $this->connection->exec("CREATE TABLE users (name VARCHAR(255) NOT NULL PRIMARY KEY, year YEAR)"); $this->repository = new UserRepository($this->connection, 2011); } public function testAverageAgeIsCalculated() { $this->insertUser('Giorgio', 1942); $this->insertUser('Isaac', 1920); $this->assertEquals(80, $this->repository->getAverageAge()); } private function insertUser($name, $year) { $stmt = $this->connection->prepare("INSERT INTO users (name, year) VALUES (:name, :year)"); $stmt->bindValue('name', $name, PDO::PARAM_STR); $stmt->bindValue('year', $year, PDO::PARAM_INT); return $stmt->execute(); } public function tearDown() { $this->connection->exec('DROP TABLE users'); } } class UserRepository { private $connection; private $currentYear; public function __construct(PDO $connection, $currentYear) { $this->connection = $connection; $this->currentYear = $currentYear; } /** * We suppose AVG() cannot be correctly implemented by Sqlite or * another surrogate database (substitute another vendor feature * for the same effect). * We also suppose reconstituting millions of User objects to calculate * their average age isn't feasible: that's why we used SQL directly. */ public function getAverageAge() { $stmt = $this->connection->prepare('SELECT AVG(:year - year) AS average_age FROM users'); $stmt->bindValue('year', $this->currentYear, PDO::PARAM_INT); $stmt->execute(); $row = $stmt->fetch(); return $row['average_age']; } }
May 4, 2011
by Giorgio Sironi
· 2,794 Views
article thumbnail
Bootstrapping CDI in several environments
i feel like writing some posts about cdi (contexts and dependency injection). so this is the first one of a series of x posts ( 0 javax.enterprise cdi-api 1.0 provided an empty beans.xml will do to enable cdi you must have a beans.xml file in your project (under the meta-inf or web-inf). that’s because cdi needs to identify the beans in your classpath (this is called bean discovery) and build its internal metamodel. with the beans.xml file cdi knows it has beans to discover. so, for all the following examples i’ll make it simple and will leave this file completely empty. java ee 6 containers let’s start with the easiest possible environment : java ee 6 containers . why is it the simplest ? well, because you don’t have to do anything : cdi is part of java ee 6 as well as the web profile 1.0 so you don’t need to manually bootstrap it. let’s see how to inject a cdi bean within an ejb 3.1 and a servlet 3.0 . ejb 3.1 since ejb 3.1 you can use the ejbcontainer api to get an in-memory embedded ejb container and you can easily unit test your ejbs. so let’s write an ejb and a test class. first let’s have a look at the code of the ejb. as you can see, with version 3.1 an ejb is just a pojo : no inheritance, no interface, just one @stateless annotation. it gets a reference of the hello bean buy using the @inject annotation and uses it in the saysomething() method. @stateless public class mainejb31 { @inject hello hello; public string saysomething() { return hello.sayhelloworld(); } } you can now package the mainejb31, hello and world classes with the empty beans.xml file into a jar, deploy it to glassfish 3.x , and it will work. but if you don’t want to bother deploying it to glassfish and just unit test it, this is what you need to do : public class mainejbtest { private static ejbcontainer ec; private static context ctx; @beforeclass public static void initcontainer() throws exception { map properties = new hashmap(); properties.put(ejbcontainer.modules, new file("target/classes")); ec = ejbcontainer.createejbcontainer(properties); ctx = ec.getcontext(); } @afterclass public static void closecontainer() throws exception { if (ec != null) ec.close(); } @test public void shoulddisplayhelloworld() throws exception { // looks up the ejb mainejb31 mainejb = (mainejb31) ctx.lookup("java:global/classes/mainejb!org.antoniogoncalves.cdi.helloworld.mainejb"); assertequals("should say hello world !!!", "hello world !!!", mainejb.saysomething()); } } in the code above the method initcontainer() initializes the ejbcontainer. the shoulddisplayhelloworld() looks up the ejb (using the new portable jndi name ), invokes it and makes sure the saysomething() method returns hello world !!!. green test. that was pretty easy too. servlet 3.0 servlet 3.0 is part of java ee 6, so again, there is no needed configuration to bootstrap cdi. let’s use the new @webservlet annotation and write a very simple one that injects a reference of hello and displays an html page with hello world !!!. this is what the servlet looks like : @webservlet(urlpatterns = "/mainservlet") public class mainservlet30 extends httpservlet { @inject hello hello; @override protected void service(httpservletrequest req, httpservletresponse resp) throws servletexception, ioexception { resp.setcontenttype("text/html"); printwriter out = resp.getwriter(); out.println(""); out.println(""); out.println(""); out.println(saysomething()); out.println(""); out.println(""); out.close(); } public string saysomething() { return hello.sayhelloworld(); } } thanks to the @webservlet i don’t need any web.xml (it’s optional in servlet 3.0) to map the mainservlet30 to the /mainservlet url. you can now package the mainservlet30, hello and world classes with the empty beans.xml and no web.xml into a war, deploy it to glassfish 3.x , go to http://localhost:8080/bootstrapping-servlet30-1.0/mainservlet and it will work. unfortunately servlet 3.0 doesn’t have an api for the container (such as ejbcontainer). there is no servletcontainer api that would let you use an embedded servlet container in a standard way and, why not, easily unit test it. application client container not many people know it, but java ee (or even older j2ee versions) comes with an application client container (acc). it’s like an ejb or servlet container but for plain pojos. for example you can develop a swing application (yes, i’m sure that some of you still use swing), run it into the acc and get some extra services given by the container (security, naming, certain annotations…). glassfish v3 has an acc that you can launch in a command line : appclient -jar . so i thought, great, i can use cdi with acc the same way i use it within ejb or servlet container, no need to bootstrap anything, it’s all out of the box. i was wrong . as per the cdi specification (section 12.1), cdi is not required to support application client bean archives. so the glassfish application client container doesn’t support it. i haven’t tried the jboss acc , maybe it works. other containers the beauty of cdi is that it doesn’t require java ee 6 . you can use cdi with simple pojos in a java se environment, as well as some servlet 2.5 containers. of course it’s not as easy to bootstrap because you need a bit of configuration. but it then works fine (not always but). java se 6 ok, so until now there was nothing to do to bootstrap cdi. it is already bundled with the ejb 3.1 and servlet 3.0 containers of java ee 6 (and web profile). so the idea here is to use cdi in a simple java se environment. coming back to our hello and world classes, we need a pojo with an entry point that will bootstrap cdi so we can use injection to get those classes. in standard java se when we say entry point , we think of a public static void main(string[] args) method. well, we need something similar… but different. weld is the reference implementation of cdi. that means it implements the specification, the standard apis (mostly found in javax.inject and javax.enterprise.context packages) but also some proprietary code (in org.jboss.weld package). bootstrapping cdi in java se is not specified so you will need to use specific weld features. you can do that in two different flavors: by observing the containerinitialized event or using the programatic bootstrap api consisting of the weld and weldcontainer classes. the following code uses the containerinitialized event. as you can see, it uses the @observes annotation that i’ll explain in a future post. but the idea is that this class is listening to the event and processes the code once the event is triggered. import org.jboss.weld.environment.se.events.containerinitialized; import javax.enterprise.event.observes; import javax.inject.inject; public class mainjavase6 { @inject hello hello; public void saysomething(@observes containerinitialized event) { system.out.println(hello.sayhelloworld()); } } but who trigers the containerinitialized event ? well, it’s the org.jboss.weld.environment.se.startmain class. i’m using maven so a nice trick is to use the exec-maven-plugin to run the startmain class. download the code , have a look at the pom.xml and give it a try. the other possibility is to programmatically bootstrap the weld container. this can be handy in unit testing. the code below initializes the weld container (with new weld().initialize()) and then looks for the hello class (using weld.instance().select(hello.class).get()). import org.jboss.weld.environment.se.weld; import org.jboss.weld.environment.se.weldcontainer; import org.junit.beforeclass; import org.junit.test; import static junit.framework.assert.assertequals; public class hellotest { @test public void shoulddisplayhelloworld() { weldcontainer weld = new weld().initialize(); hello hello = weld.instance().select(hello.class).get(); assertequals("should say hello world !!!", "hello world !!!", hello.sayhelloworld()); } } execute the test with mvn test and it should be green. as you can see, there is a bit more work using cdi in a java se environment, but it’s not that complicated. tomcat 6.x ok, and what about your legacy servlet 2.5 containers ? the first one that comes in mind is tomcat 6.x ( note that tomcat 7.x will implement servlet 3.0 but is still in beta version at the time of writing this post ). weld provides support for tomcat but you need to configure it a bit to make cdi work. first of all, this is a servlet 2.5, not a 3.0. so the code of the servlet is slightly different from the one seen before (no annotation allowed) and of course, you need your good old web.xml file : public class mainservlet25 extends httpservlet { @inject hello hello; @override protected void service(httpservletrequest req, httpservletresponse resp) throws servletexception, ioexception { resp.setcontenttype("text/html"); printwriter out = resp.getwriter(); out.println(""); out.println(""); out.println(""); out.println(saysomething()); out.println(""); out.println(""); out.close(); } public string saysomething() { return hello.sayhelloworld(); } } because we don’t have a @webservlet annotation in servlet 2.5, we need to declare and map it in the web.xml (using the servlet and servlet-mapping tags). then, you need to explicitly specify the servlet listener to boot weld and control its interaction with requests (org.jboss.weld.environment.servlet.listener). tomcat has a read-only jndi, so weld can’t automatically bind the beanmanager extension spi. to bind the beanmanager into jndi, you should populate meta-inf/context.xml and make the beanmanager available to your deployment by adding it to your web.xml: mainservlet25 org.antoniogoncalves.cdi.bootstrapping.servlet.mainservlet25 mainservlet25 /mainservlet org.jboss.weld.environment.servlet.listener beanmanager javax.enterprise.inject.spi.beanmanager the meta-inf/context.xml file is an optional file which contains a context for a single tomcat web application. this can be used to define certain behaviours for your application, jndi resources and other settings. package all the files (mainservlet25, hello, world, meta-inf/context.xml, beans.xml and web.xml) into a war and deploy it into tomcat 6.x. go to http://localhost:8080/bootstrapping-servlet25-tomcat-1.0/mainservlet and you will see your hello world page. jetty 6.x another famous servlet 2.5 containers is jetty 6.x (at codehaus) and jetty 7.x ( note that jetty 8.x will implement servlet 3.0 but it’s still in experimental stage at the time of writing this post ). if you look at the weld documentation, there is actually support for jetty 6.x and 7.x . the code is the same one as tomcat (because it’s a servlet 2.5 container), but the configuration changes. with jetty you need to add two files under web-inf : jetty-env.xml and jetty-web.xml : beanmanager javax.enterprise.inject.spi.beanmanager org.jboss.weld.resources.managerobjectfactory true package all the files (mainservlet25, hello, world, web-inf/jetty-env.xml, web-inf/jetty-web.xml, beans.xml and web.xml) into a war and deploy it into jetty 6.x. go to http://localhost:8080/bootstrapping-servlet25-jetty6/mainservlet and you will see your hello world page. there was a mistake in the weld documentation so i couldn’t make it work. i started a thread on the weld forum and thanks to dan allen , pete muir and all the weld team, this was fixed and i managed to make it work. simple as posting an email to the forum . thanks for your help guys. spring 3.x here is the tricky part. spring 3.x implements the jsr 330 : dependency injection for java , which means that @inject works out of the box. but i didn’t find a way to integrate cdi with spring 3.x . the weld documentation mentions that because of its extension points, “ integration with third-party frameworks such as spring (…) was envisaged by the designers of cdi “. i did find this blog that simulates cdi features by enabling spring ones. what i didn’t find is a clear statement or roadmap on springsource about supporting cdi or not in future releases. the last trace of this topic is a comment on a long tss flaming thread . at that time (16 december 2009), juergen huller said “ with respect to implementing cdi on top of spring (…) trying to hammer it into the semantic frame of another framework such as cdi would be an exercise that is certainly achievable (…) but ultimately pointless “. but if you have any fresh news about it, let me know. conclusion as i said, this post is not about explaining cdi, i’ll do that in future posts. i just wanted to focus on how to bootstrap it in several environments so you can try by yourself. as you saw, it’s much simpler to use cdi within an ejb 3.1 or servlet 3.0 container in java ee 6. i’ve used glassfish 3.x but it should also work with other java ee 6 or web profile containers such as jboss 6 or resin . when you don’t use java ee 6, there is a bit more work to do. depending on your environment or servlet container you need some configuration to bootstrap weld. by the way, i’ve used weld because it’s the reference implementation, the one bunddled with glassfish and jboss. but you could also use openwebbeans , another cdi implementation. download the code , give it a try, and give me some feedback. from http://agoncal.wordpress.com/2011/01/12/bootstrapping-cdi-in-several-environments/
April 28, 2011
by Antonio Goncalves
· 31,384 Views
article thumbnail
Exploring TDD in JavaScript with a small kata
A code kata is an exercise where you focus on your technique instead of on the final product of your mind and fingers. But a kata can also be used as a constant parameter, while other variables change, like in scientific experiments. For example, when learning a new programming language or framework, you can execute an old kata in order to explore it. I decided to perform a small and famous Kata that we used also during interviews to separate programmers from not programmers: the FizzBuzz kata. My goal was to learn how to setup a platform for Test-Driven Development in JavaScript, following the advice of the Test-Driven JavaScript Development book. The parameters that change from my habits are the tools for running tests and the programming language, but my IDE (Unix&Vim) remained fixed along with the Kata: Write a function that returns its numerical argument. But for multiples of three return Fizz instead of the number and for the multiples of five return Buzz. For numbers which are multiples of both three and five return FizzBuzz. Additional requirement: when passed a multiple of 7, return Bang; when passed a multiple of 5 and 7, return BuzzBang; and so on for all the combinations. As my tools for running the tests, I used JsTestDriver and Firefox, as suggested by the book Test-Driven JavaScript Development which I'm currently reading. JsTestDriver JsTestDriver will make you feel the joy of a green bar again. Download its jar, put it somewhere and add an alias in your .bashrc: export JSTESTDRIVER_HOME=~/bin alias jstestdriver="java -jar $JSTESTDRIVER_HOME/JsTestDriver-1.3.2.jar" Start the server: jstestdriver --port 4224 Point an open browser (I used Firefox) to localhost:4224. The browser will ping it via Ajax requests undefinitely to gather tests to run. Now we can use the command line to run tests, like you'll do with PHPUnit if you are a PHPer: jstestdriver --tests all The Kata I started with a simple function, fizzbuzz(), and a single test case. I never wrote a test with JsTestDriver before so I needed to gain some confidence and be sure the configuration file was correct. server: http://localhost:4224 load: - src/*.js - test/*.js In JsTestDriver, a Test case is created by passing to TestCase (global function provided by JsTestDriver) a map containing anonymous functions. TestCase("FizzBuzzTest", { "test should return Fizz when passed 3" : function () { assertEquals("Fizz", fizzbuzz(3)); } }); The functions whose names start with test will be executed; there are some reserved keywords like setUp which are used as hooks for fixture creation. Running the test with the alias command is really simple: jstestdriver --tests all I made the first test pass with fizzbuzz.js, a file containing a first version of the function (with a fake implementation): function fizzbuzz() { return 'Fizz'; } The result? A green bar (metaphorically green; all tests pass.) . Total 1 tests (Passed: 1; Fails: 0; Errors: 0) (0,00 ms) Firefox 4.0 Linux: Run 1 tests (Passed: 1; Fails: 0; Errors 0) (0,00 ms) You can capture more than one browser if you want to run test simultaneously in all of them, but it will probably slow down the TDD basic cycle. You can leave cross-browser testing for later. Going on After this first test, I went on adding new ones and making them pass, until I even converted the function to an object, for the sake of easy configuration (a function returning a function would be the same). Since I also needed to create the object in just one place, I started using setUp for the fixture creation: TestCase("FizzBuzzTest", { setUp : function () { this.fizzbuzz = new FizzBuzz({ 3 : 'Fizz', 5 : 'Buzz', 7 : 'Bang' }); }, "test should return the number when passed 1 or 2" : function () { assertEquals(1, this.fizzbuzz.accept(1)); assertEquals(2, this.fizzbuzz.accept(2)); }, "test should return Fizz when passed 3 or a multiple" : function () { assertEquals("Fizz", this.fizzbuzz.accept(3)); assertEquals("Fizz", this.fizzbuzz.accept(6)); }, "test should return Buzz when passed 5 or a multiple" : function () { assertEquals("Buzz", this.fizzbuzz.accept(5)); assertEquals("Buzz", this.fizzbuzz.accept(10)); }, "test should return FizzBuzz when passed a multiple of both 3 and 5" : function () { assertEquals("FizzBuzz", this.fizzbuzz.accept(15)); assertEquals("FizzBuzz", this.fizzbuzz.accept(30)); }, "test should return Bang when passed a multiple of 7" : function () { assertEquals("Bang", this.fizzbuzz.accept(7)); assertEquals("Bang", this.fizzbuzz.accept(14)); }, "test should return FizzBuzzBang when it is the case" : function () { assertEquals("FizzBuzzBang", this.fizzbuzz.accept(3*5*7)); } }); You can use this to share fixtures between the setUp and the different test methods: the test does not look different from JUnit and PHPUnit ones. Like in all xUnit testing frameworks, the setUp is executed on a brand new object for each test, to preserve isolation. I like a bit the way in which in JavaScript you can tear and put together objects: after all, it's called object-oriented programming, not class-oriented programming. I decided to use a small function constructor as you may infer from the test: function FizzBuzz(correspondences) { this.correspondences = correspondences; this.accept = function (number) { var result = ''; for (var divisor in this.correspondences) { if (number % divisor == 0) { result = result + this.correspondences[divisor]; } } if (result) { return result; } else { return number; } } } All the code is on Github, to see the intermediate steps of the Kata if you need them. You can also use the repository to try out your installation of JsTestDriver: a git pull followed by running the tests will confirm that it's working. Sometimes we don't test code in alien environments like JavaScript console or database queries because we don't know how; but a Kata which takes just two Pomodoros can solve the issue and let you enjoy a green bar even when working with a browser's interpreter.
April 21, 2011
by Giorgio Sironi
· 13,119 Views
article thumbnail
Don't Use JmsTemplate in Spring!
JmsTemplate is easy for simple message sending. What if we want to add headers, intercept or transform the message? Then we have to write more code. So, how do we solve this common task with more configurability in lieu of more code? First, lets review JMS in Spring. Spring JMS Options JmsTemplate – either to send and receive messages inline Use send()/convertAndSend() methods to send messages Use receive()/receiveAndConvert() methods to receive messages. BEWARE: these are blocking methods! If there is no message on the Destination, it will wait until a message is received or times out. MessageListenerContainer – Async JMS message receipt by polling JMS Destinations and directing messages to service methods or MDBs Both JmsTemplate and MessageListenerContainer have been successfully implemented in Spring applications, if we have to do something a little different, we introduce new code. What could possibly go wrong? Future Extensibility? On many projects new use-cases arise, such as: Route messages to different destinations, based on header values or contents? Log the message contents? Add header values? Buffer the messages? Improved response and error handling? Make configuration changes without having to recompile? and more… Now we have to refactor code and introduce new code and test cases, run it through QA, etc. etc. A More Configurable Solution! It is time to graduate Spring JmsTemplate and play with the big kids. We can easily do this with a Spring Integration flow. How it is done with Spring Integration Here we have a diagram illustrating the 3 simple components to Spring Integration replacing the JmsTemplate send. Create a Gateway interface – an interface defining method(s) that accept the type of data you wish to send and any optional header values. Define a Channel – the pipe connecting our endpoints Define an Outbound JMS Adapter – sends the message to your JMS provider (ActiveMQ, RabbitMQ, etc.) Simply inject this into our service classes and invoke the methods. Immediate Gains Add header & header values via the methods defined in the interface Simple invokation of Gateway methods from our service classes Multiple Gateway methods Configure method level or class level destinations Future Gains Change the JMS Adapter (one-way) to a JMS Gateway (two-way) to processes responses from JMS We can change the channel to a queue (buffered) channel We can wire in a transformer for message transformation We can wire in additional destinations, and wire in a “header (key), header value, or content based” router and add another adapter We can wire in other inbound adapters receiving data from another source, such as SMTP, FTP, File, etc. Wiretap the channel to send a copy of the message elsewhere Change the channel to a logging adapter channel which would provide us with logging of the messages coming through Add the “message-history” option to our SI configuration to track the message along its route and more… Optimal JMS Send Solution The Spring Integration Gateway Interface Gateway provides a one or two way communication with Spring Integration. If the method returns void, it is inherently one-way. The interface MyJmsGateway, has one Gateway method declared sendMyMessage(). When this method is invoked by your service class, the first argument will go into a message header field named “myHeaderKey”, the second argument goes into the payload. package com.gordondickens.sijms; import org.springframework.integration.annotation.Gateway;import org.springframework.integration.annotation.Header; public interface MyJmsGateway { @Gateway public void sendMyMessage(@Header("myHeaderKey") String s, Object o);} Spring Integration Configuration Because the interface is proxied at runtime, we need to configure in the Gateway via XML. Sending the Message package com.gordondickens.sijms; import org.junit.Test;import org.junit.runner.RunWith;import org.springframework.beans.factory.annotation.Autowired;import org.springframework.test.context.ContextConfiguration;import org.springframework.test.context.junit4.SpringJUnit4ClassRunner; @ContextConfiguration("classpath:/com/gordondickens/sijms/JmsSenderTests-context.xml")@RunWith(SpringJUnit4ClassRunner.class)public class JmsSenderTests { @Autowired MyJmsGateway myJmsGateway; @Test public void testJmsSend() { myJmsGateway.sendMyMessage("myHeaderValue", "MY PayLoad"); } Summary Simple implementation Invoke a method to send a message to JMS – Very SOA eh? Flexible configuration Reconfigure & restart WITHOUT recompiling – SWEET!
April 21, 2011
by Gordon Dickens
· 84,863 Views
article thumbnail
Eradicating Non-Determinism in Tests
An automated regression suite can play a vital role on a software project, valuable both for reducing defects in production and essential for evolutionary design. In talking with development teams I've often heard about the problem of non-deterministic tests - tests that sometimes pass and sometimes fail. Left uncontrolled, non-deterministic tests can completely destroy the value of an automated regression suite. In this article I outline how to deal with non-deterministic tests. Initially quarantine helps to reduce their damage to other tests, but you still have to fix them soon. Therefore I discuss treatments for the common causes for non-determinism: lack of isolation, asynchronous behavior, remote services, time, and resource leaks. I've enjoyed watching ThoughtWorks tackle many difficult enterprise applications, bringing successful deliveries to many clients who have rarely seen success. Our experiences have been a great demonstration that agile methods, deeply controversial and distrusted when we wrote the manifesto a decade ago, can be used successfully. There are many flavors of agile development out there, but in what we do there is a central role for automated testing. Automated testing was a core approach to Extreme Programming from the beginning, and that philosophy has been the biggest inspiration to our agile work. So we've gained a lot of experience in using automated testing as a core part of software development. Automated testing can look easy when presented in a text book. And indeed the basic ideas are really quite simple. But in the pressure-cooker of a delivery project, trials come up that are often not given much attention in texts. As I know too well, authors have a habit of skimming over many details in order to get a core point across. In my conversations with our delivery teams, one recurring problem that we've run into is tests which have become unreliable, so unreliable that people don't pay much attention to whether they pass or fail. A primary cause of this unreliability is that some tests have become non-deterministic. A test is non-deterministic when it passes sometimes and fails sometimes, without any noticeable change in the code, tests, or environment. Such tests fail, then you re-run them and they pass. Test failures for such tests are seemingly random. Non-determinism can plague any kind of test, but it's particularly prone to affect tests with a broad scope, such as acceptance or functional tests. Why non-deterministic tests are a problem Non-deterministic tests have two problems, firstly they are useless, secondly they are a virulent infection that can completely ruin your entire test suite. As a result they need to be dealt with as soon as you can, before your entire deployment pipeline is compromised. I'll start with expanding on their uselessness. The primary benefit of having automated tests is that they provide bug detection mechanism by acting as regression tests[1]. When a regression test goes red, you know you've got an immediate problem, often because a bug has crept into the system without you realizing. Having such a bug detector has huge benefits. Most obviously it means that you can find and fix bugs just after they are introduced. Not just does this give you the warm fuzzies because you kill bugs quickly, it also makes it easier to remove them since you know the bug got in with the last set of changes that are fresh in your mind. As a result you know where to look for the bug, which is more than half the battle in squashing it. The second level of benefit is that as you gain confidence in your bug detector, you gain the courage to make big changes knowing that when you goof, the bug detector will go off and you can fix the mistake quickly. [2] Without this teams are frightened to make the changes code needs in order to be kept clean, which leads to a rotting of the code base and plummeting development speed. The trouble with non-deterministic tests is that when they go red, you have no idea whether its due to a bug, or just part of the non-deterministic behavior. Usually with these tests a non-deterministic failure is relatively common, so you end up shrugging your shoulders when these tests go red. Once you start ignoring a regression test failure, then that test is useless and you might as well throw it away. Indeed you really ought to throw a non-deterministic test away, since if you don't it has an infectious quality. If you have a suite of 100 tests with 10 non-deterministic tests in them, than that suite will often fail. Initially people will look at the failure report and notice that the failures are in non-deterministic tests, but soon they'll lose the discipline to do that. Once that discipline is lost, then a failure in the healthy deterministic tests will get ignored too. At that point you've lots the whole game and might as well get rid of all the tests. Quarantine My principal aim in this article is to outline common cases of non-deterministic tests and how to eliminate the non-determinism. But before I get there I offer one piece of essential advice: quarantine your non-deterministic tests. If you have non-deterministic tests keep them in a different test suite to your healthy tests. That way you'll you can continue to pay attention to what's going on with your healthy tests and get good feedback from them. Place any non-deterministic test in a quarantined area. (But fix quarantined tests quickly.) Then the question is what to do with the quarantined test suites. They are useless as regression tests, but they do have a future as work items for cleaning up. You should not abandon such tests, since any tests you have in quarantine are not helping you with your regression coverage. A danger here is that tests keep getting thrown into quarantine and forgotten, which means your bug detection system is eroding. As a result it's worthwhile to have a mechanism that ensures that tests don't stay in quarantine too long. I've come across various ways to do this. One is a simple numeric limit: e.g. only allow 8 tests in quarantine. Once you hit the limit you must spend time to clear all the tests out. This has the advantage of batching up your test-cleaning if that's how you like to do things. Another route is to put a time limit on how long a test may be in quarantine, such as no longer than a week. The general approach with quarantine is to take the quarantined tests out of the main deployment pipeline so that you still get your regular build process. However a good team can be more aggressive. Our Mingle team puts its quarantine suite into the deployment pipeline one stage after its healthy tests. That way it can get the feedback from the healthy tests but is also forced to ensure that it sorts out the quarantined tests quickly. [3] Lack of Isolation In order to get tests to run reliably, you must have clear control over the environment in which they run, so you have a well-known state at the beginning of the test. If one test creates some data in the database and leaves it lying around, it can corrupt the run of another test which may rely on a different database state. Therefore I find it's really important to focus on keeping tests isolated. Properly isolated tests can be run in any sequence. As you get to larger operational scope of functional tests, it gets progressively harder to keep tests isolated. When you are tracking down a non-determinism, lack of isolation is a common and frustrating cause. Keep your tests isolated from each other, so that execution of one test will not affect any others. There are a couple of ways to get isolation - either always rebuild your starting state from scratch, or ensure that each test cleans up properly after itself. In general I prefer the former, as it's often easier - and in particular easier to find the source of a problem. If a test fails because it didn't build up the initial state properly, then it's easy to see which test contains the bug. With clean-up, however, one test will contain the bug, but another test will fail - so it's hard to find the real problem. Starting from a blank state is usually easy with unit tests, but can be much harder with functional tests [4] - particularly if you have a lot of data in a database that needs to be there. Rebuilding the database each time can add a lot of time to test runs, so that argues for switching to a clean-up strategy. One trick that's handy when you're using databases, is to conduct your tests inside a transaction, and then to rollback the transaction at the end of the test. That way the transaction manager cleans up for you, reducing the chance of errors[5]. Another approach is to do a single build of a mostly-immutable starting fixture before running a group of tests. Then ensure that the tests don't change that initial state (or if they do, they reverse the changes in tear-down). This tactic is more error-prone than rebuilding the fixture for each test, but it may be worthwhile iff it takes too long to build the fixture each time. Although databases are a common cause for isolation problems, there are plenty of times you can get these in-memory too. In particular be aware with static data and singletons. A good example for this kind of problem is contextual environment, such as the currently logged in user. If you have an explicit tear-down in a test, be wary of exceptions that occur during the tear-down. If this happens the test can pass, but cause isolation failures for subsequent tests. So ensure that if you do get a problem in a tear-down, it makes a loud noise. Some people prefer to put less emphasis on isolation and more on defining clear dependencies to force tests to run in a specified order. I prefer isolation because it gives you more flexibility in running subsets of tests and parallelizing tests. Asynchronous Behavior Asynchrony is a boon that allows you to keep software responsive while taking on long term tasks. Ajax calls allow a browser to stay responsive while going back to the server for more data, asynchronous message allow a server process to communicate with other system without being tied to their laggardly latency. But in testing, asynchrony can be curse. The common mistake here is to throw in a sleep: //pseudo-code makeAsyncCall; sleep(aWhile); readResponse; This can bite you two ways. First off you'll want to set the sleep time to long enough that it gives plenty of time to get the response. But that means that you'll spend a lot of time idly waiting for the response, thus slowing down your tests. The second bite is that, however long you sleep, sometimes it won't be enough. There will be some change in environment that will cause you to exceed the sleep - and you'll get false failure. As a result I strongly urge you to never use bare sleeps like this. Never use bare sleeps to wait for asynchonous responses: use a callback or polling. There are basically two tactics you can do for testing an asynchronous response. The first is for the asynchronous service to take a callback which it can call when done. This is the best since it means you'll never have to wait any longer than you need to [6]. The biggest problem with this is that the environment needs to be able to do this and then the service provider needs to do it. This is one of the advantages of having the development team integrated with testing - if they can provide a callback then they will. The second option is to poll on the answer. This is more than just looking once, but looking regularly, something like this //pseudo-code makeAsyncCall startTime = Time.now; while(! responseReceived) { if (Time.now - startTime > waitLimit) throw new TestTimeoutException; sleep (pollingInterval); } readResponse The point of this approach is that you can set the pollingInterval to a pretty small value, and know that that's the maximum amount of dead time you'll lose to waiting for a response. This means you can set the waitLimit very high, which minimizes the chance of hitting it unless something serious has gone wrong. [7] Make sure you use a clear exception class that indicates this is a test timeout that's failing. This will help make it clear what's gone wrong should it happen, and perhaps allow a more sophisticated test harness to take account of this information in its display. The time values, in particular the waitLimit, should never be literal values. Make sure they are always values that can be easily set in bulk, either by using constants or set through the runtime environment. That way if you need to tweak them (and you will) you can tweak them all quickly. All this advice is handy for async calls where you expect a response from the provider, but how about those where there is no response. These are calls where we invoke a command on something and expect it to happen without any acknowledgment. This is the trickiest case since you can test for your expected response, but there's nothing to do to detect a failure other than timing-out. If the provider is something you're building you can handle this by ensuring the provider implements some way of indicating that it's done - essentially some form of callback. Even if only the testing code uses it, it's worth it - although often you'll find this kind of functionality is valuable for other purposes too[8]. If the provider is someone else's work, you can try persuasion, but otherwise may be stuck. Although this is also a case when using Test Doubles for remote services is worthwhile (which I'll discuss more in the next section). If you have a general failure in something asynchronous, such that it's not responding at all, then you'll always be waiting for timeouts and your test suite will take a long time to fail. To combat this it's a good idea to use a smoke test to check that the asynchronous service is responding at all and stop the test run right away if it isn't. Gerard Meszaros's book, xUnit Test Patterns, contains lots of good patterns for constructing tests. You can also often side-step the asynchrony completely. Gerard Meszaros's Humble Object pattern says that whenever you have some logic that's in a hard-to-test environment, you should isolate the logic you need to test from that environment. In this case it means put most of the logic you need to test in a place where you can test it synchronously. The asynchronous behavior should be as minimal (humble) as possible, that way you don't need that much testing of it. Remote Services Sometimes I'm asked if ThoughtWorks does any integration work, which I find somewhat amusing since there's hardly any project we do that doesn't involve a fair bit of integration. By their nature, enterprise applications involve a great deal of combining data from different systems. These systems are maintained by other teams operating to their own schedules, teams that often use a very different software philosophy to our heavily test-driven agile approach. Testing with such remote systems brings a number of problems, and non-determinism is high on the list. Often remote systems don't have test system we can call, which means hitting a live system. If there is a test system, it may not be stable enough to provide deterministic responses. In this situation it's vital to ensure determinism, so it's time to reach for a Test Double - a component that looks like the remote service, but is really just a pretend version that mimics the remote system's behavior. The double needs to be setup so that provides the right kind of response in interaction with our system, but in a way we control. In this manner we can ensure determinism. Using a double has a downside, in particular when we are testing across a broad scope. How can we be sure that the double behaves in the same way that remote system does? We can tackle this again using tests, a form of test that I call Integration Contract Tests. These run the same interaction with the remote system and the double, and check that the two match. In this case 'match' may not mean coming up with the same result (due to the non-determinisms), but results that share the same essential structure. Integration Contract Tests need to be run frequently, but not part of our system's deployment pipeline. Periodic running based on the rate of the change of the remote system is usually best. For writing these kinds of test doubles, I'm a big fan of Self Initializing Fakes - since these are very simple to manage. Some people are firmly against using Test Doubles in functional tests, believing that you must test with real connection in order to ensure end-to-end behavior. While I sympathize with their argument, automated tests are useless if they are non-deterministic. So any advantage you gain by talking to the real system is overwhelmed by the need to stamp out non-determinism[9]. Time Few things are more non-deterministic than a call to the system clock. Each time you call it, you get a new result, and any tests that depend on it can thus change. Ask for all the todos due in the next hour, and you regularly get a different answer[10]. The most important thing here is to ensure that you always wrap the system clock with routines that can be replaced with a seeded value for testing. A clock stub can be set to particular time and frozen at that time, allowing your tests to have complete control over its movements. That way you can synchronize your test data to the values in the seeded clock.[11][12] Always wrap the system clock, so it can be easily substituted for testing. One thing to watch with this, is that eventually your test data might start having problems because it's too old, and you get conflicts with other time based factors in your application. In this case you can move the data, and your clock seeds to new values. When you do this, ensure that this is the only thing you do. That way you can be sure that any tests that fail are due to time-movement in the test data. Another area where time can be a problem is when you rely on other behaviors from the clock. I once saw a system that generated random keys based on clock values. This systems started failing when it was moved to a faster machine that could allocate multiple ids within a single clock tick.[13] I've heard so many problems due to direct calls to the system clock that I'd argue for finding a way to use code analysis to detect any direct calls to the system clock and failing the build right there. Even a simple regex check might save you a frustrating debugging session after a call at an ungodly hour. Resource Leaks If your application has some kind of resource leak, this will lead to random tests failing, since it's just which test causes the resource leak to go over a limit that gets the failure. This case is awkward because any test can fail intermittently due to this problem. If it isn't a case of one test being non-deterministic then resource leaks are a good candidate to investigate. By resource leak, I mean any resource that the application has to manage by acquiring and releasing. In non-memory-managed environments, the obvious example is memory. Memory-management did much to remove this problem, but other resources still need to be managed, such as database connections. Usually the best way to handle these kind of resources is through a Resource Pool. If you do this then a good tactic is to configure the pool to a size of 1 and make it throw an exception should it get a request for a resource when it has none left to give. That way the first test to request a resource after the leak will fail - which makes it a lot easier to find the problem test. This idea of limiting resource pool sizes, is about increasing constraints to make errors more likely to crop up in tests. This is good because we want errors to show in tests so we can fix them before they manifest themselves in production. This principle can be used in other ways too. One story I heard was of a system which generated randomly named temporary files, didn't clean them up properly, and crashed on a collision. This kind of bug is very hard to find, but one way to manifest it is to stub the randomizer for testing so it always returns the same value. That way you can surface the problem more quickly.
April 14, 2011
by Martin Fowler
· 6,645 Views · 1 Like
article thumbnail
Practical PHP Testing Patterns: Parameterized Test
Sometimes the mechanics of different tests are really the same, and the only thing that changes between them are input and expected output data. As an example, consider testing mathematical calculations functions, or any kind of stateless method: you invoke it always in the same way, and then check the returned value. Another case is that of triangulation in Test-Driven Development. When an implementation is not obvious, triangulating it leads to the creation of multiple tests with different data, but usually the exact same procedure. To eliminate the duplication between these tests, we can code a method which models the whole test, and that accepts input (and possible initial state) and expected output as parameters. This method will perform the arrange, act, assert, and even teardown phases if needed. It will eliminate the need for maintaining the same test code in different places, of course. An advantage of these refactorings that we already cited in Test Utility Method is that adding a new test becomes equivalent to adding one or two lines of code. This pattern is really a specialized, standalone Test Utility Method. Implementation A simple form of Parameterized Test is a private method which is delegated from in the various test*() ones, which are called by the testing framework. The advantage of this first approach is to present more clear names to the code reader, and to support possible additional work to perform after the method is executed (it may return something which we can make further assertions on.) A second way of implementing Parameterized Test is with PHPUnit's @dataProvider annotation. Is it completely automatic, and displays data used in the test instance in case of failure in order to recognize which iteration we are in. You can also prepare input and output for @dataProvider programmatically (read: with code), so with a process as flexible and complex as you want. Variations Some older versions of Parameterized Test exist. A Loop-Driven Test uses nested loops that verify every combination of inputs/outputs. For example, it may check pairs of x and y coordinates in an image. Often exhaustive testing is an overkill and contains logic for the generation of data which may contain bugs. In a Tabular Test, you have a table, represented as code or in some configuration file, where each row corresponds to one test run and contains the necessary input/output relation. One Test Method works on everything here. This mechanism is obsolete as PHPUnit with @dataProvider executes each instance of the method into a different Testcase Object. You won't observe tests influencing each other, no problem with localization of which row is causing a failure. Example The code sample shows you how to go from a strong duplication between tests to a single copy of the code that uses @dataProvider to be executed N times. It also shows, in the third Testcase Class, how easy is to produce data for the tests with PHP code. assertEquals(0, sqrt(0)); } public function testSquareRootIsCalculatedCorrectlyByPHPFor1() { $this->assertEquals(1, sqrt(1)); } public function testSquareRootIsCalculatedCorrectlyByPHPFor4() { $this->assertEquals(2, sqrt(4)); } public function testSquareRootIsCalculatedCorrectlyByPHPFor9() { $this->assertEquals(3, sqrt(9)); } public function testSquareRootIsCalculatedCorrectlyByPHPFor16() { $this->assertEquals(4, sqrt(16)); } public function testSquareRootIsCalculatedCorrectlyByPHPForMinusOne() { $this->assertEquals('i', sqrt(-1)); } } class ParameterizedTest extends PHPUnit_Framework_TestCase { /** * This should be a static method, returning an array of arrays. * Each row of this table-like array is a set of arguments * for the Test Method. */ public static function squaresAndRoots() { return array( array(0, 0), array(1, 1), array(4, 2), array(9, 3), array(16, 4), array(-1, 'i') ); } /** * The annotation requires as unique argument the name of a public method * in this Testcase Class. * @dataProvider squaresAndRoots */ public function testSquareRootIsCalculatedCorrectlyByPHP($square, $root) { $this->assertEquals($root, sqrt($square)); } } class ProgrammaticDataProviderTest extends PHPUnit_Framework_TestCase { private static $width = 5; private static $height = 10; /** * I don't know why these methods are required to be static. It makes * no difference however as we are never going to call it directly. */ public static function everyCoupleOfCoordinates() { $testParameters = array(); for ($x = 1; $x <= self::$width; $x++) { for ($y = 1; $y <= self::$height; $y++) { $testParameters[] = array($x, $y); } } return $testParameters; } /** * I don't advise you to test an image at every pixel, but in some cases * you may need an exhaustive search. The programmatic generation of data * keeps the test code really short, but don't forget that the generation * logic may hide bugs if it becomes too complex. * @dataProvider everyCoupleOfCoordinates */ public function testImageHasCorrectTransparencyValue($x, $y) { $this->fail("This test will be executed in isolation with the x=$x and y=$y values."); } }
April 11, 2011
by Giorgio Sironi
· 5,057 Views
article thumbnail
Java Access to SQL Azure via the JDBC Driver for SQL Server
I’ve written a couple of posts (here and here) about Java and the JDBC Driver for SQL Server with the promise of eventually writing about how to get a Java application running on the Windows Azure platform. In this post, I’ll deliver on that promise. Specifically, I’ll show you two things: 1) how to connect to a SQL Azure Database from a Java application running locally, and 2) how to connect to a SQL Azure database from an application running in Windows Azure. You should consider these as two ordered steps in moving an application from running locally against SQL Server to running in Windows Azure against SQL Azure. In both steps, connection to SQL Azure relies on the JDBC Driver for SQL Server and SQL Azure. The instructions below assume that you already have a Windows Azure subscription. If you don’t already have one, you can create one here: http://www.microsoft.com/windowsazure/offers/. (You’ll need a Windows Live ID to sign up.) I chose the Free Trial Introductory Special, which allows me to get started for free as long as keep my usage limited. (This is a limited offer. For complete pricing details, see http://www.microsoft.com/windowsazure/pricing/.) After you purchase your subscription, you will have to activate it before you can begin using it (activation instructions will be provided in an email after signing up). Connecting to SQL Azure from an application running locally I’m going to assume you already have an application running locally and that it uses the JDBC Driver for SQL Server. If that isn’t the case, then you can start from scratch by following the steps in this post: Getting Started with the SQL Server JDBC Driver. Once you have an application running locally, then the process for running that application with a SQL Azure back-end requires two steps: 1. Migrate your database to SQL Azure. This only takes a couple of minutes (depending on the size of your database) with the SQL Azure Migration Wizard - follow the steps in the Creating a SQL Azure Server and Creating a SQL Azure Database sections of this post. 2. Change the database connection string in your application. Once you have moved your local database to SQL Azure, you only have to change the connection string in your application to use SQL Azure as your data store. In my case (using the Northwind database), this meant changing this… String connectionUrl = "jdbc:sqlserver://serverName\\sqlexpress;" + "database=Northwind;" + "user=UserName;" + "password=Password"; …to this… String connectionUrl = "jdbc:sqlserver://xxxxxxxxxx.database.windows.net;" + "database=Northwind;" + "user=UserName@xxxxxxxxxx;" + "password=Password"; (where xxxxxxxxxx is your SQL Azure server ID). Connecting to SQL Azure from an application running in Windows Azure The heading for this section might be a bit misleading. Once you have a locally running application that is using SQL Azure, then all you have to do is move your application to Windows Azure. The connecting part is easy (see above), but moving your Java application to Windows Azure takes a bit more work. Fortunately, Ben Lobaugh has written a great post that that shows how to use the Windows Azure Starter Kit for Java to get a Java application (a JSP application, actually) running in Windows Azure: Deploying a Java application to Windows Azure with Command-Line Ant. (If you are using Eclipse, see Ben’s related post: Deploying a Java application to Windows Azure with Eclipse.) I won’t repeat his work here, but I will call out the steps I took in modifying his instructions to deploy a simple JSP page that connects to SQL Azure. 1. Add the JDBC Driver for SQL Server to the Java archive. One step in Ben’s tutorial (see the Select the Java Runtime Environment section) requires that you create a .zip file from your local Java installation and add it to your Java/Azure application. Most likely, your local Java installation references the JDBC driver by setting the classpath environment variable. When you create a .zip file from your java installation, the JDBC driver will not be included and the classpath variable will not be set in the Azure environment. I found the easiest way around this was to simply add the sqljdbc4.jar file (probably located in C:\Program Files\Microsoft SQL Server JDBC Driver\sqljdbc_3.0\enu) to the \lib\ext directory of my local Java installation before creating the .zip file. Note: You can put the JDBC driver in a separate directory, include it when you create the .zip folder, and set the classpath environment variable in the startup.bat script. But, I found the above approach to be easier. 2. Modify the JSP page. Instead of the code Ben suggests for the HelloWorld.jsp file (see the Prepare your Java Application section), use code from your locally running application. In my case, I just used the code from this post after changing the connection string and making a couple minor JSP-specific changes: Northwind Customers That’s it!. To summarize the steps… Migrate your database to SQL Azure with the SQL Azure Migration Wizard. Change the database connection in your locally running application. Use the Windows Azure Starter Kit for Java to move your application to Windows Azure. (You’ll need to follow instructions in this post and instructions above.) Thanks. -Brian
March 30, 2011
by Brian Swan
· 18,887 Views
article thumbnail
CDI Dependency Injection - An Introductory Java EE Tutorial Part 1
This article discusses dependency injection in a tutorial format. It covers some of the features of CDI such as type safe annotations configuration, alternatives and more.
March 28, 2011
by Rick Hightower
· 367,191 Views · 17 Likes
article thumbnail
A story about User Stories; Where do you start and what about the planning?
In this multi-part post, I’m going to share my personal experiences while working with user stories for gathering, tracking and planning requirements. It currently consists out of three parts: What are they and why do you need them? Who writes them and how do you control scope? Where do you start and what about the planning? You can also download all parts as one comprehensive PDF for easy printing or e-reading. Where do you start? Suppose that after intensive discussions and tough scoping sessions you ended up with a list of user stories and are about to start building the system. The first story not only needs to realize some particular feature, but also involves building a skeleton implementation of the system’s architecture. How do you avoid spending way too much time on plumbing and other general purpose stuff you need for the rest of the stories? The article Managing the Bootstrap Story by Jennitta Andrea addressed this challenge in more detail and offers some alternative solutions. One of these solutions is to find and define a user story with the product owner that offers minimal functionality yet still has project value. Such a story is often referred to as the backbone story because you realize the backbone of your system in it. It’s quite common to use the backbone story to realize a proof-of-concept (PoC) that verifies the chosen architecture. Since a working PoC can give the product owner confidence that the team is able to build such a product, that fact alone may be enough project value for the product owner. More storyotypes? You might have suspected it already, but that backbone story is just an example of another storyotype. In fact, after I started looking for an approach to capture the non-functional requirements of a project or system, I ran into a slide deck that mentioned a whole set of additional storyotypes. Dan Rawsthorne, the author, tried to define a storyotype for virtually every possible thing you might need to do in a project. Personally I think he went a bit too far, but a small set of additional storyotypes proved to be very useful anyway. Storyotype Description Compound Epic A composite user story that groups a number of stories in a logical sense. Complex Epic A user story whose content and impact must be determined later in the project, but for which it is clear that it involves a significant amount of work. Setup A story that is used to setup the project environment, including a source control environment, a project website, a build server. Technical A story that involves making a technical improvement or adjustment. Examples include introducing a coding standard, refactoring a poor design, executing a performance test. Documentation A story for writing a user manual, installation manual, etc. Training A story for developing and/or hosting a training, or having a workshop with end users. Quality Improvements A story which objective is to fix a collection of related bugs, or spent a fixed amount of time to improve the quality of the code base. Spike A story that aims to do a technical investigation to determine the usability of a specific technology, or for trying an alternative technical solution. When is the story complete? So how do you know that a user story has been successfully realized? Well, if all is good, all stories will conform with INVEST and are associated with a number of acceptance criteria (typically written down as the how-to demo) specified by the product owner. That should be enough to determine if it is functionally sound. But what you still miss is a way of explaining the stakeholders, including the product owner, when the team treats the story as finished. That may differ by team, but usually includes some or more of the following criteria. The code compiles and there are no warnings or errors. The code meets the coding standards setup by the project or the organization. The code is reviewed by a peer developer. All automated unit and integration tests have completed successfully. Visual Studio’s static code analysis tool does not report any violations. ReSharper reports no potential errors (a.k.a. everything is 'green'). The daily integration build has completed successfully. The functionality was tested by another member of the team (anybody but the developer). The feature or functionality has been signed off using the project checklist. The system functionality is tested by a tester. The visual look and feel is has been approved by an employee of the communications department. Together with the story’s how-to demo these criteria are commonly referred to as the definition-of-done. Usually, a team or project will have a default definition-of-done that applies to all stories and only mentions the particulars of that story if necessary. Then what about the planning? User stories are an excellent unit for tracking progress within your project. However, purists within the Agile community will tell you that an Agile project will have no long term plan. Instead, the functionality is realized iteratively according to the priority defined by the product owner. I agree with the latter and believe that its iterative nature is essential for dealing with the changing requirements that are common in all projects. It allows deferring decisions to the last responsible moment, and that’s always a good thing. But in reality you often can’t escape from providing at least a rough schedule to your management. How should you deal with that? What I often do to get all stakeholders to join me in a number of workshops. Using use case diagrams to illustrate the context of the discussions, I try to get enough stories on paper to represent the entire scope of the project. You need to beware though that you don’t write down too much details or have too much in-depth discussions. That would give the stakeholders a false sense of precision, and consequently, will cause them to see the stories as a formal functional design. Also, if you run into some high-level chunk of functionality for which nobody really knows what it will look like, add an epic story for it and include a spike to elaborate on the epic later on in the project. Then organize a number of shorter meetings with the team or, if the team hasn’t been formed yet, with a few experienced developers. Let them discuss every story one by one and then try to estimate the size of each story in so called story points. Some people from the Agile community say you should estimate using relative sizes only. In other words, a story that seems to require twice as much work as another story should also have twice as many story points. The story point as a unit does not have value. It’s the relative differences that are important. What works for me is that every story point corresponds to the ideal day of an experienced senior software developer. In other words, one story point means that an experienced developer familiar with the chosen architecture, technology and project methodology needs to work for 8 hours without being disturbed by telephone, email, coffee breaks, or any other distractions. Mike Cohn, author of User Stories Applied, has dedicated many chapters to this estimation technique. Ideally, each story is between 1 and 8 story points, but at the beginning of the project you still may have some epics to break up. After finishing those meetings you should have an estimate of the total size of the project. Now, in order to get from those story points to a total number of hours you need to estimate the expected productivity of the team. Mike Cohn does this by creating a table with the expected roles, their availability (to deal with part time employees), and the expected productivity compared to the ideal senior developer (as a percentage). By calculating the average productivity and multiplying it with the number of story points you’ll end up with the total number of estimated man-hours. It’s only an estimate and both the productivity can be disappointing as well as the estimate in story points may appear to be wrong. But it still gives you an initial estimate that can be used for global planning and budget discussions. Obviously it is important to ensure that you keep on continuously measuring the actual productivity. Wow, now what? By now, it should be clear that a user story is not an independent concept but something that closely resonates with many of the aspects of our work in the software industry. In this multi-part post I have tried to explain a number of those aspects and to clarify the relationship between them. But even though I’ve not touched everything as detailed as possible, I still hope I've managed to convince you about the power and potential of user stories. Last but not least, if you have any questions or comments, please do not hesitate to email me at [email protected] or tweet me at my Twitter ID ddoomen.
March 17, 2011
by Dennis Doomen
· 7,398 Views
  • Previous
  • ...
  • 584
  • 585
  • 586
  • 587
  • 588
  • 589
  • 590
  • 591
  • 592
  • 593
  • Next
  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook
×