The Latest and Popular SDLC Topics

The Latest Popular Topics

i often get asked about ‘rest to soap’ transformation use cases these days. using an soa gateway like securespan to perform this type of transformation at runtime is trivial to setup. with securespan in front of any existing web service (in the dmz for example), you can virtualize a rest version of this same service. using an example, here is a description of the steps to perform this conversion. imagine the geoloc web service for recording geographical locations. it has two methods, one for setting a location and one for getting a location. see below what this would look like in soap. request: 34802398402 response: 52.37706 4.889721 request: 34802398402 52.37706 4.889721 response: ok here is the equivalent rest target that i want to support at the edge. payloads could be xml, but let’s use json to make it more interesting. get /position/34802398402 http 200 ok content-type: text/json { 'latitude' : 52.37706 'longitude' : 4.889721 } post /position/34802398402 content-type: text/json { 'latitude' : 52.37706 'longitude' : 4.889721 } http 200 ok ok now let’s implement this rest version of the service using securespan. i’m assuming that you already have a securespan gateway deployed between the potential rest requesters and the existing soap web service. first, i will create a new service endpoint on the gateway for this service and assign anything that comes at the uri pattern /position/* to this service. i will also allow the http verbs get and post for this service. rest geoloc service properties next, let’s isolate the resource id from the uri and save this as a context variable named ‘trackerid’. we can use a simple regex assertion to accomplish this. also, i will branch on the incoming http verb using an or statement. i am just focusing on get and post for this example but you could add additional logic for other http verbs that you want to support for this rest service. regex for rest service resource identification policy branching for get vs post for get requests, the transformation is very simple, we just declare a message variable using a soap skeleton into which we refer to the trackerid variable. soap request template this soap message is routed to the existing web service and the essential elements are isolated using xpath assertions. processing soap response the rest response is then constructed back using a template response. template json response a similar logic is performed for the post message. see below for the full policy logic. complete policy you’re done for virtualizing the rest service. setting this up with securespan took less than an hour, did not require any change on the existing soap web service and did not require the deployment of an additional component. from there, you would probably enrich the policy to perform some json schema validation , some url and query parameter validation, perhaps some authentication, authorization , etc.

August 4, 2011

by Francois Lascelles

· 37,556 Views

Java Tools for Source Code Optimization and Analysis

Below is a list of some tools that can help you examine your Java source code for potential problems: 1. PMD from http://pmd.sourceforge.net/ License: PMD is licensed under a “BSD-style” license PMD scans Java source code and looks for potential problems like: * Possible bugs – empty try/catch/finally/switch statements * Dead code – unused local variables, parameters and private methods * Suboptimal code – wasteful String/StringBuffer usage * Overcomplicated expressions – unnecessary if statements, for loops that could be while loops * Duplicate code – copied/pasted code means copied/pasted bugs You can download everything from here, and you can get an overview of all the rules at the rulesets index page. PMD is integrated with JDeveloper, Eclipse, JEdit, JBuilder, BlueJ, CodeGuide, NetBeans/Sun Java Studio Enterprise/Creator, IntelliJ IDEA, TextPad, Maven, Ant, Gel, JCreator, and Emacs. 2. FindBug from http://findbugs.sourceforge.net License: L-GPL FindBugs, a program which uses static analysis to look for bugs in Java code. And since this is a project from my alumni university (IEEE – University of Maryland, College Park – Bill Pugh) , I have to definitely add this contribution to this list. 3. Clover from http://www.cenqua.com/clover/ License: Free for Open Source (more like a GPL) Measures statement, method, and branch coverage and has XML, HTML, and GUI reporting. and comprehensive plug-ins for major IDEs. * Improve Test Quality * Increase Testing Productivity * Keep Team on Track Fully integrated plugins for NetBeans, Eclipse , IntelliJ IDEA, JBuilder and JDeveloper. These plugins allow you to measure and inspect coverage results without leaving the IDE. Seamless Integration with projects using Apache Ant and Maven. * Easy integration into legacy build systems with command line interface and API. Fast, accurate, configurable, detailed coverage reporting of Method, Statement, and Branch coverage. Rich reporting in HTML, PDF, XML or a Swing GUI Precise control over the coverage gathering with source-level filtering. Historical charting of code coverage and other metrics. Fully compatible with JUnit 3.x & 4.x, TestNG, JTiger and other testing frameworks. Can also be used with manual, functional or integration testing. 4. Macker from http://innig.net/macker/ License: GPL Macker is a build-time architectural rule checking utility for Java developers. It’s meant to model the architectural ideals programmers always dream up for their projects, and then break — it helps keep code clean and consistent. You can tailor a rules file to suit a specific project’s structure, or write some general “good practice” rules for your code. Macker doesn’t try to shove anybody else’s rules down your throat; it’s flexible, and writing a rules file is part of the development process for each unique project. 5 EMMA from http://emma.sourceforge.net/ License: EMMA is distributed under the terms of Common Public License v1.0 and is thus free for both open-source and commercial development. Reports on class, method, basic block, and line coverage (text, HTML, and XML). EMMA can instrument classes for coverage either offline (before they are loaded) or on the fly (using an instrumenting application classloader). Supported coverage types: class, method, line, basic block. EMMA can detect when a single source code line is covered only partially. Coverage stats are aggregated at method, class, package, and “all classes” levels. Output report types: plain text, HTML, XML. All report types support drill-down, to a user-controlled detail depth. The HTML report supports source code linking. Output reports can highlight items with coverage levels below user-provided thresholds. Coverage data obtained in different instrumentation or test runs can be merged together. EMMA does not require access to the source code and degrades gracefully with decreasing amount of debug information available in the input classes. EMMA can instrument individial .class files or entire .jars (in place, if desired). Efficient coverage subset filtering is possible, too. Makefile and ANT build integration are supported on equal footing. EMMA is quite fast: the runtime overhead of added instrumentation is small (5-20%) and the bytecode instrumentor itself is very fast (mostly limited by file I/O speed). Memory overhead is a few hundred bytes per Java class. EMMA is 100% pure Java, has no external library dependencies, and works in any Java 2 JVM (even 1.2.x). 6. XRadar from http://xradar.sourceforge.net/ License: BSD (me thinks) The XRadar is an open extensible code report tool currently supporting all Java based systems. The batch-processing framework produces HTML/SVG reports of the systems current state and the development over time – all presented in sexy tables and graphs. The XRadar gives measurements on standard software metrics such as package metrics and dependencies, code size and complexity, code duplications, coding violations and code-style violations. 7. Hammurapi from Hammurapi Group License: (if anyone knows the license for this email me Venkatt.Guhesan at Y! dot com) Hammurapi is a tool for execution of automated inspection of Java program code. Following the example of 282 rules of Hammurabi’s code, we are offered over 120 Java classes, the so-called inspectors, which can, at three levels (source code, packages, repository of Java files), state whether the analysed source code contains violations of commonly accepted standards of coding. Relevant Links: http://en.sdjournal.org/products/articleInfo/93 http://wiki.hammurapi.biz/index.php?title=Hammurapi_4_Quick_Start 8. Relief from http://www.workingfrog.org/ License: GPL Relief is a design tool providing a new look on Java projects. Relying on our ability to deal with real objects by examining their shape, size or relative place in space it gives a “physical” view on java packages, types and fields and their relationships, making them easier to handle. 9. Hudson from http://hudson-ci.org/ License: MIT Hudson is a continuous integration (CI) tool written in Java, which runs in a servlet container, such as Apache Tomcat or the GlassFish application server. It supports SCM tools including CVS, Subversion, Git and Clearcase and can execute Apache Ant and Apache Maven based projects, as well as arbitrary shell scripts and Windows batch commands. 10. Cobertura from http://cobertura.sourceforge.net/ License: GNU GPL Cobertura is a free Java tool that calculates the percentage of code accessed by tests. It can be used to identify which parts of your Java program are lacking test coverage. It is based on jcoverage. 11. SonarSource from http://www.sonarsource.org/ (recommended by Vishwanath Krishnamurthi – thanks) License: LGPL Sonar is an open platform to manage code quality. As such, it covers the 7 axes of code quality: Architecture & Design, Duplications, Unit Tests, Complexity, Potential bugs, Coding rules, Comments. From http://mythinkpond.wordpress.com/2011/07/14/java-tools-for-source-code-optimization-and-analysis/

July 29, 2011

by Venkatt Guhesan

· 64,775 Views

Java: What is the Limit to the Number of Threads You Can Create?

I have seen a number of tests where a JVM has 10K threads. However, what happens if you go beyond this? My recommendation is to consider having more servers once your total reaches 10K. You can get a decent server for $2K and a powerful one for $10K. Creating threads gets slower The time it takes to create a thread increases as you create more thread. For the 32-bit JVM, the stack size appears to limit the number of threads you can create. This may be due to the limited address space. In any case, the memory used by each thread's stack add up. If you have a stack of 128KB and you have 20K threads it will use 2.5 GB of virtual memory. Bitness Stack Size Max threads 32-bit 64K 32,073 32-bit 128K 20,549 32-bit 256K 11,216 64-bit 64K stack too small 64-bit 128K 32,072 64-bit 512K 32,072 Note: in the last case, the thread stacks total 16 GB of virtual memory. Java 6 update 26 32-bit,-XX:ThreadStackSize=64 4,000 threads: Time to create 4,000 threads was 0.522 seconds 8,000 threads: Time to create 4,000 threads was 1.281 seconds 12,000 threads: Time to create 4,000 threads was 1.874 seconds 16,000 threads: Time to create 4,000 threads was 2.725 seconds 20,000 threads: Time to create 4,000 threads was 3.333 seconds 24,000 threads: Time to create 4,000 threads was 4.151 seconds 28,000 threads: Time to create 4,000 threads was 5.293 seconds 32,000 threads: Time to create 4,000 threads was 6.636 seconds After creating 32,073 threads, java.lang.OutOfMemoryError: unable to create new native thread at java.lang.Thread.start0(Native Method) at java.lang.Thread.start(Thread.java:640) at com.google.code.java.core.threads.MaxThreadsMain.addThread(MaxThreadsMain.java:46) at com.google.code.java.core.threads.MaxThreadsMain.main(MaxThreadsMain.java:16) Java 6 update 26 32-bit,-XX:ThreadStackSize=128 4,000 threads: Time to create 4,000 threads was 0.525 seconds 8,000 threads: Time to create 4,000 threads was 1.239 seconds 12,000 threads: Time to create 4,000 threads was 1.902 seconds 16,000 threads: Time to create 4,000 threads was 2.529 seconds 20,000 threads: Time to create 4,000 threads was 3.165 seconds After creating 20,549 threads, java.lang.OutOfMemoryError: unable to create new native thread at java.lang.Thread.start0(Native Method) at java.lang.Thread.start(Thread.java:640) at com.google.code.java.core.threads.MaxThreadsMain.addThread(MaxThreadsMain.java:46) at com.google.code.java.core.threads.MaxThreadsMain.main(MaxThreadsMain.java:16) Java 6 update 26 32-bit,-XX:ThreadStackSize=128 4,000 threads: Time to create 4,000 threads was 0.526 seconds 8,000 threads: Time to create 4,000 threads was 1.212 seconds After creating 11,216 threads, java.lang.OutOfMemoryError: unable to create new native thread at java.lang.Thread.start0(Native Method) at java.lang.Thread.start(Thread.java:640) at com.google.code.java.core.threads.MaxThreadsMain.addThread(MaxThreadsMain.java:46) at com.google.code.java.core.threads.MaxThreadsMain.main(MaxThreadsMain.java:16) Java 6 update 26 64-bit,-XX:ThreadStackSize=128 4,000 threads: Time to create 4,000 threads was 0.577 seconds 8,000 threads: Time to create 4,000 threads was 1.292 seconds 12,000 threads: Time to create 4,000 threads was 1.995 seconds 16,000 threads: Time to create 4,000 threads was 2.653 seconds 20,000 threads: Time to create 4,000 threads was 3.456 seconds 24,000 threads: Time to create 4,000 threads was 4.663 seconds 28,000 threads: Time to create 4,000 threads was 5.818 seconds 32,000 threads: Time to create 4,000 threads was 6.792 seconds After creating 32,072 threads, java.lang.OutOfMemoryError: unable to create new native thread at java.lang.Thread.start0(Native Method) at java.lang.Thread.start(Thread.java:640) at com.google.code.java.core.threads.MaxThreadsMain.addThread(MaxThreadsMain.java:46) at com.google.code.java.core.threads.MaxThreadsMain.main(MaxThreadsMain.java:16) Java 6 update 26 64-bit,-XX:ThreadStackSize=512 4,000 threads: Time to create 4,000 threads was 0.577 seconds 8,000 threads: Time to create 4,000 threads was 1.292 seconds 12,000 threads: Time to create 4,000 threads was 1.995 seconds 16,000 threads: Time to create 4,000 threads was 2.653 seconds 20,000 threads: Time to create 4,000 threads was 3.456 seconds 24,000 threads: Time to create 4,000 threads was 4.663 seconds 28,000 threads: Time to create 4,000 threads was 5.818 seconds 32,000 threads: Time to create 4,000 threads was 6.792 seconds After creating 32,072 threads, java.lang.OutOfMemoryError: unable to create new native thread at java.lang.Thread.start0(Native Method) at java.lang.Thread.start(Thread.java:640) at com.google.code.java.core.threads.MaxThreadsMain.addThread(MaxThreadsMain.java:46) at com.google.code.java.core.threads.MaxThreadsMain.main(MaxThreadsMain.java:16) The Code MaxThreadsMain.java From http://vanillajava.blogspot.com/2011/07/java-what-is-limit-to-number-of-threads.html

July 26, 2011

by Peter Lawrey

· 148,484 Views · 1 Like

Five reasons why you should rejoice about Kotlin

As you probably saw by now, JetBrains just announced that they are working on a brand new statically typed JVM language called Kotlin. I am planning to write a post to evaluate how Kotlin compares to the other existing languages, but first, I’d like to take a slightly different angle and try to answer a question I have already seen asked several times: what’s the point? We already have quite a few JVM languages, do we need any more? Here are a few reasons that come to mind. 1) Coolness New languages are exciting! They really are. I’m always looking forward to learning new languages. The more foreign they are, the more curious I am, but the languages I really look forward to discovering are the ones that are close to what I already know but not identical, just to find out what they did differently that I didn’t think of. Allow me to make a small digression in order to clarify my point. Some time ago, I started learning Japanese and it turned out to be the hardest and, more importantly, the most foreign natural language I ever studied. Everything is different from what I’m used to in Japanese. It’s not just that the grammar, the syntax and the alphabets are odd, it’s that they came up with things that I didn’t even think would make any sense. For example, in English (and many other languages), numbers are pretty straightforward and unique: one bag, two cars, three tickets, etc… Now, did it ever occur to you that a language could allow several words to mean “one”, and “two”, and “three”, etc…? And that these words are actually not arbitrary, their usage follows some very specific rules. What could these rules be? Well, in Japanese, what governs the word that you pick is… the shape of the object that you are counting. That’s right, you will use a different way to count if the object is long, flat, a liquid or a building. Mind-boggling, isn’t it? Here is another quick example: in Russian, each verb exists in two different forms, which I’ll call A and B to simplify. Russian doesn’t have a future tense, so when you want to speak at the present tense, you’ll conjugate verb A in the present, and when you want the future, you will use the B form… in the present tense. It’s not just that you need to learn two verbs per verb, you also need to know which one is which if you want to get your tenses right. Oh and these forms also have different meanings when you conjugate them in past tenses. End of digression. The reason why I am mentioning this is because this kind of construct bends your mind, and this goes for natural languages as much as programming languages. It’s tremendously exciting to read new syntaxes this way. For that reason alone, the arrival of new languages should be applauded and welcome. Kotlin comes with a few interesting syntactic innovations of its own, which I’ll try to cover in a separate post, but for now, I’d like to come back to my original point, which was to give you reasons why you should be excited about Kotlin, so let’s keep going down the list. 2) IDE support None of the existing JVM languages (Groovy, Scala, Fantom, Gosu, Ceylon) have really focused much on the tooling aspect. IDE plug-ins exist for each of them, all with varying quality, but they are all an afterthought, and they suffer from this oversight. The plug-ins are very slow to mature, they have to keep up with the internals of a compiler that’s always evolving and which, very often, doesn’t have much regards for the tools built on top of it. It’s a painful and frustrating process for tool creators and tool users alike. With Kotlin, we have good reasons to think that the IDE support will be top notch. JetBrains is basically announcing that they are building the compiler and the IDEA support in lockstep, which is a great way to guarantee that the language will work tremendously well inside IDEA, but also that other tools should integrate nicely with it as well (I’m rooting for a speedy Eclipse plug-in, obviously). 3) Reified generics This is a pretty big deal. Not so much for the functionality (I’ll get back to this in the next paragraph) but because this is probably the very first time that we see a JVM language with true support for reified generics. This innovation needs to be saluted. Correction from the comments: Gosu has reified generics Having said that, I don’t feel extremely excited by this feature because overall, I think that reified generics come at too high a price. I’ll try to dedicate a full blog post to this topic alone, because it deserves a more thorough treatment. 4) Commercial support JetBrains has a very clear financial interest in seeing Kotlin succeed. It’s not just that they are a commercial entity that can put money behind the development of the language, it’s also that the success of the language would most likely mean that they will sell more IDEA licenses, and this can also turn into an additional revenue stream derived from whatever other tools they might come up with that would be part of the Kotlin ecosystem. This kind of commercial support for a language was completely unheard of in the JVM world for fifteen years, and suddenly, we have two instances of it (Typesafe and now JetBrains). This is a good sign for the JVM community. 5) Still no Java successors Finally, the simple truth is that we still haven’t found any credible Java successor. Java still reigns supreme and is showing no sign of giving away any mindshare. Pulling numbers out of thin air, I would say that out of all the code currently running on the JVM today, maybe 94% of it is in Java, 3% is in Groovy and 1% is in Scala. This 94% figure needs to go down, but so far, no language has stepped up to whittle it down significantly. What will it take? Obviously, nothing that the current candidates are offering (closures, modularity, functional features, more concise syntax, etc…) has been enough to move that needle. Something is still missing. Could the missing piece be “stellar IDE support” or “reified generics”? We don’t know yet because no contender offers any of these features, but Kotlin will, so we will soon know. Either way, I am predicting that we will keep seeing new JVM languages pop up at a regular pace until one finally claims the prize. And this should be cause for rejoicing for everyone interested in the JVM ecosystem. So let’s cheer for Kotlin and wish JetBrains the best of luck with their endeavor. I can’t wait to see what will come out of this. Oh, and secretly, I am rooting for Eclipse to start working on their own JVM language too, obviously. From http://beust.com/weblog/2011/07/20/five-reasons-why-should-rejoice-about-kotlin/

July 22, 2011

by Cedric Beust

· 8,483 Views

Java Collection Performance

Learn more about Java collection performance in this post.

July 18, 2011

by Leo Lewis

· 123,358 Views · 10 Likes

Human Readable vs Machine Readable Formats

Most file/serialization formats can be broadly broking into two formats, Human Readable Text and Machine Readble Binary. The Human Readable formats have the advantage of being easily understood by a person reading them. Machine readable formats are easier/faster for a machine to encode/decode. There are formats which attempt to be a little of both. XML, JSon, CSV are examples of these. However these do not achieve close to the performance a binary format can achieve. Myth: Machine Readable Binary is always more compact than a Human Readable Binary can be more compact, however the obscurity of its format makes it difficult to ensure every byte counts. i.e. its usually hard enough getting something work. Making it compact as well is an added complication. However with Human Readable formats, determing how the format can be made more compact is more easily understood. As text: 38 bytes long, [-1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10] As binary: 290 bytes long, ....sr..java.util.ArrayListx.....a....I..sizexp....w.....sr..java.lang.Long; .....#....J..valuexr..java.lang.Number...........xp........sq.~..........sq.~ ..........sq.~..........sq.~..........sq.~..........sq.~..........sq.~ ..........sq.~..........sq.~..........sq.~..........sq.~..........x Even though the first format is more compact, you can immedately see you could drop the [ ] and spaces after the ", " to make it more compact. With the binary formats, it is hard to know where to start. ComparingHumanReadableToBinaryMain.java List longs = new ArrayList(); for(long i=-1;i<=10;i++) longs.add(i); String asText = longs.toString(); byte[] bytes1 = asText.getBytes(); System.out.println("As text: "+ bytes1.length+" bytes long, "+asText); ByteArrayOutputStream baos = new ByteArrayOutputStream(); ObjectOutputStream oos = new ObjectOutputStream(baos); oos.writeObject(longs); oos.close(); byte[] bytes2 = baos.toByteArray(); System.out.println("As binary: "+bytes2.length+" bytes long, " +new String(bytes2, 0).replaceAll("[^\\p{Graph}]", ".")); Myth: Machine Readable Binary is always faster than a Human Readable Its assumed the cost of parsing data in a human readable format always makes it slower, however machine sreadbale formats have to deal with an issue human readbale formats takes for granted, that is byte endianness. For human readable formats the order of digits is fairly obvious, however for machine formats the byte endianess of the data might not match that the natrual byte order of the CPU, leading to a source of overhead (as it has to swap the bytes around) One example of this is using big-endian (e.g. TCP/Network byte order) on a little endian machine e.g. Windows/Linux Intel/AMD. A common class which has this issue is DataInputStream and DataOutputStream which re-arranges the byte order (even if the native byte order matches) For this reason, a fast human readable parse can be as fast or faster. In an earlier article I showed how a Human Readable format could be used to read/write integers 30% faster than using DataInput/DataOuput. Writing human readable data faster than binary. Myth: Using a Human Readable Format makes it easy to read Just using a human readable format doesn't mean it will be easier to read than a machine readable format. Reusing existing tools as much as possible makes human readable format preferrable. However, machine readable formats can come with tools which decode the data and make maintain it easier. If you have data which can only be managed with the use of specialist tools, being human readable is not much advantage. Images are a good example of where a machine readable format is the best option. It is hard to image editing or viewing an image without the need for a specialist tool. A practical human readable format would undoubtably lower the quality of the image. ;) ________/.- ,’_______`-. \ _________\ /`__________\’/ _________ /___’a___a`___\ _________|____,’(_)`.____ | _________\___( ._|_. )___ / __________\___ .__,’___ / __________.-`._______,’-.__ ________,’__,’___`-’___`.__`. _______/____/____V_____\___\_ _____,’____/_____o______\___`.__ ___,’_____|______o_______|_____`. __|_____,’|______o_______|`._____| ___`.__,’_.-\_____o______/-._`.__,’ __________/_`.___o____,’__\_ __.””-._,’_____`._:_,’_____`.,-””._ _/_,-._`_______)___(________’_,-.__\ (_(___`._____,’_____`.______,’___)_) _\_\____\__,’________`.____/.___/_/ On the other hand human readable formats can be almost as obscure. This is a piece of code written in a language I am not worthy of mentioning. ;) Its is descibed as "used to list all of the prime numbers between 1 and R" (!R)@&{&/x!/:2_!x}'!R Conclusion If you are designing a file format, start with a human readable one as its much easier to understand. If this is not compact enough, consider compressing it. If it is not fast enough concider making it a binary format, but make sure it really is faster to use such a format. If you are going to use a binary format make sure you have tools in place to supprot viewing (possibly editing) the data (which you would get for free with a text format) From http://vanillajava.blogspot.com/2011/07/human-readable-vs-machine-readble.html

July 13, 2011

by Peter Lawrey

· 21,563 Views · 5 Likes

Lucene's near-real-time search is fast!

Lucene's near-real-time (NRT) search feature, available since 2.9, enables an application to make index changes visible to a new searcher with fast turnaround time. In some cases, such as modern social/news sites (e.g., LinkedIn, Twitter, Facebook, Stack Overflow, Hacker News, DZone, etc.), fast turnaround time is a hard requirement. Fortunately, it's trivial to use. Just open your initial NRT reader, like this: // w is your IndexWriter IndexReader r = IndexReader.open(w, true); (That's the 3.1+ API; prior to that use w.getReader() instead). The returned reader behaves just like one opened with IndexReader.open: it exposes the point-in-time snapshot of the index as of when it was opened. Wrap it in an IndexSearcher and search away! Once you've made changes to the index, call r.reopen() and you'll get another NRT reader; just be sure to close the old one. What's special about the NRT reader is that it searches uncommitted changes from IndexWriter, enabling your application to decouple fast turnaround time from index durability on crash (i.e., how often commit is called), something not previously possible. Under the hood, when an NRT reader is opened, Lucene flushes indexed documents as a new segment, applies any buffered deletions to in-memory bit-sets, and then opens a new reader showing the changes. The reopen time is in proportion to how many changes you made since last reopening that reader. Lucene's approach is a nice compromise between immediate consistency, where changes are visible after each index change, and eventual consistency, where changes are visible "later" but you don't usually know exactly when. With NRT, your application has controlled consistency: you decide exactly when changes must become visible. Recently there have been some good improvements related to NRT: New default merge policy, TieredMergePolicy, which is able to select more efficient non-contiguous merges, and favors segments with more deletions. NRTCachingDirectory takes load off the IO system by caching small segments in RAM (LUCENE-3092). When you open an NRT reader you can now optionally specify that deletions do not need to be applied, making reopen faster for those cases that can tolerate temporarily seeing deleted documents returned, or have some other means of filtering them out (LUCENE-2900). Segments that are 100% deleted are now dropped instead of inefficiently merged (LUCENE-2010). How fast is NRT search? I created a simple performance test to answer this. I first built a starting index by indexing all of Wikipedia's content (25 GB plain text), broken into 1 KB sized documents. Using this index, the test then reindexes all the documents again, this time at a fixed rate of 1 MB/second plain text. This is a very fast rate compared to the typical NRT application; for example, it's almost twice as fast as Twitter's recent peak during this year's superbowl (4,064 tweets/second), assuming every tweet is 140 bytes, and assuming Twitter indexed all tweets on a single shard. The test uses updateDocument, replacing documents by randomly selected ID, so that Lucene is forced to apply deletes across all segments. In addition, 8 search threads run a fixed TermQuery at the same time. Finally, the NRT reader is reopened once per second. I ran the test on modern hardware, a 24 core machine (dual x5680 Xeon CPUs) with an OCZ Vertex 3 240 GB SSD, using Oracle's 64 bit Java 1.6.0_21 and Linux Fedora 13. I gave Java a 2 GB max heap, and used MMapDirectory. The test ran for 6 hours 25 minutes, since that's how long it takes to re-index all of Wikipedia at a limited rate of 1 MB/sec; here's the resulting QPS and NRT reopen delay (milliseconds) over that time: The search QPS is green and the time to reopen each reader (NRT reopen delay in milliseconds) is blue; the graph is an interactive Dygraph, so if you click through above, you can then zoom in to any interesting region by clicking and dragging. You can also apply smoothing by entering the size of the window into the text box in the bottom left part of the graph. Search QPS dropped substantially with time. While annoying, this is expected, because of how deletions work in Lucene: documents are merely marked as deleted and thus are still visited but then filtered out, during searching. They are only truly deleted when the segments are merged. TermQuery is a worst-case query; harder queries, such as BooleanQuery, should see less slowdown from deleted, but not reclaimed, documents. Since the starting index had no deletions, and then picked up deletions over time, the QPS dropped. It looks like TieredMergePolicy should perhaps be even more aggressive in targeting segments with deletions; however, finally around 5:40 a very large merge (reclaiming many deletions) was kicked off. Once it finished the QPS recovered somewhat. Note that a real NRT application with deletions would see a more stable QPS since the index in "steady state" would always have some number of deletions in it; starting from a fresh index with no deletions is not typical. Reopen delay during merging The reopen delay is mostly around 55-60 milliseconds (mean is 57.0), which is very fast (i.e., only 5.7% "duty cycle" of the every 1.0 second reopen rate). There are random single spikes, which is caused by Java running a full GC cycle. However, large merges can slow down the reopen delay (once around 1:14, again at 3:34, and then the very large merge starting at 5:40). Many small merges (up to a few 100s of MB) were done but don't seem to impact reopen delay. Large merges have been a challenge in Lucene for some time, also causing trouble for ongoing searching. I'm not yet sure why large merges so adversely impact reopen time; there are several possibilities. It could be simple IO contention: a merge keeps the IO system very busy reading and writing many bytes, thus interfering with any IO required during reopen. However, if that were the case, NRTCachingDirectory (used by the test) should have prevented it, but didn't. It's also possible that the OS is [poorly] choosing to evict important process pages, such as the terms index, in favor of IO caching, causing the term lookups required when applying deletes to hit page faults; however, this also shouldn't be happening in my test since I've set Linux's swappiness to 0. Yet another possibility is Linux's write cache becomes temporarily too full, thus stalling all IO in the process until it clears; in this case perhaps tuning some of Linux's pdflush tunables could help, although I'd much rather find a Lucene-only solution so this problem can be fixed without users having to tweak such advanced OS tunables, even swappiness. Fortunately, we have an active Google Summer of Code student, Varun Thacker, working on enabling Directory implementations to pass appropriate flags to the OS when opening files for merging (LUCENE-2793 and LUCENE-2795). From past testing I know that passing O_DIRECT can prevent merges from evicting hot pages, so it's possible this will fix our slow reopen time as well since it bypasses the write cache. Finally, it's always possible other OSs do a better job managing the buffer cache, and wouldn't see such reopen delays during large merges. This issue is still a mystery, as there are many possibilities, but we'll eventually get to the bottom of it. It could be we should simply add our own IO throttling, so we can control net MB/sec read and written by merging activity. This would make a nice addition to Lucene! Except for the slowdown during merging, the performance of NRT is impressive. Most applications will have a required indexing rate far below 1 MB/sec per shard, and for most applications reopening once per second is fast enough. While there are exciting ideas to bring true real-time search to Lucene, by directly searching IndexWriter's RAM buffer as Michael Busch has implemented at Twitter with some cool custom extensions to Lucene, I doubt even the most demanding social apps actually truly need better performance than we see today with NRT. NIOFSDirectory vs MMapDirectory Out of curiosity, I ran the exact same test as above, but this time with NIOFSDirectory instead of MMapDirectory: There are some interesting differences. The search QPS is substantially slower -- starting at 107 QPS vs 151, though part of this could easily be from getting different compilation out of hotspot. For some reason TermQuery, in particular, has high variance from one JVM instance to another. The mean reopen time is slower: 67.7 milliseconds vs 57.0, and the reopen time seems more affected by the number of segments in the index (this is the saw-tooth pattern in the graph, matching when minor merges occur). The takeaway message seems clear: on Linux, use MMapDirectory not NIOFSDirectory! Optimizing your NRT turnaround time My test was just one datapoint, at a fixed fast reopen period (once per second) and at a high indexing rate (1 MB/sec plain text). You should test specifically for your use-case what reopen rate works best. Generally, the more frequently you reopen the faster the turnaround time will be, since fewer changes need to be applied; however, frequent reopening will reduce the maximum indexing rate. Most apps have relatively low required indexing rates compared to what Lucene can handle and can thus pick a reopen rate to suit the application's turnaround time requirements. There are also some simple steps you can take to reduce the turnaround time: Store the index on a fast IO system, ideally a modern SSD. Install a merged segment warmer (see IndexWriter.setMergedSegmentWarmer). This warmer is invoked by IndexWriter to warm up a newly merged segment without blocking the reopen of a new NRT reader. If your application uses Lucene's FieldCache or has its own caches, this is important as otherwise that warming cost will be spent on the first query to hit the new reader. Use only as many indexing threads as needed to achieve your required indexing rate; often 1 thread suffices. The fewer threads used for indexing, the faster the flushing, and the less merging (on trunk). If you are using Lucene's trunk, and your changes include deleting or updating prior documents, then use the Pulsing codec for your id field since this gives faster lookup performance which will make your reopen faster. Use the new NRTCachingDirectory, which buffers small segments in RAM to take load off the IO system (LUCENE-3092). Pass false for applyDeletes when opening an NRT reader, if your application can tolerate seeing deleted doccs from the returned reader. While it's not clear that thread priorities actually work correctly (see this Google Tech Talk), you should still set your thread priorities properly: the thread reopening your readers should be highest; next should be your indexing threads; and finally lowest should be all searching threads. If the machine becomes saturated, ideally only the search threads should take the hit. Happy near-real-time searching!

July 11, 2011

by Michael Mccandless

· 19,019 Views

SEVERE: Error in xpath:java.lang.RuntimeException: solrconfig.xml missing luceneMatchVersion

One of the things that changed from Solr 1.4.1 to 1.5+ was the introduction of a parameter to tell Solr / Lucene which kind of compability version its index files should be created and used in. Solr now refuses to start if you do not provide this setting (if you’re upgrading a previous installation from 1.4.1 or earlier). The fix isn’t really straight forward, and you’ll probably have to recreate your index files if you’re just arriving at the scene with Solr / Lucene 3.2 and 4.0. Solr 3.0 (1.5) might be able to upgrade the files from the 2.9 version, but if you’re jumping from Lucene 2.9 to 4.0, the easiest solution seems to be to delete the current index and reindex (set up replication, disable replication from the master, query the slave while reindexing the master, etc.. and you’ll have no downtime while doing this!). You’ll need to add a parameter to your solrconfig.xml file as well in the section. LUCENE_CURRENT Other valid values are LUCENE_30, LUCENE_31, LUCENE_32 and LUCENE_40. These values represent specific versions of the index structure, while LUCENE_CURRENT will use the version depending on which particular release of Lucene you’re using. The version format will be upgraded automagically between most releases, so you’ll probably be fine by using LUCENE_CURRENT. If you however are trying to load index files that are more than one version older, you may have to use one of the other values.

July 9, 2011

by Mats Lindh

· 10,275 Views

Programmatically Restart a Java Application

Today I'll talk about a famous problem : restarting a Java application. It is especially useful when changing the language of a GUI application, so that we need to restart it to reload the internationalized messages in the new language. Some look and feel also require to relaunch the application to be properly applied. A quick Google search give plenty answers using a simple : Runtime.getRuntime().exec("java -jar myApp.jar"); System.exit(0); This indeed basically works, but this answer that does not convince me for several reasons : 1) What about VM and program arguments ? (this one is secondary in fact, because can be solve it quite easily). 2) What if the main is not a jar (which is usually the case when launching from an IDE) ? 3) Most of all, what about the cleaning of the closing application ? For example if the application save some properties when closing, commit some stuffs etc. 4) We need to change the command line in the code source every time we change a parameter, the name of the jar, etc. Overall, something that works fine for some test, sandbox use, but not a generic and elegant way in my humble opinion. Ok, so my purpose here is to implement a method : public static void restartApplication(Runnable runBeforeRestart) throws IOException { ... } that could be wrapped in some jar library, and could be called, without any code modification, by any Java program, and by solving the 4 points raised previously. Let's start by looking at each point and find a way to answer them in an elegant way (let's say the most elegant way that I found). 1) How to get the program and VM arguments ? Pretty simple, by calling a : ManagementFactory.getRuntimeMXBean().getInputArguments(); Concerning the program arguments, the Java property sun.java.command we'll give us both the main class (or jar) and the program arguments, and both will be useful. String[] mainCommand = System.getProperty("sun.java.command").split(" "); 2) First retrieve the java bin executable given by the java.home property : String java = System.getProperty("java.home") + "/bin/java"; The simple case is when the application is launched from a jar. The jar name is given by a mainCommand[0], and it is in the current path, so we just have to append the application parameters mainCommand[1..n] with a -jar to get the command to execute : String cmd = java + vmArgsOneLine + "-jar " + new File(mainCommand[0]).getPath() + mainCommand[1..n]; We'll suppose here that the Manifest of the jar is well done, and we don't need to specify the main nor the classpath. Second case : when the application is launched from a class. In this case, we'll specify the class path and the main class : String cmd = java + vmArgsOneLine + -cp \"" + System.getProperty("java.class.path") + "\" " + mainCommand[0] + mainCommand[1..n]; 3) Third point, cleaning the old application before launching the new one. To do such a trick, we'll just execute the Runtime.getRuntime().exec(cmd) in a shutdown hook. This way, we'll be sure that everything will be properly clean up before creating the new application instance. Runtime.getRuntime().addShutdownHook(new Thread() { @Override public void run() { ... } }); Run the runBeforeRestart that contains some custom code that we want to be executed before restarting the application : if(runBeforeRestart != null) { runBeforeRestart.run(); } And finally, call the System.exit(0);. And we're done. Here is our generic method : /** * Sun property pointing the main class and its arguments. * Might not be defined on non Hotspot VM implementations. */ public static final String SUN_JAVA_COMMAND = "sun.java.command"; /** * Restart the current Java application * @param runBeforeRestart some custom code to be run before restarting * @throws IOException */ public static void restartApplication(Runnable runBeforeRestart) throws IOException { try { // java binary String java = System.getProperty("java.home") + "/bin/java"; // vm arguments List vmArguments = ManagementFactory.getRuntimeMXBean().getInputArguments(); StringBuffer vmArgsOneLine = new StringBuffer(); for (String arg : vmArguments) { // if it's the agent argument : we ignore it otherwise the // address of the old application and the new one will be in conflict if (!arg.contains("-agentlib")) { vmArgsOneLine.append(arg); vmArgsOneLine.append(" "); } } // init the command to execute, add the vm args final StringBuffer cmd = new StringBuffer("\"" + java + "\" " + vmArgsOneLine); // program main and program arguments String[] mainCommand = System.getProperty(SUN_JAVA_COMMAND).split(" "); // program main is a jar if (mainCommand[0].endsWith(".jar")) { // if it's a jar, add -jar mainJar cmd.append("-jar " + new File(mainCommand[0]).getPath()); } else { // else it's a .class, add the classpath and mainClass cmd.append("-cp \"" + System.getProperty("java.class.path") + "\" " + mainCommand[0]); } // finally add program arguments for (int i = 1; i < mainCommand.length; i++) { cmd.append(" "); cmd.append(mainCommand[i]); } // execute the command in a shutdown hook, to be sure that all the // resources have been disposed before restarting the application Runtime.getRuntime().addShutdownHook(new Thread() { @Override public void run() { try { Runtime.getRuntime().exec(cmd.toString()); } catch (IOException e) { e.printStackTrace(); } } }); // execute some custom code before restarting if (runBeforeRestart!= null) { runBeforeRestart.run(); } // exit System.exit(0); } catch (Exception e) { // something went wrong throw new IOException("Error while trying to restart the application", e); } } From : http://lewisleo.blogspot.jp/2012/08/programmatically-restart-java.html

July 6, 2011

by Leo Lewis

· 135,895 Views · 2 Likes

GET/POST Parameters in Node.js

We look at how to write GET and POST requests into tyour Node.js-based application. It's so easy you won't believe it!

June 28, 2011

by Snippets Manager

· 62,062 Views · 1 Like

How to Tame Java GC Pauses? Surviving 16GiB Heap and Greater

Learn how to survive 16GiB and greater heaps, and control Java GC pauses.

June 28, 2011

by Alexey Ragozin

· 162,207 Views

TechTip: Use of setLenient method on SimpleDateFormat

Sometimes when you are parsing a date string against a pattern(such as MM/dd/yyyy) using java.text.SimpleDateFormat, strange things might happen (for unknown developers) if your date string is dynamic content entered by a user in some input field on the user interface and if it is not entered in the specified format. The parse method in the SimpleDateFormat parses the date string that is in the incorrect format and returns your date object instead of throwing a java.text.ParseException. However, the date returned is not what you expect. The below code-snippet shows you this behaviour. package com.starwood.system.util; import java.text.ParseException; import java.text.SimpleDateFormat; import java.util.Date; public class DateSample { public static void main(String args[]){ SimpleDateFormat sdf = new SimpleDateFormat () ; sdf.applyPattern("MM/dd/yyyy") ; try { Date d = sdf.parse("2011/02/06") ; System.out.println(d) ; } catch (ParseException e) { e.printStackTrace(); } } } Output: Thu Jul 02 00:00:00 MST 173 See the output, that is a date back in the year 173. To avoid this problem, call the setLenient (false) on SimpleDateFormat instance. That will make the parse method throw ParseException when the given input string is not in the specified format. Here is the modified code-snippet. package com.starwood.system.util; import java.text.ParseException; import java.text.SimpleDateFormat; import java.util.Date; public class DateSample { public static void main(String args[]){ SimpleDateFormat sdf = new SimpleDateFormat () ; sdf.applyPattern("MM/dd/yyyy") ; sdf.setLenient(false) ; try { Date d = sdf.parse("2011/02/06") ; System.out.println(d) ; } catch (ParseException e) { System.out.println (e.getMessage()) ; } } } Output: Unparseable date: "2011/02/06" http://accordess.com/wpblog/2011/06/02/techtip-use-of-setlenient-method-on-simpledateformat

June 27, 2011

by Upendra Chintala

· 47,152 Views · 5 Likes

XML unmarshalling benchmark in Java: JAXB vs STax vs Woodstox

towards the end of last week i started thinking how to deal with large amounts of xml data in a resource-friendly way.the main problem that i wanted to solve was how to process large xml files in chunks while at the same time providing upstream/downstream systems with some data to process. of course i've been using jaxb technology for few years now; the main advantage of using jaxb is the quick time-to-market; if one possesses an xml schema, there are tools out there to auto-generate the corresponding java domain model classes automatically (eclipse indigo, maven jaxb plugins in various sauces, ant tasks, to name a few). the jaxb api then offers a marshaller and an unmarshaller to write/read xml data, mapping the java domain model. when thinking of jaxb as solution for my problem i suddendlly realised that jaxb keeps the whole objectification of the xml schema in memory, so the obvious question was: "how would our infrastructure cope with large xml files (e.g. in my case with a number of elements > 100,000) if we were to use jaxb?". i could have simply produced a large xml file, then a client for it and find out about memory consumption. as one probably knows there are mainly two approaches to processing xml data in java: dom and sax. with dom, the xml document is represented into memory as a tree; dom is useful if one needs cherry-pick access to the tree nodes or if one needs to write brief xml documents. on the other side of the spectrum there is sax, an event-driven technology, where the whole document is parsed one xml element at the time, and for each xml significative event, callbacks are "pushed" to a java client which then deals with them (such as start_document, start_element, end_element, etc). since sax does not bring the whole document into memory but it applies a cursor like approach to xml processing it does not consume huge amounts of memory. the drawback with sax is that it processes the whole document start to finish; this might not be necessarily what one wants for large xml documents. in my scenario, for instance, i'd like to be able to pass to downstream systems xml elements as they are available, but at the same time maybe i'd like to pass only 100 elements at the time, implementing some sort of pagination solution. dom seems too demanding from a memory-consumption point of view, whereas sax seems to coarse-grained for my needs. i remembered reading something about stax, a java technology which offered a middle ground between the capability to pull xml elements (as opposed to pushing xml elements, e.g. sax) while being ram-friendly. i then looked into the technology and decided that stax was probably the compromise i was looking for; however i wanted to keep the easy programming model offered by jaxb, so i really needed a combination of the two. while investigating stax, i came across woodstox; this open source project promises to be a faster xml parser than many othrers, so i decided to include it in my benchmark as well. i now had all elements to create a benchmark to give me memory consumption and processing speed metrics when processing large xml documents. the benchmark plan in order to create a benchmark i needed to do the following: create an xml schema which defined my domain model. this would be the input for jaxb to create the java domain model create three large xml files representing the model, with 10,000 / 100,000 / 1,000,000 elements respectively have a pure jaxb client which would unmarshall the large xml files completely in memory have a stax/jaxb client which would combine the low-memory consumption of sax technologies with the ease of programming model offered by jaxb have a woodstox/jaxb client with the same characteristics of the stax/jaxb client (in few words i just wanted to change the underlying parser and see if i could obtain any performance boost) record both memory consumption and speed of processing (e.g. how quickly would each solution make xml chunks available in memory as jaxb domain model classes) make the results available graphically, since, as we know, one picture tells one thousands words. the domain model xml schema i decided for a relatively easy domain model, with xml elements representing people, with their names and addresses. i also wanted to record whether a person was active. using jaxb to create the java model i am a fan of maven and use it as my default tool to build systems. this is the pom i defined for this little benchmark: 4.0.0 uk.co.jemos.tests.xml large-xml-parser 1.0.0-snapshot jar large-xml-parser http://www.jemos.co.uk utf-8 org.apache.maven.plugins maven-compiler-plugin 2.3.2 1.6 1.6 org.jvnet.jaxb2.maven2 maven-jaxb2-plugin 0.7.5 generate ${basedir}/src/main/resources **/*.xsd true -enableintrospection -xtostring -xequals -xhashcode true true org.jvnet.jaxb2_commons jaxb2-basics 0.6.1 org.apache.maven.plugins maven-jar-plugin 2.3.1 true uk.co.jemos.tests.xml.xmlpullbenchmarker org.apache.maven.plugins maven-assembly-plugin 2.2 ${project.build.directory}/site/downloads src/main/assembly/project.xml src/main/assembly/bin.xml junit junit 4.5 test uk.co.jemos.podam podam 2.3.11.release commons-io commons-io 2.0.1 com.sun.xml.bind jaxb-impl 2.1.3 org.jvnet.jaxb2_commons jaxb2-basics-runtime 0.6.0 org.codehaus.woodstox stax2-api 3.0.3 just few things to notice about this pom.xml. i use java 6, since starting from version 6, java contains all the xml libraries for jaxb, dom, sax and stax. to auto-generate the domain model classes from the xsd schema, i used the excellent maven-jaxb2-plugin, which allows, amongst other things, to obtain pojos with tostring, equals and hashcode support. i have also declared the jar plugin, to create an executable jar for the benchmark and the assembly plugin to distribute an executable version of the benchmark. the code for the benchmark is attached to this post, so if you want to build it and run it yourself, just unzip the project file, open a command line and run: $ mvn clean install assembly:assembly this command will place *-bin.* files into the folder target/site/downloads. unzip the one of your preference and to run the benchmark use (-dcreate.xml=true will generate the xml files. don't pass it if you have these files already, e.g. after the first run): $ java -jar -dcreate.xml=true large-xml-parser-1.0.0-snapshot.jar creating the test data to create the test data, i used podam , a java tool to auto-fill pojos and javabeans with data. the code is as simple as: jaxbcontext context = jaxbcontext .newinstance("xml.integration.jemos.co.uk.large_file"); marshaller marshaller = context.createmarshaller(); marshaller.setproperty(marshaller.jaxb_formatted_output, boolean.true); marshaller.setproperty(marshaller.jaxb_encoding, "utf-8"); personstype personstype = new objectfactory().createpersonstype(); list persons = personstype.getperson(); podamfactory factory = new podamfactoryimpl(); for (int i = 0; i < nbrelements; i++) { persons.add(factory.manufacturepojo(persontype.class)); } jaxbelement towrite = new objectfactory() .createpersons(personstype); file file = new file(filename); bufferedoutputstream bos = new bufferedoutputstream( new fileoutputstream(file), 4096); try { marshaller.marshal(towrite, bos); bos.flush(); } finally { ioutils.closequietly(bos); } the xmlpullbenchmarker generates three large xml files under ~/xml-benchmark: large-person-10000.xml (approx 3m) large-person-100000.xml (approx 30m) large-person-1000000.xml (approx 300m) each file looks like the following: ult6yn0d7l u8djoutlk2 dxwlpow6x3 o4ggvximo7 io7kuz0xmz lmiy1uqkxs zhtukbtwti gbc7kex9tn kxmwnlprep 9bibs1m5gr hmtqpxjcpw bhpf1rrldm ydjjillyrw xgstdjcfjc [..etc] each file contains 10,000 / 100,000 / 1,000,000 elements. the running environments i tried the benchmarker on three different environments: ubuntu 10, 64-bit running as virtual machine on a windows 7 ultimate, with cpu i5, 750 @2.67ghz and 2.66ghz, 8gb ram of which 4gb dedicated to the vm. jvm: 1.6.0_25, hotspot windows 7 ultimate , hosting the above vm, therefore with same processor. jvm, 1.6.0_24, hotspot ubuntu 10, 32-bit , 3gb ram, dual core. jvm, 1.6.0_24, openjdk the xml unmarshalling to unmarshall the code i used three different strategies: pure jaxb stax + jaxb woodstox + jaxb pure jaxb unmarshalling the code which i used to unmarshall the large xml files using jaxb follows: private void readlargefilewithjaxb(file file, int nbrrecords) throws exception { jaxbcontext ucontext = jaxbcontext .newinstance("xml.integration.jemos.co.uk.large_file"); unmarshaller unmarshaller = ucontext.createunmarshaller(); bufferedinputstream bis = new bufferedinputstream(new fileinputstream( file)); long start = system.currenttimemillis(); long memstart = runtime.getruntime().freememory(); long memend = 0l; try { jaxbelement root = (jaxbelement) unmarshaller .unmarshal(bis); root.getvalue().getperson().size(); memend = runtime.getruntime().freememory(); long end = system.currenttimemillis(); log.info("jaxb (" + nbrrecords + "): - total memory used: " + (memstart - memend)); log.info("jaxb (" + nbrrecords + "): time taken in ms: " + (end - start)); } finally { ioutils.closequietly(bis); } } the code uses a one-liner to unmarshall each xml file: jaxbelement root = (jaxbelement) unmarshaller .unmarshal(bis); i also accessed the size of the underlying persontype collection to "touch" in memory data. btw, debugging the application showed that all 10,000 elements were indeed available in memory after this line of code. jaxb + stax with stax, i just had to use an xmlstreamreader, iterate through all elements, and pass each in turn to jaxb to unmarshall it into a persontype domain model object. the code follows: // set up a stax reader xmlinputfactory xmlif = xmlinputfactory.newinstance(); xmlstreamreader xmlr = xmlif .createxmlstreamreader(new filereader(file)); jaxbcontext ucontext = jaxbcontext.newinstance(persontype.class); unmarshaller unmarshaller = ucontext.createunmarshaller(); long start = system.currenttimemillis(); long memstart = runtime.getruntime().freememory(); long memend = 0l; try { xmlr.nexttag(); xmlr.require(xmlstreamconstants.start_element, null, "persons"); xmlr.nexttag(); while (xmlr.geteventtype() == xmlstreamconstants.start_element) { jaxbelement pt = unmarshaller.unmarshal(xmlr, persontype.class); if (xmlr.geteventtype() == xmlstreamconstants.characters) { xmlr.next(); } } memend = runtime.getruntime().freememory(); long end = system.currenttimemillis(); log.info("stax - (" + nbrrecords + "): - total memory used: " + (memstart - memend)); log.info("stax - (" + nbrrecords + "): time taken in ms: " + (end - start)); } finally { xmlr.close(); } } note that this time when creating the context, i had to specify that it was for the persontype object, and when invoking the jaxb unmarshalling i had to pass also the desired returned class type, with: jaxbelement pt = unmarshaller.unmarshal(xmlr, persontype.class); note that i don't to anything with the object, just create it, to keep the benchmark as truthful and possible by not introducing any unnecessary steps. jaxb + woodstox with woodstox, the approach is very similar to the one used with stax. in fact woodstox provides a stax2 compatible api, so all i had to do was to provide the correct factory and...bang! i had woodstox under the cover working. private void readlargexmlwithfasterstax(file file, int nbrrecords) throws factoryconfigurationerror, xmlstreamexception, filenotfoundexception, jaxbexception { // set up a woodstox reader xmlinputfactory xmlif = xmlinputfactory2.newinstance(); xmlstreamreader xmlr = xmlif .createxmlstreamreader(new filereader(file)); jaxbcontext ucontext = jaxbcontext.newinstance(persontype.class); unmarshaller unmarshaller = ucontext.createunmarshaller(); long start = system.currenttimemillis(); long memstart = runtime.getruntime().freememory(); long memend = 0l; try { xmlr.nexttag(); xmlr.require(xmlstreamconstants.start_element, null, "persons"); xmlr.nexttag(); while (xmlr.geteventtype() == xmlstreamconstants.start_element) { jaxbelement pt = unmarshaller.unmarshal(xmlr, persontype.class); if (xmlr.geteventtype() == xmlstreamconstants.characters) { xmlr.next(); } } memend = runtime.getruntime().freememory(); long end = system.currenttimemillis(); log.info("woodstox - (" + nbrrecords + "): total memory used: " + (memstart - memend)); log.info("woodstox - (" + nbrrecords + "): time taken in ms: " + (end - start)); } finally { xmlr.close(); } } note the following line: xmlinputfactory xmlif = xmlinputfactory2.newinstance(); where i pass in a stax2 xmlinputfactory. this uses the woodstox implementation. the main loop once the files are in place (you obtain this by passing -dcreate.xml=true), the main performs the following: system.gc(); system.gc(); for (int i = 0; i < 10; i++) { main.readlargefilewithjaxb(new file(output_folder + file.separatorchar + "large-person-10000.xml"), 10000); main.readlargefilewithjaxb(new file(output_folder + file.separatorchar + "large-person-100000.xml"), 100000); main.readlargefilewithjaxb(new file(output_folder + file.separatorchar + "large-person-1000000.xml"), 1000000); main.readlargexmlwithstax(new file(output_folder + file.separatorchar + "large-person-10000.xml"), 10000); main.readlargexmlwithstax(new file(output_folder + file.separatorchar + "large-person-100000.xml"), 100000); main.readlargexmlwithstax(new file(output_folder + file.separatorchar + "large-person-1000000.xml"), 1000000); main.readlargexmlwithfasterstax(new file(output_folder + file.separatorchar + "large-person-10000.xml"), 10000); main.readlargexmlwithfasterstax(new file(output_folder + file.separatorchar + "large-person-100000.xml"), 100000); main.readlargexmlwithfasterstax(new file(output_folder + file.separatorchar + "large-person-1000000.xml"), 1000000); } it invites the gc to run, although as we know this is at the gc thread discretion. it then executes each strategy 10 times, to normalise ram and cpu consumption. the final data are then collected by running an average on the ten runs. the benchmark results for memory consumption here follow some diagrams which show memory consumption across the different running environments, when unmarshalling 10,000 / 100,000 / 1,000,000 files. you will probably notice that memory consumption for stax-related strategies often shows a negative value. this means that there was more free memory after unmarshalling all elements than there was at the beginning of the unmarshalling loop; this, in turn, suggests that the gc ran a lot more with stax than with jaxb. this is logical if one thinks about it; since with stax we don't keep all objects into memory there are more objects available for garbage collection. in this particular case i believe the persontype object created in the while loop gets eligible for gc and enters the young generation area and then it gets reclamed by the gc. this, however, should have a minimum impact on performance, since we know that claiming objects from the young generation space is done very efficiently. summary for 10,000 xml elements summary for 100,000 xml elements summary for 1,000,000 xml elements the benchmark results for processing speed results for 10,000 elements results for 100,000 elements results for 1,000,000 elements conclusions the results on all three different environments, although with some differences, all tell us the same story: if you are looking for performance (e.g. xml unmarshalling speed), choose jaxb if you are looking for low-memory usage (and are ready to sacrifice some performance speed), then use stax. my personal opinion is also that i wouldn't go for woodstox, but i'd choose either jaxb (if i needed processing power and could afford the ram) or stax (if i didn't need top speed and was low on infrastructure resources). both these technologies are java standards and part of the jdk starting from java 6. resources benchmarker source code zip version: download large-xml-parser-1.0.0-snapshot-project tar.gz version: download large-xml-parser-1.0.0-snapshot-project.tar tar.bz2 version: download large-xml-parser-1.0.0-snapshot-project.tar benchmarker executables: zip version: download large-xml-parser-1.0.0-snapshot-bin tar.gz version: download large-xml-parser-1.0.0-snapshot-bin.tar tar.bz2 version: download large-xml-parser-1.0.0-snapshot-bin.tar data files: ubuntu 64-bit vm running environment: download stax-vs-jaxb-ubuntu-64-vm ubuntu 32-bit running environment : download stax-vs-jaxb-ubuntu-32-bit windows 7 ultimate running environment : download stax-vs-jaxb-windows7 from http://tedone.typepad.com/blog/2011/06/unmarshalling-benchmark-in-java-jaxb-vs-stax-vs-woodstox.html

June 27, 2011

by Marco Tedone

· 71,380 Views · 9 Likes

Java EE6 CDI, Named Components and Qualifiers

One of the biggest promises java EE6 made, was to ease the use of dependency injection. They did, using CDI. CDI, which stands for Contexts and Dependency Injection for Java EE, offers a base set to apply dependency injection in your enterprise application. Before CDI, EJB 3 also introduced dependency injection, but this was a bit basic. You could inject an EJB (statefull or stateless) into another EJB or Servlet (if you container supported this). Offcourse not every application needs EJB’s, that is why CDI is gaining so much popularity. To start, I have made this example. There is a Payment interface, and 2 implementations. A cash payment and a visa payment. I want to be able to choose witch type of payment I inject, still using the same interface. public interface Payment { void pay(BigDecimal amount); } and the 2 implementations public class CashPaymentImpl implements Payment { private static final Logger LOGGER = Logger.getLogger(CashPaymentImpl.class.toString()); @Override public void pay(BigDecimal amount) { LOGGER.log(Level.INFO, "payed {0} cash", amount.toString()); } } public class VisaPaymentImpl implements Payment { private static final Logger LOGGER = Logger.getLogger(VisaPaymentImpl.class.toString()); @Override public void pay(BigDecimal amount) { LOGGER.log(Level.INFO, "payed {0} with visa", amount.toString()); } } To inject the interface we use the @Inject annotation. The annotation does basically what it says. It injects a component, that is available in your application. 1 @Inject private Payment payment; Off course, you saw this coming from a mile away, this won’t work. The container has 2 implementations of our Payment interface, so he does not know which one to inject. Unsatisfied dependencies for type [Payment] with qualifiers [@Default] at injection point [[field] @Inject private be.styledideas.blog.qualifier.web.PaymentBackingAction.payment] So we need some sort of qualifier to point out what implementation we want. CDI offers the @Named Annotation, allowing you to give a name to an implementation. @Named("cash") public class CashPaymentImpl implements Payment { private static final Logger LOGGER = Logger.getLogger(CashPaymentImpl.class.toString()); @Override public void pay(BigDecimal amount) { LOGGER.log(Level.INFO, "payed {0} cash", amount.toString()); } } @Named("visa") public class VisaPaymentImpl implements Payment { private static final Logger LOGGER = Logger.getLogger(VisaPaymentImpl.class.toString()); @Override public void pay(BigDecimal amount) { LOGGER.log(Level.INFO, "payed {0} with visa", amount.toString()); } } When we now change our injection code, we can specify wich implementation we need. @Inject private @Named("visa") Payment payment; This works, but the flexibility is limited. When we want to rename our @Named parameter, we have to change it on everyplace where it is used. There is also no refactoring support. There is a beter alternative using Custom made annotations using the @Qualifier annotation. Let us change the code a little bit. First of all, we create new Annotation types. @java.lang.annotation.Documented @java.lang.annotation.Retention(RetentionPolicy.RUNTIME) @javax.inject.Qualifier public @interface CashPayment {} @java.lang.annotation.Documented @java.lang.annotation.Retention(RetentionPolicy.RUNTIME) @javax.inject.Qualifier public @interface VisaPayment {} The @Qualifier annotation that is added to the annotation, makes this annotation discoverable by the container. We can now simply add these annotations to our implementations. @CashPayment public class CashPaymentImpl implements Payment { private static final Logger LOGGER = Logger.getLogger(CashPaymentImpl.class.toString()); @Override public void pay(BigDecimal amount) { LOGGER.log(Level.INFO, "payed {0} cash", amount.toString()); } } @VisaPayment public class VisaPaymentImpl implements Payment { private static final Logger LOGGER = Logger.getLogger(VisaPaymentImpl.class.toString()); @Override public void pay(BigDecimal amount) { LOGGER.log(Level.INFO, "payed {0} with visa", amount.toString()); } } The only thing we now need to do, is change our injection code to @Inject private @VisaPayment Payment payment; When we now change something to our qualifier, we have nice compiler and refactoring support. This also adds extra flexibilty for API or Domain-specific language design. From http://styledideas.be/blog/2011/06/16/java-ee6-cdi-named-components-and-qualifiers/

June 24, 2011

by Jelle Victoor

· 72,933 Views · 5 Likes

Java Web Application Security - Part V: Penetrating with Zed Attack Proxy

web application security is an important part of developing applications. as developers, i think we often forget this, or simply ignore it. in my career, i've learned a lot about web application security. however, i only recently learned and became familiar with the rapidly growing "appsec" industry. i found a disconnect between what appsec consultants were selling and what i was developing. it seemed like appsec consultants were selling me fear, mostly because i thought my apps were secure. so i set out on a mission to learn more about web application security and penetration testing to see if my apps really were secure. this article is part of that mission, as are the previous articles i've written in this series. java web application security - part i: java ee 6 login demo java web application security - part ii: spring security login demo java web application security - part iii: apache shiro login demo java web application security - part iv: programmatic login apis when i first decided i wanted to do a talk on webapp security, i knew it would be more interesting if i showed the audience how to hack and fix an application. that's why i wrote it into my original proposal : webapp security: develop. penetrate. protect. relax. in this session, you'll learn how to implement authentication in your java web applications using spring security, apache shiro and good ol' java ee container managed authentication. you'll also learn how to secure your rest api with oauth and lock it down with ssl. after learning how to develop authentication, i'll introduce you to owasp, the owasp top 10, its testing guide and its code review guide. from there, i'll discuss using webgoat to verify your app is secure and commercial tools like webapp firewalls and accelerators. at the time, i hadn't done much webapp pentesting . you can tell this from the fact that i mentioned webgoat as the pentesting tool. from webgoat's project page : webgoat is a deliberately insecure j2ee web application maintained by owasp designed to teach web application security lessons. in each lesson, users must demonstrate their understanding of a security issue by exploiting a real vulnerability in the webgoat application. for example, in one of the lessons the user must use sql injection to steal fake credit card numbers. the application is a realistic teaching environment, providing users with hints and code to further explain the lesson. what i really meant to say and use was zed attack proxy , also known as owasp zap. zap is a java desktop application that you setup as a proxy for your browser, then use to find vulnerabilities in your application. this article explains how you can use zap to pentest a web applications and fix its vulnerabilities. the application i'll be using in this article is the ajax login application i've been using throughout this series. i think it's great that projects like damn vulnerable web app and webgoat exist, but i wanted to test one that i think is secure, rather than one i know is not secure. in this particular example, i'll be testing the spring security implementation, since that's the framework i most often use in my open source projects. zed attack proxy tutorial download and run the application install and configure zap perform a scan fix vulnerabilities summary download and run the application to begin, download the application and expand it on your hard drive. this app is the completed version of the ajax login application referenced in java web application security - part ii: spring security login demo . you'll need java 6 and maven installed to run the app. run it using mvn jetty:run and open http://localhost:8080 in your browser. you'll see it's a simple crud application for users and you need to login to do anything. install and configure zap the zed attack proxy (zap) is an easy to use integrated penetration testing tool for finding vulnerabilities in web applications. download the latest version (i used 1.3.0) and install it on your system. after installing, launch the app and change the proxy port to 9000 (tools > options > local proxy). next, configure your browser to proxy requests through port 9000 and allow localhost requests to be proxied. i used firefox 4 (preferences > advanced > network > connection settings). when finished, your proxy settings should look like the following screenshot: another option (instead of removing localhost) is to add an entry to your hosts file with your production domain name. this is what i've done for this demo. 127.0.0.1 demo.raibledesigns.com i've also configured apache to proxy requests to jetty with the following mod_proxy settings in my httpd.conf: proxyrequests off proxypreservehost off proxypass / http://localhost:8080/ sslengine on sslproxyengine on sslcertificatefile "/etc/apache2/ssl.key/server.crt" sslcertificatekeyfile "/etc/apache2/ssl.key/server.key" proxypass / https://localhost:8443/ perform a scan now you need to give zap some data to work with. using firefox, i navigated to http://demo.raibledesigns.com and browsed around a bit, listing users, added a new one and deleted an existing one. after doing this, i noticed a number of flags in the zap ui under sites. i then right-clicked on each site (one for http and one for https) and selected attack > active scan site. you should be able to do this from the "active scan" tab at the bottom of zap, but there's a bug when the urls are the same . after doing this, i received a number of alerts, ranging from high (cross-site scripting) to low (password autocomplete). the screenshot below shows the various issues. now let's take a look at how to fix them. fix vulnerabilities one of the things not mentioned by the scan, but #1 in seven security (mis)configurations in java web.xml files , is custom error pages not configured. custom error pages are configured in this app, but error.jsp contains the following code: please check your log files for further information. stack traces can be really useful to an attacker, so it's important to start by removing the above code from src/main/webapp/error.jsp . the rest of the issues have to do with xss, autocomplete, and cookies. let's start with the easy ones. fixing autocomplete is easy enough; simply changed the html in login.jsp and userform.jsp to have autocomplete="off" as part of the tag. then modify web.xml so http-only and secure cookies are used. while you're at it, add session-timeout and tracking-mode as recommended by the aforementioned web.xml misconfigurations article. 15 true true cookie next, modify spring security's remember me configuration so it uses secure cookies. to do this, add use-secure-cookies="true" to the element in security.xml . unfortunately, spring security doesn't support httponly cookies , but will in a future release. the next issue to solve is disabling directory browsing. you can do this by copying jetty's webdefault.xml (from the org.eclipse.jetty:jetty-webapp jar) into src/test/resources and changing its "dirallowed" to false: default org.mortbay.jetty.servlet.defaultservlet acceptranges true dirallowed false you'll also need to modify the plugin's configuration to point to this file by adding it to the section in pom.xml. / src/test/resources/webdefault.xml of course, if you're running in production you'll want to configure this in your server's settings rather than in your pom.xml file. next, i set out to fix secure page browser cache issues . i had the following settings in my sitemesh decorator: however, according to zap, the first meta tag should have "no-cache" instead of "no-store", so i changed it to "no-cache". after making all these changes, i created a new zap session and ran an active scan on both sites again. below are the results: i believe the first issue (parameter tampering) is because i show the error page when a duplicate user exists. to fix this, i changed userformcontroller so it catches a userexistsexception and sends the user back to the form. try { usermanager.saveuser(user); } catch (userexistsexception uex) { result.adderror(new objecterror("user", uex.getmessage())); return "userform"; } however, this still doesn't seem to cause the alert to go away. this is likely because i'm not filtering/escaping html when it's first submitted. i believe the best solution for this would be to use something like owasp's esapi to filter parameter values. however, i was unable to find integration with spring mvc's data binding, so i decided not to try and fix this vulnerability. finally, i tried to disable jsessionid in urls using suggestions from stack overflow . the previous setting in web.xml (cookie) should do this, but it doesn't seem to work with jetty 8. the other issues (secure page browser cache, httponly cookies and secure cookies), i was unable to solve. the last two are issues caused by spring security as far as i can tell. summary in this article, i've shown you how to pentest a web application using firefox and owasp's zed attack proxy (zap). i found zap to be a nice tool for figuring out vulnerabilities, but it'd be nice if it had a "retest" feature to see if you fixed an issue for a particular url. it does have a "resend" feature, but running it didn't seem to clear alerts after i'd fixed them. the issues i wasn't able to solve seemed to be mostly related to frameworks (e.g. spring security and httponly cookies) or servers (jetty not using cookies for tracking). my suspicion is the jetty issues are because it doesn't support servlet 3 as well as it advertises. i believe this is fair; i am using a milestone release after all. i tried scanning http://demo.raibledesigns.com/ajax-login (which runs on tomcat 7 at contegix ) and confirmed that no jsessionid exists. hopefully this article has helped you understand how to figure out security vulnerabilities in your web applications. i believe zap will continue to get more popular as developers become aware of it. if you feel ambitious and want to try and solve all of the issues in my ajax login application, feel free to fork it on github . if you're interested in talking more about webapp security, please leave a comment, meet me at jazoon later this week or let's talk in july at über conf . from http://raibledesigns.com/rd/entry/java_web_application_security_part4

June 22, 2011

by Matt Raible

· 27,750 Views · 2 Likes

Java EE6 Events, a lightweight alternative to JMS

A few weeks ago I attended a bejug meeting about Java EE 6, Building next generation enterprise applications. Having read much about it, I did not expect to see much shocking hidden features. But there was one part of the demo I really found impressive. Due to its loose coupling, Enterprise possibilities and simplicity. The feature I’m going to talk about today is the event mechanism that is in java EE 6. The general idea is to fire an event and let an eventlistener pick it up. I have created this example that is totally useless, but it simplicity helps me to focus on the important stuff. I’m going to fire a LogEvent from my backing action, that will log to the java.util.Logger. The first thing I need is to create a pojo that contains my log message and my LogLevel. public class LogMessage implements Serializable { private final String message; private final Level level; LogMessage(String message, Level level) { this.message = message; this.level = level; } public String getMessage() { return message; } public Level getLevel() { return level; } } easy peasy. Now that I have my data wrapper, I need something to fire the event and something to pick it up. The first thing I create is my method where I fire the event. Due to CDI I can inject an event. @Inject Event event; So we just need to fire it. event.fire(new LogMessage("Log it baby!", Level.INFO)); Now the event is fired, If no one is registerd to pick it up, it disappears into oblivion, thus we create a listener. The listeners needs a method that has one parameter, the generic type that is given to the previous event. LogMessage. public class LogListener { private static final Logger LOGGER = Logger.getAnonymousLogger(); public void process(@Observes LogMessage message){ LOGGER.log(message.getLevel(), message.getMessage()); } } The @Observes annotation listens to all events with a LogMessage. When the event is fired, this method will be triggered. This is a very nice way to create a loosely coupled application, you can separate heavy operations or encapsulate less essential operations in these event listeners. All of this all happens synchronously. When we want to replace the log statement with a slow database call to a logging table, we could make our operation heavier than it should be. What I’m looking for is to create an asynchronous call. As long as we support EJB, we can transform our Listener to an EJB by adding the @Stateless annotation on top of it. Now it’s a statless enterprise bean. This changes nothing to our sync/async problem, but EJB 3.1 support async operations. So if we also add the @Asynchronous annotation on top of it. It will asynchronously execute our logging statement. @Stateless @Asynchronous public class LogListener { private static final Logger LOGGER = Logger.getAnonymousLogger(); public void process(@Observes LogMessage message){ LOGGER.log(message.getLevel(), message.getMessage()); } } If we would want to combine the database logging and the console logging, we can just create multiple methods that listen to the same event. This is a great way to create a lightweight application with a very flexible components. The alternative solution to this problem is to use JMS, but you don’t want a heavyweight configuration for this kind of loosely coupling. Java EE has worked hard to get rid of the stigma of being heavyweight, I think they are getting there From http://styledideas.be/blog/2011/05/22/java-ee6-events-a-lightweight-alternative-to-jms/

June 22, 2011

by Jelle Victoor

· 20,699 Views · 2 Likes

Developing Android Apps with NetBeans, Maven, and VirtualBox

I am an experienced Java developer who has used various IDEs and prefer NetBeans IDE over all others by a long shot. I am also very fond of Maven as the tool to simplify and automate nearly every aspect of the development of my Java project throughout its lifecycle. Recently, I started developing Android applications and naturally I looked for a Maven plugin that would manage my Android projects. Luckily I found the maven-android-plugin which worked like a charm and allowed me to use Maven for developing my Android projects. The Android Emulator from the Android SDK seemed unusably slow. Lucklily, I found a way to use an Android Virtual Machine for VirtualBox that worked nearly as fast as my native computer! This page documents my experiences. Tested Environment Dev machine: Ubuntu 11.04 Linux IDE: NetBeans VirtualBox: 4.0.8 r71778 Android SDK Revision 11, Add on XML Schema #1, Repository XML Schema #3 (from About in SDK and AVD Manager) Android Version: 2.2 Overview of Steps Download and install the Android SDK on your dev machine Attach an Android Device to dev machine Configure and load your device for development and other use Create an initial Android maven project Connect Android Device to Android SDK Debug Android app using NetBeans Graphical Debuger Download and Install Android SDK Download and install the Android SDK on your dev machine as described here. Make sure to set the following in dev machine ~/.bashrc file: export ANDROID_HOME=$HOME/android-sdk-linux_x86 #Change as needed export PATH="$ANDROID_HOME/tools:$ANDROID_HOME/platform-tools:$PATH" Attaching an Android Device to Dev Machine If you have an actual device that is usually always best. If not, you must use a virtual Android device which usually has various limitations (e.g. no GPS, Camera etc.). The Android SDK makes it easy to create a new Virtual Device but the resulting device is painfully slow in my experience and not usable. Do not bother with this. Instead, create a virtual Android device using VirtualBox as described in the following steps: Install virtual box and initial Android VM as described here: http://androidspin.com/2011/01/24/howto-install-android-x86-2-2-in-virtualbox/ http://geeknizer.com/how-to-run-google-android-in-virtualbox-vmware-on-netbooks/ Configure Android VM so it is connected bidirectionally with your dev machine over TCP as described here: http://stackoverflow.com/questions/61156/virtualbox-host-guest-network-setup I used the approach of configuring a HOST ONLY network adapater and a second NAT adapter on the Android VM within virtual box. Configuring your Android Device This section describes various things I did to setup a dev environment for my Android device: Root the device. I used Universal AndRoot Install ConnectBot so you have ssh and related network utilities Creating Initial Android Maven Application Create initial project using instructions here. I found it best to create stub project structure using the maven-archtype-plugin and the archtypes at https://github.com/akquinet/android-archetypes/wiki Connecting Android VM Device to Android SDK In order for your code to be deployed from NetBeans IDE to Android Device and in order for you to monitor your deployed app from the Dalvik Debug Monitor (ddms) you need to connect your android VM device to the android sdk over TCP as described in the following steps. On Android Device open the Terminal Emulator Type su to become root (your device must be rooted for this Type following commands in root shell: setprop service.adb.tcp.port 5555 stop adbd start adbd Type the following commands on dev machine shell. TODO: Note that IP address below is whatever is the ip address associated with the device (see ifconfig on linux for device vboxnet0) adb tcpip 5555 adb connect 192.168.0.101:5555 For details on above steps see: http://stackoverflow.com/questions/2604727/how-can-i-connect-to-android-with-adb-over-tcp Set up port forwarding as described here http://redkrieg.com/2010/10/11/adb-over-ssh-fun-with-port-forwards/ (this is where I am most fuzzy) Build your maven android project using Right-Click / Clean and Build Now for the acid test whether you can deploy your app to the device from NetBeans IDE! Right-click / Custom / Goal to show Run Maven dialog. Enter android:deploy in Goals field. Select Remember As button and enter android:deploy for its text field. If all is well, the app will deploy to the device and will show up in its "Applications" screen. Debugging Android App Using NetBeans Graphical Debugger Once you can build and deploy your app to the real or virtual Android device, here are the steps to debug the app using NetBeans debugger: On Device: Start the app (TODO: determine how to start app on device with JVM options so it can wait for debugger connection. This should be easy) On Dev Machine run Dalvik Debug Monitor (ddms) in background: $ANDROID_HOME/tools/ddms & Lookup your app in ddms and get its debug port. This is described here but does not address NetBeans specifically In NetBeans do: Debug / Attach Debugger and specify the port looked up in ddms in previous step. You may leave rest of the fields with defaults. Click OK

June 18, 2011

by Farrukh Najmi

· 173,470 Views

Method Size Limit in Java

Most people don’t know this, but you cannot have methods of unlimited size in Java. Java has a 64k limit on the size of methods. What happens if I run into this limit? If you run into this limit, the Java compiler will complain with a message which says something like "Code too large to compile". You can also run into this limit at runtime if you had an already large method, just below the 64k limit and some tool or library does bytecode instrumentation on that method, adding to the size of the method and thus making it go beyond the 64k limit. In this case you will get a java.lang.VerifyError at runtime. This is an issue we ran into with the Chronon recorder where most large programs would have atleast a few large methods, and adding instrumentation to them would cause them to blow past the 64k limit, thus causing a runtime error in the program. Before we look into how we went about solving this problem for Chronon, lets look at under what circumstances people write such large methods in the first place. Where do these large methods come from? · Code generators As it turns out, most humans don’t infact write such gigantic methods. We found that most of these large methods were the result of some code generators, eg the ANTLR parser generator generates some very large methods. · Initialization Methods Initialization methods, especially gui initialization methods, where all the layout and attaching listeners, etc to every component in some in one large chunk of code is a common practise and results in a single large method. · Array initializers If you have a large array initialized in your code, eg: static final byte largeArray[] = {10, 20, 30, 40, 50, 60, 70, 80, …}; that is translated by the compiler into a method which uses load/store instructions to initialize the array. Thus an array too large can cause this error too, which may seem very mysterious to those who don’t know about this limit. · Long jsp pages Since most JSP compilers put all the jsp code in one method, large jsp pages can make you run into these errors too. Of course, these are only a few common cases, there can be a lot of other reasons why your method size is too large. How do we get around this issue? If you get this error at compile time, it is usually trivial to split your code into multiple methods. It may be a bit hairy when the method limit is reached due to some automated code generation like ANTLR or JSPs, but usually even these tools have provisions to allow you to split the code into chunks, eg : jsp:include in the case of JSPs. Where things get hairy is the second case I talked about earlier, which is when bytecode instrumentation causes the size of your methods to go beyond the 64k limit, which results in a runtime error. Of course you can still look at the method which is causing the issue, and go back and split it. However, this may not be possible if the method is inside a third party library. Thus, for the Chronon recorder at least, the way we fixed it was to instrument the method, and then check the method's size after instrumentation. If the size is above the 64k limit, we go back and 'deinstrument' the method, thus essentially excluding it from recording. Since both our Recorder and Time Travelling Debugger are already built from the groud up to deal with excluded code, it wasn’t an issue while recording or debugging the rest of the code. That said, the method size limit of 64k is too small and not needed in a world of 64 bit machines. I would urge everyone reading this to go vote on this JVM bug so that this issue can be resolved in some future version of the JVM. From http://eblog.chrononsystems.com/method-size-limit-in-java

June 14, 2011

by Prashant Deva

· 22,323 Views

qcadoo MES - new Java-based Open Source manufacturing management software

Comment: Why did we do it ? For many years we stumbled upon many companies from different industries and with blushes on our cheeks, we saw that many new business software projects were solving the same problems over and over again. Every one is racing to bring new ERP systems, "innovative" CRMs and e-commerce suits. But most companies have already deployed these type of systems and optimized their sales process, finances and logistics. Little is however done with software for the manufacturing industry especially in the SME sector. Flying paper notes and Excel files - that is still the landscape in that domain. So we decided to make manufacturing management more systematic and started to work on qcadoo MES. We managed to acquire financing from the UE in the polish POIG 8.1 program and from a couple of private investors. We also build a team with people that have at least 10 year experience in developing IT solutions like ERP and BI systems for large international companies. What is qcadoo MES qcadoo MES is a manufacturing management system for small and medium companies. It combines many functions you can find in ERP, MRP and MES systems. We don't want to replace systems like SAP, MS Dynamics or Wonderware. We do not target the same groups they do. qcadoo MES lets you replace those paper notes and spreadsheets that are flying around the production line and gain benefit from having them all in one system. From the beginning of our project we had a couple of ambitious goals. First of all we wanted our software to be easy to use. From the first day we work closely with a company that specializes in usability. Thanks to this we have a clear and intuitive interface approved by eye-tracking tests. qcadoo MES should also eliminate the biggest issue with other systems of this type - starting with big and wide deployment which mess with your daily business tasks and give no guarantee of success. We did this by encapsulating every functionality - big or small - in separate modules. Modules can be turned off and on and extend functionality of other modules. Our platform takes care of everything - keeps the systems integrity and module dependencies intact. How does the user benefit from this ? He can start his adventure with qcadoo MES by deploying a simple functionality - a single module - and in time add new ones to solve just the task he needs. Additionally you don't just get access to modules from Qcadoo but mainly from our partners and from the Open Source community. Encouraged by successes of such companies like OpenERP, Open Bravo and SugarCRM we decided to also free our product on an Open Source license. We hope this will encourage developers and engineers from the manufacturing industry to collaborate with us on qcadoo MES. What makes us special You can find some manufacturing managements systems on the web. But what you wont find in them is: fully modular system - you can start from a simple functionality - one module - an add more in time. Just as your companies needs evolve. You can buy modules in the qcooStore - something similar to Apple AppStore or Android Market - but with solutions for the manufacturing industry created from day one with usability in mind - this speeds up deployments and makes user training go swiftly expandable to infinity - modules are the fundamental concept of this system and you can add new features easily without any technical knowledge. Thanks to this you can adjust our system easily to your companies specific needs. open source - our source code is available on the AGPL license which means that every one can use the software for commercial deployments (with having the terms of the AGPL in mind). Users that need more professional support and commercial modules can buy a commercial license available as a service (SaaS) - qcadoo MES is available for any browser and eliminates the need to install it manually, remembering about backups and system upgrades. This really reduces your costs and saves your time global range - from the beginning we are working on making qcadoo MES available worldwide. We are collaborating with partners from every country and believe that a wide choice of modules will fit in to production lines everywhere, Canada, Poland, India, you name it A new Module is a piece of cake - qcadoo Framework The whole system is based on the qcadoo Framework. A great tool for rapid implementation of modular database web application. Every thing in it is a module and therefore can be extended by another module. Yours for example ! While designing the qcadoo Framework we also wanted to make it very easy for new developers. For this purpose we created two simple XML languages: one to define the GUI and the other do define the data model. Thanks to them in a few lines of code you can add a new data entity, a table for it and all the basic CRUD operations (Create, Remove, Update, Delete). Also you can always write more complex business logic in a simple POJO Java class and use the XML languages to hook it to the right events. A new Application is also a piece of cake The qcadoo Framework is not limited to just manufacturing software. It's well suited for any business database application. Everywhere you need a rapid tool for data and GUI modeling, a modular architecture and a repository of business modules which you can use. Our partners ale already getting ready to use the qcadoo Framework in their commercial projects. We also hope that the open source community can benefit from it as well. About Qcadoo Limited Qcadoo Limited is a company which was founded in 2010 with the help of the UE POIG 8.1 financing program and by a couple of private investors. The core team is build from people which have years of experience in developing business software in Poland and the UE. More info on qcadoo MES and the qcadoo Framework at www.qcadoo.com

June 6, 2011

by Adam Walczak

· 6,542 Views

When to Use Apache Camel?

When to use Apache Camel, a popular JVM/Java environment, and when to use other alternatives.

June 5, 2011

by Kai Wähner

CORE

· 153,185 Views · 12 Likes