Java Resources

The Latest Java Topics

This is the second article of the series on Spring Integration. This article builds on top of the first article where we introduced Spring Integration. Context setting In the first article, we created a simple java application where A message was sent over a channel, It was intercepted by a service i.e. POJO and modified. It was then sent over a different channel The modified message was read from the channel and displayed. However, in doing this - keeping in mind that we were merely introducing the concepts there - we wrote some Spring specific code in our application i.e. the test classes. In this article we will take care of that and make our application code as insulated from Spring Integration api as possible. This is done by, what Spring Integration calls gateways. Gateways exist for the sole purpose of abstracting messaging related "plumbing" code away from "business" code. The business logic might really not care whether a functionality is being achieved be sending a message over a channel or by making a SOAP call. This abstraction - though logical and desirable - have not been very practical, till now. It is probably worth having a quick look at the Spring Integration Reference Manual at this point. However, if you are just getting started with Spring Integration, you are perhaps better off following this article for the moment. I would recommend you get your hands dirty before returning to reference manual, which is very good but also very exhaustive and hence could be overwhelming for a beginner. The gateway could be a POJO with annotations (which is convenient but in my mind beats the whole purpose) or with XML configurations (can very quickly turn into a nightmare in any decent sized application if unchecked). At the end of the day it is really your choice but I like to go the XML route. The configuration options for both styles are detailed out in this section of the reference implementation. Spring Integration with Gateways So, let's create another test with gateway throw in for our HelloWorld service (refer to the first article of this series for more context). Let's start with the Spring configuration for the test. File: src/test/resources/org/academy/integration/HelloWorld1Test-context.xml In this case, all that is different is that we have added a gateway. This is an interface called org.academy.integration.Greetings. It interacts with both "inputChannel" and "outputChannel", to send and read messages respectively. Let's write the interface. File: /src/main/java/org/academy/integration/Greetings.java package org.academy.integration; public interface Greetings { public void send(String message); public String receive(); } And then we add the implementation of this interface. Wait. There is no implementation. And we do not need any implementation. Spring uses something called GatewayProxyFactoryBean to inject some basic code to this gateway which allows it to read the simple string based message, without us needing to do anything at all. That's right. Nothing at all. Note - You will need to add more code for most of your production scenarios - assuming you are not using Spring Integration framework to just push around strings. So, don't get used to free lunches. But, while it is here, let's dig in. Now, lets write a new test class using the gateway (and not interact with the channels and messages at all). File: /src/test/java/org/academy/integration/HelloWorld1Test.java package org.academy.integration; import static org.junit.Assert.*; import org.junit.Test; import org.junit.runner.RunWith; import org.slf4j.Logger; import org.slf4j.LoggerFactory; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.test.context.ContextConfiguration; import org.springframework.test.context.junit4.SpringJUnit4ClassRunner; @RunWith(SpringJUnit4ClassRunner.class) @ContextConfiguration public class HelloWorld1Test { private final static Logger logger = LoggerFactory .getLogger(HelloWorld1Test.class); @Autowired Greetings greetings; @Test public void test() { greetings.send("World"); assertEquals(greetings.receive(), "Hello World"); logger.debug("Spring Integration with gateways."); } } Our test class is much cleaner now. It does not know about channels, or messages or anything related to Spring Integration at all. It only knows about a greetings instance - to which it gave some data by .send() method - and got modified data back by .receive() method. Hence, the business logic is oblivious of the plumbing logic, making for a much cleaner code. Now, simply type "mvn -e clean install" (or use m2e plugin) and you should be able to run the unit test and confirm that given string "World" the HelloWorld service indeed returns "Hello World" over the entire arrangement of channels and messages. Again, something optional but I highly recommend, is to run "mvn -e clean install site". This - assuming you have correctly configured some code coverage tool (cobertura in my case) will give you a nice HTML report showing the code coverage. In this case it would be 100%. I have blogged a series on code quality which deals this subject in more detail, but to cut long story short, it is very important for me to ensure that whatever coding practice / framework I use and recommend use, complies to some basic code quality standards. Being able to unit test and measure that is one such fundamental check that I do. Needless to say, Spring in general (including Spring integration) passes that check with flying colours. Conclusion That's it for this article. Happy coding.

August 13, 2012

by Partha Bhattacharjee

· 59,695 Views

FXML & JavaFX—Fueled by CDI & JBoss Weld

It has been a while since I wanted to have CDI running with JavaFX2. Some people already blogged on how to proceed by getting Guice injection [1] to work with JavaFX & FXML. Well, now it's my turn to provide a way to empower JavaFX with CDI, using Weld as the implementation. My goal was just to have CDI working, no matter how I was using JavaFX, by directly coding in plain Java or using FXML. Ready? Let's go!!! Bootstrap JavaFX & Weld/CDI The launcher class will be the only place where we will have Weld-specific code—all the rest will be totally CDI compliant. The only trick here is to make the application parameters available as a CDI-compliant object so we can reuse them afterwards. Notice also that we use the CDI event mechanism to start up our real application code. public class WeldJavaFXLauncher extends Application { /** * Nothing special, we just use the JavaFX Application methods to boostrap * JavaFX */ public static void main(String[] args) { Application.launch(WeldJavaFXLauncher.class, args); } @SuppressWarnings("serial") @Override public void start(final Stage primaryStage) throws Exception { // Let's initialize CDI/Weld. WeldContainer weldContainer = new Weld().initialize(); // Make the application parameters injectable with a standard CDI // annotation weldContainer.instance().select(ApplicationParametersProvider.class).get().setParameters(getParameters()); // Now that JavaFX thread is ready // let's inform whoever cares using standard CDI notification mechanism: // CDI events weldContainer.event().select(Stage.class, new AnnotationLiteral() {}).fire(primaryStage); } } Start our real JavaFX application Here we start our real application code. We're just listening to the previously fired event (containing the Scene object to render into) so we can start showing our application. In the following example, we load an FXML GUI, but it might have been any node created in any way. public class LoginApplicationStarter { // Let's have a FXMLLoader injected automatically @Inject FXMLLoader fxmlLoader; // Our CDI entry point, we just listen to an event providing the startup scene public void launchJavaFXApplication(@Observes @StartupScene Stage s) { InputStream is = null; try { is = getClass().getResourceAsStream("login.fxml"); // we just load our FXML form (including controler and so on) Parent root = (Parent) fxmlLoader.load(is); s.setScene(new Scene(root, 300, 275)); s.show(); // let's show the scene } catch (IOException e) { throw new IllegalStateException("cannot load FXML login screen", e); } finally { // omitted is cleanup } } } But what about the FXML controller? First let's have a look at the controller we want to use inside our application. It is a pure POJO class annotated with both JavaFX & CDI annotations. // Simple application controller that uses injected fields // to delegate login process and to get default values from the command line using: --user=SomeUser public class LoginController implements Initializable { // Standard FXML injected fields @FXML TextField loginField; @FXML PasswordField passwordField; @FXML Text feedback; // CDI Injected service @Inject LoginService loginService; // Default application parameters retrieved using CDI @Inject Parameters applicationParameters; @FXML protected void handleSubmitButtonAction(ActionEvent event) { feedback.setText(loginService.login(loginField.getText(), passwordField.getText())); } @Override public void initialize(URL location, ResourceBundle resources) { loginField.setText(applicationParameters.getNamed().get("user")); } } In order to have injection working inside the FXML controller, we need to set up JavaFX so that controller objects are created by CDI. As we are in a CDI environment we can also have the FXMLLoader classes injected (that's exactly what we did in the previous LoginApplicationStarter class). How can we achieve this? We just have to provide a Producer class whose responsibility will be to create FXMLLoader instances that are able to load FXML GUIs and instantiate controllers using CDI. The only part that's a little tricky there is that the controller instantiation depends on the required class or interface (using fx:controller in your fxml file). In order to have such a runtime injection/resolution available we use a CDI Instance Object. public class FXMLLoaderProducer { @Inject Instance, Object>() { @Override public Object call(Class param) { return instance.select(param).get(); } }); return loader; } } I hope you found the article interesting and you do not hesitate to comment if you see some errors or possible enhancements. Finally, if you are interested you can find the full source code here. [1] http://andrewtill.blogspot.be/2012/07/creating-javafx-controllers-using-guice.htm

August 7, 2012

by Matthieu Brouillard

· 15,563 Views · 1 Like

Java Executor Service Types

The ExecutorService feature came with Java 5. It extends the Executor interface and provides a thread pool feature to execute asynchronous short tasks. There are five ways to execute the tasks asyncronously by using the ExecutorService interface provided Java 6. ExecutorService execService = Executors.newCachedThreadPool(); This approach creates a thread pool that creates new threads as needed, but will reuse previously constructed threads when they are available. These pools will typically improve the performance of programs that execute many short-lived asynchronous tasks. If no existing thread is available, a new thread will be created and added to the pool. Threads that have not been used for 60 seconds are terminated and removed from the cache. ExecutorService execService = Executors.newFixedThreadPool(10); This approach creates a thread pool that reuses a fixed number of threads. Created nThreads will be active at the runtime. If additional tasks are submitted when all threads are active, they will wait in the queue until a thread is available. ExecutorService execService = Executors.newSingleThreadExecutor(); This approach creates an Executor that uses a single worker thread operating off an unbounded queue. Tasks are guaranteed to execute sequentially, and no more than one task will be active at any given time. Methods of the ExecutorService : execute(Runnable) : Executes the given command at some time in the future. submit(Runnable) : Submit method returns a Future Object which represents executed task. Future Object returns null if the task has finished correctly. shutdown() : Initiates an orderly shutdown in which previously submitted tasks are executed, but no new tasks will be accepted. Invocation has no additional effect if already shut down. shutdownNow() : Attempts to stop all actively executing tasks, halts the processing of waiting tasks, and returns a list of the tasks that were awaiting execution. There are no guarantees beyond best-effort attempts to stop processing actively executing tasks. For example, typical implementations will cancel via Thread.interrupt, so any task that fails to respond to interrupts may never terminate. A sample application is below : STEP 1 : CREATE MAVEN PROJECT A maven project is created as below. (It can be created by using Maven or IDE Plug-in). STEP 2 : CREATE A NEW TASK A new task is created by implementing the Runnable interface(creating Thread) as below. TestTask Class specifies business logic which will be executed. package com.otv.task; import org.apache.log4j.Logger; /** * @author onlinetechvision.com * @since 24 Sept 2011 * @version 1.0.0 * */ public class TestTask implements Runnable { private static Logger log = Logger.getLogger(TestTask.class); private String taskName; public TestTask(String taskName) { this.taskName = taskName; } public void run() { try { log.debug(this.taskName + " is sleeping..."); Thread.sleep(3000); log.debug(this.taskName + " is running..."); } catch (InterruptedException e) { e.printStackTrace(); } } STEP 3 : CREATE TestExecutorService by using newCachedThreadPool TestExecutorService is created by using the method newCachedThreadPool. In this case, created thread count is specified at the runtime. package com.otv; import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors; import com.otv.task.TestTask; /** * @author onlinetechvision.com * @since 24 Sept 2011 * @version 1.0.0 * */ public class TestExecutorService { public static void main(String[] args) { ExecutorService execService = Executors.newCachedThreadPool(); execService.execute(new TestTask("FirstTestTask")); execService.execute(new TestTask("SecondTestTask")); execService.execute(new TestTask("ThirdTestTask")); execService.shutdown(); } } When TestExecutorService is run, the output will be seen as below : 24.09.2011 17:30:47 DEBUG (TestTask.java:21) - SecondTestTask is sleeping... 24.09.2011 17:30:47 DEBUG (TestTask.java:21) - ThirdTestTask is sleeping... 24.09.2011 17:30:47 DEBUG (TestTask.java:21) - FirstTestTask is sleeping... 24.09.2011 17:30:50 DEBUG (TestTask.java:23) - ThirdTestTask is running... 24.09.2011 17:30:50 DEBUG (TestTask.java:23) - FirstTestTask is running... 24.09.2011 17:30:50 DEBUG (TestTask.java:23) - SecondTestTask is running... STEP 4 : CREATE TestExecutorService by using newFixedThreadPool TestExecutorService is created by using the method newFixedThreadPool. In this case, required thread count has to be set as the following : package com.otv; import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors; import com.otv.task.TestTask; /** * @author onlinetechvision.com * @since 24 Sept 2011 * @version 1.0.0 * */ public class TestExecutorService { public static void main(String[] args) { ExecutorService execService = Executors.newFixedThreadPool(2); execService.execute(new TestTask("FirstTestTask")); execService.execute(new TestTask("SecondTestTask")); execService.execute(new TestTask("ThirdTestTask")); execService.shutdown(); } } When TestExecutorService is run, ThirdTestTask is executed after FirstTestTask and SecondTestTask’ s executions are completed. The output will be seen as below: 24.09.2011 17:33:38 DEBUG (TestTask.java:21) - FirstTestTask is sleeping... 24.09.2011 17:33:38 DEBUG (TestTask.java:21) - SecondTestTask is sleeping... 24.09.2011 17:33:41 DEBUG (TestTask.java:23) - FirstTestTask is running... 24.09.2011 17:33:41 DEBUG (TestTask.java:23) - SecondTestTask is running... 24.09.2011 17:33:41 DEBUG (TestTask.java:21) - ThirdTestTask is sleeping... 24.09.2011 17:33:44 DEBUG (TestTask.java:23) - ThirdTestTask is running... STEP 5 : CREATE TestExecutorService by using newSingleThreadExecutor TestExecutorService is created by using the method newSingleThreadExecutor. In this case, only one thread is created and tasks are executed sequentially. package com.otv; import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors; import com.otv.task.TestTask; /** * @author onlinetechvision.com * @since 24 Sept 2011 * @version 1.0.0 * */ public class TestExecutorService { public static void main(String[] args) { ExecutorService execService = Executors.newSingleThreadExecutor(); execService.execute(new TestTask("FirstTestTask")); execService.execute(new TestTask("SecondTestTask")); execService.execute(new TestTask("ThirdTestTask")); execService.shutdown(); } } When TestExecutorService is run, SecondTestTask and ThirdTestTask is executed after FirstTestTask’ s execution is completed. The output will be seen as below : 24.09.2011 17:38:21 DEBUG (TestTask.java:21) - FirstTestTask is sleeping... 24.09.2011 17:38:24 DEBUG (TestTask.java:23) - FirstTestTask is running... 24.09.2011 17:38:24 DEBUG (TestTask.java:21) - SecondTestTask is sleeping... 24.09.2011 17:38:27 DEBUG (TestTask.java:23) - SecondTestTask is running... 24.09.2011 17:38:27 DEBUG (TestTask.java:21) - ThirdTestTask is sleeping... 24.09.2011 17:38:30 DEBUG (TestTask.java:23) - ThirdTestTask is running... STEP 6 : REFERENCES http://download.oracle.com/javase/6/docs/api/java/util/concurrent/ExecutorService.html http://tutorials.jenkov.com/java-util-concurrent/executorservice.html

August 6, 2012

by Eren Avsarogullari

· 23,418 Views · 2 Likes

Using Multiple Versions of JDK and Eclipse in Single Machine

In my office laptop, I have installed two versions of JDK. For the office work, I need JDK6 because the internal framework needs it. I’m using JDK7 for my personal projects and exploring the latest and greatest in Java. I have two versions of Eclipse too (one for office work and one is the latest Juno). But, the tricky thing is to manage these multiple JDKs and IDEs. It’s a piece of cake if I just use Eclipse for compiling my code, because the IDE allows me to configure multiple versions of Java runtime. Unfortunately (or fortunately), I have to use the command line/shell to build my code. So, it is important that I have the right version of JDK present in the PATH and other related environment variables (such as JAVA_HOME). Manually modifying the environment variables every time I want to switch between JDKs, isn’t a happy task. But, thanks to Windows Powershell, I’m able to write a scriplet that can do the heavy-lifting for me. Basically, what I want to achieve is to set PATH variable to add Java bin folder and set the JAVA_HOME environment variable and then launch the correct Eclipse IDE. And, I want to do this with a single command. Let’s do it. Open a Windows Powershell. I prefer writing custom Windows scripts in my profile file so that it is available to run when ever I open the shell. To edit the profile, run this command: notepad.exe $profile - the $profile is a special variable that points to your profile file. Write the below script in the profile file and save it. function myIDE{ $env:Path += "C:\vraa\java\jdk7\bin;" $env:JAVA_HOME = "C:\vraa\java\jdk7" C:\vraa\ide\eclipse\eclipse set-location C:\vraa\workspace\myproject play } function officeIDE{ $env:Path += "C:\vraa\java\jdk6\bin;" $env:JAVA_HOME = "C:\vraa\java\jdk6" C:\office\eclipse\eclipse } Close and restart the Powershell. Now you can issue the command myIDE which will set the proper PATH and environment variables and then launch the eclipse IDE. As you can see, there are two functions with different configurations. Just call the function name that you want to launch from the Powershell command line (myIDE or officeIDE).

August 4, 2012

by Veera Sundar

· 19,562 Views

Mustaches in the World of Java

Mustache is templating system with implementation in many languages including Java and JavaScript . The templates are also supported by various web frameworks and client side JS libraries. Mustache has simple idea of "logic-less" system because it lacks any explicit control statements, like if, else or goto and also it does not have for statement however looping and conditional calculation can be achieved using custom tags that work with lists and lambdas. The name unfortunately has less to do with Tom Selleck but more with the heavy use of curly braces that look like mustache. The similarity is more than comparable. Mustache has implementation for most of the widely used languages like: Java, Javascript, Ruby,Net and many more. The client side template's in JavaScript Let say that you have some REST service and you have created a book view object that has an additional function that appends amazon associates id to the book url: var book = { id : 12, title : "A Game of Thrones", url : "http://www.amazon.com/gp/product/0553573403/", amazonId : "myAwesomeness", associateUrl : function() { return this.url + '?tag=' + this.amazonId; }, author : { name : 'George R. R. Martin', imdbUrl : 'http://www.imdb.com/name/nm0552333/', wikiUrl : 'https://en.wikipedia.org/wiki/George_R._R._Martin' }, haveInStock : true, similarBooks : [{ id : 13, title : "Decision Points" }, { id : 13, title : "Spoken from the Heart" }], comments : [] }; The standard way of rendering data without using templates would be create an output variable and just append everything inside and at the end just place the data where it should be. jQuery(document).ready(function() { var out = '' + book.title + ' is awesome book get it on Amazon'; jQuery('#content-jquery').html(out); }); This is fairly simple but if you for example want to change the span element with div it takes a little bit of time to figure where it should be closed and often you can miss if the element should be in single quotes or double quotes. The bigger issue here is that the content is peaces of strings that need to be easy to styled via CSS and JavaScript. As the code gets bigger this becomes unmanageable and changes to anything become slower especially if you add on top of this jQuery's manipulation functions like appendTo() or prependTo(). This direct use of out+= type of creating the content reminds me a lot of HttpServlet style of using print writer and doing out.print() and for the same reason why this was almost abandoned we should not do this in JavaScript. To simplify work we can add template engine like Mustache that is one of many client side tempting engines. So how does a template in mustache looks like, well for the example above with the book it would look like :

August 1, 2012

by Mite Mitreski

· 39,536 Views

Spring Data With Cassandra Using JPA

We recently adopted the use of Spring Data. Spring Data provides a nice pattern/API that you can layer on top of JPA to eliminate boiler-plate code. With that adoption, we started looking at the DAO layer we use against Cassandra for some of our operations. Some of the data we store in Cassandra is simple. It does *not* leverage the flexible nature of NoSQL. In other words, we know all the table names, the column names ahead of time, and we don't anticipate them changing all that often. We could have stored this data in an RDBMs, using hibernate to access it, but standing up another persistence mechanism seemed like overkill. For simplicity's sake, we preferred storing this data in Cassandra. That said, we want the flexibility to move this to an RDBMs if we need to. Enter JPA. JPA would provide us a nice layer of abstraction away from the underlying storage mechanism. Wouldn't it be great if we could annotate the objects with JPA annotations, and persist them to Cassandra? Enter Kundera. Kundera is a JPA implementation that supports Cassandra (among other storage mechanisms). OK -- so JPA is great, and would get us what we want, but we had just adopted the use of Spring Data. Could we use both? The answer is "sort of". I forked off SpringSource's spring-data-cassandra: https://github.com/boneill42/spring-data-cassandra And I started hacking on it. I managed to get an implementation of the PagingAndSortingRepository for which I wrote unit tests that worked, but I was duplicating a lot of what should have come for free in the SimpleJpaRepository. When I tried to substitute my CassandraJpaRepository for the SimpleJpaRepository, I ran into some trouble w/ Kundera. Specifically, the MetaModel implementation appeared to be incomplete. MetaModelImpl was returning null for all managedTypes(). SimpleJpa wasn't too happy with this. Instead of wrangling with Kundera, we punted. We can achieve enough of the value leveraging JPA directly. Perhaps more importantly, there is still an impedance mismatch between JPA and NoSQL. In our case, it would have been nice to get at Cassandra through Spring Data using JPA for a few cases in our app, but for the vast majority of the application, a straight up ORM layer whereby we know the tables, rows and column names ahead of time is insufficient. For those cases where we don't know the schema ahead of time, we're going to need to leverage the converters pattern in Spring Data. So, I started hacking on a proper Spring Data layer using Astyanax as the client. Follow along here: https://github.com/boneill42/spring-data-cassandra More to come on that....

July 31, 2012

by Brian O' Neill

· 30,017 Views

Use Lucene’s MMapDirectory on 64bit Platforms, Please!

Don’t be afraid – Some clarification to common misunderstandings Since version 3.1, Apache Lucene and Solr use MMapDirectory by default on 64bit Windows and Solaris systems; since version 3.3 also for 64bit Linux systems. This change lead to some confusion among Lucene and Solr users, because suddenly their systems started to behave differently than in previous versions. On the Lucene and Solr mailing lists a lot of posts arrived from users asking why their Java installation is suddenly consuming three times their physical memory or system administrators complaining about heavy resource usage. Also consultants were starting to tell people that they should not use MMapDirectory and change their solrconfig.xml to work instead with slow SimpleFSDirectory or NIOFSDirectory (which is much slower on Windows, caused by a JVM bug #6265734). From the point of view of the Lucene committers, who carefully decided that using MMapDirectory is the best for those platforms, this is rather annoying, because they know, that Lucene/Solr can work with much better performance than before. Common misinformation about the background of this change causes suboptimal installations of this great search engine everywhere. In this blog post, I will try to explain the basic operating system facts regarding virtual memory handling in the kernel and how this can be used to largely improve performance of Lucene (“VIRTUAL MEMORY for DUMMIES”). It will also clarify why the blog and mailing list posts done by various people are wrong and contradict the purpose of MMapDirectory. In the second part I will show you some configuration details and settings you should take care of to prevent errors like “mmap failed” and suboptimal performance because of stupid Java heap allocation. Virtual Memory[1] Let’s start with your operating system’s kernel: The naive approach to do I/O in software is the way, you have done this since the 1970s – the pattern is simple: whenever you have to work with data on disk, you execute a syscall to your operating system kernel, passing a pointer to some buffer (e.g. a byte[] array in Java) and transfer some bytes from/to disk. After that you parse the buffer contents and do your program logic. If you don’t want to do too many syscalls (because those may cost a lot processing power), you generally use large buffers in your software, so synchronizing the data in the buffer with your disk needs to be done less often. This is one reason, why some people suggest to load the whole Lucene index into Java heap memory (e.g., by using RAMDirectory). But all modern operating systems like Linux, Windows (NT+), MacOS X, or Solaris provide a much better approach to do this 1970s style of code by using their sophisticated file system caches and memory management features. A feature called “virtual memory” is a good alternative to handle very large and space intensive data structures like a Lucene index. Virtual memory is an integral part of a computer architecture; implementations require hardware support, typically in the form of a memory management unit (MMU) built into the CPU. The way how it works is very simple: Every process gets his own virtual address space where all libraries, heap and stack space is mapped into. This address space in most cases also start at offset zero, which simplifies loading the program code because no relocation of address pointers needs to be done. Every process sees a large unfragmented linear address space it can work on. It is called “virtual memory” because this address space has nothing to do with physical memory, it just looks like so to the process. Software can then access this large address space as if it were real memory without knowing that there are other processes also consuming memory and having their own virtual address space. The underlying operating system works together with the MMU (memory management unit) in the CPU to map those virtual addresses to real memory once they are accessed for the first time. This is done using so called page tables, which are backed by TLBs located in the MMU hardware (translation lookaside buffers, they cache frequently accessed pages). By this, the operating system is able to distribute all running processes’ memory requirements to the real available memory, completely transparent to the running programs. Schematic drawing of virtual memory (image from Wikipedia [1], http://en.wikipedia.org/wiki/File:Virtual_memory.svg, licensed by CC BY-SA 3.0) By using this virtualization, there is one more thing, the operating system can do: If there is not enough physical memory, it can decide to “swap out” pages no longer used by the processes, freeing physical memory for other processes or caching more important file system operations. Once a process tries to access a virtual address, which was paged out, it is reloaded to main memory and made available to the process. The process does not have to do anything, it is completely transparent. This is a good thing to applications because they don’t need to know anything about the amount of memory available; but also leads to problems for very memory intensive applications like Lucene. Lucene & Virtual Memory Let’s take the example of loading the whole index or large parts of it into “memory” (we already know, it is only virtual memory). If we allocate a RAMDirectory and load all index files into it, we are working against the operating system: The operating system tries to optimize disk accesses, so it caches already all disk I/O in physical memory. We copy all these cache contents into our own virtual address space, consuming horrible amounts of physical memory (and we must wait for the copy operation to take place!). As physical memory is limited, the operating system may, of course, decide to swap out our large RAMDirectory and where does it land? – On disk again (in the OS swap file)! In fact, we are fighting against our O/S kernel who pages out all stuff we loaded from disk [2]. So RAMDirectory is not a good idea to optimize index loading times! Additionally, RAMDirectory has also more problems related to garbage collection and concurrency. Because the data residing in swap space, Java’s garbage collector has a hard job to free the memory in its own heap management. This leads to high disk I/O, slow index access times, and minute-long latency in your searching code caused by the garbage collector driving crazy. On the other hand, if we don’t use RAMDirectory to buffer our index and use NIOFSDirectory or SimpleFSDirectory, we have to pay another price: Our code has to do a lot of syscalls to the O/S kernel to copy blocks of data between the disk or filesystem cache and our buffers residing in Java heap. This needs to be done on every search request, over and over again. Memory Mapping Files The solution to the above issues is MMapDirectory, which uses virtual memory and a kernel feature called “mmap” [3] to access the disk files. In our previous approaches, we were relying on using a syscall to copy the data between the file system cache and our local Java heap. How about directly accessing the file system cache? This is what mmap does! Basically mmap does the same like handling the Lucene index as a swap file. The mmap() syscall tells the O/S kernel to virtually map our whole index files into the previously described virtual address space, and make them look like RAM available to our Lucene process. We can then access our index file on disk just like it would be a large byte[] array (in Java this is encapsulated by a ByteBuffer interface to make it safe for use by Java code). If we access this virtual address space from the Lucene code we don’t need to do any syscalls, the processor’s MMU and TLB handles all the mapping for us. If the data is only on disk, the MMU will cause an interrupt and the O/S kernel will load the data into file system cache. If it is already in cache, MMU/TLB map it directly to the physical memory in file system cache. It is now just a native memory access, nothing more! We don’t have to take care of paging in/out of buffers, all this is managed by the O/S kernel. Furthermore, we have no concurrency issue, the only overhead over a standard byte[] array is some wrapping caused by Java’s ByteBuffer interface (it is still slower than a real byte[] array, but that is the only way to use mmap from Java and is much faster than all other directory implementations shipped with Lucene). We also waste no physical memory, as we operate directly on the O/S cache, avoiding all Java GC issues described before. What does this all mean to our Lucene/Solr application? We should not work against the operating system anymore, so allocate as less as possible heap space (-Xmx Java option). Remember, our index accesses rely on passed directly to O/S cache! This is also very friendly to the Java garbage collector. Free as much as possible physical memory to be available for the O/S kernel as file system cache. Remember, our Lucene code works directly on it, so reducing the number of paging/swapping between disk and memory. Allocating too much heap to our Lucene application hurts performance! Lucene does not require it with MMapDirectory. Why does this only work as expected on operating systems and Java virtual machines with 64bit? One limitation of 32bit platforms is the size of pointers, they can refer to any address within 0 and 232-1, which is 4 Gigabytes. Most operating systems limit that address space to 3 Gigabytes because the remaining address space is reserved for use by device hardware and similar things. This means the overall linear address space provided to any process is limited to 3 Gigabytes, so you cannot map any file larger than that into this “small” address space to be available as big byte[] array. And when you mapped that one large file, there is no virtual space (address like “house number”) available anymore. As physical memory sizes in current systems already have gone beyond that size, there is no address space available to make use for mapping files without wasting resources (in our case “address space”, not physical memory!). On 64bit platforms this is different: 264-1 is a very large number, a number in excess of 18 quintillion bytes, so there is no real limit in address space. Unfortunately, most hardware (the MMU, CPU’s bus system) and operating systems are limiting this address space to 47 bits for user mode applications (Windows: 43 bits) [4]. But there is still much of addressing space available to map terabytes of data. Common misunderstandings If you have read carefully what I have told you about virtual memory, you can easily verify that the following is true: MMapDirectory does not consume additional memory and the size of mapped index files is not limited by the physical memory available on your server. By mmap() files, we only reserve address space not memory! Remember, address space on 64bit platforms is for free! MMapDirectory will not load the whole index into physical memory. Why should it do this? We just ask the operating system to map the file into address space for easy access, by no means we are requesting more. Java and the O/S optionally provide the option to try loading the whole file into RAM (if enough is available), but Lucene does not use that option (we may add this possibility in a later version). MMapDirectory does not overload the server when “top” reports horrible amounts of memory. “top” (on Linux) has three columns related to memory: “VIRT”, “RES”, and “SHR”. The first one (VIRT, virtual) is reporting allocated virtual address space (and that one is for free on 64 bit platforms!). This number can be multiple times of your index size or physical memory when merges are running in IndexWriter. If you have only one IndexReader open it should be approximately equal to allocated heap space (-Xmx) plus index size. It does not show physical memory used by the process. The second column (RES, resident) memory shows how much (physical) memory the process allocated for operating and should be in the size of your Java heap space. The last column (SHR, shared) shows how much of the allocated virtual address space is shared with other processes. If you have several Java applications using MMapDirectory to access the same index, you will see this number going up. Generally, you will see the space needed by shared system libraries, JAR files, and the process executable itself (which are also mmapped). How to configure my operating system and Java VM to make optimal use of MMapDirectory? First of all, default settings in Linux distributions and Solaris/Windows are perfectly fine. But there are some paranoid system administrators around, that want to control everything (with lack of understanding). Those limit the maximum amount of virtual address space that can be allocated by applications. So please check that “ulimit -v” and “ulimit -m” both report “unlimited”, otherwise it may happen that MMapDirectory reports “mmap failed” while opening your index. If this error still happens on systems with lot’s of very large indexes, each of those with many segments, you may need to tune your kernel parameters in /etc/sysctl.conf: The default value of vm.max_map_count is 65530, you may need to raise it. I think, for Windows and Solaris systems there are similar settings available, but it is up to the reader to find out how to use them. For configuring your Java VM, you should rethink your memory requirements: Give only the really needed amount of heap space and leave as much as possible to the O/S. As a rule of thumb: Don’t use more than ¼ of your physical memory as heap space for Java running Lucene/Solr, keep the remaining memory free for the operating system cache. If you have more applications running on your server, adjust accordingly. As usual the more physical memory the better, but you don’t need as much physical memory as your index size. The kernel does a good job in paging in frequently used pages from your index. A good possibility to check that you have configured your system optimally is by looking at both "top" (and correctly interpreting it, see above) and the similar command "iotop" (can be installed, e.g., on Ubuntu Linux by "apt-get install iotop"). If your system does lots of swap in/swap out for the Lucene process, reduce heap size, you possibly used too much. If you see lot's of disk I/O, buy more RUM (Simon Willnauer) so mmapped files don't need to be paged in/out all the time, and finally: buy SSDs. Happy mmapping! Bibliography [1] http://en.wikipedia.org/wiki/Virtual_memory [2] https://www.varnish-cache.org/trac/wiki/ArchitectNotes [3] http://en.wikipedia.org/wiki/Memory-mapped_file [4] http://en.wikipedia.org/wiki/X86-64#Virtual_address_space_details

July 31, 2012

by Uwe Schindler

· 13,549 Views · 1 Like

How Many Java developers are There in the World?

Oracle says it’s 9,000,000. Wikipedia claims it’s 10,000,000. And the guys from NumberOf.net seem to be the most precise - they know that there are exactly 9,007,346 Java developers out there. Nice numbers. I have used those articles as reference points while speaking about the potential market size for our memory leak detection tool. But something in these numbers has bothered me for years - there is no trustworthy and public analysis behind those numbers. Its just conjured up from thin air. So I finally thought I would do something about it and try to figure it out for good. It proved out to be a challenging task. After all - with more than seven billion people on our planet I couldn't call everyone and ask them. Well, maybe I could, but if every call would take on average 20 seconds I would need at least 4,439 years to complete the survey. If I would not sleep nor eat nor rest. So I had to use other ways for estimation. After playing around with different sources of information, I decided to dig into four of them for a closer look: Labour statistics provided by different governments Language popularity sites such as Tiobe and Langpop Employment portals using Indeed.com and Monster.com Download numbers on popular Java tools and libraries - namely Eclipse and Tomcat. Using that information I wanted to estimate the number using three different calculations - based on language popularity indexes, labour statistics and download figures. So, here we go. How many programmers could there be in total? World population is currently above seven billion. Out of those seven billion we can leave out sub-Saharan Africa (900M) and rural Asia (about 50% of its 2.2B population) as negligible. This leaves us with approximately 5 billion people living in regions where overall economical and cultural background can be considered suitable for software industries to spawn. Now, out of those 5,000,000,000 how many could be actually developing software? A good answer at StackExchange gives us some pointers as to where we can find information on the percentage of software developers in different countries. Using the US, Japan, Canada, the EU27 and the UK as a baseline we can estimate that 0.82% of the population is employed as a software developer or programmer: Country Population Developers % Canada 33,476,688 387,000 1.16% EU27 502,486,499 5,900,000 1.17% Japan 127,799,000 1,016,929 0.80% UK 63,162,000 333,000 0.53% US 313,931,000 1,336,300 0.43% Weighted average: 0.86% 0.86% out of five billion is 43,000,000. Lets remember this number, as it will be used as a baseline in following calculations. Popularity contests In the popularity contest we will use two channels for the source of data - the TIOBE index and the Langpop one. Other sources such as Dataist figures were hard to interpret, so we’ll stick just to those two. For the background - the TIOBE ratings are calculated by counting hits of the most popular search engines. The search query that is used is +" programming", e.g. +“Java programming” in our case. Langpop uses more sources for input besides search engine queries - in equal weights it traces open job positions, book titles, search engine results, the number of open source projects and other data to calculate its popularity score. Simplifying TIOBE and Langpop results, we can conclude that according to TIOBE 17% and according to Langpop ~15% of the programmers in the world are using Java. Averaging those numbers we can say that around 16% out of the 43,000,000 developers in the world use Java. This translates to 6,880,000 Java developers out there. Job portals Job portals, especially when considering both available positions and uploaded resumes, are definitely a good source of information. The larger ones also provide nice reports on labour market, which we will dig into next. Note that we used Indeed.com and Monster.com - if you can point us towards more and/or better sources of information, we would be glad to correct our calculations. But using this analysis from Monster.com and the aggregated statistics from Indeed.com we can say that ~18% of Monster.com applicants can program in Java and ~16% of open engineering / programming positions scanned by Indeed.com are looking for Java talent. Averaging those numbers we arrive at 17%. Which out of 41,000,000 programmers in total would translate to 7,310,000 Java guys and girls in the world. Software downloads Every Java developer uses something to build the application. Well, we expect them to use at least a JVM and a compiler. If you happen to know anyone who can get away without those two, please let us know. We would hire him immediately. But most of us tend to use more than just a compiler and a virtual machine. We use IDEs, application servers, build tools, etc. So we figured that we would look into the publicly available download numbers of these tools and try to estimate the number of developers from the download numbers. When calculating the total number of developers from estimated number of users, we take into account the market share of the corresponding software. To estimate the market share we use Zeroturnaround’s statistics gathered in the spring of 2012. Eclipse downloads. Eclipse Juno was released on June 27 and has been downloaded 1,200,000 times during the first 20 days. Looking into the historical data published by eclipse.org we can predict that Juno will be downloaded approximately 8,000,000 times in total. Last four major Eclipse releases have all been released using a yearly release calendar and all the releases took place in June: Juno - 8,000,000 (in a year, expecting the trend to continue. Currently has 1,200,000 downloads in first 20 days). Indigo - 6,000,000 downloads Helios - 4,100,000 downloads Galileo - 2,200,000 downloads Averaging Juno estimates and Indigo results, we can say that Eclipse is downloaded approximately 7,000,000 times a year. Using the Zeroturnaround’s statistics, we expect 68% of Java developers to use Eclipse as a (primary) IDE. If we now make a bold claim that each Java developer on Eclipse will download the IDE exactly once a year, expect the number of downloads per year to be 7,000,000 and consider that 32% of Java developers do not use Eclipse at all, we come to a conclusion that there should be 10,300,00 Java developers in total. Apache Tomcat downloads. Vadim Gritsenko has put together some nice statistics on top of Apache logs. From there we can see that during the last year Tomcat has been downloaded approximately 550,000 times/month. This gives us a yearly total of 6,600,000 Tomcat downloads. Applying now statistics from the same report used for calculating Eclipse’s market share we can estimate that 59% of Java developers are using Tomcat as one of their development platform. If we now again make a bold claim that each Java developer on Tomcat will download every major release exactly once and consider that 41% of Java developers do not use Tomcat, we reach to conclusion that there should be 11,186,000 Java developers out there. Averaging the numbers from Eclipse and Tomcat downloads, we end up with 10,743,000 Java developers. Conclusions We used three different sources for estimation - popularity contests, job market analysis and download numbers of popular Java development infrastructure products. The numbers varied quite a bit - from 6,880,000 to 10,743,000. Aggressively averaging the three numbers we can conclude that there are 8,311,000 Java developers out there. Not quite as much as Oracle or Wikipedia think, but still enough to build a business that provides developing tools for the Java community. Lies. Damn lies. And statistics.

July 20, 2012

by Nikita Salnikov-Tarnovski

· 23,953 Views

How Changing Java Package Names Transformed my System Architecture

Changing your perspective even a small amount can have profound effects on how you approach your system. Let’s say you’re writing a web application in Java. In the system you deal with orders, customers and products. As a web application, your classes include staples like PersonController, PersonRepository, CustomerController and OrderService. How do you organize your classes into packages? There are two fundamental ways to structure your packages. Either you can focus on the logical tiers, like com.brodwall.myapp.controllers, com.brodwall.myapp.domain or perhaps com.brodwall.myapp.services.customer. Or you can focus on the domain contexts, like com.brodwall.myapp.customer, com.brodwall.myapp.orders and com.brodwall.myapp.products. The first approach is by far the most prevalent. In my view, it’s also the least helpful. Here are some ways your thinking changes if you structure your packages around domain concepts, rather than technological tiers: First, and most fundamentally, your mental model will now be aligned with that of the users of your system. If you’re asked to implement a typical feature, it is now more likely to be focused around a strict subset of the packages of your system. For example, adding a new field to a form will at least affect the presentation logic, entity and persistence layer for the corresponding domain concept. If your packages are organized around tiers, this change will hit all over your system. In a word: A system organized around features, rather than technologies, have higher coherence. This technical term means that a large percentage of a the dependencies of a class are located close to that class. Secondly, organizing around domain concepts will give you more options when your software grows. When a package contains tens of classes, you may want to split it up in several packages. The discussion can itself be enlightening. “Maybe we should separate out the customer address classes into a com.brodwall.myapp.customer.address package. It seems to have a bit of a life on its own.” “Yeah, and maybe we can use the same classes for other places we need addresses, such as suppliers?” “Cool, so com.brodwall.myapp.address, then?” Or maybe you decide that order status codes and payment status codes deserve to be in the “com.brodwall.myapp.order.codes” package. On the other hand, what options do you have for splitting up com.brodwall.myapp.controllers? You could create subpackages for customer, orders and products, but these subpackages may only have one or possibly two classes each. Finally, and perhaps most intriguingly, using domain concepts for packages allows you to vary the design according on a case by case basis. Maybe you really need a OrderService which coordinates the payment and shipping of an order, while ProductController only needs basic create-retrieve-update-delete functionality with a repository. A ProductService would just get in the way. If ProductService is missing from the com.brodwall.myapp.services package, this may be confusing or at the very least give you a nagging feeling that something is wrong. On the other hand, if there’s no Controller in the com.brodwall.myapp.product package, it doesn’t matter much. Also, most systems have some good parts and some not-so-good parts. If your Services package is not working for you, there’s not much you can do. But if the Products package is rotten, you can throw it out and reimplement it without the whole system being thrown into a state of chaos. By putting the classes needed to implement a feature together with each other and apart from the classes needed to implement other features, developers can be pragmatic and innovative when developing one feature without negatively affecting other features. The flip side of this is that most developers are more comfortable with some technologies in the application and less comfortable with other technologies. Organizing around features instead of technologies force each developer to consider a larger set of technological challenges. Some programmers take this as a motivating challenge to learn, while others, it seems, would rather not have to learn something new. If it were my money being spend to create features, I know what kind of developer I would want. Trivial changes can have large effects. By organizing your software around features, you get a more coherent system that allows for growth. It may challenge your developers, but it drives down the number of hand-offs needed to implement a feature and it challenges the developers to improve the parts of the application they are working on. See also my blog post on Architecture as tidying up.

July 20, 2012

by Johannes Brodwall

· 17,064 Views

Spring Data - Apache Hadoop

Spring for Apache Hadoop is a Spring project to support writing applications that can benefit of the integration of Spring Framework and Hadoop. This post describes how to use Spring Data Apache Hadoop in an Amazon EC2 environment using the “Hello World” equivalent of Hadoop programming – a Wordcount application. 1./ Launch an Amazon Web Services EC2 instance. - Navigate to AWS EC2 Console (“https://console.aws.amazon.com/ec2/home”): - Select Launch Instance then Classic Wizzard and click on Continue. My test environment was a “Basic Amazon Linux AMI 2011.09″ 32-bit., Instant type: Micro (t1.micro , 613 MB), Security group quick-start-1 that enables ssh to be used for login. Select your existing key pair (or create a new one). Obviously you can select another AMI and instance types depending on your favourite flavour. (Should you vote for Windows 2008 based instance, you also need to have cygwin installed as an additional Hadoop prerequisite beside Java JDK and ssh, see “Install Apache Hadoop” section) 2./ Download Apache Hadoop - as of writing this article, 1.0.0 is the latest stable version of Apache Hadoop, that is what was used for testing purposes. I downloaded hadoop-1.0.0.tar.gz and copied it into /home/ec2-user directory using pscp command from my PC running Windows: c:\downloads>pscp -i mykey.ppk hadoop-1.0.0.tar.gz ec2-user@ec2-176-34-201-185.eu-west-1.compute.amazonaws.com:/home/ec2-user (the computer name above – ec2-ipaddress-region-compute.amazonaws.com – can be found on AWS EC2 console, Instance Description, public DNS field) 3./ Install Apache Hadoop: As prerequisites, you need to have Java JDK 1.6 and ssh installed, see Apache Single-Node Setup Guide. (ssh is automatically installed with Basic Amazon AMI). Then install hadoop itself: $ cd ~ # change directory to ec2-user home (/home/ec2-user) $ tar xvzf hadoop-1.0.0.tar.gz $ ln -s hadoop-1.0.0 hadoop $ cd hadoop/conf $ vi hadoop-env.sh # edit as below export JAVA_HOME=/opt/jdk1.6.0_29 $ vi core-site.xml # edit as below – this defines the namenode to be running on localhost and listeing to port 9000. fs.default.name hdfs://localhost:9000 $ vi hdsf-site.xml # edit as below this defines that file system replicate is 1 (in production environment it is supposed to be 3 by default) dfs.replication 1 $ vi mapred-site.xml # edit as below – this defines the jobtracker to be running on localhost and listeing to port 9001. mapred.job.tracker localhost:9001 $ cd ~/hadoop $ bin/hadoop namenode -format $ bin/start-all.sh At this stage all hadoop jobs are running in pseudo distributed mode, you can verify it by running: $ ps -ef | grep java You should see 5 java processes: namenode, secondarynamenode, datanode, jobtracker and tasktracker. 4./ Install Spring Data Hadoop Download Spring Data Hadoop package from SpringSource community download site. As of writing this article, the latest stable version is spring-data-hadoop-1.0.0.M1.zip. $ cd ~ $ tar xzvf spring-data-hadoop-1.0.0.M1.zip $ ln -s spring-data-hadoop-1.0.0.M1 spring-data-hadoop 5./ Build and Run Spring Data Hadoop Wordcount example $ cd spring-data-hadoop/spring-data-hadoop-1.0.0.M1/samples/wordcount Spring Data Hadoop is using gradle as build tool. Check build.grandle build file. The original version packaged in the tar.gz file does not compile, it complains about thrift, version 0.2.0 and jdo2-api, version2.3-ec. Add datanucleus.org maven repository to the build.gradle file to support jdo2-api (http://www.datanucleus.org/downloads/maven2/) . Unfortunatelly, there seems to be no maven repo for thrift 0.2.0 . You should download thrift 0.2.0.jar and thrift.0.2.0.pom file e.g. from this repo: “http://people.apache.org/~rawson/repo“ and then add it to local maven repo. $ mvn install:install-file -DgroupId=org.apache.thrift -DartifactId=thrift -Dversion=0.2.0 -Dfile=thrift-0.2.0.jar -Dpackaging=jar $ vi build.grandle # modify the build file to refer to datanucleus maven repo for jdo2-api and the local repo for thrift repositories { // Public Spring artefacts mavenCentral() maven { url “http://repo.springsource.org/libs-release” } maven { url “http://repo.springsource.org/libs-milestone” } maven { url “http://repo.springsource.org/libs-snapshot” } maven { url “http://www.datanucleus.org/downloads/maven2/” } maven { url “file:///home/ec2-user/.m2/repository” } } I also modified the META-INF/spring/context.xml file in order to run hadoop file system commands manually: $ cd /home/ec2-user/spring-data-hadoop/spring-data-hadoop-1.0.0.M1/samples/wordcount/src/main/resources $vi META-INF/spring/context.xml # remove clean-script and also the dependency on it for JobRunner. xmlns=”http://www.springframework.org/schema/beans” xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance” xmlns:context=”http://www.springframework.org/schema/context” xmlns:hdp=”http://www.springframework.org/schema/hadoop” xmlns:p=”http://www.springframework.org/schema/p” xsi:schemaLocation=”http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context.xsd http://www.springframework.org/schema/hadoop http://www.springframework.org/schema/hadoop/spring-hadoop.xsd”> fs.default.name=${hd.fs} Copy the sample file – nietzsche-chapter-1.txt – to Hadoop file system (/user/ec2-user-/input directory) $ cd src/main/resources/data $ hadoop fs -mkdir /user/ec2-user/input $ hadoop fs -put nietzsche-chapter-1.txt /user/ec2-user/input/data $ cd ../../../.. # go back to samples/wordcount directory $ ../gradlew Verify the result: $ hadoop fs -cat /user/ec2-user/output/part-r-00000 | more “AWAY 1 “BY 1 “Beyond 1 “By 2 “Cheers 1 “DE 1 “Everywhere 1 “FROM” 1 “Flatterers 1 “Freedom 1

July 19, 2012

by Istvan Szegedi

· 11,649 Views

How to Resolve java.lang.NoClassDefFoundError: Part 3

This article is part 3 of our NoClassDefFoundError troubleshooting series. As I mentioned in my first article, there are many possible issues that can lead to a NoClassDefFoundError. This article will focus and describe one of the most common causes of this problem: failure of a Java class static initializer block or variable. A sample Java program will be provided and I encourage you to compile and run this example from your workstation in order to properly replicate and understand this type of NoClassDefFoundError problem. Java static initializer revisited The Java programming language provides you with the capability to “statically” initialize variables or a block of code. This is achieved via the “static” variable identifier or the usage of a static {} block at the header of a Java class. Static initializers are guaranteed to be executed only once in the JVM life cycle and are Thread safe by design which make their usage quite appealing for static data initialization such as internal object caches, loggers etc. What is the problem? I will repeat again, static initializers are guaranteed to be executed only once in the JVM life cycle…This means that such code is executed at the Class loading time and never executed again until you restart your JVM. Now what happens if the code executed at that time (@Class loading time) terminates with an unhandled Exception? Welcome to the java.lang.NoClassDefFoundError problem case #2! NoClassDefFoundError problem case 2 – static initializer failure This type of problem is occurring following the failure of static initializer code combined with successive attempts to create a new instance of the affected (non-loaded) class. Sample Java program The following simple Java program is split as per below: The main Java program NoClassDefFoundErrorSimulator The affected Java class ClassA ClassA provides you with a ON/OFF switch allowing you the replicate the type of problem that you want to study This program is simply attempting to create a new instance of ClassA 3 times (one after each other). It will demonstrate that an initial failure of either a static variable or static block initializer combined with successive attempt to create a new instance of the affected class triggers java.lang.NoClassDefFoundError. #### NoClassDefFoundErrorSimulator.java package org.ph.javaee.tools.jdk7.training2; /** * NoClassDefFoundErrorSimulator * @author Pierre-Hugues Charbonneau * */ public class NoClassDefFoundErrorSimulator { /** * @param args */ public static void main(String[] args) { System.out.println("java.lang.NoClassDefFoundError Simulator - Training 2"); System.out.println("Author: Pierre-Hugues Charbonneau"); System.out.println("http://javaeesupportpatterns.blogspot.com\n\n"); try { // Create a new instance of ClassA (attempt #1) System.out.println("FIRST attempt to create a new instance of ClassA...\n"); ClassA classA = new ClassA(); } catch (Throwable any) { any.printStackTrace(); } try { // Create a new instance of ClassA (attempt #2) System.out.println("\nSECOND attempt to create a new instance of ClassA...\n"); ClassA classA = new ClassA(); } catch (Throwable any) { any.printStackTrace(); } try { // Create a new instance of ClassA (attempt #3) System.out.println("\nTHIRD attempt to create a new instance of ClassA...\n"); ClassA classA = new ClassA(); } catch (Throwable any) { any.printStackTrace(); } System.out.println("\n\ndone!"); } } #### ClassA.java package org.ph.javaee.tools.jdk7.training2; /** * ClassA * @author Pierre-Hugues Charbonneau * */ public class ClassA { private final static String CLAZZ = ClassA.class.getName(); // Problem replication switch ON/OFF private final static boolean REPLICATE_PROBLEM1 = true; // static variable initializer private final static boolean REPLICATE_PROBLEM2 = false; // static block{} initializer // Static variable executed at Class loading time private static String staticVariable = initStaticVariable(); // Static initializer block executed at Class loading time static { // Static block code execution... if (REPLICATE_PROBLEM2) throw new IllegalStateException("ClassA.static{}: Internal Error!"); } public ClassA() { System.out.println("Creating a new instance of "+ClassA.class.getName()+"..."); } /** * * @return */ private static String initStaticVariable() { String stringData = ""; if (REPLICATE_PROBLEM1) throw new IllegalStateException("ClassA.initStaticVariable(): Internal Error!"); return stringData; } } Problem reproduction In order to replicate the problem, we will simply “voluntary” trigger a failure of the static initializer code. Please simply enable the problem type that you want to study e.g. either static variable or static block initializer failure: // Problem replication switch ON (true) / OFF (false) private final static boolean REPLICATE_PROBLEM1 = true; // static variable initializer private final static boolean REPLICATE_PROBLEM2 = false; // static block{} initializer Now, let’s run the program with both switch at OFF (both boolean values at false) ## Baseline (normal execution) java.lang.NoClassDefFoundError Simulator - Training 2 Author: Pierre-Hugues Charbonneau http://javaeesupportpatterns.blogspot.com FIRST attempt to create a new instance of ClassA... Creating a new instance of org.ph.javaee.tools.jdk7.training2.ClassA... SECOND attempt to create a new instance of ClassA... Creating a new instance of org.ph.javaee.tools.jdk7.training2.ClassA... THIRD attempt to create a new instance of ClassA... Creating a new instance of org.ph.javaee.tools.jdk7.training2.ClassA... done! For the initial run (baseline), the main program was able to create 3 instances of ClassA successfully with no problem. ## Problem reproduction run (static variable initializer failure) java.lang.NoClassDefFoundError Simulator - Training 2 Author: Pierre-Hugues Charbonneau http://javaeesupportpatterns.blogspot.com FIRST attempt to create a new instance of ClassA... java.lang.ExceptionInInitializerError at org.ph.javaee.tools.jdk7.training2.NoClassDefFoundErrorSimulator.main(NoClassDefFoundErrorSimulator.java:21) Caused by: java.lang.IllegalStateException: ClassA.initStaticVariable(): Internal Error! at org.ph.javaee.tools.jdk7.training2.ClassA.initStaticVariable(ClassA.java:37) at org.ph.javaee.tools.jdk7.training2.ClassA.(ClassA.java:16) ... 1 more SECOND attempt to create a new instance of ClassA... java.lang.NoClassDefFoundError: Could not initialize class org.ph.javaee.tools.jdk7.training2.ClassA at org.ph.javaee.tools.jdk7.training2.NoClassDefFoundErrorSimulator.main(NoClassDefFoundErrorSimulator.java:30) THIRD attempt to create a new instance of ClassA... java.lang.NoClassDefFoundError: Could not initialize class org.ph.javaee.tools.jdk7.training2.ClassA at org.ph.javaee.tools.jdk7.training2.NoClassDefFoundErrorSimulator.main(NoClassDefFoundErrorSimulator.java:39) done! ## Problem reproduction run (static block initializer failure) java.lang.NoClassDefFoundError Simulator - Training 2 Author: Pierre-Hugues Charbonneau http://javaeesupportpatterns.blogspot.com FIRST attempt to create a new instance of ClassA... java.lang.ExceptionInInitializerError at org.ph.javaee.tools.jdk7.training2.NoClassDefFoundErrorSimulator.main(NoClassDefFoundErrorSimulator.java:21) Caused by: java.lang.IllegalStateException: ClassA.static{}: Internal Error! at org.ph.javaee.tools.jdk7.training2.ClassA.(ClassA.java:22) ... 1 more SECOND attempt to create a new instance of ClassA... java.lang.NoClassDefFoundError: Could not initialize class org.ph.javaee.tools.jdk7.training2.ClassA at org.ph.javaee.tools.jdk7.training2.NoClassDefFoundErrorSimulator.main(NoClassDefFoundErrorSimulator.java:30) THIRD attempt to create a new instance of ClassA... java.lang.NoClassDefFoundError: Could not initialize class org.ph.javaee.tools.jdk7.training2.ClassA at org.ph.javaee.tools.jdk7.training2.NoClassDefFoundErrorSimulator.main(NoClassDefFoundErrorSimulator.java:39) done! What happened? As you can see, the first attempt to create a new instance of ClassA did trigger a java.lang.ExceptionInInitializerError. This exception indicates the failure of our static initializer for our static variable & bloc which is exactly what we wanted to achieve. The key point to understand at this point is that this failure did prevent the whole class loading of ClassA. As you can see, attempt #2 and attempt #3 both generated a java.lang.NoClassDefFoundError, why? Well since the first attempt failed, class loading of ClassA was prevented. Successive attempts to create a new instance of ClassA within the current ClassLoader did generate java.lang.NoClassDefFoundError over and over since ClassA was not found within current ClassLoader. As you can see, in this problem context, the NoClassDefFoundError is just a symptom or consequence of another problem. The original problem is the ExceptionInInitializerError triggered following the failure of the static initializer code. This clearly demonstrates the importance of proper error handling and logging when using Java static initializers. Recommendations and resolution strategies Now find below my recommendations and resolution strategies for NoClassDefFoundError problem case 2: - Review the java.lang.NoClassDefFoundError error and identify the missing Java class - Perform a code walkthrough of the affected class and determine if it contains static initializer code (variables & static block) - Review your server and application logs and determine if any error (e.g. ExceptionInInitializerError) originates from the static initializer code - Once confirmed, analyze the code further and determine the root cause of the initializer code failure. You may need to add some extra logging along with proper error handling to prevent and better handle future failures of your static initializer code going forward Please feel free to post any question or comment. The part 4 will start coverage of NoClassDefFoundError problems related to class loader problems.

July 19, 2012

by Pierre - Hugues Charbonneau

· 90,586 Views · 3 Likes

5 Tips for Proper Java Heap Size

Determination of proper Java Heap size for a production system is not a straightforward exercise. In my Java EE enterprise experience, I have seen multiple performance problem cases due to inadequate Java Heap capacity and tuning. This article will provide you with 5 tips that can help you determine optimal Java Heap size, as a starting point, for your current or new production environment. Some of these tips are also very useful regarding the prevention and resolution of java.lang.OutOfMemoryError problems; including memory leaks. Please note that these tips are intended to “help you” determine proper Java Heap size. Since each IT environment is unique, you are actually in the best position to determine precisely the required Java Heap specifications of your client’s environment. Some of these tips may also not be applicable in the context of a very small Java standalone application but I still recommend you to read the entire article. Future articles will include tips on how to choose the proper Java VM garbage collector type for your environment and applications. #1 – JVM: you always fear what you don't understand How can you expect to configure, tune and troubleshoot something that you don’t understand? You may never have the chance to write and improve Java VM specifications but you are still free to learn its foundation in order to improve your knowledge and troubleshooting skills. Some may disagree, but from my perspective, the thinking that Java programmers are not required to know the internal JVM memory management is an illusion. Java Heap tuning and troubleshooting can especially be a challenge for Java & Java EE beginners. Find below a typical scenario: - Your client production environment is facing OutOfMemoryError on a regular basis and causing lot of business impact. Your support team is under pressure to resolve this problem - A quick Google search allows you to find examples of similar problems and you now believe (and assume) that you are facing the same problem - You then grab JVM -Xms and -Xmx values from another person OutOfMemoryError problem case, hoping to quickly resolve your client’s problem - You then proceed and implement the same tuning to your environment. 2 days later you realize problem is still happening (even worse or little better)…the struggle continues… What went wrong? - You failed to first acquire proper understanding of the root cause of your problem - You may also have failed to properly understand your production environment at a deeper level (specifications, load situation etc.). Web searches is a great way to learn and share knowledge but you have to perform your own due diligence and root cause analysis - You may also be lacking some basic knowledge of the JVM and its internal memory management, preventing you to connect all the dots together My #1 tip and recommendation to you is to learn and understand the basic JVM principles along with its different memory spaces. Such knowledge is critical as it will allow you to make valid recommendations to your clients and properly understand the possible impact and risk associated with future tuning considerations. Now find below a quick high level reference guide for the Java VM: The Java VM memory is split up to 3 memory spaces: The Java Heap. Applicable for all JVM vendors, usually split between YoungGen (nursery) & OldGen (tenured) spaces. The PermGen (permanent generation). Applicable to the Sun HotSpot VM only (PermGen space will be removed in future Java 7 or Java 8 updates) The Native Heap (C-Heap). Applicable for all JVM vendors. I recommend that you review each article below, including Sun white paper on the HotSpot Java memory management. I also encourage you to download and look at the OpenJDK implementation. ## Sun HotSpot VM http://javaeesupportpatterns.blogspot.com/2011/08/java-heap-space-hotspot-vm.html ## IBM VM http://javaeesupportpatterns.blogspot.com/2012/02/java-heap-space-ibm-vm.html ## Oracle JRockit VM http://javaeesupportpatterns.blogspot.com/2012/02/java-heap-space-jrockit-vm.html ## Sun (Oracle) – Java memory management white paper http://java.sun.com/j2se/reference/whitepapers/memorymanagement_whitepaper.pdf ## OpenJDK – Open-source Java implementation http://openjdk.java.net/ As you can see, the Java VM memory management is more complex than just setting up the biggest value possible via –Xmx. You have to look at all angles, including your native and PermGen space requirement along with physical memory availability (and # of CPU cores) from your physical host(s). It can get especially tricky for 32-bit JVM since the Java Heap and native Heap are in a race. The bigger your Java Heap, smaller the native Heap. Attempting to setup a large Heap for a 32-bit VM e.g .2.5 GB+ increases risk of native OutOfMemoryError depending of your application(s) footprint, number of Threads etc. 64-bit JVM resolves this problem but you are still limited to physical resources availability and garbage collection overhead (cost of major GC collections go up with size). The bottom line is that the bigger is not always the better so please do not assume that you can run all your 20 Java EE applications on a single 16 GB 64-bit JVM process. #2 – Data and application is king: review your static footprint requirement Your application(s) along with its associated data will dictate the Java Heap footprint requirement. By static memory, I mean “predictable” memory requirements as per below. - Determine how many different applications you are planning to deploy to a single JVM process e.g. number of EAR files, WAR files, jar files etc. The more applications you deploy to a single JVM, higher demand on native Heap - Determine how many Java classes will be potentially loaded at runtime; including third part API’s. The more class loaders and classes that you load at runtime, higher demand on the HotSpot VM PermGen space and internal JIT related optimization objects - Determine data cache footprint e.g. internal cache data structures loaded by your application (and third party API’s) such as cached data from a database, data read from a file etc. The more data caching that you use, higher demand on the Java Heap OldGen space - Determine the number of Threads that your middleware is allowed to create. This is very important since Java threads require enough native memory or OutOfMemoryError will be thrown For example, you will need much more native memory and PermGen space if you are planning to deploy 10 separate EAR applications on a single JVM process vs. only 2 or 3. Data caching not serialized to a disk or database will require extra memory from the OldGen space. Try to come up with reasonable estimates of the static memory footprint requirement. This will be very useful to setup some starting point JVM capacity figures before your true measurement exercise (e.g. tip #4). For 32-bit JVM, I usually do not recommend a Java Heap size high than 2 GB (-Xms2048m, -Xmx2048m) since you need enough memory for PermGen and native Heap for your Java EE applications and threads. This assessment is especially important since too many applications deployed in a single 32-bit JVM process can easily lead to native Heap depletion; especially in a multi threads environment. For a 64-bit JVM, a Java Heap size of 3 GB or 4 GB per JVM process is usually my recommended starting point. #3 – Business traffic set the rules: review your dynamic footprint requirement Your business traffic will typically dictate your dynamic memory footprint. Concurrent users & requests generate the JVM GC “heartbeat” that you can observe from various monitoring tools due to very frequent creation and garbage collections of short & long lived objects. As you saw from the above JVM diagram, a typical ratio of YoungGen vs. OldGen is 1:3 or 33%. For a typical 32-bit JVM, a Java Heap size setup at 2 GB (using generational & concurrent collector) will typically allocate 500 MB for YoungGen space and 1.5 GB for the OldGen space. Minimizing the frequency of major GC collections is a key aspect for optimal performance so it is very important that you understand and estimate how much memory you need during your peak volume. Again, your type of application and data will dictate how much memory you need. Shopping cart type of applications (long lived objects) involving large and non-serialized session data typically need large Java Heap and lot of OldGen space. Stateless and XML processing heavy applications (lot of short lived objects) require proper YoungGen space in order to minimize frequency of major collections. Example: - You have 5 EAR applications (~2 thousands of Java classes) to deploy (which include middleware code as well…) - Your native heap requirement is estimated at 1 GB (has to be large enough to handle Threads creation etc.) - Your PermGen space is estimated at 512 MB - Your internal static data caching is estimated at 500 MB - Your total forecast traffic is 5000 concurrent users at peak hours - Each user session data footprint is estimated at 500 K - Total footprint requirement for session data alone is 2.5 GB under peak volume As you can see, with such requirement, there is no way you can have all this traffic sent to a single JVM 32-bit process. A typical solution involves splitting (tip #5) traffic across a few JVM processes and / or physical host (assuming you have enough hardware and CPU cores available). However, for this example, given the high demand on static memory and to ensure a scalable environment in the long run, I would also recommend 64-bit VM but with a smaller Java Heap as a starting point such as 3 GB to minimize the GC cost. You definitely want to have extra buffer for the OldGen space so I typically recommend up to 50% memory footprint post major collection in order to keep the frequency of Full GC low and enough buffer for fail-over scenarios. Most of the time, your business traffic will drive most of your memory footprint, unless you need significant amount of data caching to achieve proper performance which is typical for portal (media) heavy applications. Too much data caching should raise a yellow flag that you may need to revisit some design elements sooner than later. #4 – Don’t guess it, measure it! At this point you should: - Understand the basic JVM principles and memory spaces - Have a deep view and understanding of all applications along with their characteristics (size, type, dynamic traffic, stateless vs. stateful objects, internal memory caches etc.) - Have a very good view or forecast on the business traffic (# of concurrent users etc.) and for each application - Some ideas if you need a 64-bit VM or not and which JVM settings to start with - Some ideas if you need more than one JVM (middleware) processes But wait, your work is not done yet. While this above information is crucial and great for you to come up with “best guess” Java Heap settings, it is always best and recommended to simulate your application(s) behaviour and validate the Java Heap memory requirement via proper profiling, load & performance testing. You can learn and take advantage of tools such as JProfiler (future articles will include tutorials on JProfiler). From my perspective, learning how to use a profiler is the best way to properly understand your application memory footprint. Another approach I use for existing production environments is heap dump analysis using the Eclipse MAT tool. Heap Dump analysis is very powerful and allow you to view and understand the entire memory footprint of the Java Heap, including class loader related data and is a must do exercise in any memory footprint analysis; especially memory leaks. Java profilers and heap dump analysis tools allow you to understand and validate your application memory footprint, including detection and resolution of memory leaks. Load and performance testing is also a must since this will allow you to validate your earlier estimates by simulating your forecast concurrent users. It will also expose your application bottlenecks and allow you to further fine tune your JVM settings. You can use tools such as Apache JMeter which is very easy to learn and use or explore other commercial products. Finally, I have seen quite often Java EE environments running perfectly fine until the day where one piece of the infrastructure start to fail e.g. hardware failure. Suddenly the environment is running at reduced capacity (reduced # of JVM processes) and the whole environment goes down. What happened? There are many scenarios that can lead to domino effects but lack of JVM tuning and capacity to handle fail-over (short term extra load) is very common. If your JVM processes are running at 80%+ OldGen space capacity with frequent garbage collections, how can you expect to handle any fail-over scenario? Your load and performance testing exercise performed earlier should simulate such scenario and you should adjust your tuning settings properly so your Java Heap has enough buffer to handle extra load (extra objects) at short term. This is mainly applicable for the dynamic memory footprint since fail-over means redirecting a certain % of your concurrent users to the available JVM processes (middleware instances). #5 – Divide and conquer At this point you have performed dozens of load testing iterations. You know that your JVM is not leaking memory. Your application memory footprint cannot be reduced any further. You tried several tuning strategies such as using a large 64-bit Java Heap space of 10 GB+, multiple GC policies but still not finding your performance level acceptable? In my experience I found that, with current JVM specifications, proper vertical and horizontal scaling which involved creating a few JVM processes per physical host and across several hosts will give you the throughput and capacity that you are looking for. Your IT environment will also more fault tolerant if you break your application list in a few logical silos, with their own JVM process, Threads and tuning values. This “divide and conquer” strategy involves splitting your application(s) traffic to multiple JVM processes and will provide you with: - Reduced Java Heap size per JVM process (both static & dynamic footprint) - Reduced complexity of JVM tuning - Reduced GC elapsed and pause time per JVM process - Increased redundancy and fail-over capabilities - Aligned with latest Cloud and IT virtualization strategies The bottom line is that when you find yourself spending too much time in tuning that single elephant 64-bit JVM process, it is time to revisit your middleware and JVM deployment strategy and take advantage of vertical & horizontal scaling. This implementation strategy is more taxing for the hardware but will really pay off in the long run. Please provide any comment and share your experience on JVM Heap sizing and tuning.

July 19, 2012

by Pierre - Hugues Charbonneau

· 142,728 Views · 7 Likes

Apache Thrift with Java Quickstart

Apache Thrift is a RPC framework founded by facebook and now it is an Apache project. Thrift lets you define data types and service interfaces in a language neutral definition file. That definition file is used as the input for the compiler to generate code for building RPC clients and servers that communicate over different programming languages. You can refer Thrift white paper also. According to the official web site Apache Thrift is a, software framework, for scalable cross-language services development, combines a software stack with a code generation engine to build services that work efficiently and seamlessly between C++, Java, Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, JavaScript, Node.js, Smalltalk, OCaml and Delphi and other languages. Image courtesy wikipedia Installing Apache Thrift in Windows Installation Thrift can be a tiresome process. But for windows the compiler is available as a prebuilt exe. Download thrift.exe and add it into your environment variables. Writing Thrift definition file (.thrift file) Writing the Thrift definition file becomes really easy once you get used to it. I found this tutorial quite useful to begin with. Example definition file (add.thrift) namespace java com.eviac.blog.samples.thrift.server // defines the namespace typedef i32 int //typedefs to get convenient names for your types service AdditionService { // defines the service to add two numbers int add(1:int n1, 2:int n2), //defines a method } Compiling Thrift definition file To compile the .thrift file use the following command. thrift --gen For my example the command is, thrift --gen java add.thrift After performing the command, inside gen-java directory you'll find the source codes which is useful for building RPC clients and server. In my example it will create a java code called AdditionService.java Writing a service handler Service handler class is required to implement the AdditionService.Iface interface. Example service handler (AdditionServiceHandler.java) package com.eviac.blog.samples.thrift.server; import org.apache.thrift.TException; public class AdditionServiceHandler implements AdditionService.Iface { @Override public int add(int n1, int n2) throws TException { return n1 + n2; } } Writing a simple server Following is an example code to initiate a simple thrift server. To enable the multithreaded server uncomment the commented parts of the example code. Example server (MyServer.java) package com.eviac.blog.samples.thrift.server; import org.apache.thrift.transport.TServerSocket; import org.apache.thrift.transport.TServerTransport; import org.apache.thrift.server.TServer; import org.apache.thrift.server.TServer.Args; import org.apache.thrift.server.TSimpleServer; public class MyServer { public static void StartsimpleServer(AdditionService.Processor processor) { try { TServerTransport serverTransport = new TServerSocket(9090); TServer server = new TSimpleServer( new Args(serverTransport).processor(processor)); // Use this for a multithreaded server // TServer server = new TThreadPoolServer(new // TThreadPoolServer.Args(serverTransport).processor(processor)); System.out.println("Starting the simple server..."); server.serve(); } catch (Exception e) { e.printStackTrace(); } } public static void main(String[] args) { StartsimpleServer(new AdditionService.Processor(new AdditionServiceHandler())); } } Writing the client Following is an example java client code which consumes the service provided by AdditionService. Example client code (AdditionClient.java) package com.eviac.blog.samples.thrift.client; import org.apache.thrift.TException; import org.apache.thrift.protocol.TBinaryProtocol; import org.apache.thrift.protocol.TProtocol; import org.apache.thrift.transport.TSocket; import org.apache.thrift.transport.TTransport; import org.apache.thrift.transport.TTransportException; public class AdditionClient { public static void main(String[] args) { try { TTransport transport; transport = new TSocket("localhost", 9090); transport.open(); TProtocol protocol = new TBinaryProtocol(transport); AdditionService.Client client = new AdditionService.Client(protocol); System.out.println(client.add(100, 200)); transport.close(); } catch (TTransportException e) { e.printStackTrace(); } catch (TException x) { x.printStackTrace(); } } } Run the server code(MyServer.java). It should output following and will listen to the requests. Starting the simple server... Then run the client code(AdditionClient.java). It should output following. 300

July 16, 2012

by Pavithra Gunasekara

· 43,270 Views · 2 Likes

JMS With ActiveMQ

Java Message Service is a mechanism for integrating applications in a loosely coupled, flexible manner and delivers data asynchronously across applications.

July 14, 2012

by Pavithra Gunasekara

· 159,880 Views · 13 Likes

Dependency Convergence in Maven

I was running in to a problem with a Java project that occured only in IntelliJ Idea, but not on the command line, when running specific test classes in Maven. The exception stack trace had the following in it: Caused by: com.sun.jersey.api.container.ContainerException: No WebApplication provider is present That seems like an easy problem to fix - it is the exception message that is given when jersey can’t find the provider for JAX-RS. Fixing it is normally just a matter of making sure jersey-core is on the classpath to fulfill SPI requirements for JAX-RS. For some reason though this isn’t happening in IntelliJ Idea. I inspected the log output of the test run and it is quite clear that all of the jersey dependencies are on the classpath. Then it dawns me on the try running mvn dependency:tree from inside of Idea. Here is what I found: [INFO] +- org.mule.modules:mule-module-jersey:jar:3.2.1:provided [INFO] | +- com.sun.jersey:jersey-server:jar:1.6:provided [INFO] | +- com.sun.jersey:jersey-json:jar:1.6:provided [INFO] | | +- com.sun.xml.bind:jaxb-impl:jar:2.2.3-1:provided [INFO] | | \- org.codehaus.jackson:jackson-xc:jar:1.7.1:provided [INFO] | +- com.sun.jersey:jersey-client:jar:1.6:provided [INFO] | \- org.codehaus.jackson:jackson-jaxrs:jar:1.8.0:provided ... [INFO] +- org.jclouds.driver:jclouds-sshj:jar:1.4.0-rc.3:compile [INFO] | +- org.jclouds:jclouds-compute:jar:1.4.0-rc.3:compile [INFO] | | \- org.jclouds:jclouds-scriptbuilder:jar:1.4.0-rc.3:compile [INFO] | +- org.jclouds:jclouds-core:jar:1.4.0-rc.3:compile [INFO] | | +- net.oauth.core:oauth:jar:20100527:compile [INFO] | | +- com.sun.jersey:jersey-core:jar:1.11:compile [INFO] | | +- com.google.inject.extensions:guice-assistedinject:jar:3.0:compile Notice how I have jersey-core 1.11 coming from jclouds-core but jersey 1.6 everywhere else. That, my friends, is a dependency convergence problem. Maven with its default set of plugins (read: no maven-enforcer-plugin) does not even warn you if something like this happens. In this case, somehow jclouds-core depends directly on jersey-core and happens to resolve the dependency to the version that jclouds-core declared first before jersey-core can be resolved as a transitive dependency on mule-module-jersey. To fix the symptom, all I had to do was add the jersey-core dependency explicitely as a top level dependency in my pom: com.sun.jersey jersey-core ${jersey.version} provided But doing so only fixes the symptom, not the problem. The real problem is that the maven project I’m working on does not presently attempt to detect or resolve dependency convergence problems. This is where the maven-enforcer-plugin comes in handy. You can have the enforcer plugin run the DependencyConvergence rule agaisnt your build and have it fail when you have potential conflicts in your transitive dependencies that you haven’t resolved through exclusions or declaring direct dependencies yet. Binding the maven-enforcer-plugin to your build would look something like this: org.apache.maven.plugins maven-enforcer-plugin 1.0.1 enforce enforce validate ... I chose to bind to the validate phase since that is the first phase to be run in the maven lifecycle. Now my build fails immediately and contains very useful output that looks like the following: Dependency convergence error for org.codehaus.jackson:jackson-jaxrs:1.7.1 paths to dependency are: +-com.nodeable:server:1.0-SNAPSHOT +-org.mule.modules:mule-module-jersey:3.2.1 +-com.sun.jersey:jersey-json:1.6 +-org.codehaus.jackson:jackson-jaxrs:1.7.1 and +-com.nodeable:server:1.0-SNAPSHOT +-org.mule.modules:mule-module-jersey:3.2.1 +-org.codehaus.jackson:jackson-jaxrs:1.8.0 There are many rules you can apply besides DependencyConvergence. However, if the output from the DependencyConvergence rule looks anything like mine does presently, it might take you a while before you get around to getting your maven build to pass and conform to other rules.

July 11, 2012

by Jason Whaley

· 24,415 Views · 1 Like

JBoss AS 7 is neat but the documentation is still quite lacking (and error messages not as useful as they could be). This post summarizes how you can create your own JavaEE-compliant login module for authenticating users of your webapp deployed on JBoss AS. A working elementary username-password module provided. Why use Java EE standard authentication? Java EE security primer A part of the Java EE specification is security for web and EE applications, which makes it possible both to specify declarative constraints in your web.xml (such as “role X is required to access resources at URLs “/protected/*”) and to control it programatically, i.e. verifying that the user has a particular role (see HttpServletRequest.isUserInRole). It works as follows: You declare in your web.xml: Login configuration – primarily whether to use browser prompt (basic) or a custom login form and a name for the login realm The custom form uses “magic” values for the post action and the fields, starting with j_, which are intercepted and processed by the server The roles used in your application (typically you’d something like “user” and perhaps “admin”) What roles are required for accessing particular URL patterns (default: none) Whether HTTPS is required for some parts of the application You tell your application server how to authenticate users for that login realm, usually by associating its name with one of the available login modules in the configuration (the modules ranging from simple file-based user list to LDAP and Kerberos support). Only rarely do you need to create your own login module, the topic of this post. If this is new for you than I strongly recommend reading The Java EE 5 Tutorial – Examples: Securing Web Applications (Form-Based Authentication with a JSP Page incl. security constraint specification, Basic Authentication with JAX-WS, Securing an Enterprise Bean, Using the isCallerInRole and getCallerPrincipal Methods). Why to bother? Declarative security is nicely decoupled from the business code It’s easy to propagate security information between a webapp and for example EJBs (where you can protect a complete bean or a particular method declaratively via xml or via annotations such as @RolesAllowed) It’s easy to switch to a different authentication mechanism such as LDAP and it’s more likely that SSO will be supported Custom login module implementation options If one of the login modules (part of a security domain) provided out of the box with JBoss, such as UsersRoles, Ldap, Database, Certificate, isn’t sufficient for you then you can adjust one of them or implement your own. You can: Extend one of the concrete modules, overriding one or some of its methods to ajdust to your needs – see f.ex. how to override the DatabaseServerLoginModule to specify your own encryption of the stored passwords. This should be your primary choice, of possible. Subclass UsernamePasswordLoginModule Implement javax.security.auth.spi.LoginModule if you need maximal flexibility and portability (this is a part of Java EE, namely JAAS, and is quite complex) JBoss EAP 5 Security Guide Ch. 12.2. Custom Modules has an excellent description of the basic modules (AbstractServerLoginModule, UsernamePasswordLoginModule) and how to proceed when subclassing them or any other standard module, including description of the key methods to implement/override. You must read it. (The guide is still perfectly applicable to JBoss AS 7 in this regard.) The custom JndiUserAndPass module example, extending UsernamePasswordLoginModule, is also worth reading – it uses module options and JNDI lookup. Example: Custom UsernamePasswordLoginModule subclass See the source code of MySimpleUsernamePasswordLoginModule that extends JBoss’ UsernamePasswordLoginModule. The abstract UsernamePasswordLoginModule (source code) works by comparing the password provided by the user for equality with the password returned from the method getUsersPassword, implemented by a subclass. You can use the method getUsername to obtain the user name of the user attempting login. Implement abstract methods getUsersPassword() Implement getUsersPassword() to lookup the user’s password wherever you have it. If you do not store passwords in plain text then read how to customize the behavior via other methods below getRoleSets() Implement getRoleSets() (from AbstractServerLoginModule) to return at least one group named “Roles” and containing 0+ roles assigned to the user, see the implementation in the source code for this post. Usually you’d lookup the roles for the user somewhere (instead of returning hardcoded “user_role” role). Optionally extend initialize(..) to get access to module options etc. Usually you will also want to extend initialize(Subject subject, CallbackHandler callbackHandler, Map sharedState, Map options) (called for each authentication attempt), To get values of properties declared via the element in the security-domain configuration – see JBoss 5 custom module example To do other initialization, such as looking up a data source via JNDI – see the DatabaseServerLoginModule Optionally override other methods to customize the behavior If you do not store passwords in plain text (a wise choice!) and your hashing method isn’t supported out of the box then you can override createPasswordHash(String username, String password, String digestOption) to hash/encrypt the user-supplied password before comparison with the stored password. Alternatively you could override validatePassword(String inputPassword, String expectedPassword) to do whatever conversion on the password before comparison or even do a different type of comparison than equality. Custom login module deployment options In JBoss AS you can Deploy your login module class in a JAR as a standalone module, independently of the webapp, under /modules/, together with a module.xml – described at JBossAS7SecurityCustomLoginModules Deploy your login module class as a part of your webapp (no module.xml required) In a JAR inside WEB-INF/lib/ Directly under WEB-INF/classes In each case you have to declare a corresponding security-domain it inside JBoss configuration (standalone/configuration/standalone.xml or domain/configuration/domain.xml): The code attribute should contain the fully qualified name of your login module class and the security-domain’s name must match the declaration in jboss-web.xml: form-auth true The code Download the webapp jboss-custom-login containing the custom login module MySimpleUsernamePasswordLoginModule, follow the deployment instructions in the README.

July 4, 2012

by Jakub Holý

· 31,642 Views

Using the JavaFX AnimationTimer

In retrospect it was probably not a good idea to give the AnimationTimer its name, because it can be used for much more than just animation: measuring the fps-rate, collision detection, calculating the steps of a simulation, the main loop of a game etc. In fact, most of the time I saw AnimationTimer in action was not related to animation at all. Nevertheless there are cases when you want to consider using an AnimationTimer for your animation. This post will explain the class and show an example where AnimationTimer is used to calculate animations. The AnimationTimer provides an extremely simple, but very useful and flexible feature. It allows to specify a method, that will be called in every frame. What this method is used for is not limited and, as already mentioned, does not have anything to do with animation. The only requirement is, that it has to return fast, because otherwise it can easily become the bottleneck of a system. To use it, a developer has to extend AnimationTimer and implement the abstract method handle(). This is the method that will be called in every frame while the AnimationTimer is active. A single parameter is passed to handle(). It contains the current time in nanoseconds, the same as what you would get when calling System.nanoTime(). Why should one use the passed in value instead of calling System.nanoTime() or its little brother System.currentTimeMillis() oneself? There are several reasons, but the most important probably is, that it makes your life a lot easier while debugging. If you ever tried to debug code, that depended on these two methods, you know that you are basically screwed. But the JavaFX runtime goes into a paused state while it is waiting to execute the next step during debugging and the internal clock does not proceed during this pause. In other words no matter if you wait two seconds or two hours before you resume a halted program while debugging, the increment of the parameter will roughly be the same! AnimationTimer has two methods start() and stop() to activate and deactivate it. If you override them, it is important that you call these methods in the super class. The Animation API comes with many feature rich classes, that make defining an animation very simple. There are predefined Transition classes, it is possible to define a key-frame based animation using Timeline, and one can even write a custom Transition easily. But in which cases does it make sense to use an AnimationTimer instead? – Almost always you want to use one of the standard classes. But if you want to specify many simple animations, using an AnimationTimer can be the better choice. The feature richness of the standard animation classes comes with a price. Every single animation requires a whole bunch of variables to be tracked – variables that you often do not need for simple animations. Plus these classes are optimized for speed, not for small memory footprint. Some of the variables are stored twice, once in the format the public API requires and once in a format that helps faster calculation while playing. Below is a simple example that shows a star field. It animates thousands of rectangles flying from the center to the outer edges. Using an AnimationTimer allows to store only the values that are needed. The calculation is extremely simple compared to the calculation within a Timeline for example, because no advanced features (loops, animation rate, direction etc.) have to be considered. package fxsandbox; import java.util.Random; import javafx.animation.AnimationTimer; import javafx.application.Application; import javafx.scene.Group; import javafx.scene.Node; import javafx.scene.Scene; import javafx.scene.paint.Color; import javafx.scene.shape.Rectangle; import javafx.stage.Stage; public class FXSandbox extends Application { private static final int STAR_COUNT = 20000; private final Rectangle[] nodes = new Rectangle[STAR_COUNT]; private final double[] angles = new double[STAR_COUNT]; private final long[] start = new long[STAR_COUNT]; private final Random random = new Random(); @Override public void start(final Stage primaryStage) { for (int i=0; i

June 27, 2012

by Michael Heinrichs

· 17,680 Views

Using Cookies to implement a RememberMe functionality

Some web applications may need a "Remember Me" functionality. This means that, after a user login, user will have access from same machine to all its data even after session expired. This access will be possible until user does a logout. If you are using Spring and its login form, then you should use "Remember Me" functionality already implemented inside the framework. Some web frameworks also offer a type of SignIn panel which already has "remember me" built-in. But in case you have to implement "Remember Me" functionality by your own, this can be easily achieved using Cookies. Java has a Cookie class named javax.servlet.http.Cookie. Algorithm is straight-forward: your login panel must contain a "Remember Me" check after a succesfull login with "Remember Me" check selected, you can create two cookies: one to keep the value for rememberMe and one to keep a token which has to identify the logged user. For sake of security, this token must never contain user name or user password. The ideea is to generate a random id as token value. And token value aside with user id must be saved in your storage (database) whenever a login is needed, you have to look if there is any cookie saved by you, and if so and your "rememberMe" value is true, you can take the user from storage based on your token and do an automatic login. when a logout is done, you have to delete the cookie that keeps the token To add a cookie, you have to specify the maximum age of the cookie in seconds : HttpServletResponse servletResponse = ...; Cookie c = new Cookie(COOKIE_NAME, encodeString(uuid)); c.setMaxAge(365 * 24 * 60 * 60); // one year servletResponse.addCookie(c); To delete a cookie, you have to find cookie by name and set its maximum age to 0, before adding it to servlet response: HttpServletRequest servletRequest = ...; HttpServletResponse servletResponse = ... ; Cookie[] cookies = servletRequest.getCookies(); for (int i = 0; i < cookies.length; i++) { Cookie c = cookies[i]; if (c.getName().equals(COOKIE_NAME)) { c.setMaxAge(0); c.setValue(null); servletResponse.addCookie(c); } }

June 26, 2012

by Mihai Dinca - Panaitescu

· 57,973 Views · 1 Like

JAX-WS Header: Part 1 the Client Side

Manipulating JAXWS header on the client Side like adding WSS username token or logging saop message.

June 25, 2012

by Slim Ouertani

· 89,325 Views

Java Volatile Keyword Explained by Example

Check out an example of the volatile Java keyword.

June 21, 2012

by Thibault Delor

· 259,899 Views · 20 Likes