Language Resources

The Latest Languages Topics

Glassfish 4 - Performance Tuning, Monitoring and Troubleshooting

This is the third blog in C2B2 series looking at Glassfish 4. The previous two are available here: Part 1 - Getting started with Glassfish 4 Part 2 - Glassfish 4 - Features For High Availability In this blog I will look at 3 areas: Performance Tuning, where I will look at some of the areas to look at when setting up a system for production. Monitoring, where I will look at some of the tools we use for monitoring a system both during performance testing and tuning and once a system is up and running. Troubleshooting, where I will look at some of the tools you can use to help diagnose and detect performance issues. Performance Tuning Glassfish out of the box (as with most app servers) is optimised for development purposes. Developers want the ability to deploy and undeploy continuously, create and remove resources, debug, etc. However, this configuration is not suitable for a production system. When configuring any application server you have to take into account what you are trying to achieve and what is best suited for the applications you intend to run. One size does not fit all! It can be a long and complex process and I'm afraid I can't give you a one-stop solution. However, I can give you some pointers to some of the things you can do to prepare your system for production. So, what kind of things do we look at when we are looking to performance tune a Glassfish system. Some of the most common things are: JVM Settings Garbage Collection Glassfish Settings Logging JVM Settings The standard JVM defaults are not suitable for a production system. One of the simplest changes that can be made is to use the -server flag, rather than the default -client. Although the Server and Client VMs are similar, the Server VM has been specially tuned to maximise peak operating speed. It is intended for executing long-running server applications, which need the fastest possible operating speed more than a fast start-up time or smaller runtime memory footprint. Allocate more memory to the JVM by modifying the value of the -Xmx flag. How much depends on the size and complexity of your enterprise application and how much memory you have available. In addition we also want to make sure we allocate all of the memory on startup. This is done with the -Xms flag. We set the minimum and maximum perm gen to the same value in order to avoid allocation failures & subsequent full garbage collections. Garbage Collection There are a number of settings that can be tweaked regarding Garbage Collection. I'm not going to cover GC tuning as that is a whole topic all of it's own but here are some of the settings we would always recommend regarding GC in a production environment: Firstly we want to ensure we log all Garbage Collection information as this can prove extremely useful in diagnosing issues. -verbose:gc Next we want to make sure we log GC information to a file. This will make it easier to separate the GC from other details in the log files. -Xloggc:/path_to_log_file/gc.log We also want to ensure we have as much detail as possible. -XX:+PrintGCDetails and that the information is timestamped for easier diagnosis of long running errors and to be able to ascertain what normal levels are over time. -XX:+PrintGCDateStamps Finally, we want to ensure that developers aren't making explicit calls to System.gc(). Hopefully they don’t anyway and if they are you need to look into why (doing so is a bad idea since this forces major collections) but this will disable it just in case. -XX:+DisableExplicitGC Heap Dumps Heap dumps can be extremely useful for diagnosing memory issues. There are two settings we would definitely recommend. These tell the JVM to generate a heap dump when an allocation from the Java heap or the permanent generation cannot be satisfied. There is no overhead in running with these options but they can be useful for production systems where OutOfMemoryErrors can take a long time to surface. -XX:-HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/opt/dumps/glassfish.hprof Configuring Glassfish There are three ways to configure Glassfish: Through the admin console By directly editing the config files Using the asadmin tool Although making changes through the admin console can often be the easiest way to make changes we’d recommend where possible to script all changes so you have a repeatable production server build. Also you should ensure copies of all config files are kept in Config Control so you know you have a working copy and can roll back to a previous version when needed. Turn off development features Turn off auto-deploy and dynamic application reloading. Both of these features are great for development, but can affect performance. Configure the JSP servlet not to check JSP files for changes on every request. Also, set the parameter genStrAsCharArray to true. This will ensure all String values are declared as static char arrays. One reason for this is that the array has less memory overhead than String. These changes will mean you cannot change JSP pages on your production server without redeploying the application, but on a production system this is generally what you want. Acceptor Threads and Request Threads There are two main thread values we would recommend setting, acceptor threads and request threads. Acceptor threads are used to accept new connections to the server and to schedule existing connections when a new request comes in. Set this value equal to the number of CPU cores in your server. So, if you have two quad core CPUs, this value should be set to eight. Request threads run HTTP requests. You want enough of these to keep the machine busy, but not so many that they compete for CPU resources which would cause your throughput to suffer greatly. Static resources By default, GlassFish does not tell the client to cache static resources. It is recommended to cache static resources, like CSS files and images particularly if you have a lot of them. Thread pools Max thread pool and min pool size should be set to the same value. Specifying the same value will allow GlassFish to use a slightly more optimised thread pool. This configuration should be considered unless the load on the server varies significantly. Increasing this value will reduce HTTP response latency times. What to set these values to depends heavily on what your application is doing. In order to get this value right you should look to incrementally increase the thread count and to monitor performance after each incremental increase. When performance stops improving stop increasing the thread count. Logging You should look to turn off as much logging as possible. In a production environment we would generally recommend logging at WARN and above. This includes the logging done by Glassfish as well as your own applications. Monitoring The fewer monitoring options that are enabled, the better the server's performance. All Glassfish monitoring is turned off by default. Switching monitoring on can be very useful when diagnosing issues and when doing initial system testing and performance tuning for monitoring what changes. What to monitor Used Heap Size - Compare this number with the maximum allowed heap size to see what portion of the heap is in use. If the used heap size nears the max heap size, the garbage collector urgently attempts to free memory and this is something that should be avoided where possible. Number of loaded classes - Useful for detecting performance and application development trends. JVM Threads - Important for performance tuning and for troubleshooting JVM crashes. Some of the most essential indicators are the current active JVM thread count and the peak values. Thread pools - You should compare a pools current usage with the maximum number allowed. Problems can start to occur when the current count nears the max threads number. JVM Tools for Monitoring The following is a list of a a few of the tools that come with the JDK that are useful for monitoring information from the JVM. jstat - This tool displays performance statistics regarding usage of the perm gen, new gen and old gen. It also provides class loading and compilation statistics jmap - Gives you visibility of memory usage, can produce a class histogram and can dump the memory to a file jconsole/jvisualvm - These tools can display all the previously mentioned monitoring indicators and graph them over time. This allows you to spot trends and to get a better overall picture of your normal performance levels and changes over time. Note - These should NOT be left running permanently on a production system! Troubleshooting Unfortunately, no matter how much tuning and testing you do all systems WILL go wrong from time to time. So, what should you do when your production server bursts into flames? Well, in that situation you should call the fire service but for more general problems: Gather data - get as much data as you can, there is no such thing as too much! Analyse that data - Data is worthless when you don’t know what it means. Visualise where possible – graphs and charts reveal trends and patterns over time Make educated decisions - Only make decisions based on data. If you go with your “gut instinct” and what “feels right” you will probably make things worse Gathering data First up, for most of the JVM tools you will need the process ID of the server. You can get this information in various ways. Two of the simplest are: jps -v This will list all current running Java processes. The -v flag is for verbose output. ps aux | grep glassfish The ps command with the options aux will show all processes from all users. This will display a LOT of information so pipe it through grep to filter for the glassfish process As mentioned earlier the jstat tool can be used for gathering info on JVM performance. Other useful tools include: jstack This will produce thread stack dumps for all threads running in the JVM. This can be very useful for discovering stuck threads or long running threads. jmap This tool can be used to create a heap dump. It outputs to a file in .hprof format which can be read by a number of analysis tools jrcmd and jrmc These tools are only available with the jRockit JDK. I won't go into any detail here as I have previously blogged about jrcmd here: http://blog.c2b2.co.uk/2012/11/troubleshooting-jrockit-using-jrcmd.html and my colleague has blogged about jrmc here: http://blog.c2b2.co.uk/2012/10/weblogic-troubleshooting-with-jrockit.html Glassfish asadmin The Glassfish asadmin tool has a built in command which will provide similar functionality to the above tools but without the need for the PID. asadmin generate-jvm-report --type=[type] Analysing the data There are various tools available for analysing performance data. The following are some of the most useful: IBM Support Assistant is a free troubleshooting application that helps you research, analyze, and resolve problems using various support features and tools. It contains a Garbage Collection and Memory Visualiser as well as a Heap Analyser. It will also provide a report telling you where issues might exist, and listing red flags with advice on what to change in your applications jRockit Mission Control is a very powerful tool which can be used to monitor live systems or analyse historical data in the form of flight recordings. JVisualVM GCViewer is an optional plugin for jVisualVM which can transform a tool which is already great for live monitoring into a powerful analysis tool jhat is a Java Heap Analysis Tool. It processes heap dump files and produces HTML reports. There are better analysis tools, but it’s always freely available if you’re running a JDK. Others There are many open source and freely available tools and projects to help you, here we’ve covered some very common and widely used ones, but our list is by no means exhaustive! Conclusion Remember, Glassfish out of the box (or out of the zip file!) is not designed to be run 'as is'. You should also note that there is no ideal configuration that will work for all systems. It will take time and effort to get the best configuration for what you require. Hopefully in this blog I have given you some useful guidelines and pointers. You should take time to work out what you want in terms of services, then strip back your config to match that. You should test, test and test again to ensure that your configuration matches the requirements with regards to the applications you will be running on your server. You should tune your JVM to ensure you have the best settings for your particular configuration. You should ensure you have monitoring in place to keep a check on everything and ensure that if your server does crash you have as much information as possible at hand to diagnose what caused it. The next blog in this series looks at Migrating to Glassfish 4: http://blog.c2b2.co.uk/2013/07/glassfish-4-migrating-to-glassfish.html

July 30, 2014

by Andy Overton

· 24,859 Views

AngularJS + TypeScript – How To Setup a Watch (And 2 Ways to Do it Wrong)

Introduction After setting up my initial application as described in my previous post, I went about to set up a watch. For those who don’t know what that is – it’s basically a function that gets triggered when an scope object or part of that changes. I have found 4 ways to set it up, and only one seems to be (completely) right. In JavaScript, you would set up a watch like this sample I nicked from Stack Overflow: function MyController($scope) { $scope.myVar = 1; $scope.$watch('myVar', function() { alert('hey, myVar has changed!'); }); $scope.buttonClicked = function() { $scope.myVar = 2; // This will trigger $watch expression to kick in }; } So how would you go about in TypeScript? Turns out there are a couple of ways that compile but don’t work, partially work, or have unexpected side effects. For my demonstration, I am going to use the DemoController that I made in my previous post. Incorrect method #1 – 1:1 translation. /// /// module App.Controllers { "use strict"; export class DemoController { static $inject = ["$scope"]; constructor(private $scope: Scope.IDemoScope) { if (this.$scope.person === null || this.$scope.person === undefined) { this.$scope.person = new Scope.Person(); } this.$scope.$watch(this.$scope.person.firstName, () => { alert("person.firstName changed to " + this.$scope.person.firstName); }); } public clear(): void { this.$scope.person.firstName = ""; this.$scope.person.lastName = ""; } } } The new part is in red. Very cool – we even use the inline ‘delegate-like’ notation do define the handler inline. This seems plausible, but does not work. What it does is, on startup, give the message “person.firstName changed to undefined” and then it never, ever does anything again. I have spent quite some time looking at this. Don’t do the same – read on. Incorrect method #2 – not catching the first call To fix the problem above, you need to use the delegate notation at the start as well: this.$scope.$watch(() => this.$scope.person.firstName, () => { alert("person.firstName changed to " + this.$scope.person.firstName); }); See the difference? As you now type a “J” in the top text box, you immediately get a “person.firstName changed to J” alert. Making it almost impossible to type. But you get the drift. But then we arrive at the next problem – this is still not correct: it goes off initially, when nothing has changed yet. This is undesirable in most occasions. The correct way It appears the callback actually has a few overloads with a couple of parameters, of which I usually only use oldValue and newValue to detect a real change. Kinda like you do in an INotifyPropertyChanged property: this.$scope.$watch(() => this.$scope.person.firstName, (oldValue: string, newValue: string) => { if (oldValue !== newValue) { alert("person.firstName changed to " + this.$scope.person.firstName); } }); Now it only goes off when there’s a real change in the watched property. …and possibly and even better way I am not really a fan of a lambda calling a lambda in a method call, so I would most probably refactor this to constructor(private $scope: Scope.IDemoScope) { if (this.$scope.person === null || this.$scope.person === undefined) { this.$scope.person = new Scope.Person(); } this.$scope.$watch(() => this.$scope.person.firstName, (oldValue: string, newValue: string) => { this.tellmeItChanged(oldValue, newValue); }); } private tellmeItChanged(oldValue: string, newValue: string) { if (oldValue !== newValue) { alert("person.firstName changed to " + this.$scope.person.firstName); } } as I think this is just a bit more readable, especially if you are going to do more complex things in the callback. Demo solution can be found here

July 28, 2014

by Joost van Schaik

· 14,831 Views

Data-driven Unit Testing in Java

Data-driven testing is a powerful way of testing a given scenario with different combinations of values. In this article, we look at several ways to do data-driven unit testing in JUnit. Suppose, for example, you are implementing a Frequent Flyer application that awards status levels (Bronze, Silver, Gold, Platinum) based on the number of status points you earn. The number of points needed for each level is shown here: level minimum status points result level Bronze 0 Bronze Bronze 300 Silver Bronze 700 Gold Bronze 1500 Platinum Our unit tests need to check that we can correctly calculate the status level achieved when a frequent flyer earns a certain number of points. This is a classic problem where data-driven tests would provide an elegant, efficient solution. Data-driven testing is well-supported in modern JVM unit testing libraries such as Spock and Spec2. However, some teams don’t have the option of using a language other than Java, or are limited to using JUnit. In this article, we look at a few options for data-driven testing in plain old JUnit. Parameterized Tests in JUnit JUnit provides some support for data-driven tests, via the Parameterized test runner. A simple data-driven test in JUnit using this approach might look like this: @RunWith(Parameterized.class) public class WhenEarningStatus { @Parameters(name = "{index}: {0} initially had {1} points, earns {2} points, should become {3} ") public static Iterable data() { return Arrays.asList(new Object[][]{ {Bronze, 0, 100, Bronze}, {Bronze, 0, 300, Silver}, {Bronze, 100, 200, Silver}, {Bronze, 0, 700, Gold}, {Bronze, 0, 1500, Platinum}, }); } private Status initialStatus; private int initialPoints; private int earnedPoints; private Status finalStatus; public WhenEarningStatus(Status initialStatus, int initialPoints, int earnedPoints, Status finalStatus) { this.initialStatus = initialStatus; this.initialPoints = initialPoints; this.earnedPoints = earnedPoints; this.finalStatus = finalStatus; } @Test public void shouldUpgradeStatusBasedOnPointsEarned() { FrequentFlyer member = FrequentFlyer.withFrequentFlyerNumber("12345678") .named("Joe", "Jones") .withStatusPoints(initialPoints) .withStatus(initialStatus); member.earns(earnedPoints).statusPoints(); assertThat(member.getStatus()).isEqualTo(finalStatus); } } You provide the test data in the form of a list of Object arrays, identified by the _@Parameterized@ annotation. These object arrays contain the rows of test data that you use for your data-driven test. Each row is used to instantiate member variables of the class, via the constructor. When you run the test, JUnit will instantiate and run a test for each row of data. You can use the name attribute of the @Parameterized annotation to provide a more meaningful title for each test. There are a few limitations to the JUnit parameterized tests. The most important is that, since the test data is defined at a class level and not at a test level, you can only have one set of test data per test class. Not to mention that the code is somewhat cluttered - you need to define member variables, a constructor, and so forth. Fortunatly, there is a better option. Using JUnitParams A more elegant way to do data-driven testing in JUnit is to use [https://code.google.com/p/junitparams/|JUnitParams]. JUnitParams (see [http://search.maven.org/#search%7Cga%7C1%7Ca%3A%22JUnitParams%22|Maven Central] to find the latest version) is an open source library that makes data-driven testing in JUnit easier and more explicit. A simple data-driven test using JUnitParam looks like this: @RunWith(JUnitParamsRunner.class) public class WhenEarningStatusWithJUnitParams { @Test @Parameters({ "Bronze, 0, 100, Bronze", "Bronze, 0, 300, Silver", "Bronze, 100, 200, Silver", "Bronze, 0, 700, Gold", "Bronze, 0, 1500, Platinum" }) public void shouldUpgradeStatusBasedOnPointsEarned(Status initialStatus, int initialPoints, int earnedPoints, Status finalStatus) { FrequentFlyer member = FrequentFlyer.withFrequentFlyerNumber("12345678") .named("Joe", "Jones") .withStatusPoints(initialPoints) .withStatus(initialStatus); member.earns(earnedPoints).statusPoints(); assertThat(member.getStatus()).isEqualTo(finalStatus); } } Test data is defined in the @Parameters annotation, which is associated with the test itself, not the class, and passed to the test via method parameters. This makes it possible to have different sets of test data for different tests in the same class, or mixing data-driven tests with normal tests in the same class, which is a much more logical way of organizing your classes. JUnitParam also lets you get test data from other methods, as illustrated here: @Test @Parameters(method = "sampleData") public void shouldUpgradeStatusFromEarnedPoints(Status initialStatus, int initialPoints, int earnedPoints, Status finalStatus) { FrequentFlyer member = FrequentFlyer.withFrequentFlyerNumber("12345678") .named("Joe", "Jones") .withStatusPoints(initialPoints) .withStatus(initialStatus); member.earns(earnedPoints).statusPoints(); assertThat(member.getStatus()).isEqualTo(finalStatus); } private Object[] sampleData() { return $( $(Bronze, 0, 100, Bronze), $(Bronze, 0, 300, Silver), $(Bronze, 100, 200, Silver) ); } The $ method provides a convenient short-hand to convert test data to the Object arrays that need to be returned. You can also externalize @Test @Parameters(source=StatusTestData.class) public void shouldUpgradeStatusFromEarnedPoints(Status initialStatus,int initialPoints, int earnedPoints,Status finalStatus){ ... } The test data here comes from a method in the StatusTestData class: public class StatusTestData{ public static Object[] provideEarnedPointsTable(){ return $( $(Bronze,0, 100,Bronze), $(Bronze,0, 300,Silver), $(Bronze,100,200,Silver) ); } } This method needs to be static, return an object array, and start with the word "provide". Getting test data from external methods or classes in this way opens the way to retrieving test data from external sources such as CSV or Excel files. JUnitParam provides a simple and clean way to implement data-driven tests in JUnit, without the overhead and limitations of the traditional JUnit parameterized tests. Testing with non-Java languages If you are not constrained to Java and/or JUnit, more modern tools such as Spock (https://code.google.com/p/spock/) and Spec2 provide great ways of writing clean, expressive unit tests in Groovy and Scala respectively. In Groovy, for example, you could write a test like the following: class WhenEarningStatus extends Specification{ def"should earn status based on the number of points earned"(){ given: def member =FrequentFlyer.withFrequentFlyerNumber("12345678") .named("Joe","Jones") .withStatusPoints(initialPoints) .withStatus(initialStatus); when: member.earns(earnedPoints).statusPoints() then: member.status == finalStatus where: initialStatus | initialPoints | earnedPoints | finalStatus Bronze |0 |100 |Bronze Bronze |0 |300 |Silver Bronze |100 |200 |Silver Silver |0 |700 |Gold Gold |0 |1500 |Platinum } } John Ferguson Smart is a specialist in BDD, automated testing, and software life cycle development optimization, and author of BDD in Action and other books. John runsregular courses in Australia, London and Europe on related topics such as Agile Requirements Gathering, Behaviour Driven Development, Test Driven Development, andAutomated Acceptance Testing. Blog Links >>

July 27, 2014

by John Ferguson Smart

· 24,714 Views · 1 Like

How to Instantly Improve Your Java Logging With 7 Logback Tweaks

the benchmark tests to help you discover how logback performs under pressure logging is essential for server-side applications but it comes at a cost. it’s surprising to see though how much impact small changes and configuration tweaks can have on an app’s logging throughput. in this post we will benchmark logback ’s performance in terms of log entries per minute. we’ll find out which appenders perform best, what is prudent mode, and what are some of the awesome side effects of async methods, sifting and console logging. let’s get to it. the groundwork for the benchmark at its core, logback is based on log4j with tweaks and improvements under ceki gülcü ’s vision. or as they say, a better log4j . it features a native slf4j api, faster implementation, xml configuration, prudent mode, and a set of useful appenders which i will elaborate on shortly. having said that, there are quite a few ways to log with the different sets of appenders, patterns and modes available on logback. we took a set of commonly used combinations and put them to a test on 10 concurrent threads to find out which can run faster. the more log entries written per minute, the more efficient the method is and more resources are free to serve users. it’s not exact science but to be more precise, we’ve ran each test 5 times, removed the top and bottom outliers and took the average of the results. to try and be fair, all log lines written also had an equal length of 200 characters. ** all code is available on github right here . the test was run on a debian linux machine running on intel i7-860 (4 core @ 2.80 ghz) with 8gb of ram. first benchmark: what’s the cost of synchronous log files? first we took a look at the difference between synchronous and asynchronous logging. both writing to a single log file, the fileappender writes entries directly to file while the asyncappender feeds them to a queue which is then written to file. the default queue size is 256, and when it’s 80% full it stops letting in new entries of lower levels (except warn and error). the table compares between the fileappender and different queue sizes for the asyncappender. async came on top with the 500 queue size. tweak #1: asyncappender can be 3.7x faster than the synchronous fileappender. actually, it’s the fastest way to log across all appenders. it performed way better than the default configuration that even trails behind the sync fileappender which was supposed to finish last. so what might have happened? since we’re writing info messages, and doing so from 10 concurrent threads, the default queue size might have been too small and messages could have been lost to the default threshold. looking at results of the 500 and 1,000,000 queue sizes, you’ll notice that their throughput was similar so queue size and threshold weren’t an issue for them. tweak #2: the default asyncappender can cause a 5 fold performance cut and even lose messages. make sure to customize the queue size and discardingthreshold according to your needs. 500 0 ** setting an asyncappender’s queuesize and discardingthreshold second benchmark: do message patterns really make a difference? now we want to see the effect of log entry patterns on the speed of writing. to make this fair we kept the log line’s length equal (200 characters) even when using different patterns. the default logback entry includes the date, thread, level, logger name and message, by playing with it we tried to see what the effects on performance might be. this benchmark demonstrates and helps see up close the benefit of logger naming conventions. just remember to change its name accordingly to the class you use it in. tweak #3: naming the logger by class name provides 3x performance boost. taking the loggers or the threads name off added some 40k-50k entries per minute. no need to write information you’re not going to use. going minimal also proved to be a bit more effective. tweak #4: compared to the default pattern, using only the level and message fields provided 127k more entries per minute. third benchmark: dear prudence, won’t you come out to play? in prudent mode a single log file can be accessed from multiple jvms. this of course takes a hit on performance because of the need to handle another lock. we tested prudent mode on 2 jvms writing to a single file using the same benchmark we ran earlier. prudent mode takes a hit as expected, although my first guess was that the impact would be a stronger. tweak #5: use prudent mode only when you absolutely need it to avoid a throughput decrease. logs/test.log true ** configuring prudent mode on a fileappender fourth benchmark: how to speed up synchronous logging? let’s see how synchronous appenders other than the fileappender perform. the consoleappender writes to system.out or system.err (defaults to system.out) and of course can also be piped to a file. that’s how we we’re able to count the results. the socketappender writes to a specified network resource over a tcp socket. if the target is offline, the message is dropped. otherwise, it’s received as if it was generated locally. for the benchmark, the socket was was sending data to the same machine so we avoided network issues and concerns. to our surprise, explicit file access through fileappender is more expensive than writing to console and piping it to a file. the same result, a different approach, and some 200k more log entries per minute. socketappender performed similarly to fileappender in spite of adding serialization in between, the network resource if existed would have beared most of the overhead. tweak #6: piping consoleappender to a file provided 13% higher throughput than using fileappender. fifth benchmark: now can we kick it up a notch? another useful method we have in our toolbelt is the siftingappender. sifting allows to break the log to multiple files. our logic here was to create 4 separate logs, each holding the logs of 2 or 3 out of the 10 threads we run in the test. this is done by indicating a discriminator, in our case, logid, which determines the file name of the logs: logid unknown logs/sift-${logid}.log false ** configuring a siftingappender once again our fileappender takes a beat down. the more output targets, the less stress on the locks and fewer context switching. the main bottleneck in logging, same as with the async example, proves to be synchronising a file. tweak #7: using a siftingappender can allow a 3.1x improvement in throughput. conclusion we found that the way to achieve the highest throughput is by using a customized asyncappender. if you must use synchronous logging, it’s better to sift through the results and use multiple files by some logic. i hope you’ve found the insights from the logback benchmark useful and look forward to hear your thoughts at the comments below. originally posted in takipi's blog

July 25, 2014

by Chen Harel

· 20,624 Views

Swiss Java Knife - A useful tool to add to your diagnostic tool-kit?

As a support consultant at C2B2 I am always looking for handy tools that may be able to help me or my team in diagnosing our customers middleware issues. So, when I came across a project called Swiss Java Knife promising tools for 'JVM monitoring, profiling and tuning' I figured I should take a look. It's basically a single jar file that allows you to run a number of tools most of which are similar to the ones that come bundled with the JDK. If you're interested in those tools my colleague Matt Brasier did a good introductory webinar which is available here: http://www.c2b2.co.uk/jvm_webinar_video Downloading Firstly I downloaded the latest jar file from github: https://github.com/aragozin/jvm-tools The source code is also available but for the purposes of this look into what it can offer the jar will suffice. What does it offer? Swiss Java Knife offers a number of commands: jps - Similar to the jps tool that comes with the JDK. ttop - Similar to the linux top command. hh - Similar to running the jmap tool that comes with the JDK with the -histo option. gc - Reports information about GC in real time. mx - Allows you to do basic operations with MBeans from the command line. mxdump - Dumps all MBeans of the target java process to JSON. Testing In order to test out the commands that are available I set up a Weblogic server and deployed an app containing a number of servlets that have known issues. These are then called via JMeter to show certain server behaviour: excessive Garbage Collection high CPU usage a memory leak Finding the process ID Normally to find the process ID I'd use the jps command that comes with the JDK. Swiss Java Knife has it's own version of the jps command so I tried that instead. Running the command: java -jar sjk-plus-0.1-2013-09-06.jar jps gives the following output: 5402org.apache.derby.drda.NetworkServerControl start 3250weblogic.Server 4032./ApacheJMeter.jar 3172weblogic.NodeManager -v 5427weblogic.Server 6523sjk-plus-0.1-2013-09-06.jar jps Which is basically the same as running the jps command with the -l option. There are a couple of additions where you can add filter options allowing you to pass in wild cards to match process descriptions or JVM system properties but overall it adds very little to the standard jps tool. jps -lv will generally give you everything you need. OK, so now we've got the process ID of our server we can start to look at what is going on. First of all, lets check garbage collection. Checking garbage collection OK. Now this one looks more promising. Swiss Java Knife has a command for collecting real time GC statistics. Let's give it a go. So, running the following command without my dodgy servlet running should give us a 'standard' reading: java -jar sjk-plus-0.1-2013-09-06.jar gc -p 3016 [GC: PS Scavenge#10471 time: 6ms interval: 113738ms mem: PS Survivor Space: 0k+96k->96k[max:128k,rate:0.84kb/s] PS Old Gen: 78099k+0k->78099k[max:349568k,rate:0.00kb/s] PS Eden Space: 1676k-1676k->0k[max:174464k,rate:-14.74kb/s]] [GC: PS MarkSweep#10436 time: 192ms interval: 40070ms mem: PS Survivor Space: 96k-96k->0k[max:128k,rate:-2.40kb/s] PS Old Gen: 78099k+7k->78106k[max:349568k,rate:0.19kb/s] PS Eden Space: 0k+0k->0k[max:174400k,rate:0.00kb/s]] PS Scavenge[ collections: 31 | avg: 0.0057 secs | total: 0.2 secs ] PS MarkSweep[ collections: 9 | avg: 0.1980 secs | total: 1.8 secs ] OK. Looks good. Useful to be able to get runtime GC info without having to rely on GC logs which are often not available. After running my dodgy servlet (containing a number System.gc() calls) we see the following: [GC: PS Scavenge#9787 time: 5ms interval: 38819ms mem: PS Survivor Space: 0k+64k->64k[max:192k,rate:1.65kb/s] PS Old Gen: 78062k+0k->78062k[max:349568k,rate:0.00kb/s] PS Eden Space: 204k-204k->0k[max:174336k,rate:-5.28kb/s]] [GC: PS MarkSweep#10200 time: 155ms interval: 112488ms mem: PS Survivor Space: 64k-64k->0k[max:192k,rate:-0.57kb/s] PS Old Gen: 78071k+0k->78071k[max:349568k,rate:0.00kb/s] PS Eden Space: 0k+0k->0k[max:174336k,rate:0.00kb/s]] PS Scavenge[ collections: 666 | avg: 0.0046 secs | total: 3.1 secs ] PS MarkSweep[ collections: 689 | avg: 0.1588 secs | total: 109.4 secs ] A big difference and although not a particularly realistic scenario it's certainly a useful tool for being able to quickly view runtime GC info. Next up we'll take a look at CPU usage. Checking CPU usage Swiss Java Knife has a command that works in a similar way to the linux top command which displays the top CPU processes. Running the following command should give us the top 10 CPU processes when running normally: java -jar sjk-plus-0.1-2013-09-06.jar ttop -n 10 -p 5427 -o CPU 2014-03-11T08:56:33.120-0700 Process summary process cpu=2.21% application cpu=0.67% (user=0.30% sys=0.37%) other: cpu=1.54% heap allocation rate 245kb/s [000001] user= 0.00% sys= 0.00% alloc= 0b/s - main [000002] user= 0.00% sys= 0.00% alloc= 0b/s - Reference Handler [000003] user= 0.00% sys= 0.00% alloc= 0b/s - Finalizer [000004] user= 0.00% sys= 0.00% alloc= 0b/s - Signal Dispatcher [000010] user= 0.00% sys= 0.00% alloc= 0b/s - Timer-0 [000011] user= 0.00% sys= 0.01% alloc= 96b/s - Timer-1 [000012] user= 0.00% sys= 0.01% alloc= 20b/s - [ACTIVE] ExecuteThread: '0' for queue: 'weblogic.kernel.Default (self-tuning)' [000013] user= 0.00% sys= 0.00% alloc= 0b/s - weblogic.time.TimeEventGenerator [000014] user= 0.00% sys= 0.04% alloc= 245b/s - weblogic.timers.TimerThread [000017] user= 0.00% sys= 0.00% alloc= 0b/s - Thread-7 So far so good, minimal CPU usage. Now I'll run my dodgy servlet and run it again: Hmmm, not so good: Unexpected error: java.lang.IllegalArgumentException: Comparison method violates its general contract! Try once again and we get the following: 2014-03-11T09:00:10.625-0700 Process summary process cpu=199.14% application cpu=189.87% (user=181.57% sys=8.30%) other: cpu=9.27% heap allocation rate 4945kb/s [000040] user=83.95% sys= 2.82% alloc= 0b/s - [ACTIVE] ExecuteThread: '5' for queue: 'weblogic.kernel.Default (self-tuning)' [000038] user=93.71% sys=-0.44% alloc= 0b/s - [ACTIVE] ExecuteThread: '3' for queue: 'weblogic.kernel.Default (self-tuning)' [000044] user= 3.90% sys= 4.91% alloc= 4855kb/s - RMI TCP Connection(5)-127.0.0.1 [000001] user= 0.00% sys= 0.00% alloc= 0b/s - main [000002] user= 0.00% sys= 0.00% alloc= 0b/s - Reference Handler [000003] user= 0.00% sys= 0.00% alloc= 0b/s - Finalizer [000004] user= 0.00% sys= 0.00% alloc= 0b/s - Signal Dispatcher [000010] user= 0.00% sys= 0.00% alloc= 0b/s - Timer-0 [000011] user= 0.00% sys= 0.04% alloc= 1124b/s - Timer-1 [000012] user= 0.00% sys= 0.00% alloc= 0b/s - [STANDBY] ExecuteThread: '0' for queue: 'weblogic.kernel.Default (self-tuning)' So, the CPU usage is now through the roof (as expected). The main issue with this is that similar to the jps command it doesn't really offer much more than the top command. It also threw the exception above many times when trying to run commands ordered by CPU. Overall, it doesn't really add much to the command already available and unexpected errors are never good. Finally, we'll take a look at memory usage. Checking memory usage For checking memory usage Swiss Java Knife has a tool called hh which it claims is an extended version of jmap -histo. For those not familiar with jmap, it's another of the tools that comes with the JDK which prints shared object memory maps or heap memory details for a process. So, first of all I run my JMeter test that repeatedly calls my dodgy servlet. This time one that allocates multiple byte arrays each time it's called to simulate a memory leak. Although it claims to be an extended version of jmap -histo the only real addition is the ability to state how many buckets to view but this can be easily achieved by piping the output of jmap -histo through head. Aside from that the output is virtually identical. Output from jmap: num #instances #bytes class name ---------------------------------------------- 1: 42124 234260776 [B 2: 161472 24074512 3: 161472 21970928 4: 12853 15416848 5: 12853 10250656 6: 84735 9020400 [C 7: 10896 8943104 8: 91873 2939936 java.lang.String 9: 14021 1675576 java.lang.Class 10: 10311 1563520 [Ljava.lang.Object; Output from sjk: java -jar sjk-plus-0.1-2013-09-06.jar hh -n 10 -p 5427 1: 56626 386286072 [B 2: 161493 24076192 3: 161493 21973784 4: 12850 15409912 5: 12850 10249384 6: 10891 8936672 7: 83336 8577720 [C 8: 90525 2896800 java.lang.String 9: 14018 1675264 java.lang.Class 10: 9819 1579400 [Ljava.lang.Object; Total 996089 500086120 The only other tools available are the commands mxdump and mx which allow access to MBean attributes and operations. However, trying to run either of these resulted in a Null pointer exception. At this point I would generally download the code and start to poke about but by now I'd seen enough. Conclusion Although a nice idea it's very limited in what it offers. Under the covers it uses the Attach API so requires the JDK and not just the JRE in order to run so the majority of tools available are already provided with the standard JDK. There are a few additions to those tools but nothing that really makes it worthwhile using this instead. The only tool I could see myself using would be the real-time GC data gathering tool but this would only be of use where GC logs were unavailable and no other monitoring tools were available. The number of errors seen when running basic commands was also a concern, although this is just a project on github not a commercial offering and doesn't appear to be a particularly active project. So, a useful tool to add to your diagnostic tool-kit? Not in my opinion. It's certainly an interesting idea and with further work could be useful but for now I'd stick with the tools that are already available.

July 24, 2014

by Andy Overton

· 13,545 Views

DocFlex/XML - XML Schema Documentation Generator and Toolkit

a powerful multi-format xml schema (xsd) documentation generator and a tool for rapid development of custom xsd documentation generators according to user needs. about docflex/xml "xsddoc" template set template processor template designer integrations generation of xsd diagrams apache ant & maven links about docflex/xml docflex/xml is a java-based software system for development and execution of high performance template-driven documentation generators from any data stored in xml files. the actual doc/report generators are programmed in the form of special templates using a graphic template designer , which represents the templates visually in a form resembling the output they generate. further, the templates are interpreted by a template processor , which takes on input the xml files and produces by them the result documentation. this article describes an application of docflex/xml for the task of generation of high-quality xml schema documentation. that includes the following features of docflex/xml system: " xsddoc " template set that implements the ready-to-use xml schema documentation generator itself. template processor makes the templates works. currently, it provides three interchangeable output generators for html, rtf, txt (plain text) formats. template designer provides a high quality gui to design/modify templates. if you need a special xml schema doc generator, the simplest way to create it is to modify the standard xsddoc templates. the template designer enables you to do that. integrations with altova xmlspy and oxygen xml editor . if you are a user of one of those popular xml editors, you can turn it also into a dynamically linked diagramming engine for docflex, so that to include automatically the xsd diagrams generated by xmlspy/oxygenxml into the xml schema documentation generated by docflex (with the full support of hyperlinks). "xsddoc" template set it is the implementation of xml schema documentation itself, which provides the following functionality: generation of single documentation by any number of xml schema (xsd) files together, in particular: highly navigable framed (javadoc-like) html documentation single-file html documentation rtf documentation (further convertible to pdf) processing of any referenced xml schemas, in particular: correct processing of all , , elements found across all involved xsd files. automatic loading and processing (i.e. inclusion in the documentation scope) all directly/indirectly referenced xsd files. sophisticated documenting of xsd components , including: component diagrams (with hyperlinks to everything depicted on them; see also integrations ) xml representation summary (a textual alternative to diagrams) lists of related components. for elements this includes also the list of possible containing elements . (such a list is never present in the output generated by xslt-based doc generators). list of usage locations support of any xml schema design patterns . this comes down mainly to the following: special treatment of local elements (see below) support and documenting of substitution groups support of importing, inclusion and redefinition of schema files special documenting of local elements . local elements are those components that are declared locally within other xsd components. w3c xml schema spec allows you to declare any number of local elements that may share the same name but have different content. that's because their meaning is local and there will be no collisions with other declarations. that, however, creates a problem for documenting, because in a documentation both global and local elements may appear simultaneously in various lists according to their common properties. if each element component is identified only by its name, you will get the lists with multiple repeating names but little clue what they mean. moreover, some xml schemas may contain lots of identical local element declarations (that is, they have the same both name and content). so, you'll get in those lists a mess of repeating names, some of which referencing to effectively the same entities, whereas others to complete different ones. in xsddoc , those problems are solved in two ways: adding extensions to local element names. the extension provides more information about the element (e.g. where it can be inserted or its global type or where it is defined). that makes the whole string identifying the element unique. here is how it looks. the grey text is the name extension: unifying local elements by type. on the left you can see a documentation generated with such unification. on the right, all local elements are documented straight as they are. click on each screenshot to view the docs: we believe the first documentation (on the left) is easier to understand and use. processing of xhtml markup . you can format your xml schema annotations with xhtml tags, which will be recognized and rendered with the appropriate formatting in both html and rtf output, as shown on the following screenshots (click to see more details): here, on the left you can see the xml source of an xml schema, whose annotations are heavily laden with xhtml markup (including insertion of images). the next is the html documentation generated by that schema. on the right is a page of rtf documentation also generated by that schema. possibility of unlimited customization : xsddoc is controlled by more than 400 parameters, which allow you to adjust the generated documentation within huge range of included details. template parameters serve the same role as options in traditional doc generators. the difference is that docflex template architecture makes the support/implementation of template parameters very cheap (typically, the most of efforts takes writing their descriptions). so, there may be hundreds of parameters controlling a large template application. if parameters are not enough, you can modify the templates themselves using the template designer . in case of html output, you can also apply your own css styles to change how the generated documentation looks. template processor the template processor (also called simply "generator") makes everything work. it consists of two logical parts: 1. template interpreter 2. output generator the output generator actually has three different implementations for each currently supported output format: html, rtf, txt (plain text). the plain-text output can be used to generate documentation in formats not supported directly by docflex. the template processor is started directly from java command line with the following arguments: ● main template ● template parameters ● initial xsd files to be processed (documented) ● xml catalogs (to redirect physical location of input files) ● destination directory/file ● output format (this selects which output generator will be used) ● output format options (specify settings to control the selected output generator) actually, the number of settings may be so large that the template processor provides a special gui to specify everything interactively (click to enlarge): template designer although docflex templates are stored as plain-text files (with an xml-like format), they are not supposed for editing manually. rather, a special graphic template designer must be used, which visualizes the templates in the form of template components they are made of. those components are the actual constructs of the template language (not some textual statements, operators, blocks etc.) the following screenshots show templates open in the template designer (click to see a lot more): that approach has a number of advantages, among them: the processing structures represented by template components may be displayed in a way that visually expresses what a component does (for instance, it may resemble the output it generates). that representation may be both expressive and compact (after all, it is not just a text), which allows you easily to navigate a template, understand what it does and modify anything you need. as template components are visual and interactive, they may have very complex internal structure, for instance, contain lots of properties and nested components. at that, you don't need to scroll and navigate some kind of enormous text, which encodes all of this (as it would be in case of a script). rather, you just need to invoke some property dialogs and expand/collapse some component sections. a template component may be easily copied, pasted and deleted as a whole. at that, you don't need to bother that the template syntax is restored after that. the template designer will also ensure that each component is created, copied or moved only in the allowed place. the highly structured nature of templates eliminates the need for most of various named identifiers. many connections between different template components are also maintained by the template designer (i.e. modified automatically when necessary). as template files are stored and read only programmatically, there is no need to know and understand their syntax. there will be no syntax errors either. the actual syntax of template files may be optimized not for human programmers, but for faster loading and processing of templates by the template processor . there is no need in a compilation phase. the separation of template semantics from the particular structure of template files helps for faster and easier evolution of the template language. the obsolete constructs of older template versions can be automatically converted into new structures. both old and new templates will look and work up-to-date. integrations generation of xsd diagrams docflex/xml is able to work with any kind of diagrams (i.e. inserting them automatically in the generated output). that is supported on the level of templates, along with the generation of hypertext imagemaps, as shown on the following screenshot (click to see a lot more): docflex/xml provides no diagramming engine of its own. instead, it includes integrations with two most popular xml editors that do generate xsd diagrams: ● altova xmlspy ● oxygen xml editor effectively, the third-party software is used as dynamically linked diagramming engine. the advantage of such integrations is that when you are the user of one of those xml editors, you will get in the documentation generated by docflex the same diagrams as you see in your xml editor. here is how such a documentation with diagrams looks (click on a screenshot to view the real html): apache ant & maven as a pure java application, docflex/xml can be run in any environment that runs java itself. the template processor can be easily integrated with ant (that can be specified just in the ant build file). in case of maven, docflex/xml includes a simple maven plugin. it is possible also to use all diagraming integrations with both ant and maven. links docflex/xml (home page): http://www.filigris.com/docflex-xml/ docflex/xml xsddoc: http://www.filigris.com/docflex-xml/xsddoc/ xsddoc examples: http://www.filigris.com/docflex-xml/xsddoc/examples/ xmlspy integration: http://www.filigris.com/docflex-xml/xmlspy/ oxygenxml integration: http://www.filigris.com/docflex-xml/oxygenxml/ free downloads: http://www.filigris.com/downloads/ this original article: http://www.filigris.com/ann/docflex-xsd/

July 23, 2014

by Leonid Rudy

· 7,661 Views

Building Extremely Large In-Memory InputStream for Testing Purposes

For some reason I needed extremely large, possibly even infinite InputStream that would simply return the same byte[]over and over. This way I could produce insanely big stream of data by repeating small sample. Sort of similar functionality can be found in Guava: Iterable Iterables.cycle(Iterable) and Iterator Iterators.cycle(Iterator). For example if you need an infinite source of 0 and 1, simply sayIterables.cycle(0, 1) and get 0, 1, 0, 1, 0, 1... infinitely. Unfortunately I haven't found such utility forInputStream, so I jumped into writing my own. This article documents many mistakes I made during that process, mostly due to overcomplicating and overengineering straightforward solution. We don't really need an infinite InputStream, being able to create very large one (say, 32 GiB) is enough. So we are after the following method: public static InputStream repeat(byte[] sample, int times) It basically takes sample array of bytes and returns an InputStream returning these bytes. However when sample runs out, it rolls over, returning the same bytes again - this process is repeated given number of times, until InputStreamsignals end. One solution that I haven't really tried but which seems most obvious: public static InputStream repeat(byte[] sample, int times) { final byte[] allBytes = new byte[sample.length * times]; for (int i = 0; i < times; i++) { System.arraycopy(sample, 0, allBytes, i * sample.length, sample.length); } return new ByteArrayInputStream(allBytes); } I see you laughing there! If sample is 100 bytes and we need 32 GiB of input repeating these 100 bytes, generatedInputStream shouldn't really allocate 32 GiB of memory, we must be more clever here. As a matter of fact repeat()above has another subtle bug. Arrays in Java are limited to 231-1 entries (int), 32 GiB is way above that. The reason this program compiles is a silent integer overflow here: sample.length * times. This multiplication doesn't fit in int. OK, let's try something that at least theoretically can work. My first idea was as follows: what if I create manyByteArrayInputStreams sharing the same byte[] sample (they don't do an eager copy) and somehow join them together? Thus I needed some InputStream adapter that could take arbitrary number of underlying InputStreams and chain them together - when first stream is exhausted, switch to next one. This awkward moment when you look for something in Apache Commons or Guava and apparently it was in the JDK forever... java.io.SequenceInputStream is almost ideal. However it can only chain precisely two underlying InputStreams. Of course since SequenceInputStreamis an InputStream itself, we can use it recursively as an argument to outer SequenceInputStream. Repeating this process we can chain arbitrary number of ByteArrayInputStreams together: public static InputStream repeat(byte[] sample, int times) { if (times <= 1) { return new ByteArrayInputStream(sample); } else { return new SequenceInputStream( new ByteArrayInputStream(sample), repeat(sample, times - 1) ); } } If times is 1, just wrap sample in ByteArrayInputStream. Otherwise use SequenceInputStream recursively. I think you can immediately spot what's wrong with this code: too deep recursion. Nesting level is the same as times argument, which will reach millions or even billions. There must be a better way. Luckily minor improvement changes recursion depth from O(n) to O(logn): public static InputStream repeat(byte[] sample, int times) { if (times <= 1) { return new ByteArrayInputStream(sample); } else { return new SequenceInputStream( repeat(sample, times / 2), repeat(sample, times - times / 2) ); } } Honestly this was the first implementation I tried. It's a simple application of divide and conquer principle, where we produce result by evenly splitting it into two smaller sub-problems. Looks clever, but there is one issue: it's easy to prove we create t (t = times) ByteArrayInputStreams and O(t) SequenceInputStreams. While sample byte array is shared, millions of various InputStream instances are wasting memory. This leads us to alternative implementation, creating just one InputStream, regardless value of times: import com.google.common.collect.Iterators; import org.apache.commons.lang3.ArrayUtils; public static InputStream repeat(byte[] sample, int times) { final Byte[] objArray = ArrayUtils.toObject(sample); final Iterator infinite = Iterators.cycle(objArray); final Iterator limited = Iterators.limit(infinite, sample.length * times); return new InputStream() { @Override public int read() throws IOException { return limited.hasNext() ? limited.next() & 0xFF : -1; } }; } We will use Iterators.cycle() after all. But before we have to translate byte[] into Byte[] since iterators can only work with objets, not primitives. There is no idiomatic way to turn array of primitives to array of boxed types, so I useArrayUtils.toObject(byte[]) from Apache Commons Lang. Having an array of objects we can create an infiniteiterator that cycles through values of sample. Since we don't want an infinite stream, we cut off infinite iterator usingIterators.limit(Iterator, int), again from Guava. Now we just have to bridge from Iterator toInputStream - after all semantically they represent the same thing. This solution suffers two problems. First of all it produces tons of garbage due to unboxing. Garbage collection is not that much concerned about dead, short-living objects, but still seems wasteful. Second issue we already faced previously:sample.length * times multiplication can cause integer overflow. It can't be fixed because Iterators.limit() takesint, not long - for no good reason. BTW we avoided third problem by doing bitwise and with 0xFF - otherwise byte with value -1 would signal end of stream, which is not the case. x & 0xFF is correctly translated to unsigned 255 (int). So even though implementation above is short and sweet, declarative rather than imperative, it's too slow and limited. If you have a C background, I can imagine how uncomfortable you were seeing me struggle. After all the most straightforward, painfully simple and low-level implementation was the one I came up with last: public static InputStream repeat(byte[] sample, int times) { return new InputStream() { private long pos = 0; private final long total = (long)sample.length * times; public int read() throws IOException { return pos < total ? sample[(int)(pos++ % sample.length)] : -1; } }; } GC free, pure JDK, fast and simple to understand. Let this be a lesson for you: start with the simplest solution that jumps to your mind, don't overengineer and don't be too smart. My previous solutions, declarative, functional, immutable, etc. - maybe they looked clever, but they were neither fast nor easy to understand. The utility we just developed was not just a toy project, it will be used later in subsequent article.

July 23, 2014

by Tomasz Nurkiewicz

· 7,584 Views

VelocityEngine Spring Java Config

This is a first post in a series of short code snippets that will present the configuration of Spring beans from XML to Java. XML: resource.loader=class class.resource.loader.class=org.apache.velocity.runtime.resource.loader.ClasspathResourceLoader Java @Bean public VelocityEngine velocityEngine() throws VelocityException, IOException{ VelocityEngineFactoryBean factory = new VelocityEngineFactoryBean(); Properties props = new Properties(); props.put("resource.loader", "class"); props.put("class.resource.loader.class", "org.apache.velocity.runtime.resource.loader." + "ClasspathResourceLoader"); factory.setVelocityProperties(props); return factory.createVelocityEngine(); }

July 23, 2014

by Adrian Matei

· 13,239 Views

Time - Memory Tradeoff With the Example of Java Maps

this article illustrates the general time - memory tradeoff with the example of different hash table implementations in java. the more memory a hash table takes, the faster each operation (e. g. getting a value by key or putting an entry) is performed. benchmarking method hash maps with int keys and int values were tested. memory measure is relative usage over theoretical minimum. for example, 1000 entries of int key and value take at least (4 (size of int) + 4) * 1000 = 8000 bytes. if the hash map implementation takes 20 000 bytes, it's memory overuse is (20 000 - 8000) / 8000 = 1.5. each implementation was benchmarked on 9 different load levels (load factors). on each load level, each map was filled with 10 numbers of entries, logariphmically evenly distributed bewteen 1000 and 10 000 000 (to study caching effects). then, for the same implementation and load level, memory metrics and average operation throughputs are averaged independently, over 3 smallest sizes (small sizes), 3 largest sizes (large sizes) and all 10 sizes from 1000 to 10 000 000 (all sizes). implementations: higher frequency trading collections (hftc) high performance primitive collections (hppc) fastutil collections goldman sachs collections (gs) trove collections mahout collections java.util.hashmap as a reference get value by key (successful) only looking at these charts, you can suppose that hftc, trove and mahout on the one size, fastutil, hppc and gs on the another use the same hash table algorithm. (in fact, it is not quite true.) sparser hash table on average performs less lookups during key search, therefore less memory reads, therefore the operation finishes earlier. notice, that on small sizes the largest maps are the fastest for all implementations, but on large and all sizes there isn't visible progress starting from memory overuse ~4. that's because when the total memory taken by the map goes beyond cpu cache capacity, cache misses become more often when the map is getting larger. this effect compensates algorithic trend. update (increment) value by key update operation behaves pretty similar to get(). fastutil wasn't benchmarked, because there aren't fairly performant method for this task in it's api. put an entry (key was absent) in this case, maps were gradually filled in with the entries from the size 0 to the target size (1000 - 10 000 000). rehash shouldn't occur, because maps were constructed with the target size provided. for small sizes, plots still looks like hyperbolas, but i can't explain so dramatic change on large sizes and differences between hftc and other primitive implementations. internal iteration (foreach) iteration is getting slower with memory usage growth. interesting thing about external iteration: for all open hash table implementations throughtput depends only memory usage, not even on load factor (which differs for implementations for the same memory usage). also, foreach throughput don't depend on open hash table size. external iteration (via iterator or cursor) external iteration performance is more varying than internal, because there is more freedom for optimization. hftc and trove employ own iteration interfaces, other libraries use standard java.util.iterator . footnote raw benchmark results from which the pictures were built with a link to the benchmarking code and information about the runsite in description.

July 22, 2014

by Roman Leventov

· 26,018 Views

New in JAX-RS 2.0 – @BeanParam Annotation

JAX-RS 2.0 is the latest version of the JSR 311 specification and it was released along with Java EE 7.

July 22, 2014

by Abhishek Gupta

CORE

· 23,404 Views · 2 Likes

5 Reasons to Use a Java Data Grid in Your Application

In this post we explore 5 reasons to use a Java Data Grid for caching Java objects in-memory in your applications. In a later post we will explore some of the other data grid capabilities, beyond data storage, that can revolutionize your Java architectures, like on-grid computation and events. Memory is Fast Java Data Grids store Java objects in memory. Memory access is fast with low latency. So if access to data storage either disk or database is the primary bottleneck in your application then using a data grid as an in-memory cache in front of your storage tier will give you a performance boost. Scale out your Application Shared State If you need to share state across JVMs to scale out your application then using a Java Data Grid rather than a database will increase your scalability. A typical shared state architecture is shown below, the application server tier stores shared Java objects in the data grid and these objects are available to all application server nodes in your architecture. Separating the data grid tier from the application server tier has a number of advantages; Applications can be redeployed and restarted without losing the shared state Data Grid JVMs and Application JVMs can be tuned separately State can be shared across multiple different applications. Each tier can be scaled horizontally separately depending on work load Typical use cases for shared state include; PCI compliant storage of card security codes; In-game state in online games; web session data; prices and catalogues in ecommerce. Anything that needs low latency access can be stored in the shared data grid. High Availability for In-Memory Data As well as low latency access and scaling out shared state. Java Data Grids also provide high availability for your in-memory data. When storing Java objects in a data grid a primary object is stored in one of the Data Grid JVMs and secondary back up copies of the object are stored in different Data Grid JVM node, ensuring that if you lose a node then you don't lose any data. Clients of the data grid do not need to know where data is to access it so high availability is transparent to your application. Scale Out In-Memory Data Volumes Java objects, in data grids, aren't fully replicated across all Data Grid JVMs but are stored as a primary object and a secondary object. This means the more Data Grid JVM nodes we add the more JVM heap we have for storing Java objects in-memory (and remember memory is fast). For example if we build a Data Grid with 20 JVMs each with 4Gb free heap (after per JVM overhead) we could theoretically store 80Gb (4 times 20) of shared Java objects. If we assume we have 1 duplicate for high availability this cuts our storage in half so we can store 40Gb (.5 time 4 times 20 ) of Java Objects in memory. Native Integration with JPA Java Data Grids have native integration with JPA frameworks like TopLink and Hibernate whereby the Data Grid can act as a second level cache between JPA and the database. This can give a large performance boost to your database driven application if latency associated with database access is a key performance bottleneck.

July 22, 2014

by Steve Millidge

· 7,437 Views

R: ggplot: Problem automatically picking scale for difftime object

While reading ‘Why The R Programming Language Is Good For Business‘ I came across Udacity’s ‘Data Analysis with R‘ courses – part of which focuses exploring data sets using visualisations, something I haven’t done much of yet. I thought it’d be interesting to create some visualisations around the times that people RSVP ‘yes’ to the various Neo4j events that we run in London. I started off with the following query which returns the date time that people replied ‘Yes’ to an event and the date time of the event: library(Rneo4j) query = "MATCH (e:Event)<-[:TO]-(response {response: 'yes'}) RETURN response.time AS time, e.time + e.utc_offset AS eventTime" allYesRSVPs = cypher(graph, query) allYesRSVPs$time = timestampToDate(allYesRSVPs$time) allYesRSVPs$eventTime = timestampToDate(allYesRSVPs$eventTime) > allYesRSVPs[1:10,] time eventTime 1 2011-06-05 12:12:27 2011-06-29 18:30:00 2 2011-06-05 14:49:04 2011-06-29 18:30:00 3 2011-06-10 11:22:47 2011-06-29 18:30:00 4 2011-06-07 15:27:07 2011-06-29 18:30:00 5 2011-06-06 20:21:45 2011-06-29 18:30:00 6 2011-07-04 19:49:04 2011-07-27 19:00:00 7 2011-07-05 16:40:10 2011-07-27 19:00:00 8 2011-08-19 07:41:10 2011-08-31 18:30:00 9 2011-08-24 12:47:40 2011-08-31 18:30:00 10 2011-08-18 09:56:53 2011-08-31 18:30:00 I wanted to create a bar chart showing the amount of time in advance of a meetup that people RSVP’d ‘yes’ so I added the following column to my data frame: allYesRSVPs$difference = allYesRSVPs$eventTime - allYesRSVPs$time > allYesRSVPs[1:10,] time eventTime difference 1 2011-06-05 12:12:27 2011-06-29 18:30:00 34937.55 mins 2 2011-06-05 14:49:04 2011-06-29 18:30:00 34780.93 mins 3 2011-06-10 11:22:47 2011-06-29 18:30:00 27787.22 mins 4 2011-06-07 15:27:07 2011-06-29 18:30:00 31862.88 mins 5 2011-06-06 20:21:45 2011-06-29 18:30:00 33008.25 mins 6 2011-07-04 19:49:04 2011-07-27 19:00:00 33070.93 mins 7 2011-07-05 16:40:10 2011-07-27 19:00:00 31819.83 mins 8 2011-08-19 07:41:10 2011-08-31 18:30:00 17928.83 mins 9 2011-08-24 12:47:40 2011-08-31 18:30:00 10422.33 mins 10 2011-08-18 09:56:53 2011-08-31 18:30:00 19233.12 mins I then tried to use ggplot to create a bar chart of that data: > ggplot(allYesRSVPs, aes(x=difference)) + geom_histogram(binwidth=1, fill="green") Unfortunately that resulted in this error: Don't know how to automatically pick scale for object of type difftime. Defaulting to continuous Error: Discrete value supplied to continuous scale I couldn’t find anyone who had come across this problem before in my search but I did find the as.numeric function which seemed like it would put the difference into an appropriate format: allYesRSVPs$difference = as.numeric(allYesRSVPs$eventTime - allYesRSVPs$time, units="days") > ggplot(allYesRSVPs, aes(x=difference)) + geom_histogram(binwidth=1, fill="green") that resulted in the following chart: We can see there is quite a heavy concentration of people RSVPing yes in the few days before the event and then the rest are scattered across the first 30 days. We usually announce events 3/4 weeks in advance so I don’t know that it tells us anything interesting other than that it seems like people sign up for events when an email is sent out about them. The date the meetup was announced (by email) isn’t currently exposed by the API but hopefully one day it will be. The code is on github if you want to have a play – any suggestions welcome.

July 21, 2014

by Mark Needham

· 11,913 Views

Grouping, Sampling and Batching - Custom Collectors in Java 8

Continuing from the first article, this time we will write some more useful custom collectors: for grouping by given criteria, sampling input, batching and sliding over with fixed size window. Grouping (counting occurrences, histogram) Imagine you have a collection of some items and you want to calculate how many times each item (with respect to equals()) appears in this collection. This can be achieved using CollectionUtils.getCardinalityMap() from Apache Commons Collections. This method takes an Iterable and returns Map, counting how many times each item appeared in the collection. However sometimes instead of usingequals() we would like to group by an arbitrary attribute of input T. For example say we have a list of Person objects and we would like to compute the number of males vs. females (i.e. Map) or maybe an age distribution. There is a built-in collector Collectors.groupingBy(Function classifier) - however it returns a map from key to all items mapped to that key. See: import static java.util.stream.Collectors.groupingBy; //... final List people = //... final Map> bySex = people .stream() .collect(groupingBy(Person::getSex)); It's valuable, but in our case unnecessarily builds two List. I only want to know the number of people. There is no such collector built-in, but we can compose it in a fairly simple manner: import static java.util.stream.Collectors.counting; import static java.util.stream.Collectors.groupingBy; //... final Map bySex = people .stream() .collect( groupingBy(Person::getSex, HashMap::new, counting())); This overloaded version of groupingBy() takes three parameters. First one is the key (classifier) function, as previously. Second argument creates a new map, we'll see shortly why it's useful. counting() is a nested collector that takes all people with same sex and combines them together - in our case simply counting them as they arrive. Being able to choose map implementation is useful e.g. when building age histogram. We would like to know how many people we have at given age - but age values should be sorted: final TreeMap byAge = people .stream() .collect( groupingBy(Person::getAge, TreeMap::new, counting())); byAge .forEach((age, count) -> System.out.println(age + ":\t" + count)); We ended up with a TreeMap from age (sorted) to count of people having that age. Sampling, batching and sliding window IterableLike.sliding() method in Scala allows to view a collection through a sliding fixed-size window. This window starts at the beginning and in each iteration moves by given number of items. Such functionality, missing in Java 8, allows several useful operators like computing moving average, splitting big collection into batches (compare with Lists.partition() in Guava) or sampling every n-th element. We will implement collector for Java 8 providing similar behaviour. Let's start from unit tests, which should describe briefly what we want to achieve: import static com.nurkiewicz.CustomCollectors.sliding @Unroll class CustomCollectorsSpec extends Specification { def "Sliding window of #input with size #size and step of 1 is #output"() { expect: input.stream().collect(sliding(size)) == output where: input | size | output [] | 5 | [] [1] | 1 | [[1]] [1, 2] | 1 | [[1], [2]] [1, 2] | 2 | [[1, 2]] [1, 2] | 3 | [[1, 2]] 1..3 | 3 | [[1, 2, 3]] 1..4 | 2 | [[1, 2], [2, 3], [3, 4]] 1..4 | 3 | [[1, 2, 3], [2, 3, 4]] 1..7 | 3 | [[1, 2, 3], [2, 3, 4], [3, 4, 5], [4, 5, 6], [5, 6, 7]] 1..7 | 6 | [1..6, 2..7] } def "Sliding window of #input with size #size and no overlapping is #output"() { expect: input.stream().collect(sliding(size, size)) == output where: input | size | output [] | 5 | [] 1..3 | 2 | [[1, 2], [3]] 1..4 | 4 | [1..4] 1..4 | 5 | [1..4] 1..7 | 3 | [1..3, 4..6, [7]] 1..6 | 2 | [[1, 2], [3, 4], [5, 6]] } def "Sliding window of #input with size #size and some overlapping is #output"() { expect: input.stream().collect(sliding(size, 2)) == output where: input | size | output [] | 5 | [] 1..4 | 5 | [[1, 2, 3, 4]] 1..7 | 3 | [1..3, 3..5, 5..7] 1..6 | 4 | [1..4, 3..6] 1..9 | 4 | [1..4, 3..6, 5..8, 7..9] 1..10 | 4 | [1..4, 3..6, 5..8, 7..10] 1..11 | 4 | [1..4, 3..6, 5..8, 7..10, 9..11] } def "Sliding window of #input with size #size and gap of #gap is #output"() { expect: input.stream().collect(sliding(size, size + gap)) == output where: input | size | gap | output [] | 5 | 1 | [] 1..9 | 4 | 2 | [1..4, 7..9] 1..10 | 4 | 2 | [1..4, 7..10] 1..11 | 4 | 2 | [1..4, 7..10] 1..12 | 4 | 2 | [1..4, 7..10] 1..13 | 4 | 2 | [1..4, 7..10, [13]] 1..13 | 5 | 1 | [1..5, 7..11, [13]] 1..12 | 5 | 3 | [1..5, 9..12] 1..13 | 5 | 3 | [1..5, 9..13] } def "Sampling #input taking every #nth th element is #output"() { expect: input.stream().collect(sliding(1, nth)) == output where: input | nth | output [] | 1 | [] [] | 5 | [] 1..3 | 5 | [[1]] 1..6 | 2 | [[1], [3], [5]] 1..10 | 5 | [[1], [6]] 1..100 | 30 | [[1], [31], [61], [91]] } } Using data driven tests in Spock I managed to write almost 40 test cases in no-time, succinctly describing all requirements. I hope these are clear for you, even if you haven't seen this syntax before. I already assumed existence of handy factory methods: public class CustomCollectors { public static Collector>> sliding(int size) { return new SlidingCollector<>(size, 1); } public static Collector>> sliding(int size, int step) { return new SlidingCollector<>(size, step); } } The fact that collectors receive items one after another makes are job harder. Of course first collecting the whole list and sliding over it would have been easier, but sort of wasteful. Let's build result iteratively. I am not even pretending this task can be parallelized in general, so I'll leave combiner() unimplemented: public class SlidingCollector implements Collector>, List>> { private final int size; private final int step; private final int window; private final Queue buffer = new ArrayDeque<>(); private int totalIn = 0; public SlidingCollector(int size, int step) { this.size = size; this.step = step; this.window = max(size, step); } @Override public Supplier>> supplier() { return ArrayList::new; } @Override public BiConsumer>, T> accumulator() { return (lists, t) -> { buffer.offer(t); ++totalIn; if (buffer.size() == window) { dumpCurrent(lists); shiftBy(step); } }; } @Override public Function>, List>> finisher() { return lists -> { if (!buffer.isEmpty()) { final int totalOut = estimateTotalOut(); if (totalOut > lists.size()) { dumpCurrent(lists); } } return lists; }; } private int estimateTotalOut() { return max(0, (totalIn + step - size - 1) / step) + 1; } private void dumpCurrent(List> lists) { final List batch = buffer.stream().limit(size).collect(toList()); lists.add(batch); } private void shiftBy(int by) { for (int i = 0; i < by; i++) { buffer.remove(); } } @Override public BinaryOperator>> combiner() { return (l1, l2) -> { throw new UnsupportedOperationException("Combining not possible"); }; } @Override public Set characteristics() { return EnumSet.noneOf(Characteristics.class); } } I spent quite some time writing this implementation, especially correct finisher() so don't be frightened. The crucial part is a buffer that collects items until it can form one sliding window. Then "oldest" items are discarded and window slides forward by step. I am not particularly happy with this implementation, but tests are passing. sliding(N)(synonym to sliding(N, 1)) will allow calculating moving average of N items.sliding(N, N) splits input into batches of size N. sliding(1, N) takes every N-th element (samples). I hope you'll find this collector useful, enjoy!

July 18, 2014

by Tomasz Nurkiewicz

· 22,004 Views · 2 Likes

Retrieving JMX information programmatically

Retrieving JMX information for a Java process is very easy when using a tool such as JConsole or JVisualVM. These provide an interface that allows viewing of information such as CPU usage, memory usage, threads active and more. This blog post gives an example of how to retrieve such information programmatically. In order to retrieve JMX information from a Java application, the target application must be configured to expose JMX information. This link shows how to do this. As an example, we shall be retrieving CPU and memory usage from a standalone Mule instance. In Mule, the JMX agent may be configured from a Mule configuration file. Through this, one may set the address that a JMX client can use to retrieve information; this is how. The following Java code allows for polling the JMX agent and retrieving memory, CPU usage and also shows how to remotely invoke the garbage collector: // create jmx connection with mules jmx agent JMXServiceURL url = new JMXServiceURL("service:jmx:rmi:///jndi/rmi://localhost:1098/server"); JMXConnector jmxc = JMXConnectorFactory.connect(url, null); jmxc.connect(); //create object instances that will be used to get memory and operating system Mbean objects exposed by JMX; create variables for cpu time and system time before Object memoryMbean = null; Object osMbean = null; long cpuBefore = 0; long tempMemory = 0; CompositeData cd = null; cpuBefore = Long.parseLong(a.toString()); // call the garbage collector before the test using the Memory Mbean jmxc.getMBeanServerConnection().invoke(new ObjectName("java.lang:type=Memory"), "gc", null, null); //create a loop to get values every second (optional) for (int i = 0; i < samplesCount; i++) { //get an instance of the HeapMemoryUsage Mbean memoryMbean = jmxc.getMBeanServerConnection().getAttribute(new ObjectName("java.lang:type=Memory"), "HeapMemoryUsage"); cd = (CompositeData) memoryMbean; //get an instance of the OperatingSystem Mbean osMbean = jmxc.getMBeanServerConnection().getAttribute(new ObjectName("java.lang:type=OperatingSystem"),"ProcessCpuTime"); System.out.println("Used memory: " + " " + cd.get("used") + " Used cpu: " + osMbean); //print memory usage tempMemory = tempMemory + Long.parseLong(cd.get("used").toString()); Thread.sleep(1000); //delay for one second } //get system time and cpu time from last poll long cpuAfter = Long.parseLong(osMbean.toString()); long cpuDiff = cpuAfter - cpuBefore; //find cpu time between our first and last jmx poll System.out.println("Cpu diff in milli seconds: " + cpuDiff / 1000000); //print cpu time in miliseconds System.out.println("average memory usage is: " + tempMemory / samplesCount);//print average memory usage The above example prints: ... Used memory: 23376624 Used cpu: 38060000000 Used memory: 24020624 Used cpu: 38080000000 Used memory: 24621920 Used cpu: 38090000000 Cpu diff in milli seconds: 4230 average memory usage is: 28028204 When the JMX agent may not be enabled for a Java process, it is also possible to retrieve information by fetching the process by id, for example. The following code shows how to do this using Sigar API: //create a sigar object Sigar sigar = new Sigar(); for (int i = 0; i < 100; i++) { ProcessFinder find = new ProcessFinder(sigar); //get the list of current java processes, and optionally query the list to choose which process to monitor long[] pidList = sigar.getProcList(); //assuming we know the process id, we may query the process finder long pid = find.findSingleProcess("Pid.Pid.eq=54730"); //get memory info for the process id ProcMem memory = new ProcMem(); memory.gather(sigar, pid); //get cou info for the oricess id ProcCpu cpu = new ProcCpu(); cpu.gather(sigar, pid); //print the memory used by the process id System.out.println("Current memory used: " + Long.toString(memory.getSize())); //print all memory info System.out.println(memory.toMap()); //print all cpu info System.out.println(cpu.toMap()); Thread.sleep(1000); } This is displayed when running the above example: Current memory used: 3257659392 {Resident=258789376, PageFaults=109613, Size=3257659392} {User=34973, LastTime=1404467787774, Percent=0.0, StartTime=1404467383826, Total=38121, Sys=3148}

July 16, 2014

by Gabriel Dimech

· 31,537 Views

The Observer Pattern in Java

Get an overview of the Java observer pattern using an inventory example.

July 16, 2014

by Roohi Agrawala

· 113,347 Views · 27 Likes

The Java Origins of Angular JS: Angular vs JSF vs GWT

Get familiar with the Angular JS origin story.

July 15, 2014

by Vasco Cavalheiro

· 86,004 Views · 5 Likes

Spring Security Run-As example using annotations and namespace configuration

Spring Security offers an authentication replacement feature, often referred to as Run-As, that can replace the current user's authentication (and thus permissions) during a single secured object invocation. Using this feature makes sense when a backend system invoked during request processing requires different privileges than the current application. For example, an application might want to expose a financial transaction log to the currently logged in user, but the backend system that provides it only permits this action to the members of a special "auditor" role. The application can not simply assign this role to the user as that would potentially permit them to execute other restricted actions. Instead, the user can be given this right exclusively for viewing their transaction log. Only two classes are used to implement this feature. Instances of RunAsManager are tasked with producing the actual replacement authentication tokens. A sensible default implementation is already provided by Spring Security. As with other types of authentication, it is also necessary to register an instance of an appropriate AuthenticationProvider. Tokens produced by runAsManager are signed with the provided key (my_run_as_key in the example above) and are later checked against the same key by runAsAuthenticationProvider, in order to mitigate the risk of fake tokens being provided. These keys can have any value, but need to be the same in both objects. Otherwise, runAsAuthenticationProvider will reject the produced tokens as invalid. If an instance is registered, RunAsManager will be invoked by AbstractSecurityInterceptor for every intercepted object invocation for which the user has already been given access. If RunAsManager returns a token, this token will be used be used instead of the original one for the duration of the invocation, thus granting the user different privileges. There are two key points here. In order for the authentication replacement feature to do anything, the call has to actually be secured (and thus intercepted), and the user has to already have been granted access. To register a RunAsManager instance with the method security interceptor, something similar to the following is needed: Now, all methods secured by the @Secured annotation will be able to trigger RunAsManager. One important point here is that global-method-security will only work in the Spring context in which it is defined. In Spring MVC applications, there usually are two Spring contexts: the parent context, attached to ContextLoaderListener, and the child context, attached toDispatcherServlet. To secure Controller methods in this way, global-method-security must be added to DispatcherServlet's context. To secure methods in beans not in this context, global-method-security should also be added to ContextLoaderListener's context. Otherwise, security annotations will be ignored. The default implementation of RunAsManager (RunAsManagerImpl) will inspect the secured object's configuration and if it finds any attributes prefixed with RUN_AS_, it will create a token identical to the original, with the addition of one new GrantedAuthorty per RUN_AS_ attribute found. The new GrantedAuthority will be a role (prefixed by ROLE_ by default) named like the found attribute without the RUN_AS_ prefix. So, if a user with a role ROLE_REGISTERED_USER invokes a method annotated with @Secured({"ROLE_REGISTERED_USER","RUN_AS_AUDITOR"}), e.g. @Controller public class TransactionLogController { @Secured({"ROLE_REGISTERED_USER","RUN_AS_AUDITOR"}) //Authorities needed for method access and authorities added by RunAsManager prefixed with RUN_AS_ @RequestMapping(value = "/transactions", method = RequestMethod.GET) //Spring MVC configuration. Not related to security @ResponseBody //Spring MVC configuration. Not related to security public List getTransactionLog(...) { ... //Invoke something in the backend requiring ROLE_AUDITOR { ... //User does not have ROLE_AUDITOR here } the resulting token created by RunAsManagerImpl with be granted ROLE_REGISTERED_USER and ROLE_AUDITOR. Thus, the user will also be allowed actions, normally reserved for ROLE_AUDITOR members, during the current invocation, permitting them, in this case, to access the transaction log.To enable runAsAuthenticationProvider, register it as usual: ... other authentication-providers used by the application ... This is all that is necessary to have the default implementation activated. Still, this setting will not work for methods secured by @PreAuthorize and @PostAuthorize annotations as their configuration attributes are differently evaluated (they are SpEL expressions and not a simple list or required authorities like with @Secured) and will not be recognized by RunAsManagerImpl. For this scenario to work, a custom RunAsManager implementation is required, as, at least at the time of writing, no applicable implementation is provided by Spring. A custom RunAsManager implementation for use with @PreAuthorize/@PostAuthorize A convenient implementation relying on a custom annotation is provided below: public class AnnotationDrivenRunAsManager extends RunAsManagerImpl { @Override public Authentication buildRunAs(Authentication authentication, Object object, Collection attributes) { if(!(object instanceof ReflectiveMethodInvocation) || ((ReflectiveMethodInvocation)object).getMethod().getAnnotation(RunAsRole.class) == null) { return super.buildRunAs(authentication, object, attributes); } String roleName = ((ReflectiveMethodInvocation)object).getMethod().getAnnotation(RunAsRole.class).value(); if (roleName == null || roleName.isEmpty()) { return null; } GrantedAuthority runAsAuthority = new SimpleGrantedAuthority(roleName); List newAuthorities = new ArrayList(); // Add existing authorities newAuthorities.addAll(authentication.getAuthorities()); // Add the new run-as authority newAuthorities.add(runAsAuthority); return new RunAsUserToken(getKey(), authentication.getPrincipal(), authentication.getCredentials(), newAuthorities, authentication.getClass()); } } This implementation will look for a custom @RunAsRole annotation on a protected method (e.g. @RunAsRole("ROLE_AUDITOR")) and, if found, will add the given authority (ROLE_AUDITOR in this case) to the list of granted authorities. RunAsRole itself is just a simple custom annotation: @Retention(RetentionPolicy.RUNTIME) @Target(ElementType.METHOD) public @interface RunAsRole { String value(); } This new implementation would be instantiated in the same way as before: And registered in a similar fashion: The expression-handler is always required for pre-post-annotations to work. It is a part of the standard Spring Security configuration, and not related to the topic described here. Both pre-post-annotations and secured-annotations can be enabled at the same time, but should never be used in the same class. The protected controller method from above could now look like this: @Controller public class TransactionLogController { @PreAuthorize("hasRole('ROLE_REGISTERED_USER')") //Authority needed to access the method @RunAsRole("ROLE_AUDITOR") //Authority added by RunAsManager @RequestMapping(value = "/transactions", method = RequestMethod.GET) //Spring MVC configuration. Not related to security @ResponseBody //Spring MVC configuration. Not related to security public List getTransactionLog(...) { ... //Invoke something in the backend requiring ROLE_AUDITOR { ... //User does not have ROLE_AUDITOR here }

July 7, 2014

by Bojan Tomić

· 23,115 Views · 1 Like

Dynamically Create CSS Classes With SASS

There are many advantages to using CSS pre-processors like SASS, some of the features allow you to end up writing less CSS code by using inheritance and functions in SASS to reuse the same code on your different CSS classes and IDs. To learn more about getting started with SASS you can refer to a previous articles. Getting started with SASS One of my favourite features of SASS is the ability to use loops to dynamically create your CSS classes. A good example of this is when you want to make a set of classes to use for changing the text colours and background colours of elements you would normally have to write CSS like this. .red-background { background: #FF0000; } .red-color { color: #FF0000; } .blue-background { background: #001EFF; } .blue-color { color: #001EFF; } .green-background { background: #00FF00; } .green-color { color: #00FF00; } .yellow-background { background: #F6FF00; } .yellow-color { color: #F6FF00; } If you want to add additional colours to this later you will have to remember to write both background and colour classes. With SASS we can create a list of our colours and then loop through these to create the CSS classes. To create a list in SASS all you have to do is create a comma separated list of key value pairs like the following. $colours: "red" #FF0000, "blue" #001EFF, "green" #00FF00, "yellow" #F6FF00; Using the @each keyword in SASS we can loop through each of the colours and then use the nth() function to get the name of the class and the value of the class to dynamically create the classes in our CSS. The following each loop will generate exactly the same colour classes as above with only a few lines of code. @each $i in $colours{ .#{nth($i, 1)}-background { background: nth($i, 2); } .#{nth($i, 1)}-color { color:nth($i, 2); } }

July 4, 2014

by Paul Underwood

· 16,353 Views

Spring Integration Java DSL sample - Further Simplification With JMS Namespace Factories

In an earlier blog entry I had touched on a fictitious rube goldberg flow for capitalizing a string through a complicated series of steps, the premise of the article was to introduce Spring Integration Java DSL as an alternative to defining integration flows through xml configuration files. I learned a few new things after writing that blog entry, thanks to Artem Bilan and wanted to document those learnings here: So, first my original sample, here I have the following flow(the one's in bold): Take in a message of this type - "hello from spring integ" Split it up into individual words(hello, from, spring, integ) Send each word to a ActiveMQ queue Pick up the word fragments from the queue and capitalize each word Place the response back into a response queue Pick up the message, re-sequence based on the original sequence of the words Aggregate back into a sentence("HELLO FROM SPRING INTEG") and Return the sentence back to the calling application. EchoFlowOutbound.java: @Bean public DirectChannel sequenceChannel() { return new DirectChannel(); } @Bean public DirectChannel requestChannel() { return new DirectChannel(); } @Bean public IntegrationFlow toOutboundQueueFlow() { return IntegrationFlows.from(requestChannel()) .split(s -> s.applySequence(true).get().getT2().setDelimiters("\\s")) .handle(jmsOutboundGateway()) .get(); } @Bean public IntegrationFlow flowOnReturnOfMessage() { return IntegrationFlows.from(sequenceChannel()) .resequence() .aggregate(aggregate -> aggregate.outputProcessor(g -> Joiner.on(" ").join(g.getMessages() .stream() .map(m -> (String) m.getPayload()).collect(toList()))) , null) .get(); } @Bean public JmsOutboundGateway jmsOutboundGateway() { JmsOutboundGateway jmsOutboundGateway = new JmsOutboundGateway(); jmsOutboundGateway.setConnectionFactory(this.connectionFactory); jmsOutboundGateway.setRequestDestinationName("amq.outbound"); jmsOutboundGateway.setReplyChannel(sequenceChannel()); return jmsOutboundGateway; } It turns out, based on Artem Bilan's feedback, that a few things can be optimized here. First notice how I have explicitly defined two direct channels, "requestChannel" for starting the flow that takes in the string message and the "sequenceChannel" to handle the message once it returns back from the jms message queue, these can actually be totally removed and the flow made a little more concise this way: @Bean public IntegrationFlow toOutboundQueueFlow() { return IntegrationFlows.from("requestChannel") .split(s -> s.applySequence(true).get().getT2().setDelimiters("\\s")) .handle(jmsOutboundGateway()) .resequence() .aggregate(aggregate -> aggregate.outputProcessor(g -> Joiner.on(" ").join(g.getMessages() .stream() .map(m -> (String) m.getPayload()).collect(toList()))) , null) .get(); } @Bean public JmsOutboundGateway jmsOutboundGateway() { JmsOutboundGateway jmsOutboundGateway = new JmsOutboundGateway(); jmsOutboundGateway.setConnectionFactory(this.connectionFactory); jmsOutboundGateway.setRequestDestinationName("amq.outbound"); return jmsOutboundGateway; } "requestChannel" is now being implicitly created just by declaring a name for it. The sequence channel is more interesting, quoting Artem Bilan - do not specify outputChannel for AbstractReplyProducingMessageHandler and rely on DSL , what it means is that here jmsOutboundGateway is a AbstractReplyProducingMessageHandler and its reply channel is implicitly derived by the DSL. Further, two methods which were earlier handling the flows for sending out the message to the queue and then continuing once the message is back, is collapsed into one. And IMHO it does read a little better because of this change. The second good change and the topic of this article is the introduction of the Jms namespace factories, when I had written the previous blog article, DSL had support for defining the AMQ inbound/outbound adapter/gateway, now there is support for Jms based inbound/adapter adapter/gateways also, this simplifies the flow even further, the flow now looks like this: @Bean public IntegrationFlow toOutboundQueueFlow() { return IntegrationFlows.from("requestChannel") .split(s -> s.applySequence(true).get().getT2().setDelimiters("\\s")) .handle(Jms.outboundGateway(connectionFactory) .requestDestination("amq.outbound")) .resequence() .aggregate(aggregate -> aggregate.outputProcessor(g -> Joiner.on(" ").join(g.getMessages() .stream() .map(m -> (String) m.getPayload()).collect(toList()))) , null) .get(); } The inbound Jms part of the flow also simplifies to the following: @Bean public IntegrationFlow inboundFlow() { return IntegrationFlows.from(Jms.inboundGateway(connectionFactory) .destination("amq.outbound")) .transform((String s) -> s.toUpperCase()) .get(); } Thus, to conclude, Spring Integration Java DSL is an exciting new way to concisely configure Spring Integration flows. It is already very impressive in how it simplifies the readability of flows, the introduction of the Jms namespace factories takes it even further for JMS based flows.

July 2, 2014

by Biju Kunjummen

· 17,890 Views

Reporting Back from MongoDB World 2014, NYC, Planet JSON

Closely approaching the one year mark of when I first joined MongoLab (and the MongoDB community), I had the pleasure of attending the inaugural MongoDB World conference put together by the incredible MongoDB team. Second only to the excitement around major MongoDB feature announcements was the collective disbelief that this was MongoDB’s first multi-day conference ever. A big congratulations to all those that worked hard to put on such a massive (did you see the Intrepid!?) event. All this planning would have been for naught if MongoDB leaders and engineers failed to deliver announcements and features that would meet and exceed expectations. From major public cloud announcements to the reveal of document-level locking in version 2.8, developers and conference goers had plenty to be excited about. There was a lot to digest from the conference… we’ll cover the major highlights in case you missed them. Big announcements in public cloud Our time at the MongoLab booth yielded many high-quality conversations, predominantly those about offloading previously internal processes and workloads to the public cloud. It was remarkable to see and hear so many enterprise teams with the exact same message: the public cloud is the future, and the future is now. It’s no surprise then that MongoDB, Inc. released not one, but two press releases around MongoDB solutions for the public cloud. Fully-managed MongoDB on the Microsoft Azure Store Nearly one year ago, MongoDB, Inc. chose to partner with the MongoLab team to build a production-ready MongoDB solution for developers on Microsoft Azure. On the first day of World, MongoDB, Inc. announced the product of our collaboration – a fully-managed highly available MongoDB-as-a-Service Add-On offering on the Microsoft Azure Store. This new service runs MongoDB Enterprise and offers replication, monitoring and support from MongoDB, Inc. It’s also backed by MongoDB Management Service (MMS), allowing for point-in-time recovery of MongoDB deployments. Now, teams without the expertise or resources to manage their MongoDB deployment(s) can outsource all the database operations (monitoring and alerting, backups, performance tuning, etc.) to both MongoLab and MongoDB’s expert support teams. You can check out the MongoDB add-on in the Azure Store: https://azure.microsoft.com/en-us/gallery/store/mongodb/mongodb-inc/ MongoDB solutions on Google Cloud Platform MongoDB, Inc. also announced the arrival of new resources to help Google Cloud Platform customers deploy MongoDB on Google Compute Engine. These resources include a “Click to Deploy” feature and a MongoDB on Google Compute Engine Solutions paper covering MongoDB best practices. If you are looking for a fully-managed solution, with automated provisioning, backups, integrated monitoring and alerting, along with expert support, MongoLab recently announced the arrival of production-ready replica sets on Google. Product Roadmap – MongoDB version 2.8 On the second day of MongoDB World, Eliot Horowitz, MongoDB, Inc. CTO & Co-founder, took center stage and announced two huge changes to the MongoDB core project: document-level locking and pluggable storage engines. These features not only reflect improvements to the core project, but also signal to the community that the MongoDB team is listening to its users and is capable of delivering the software needed to power the workloads of tomorrow. Document-level locking The slides above from Eliot’s keynote point to a current obstacle (database-level locking) in MongoDB that limits overall scalability. With database-level locking, any write operation to the database holds the write lock and prevents subsequent writes from executing on the database until the original operation holding the write lock completes. Eliot’s announcement of document-level locking moves the write lock contention from the database level to the document (MongoDB equivalent to SQL “records”) level. This change will allow users to achieve much higher write throughput (we saw a 10x performance improvement in the live demo) across their MongoDB deployments, improving write scalability. If you’d like to try out document-level locking, the MongoDB team has already pushed the feature to the master branch on GitHub. This should only be used for experimentation, not to be run in production. Pluggable storage engine As MongoDB matures, feature releases like document level locking will continue to allow developers to build robust systems on top of MongoDB. But as the number of use cases grows, different tooling tailored to specific use cases may prove to be extremely beneficial. For example, if Company X decides that they want to use MongoDB to warehouse some of their data, they would likely want to optimize their database for slow-moving data and storage efficiency (compression). With the introduction of pluggable storage engines, many new possibilities are open to the community. Teams can now write their own storage engine for a particular use case, configure replica set nodes with different storage engines for specific situations, or collaborate with the open-source community to architect innovative solutions. This feature not only allows for more granular control of the database, but also encourages the MongoDB community to work together. Takeaways: A maturing and thriving ecosystem Roughly a year ago, MongoLab CTO Todd Dampier recapped MongoSF 2013 and spoke to the health of the MongoDB ecosystem. How far we’ve come! After attending the inaugural MongoDB World and chatting with MongoDB Masters, interns, hackathon winners, power users and those new to the community, the enthusiasm is still surging and as positive as ever. This enthusiasm is well placed. Developers and hackers use MongoDB because so much rich data on the web is shared as JSON (think Facebook, Twitter, Google, etc.). As a result, MongoDB is the de-facto database for hackathons and bootstrapped projects. Just learn the API for the site you want to mine, throw the JSON in MongoDB and query your data with the rich query language- it’s that easy. The MongoDB ecosystem is maturing as well. Take a look at the Customer Success Stories and you’ll get a feel for the extent in which enterprises leverage the solution and use it in production. To further drive enterprise adoption, MongoDB, Inc.’s public cloud solutions and product roadmap features aim to help teams run MongoDB in production and give teams the confidence that MongoDB will continue to improve scalability and meet their growing project requirements. Congratulations again to the MongoDB team on their big announcements and for creating such a fantastic forum at which to learn and meet fellow MongoDB users. Our team at MongoLab had a great time making new friends and talking shop; we look forward to meeting more MongoDB users soon (at a MongoDB Days near you)! -Chris@MongoLab

July 2, 2014

by Chris Chang

· 6,501 Views