Language Resources

The Latest Languages Topics

Java - Top 5 Exception Handling Coding Practices to Avoid

The best coding practices related with Java exception handling that you may want to watch out for while doing coding for exception handling.

October 1, 2014

by Ajitesh Kumar

· 113,538 Views · 3 Likes

Java 8 Optional - Avoid Null and NullPointerException Altogether - and Keep It Pretty

There have been a couple of articles on null, NPE's and how to avoid them. They make some point, but could stress the easy, safe, beautiful aspects of Java 8's Optional. This article shows some way of dealing with optional values, without additional utility code. The old way Let's consider this code: String unsafeTypeDirName = project.getApplicationType().getTypeDirName(); System.out.println(unsafeTypeDirName); This can obviously break with NullPointerException if any term is null. A typical way of avoiding this: // safe, ugly, omission-prone if (project != null) { ApplicationType applicationType = project.getApplicationType(); if (applicationType != null) { String typeDirName = applicationType.getTypeDirName(); if (typeDirName != null) { System.out.println(typeDirName); } } } This won't explode, but is just ugly, and it's easy to avoid some null check. Java 8 Let's try with Java 8's Optional: // let's assume you will get this from your model in the future; in the meantime... Optional optionalProject = Optional.ofNullable(project); // safe, java 8, but still ugly and omission-prone if (optionalProject.isPresent()) { ApplicationType applicationType = optionalProject.get().getApplicationType(); Optional optionalApplicationType = Optional.ofNullable(applicationType); if (optionalApplicationType.isPresent()) { String typeDirName = optionalApplicationType.get().getTypeDirName(); Optional optionalTypeDirName = Optional.ofNullable(typeDirName); if (optionalTypeDirName.isPresent()) { System.out.println(optionalTypeDirName); } } As noted in a lot of posts, this isn't a lot better than null checks. Some argue that it makes your intent clear. I don't see any big difference, most null checks being pretty obvious on those kind of situations. Ok, let's use the functional interfaces and get more power from Optional: // safe, prettier Optional optionalTypeDirName = optionalProject .flatMap(project -> project.getApplicationTypeOptional()) .flatMap(applicationType -> applicationType.getTypeDirNameOptional()); optionalTypeDirName.ifPresent(typeDirName -> System.out.println(typeDirName)); flatMap() will always return an Optional, so no nulls possible here, and you avoid having to wrap/unwrap to Optional. Please note that I added *Optional() methods in the types for that. There are other ways to do it (map + flatMap to Optional::ofNullable is one). The best one: only return optional value where it makes sense: if you know the value will always be provided, make it non-optional. By the way, this advice works for old style null checks too. ifPresent() will only run the code if it's there. No default or anything. Let's just use member references to express the same in a tight way: // safe, yet prettier optionalProject .flatMap(Project::getApplicationTypeOptional) .flatMap(ApplicationType::getTypeDirNameOptional) .ifPresent(System.out::println); Or if you know that Project has an ApplicationType anyway: // safe, yet prettier optionalProject .map(Project::getApplicationType) .flatMap(ApplicationType::getTypeDirNameOptional) .ifPresent(System.out::println); Conclusion By using Optional, and never working with null, you could avoid null checks altogether. Since they aren't needed, you also avoid omitting a null check leading to NPEs. Still, make sure that values returned from legacy code (Map, ...), which can be null, are wrapped asap in Optional.

September 27, 2014

by Yannick Majoros

· 201,177 Views · 9 Likes

Optional and Objects: Null Pointer Saviours!

No one loves Null Pointer Exceptions ! Is there a way we can get rid of them ? Maybe . . . Couple of techniques have been discussed in this post Optional type (new in Java 8) Objects class (old Java 7 stuff ;-) ) Optional type in Java 8 What is it? A new type (class) introduced in Java 8 Meant to act as a ‘wrapper‘ for an object of a specific type or for scenarios where there is no object (null) In plain words, its a better substitute for handling nulls (warning: it might not be very obvious at first !) Basic Usage It’a a type (a class) – so, how do I create an instance of it? Just use three static methods in the Optional class public static Optional stringOptional(String input) { return Optional.of(input); } Plain and simple – create an Optional wrapper containing the value. Beware – will throw NPE in case the value itself is null ! public static Optional stringNullableOptional(String input) { if (!new Random().nextBoolean()) { input = null; } return Optional.ofNullable(input); } Slightly better in my personal opinion. There is no risk of an NPE here – in case of a null input, an empty Optional would be returned public static Optional emptyOptional() { return Optional.empty(); } In case you want to purposefully return an ‘empty’ value. ‘empty’ does not imply null Alright – what about consuming/using an Optional? public static void consumingOptional() { Optional wrapped = Optional.of("aString"); if (wrapped.isPresent()) { System.out.println("Got string - " + wrapped.get()); } else { System.out.println("Gotcha !"); } } A simple way is to check whether or not the Optional wrapper has an actual value (use theisPresent method) – this will make you wonder if its any better than usingif(myObj!=null) ;-) Don’t worry, I’ll explain that as well public static void consumingNullableOptional() { String input = null; if (new Random().nextBoolean()) { input = "iCanBeNull"; } Optional wrapped = Optional.ofNullable(input); System.out.println(wrapped.orElse("default")); } One can use the orElse which can be used to return a default value in case the wrapped value is null – the advantage is obvious. We get to avoid the the obvious verbosity of invoking ifPresent before extracting the actual value public static void consumingEmptyOptional() { String input = null; if (new Random().nextBoolean()) { input = "iCanBeNull"; } Optional wrapped = Optional.ofNullable(input); System.out.println(wrapped.orElseGet( () -> { return "defaultBySupplier"; } )); } I was a little confused with this. Why two separate methods for similar goals ? orElse andorElseGet could well have been overloaded (same name, different parameter) Anyway, the only obvious difference here is the parameter itself – you have the option of providing a Lambda Expression representing instance of a Supplier (a Functional Interface) How is using Optional better than regular null checks???? By and large, the major benefit of using Optional is to be able to express your intent clearly – simply returning a null from a method leaves the consumer in a sea of doubt (when the actual NPE occurs) as to whether or not it was intentional and requires further introspection into the javadocs (if any). With Optional, its crystal clear ! There are ways in which you can completely avoid NPE with Optional – as mentioned in above examples, the use of Optional.ofNullable (during Optional creation) andorElse and orElseGet (during Optional consumption) shield us from NPEs altogether Another savior! (in case you can’t use Java 8) Look at this code snippet package com.abhirockzz.wordpress.npesaviors; import java.util.Map; import java.util.Objects; public class UsingObjects { String getVal(Map aMap, String key) { return aMap.containsKey(key) ? aMap.get(key) : null; } public static void main(String[] args) { UsingObjects obj = new UsingObjects(); obj.getVal(null, "dummy"); } } What can possibly be null? The Map object The key against which the search is being executed The instance on which the method is being called When a NPE is thrown in this case, we can never be sure as to What is null? Enter The Objects class package com.abhirockzz.wordpress.npesaviors; import java.util.Map; import java.util.Objects; public class UsingObjects { String getValSafe(Map aMap, String key) { Map safeMap = Objects.requireNonNull(aMap, "Map is null"); String safeKey = Objects.requireNonNull(key, "Key is null"); return safeMap.containsKey(safeKey) ? safeMap.get(safeKey) : null; } public static void main(String[] args) { UsingObjects obj = new UsingObjects(); obj.getValSafe(null, "dummy"); } } The requireNonNull method Simply returns the value in case its not null Throws a NPE will the specified message in case the value in null Why is this better than if(myObj!=null) The stack trace which you would see will clearly have theObjects.requireNonNull method call. This, along with your custom error message will help you catch bugs faster . . much faster IMO ! You can write your user defined checks as well e.g. implementing a simple check which enforces non-emptiness import java.util.Collections; import java.util.List; import java.util.Objects; import java.util.function.Predicate; public class RandomGist { public static T requireNonEmpty(T object, Predicate predicate, String msgToCaller){ Objects.requireNonNull(object); Objects.requireNonNull(predicate); if (predicate.test(object)){ throw new IllegalArgumentException(msgToCaller); } return object; } public static void main(String[] args) { //Usage 1: an empty string (intentional) String s = ""; System.out.println(requireNonEmpty(Objects.requireNonNull(s), (s1) -> s1.isEmpty() , "My String is Empty!")); //Usage 2: an empty List (intentional) List list = Collections.emptyList(); System.out.println(requireNonEmpty(Objects.requireNonNull(list), (l) -> l.isEmpty(), "List is Empty!").size()); //Usage 3: an empty User (intentional) User user = new User(""); System.out.println(requireNonEmpty(Objects.requireNonNull(user), (u) -> u.getName().isEmpty(), "User is Empty!")); } private static class User { private String name; public User(String name){ this.name = name; } public String getName(){ return name; } } } Don’t let NPEs be a pain in the wrong place. We have more than a decent set of tools at our disposal to better handle NPEs or eradicate them altogether ! Cheers! :-)

September 25, 2014

by Abhishek Gupta

CORE

· 8,539 Views

Visualizing and Analyzing Java Dependency Graph with Gephi

Gephi comes with tools to analyse properties of a graph.

September 23, 2014

by Peter Huber

· 31,989 Views · 2 Likes

Reduce Boilerplate Code in your Java applications with Project Lombok

One of the most frequently voiced criticisms of the Java programming language is the amount of Boilerplate Code it requires. This is especially true for simple classes that should do nothing more than store a few values. You need getters and setters for these values, maybe you also need a constructor, overridingequals() and hashcode() is often required and maybe you want a more useful toString()implementation. In the end you might have 100 lines of code that could be rewritten with 10 lines of Scala or Groovy code. Java IDEs like Eclipse or IntelliJ try to reduce this problem by providing various types of code generation functionality. However, even if you do not have to write the code yourself, you always see it (and get distracted by it) if you open such a file in your IDE. Project Lombok (don't be frightened by the ugly web page) is a small Java library that can help reducing the amount of Boilerplate Code in Java Applications. Project Lombok provides a set of annotations that are processed at development time to inject code into your Java application. The injected code is immediately available in your development environment. Lets have a look at the following Eclipse Screenshot: The defined class is annotated with Lombok's @Data annotation and does not contain any more than three private fields. @Data automatically injects getters, setters (for non final fields), equals(), hashCode(),toString() and a constructor for initializing the final dateOfBirth field. As you can see the generated methods are directly available in Eclipse and shown in the Outline view. Setup To set up Lombok for your application you have to put lombok.jar to your classpath. If you are using Maven you just have to add to following dependency to your pom.xml: org.projectlombok lombok 1.14.8 provided You also need to set up Lombok in the IDE you are using: NetBeans users just have to enable the Enable Annotation Processing in Editor option in their project properties (see: NetBeans instructions). Eclipse users can install Lombok by double clicking lombok.jar and following a quick installation wizard. For IntelliJ a Lombok Plugin is available. Getting started The @Data annotation shown in the introduction is actually a shortcut for various other Lombok annotations. Sometimes @Data does too much. In this case, you can fall back to more specific Lombok annotations that give you more flexibility. Generating only getters and setters can be achieved with @Getter and @Setter: @Getter @Setter public class Person { private final LocalDate birthday; private String firstName; private String lastName; public Person(LocalDate birthday) { this.birthday = birthday; } } Note that getter methods for boolean fields are prefixed with is instead of get (e.g. isFoo() instead ofgetFoo()). If you only want to generate getters and setters for specific fields you can annotate these fields instead of the class. Generating equals(), hashCode() and toString(): @EqualsAndHashCode @ToString public class Person { ... } @EqualsAndHashCode and @ToString also have various properties that can be used to customize their behaviour: @EqualsAndHashCode(exclude = {"firstName"}) @ToString(callSuper = true, of = {"firstName", "lastName"}) public class Person { ... } Here the field firstName will not be considered by equals() and hashCode(). toString() will call super.toString() first and only consider firstName and lastName. For constructor generation multiple annotations are available: @NoArgsConstructor generates a constructor that takes no arguments (default constructor). @RequiredArgsConstructor generates a constructor with one parameter for all non-initialized final fields. @AllArgsConstructor generates a constructor with one parameter for all fields in the class. The @Data annotation is actually an often used shortcut for @ToString, @EqualsAndHashCode, @Getter,@Setter and @RequiredArgsConstructor. If you prefer immutable classes you can use @Value instead of @Data: @Value public class Person { LocalDate birthday; String firstName; String lastName; } @Value is a shortcut for @ToString, @EqualsAndHashCode, @AllArgsConstructor,@FieldDefaults(makeFinal = true, level = AccessLevel.PRIVATE) and @Getter. So, with @Value you get toString(), equals(), hashCode(), getters and a constructor with one parameter for each field. It also makes all fields private and final by default, so you do not have to addprivate or final modifiers. Looking into Lombok's experimental features Besides the well supported annotations shown so far, Lombok has a couple of experimental features that can be found on the Experimental Features page. One of these features I like in particular is the @Builder annotation, which provides an implementation of the Builder Pattern. @Builder public class Person { private final LocalDate birthday; private String firstName; private String lastName; } @Builder generates a static builder() method that returns a builder instance. This builder instance can be used to build an object of the class annotated with @Builder (here Person): Person p = Person.builder() .birthday(LocalDate.of(1980, 10, 5)) .firstName("John") .lastName("Smith") .build(); By the way, if you wonder what this LocalDate class is, you should have a look at my blog post about the Java 8 date and time API ;-) Conclusion Project Lombok injects generated methods, like getters and setters, based on annotations. It provides an easy way to significantly reduce the amount of Boilerplate code in Java applications. Be aware that there is a downside: According to reddit comments (including a comment of the project author), Lombok has to rely on various hacks to get the job done. So, there is a chance that future JDK or IDE releases will break the functionality of project Lombok. On the other hand, these comments where made 5 years ago and Project Lombok is still actively maintained. You can find the source of Project Lombok on GitHub.

September 22, 2014

by Michael Scharhag

· 25,471 Views · 1 Like

Java - Four Security Vulnerabilities Related Coding Practices to Avoid

This article represents top 4 security vulnerabilities related coding practice to avoid while you are programming with Java language. Recently, I came across few Java projects where these instances were found. Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the typos. Following are the key points described later in this article: Executing a dynamically generated SQL statement Directly writing an Http Parameter to Servlet output Creating an SQL PreparedStatement from dynamic string Array is stored directly Executing a Dynamically Generated SQL Statement This is most common of all. One can find mention of this vulenrability at several places. As a matter of fact, many developers are also aware of this vulnerability, although this is a different thing they end up making mistakes once in a while. In several DAO classes, the instances such as following code were found which could lead to SQL injection attacks. StringBuilder query = new StringBuilder(); query.append( "select * from user u where u.name in (" + namesString + ")" ); try { Connection connection = getConnection(); Statement statement = connection.createStatement(); resultSet = statement.executeQuery(query.toString()); } Instead of above query, one could as well make use of prepared statement such as that demonstrated in the code below. It not only makes code less vulnerable to SQL injection attacks but also makes it more efficient. StringBuilder query = new StringBuilder(); query.append( "select * from user u where u.name in (?)" ); try { Connection connection = getConnection(); PreparedStatement statement = connection.prepareCall(query.toString()); statement.setString( 1, namesString ); resultSet = statement.execute(); } Directly writing an Http Parameter to Servlet Output In Servlet classes, I found instances where the Http request parameter was written as it is, to the output stream, without any validation checks. Following code demonstrate the same: public void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { String content = request.getParameter("some_param"); // // .... some code goes here // response.getWriter().print(content); } Note that above code does not persist anything. Code like above may lead to what is called reflected (or non-persistent) cross site scripting (XSS) vulnerability. Reflected XSS occur when an attacker injects browser executable code within a single HTTP response. As it goes by definition (being non-persistent), the injected attack does not get stored within the application; it manifests only users who open a maliciously crafted link or third-party web page. The attack string is included as part of the crafted URI or HTTP parameters, improperly processed by the application, and returned to the victim. You could read greater details on following OWASP page on reflect XSS Creating an SQL PreparedStatement from Dynamic Query String What it essentially means is the fact that although PreparedStatement was used, but the query was generated as a string buffer and not in the way recommended for prepared statement (parametrized). If unchecked, tainted data from a user would create a String where SQL injection could make it behave in unexpected and undesirable manner. One should rather make the query statement parametrized and, use the PreparedStatement appropriately. Take a look at following code to identify the vulnerable code. StringBuilder query = new StringBuilder(); query.append( "select * from user u where u.name in (" + namesString + ")" ); try { Connection connection = getConnection(); PreparedStatement statement = connection.prepareStatement(query.toString()); resultSet = statement.executeQuery(); } Array is Stored Directly Instances of this vulnerability, Array is stored directly, could help the attacker change the objects stored in array outside of program, and the program behave in inconsistent manner as the reference to the array passed to method is held by the caller/invoker. The solution is to make a copy within the object when it gets passed. In this manner, a subsequent modification of the collection won’t affect the array stored within the object. You could read the details on following stackoverflow page. Following code represents the vulnerability: // Note that values is a String array in the code below. // public void setValues(String[] somevalues) { this.values = somevalues; }

September 19, 2014

by Ajitesh Kumar

· 19,502 Views · 1 Like

Sublime: Configure to Open HTML Page in a Web Browser

This article presents steps that is needed to configure Sublime to open the HTML pages you are working, in your preferred web browser. As I started developing AngularJS apps with Sublime, I got stuck at the point where I have to manually go to appropriate folder consisting of HTML file and double-click to open it in browser or, go to existing browser having that page and refresh it. In both the case, it was quite a bit cumbersome. Ideally, I wanted some shortcut keys right from within Sublime which would have helped me open the file in browser. This is where I did some research and found the way out. Following are the steps (for Win platform) to configure your Sublime to open the HTML page in the web browser: Goto Tools > Build System and click on “New Build System”. It opens up a file with default command text such as { “cmd”:["make"] } Copy and paste follow command and save the file as “Chrome.sublime-build { "cmd":["PATH_TO_CHROME_OR_FIREFOX","$file"] } Close Sublime and start again. Goto Tools > Build System and select “Chrome” Write an HTML file and use following shortcut: CTRL + B . The command would open the HTML page that you are working, in a web browser. Happy coding with Sublime.

September 19, 2014

by Ajitesh Kumar

· 100,856 Views · 3 Likes

MySQL 101: Monitor Disk I/O with pt-diskstats

Originally Written by Muhammad Irfan Here on the Percona Support team we often ask customers to retrieve disk stats to monitor disk IO and to measure block devices iops and latency. There are a number of tools available to monitor IO on Linux. iostat is one of the popular tools and Percona Toolkit, which is free, contains the pt-diskstats tool for this purpose. The pt-diskstats tool is similar to iostat but it’s more interactive and contains extended information. pt-diskstats reports current disk activity and shows the statistics for the last second (which by default is 1 second) and will continue until interrupted. The pt-diskstats tool collects samples of /proc/diskstats. In this post, I will share some examples about how to monitor and check to see if the IO subsystem is performing properly or if any disks are a limiting factor – all this by using the pt-diskstats tool. pt-diskstats output consists on number of columns and in order to interpret pt-diskstats output we need to know what each column represents. rd_s tells about number of reads per second while wr_s represents number of writes per second. rd_rt and wr_rt shows average response time in milliseconds for reads & writes respectively, which is similar to iostat tool output await column but pt-diskstats shows individual response time for reads and writes at disk level. Just a note, modern iostat splits read and write latency out, but most distros don’t have the latest iostat in their systat (or equivalent) package. rd_mrg and wr_mrg are other two important columns in pt-diskstats output. *_mrg is telling us how many of the original operations the IO elevator (disk scheduler) was able to merge to reduce IOPS, so *_mrg is telling us a quite important thing by letting us know that the IO scheduler was able to consolidate many or few operations. If rd_mrg/wr_mrg is high% then the IO workload is sequential on the other hand, If rd_mrg/wr_mrg is a low% then IO workload is all random. Binary logs, redo logs (aka ib_logfile*), undo log and doublewrite buffer all need sequential writes. qtime and stime are last two columns in pt-diskstats output where qtime reflects to time spent in disk scheduler queue i.e. average queue time before sending it to physical device and on the other hand stime is average service time which is time accumulated to process the physical device request. Note, that qtime is not discriminated between reads and writes and you can check if response time is higher for qtime than it signal towards disk scheduler. Also note that service time (stime field and svctm field in in pt-diskstats & iostat output respectively) is not reliable on Linux. If you read the iostat manual you will see it is deprecated. Along with that, there are many other parameters for pt-diskstats – you can found full documentation here. Below is an example of pt-disktats in action. I used the –devices-regex option which prints only device information that matches this Perl regex. $ pt-diskstats --devices-regex=sd --interval 5 #ts device rd_s rd_avkb rd_mb_s rd_mrg rd_cnc rd_rt wr_s wr_avkb wr_mb_s wr_mrg wr_cnc wr_rt busy in_prg io_s qtime stime 1.1 sda 21.6 22.8 0.5 45% 1.2 29.4 275.5 4.0 1.1 0% 40.0 145.1 65% 158 297.1 155.0 2.1 1.1 sdb 15.0 21.0 0.3 33% 0.1 5.2 0.0 0.0 0.0 0% 0.0 0.0 11% 1 15.0 0.5 4.7 1.1 sdc 5.6 10.0 0.1 0% 0.0 5.2 1.9 6.0 0.0 33% 0.0 2.0 3% 0 7.5 0.4 3.6 1.1 sdd 0.0 0.0 0.0 0% 0.0 0.0 0.0 0.0 0.0 0% 0.0 0.0 0% 0 0.0 0.0 0.0 5.0 sda 17.0 14.8 0.2 64% 3.1 66.7 404.9 4.6 1.8 14% 140.9 298.5 100% 111 421.9 277.6 1.9 5.0 sdb 14.0 19.9 0.3 48% 0.1 5.5 0.4 174.0 0.1 98% 0.0 0.0 11% 0 14.4 0.9 2.4 5.0 sdc 3.6 27.1 0.1 61% 0.0 3.5 2.8 5.7 0.0 30% 0.0 2.0 3% 0 6.4 0.7 2.4 5.0 sdd 0.0 0.0 0.0 0% 0.0 0.0 0.0 0.0 0.0 0% 0.0 0.0 0% 0 0.0 0.0 0.0 These are the stats from 7200 RPM SATA disks. As you can see, the write-response time is very high and most of that is made up of IO queue time. This shows the problem exactly. The problem is that the IO subsystem is not able to handle the write workload because the amount of writes that are being performed are way beyond what it can handle. It means the disks cannot service every request concurrently. The workload would actually depend a lot on where the hot data is stored and as we can see in this particular case the workload only hits a single disk out of the 4 disks. A single 7.2K RPM disk can only do about 100 random writes per second which is not a lot considering heavy workload. It’s not particularly a hardware issue but a hardware capacity issue. The kind of workload that is present and the amount of writes that are performed per second are not something that the IO subsystem is able to handle in an efficient manner. Mostly writes are generated on this server as can be seen by the disk stats. Let me show you a second example. Here you can see read latency. rd_rt is consistently between 10ms-30ms. It depends on how fast the disks are spinning and the number of disks. To deal with it possible solutions would be to optimize queries to avoid table scans, use memcached where possible, use SSD’s as it can provide good I/O performance with high concurrency. You will find this post useful on SSD’s from our CEO, Peter Zaitsev. #ts device rd_s rd_avkb rd_mb_s rd_mrg rd_cnc rd_rt wr_s wr_avkb wr_mb_s wr_mrg wr_cnc wr_rt busy in_prg io_s qtime stime 1.0 sdb 33.0 29.1 0.9 0% 1.1 34.7 7.0 10.3 0.1 61% 0.0 0.4 99% 1 40.0 2.2 19.5 1.0 sdb1 0.0 0.0 0.0 0% 0.0 0.0 7.0 10.3 0.1 61% 0.0 0.4 1% 0 7.0 0.0 0.4 1.0 sdb2 33.0 29.1 0.9 0% 1.1 34.7 0.0 0.0 0.0 0% 0.0 0.0 99% 1 33.0 3.5 30.2 1.0 sdb 81.9 28.5 2.3 0% 1.1 14.0 0.0 0.0 0.0 0% 0.0 0.0 99% 1 81.9 2.0 12.0 1.0 sdb1 0.0 0.0 0.0 0% 0.0 0.0 0.0 0.0 0.0 0% 0.0 0.0 0% 0 0.0 0.0 0.0 1.0 sdb2 81.9 28.5 2.3 0% 1.1 14.0 0.0 0.0 0.0 0% 0.0 0.0 99% 1 81.9 2.0 12.0 1.0 sdb 50.0 25.7 1.3 0% 1.3 25.1 13.0 11.7 0.1 66% 0.0 0.7 99% 1 63.0 3.4 11.3 1.0 sdb1 25.0 21.3 0.5 0% 0.6 25.2 13.0 11.7 0.1 66% 0.0 0.7 46% 1 38.0 3.2 7.3 1.0 sdb2 25.0 30.1 0.7 0% 0.6 25.0 0.0 0.0 0.0 0% 0.0 0.0 56% 0 25.0 3.6 22.2 From the below diskstats output it seems that IO is saturated between both reads and writes. This can be noticed with high value for columns rd_s and wr_s. In this particular case, consider having disks in either RAID 5 (better for read only workload) or RAID 10 array is good option along with battery-backed write cache (BBWC) as single disk can really be bad for performance when you are IO bound. device rd_s rd_avkb rd_mb_s rd_mrg rd_cnc rd_rt wr_s wr_avkb wr_mb_s wr_mrg wr_cnc wr_rt busy in_prg io_s qtime stime sdb1 362.0 27.4 9.7 0% 2.7 7.5 525.2 20.2 10.3 35% 6.4 8.0 100% 0 887.2 7.0 0.9 sdb1 439.9 26.5 11.4 0% 3.4 7.7 545.7 20.8 11.1 34% 9.8 11.9 100% 0 985.6 9.6 0.8 sdb1 576.6 26.5 14.9 0% 4.5 7.8 400.2 19.9 7.8 34% 6.7 10.9 100% 0 976.8 8.6 0.8 sdb1 410.8 24.2 9.7 0% 2.9 7.1 403.1 18.3 7.2 34% 10.8 17.7 100% 0 813.9 12.5 1.0 sdb1 378.4 24.6 9.1 0% 2.7 7.3 506.1 16.5 8.2 33% 5.7 7.6 100% 0 884.4 6.6 0.9 sdb1 572.8 26.1 14.6 0% 4.8 8.4 422.6 17.2 7.1 30% 1.7 2.8 100% 0 995.4 4.7 0.8 sdb1 429.2 23.0 9.6 0% 3.2 7.4 511.9 14.5 7.2 31% 1.2 1.7 100% 0 941.2 3.6 0.9 The following example reflects write heavy activity but write-response time is very good, under 1ms, which shows disks are healthy and capable of handling high number of IOPS. #ts device rd_s rd_avkb rd_mb_s rd_mrg rd_cnc rd_rt wr_s wr_avkb wr_mb_s wr_mrg wr_cnc wr_rt busy in_prg io_s qtime stime 1.0 dm-0 530.8 16.0 8.3 0% 0.3 0.5 6124.0 5.1 30.7 0% 1.7 0.3 86% 2 6654.8 0.2 0.1 2.0 dm-0 633.1 16.1 10.0 0% 0.3 0.5 6173.0 6.1 36.6 0% 1.7 0.3 88% 1 6806.1 0.2 0.1 3.0 dm-0 731.8 16.0 11.5 0% 0.4 0.5 6064.2 5.8 34.1 0% 1.9 0.3 90% 2 6795.9 0.2 0.1 4.0 dm-0 711.1 16.0 11.1 0% 0.3 0.5 6448.5 5.4 34.3 0% 1.8 0.3 92% 2 7159.6 0.2 0.1 5.0 dm-0 700.1 16.0 10.9 0% 0.4 0.5 5689.4 5.8 32.2 0% 1.9 0.3 88% 0 6389.5 0.2 0.1 6.0 dm-0 774.1 16.0 12.1 0% 0.3 0.4 6409.5 5.5 34.2 0% 1.7 0.3 86% 0 7183.5 0.2 0.1 7.0 dm-0 849.6 16.0 13.3 0% 0.4 0.5 6151.2 5.4 32.3 0% 1.9 0.3 88% 3 7000.8 0.2 0.1 8.0 dm-0 664.2 16.0 10.4 0% 0.3 0.5 6349.2 5.7 35.1 0% 2.0 0.3 90% 2 7013.4 0.2 0.1 9.0 dm-0 951.0 16.0 14.9 0% 0.4 0.4 5807.0 5.3 29.9 0% 1.8 0.3 90% 3 6758.0 0.2 0.1 10.0 dm-0 742.0 16.0 11.6 0% 0.3 0.5 6461.1 5.1 32.2 0% 1.7 0.3 87% 1 7203.2 0.2 0.1 Let me show you a final example. I used –interval and –iterations parameters for pt-diskstats which tells us to wait for a number of seconds before printing the next disk stats and to limit the number of samples respectively. If you notice, you will see in 3rd iteration high latency (rd_rt, wr_rt) mostly for reads. Also, you can notice a high value for queue time (qtime) and service time (stime) where qtime is related to disk IO scheduler settings. For MySQL database servers we usually recommends noop/deadline instead of default cfq. $ pt-diskstats --interval=20 --iterations=3 #ts device rd_s rd_avkb rd_mb_s rd_mrg rd_cnc rd_rt wr_s wr_avkb wr_mb_s wr_mrg wr_cnc wr_rt busy in_prg io_s qtime stime 10.4 hda 11.7 4.0 0.0 0% 0.0 1.1 40.7 11.7 0.5 26% 0.1 2.1 10% 0 52.5 0.4 1.5 10.4 hda2 0.0 0.0 0.0 0% 0.0 0.0 0.4 7.0 0.0 43% 0.0 0.1 0% 0 0.4 0.0 0.1 10.4 hda3 0.0 0.0 0.0 0% 0.0 0.0 0.4 107.0 0.0 96% 0.0 0.2 0% 0 0.4 0.0 0.2 10.4 hda5 0.0 0.0 0.0 0% 0.0 0.0 0.7 20.0 0.0 80% 0.0 0.3 0% 0 0.7 0.1 0.2 10.4 hda6 0.0 0.0 0.0 0% 0.0 0.0 0.1 4.0 0.0 0% 0.0 4.0 0% 0 0.1 0.0 4.0 10.4 hda9 11.7 4.0 0.0 0% 0.0 1.1 39.2 10.7 0.4 3% 0.1 2.7 9% 0 50.9 0.5 1.8 10.4 drbd1 11.7 4.0 0.0 0% 0.0 1.1 39.1 10.7 0.4 0% 0.1 2.8 9% 0 50.8 0.5 1.7 20.0 hda 14.6 4.0 0.1 0% 0.0 1.4 39.5 12.3 0.5 26% 0.3 6.4 18% 0 54.1 2.6 2.7 20.0 hda2 0.0 0.0 0.0 0% 0.0 0.0 0.4 9.1 0.0 56% 0.0 42.0 3% 0 0.4 0.0 42.0 20.0 hda3 0.0 0.0 0.0 0% 0.0 0.0 1.5 22.3 0.0 82% 0.0 1.5 0% 0 1.5 1.2 0.3 20.0 hda5 0.0 0.0 0.0 0% 0.0 0.0 1.1 18.9 0.0 79% 0.1 21.4 11% 0 1.1 0.1 21.3 20.0 hda6 0.0 0.0 0.0 0% 0.0 0.0 0.8 10.4 0.0 62% 0.0 1.5 0% 0 0.8 1.3 0.2 20.0 hda9 14.6 4.0 0.1 0% 0.0 1.4 35.8 11.7 0.4 3% 0.2 4.9 18% 0 50.4 0.5 3.5 20.0 drbd1 14.6 4.0 0.1 0% 0.0 1.4 36.4 11.6 0.4 0% 0.2 5.1 17% 0 51.0 0.5 3.4 20.0 hda 0.9 4.0 0.0 0% 0.2 251.9 28.8 61.8 1.7 92% 4.5 13.1 31% 2 29.6 12.8 0.9 20.0 hda2 0.0 0.0 0.0 0% 0.0 0.0 0.6 8.3 0.0 52% 0.1 98.2 6% 0 0.6 48.9 49.3 20.0 hda3 0.0 0.0 0.0 0% 0.0 0.0 2.0 23.2 0.0 83% 0.0 1.4 0% 0 2.0 1.2 0.3 20.0 hda5 0.0 0.0 0.0 0% 0.0 0.0 4.9 249.4 1.2 98% 4.0 13.2 9% 0 4.9 12.9 0.3 20.0 hda6 0.0 0.0 0.0 0% 0.0 0.0 0.0 0.0 0.0 0% 0.0 0.0 0% 0 0.0 0.0 0.0 20.0 hda9 0.9 4.0 0.0 0% 0.2 251.9 21.3 24.2 0.5 32% 0.4 12.9 31% 2 22.2 10.2 9.7 20.0 drbd1 0.9 4.0 0.0 0% 0.2 251.9 30.6 17.0 0.5 0% 0.7 24.1 30% 5 31.4 21.0 9.5 You can see the busy column in pt-diskstats output which is the same as the util column in iostat – which points to utilization. Actually, pt-diskstats is quite similar to the iostat tool but pt-diskstats is more interactive and has more information. The busy percentage is only telling us for how long the IO subsystem was busy, but is not indicating capacity. So the only time you care about %busy is when it’s 100% and at the same time latency (await in iostat and rd_rt/wr_rt in diskstats output) increases over -say- 5ms. You can estimate capacity of your IO subsystem and then look at the IOPS being consumed (r/s + w/s columns). Also, the system can process more than one request in parallel (in case of RAID) so %busy can go beyond 100% in pt-diskstats output. If you need to check disk throughput, block device IOPS run the following to capture metrics from your IO subsystem and see if utilization matches other worrisome symptoms. I would suggest capturing disk stats during peak load. Output can be grouped by sample or by disk using the –group-by option. You can use the sysbench benchmark tool for this purpose to measure database server performance. You will find this link useful for sysbench tool details. $ pt-diskstats --group-by=all --iterations=7200 > /tmp/pt-diskstats.out; Conclusion: pt-diskstats is one of the finest tools from Percona Toolkit. By using this tool you can easily spot disk bottlenecks, measure the IO subsystem and identify how much IOPS your drive can handle (i.e. disk capacity).

September 19, 2014

by Peter Zaitsev

· 5,304 Views

15 Tools That Make Life Easy for Java Developers

If you use Java for programming, read on to learn about tools like Eclipse IDE, the Java Development Kit, and other must-know tools.

September 19, 2014

by Michael Georgiou

· 132,471 Views · 3 Likes

5 Error Tracking Tools Java Developers Should Know

Raygun, Stack Hunter, Sentry, Takipi and Airbrake: Modern developer tools to help you crush bugs before bugs crush your app With the Java ecosystem going forward, web applications serving growing numbers of requests and users’ demand for high performance - comes a new breed of modern development tools. A fast paced environment with rapid new deployments requires tracking errors and gaining insight to an application's behavior on a level traditional methods can’t sustain. In this post we’ve decided to gather 5 of those tools, see how they integrate with Java and find out what kind of tricks they have up their sleeves. It’s time to smash some bugs. Raygun Mindscape’s Raygun is a web based error management system that keeps track of exceptions coming from your apps. It supports various desktop, mobile and web programming languages, including Java, Scala, .NET, Python, PHP, and JavaScript. Besides that, sending errors to Raygun is possible through a REST API and a few more Providers (that’s how they call language and framework integrations) came to life thanks to developer community involvement. Key Features: Error grouping - Every occurrence of a bug is presented within one group with access to single instances of it, including its stack trace. Full text search - Error groups and all collected data is searchable. View app activity - Every action on an error group is displayed for all your team to see: status updates, comments and more. Affected users - Counts of affected users appear by each error. External integrations - Github, Bitbucket, Asana, JIRA, HipChat and many more. The Java angle: To use Raygun with Java, you’ll need to add some dependencies to your pom.xml file if you’re using Maven or add the jars manually. The second step would be to add an UncaughtExceptionHandler that would create an instance of RaygunClient and send your exceptions to it. In addition, you can also add custom data fields to your exceptions and send them together to Raygun. The full walkthrough is available here. Behind the curtain: Meet Robie Robot, the certified operator of Raygun. As in, the actual ray gun. Check it out on: https://raygun.io Sentry Started as a side-project, Sentry is an open-source web based solution that serves as a real time event logging and aggregation platform. It monitors errors and displays when, where and to whom they happen, promising to do so without relying solely on user feedback. Supported languages and frameworks include Ruby, Python, JS, Java, Django, iOS, .NET and more. Key Features: See the impact of new deployments in real time Provide support to specific users interrupted by an error Detect and thwart fraud as its attempted - notifications of unusual amounts of failures on purchases, authentication, and other sensitive areas External Integrations - GitHub, HipChat, Heroku, and many more The Java angle: Sentry’s Java client is called Raven and supports major existing logging frameworks like java.util.logging, Log4j, Log4j2 and Logback with Slf4j. An independent method to send events directly to Sentry is also available. To set up Sentry for Java with Logback for example, you’ll need to add the dependencies manually or through Maven, then add a new Sentry appender configuration and you’re good to do. Instructions are available here. Behind the curtain: Sentry was an internal project at Disqus back in 2010 to solve exception logging on a Django application by Chris Jennings and David Cramer Check it out on: https://www.getsentry.com/ Takipi Unlike most of the other tools, Takipi is far more than a stack trace prettifier. It was built with a simple objective in mind: Telling developers exactly when and why production code breaks. Whenever a new exception is thrown or a log error occurs – Takipi captures it and shows you the variable state which caused it, across methods and machines. Takipi will overlay this over the actual code which executed at the moment of error – so you can analyze the exception as if you were there when it happened. Key features: Detect – Caught/uncaught exceptions, Http and logged errors. Prioritize – How often errors happen across your cluster, if they involve new or modified code, and whether that rate is increasing. Analyze – See the actual code and variable state, even across different machines and applications. Easy to install - No code or configuration changes needed. Less than 2% overhead. The Java angle: Takipi was built for production environments in Java and Scala. The installation takes less than 1min, and includes attaching a Java agent to your JVM. Behind the curtain: Each exception type and error has a unique monster that represents it. You can find these monster here. Check it out on: http://www.takipi.com/ Airbrake Another tool that has put exception tracking on its eyesights is Rackspace’s Airbrake, taking on the mission of “No More Searching Log Files”. It provides users with a web based interface that includes a dashboard with error details and an application specific view. Supported languages include Ruby, PHP, Java, .NET, Python and even… Swift. Key Features: Detailed stack traces, grouping by error type, users and environment variables Team productivity - Filter importance errors from the noise Team collaboration - See who’s causing bugs and whose fixing them External Integrations - HipChat, GitHub, JIRA, Pivotal and over 30 more The Java angle: Airbrake officially supports only Log4j, although a Logback library is also available. Log4j2 support is currently lacking. The installation procedure is similar to Sentry, adding a few dependencies manually or through Maven, adding an appender, and you’re ready to start. Similarly, a direct way to send messages to Airbrake is also available with AirbrakeNotice and AirbrakeNotifier. More details are available here. Behind the curtain: Airbrake was acquired by Exceptional, which then got acquired by Rackspace. Check it out on: https://airbrake.io/ StackHunter Currently in beta, Stack Hunter provides a self hosted tool to track your Java exceptions. A change of scenery from the past hosted tools. Other than that, it aims to provide a similar feature set to inform developers of their exceptions and help solve them faster. Key Features: A single self hosted web interface to view all exceptions Collections of stack trace data and context including key metrics such as total exceptions, unique exceptions, users affected, & sessions affected Instant email alerts when exceptions occur Exceptions grouping by root cause The Java angle: Built specifically for Java, StackHunter runs on any servlet container running Java 6 or above. Installation includes running StackHunter on a local servlet, configuring an outgoing mail server for alerts, and configuring the application you’re wishing to log. Full instructions are available here. Behind the curtain: StackHunter is developed by Dele Taylor, who also works on Data Pipeline - a tool for transforming and migrating data in Java. Check it out on: http://stackhunter.com/ Bonus: ABRT Another approach to error tracking worth mentioning is used by ABRT, an automatic bug detection and reporting tool from the Fedora ecosystem, which is a Red Hat sponsored community project. Unlike the 5 tools we covered here, this one is intended to be used not only by app developers - but their users as well. Reporting bugs back to Red Hat with richer context that otherwise would have been harder to understand and debug. The Java angle: Support for Java exceptions is still in its proof of concept stage. A Java connector developed by Jakub Filák is available here. Behind the curtain: ABRT is an open-source project developed by Red Hat. Check it out on: https://github.com/abrt/abrt Did we miss any other tools? How do you keep track of your exceptions? Please let me know in the comments section below.

September 18, 2014

by Chen Harel

· 8,796 Views · 2 Likes

Lambdas and Side Effects

Overview Java 8 has added features such as lambdas and type inference. This makes the language less verbose and cleaner, however it comes with more side effects as you don't have to be as explicit in what you are doing. The return type of a lambda matters Java 8 infers the type of a closure. One way it does this is to look at the return type (or whether anything is returned) This can have a surprising side effect. Consider this code. ExecutorService es = Executors.newSingleThreadExecutor(); es.submit(() -> { try(Scanner scanner = new Scanner(new FileReader("file.txt"))) { String line = scanner.nextLine(); process(line); } return null; }); This code compiles fine. However, the line return null; appears redundant and you might be tempted to remove it. However if you remove the line, you get an error. Error:(12, 39) java: unreported exception java.io.FileNotFoundException; must be caught or declared to be thrown This is complaining about the use of FileReader. What has the return null got to do with catching an uncaught exception !? Type inference. ExecutorService.submit() is an overloaded method. It has two methods which take one argument. ExecutorService.submit(Runnable runnable); ExecutorService.submit(Callable callable); Both these methods take no arguments, so how does the javac compiler infer the type of the lambda? It looks at the return type. If you return null; it is aCallable however if nothing is returned, not even null, it is a Runnable. Callable and Runnable have another important difference. Callable throws checked exceptions, however Runnable doesn't allow checked exceptions to be thrown. The side effect of returning null is that you don't have to handle checked exceptions, these will be stored in the Future submit() returns. If you don't return anything, you have to handle checked exceptions. Conclusion While lambdas and type inference remove significant amounts of boiler plate code, you can find more edge cases, where the hidden details of what the compiler infers can be slightly confusing. Footnote You can be explicit about type inference with a cast. Consider this Callable calls = (Callable & Serializable) () -> { return null; } if (calls instanceof Serializable) // is true This cast has a number of side effects. Not only does the call() method return anInteger and a marker interface added, the code generated for the lambda changes i.e. it adds a writeObject() and readObject() method to support serialization of the lambda. Note: Each call site creates a new class meaning the details of this cast is visible at runtime via reflection.

September 16, 2014

by Peter Lawrey

· 7,991 Views · 1 Like

A Closer Look at the MySQL ibdata1 Disk Space Issue and Big Tables

A recurring customer issue seen by the Percona Support team involves how to make the ibdata1 file “shrink” within MySQL. I'll show you how to handle big tables.

September 16, 2014

by Peter Zaitsev

· 7,878 Views

Python 101: An Intro to Pony ORM

The Pony ORM project is another object relational mapper package for Python. They allow you to query a database using generators. They also have an online ER Diagram Editor that is supposed to help you create a model. They are also one of the only Python packages I’ve seen with a multi-licensing scheme where you can develop using a GNU license or purchase a license for non-open source work. See their website for additional details. In this article, we will spend some time learning the basics of this package. Getting Started Since this project is not included with Python, you will need to download and install it. If you have pip, then you can just do this: pip install pony Otherwise you’ll have to download the source and install it via its setup.py script. Creating the Database We will start out by creating a database to hold some music. We will need two tables: Artist and Album. Let’s get started! import datetime import pony.orm as pny database = pny.Database("sqlite", "music.sqlite", create_db=True) ######################################################################## class Artist(database.Entity): """ Pony ORM model of the Artist table """ name = pny.Required(unicode) albums = pny.Set("Album") ######################################################################## class Album(database.Entity): """ Pony ORM model of album table """ artist = pny.Required(Artist) title = pny.Required(unicode) release_date = pny.Required(datetime.date) publisher = pny.Required(unicode) media_type = pny.Required(unicode) # turn on debug mode pny.sql_debug(True) # map the models to the database # and create the tables, if they don't exist database.generate_mapping(create_tables=True) Pony ORM will create our primary key for us automatically if we don’t specify one. To create a foreign key, all you need to do is pass the model class into a different table, as we did in the Album class. Each Required field takes a Python type. Most of our fields are unicode, with one being a datatime object. Next we turn on debug mode, which will output the SQL that Pony generates when it creates the tables in the last statement. Note that if you run this code multiple times, you won’t recreate the table. Pony will check to see if the tables exist before creating them. If you run the code above, you should see something like this get generated as output: GET CONNECTION FROM THE LOCAL POOL PRAGMA foreign_keys = false BEGIN IMMEDIATE TRANSACTION CREATE TABLE "Artist" ( "id" INTEGER PRIMARY KEY AUTOINCREMENT, "name" TEXT NOT NULL ) CREATE TABLE "Album" ( "id" INTEGER PRIMARY KEY AUTOINCREMENT, "artist" INTEGER NOT NULL REFERENCES "Artist" ("id"), "title" TEXT NOT NULL, "release_date" DATE NOT NULL, "publisher" TEXT NOT NULL, "media_type" TEXT NOT NULL ) CREATE INDEX "idx_album__artist" ON "Album" ("artist") SELECT "Album"."id", "Album"."artist", "Album"."title", "Album"."release_date", "Album"."publisher", "Album"."media_type" FROM "Album" "Album" WHERE 0 = 1 SELECT "Artist"."id", "Artist"."name" FROM "Artist" "Artist" WHERE 0 = 1 COMMIT PRAGMA foreign_keys = true CLOSE CONNECTION Wasn’t that neat? Now we’re ready to learn how to add data to our database. How to Insert / Add Data to Your Tables Pony makes adding data to your tables pretty painless. Let’s take a look at how easy it is: import datetime import pony.orm as pny from models import Album, Artist #---------------------------------------------------------------------- @pny.db_session def add_data(): """""" new_artist = Artist(name=u"Newsboys") bands = [u"MXPX", u"Kutless", u"Thousand Foot Krutch"] for band in bands: artist = Artist(name=band) album = Album(artist=new_artist, title=u"Read All About It", release_date=datetime.date(1988,12,01), publisher=u"Refuge", media_type=u"CD") albums = [{"artist": new_artist, "title": "Hell is for Wimps", "release_date": datetime.date(1990,07,31), "publisher": "Sparrow", "media_type": "CD" }, {"artist": new_artist, "title": "Love Liberty Disco", "release_date": datetime.date(1999,11,16), "publisher": "Sparrow", "media_type": "CD" }, {"artist": new_artist, "title": "Thrive", "release_date": datetime.date(2002,03,26), "publisher": "Sparrow", "media_type": "CD"} ] for album in albums: a = Album(**album) if __name__ == "__main__": add_data() # use db_session as a context manager with pny.db_session: a = Artist(name="Skillet") You will note that we need to use a decorator caled db_session to work with the database. It takes care of opening a connection, committing the data and closing the connection. You can also use it as a context manager, which is demonstrated at the very end of this piece of code. Using Basic Queries to Modify Records with Pony ORM In this section, we will learn how to make some basic queries and modify a few entries in our database. ] import pony.orm as pny from models import Artist, Album with pny.db_session: band = Artist.get(name="Newsboys") print band.name for record in band.albums: print record.title # update a record band_name = Artist.get(name="Kutless") band_name.name = "Beach Boys" Here we use the db_session as a context manager. We make a query to get an artist object from the database and print its name. Then we loop over the artist’s albums that are also contained in the returned object. Finally, we change one of the artist’s names. Let’s try querying the database using a generator: result = pny.select(i.name for i in Artist) result.show() If you run this code, you should see something like the following: i.name -------------------- Newsboys MXPX Beach Boys Thousand Foot Krutch The documentation has several other examples that are worth checking out. Note that Pony also supports using SQL itself via its select_by_sql and get_by_sql methods. How to Delete Records in Pony ORM Deleting records with Pony is also pretty easy. Let’s remove one of the bands from the database: import pony.orm as pny from models import Artist with pny.db_session: band = Artist.get(name="MXPX") band.delete() Once more we use db_session to access the database and commit our changes. We use the band object’s delete method to remove the record. You will need to dig to find out if Pony supports cascading deletes where if you delete the Artist, it will also delete all the Albums that are connected to it. According to the docs, if the field is Required, then cascade is enabled. Wrapping Up Now you know the basics of using the Pony ORM package. I personally think the documentation needs a little work as you have to dig a lot to find some of the functionality that I felt should have been in the tutorials. Overall though, the documentation is still a lot better than most projects. Give it a go and see what you think! Additional Resources Pony ORM’s website Pony documentation SQLAlchemy Tutorial An Intro to peewee

September 12, 2014

by Mike Driscoll

· 8,803 Views

How to Load an Existing Email Message & Modify its Contents in Java Apps

This technical tip shows how to java developers can load and modify an existing email messages inside their java application using Aspose.Email Java API. Aspose.Email API allows developer to load any existing email message and modify its contents before saving back to the disk. One notable point is to specify the MessageFormat while loading the email message from the disk. In addition, it is important to specify the correct MailMessageSaveType while saving the message back to disk. The following sequence of steps lets you modify an existing email message: Create an instance of the MailMessage class. Load an existing message using the MailMessage class' load(), specifying the email' MessageFormat. Get the subject using the getSubject() method, modify it and set it using the MailMessage class' setSubject() method. Get the body using the getHtmlBody() method, modify it and set it using the MailMessage class' setHtmlBody() method. Create an instance of the MailAddressCollection class. Get recipients from the TO field into the MailAddressCollection object using the MailMessage class' getTo() method. Add or remove recipients using the MailAddressCollection collection's add() and remove() methods. Get recipients from the CC field into the MailAddressCollection object using the MailMessage class' getCC() method. Add or remove recipients using the MailAddressCollection collection's add() and remove() methods. Call the MailMessage class' save() method to save the file to disk in MSG format by specifying the correct MailMessageSaveType. //Adding Attachments to a New Email Message public static void main(String[] args) { // Base folder for reading and writing files String strBaseFolder = "D:\\Data\\Aspose\\resources\\"; //Initialize and Load an existing MSG file by specifying the MessageFormat MailMessage email = MailMessage.load(strBaseFolder + "anEmail.msg", MessageFormat.getMsg()); //Initialize a String variable to get the Email Subject String subject = email.getSubject(); //Append some more information to Subject subject = subject + " This text is added to the existing subject"; //Set the Email Subject email.setSubject(subject); //Initialize a String variable to get the Email's HTML Body String body = email.getHtmlBody(); //Apppend some more information to the Body variable body = body + " This text is added to the existing body"; //Set the Email Body email.setHtmlBody(body); //Initialize MailAddressCollection object MailAddressCollection contacts = new MailAddressCollection(); //Retrieve Email's TO list contacts = email.getTo(); //Check if TO list has some values if (contacts.size() > 0) { //Remove the first email address contacts.remove(0); //Add another email address to collection contacts.add("[email protected]"); } //Set the collection as Email's TO list email.setTo(contacts); //Initialize MailAddressCollection contacts = new MailAddressCollection(); //Retrieve Email's CC list contacts = email.getCC(); //Add another email address to collection contacts.add("[email protected]"); //Set the collection as Email's CC list email.setCC(contacts); //Save the Email message to disk by specifying the MessageFormat email.save(strBaseFolder + "message.msg", MailMessageSaveType.getOutlookMessageFormat()); } //Loading a Message with Load Options //To load a message with specific load options, Aspose.Email provides the MessageLoadOptions class that can be used as follow: MesageLoadOptions options = new MesageLoadOptions(); options.PrefferedTextEncoding = Encoding.getEncoding(1252); options.setMessageFormat(MessageFormat.getMsg()); MailMessage eml = MailMessage.Load("EMAIL_497563\\test3.msg", options);

September 10, 2014

by David Zondray

· 1,730 Views

How to Run HTML files in your Browser from GitHub

if you have a .html file in a github repository and want to view that page directly, you would typically download or clone the repo to your local hard drive and run it from there. there is an easier way simply navigate to the repo in your github account that contains a html file as shown below: right-click the index.html file and select copy link address. you should have a url similar to the following structure: https://github.com///blob/master/index.html enter rawgit.com as the name implies, rawgit shows serves the raw files directly from github. to use it simply use the following format: https://rawgit.com///master/index.html if you want to use it in production, you can use: https://cdn.rawgit.com///master/index.html that was easy now, wasn’t it!

September 10, 2014

by Michael Crump

· 11,294 Views

Garbage Collectors - Serial vs. Parallel vs. CMS vs. G1 (and what's new in Java 8)

The 4 Java Garbage Collectors - How the Wrong Choice Dramatically Impacts Performance The year is 2014 and there are two things that still remain a mystery to most developers - Garbage collection and understanding the opposite sex. Since I don’t know much about the latter, I thought I’d take a whack at the former, especially as this is an area that has seen some major changes and improvements with Java 8, especially with the removal of the PermGen and some new and exciting optimizations (more on this towards the end). When we speak about garbage collection, the vast majority of us know the concept and employ it in our everyday programming. Even so, there’s much about it we don’t understand, and that’s when things get painful. One of the biggest misconceptions about the JVM is that it has one garbage collector, where in fact it provides four different ones, each with its own unique advantages and disadvantages. The choice of which one to use isn’t automatic and lies on your shoulders and the differences in throughput and application pauses can be dramatic. What’s common about these four garbage collection algorithms is that they are generational, which means they split the managed heap into different segments, using the age-old assumptions that most objects in the heap are short lived and should be recycled quickly. As this too is a well-covered area, I’m going to jump directly into the different algorithms, along with their pros and their cons. 1. The Serial Collector The serial collector is the simplest one, and the one you probably won’t be using, as it’s mainly designed for single-threaded environments (e.g. 32 bit or Windows) and for small heaps. This collector freezes all application threads whenever it’s working, which disqualifies it for all intents and purposes from being used in a server environment. How to use it: You can use it by turning on the -XX:+UseSerialGC JVM argument, 2. The Parallel / Throughput collector Next off is the Parallel collector. This is the JVM’s default collector. Much like its name, its biggest advantage is that is uses multiple threads to scan through and compact the heap. The downside to the parallel collector is that it will stop application threads when performing either a minor or full GC collection. The parallel collector is best suited for apps that can tolerate application pauses and are trying to optimize for lower CPU overhead caused by the collector. 3. The CMS Collector Following up on the parallel collector is the CMS collector (“concurrent-mark-sweep”). This algorithm uses multiple threads (“concurrent”) to scan through the heap (“mark”) for unused objects that can be recycled (“sweep”). This algorithm will enter “stop the world” (STW) mode in two cases: when initializing the initial marking of roots (objects in the old generation that are reachable from thread entry points or static variables) and when the application has changed the state of the heap while the algorithm was running concurrently, forcing it to go back and do some final touches to make sure it has the right objects marked. The biggest concern when using this collector is encountering promotion failures which are instances where a race condition occurs between collecting the young and old generations. If the collector needs to promote young objects to the old generation, but hasn’t had enough time to make space clear it, it will have to do so first which will result in a full STW collection - the very thing this CMS collector was meant to prevent. To make sure this doesn’t happen you would either increase the size of the old generation (or the entire heap for that matter) or allocate more background threads to the collector for him to compete with the rate of object allocation. Another downside to this algorithm in comparison to the parallel collector is that it uses more CPU in order to provide the application with higher levels of continuous throughput, by using multiple threads to perform scanning and collection. For most long-running server applications which are adverse to application freezes, that’s usually a good trade off to make. Even so, this algorithm is not on by default. You have to specify XX:+USeParNewGC to actually enable it. If you’re willing to allocate more CPU resources to avoid application pauses this is the collector you’ll probably want to use, assuming that your heap is less than 4Gb in size. However, if it’s greater than 4GB, you’ll probably want to use the last algorithm - the G1 Collector. 4. The G1 Collector The Garbage first collector (G1) introduced in JDK 7 update 4 was designed to better support heaps larger than 4GB. The G1 collector utilizes multiple background threads to scan through the heap that it divides into regions, spanning from 1MB to 32MB (depending on the size of your heap). G1 collector is geared towards scanning those regions that contain the most garbage objects first, giving it its name (Garbage first). This collector is turned on using the –XX:+UseG1GC flag. This strategy the chance of the heap being depleted before background threads have finished scanning for unused objects, in which case the collector will have to stop the application which will result in a STW collection. The G1 also has another advantage that is that it compacts the heap on-the-go, something the CMS collector only does during full STW collections. Large heaps have been a fairly contentious area over the past few years with many developers moving away from the single JVM per machine model to more micro-service, componentized architectures with multiple JVMs per machine. This has been driven by many factors including the desire to isolate different application parts, simplifying deployment and avoiding the cost which would usually come with reloading application classes into memory (something which has actually been improved in Java 8). Even so, one of the biggest drivers to do this when it comes to the JVM stems from the desire to avoid those long “stop the world” pauses (which can take many seconds in a large collection) that occur with large heaps. This has also been accelerated by container technologies like Docker that enable you to deploy multiple apps on the same physical machine with relative ease. Java 8 and the G1 Collector Another beautiful optimization which was just out with Java 8 update 20 for is the G1 Collector String deduplication. Since strings (and their internal char[] arrays) takes much of our heap, a new optimization has been made that enables the G1 collector to identify strings which are duplicated more than once across your heap and correct them to point into the same internal char[] array, to avoid multiple copies of the same string from residing inefficiently within the heap. You can use the -XX:+UseStringDeduplicationJVM argument to try this out. Java 8 and PermGen One of the biggest changes made in Java 8 was removing the permgen part of the heap that was traditionally allocated for class meta-data, interned strings and static variables. This would traditionally require developers with applications that would load significant amount of classes (something common with apps using enterprise containers) to optimize and tune for this portion of the heap specifically. This has over the years become the source of many OutOfMemory exceptions, so having the JVM (mostly) take care if it is a very nice addition. Even so, that in itself will probably not reduce the tide of developers decoupling their apps into multiple JVMs. Each of these collectors is configured and tuned differently with a slew of toggles and switches, each with the potential to increase or decrease throughput, all based on the specific behavior of your app. We’ll delve into the key strategies of configuring each of these in our next posts.

September 10, 2014

by Chen Harel

· 55,184 Views · 7 Likes

Creating a Custom SQL Server VM Image in Azure

Recently I had the opportunity to work on a project were I needed to create a custom SQL Server image for use with Azure VMs. The process was a little more challenging than I initially anticipated. I think this is mostly because I was not familiar with the process of preparing a SQL Server image. Perhaps this isn’t much of a challenge for an experienced SQL Server DBA or IT Pro. For me, it was a great learning experience. Why a Custom SQL Server Image? The Azure VM image gallery already contains a SQL Server image. It’s very easy to create a new SQL Server VM using this image. However, doing so has a few important trade-offs to consider: Unable to fully customize the base install of SQL Server. This is a template/image after all – you get a VM configured the way the image was configured. Unable to use your own SQL Server license. If your company has an Enterprise Agreement (EA) with Microsoft, it’s likely there is already some SQL Server licenses built into that agreement. Depending on the details, it may be significantly cheaper to use the licenses from the EA instead of paying the SQL Server VM image upcharge from Azure. The Basic Steps There are 6 basic steps to creating a custom SQL Server VM image for use in Azure. Provision a new base Windows Server VM Download the SQL Server installation media Run SQL Server setup to prepare an image Configure Windows to complete the installation of SQL Server Capture the image and add it to the Azure VM image gallery Create a new VM instance using the custom SQL Server image The basic idea here is to create a base VM, customize it with a SQL Server image, capture the VM to create an image, and then provision new VMs using that captured VM image. Let’s dive into each of these in a little more detail. Note: the terminology here can be a little confusing. When referring to the VM used to create the template/image, I’ll use the term “base VM”. When referring to the VM created from the base VM, I’ll use the term “VM instance”. 1. Provision a new base Windows Server VM There are multiple ways to create a Windows Server VM in Azure. Creating a VM via the Azure management portal and PowerShell are probably the two most popular options. Be sure to check out this tutorial to learn how to do so via the portal. For the purposes of this post, I’ll do so via PowerShell. $img = Get-AzureVMImage ` | where { ( $_.PublisherName -ilike "Microsoft*" -and $_.ImageFamily -ilike "Windows Server 2012 Datacenter" ) } ` | Sort-Object -Unique -Descending -Property ImageFamily ` | sort -Descending -Property PublishDate ` | select -First(1) $vmConfig = New-AzureVMConfig -Name "sql-1" -InstanceSize Small -ImageName $img.ImageName | Add-AzureProvisioningConfig -Windows -AdminUsername "[admin-username-here]" -Password "[admin-password-here]" New-AzureVM -ServiceName "SQLServerVMTemplate" -VMs $vmConfig -Location "East US" -WaitForBoot 2. Download the SQL Server installation media With the base Windows Server 2012 VM created, we can now get ready to prepare (sysprep) the SQL Server installation. To do that, we need to get the SQL Server installation media onto the machine. The easiest way I found to do this was to leverage Azure blob storage. Upload the SQL Server ISO file to Azure blob storage Remote Desktop (RDP) into the base VM From the VM, download the SQL Server ISO file to the local disk Mount the SQL Server ISO file to the VM Copy the ISO contents (not the ISO file itself) to the VM’s C:\ drive. For example, use C:\sql The SQL Server installation media files need to be copied to the local C: drive so it can be used later to complete the SQL Server installation (when provisioning the actual SQL Server VM instance). 3. Run SQL Server setup to prepare an image In order to prepare the (sysprep’d) SQL Server VM image (which we can use as a template for future VMs), we need to run the SQL Server installation and instruct it topreparean image – not run the full installation. An easy way to do this is with a SQL Server configuration file, an example of which I’ve included below. ConfigurationFile.ini ;SQL Server 2012 Configuration File [OPTIONS] ; Specifies a Setup workflow, like INSTALL, UNINSTALL, or UPGRADE. This is a required parameter. ACTION="PrepareImage" ; Detailed help for command line argument ENU has not been defined yet. ENU="True" ; Parameter that controls the user interface behavior. Valid values are Normal for the full UI, AutoAdvance for a simplified UI, and EnableUIOnServerCore for bypassing Server Core setup GUI block. ;UIMODE="Normal" ; Specifies setup not display any user interface. ;QUIET="False" ; Specifies setup to display progress only, without any user interaction. QUIETSIMPLE="True" ; Specifies whether SQL Server Setup should discover and include product updates. The valid values are True and False or 1 and 0. By default SQL Server Setup will include updates that are found. UpdateEnabled="True" ; Specifies features to install, uninstall, or upgrade. The list of top-level features include SQL, AS, RS, IS, MDS, and Tools. The SQL feature will install the Database Engine, Replication, Full-Text, and Data Quality Services (DQS) server. The Tools feature will install Management Tools, Books online components, SQL Server Data Tools, and other shared components. FEATURES=SQLENGINE ; Specifies the location where SQL Server Setup will obtain product updates. The valid values are "MU" to search Microsoft Update, a valid folder path, a relative path such as .\MyUpdates or a UNC share. By default SQL Server Setup will search Microsoft Update or a Windows Update service through the Window Server Update Services. UpdateSource="MU" ; Displays the command line parameters usage HELP="False" ; Specifies that the detailed Setup log should be piped to the console. INDICATEPROGRESS="False" ; Specifies that Setup should install into WOW64. This command line argument is not supported on an IA64 or a 32-bit system. X86="False" ; Specifies the root installation directory for shared components. This directory remains unchanged after shared components are already installed. INSTALLSHAREDDIR="C:\Program Files\Microsoft SQL Server" ; Specifies the root installation directory for the WOW64 shared components. This directory remains unchanged after WOW64 shared components are already installed. INSTALLSHAREDWOWDIR="C:\Program Files (x86)\Microsoft SQL Server" ; Specifies the Instance ID for the SQL Server features you have specified. SQL Server directory structure, registry structure, and service names will incorporate the instance ID of the SQL Server instance. INSTANCEID="MSSQLSERVER" ; Specifies the installation directory. INSTANCEDIR="C:\Program Files\Microsoft SQL Server" There are two steps in this process: Copy the ConfigurationFile.ini file (from your local PC) to the same location as the SQL Server installation media (i.e.c:\sql) on the base VM. Run SQL Server setup to prepare an image. From a command prompt (on the base VM), navigate to theC:\sqlfolder and then execute the following command: Setup.exe /ConfigurationFile=ConfigurationFile.ini /IAcceptSQLServerLicenseTerms=true 4. Configure Windows to complete the installation of SQL Server At this point the base VM should have an “installation” of SQL Server that is not fully completed. The SQL Server bits are in place, but they’re not configured for a full server install . . . at least not yet. The final configuration of SQL Server will take place when the VM instance (of which this template/image is the base) is provisioned and boots up for the first time. This is accomplished by using a CMD file with the following content: @ECHO OFF && SETLOCAL && SETLOCAL ENABLEDELAYEDEXPANSION && SETLOCAL ENABLEEXTENSIONS REM All commands will be executed during first Virtual Machine boot "C:\Program Files\Microsoft SQL Server\110\Setup Bootstrap\SQLServer2012\setup.exe" /QS /ACTION=CompleteImage /INSTANCEID=MSSQLSERVER /INSTANCENAME=MSSQLSERVER /IACCEPTSQLSERVERLICENSETERMS=1 /SQLSYSADMINACCOUNTS=%COMPUTERNAME%\Administrators /BROWSERSVCSTARTUPTYPE=AUTOMATIC /INDICATEPROGRESS /TCPENABLED=1 /PID="[YOUR-SQL-SERVER-PRODUCT-ID-HERE]" On your local PC, save the file as SetupComplete2.cmd RDP / log into the base VM Copy the SetupComplete2.cmd from your local PC file to the c:\Windows\OEM folder on the base VM Change the value for the SQLSYSADMINACCOUNTS value to be that of the administrative account created on the VM (or better yet – the local Administrators group account) If needed, supply the SQL Server product ID (PID) value. When Windows starts on the new VM instance for the first time, the SetupComplete2.cmd file should automatically run. It is invoked by the SetupComplete.cmd file already on the machine. 5. Capture the image and add it to the Azure VM image gallery At this point a base SQL Server VM has been created and the groundwork laid to complete the install. Now it is time to create the VM image from the base VM, and do to that you sysprep and capture the base VM. Please follow the guide on How to Capture a Windows Virtual Machine to Use as a Template. 6. Create a new VM using the custom SQL Server image With a new custom VM image template available in the VM image gallery, you can provision a new VM instance using that custom template. Upon first boot, the newly provisioned VM should complete the full SQL Server installation as laid out in your SetupComplete2.cmd file. Please follow the guide on How to Create a Custom Virtual Machine for more information on creating the VM from the template. Closing Thoughts One of the quirks I noticed when preparing the base SQL Server image is that it was not possible to prepare the image with SQL Server Management Studio (SSMS). I would have to do the install after the newly provisioned VM instance is created. Not hard, but time consuming (an annoying if doing this on multiple VM instances). I later learned that SQL Server 2012 Cumulative Update 1 does allow for preparing a SQL Server image with SSMS installed. I’ve included a link below that describes the process for creating a SQL Server image with CU1. In the end, this process really is not all that hard. Time consuming? Yes! The worst part (at least for me) was really just understanding how the SQL Server installation and sysprep process works. Once I wrapped my head around that, the process was a lot smoother. Helpful Resources While I was learning how to create a custom SQL Server VM image, the following resources were very helpful: How to: Create a Windows Azure Virtual Machine Operating System Image for Microsoft Dynamics NAV. This MSDN article provided the jumping off point on learning how to install SQL Server by using a sysprep image. Install SQL Server 2012 from the Command Prompt Install SQL Server 2012 Using a Configuration File Install SQL Server 2012 Using SysPrep How to create a slipstream SQL Server 2012 and Cumulative Update 1 image –http://sqlperformance.com/2012/12/system-configuration/sql-2012-slipstream I would like to thank Scott Klein for his assistance in verifying these steps. His help was extremely valuable to ensure I was doing this the right way.

September 10, 2014

by Michael Collier

· 6,491 Views

How JSF Works and how to Debug it - is Polyglot an Alternative?

JSF is not what we often think it is. It's also a framework that can be somewhat tricky to debug, especially when first encountered. In this post let's go over on why that is and provide some JSF debugging techniques. We will go through the following topics: JSF is not what we often think The difficulties of JSF debugging How to debug JSF systematically How JSF Works - The JSF lifecycle Debugging an Ajax request from browser to server and back Debugging the JSF frontend Javascript code Final thoughts - alternatives? (questions to the reader) JSF is not what we often think JSF looks on first look like an enterprise Java/XML frontend framework, but under the hood it really isn't. It's really a polyglot Java/Javascript framework, where the client Javascript part is non-neglectable and also important to understand it. It also has good support for direct HTML/CSS use. JSF developers are on ocasion already polyglot developers, whose primary language is Java but still need to use ocasionally Javascript. The difficulties of JSF debugging When comparing JSF to GWT and AngularJS in a previous post, I found that the (most often used) approach that the framework takes of abstracting HTML and CSS from the developer behind XML adds to the difficulty of debugging, because it creates an extra level of indirection. A more direct approach of using HTML/CSS directly is also possible, but it seems enterprise Java developers tend to stick to XML in most cases, because it's a more familiar technology. Also another problem is that the client side Javascript part of the framework/libraries is not very well documented, and it's often important to understand what is going on. The only way to debug JSF systematically When first encountering JSF, I first tried to approach it from a Java, XML and documentation only. While I could do a part of the work that way, there where frequent situations where that approach was really not sufficient. The conclusion that I got to is that in order to be able to debug JSF applications effectively, an understanding of the following is needed: HTML CSS Javascript HTTP Chrome Dev Tools, Firebug or equivalent The JSF Lifecycle This might sound surprising to developers that work mostly in Java/XML, but this web-centric approach to debugging JSF is the only way that I managed to tackle many requirements that needed some significant component customization, or to be able to fix certain bugs. Let’s start by understanding the inner workings of JSF, so that we can debug it better. The JSF take on MVC The way JSF approaches MVC is that the whole 3 components reside on the server side: The Model is a tree of plain Java objects The View is a server side template defined in XML that is read to build an in-memory view definition The Controller is a Java servlet, that receives each request and processes them through a series of steps The browser is assumed to be simply a rendering engine for the HTML generated at server side. Ajax is achieved by submitting parts of the page for server processing, and requesting a server to ‘repaint’ only portions of the screen, without navigating away from the page. The JSF Lifecycle Once an HTTP request reaches the backend, it gets caught by the JSF Controller that will then process it. The request goes through a series of phases known as the JSF lifecycle, which is essential to understand how JSF works: Design Goals of the JSF Lifecycle The whole point of the lifecycle is to manage MVC 100% on the server side, using the browser as a rendering platform only. The initial idea was to decouple the rendering platform from the server-side UI component model, in order to allow to replace HTML with alternative markup languages by swapping the Render Response phase. This was in the early 2000's when HTML could be soon replaced by XML-based alternatives (that never came to be), and then HTML5 came along. Also browsers where much more qwirkier than what they are today, and the idea of cross-browser Javascript libraries was not widespread. So let’s go through each phase and see how to debug it if needed, starting in the browser. Let's base ourselves in a simple example that uses an Ajax request. A JSF 2 Hello World Example The following is a minimal JSF 2 page, that receives an input text from the user, sends the text via an Ajax request to the backend and refreshes only an output label: JSF 2.2 Hello World Example The page looks like this: Following one Ajax request - to the server and back Let’s click submit in order to trigger the Ajax request, and use the Chrome Dev Tools Network tab (right click and inspect any element on the page).What goes over the wire? This is what we see in the Form Data section of the request: j_idt8:input: Hello World javax.faces.ViewState: -2798727343674530263:954565149304692491 javax.faces.source: j_idt8:j_idt9 javax.faces.partial.event: click javax.faces.partial.execute: j_idt8:j_idt9 j_idt8:input javax.faces.partial.render: j_idt8:output javax.faces.behavior.event: action javax.faces.partial.ajax:true This request says: The new value of the input field is "Hello World", send me a new value for the output field only, and don't navigate away from this page. Let's see how this can be read from the request. As we can see, the new values of the form are submitted to the server, namely the “Hello World” value. This is the meaning of the several entries: javax.faces.ViewState identifies the view from which the request was made. The request is an Ajax request, as indicated by the flag javax.faces.partial.ajax, The request was triggered by a click as defined in javax.faces.partial.event. But what are those j_ strings ? Those are space separated generated identifiers of HTML elements. For example this is how we can see what is the page element corresponding to j_idt8:input, using the Chrome Dev Tools: There are also 3 extra form parameters that use these identifiers, that are linked to UI components: javax.faces.source: The identifier of the HTML element that originated this request, in this case the Id of the submit button. javax.faces.execute: The list of identifiers of the elements whose values are sent to the server for processing, in this case the input text field. javax.faces.render: The list of identifiers of the sections of the page that are to be ‘repainted', in this case the output field only. But what happens when the request hits the server ? JSF lifecycle - Restore View Phase Once the request reaches the server, the JSF controller will inspect the javax.faces.ViewState and identify to which view it refers. It will then build or restore a Java representation of the view, that is somehow similar to the document definition in the browser side. The view will be attached to the request and used throughout. There is usually little need to debug this phase during application development. JSF Lifecycle - Apply Request Values The JSF Controller will then apply to the view widgets the new values received via the request. The values might be invalid at this point. Each JSF component gets a call to it’s decode method in this phase. This method will retrieve the submitted value for the widget in question from the HTTP request and store it on the widget itself. To debug this, let’s put a breakpoint in the decode method of the HtmlInputText class, to see the value “Hello World”: Notice the conditional breakpoint using the HTML clientId of the field we want. This would allow to quickly debug only the decoding of the component we want, even in a large page with many other similar widgets. Next after decoding is the validation phase. JSF Lifecycle - Process Validations In this phase, validations are applied and if the value is found to be in error (for example a date is invalid), then the request bypasses Invoke Application and goes directly to Render Response phase. To debug this phase, a similar breakpoint can be put on method processValidators, or in the validators themselves if you happen to know which ones or if they are custom. JSF Lifecycle - Update Model In this phase, we know all the submitted values where correct. JSF can now update the view model by applying the new values received in the requests to the plain Java objects in the view model. This phase can be debugged by putting a breakpoint in the processUpdates method of the component in question, eventually using a similar conditional breakpoint to break only on the component needed. JSF Lifecycle - Invoke Application This is the simplest phase to debug. The application now has an updated view model, and some logic can be applied on it. This is where the action listeners defined in the XML view definition (the 'action' properties and the listener tags) are executed. JSF Lifecycle - Render Response This is the phase that I end up debugging the most: why is the value not being displayed as we expect it, etc, it all can be found here. In this phase the view and the new model values will be transformed from Java objects into HTML, CSS and eventually Javascript and sent back over the wire to the browser. This phase can be debugged using breakpoints in the encodeBegin, encodeChildren and encodeEnd methods of the component in question. The components will either render themselves or delegate rendering to aRenderer class. Back in the browser It was a long trip, but we are back where we started! This is how the response generated by JSF looks once received in the browser: -8188482707773604502:6956126859616189525> What the Javascript part of the framework will do is to take the contents of the partial response, update by update. Using the Id of the update, the client side JSF callback will search for a component with that Id, delete it from the document and replace it with the new updated version. In this case, "Hello World" will show up on the label next to the Input text field! And so thats how JSF works under the hood. But what about if we need to debug the Javascript part of the framework? Debugging the JSF Javascript Code The Chrome Dev Tools can help debug the client part. For example let’s say that we want to halt the client when an Ajax request is triggered. We need to go to the sources tab, add an XHR (Ajax) breakpoint and trigger the browser action. The debugger will stop and the call stack can be examined: For some frameworks like Primefaces, the Javascript sources might be minified (non human-readable) because they are optimized for size. To solve this, download the source code of the library and do a non minified build of the jar. There are usually instructions for this, otherwise check the project poms. This will install in your Maven repository a jar with non minified sources for debugging. The UI Debug tag: The ui:debug tag allows to view a lot of debugging information using a keyboard shortcut, see here for further details. Final Thoughts JSF is very popular in the enterprise Java world, and it handles a lot of problems well, specially if the UI designers take into account the possibilities of the widget library being used. The problem is that there are usually feature requests that force us to dig deeper into the widgets internal implementation in order to customize them, and this requires HTML, CSS, Javascript and HTTP plus JSF lifecycle knowledge. Is polyglot an alternative? We can wonder that if developers have to know a fair amount about web technologies in order to be able to debug JSF effectively, then it would be simpler to build enterprise front ends (just the client part) using those technologies directly instead. It's possible that a polyglot approach of a Java backend plus a Javascript-only frontend could be proved effective in a nearby future, specially using some sort of a client side MVC framework like Angular. This would require learning more Javascript, (have a look at Javascript for Java developers post if curious), but this is already often necessary to do custom widget development in JSF anyway. Conclusions and some questions to the reader Thanks for reading, please take a moment to share your thoughts on these matters on the comments bellow: do you believe polyglot development (Java/Javascript) is a viable alternative in general, and in your workplace in particular? Did you find one of the GWT-based frameworks (plain GWT, Vaadin, Errai), or the Play Framework to be easier to use and of better productivity?

September 10, 2014

by Vasco Cavalheiro

· 44,582 Views · 5 Likes

Spring Batch Tutorial with Spring Boot and Java Configuration

I’ve been working on migrating some batch jobs for Podcastpedia.org to Spring Batch. Before, these jobs were developed in my own kind of way, and I thought it was high time to use a more “standardized” approach. Because I had never used Spring with java configuration before, I thought this were a good opportunity to learn about it, by configuring the Spring Batch jobs in java. And since I am all into trying new things with Spring, why not also throw Spring Boot into the boat… Before you begin with this tutorial I recommend you read first Spring’s Getting started – Creating a Batch Service, because the structure and the code presented here builds on that original. 1. What I’ll build So, as mentioned, in this post I will present Spring Batch in the context of configuring it and developing with it some batch jobs for Podcastpedia.org. Here’s a short description of the two jobs that are currently part of the Podcastpedia-batch project: addNewPodcastJob reads podcast metadata (feed url, identifier, categories etc.) from a flat file transforms (parses and prepares episodes to be inserted with Http Apache Client) the data and in the last step, insert it to the Podcastpedia database and inform the submitter via emailabout it notifyEmailSubscribersJob – people can subscribe to their favorite podcasts on Podcastpedia.orgvia email. For those who did it is checked on a regular basis (DAILY, WEEKLY, MONTHLY) if new episodes are available, and if they are the subscribers are informed via email about those; read from database, expand read data via JPA, re-group it and notify subscriber via email Source code: The source code for this tutorial is available on GitHub – Podcastpedia-batch. Note: Before you start I also highly recommend you read the Domain Language of Batch, so that terms like “Jobs”, “Steps” or “ItemReaders” don’t sound strange to you. 2. What you’ll need A favorite text editor or IDE JDK 1.7 or later Maven 3.0+ 3. Set up the project The project is built with Maven. It uses Spring Boot, which makes it easy to create stand-alone Spring based Applications that you can “just run”. You can learn more about the Spring Boot by visiting theproject’s website. 3.1. Maven build file Because it uses Spring Boot it will have the spring-boot-starter-parent as its parent, and a couple of other spring-boot-starters that will get for us some libraries required in the project: pom.xml of the podcastpedia-batch project 4.0.0 org.podcastpedia.batch podcastpedia-batch 0.1.0 1.1.6.RELEASE 1.7 org.springframework.boot spring-boot-starter-parent 1.1.6.RELEASE org.springframework.boot spring-boot-starter-batch org.springframework.boot spring-boot-starter-data-jpa org.apache.httpcomponents httpclient 4.3.5 org.apache.httpcomponents httpcore 4.3.2 org.apache.velocity velocity 1.7 org.apache.velocity velocity-tools 2.0 org.apache.struts struts-core rome rome 1.0 rome rome-fetcher 1.0 org.jdom jdom 1.1 xerces xercesImpl 2.9.1 mysql mysql-connector-java 5.1.31 org.springframework.boot spring-boot-starter-freemarker org.springframework.boot spring-boot-starter-remote-shell javax.mail mail javax.mail mail 1.4.7 javax.inject javax.inject 1 org.twitter4j twitter4j-core [4.0,) org.springframework.boot spring-boot-starter-test maven-compiler-plugin org.springframework.boot spring-boot-maven-plugin Note: One big advantage of using the spring-boot-starter-parent as the project’s parent is that you only have to upgrade the version of the parent and it will get the “latest” libraries for you. When I started the project spring boot was in version 1.1.3.RELEASE and by the time of finishing to write this post is already at 1.1.6.RELEASE. 3.2. Project directory structure I structured the project in the following way: └── src └── main └── java └── org └── podcastpedia └── batch └── common └── jobs └── addpodcast └── notifysubscribers Note: the org.podcastpedia.batch.jobs package contains sub-packages having specific classes to particular jobs. the org.podcastpedia.batch.jobs.common package contains classes used by all the jobs, like for example the JPA entities that both the current jobs require. 4. Create a batch Job configuration I will start by presenting the Java configuration class for the first batch job: package org.podcastpedia.batch.jobs.addpodcast; import org.podcastpedia.batch.common.configuration.DatabaseAccessConfiguration; import org.podcastpedia.batch.common.listeners.LogProcessListener; import org.podcastpedia.batch.common.listeners.ProtocolListener; import org.podcastpedia.batch.jobs.addpodcast.model.SuggestedPodcast; import org.springframework.batch.core.Job; import org.springframework.batch.core.Step; import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing; import org.springframework.batch.core.configuration.annotation.JobBuilderFactory; import org.springframework.batch.core.configuration.annotation.StepBuilderFactory; import org.springframework.batch.item.ItemProcessor; import org.springframework.batch.item.ItemReader; import org.springframework.batch.item.ItemWriter; import org.springframework.batch.item.file.FlatFileItemReader; import org.springframework.batch.item.file.LineMapper; import org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper; import org.springframework.batch.item.file.mapping.DefaultLineMapper; import org.springframework.batch.item.file.transform.DelimitedLineTokenizer; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration; import org.springframework.context.annotation.Import; import org.springframework.core.io.ClassPathResource; import com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException; @Configuration @EnableBatchProcessing @Import({DatabaseAccessConfiguration.class, ServicesConfiguration.class}) public class AddPodcastJobConfiguration { @Autowired private JobBuilderFactory jobs; @Autowired private StepBuilderFactory stepBuilderFactory; // tag::jobstep[] @Bean public Job addNewPodcastJob(){ return jobs.get("addNewPodcastJob") .listener(protocolListener()) .start(step()) .build(); } @Bean public Step step(){ return stepBuilderFactory.get("step") .chunk(1) //important to be one in this case to commit after every line read .reader(reader()) .processor(processor()) .writer(writer()) .listener(logProcessListener()) .faultTolerant() .skipLimit(10) //default is set to 0 .skip(MySQLIntegrityConstraintViolationException.class) .build(); } // end::jobstep[] // tag::readerwriterprocessor[] @Bean public ItemReader reader(){ FlatFileItemReader reader = new FlatFileItemReader(); reader.setLinesToSkip(1);//first line is title definition reader.setResource(new ClassPathResource("suggested-podcasts.txt")); reader.setLineMapper(lineMapper()); return reader; } @Bean public LineMapper lineMapper() { DefaultLineMapper lineMapper = new DefaultLineMapper(); DelimitedLineTokenizer lineTokenizer = new DelimitedLineTokenizer(); lineTokenizer.setDelimiter(";"); lineTokenizer.setStrict(false); lineTokenizer.setNames(new String[]{"FEED_URL", "IDENTIFIER_ON_PODCASTPEDIA", "CATEGORIES", "LANGUAGE", "MEDIA_TYPE", "UPDATE_FREQUENCY", "KEYWORDS", "FB_PAGE", "TWITTER_PAGE", "GPLUS_PAGE", "NAME_SUBMITTER", "EMAIL_SUBMITTER"}); BeanWrapperFieldSetMapper fieldSetMapper = new BeanWrapperFieldSetMapper(); fieldSetMapper.setTargetType(SuggestedPodcast.class); lineMapper.setLineTokenizer(lineTokenizer); lineMapper.setFieldSetMapper(suggestedPodcastFieldSetMapper()); return lineMapper; } @Bean public SuggestedPodcastFieldSetMapper suggestedPodcastFieldSetMapper() { return new SuggestedPodcastFieldSetMapper(); } /** configure the processor related stuff */ @Bean public ItemProcessor processor() { return new SuggestedPodcastItemProcessor(); } @Bean public ItemWriter writer() { return new Writer(); } // end::readerwriterprocessor[] @Bean public ProtocolListener protocolListener(){ return new ProtocolListener(); } @Bean public LogProcessListener logProcessListener(){ return new LogProcessListener(); } } The @EnableBatchProcessing annotation adds many critical beans that support jobs and saves us configuration work. For example you will also be able to @Autowired some useful stuff into your context: a JobRepository (bean name “jobRepository”) a JobLauncher (bean name “jobLauncher”) a JobRegistry (bean name “jobRegistry”) a PlatformTransactionManager (bean name “transactionManager”) a JobBuilderFactory (bean name “jobBuilders”) as a convenience to prevent you from having to inject the job repository into every job, as in the examples above a StepBuilderFactory (bean name “stepBuilders”) as a convenience to prevent you from having to inject the job repository and transaction manager into every step The first part focuses on the actual job configuration: @Bean public Job addNewPodcastJob(){ return jobs.get("addNewPodcastJob") .listener(protocolListener()) .start(step()) .build(); } @Bean public Step step(){ return stepBuilderFactory.get("step") .chunk(1) //important to be one in this case to commit after every line read .reader(reader()) .processor(processor()) .writer(writer()) .listener(logProcessListener()) .faultTolerant() .skipLimit(10) //default is set to 0 .skip(MySQLIntegrityConstraintViolationException.class) .build(); } The first method defines a job and the second one defines a single step. As you’ve read in The Domain Language of Batch, jobs are built from steps, where each step can involve a reader, a processor, and a writer. In the step definition, you define how much data to write at a time (in our case 1 record at a time). Next you specify the reader, processor and writer. 5. Spring Batch processing units Most of the batch processing can be described as reading data, doing some transformation on it and then writing the result out. This mirrors somehow the Extract, Transform, Load (ETL) process, in case you know more about that. Spring Batch provides three key interfaces to help perform bulk reading and writing: ItemReader, ItemProcessor and ItemWriter. 5.1. Readers ItemReader is an abstraction providing the mean to retrieve data from many different types of input: flat files, xml files, database, jms etc., one item at a time. See the Appendix A. List of ItemReaders and ItemWriters for a complete list of available item readers. In the Podcastpedia batch jobs I use the following specialized ItemReaders: 5.1.1. FlatFileItemReader which, as the name implies, reads lines of data from a flat file that typically describe records with fields of data defined by fixed positions in the file or delimited by some special character (e.g. Comma). This type of ItemReader is being used in the first batch job, addNewPodcastJob. The input file used is named suggested-podcasts.in, resides in the classpath (src/main/resources) and looks something like the following: FEED_URL; IDENTIFIER_ON_PODCASTPEDIA; CATEGORIES; LANGUAGE; MEDIA_TYPE; UPDATE_FREQUENCY; KEYWORDS; FB_PAGE; TWITTER_PAGE; GPLUS_PAGE; NAME_SUBMITTER; EMAIL_SUBMITTER http://www.5minutebiographies.com/feed/; 5minutebiographies; people_society, history; en; Audio; WEEKLY; biography, biographies, short biography, short biographies, 5 minute biographies, five minute biographies, 5 minute biography, five minute biography; https://www.facebook.com/5minutebiographies;https://twitter.com/5MinuteBios; ; Adrian Matei; [email protected] http://notanotherpodcast.libsyn.com/rss; NotAnotherPodcast; entertainment; en; Audio; WEEKLY; Comedy, Sports, Cinema, Movies, Pop Culture, Food, Games; https://www.facebook.com/notanotherpodcastusa;https://twitter.com/NAPodcastUSA;https://plus.google.com/u/0/103089891373760354121/posts; Adrian Matei; [email protected] As you can see the first line defines the names of the “columns”, and the following lines contain the actual data (delimited by “;”), that needs translating to domain objects relevant in the context. Let’s see now how to configure the FlatFileItemReader: @Bean public ItemReader reader(){ FlatFileItemReader reader = new FlatFileItemReader(); reader.setLinesToSkip(1);//first line is title definition reader.setResource(new ClassPathResource("suggested-podcasts.in")); reader.setLineMapper(lineMapper()); return reader; } You can specify, among other things, the input resource, the number of lines to skip, and a line mapper. 5.1.1.1. LineMapper The LineMapper is an interface for mapping lines (strings) to domain objects, typically used to map lines read from a file to domain objects on a per line basis. For the Podcastpedia job I used the DefaultLineMapper, which is two-phase implementation consisting of tokenization of the line into a FieldSet followed by mapping to item: @Bean public LineMapper lineMapper() { DefaultLineMapper lineMapper = new DefaultLineMapper(); DelimitedLineTokenizer lineTokenizer = new DelimitedLineTokenizer(); lineTokenizer.setDelimiter(";"); lineTokenizer.setStrict(false); lineTokenizer.setNames(new String[]{"FEED_URL", "IDENTIFIER_ON_PODCASTPEDIA", "CATEGORIES", "LANGUAGE", "MEDIA_TYPE", "UPDATE_FREQUENCY", "KEYWORDS", "FB_PAGE", "TWITTER_PAGE", "GPLUS_PAGE", "NAME_SUBMITTER", "EMAIL_SUBMITTER"}); BeanWrapperFieldSetMapper fieldSetMapper = new BeanWrapperFieldSetMapper(); fieldSetMapper.setTargetType(SuggestedPodcast.class); lineMapper.setLineTokenizer(lineTokenizer); lineMapper.setFieldSetMapper(suggestedPodcastFieldSetMapper()); return lineMapper; } the DelimitedLineTokenizer splits the input String via the “;” delimiter. if you set the strict flag to false then lines with less tokens will be tolerated and padded with empty columns, and lines with more tokens will simply be truncated. the columns names from the first line are set lineTokenizer.setNames(...); and the fieldMapper is set (line 14) Note: The FieldSet is an “interface used by flat file input sources to encapsulate concerns of converting an array of Strings to Java native types. A bit like the role played by ResultSet in JDBC, clients will know the name or position of strongly typed fields that they want to extract.“ 5.1.1.2. FieldSetMapper The FieldSetMapper is an interface that is used to map data obtained from a FieldSet into an object. Here’s my implementation which maps the fieldSet to the SuggestedPodcast domain object that will be further passed to the processor: public class SuggestedPodcastFieldSetMapper implements FieldSetMapper { @Override public SuggestedPodcast mapFieldSet(FieldSet fieldSet) throws BindException { SuggestedPodcast suggestedPodcast = new SuggestedPodcast(); suggestedPodcast.setCategories(fieldSet.readString("CATEGORIES")); suggestedPodcast.setEmail(fieldSet.readString("EMAIL_SUBMITTER")); suggestedPodcast.setName(fieldSet.readString("NAME_SUBMITTER")); suggestedPodcast.setTags(fieldSet.readString("KEYWORDS")); //some of the attributes we can map directly into the Podcast entity that we'll insert later into the database Podcast podcast = new Podcast(); podcast.setUrl(fieldSet.readString("FEED_URL")); podcast.setIdentifier(fieldSet.readString("IDENTIFIER_ON_PODCASTPEDIA")); podcast.setLanguageCode(LanguageCode.valueOf(fieldSet.readString("LANGUAGE"))); podcast.setMediaType(MediaType.valueOf(fieldSet.readString("MEDIA_TYPE"))); podcast.setUpdateFrequency(UpdateFrequency.valueOf(fieldSet.readString("UPDATE_FREQUENCY"))); podcast.setFbPage(fieldSet.readString("FB_PAGE")); podcast.setTwitterPage(fieldSet.readString("TWITTER_PAGE")); podcast.setGplusPage(fieldSet.readString("GPLUS_PAGE")); suggestedPodcast.setPodcast(podcast); return suggestedPodcast; } } 5.2. JdbcCursorItemReader In the second job, notifyEmailSubscribersJob, in the reader, I only read email subscribers from a single database table, but further in the processor a more detailed read(via JPA) is executed to retrieve all the new episodes of the podcasts the user subscribed to. This is a common pattern employed in the batch world. Follow this link for more Common Batch Patterns. For the initial read, I chose the JdbcCursorItemReader, which is a simple reader implementation that opens a JDBC cursor and continually retrieves the next row in the ResultSet: @Bean public ItemReader notifySubscribersReader(){ JdbcCursorItemReader reader = new JdbcCursorItemReader(); String sql = "select * from users where is_email_subscriber is not null"; reader.setSql(sql); reader.setDataSource(dataSource); reader.setRowMapper(rowMapper()); return reader; } Note I had to set the sql, the datasource to read from and a RowMapper. 5.2.1. RowMapper The RowMapper is an interface used by JdbcTemplate for mapping rows of a Result’set on a per-row basis. My implementation of this interface, , performs the actual work of mapping each row to a result object, but I don’t need to worry about exception handling: public class UserRowMapper implements RowMapper { @Override public User mapRow(ResultSet rs, int rowNum) throws SQLException { User user = new User(); user.setEmail(rs.getString("email")); return user; } } 5.2. Writers ItemWriter is an abstraction that represents the output of a Step, one batch or chunk of items at a time. Generally, an item writer has no knowledge of the input it will receive next, only the item that was passed in its current invocation. The writers for the two jobs presented are quite simple. They just use external services to send email notifications and post tweets on Podcastpedia’s account. Here is the implementation of the ItemWriterfor the first job – addNewPodcast: package org.podcastpedia.batch.jobs.addpodcast; import java.util.Date; import java.util.List; import javax.inject.Inject; import javax.persistence.EntityManager; import org.podcastpedia.batch.common.entities.Podcast; import org.podcastpedia.batch.jobs.addpodcast.model.SuggestedPodcast; import org.podcastpedia.batch.jobs.addpodcast.service.EmailNotificationService; import org.podcastpedia.batch.jobs.addpodcast.service.SocialMediaService; import org.springframework.batch.item.ItemWriter; import org.springframework.beans.factory.annotation.Autowired; public class Writer implements ItemWriter{ @Autowired private EntityManager entityManager; @Inject private EmailNotificationService emailNotificationService; @Inject private SocialMediaService socialMediaService; @Override public void write(List items) throws Exception { if(items.get(0) != null){ SuggestedPodcast suggestedPodcast = items.get(0); //first insert the data in the database Podcast podcast = suggestedPodcast.getPodcast(); podcast.setInsertionDate(new Date()); entityManager.persist(podcast); entityManager.flush(); //notify submitter about the insertion and post a twitt about it String url = buildUrlOnPodcastpedia(podcast); emailNotificationService.sendPodcastAdditionConfirmation( suggestedPodcast.getName(), suggestedPodcast.getEmail(), url); if(podcast.getTwitterPage() != null){ socialMediaService.postOnTwitterAboutNewPodcast(podcast, url); } } } private String buildUrlOnPodcastpedia(Podcast podcast) { StringBuffer urlOnPodcastpedia = new StringBuffer( "http://www.podcastpedia.org"); if (podcast.getIdentifier() != null) { urlOnPodcastpedia.append("/" + podcast.getIdentifier()); } else { urlOnPodcastpedia.append("/podcasts/"); urlOnPodcastpedia.append(String.valueOf(podcast.getPodcastId())); urlOnPodcastpedia.append("/" + podcast.getTitleInUrl()); } String url = urlOnPodcastpedia.toString(); return url; } } As you can see there’s nothing special here, except that the write method has to be overriden and this is where the injected external services EmailNotificationService and SocialMediaService are used to inform via email the podcast submitter about the addition to the podcast directory, and if a Twitter page was submitted a tweet will be posted on the Podcastpedia’s wall. You can find detailed explanation on how to send email via Velocity and how to post on Twitter from Java in the following posts: How to compose html emails in Java with Spring and Velocity How to post to Twittter from Java with Twitter4J in 10 minutes 5.3. Processors ItemProcessor is an abstraction that represents the business processing of an item. While theItemReader reads one item, and the ItemWriter writes them, the ItemProcessor provides access to transform or apply other business processing. When using your own Processors you have to implement the ItemProcessor interface, with its only method O process(I item) throws Exception, returning a potentially modified or a new item for continued processing. If the returned result is null, it is assumed that processing of the item should not continue. While the processor of the first job requires a little bit of more logic, because I have to set the etag andlast-modified header attributes, the feed attributes, episodes, categories and keywords of the podcast: public class SuggestedPodcastItemProcessor implements ItemProcessor { private static final int TIMEOUT = 10; @Autowired ReadDao readDao; @Autowired PodcastAndEpisodeAttributesService podcastAndEpisodeAttributesService; @Autowired private PoolingHttpClientConnectionManager poolingHttpClientConnectionManager; @Autowired private SyndFeedService syndFeedService; /** * Method used to build the categories, tags and episodes of the podcast */ @Override public SuggestedPodcast process(SuggestedPodcast item) throws Exception { if(isPodcastAlreadyInTheDirectory(item.getPodcast().getUrl())) { return null; } String[] categories = item.getCategories().trim().split("\\s*,\\s*"); item.getPodcast().setAvailability(org.apache.http.HttpStatus.SC_OK); //set etag and last modified attributes for the podcast setHeaderFieldAttributes(item.getPodcast()); //set the other attributes of the podcast from the feed podcastAndEpisodeAttributesService.setPodcastFeedAttributes(item.getPodcast()); //set the categories List categoriesByNames = readDao.findCategoriesByNames(categories); item.getPodcast().setCategories(categoriesByNames); //set the tags setTagsForPodcast(item); //build the episodes setEpisodesForPodcast(item.getPodcast()); return item; } ...... } the processor from the second job uses the ‘Driving Query’ approach, where I expand the data retrieved from the Reader with another “JPA-read” and I group the items on podcasts with episodes so that it looks nice in the emails that I am sending out to subscribers: @Scope("step") public class NotifySubscribersItemProcessor implements ItemProcessor { @Autowired EntityManager em; @Value("#{jobParameters[updateFrequency]}") String updateFrequency; @Override public User process(User item) throws Exception { String sqlInnerJoinEpisodes = "select e from User u JOIN u.podcasts p JOIN p.episodes e WHERE u.email=?1 AND p.updateFrequency=?2 AND" + " e.isNew IS NOT NULL AND e.availability=200 ORDER BY e.podcast.podcastId ASC, e.publicationDate ASC"; TypedQuery queryInnerJoinepisodes = em.createQuery(sqlInnerJoinEpisodes, Episode.class); queryInnerJoinepisodes.setParameter(1, item.getEmail()); queryInnerJoinepisodes.setParameter(2, UpdateFrequency.valueOf(updateFrequency)); List newEpisodes = queryInnerJoinepisodes.getResultList(); return regroupPodcastsWithEpisodes(item, newEpisodes); } ....... } Note: If you’d like to find out more how to use the Apache Http Client, to get the etag and last-modifiedheaders, you can have a look at my post – How to use the new Apache Http Client to make a HEAD request 6. Execute the batch application Batch processing can be embedded in web applications and WAR files, but I chose in the beginning the simpler approach that creates a standalone application, that can be started by the Java main() method: package org.podcastpedia.batch; //imports ...; @ComponentScan @EnableAutoConfiguration public class Application { private static final String NEW_EPISODES_NOTIFICATION_JOB = "newEpisodesNotificationJob"; private static final String ADD_NEW_PODCAST_JOB = "addNewPodcastJob"; public static void main(String[] args) throws BeansException, JobExecutionAlreadyRunningException, JobRestartException, JobInstanceAlreadyCompleteException, JobParametersInvalidException, InterruptedException { Log log = LogFactory.getLog(Application.class); SpringApplication app = new SpringApplication(Application.class); app.setWebEnvironment(false); ConfigurableApplicationContext ctx= app.run(args); JobLauncher jobLauncher = ctx.getBean(JobLauncher.class); if(ADD_NEW_PODCAST_JOB.equals(args[0])){ //addNewPodcastJob Job addNewPodcastJob = ctx.getBean(ADD_NEW_PODCAST_JOB, Job.class); JobParameters jobParameters = new JobParametersBuilder() .addDate("date", new Date()) .toJobParameters(); JobExecution jobExecution = jobLauncher.run(addNewPodcastJob, jobParameters); BatchStatus batchStatus = jobExecution.getStatus(); while(batchStatus.isRunning()){ log.info("*********** Still running.... **************"); Thread.sleep(1000); } log.info(String.format("*********** Exit status: %s", jobExecution.getExitStatus().getExitCode())); JobInstance jobInstance = jobExecution.getJobInstance(); log.info(String.format("********* Name of the job %s", jobInstance.getJobName())); log.info(String.format("*********** job instance Id: %d", jobInstance.getId())); System.exit(0); } else if(NEW_EPISODES_NOTIFICATION_JOB.equals(args[0])){ JobParameters jobParameters = new JobParametersBuilder() .addDate("date", new Date()) .addString("updateFrequency", args[1]) .toJobParameters(); jobLauncher.run(ctx.getBean(NEW_EPISODES_NOTIFICATION_JOB, Job.class), jobParameters); } else { throw new IllegalArgumentException("Please provide a valid Job name as first application parameter"); } System.exit(0); } } The best explanation for SpringApplication-, @ComponentScan- and @EnableAutoConfiguration-magic you get from the source – Getting Started – Creating a Batch Service: “The main() method defers to the SpringApplication helper class, providing Application.class as an argument to its run() method. This tells Spring to read the annotation metadata from Application and to manage it as a component in the Spring application context. The @ComponentScan annotation tells Spring to search recursively through theorg.podcastpedia.batchpackage and its children for classes marked directly or indirectly with Spring’s @Component annotation. This directive ensures that Spring finds and registers BatchConfiguration, because it is marked with @Configuration, which in turn is a kind of @Component annotation. The @EnableAutoConfiguration annotation switches on reasonable default behaviors based on the content of your classpath. For example, it looks for any class that implements the CommandLineRunner interface and invokes its run() method.” Execution construction steps: the JobLauncher, which is a simple interface for controlling jobs, is retrieved from the ApplicationContext. Remember this is automatically made available via the@EnableBatchProcessing annotation. now based on the first parameter of the application (args[0]), I will retrieve the correspondingJob from the ApplicationContext then the JobParameters are prepared, where I use the current date - .addDate("date", new Date()), so that the job executions are always unique. once everything is in place, the job can be executed: JobExecution jobExecution = jobLauncher.run(addNewPodcastJob, jobParameters); you can use the returned jobExecution to gain access to BatchStatus, exit code, or job name and id. Note: I highly recommend you read and understand the Meta-Data Schema for Spring Batch. It will also help you better understand the Spring Batch Domain objects. 6.1. Running the application on dev and prod environments To be able to run the Spring Batch / Spring Boot application on different environments I make use of the Spring Profiles capability. By default the application runs with development data (database). But if I want the job to use the production database I have to do the following: provide the following environment argument -Dspring.profiles.active=prod have the production database properties configured in the application-prod.properties file in the classpath, right besides the default application.properties file Summary In this tutorial we’ve learned how to configure a Spring Batch project with Spring Boot and Java configuration, how to use some of the most common readers in batch processing, how to configure some simple jobs, and how to start Spring Batch jobs from a main method. Note: As I mentioned, I am fairly new to Spring Batch, and especially to Spring Boot and Spring Configuration with Java, so if you see any potential for improvement (code, job design etc.) please make a pull request or leave a comment below. Thanks a lot.

September 9, 2014

by Adrian Matei

· 146,335 Views · 7 Likes

Name of the Class

In Java every class has a name. Classes are in packages and this lets us programmers work together avoiding name collision. I can name my class A and you can also name your class A so long as long they are in different packages, they work together fine. If you looked at the API of the class Class you certainly noticed that there are three different methods that give you the name of a class: getSimpleName() gives you the name of the class without the package. getName() gives you the name of the class with the full package name in front. getCanonicalName() gives you the canonical name of the class. Simple is it? Well, the first is simple and the second is also meaningful unless there is that disturbing canonical name. That is not evident what that is. And if you do not know what canonical name is, you may feel some disturbance in the force of your Java skills for the second also. What is the difference between the two? If you want a precise explanation, visit the chapter 6.7 of Java Language Specification. Here we go with something simpler, aimed simpler to understand though not so thorough. Let’s see some examples: package pakage.subpackage.evensubberpackage; import org.junit.Assert; import org.junit.Test; public class WhatIsMyName { @Test public void classHasName() { final Class klass = WhatIsMyName.class; final String simpleNameExpected = "WhatIsMyName"; Assert.assertEquals(simpleNameExpected, klass.getSimpleName()); final String nameExpected = "pakage.subpackage.evensubberpackage.WhatIsMyName"; Assert.assertEquals(nameExpected, klass.getName()); Assert.assertEquals(nameExpected, klass.getCanonicalName()); } ... This “unit test” just runs fine. But as you can see there is no difference between name and canonical name in this case. (Note that the name of the package is pakage and not package. To test your java lexical skills answer the question why?) Let’s have a look at the next example from the same junit test file: @Test public void arrayHasName() { final Class klass = WhatIsMyName[].class; final String simpleNameExpected = "WhatIsMyName[]"; Assert.assertEquals(simpleNameExpected, klass.getSimpleName()); final String nameExpected = "[Lpakage.subpackage.evensubberpackage.WhatIsMyName;"; Assert.assertEquals(nameExpected, klass.getName()); final String canonicalNameExpected = "pakage.subpackage.evensubberpackage.WhatIsMyName[]"; Assert.assertEquals(canonicalNameExpected, klass.getCanonicalName()); } Now there are differences. When we talk about arrays the simple name signals it appending the opening and closing brackets, just like we would do in Java source code. The “normal” name looks a bit weird. It starts with an L and semicolon is appended. This reflects the internal representation of the class names in the JVM. The canonical name changed similar to the simple name: it is the same as before for the class having all the package names as prefix with the brackets appended. Seems that getName() is more the JVM name of the class and getCanonicalName() is more like the fully qualified name on Java source level. Let’s go on with still some other example (we are still in the same file): class NestedClass{} @Test public void nestedClassHasName() { final Class klass = NestedClass.class; final String simpleNameExpected = "NestedClass"; Assert.assertEquals(simpleNameExpected, klass.getSimpleName()); final String nameExpected = "pakage.subpackage.evensubberpackage.WhatIsMyName$NestedClass"; Assert.assertEquals(nameExpected, klass.getName()); final String canonicalNameExpected = "pakage.subpackage.evensubberpackage.WhatIsMyName.NestedClass"; Assert.assertEquals(canonicalNameExpected, klass.getCanonicalName()); } The difference is the dollar sign in the name of the class. Again the “name” is more what is used by the JVM and canonical name is what is Java source code like. If you compile this code, the Java compiler will generate the files: WhatIsMyName.class and WhatIsMyName$NestedClass.class Even though the class is named nested class it actually is an inner class. However in the naming there is no difference: a static or non-static class inside another class is just named the same. Now let’s see something even more interesting: @Test public void methodClassHasName() { class MethodClass{}; final Class klass = MethodClass.class; final String simpleNameExpected = "MethodClass"; Assert.assertEquals(simpleNameExpected, klass.getSimpleName()); final String nameExpected = "pakage.subpackage.evensubberpackage.WhatIsMyName$1MethodClass"; Assert.assertEquals(nameExpected, klass.getName()); final String canonicalNameExpected = null; Assert.assertEquals(canonicalNameExpected, klass.getCanonicalName()); } This time we have a class inside a method. Not a usual scenario, but valid from the Java language point of view. The simple name of the class is just that: the simple name of the class. No much surprise. The “normal” name however is interesting. The Java compiler generates a JVM name for the class and this name contains a number in it. Why? Because nothing would stop me having a class with the same name in another method in our test class and inserting a number is the way to prevent name collisions for the JVM. The JVM does not know or care anything about inner and nested classes or classes defined inside a method. A class is just a class. If you compile the code you will probably see the file WhatIsMyName$1MethodClass.class generated by javac. I had to add “probably” not because I count the possibility of you being blind, but rather because this name is actually the internal matter of the Java compiler. It may choose different name collision avoiding strategy, though I know no compiler that differs from the above. The canonical name is the most interesting. It does not exist! It is null. Why? Because you can not access this class from outside the method defining it. It does not have a canonical name. Let’s go on. What about anonymous classes. They should not have name. After all, that is why they are called anonymous. @Test public void anonymousClassHasName() { final Class klass = new Object(){}.getClass(); final String simpleNameExpected = ""; Assert.assertEquals(simpleNameExpected, klass.getSimpleName()); final String nameExpected = "pakage.subpackage.evensubberpackage.WhatIsMyName$1"; Assert.assertEquals(nameExpected, klass.getName()); final String canonicalNameExpected = null; Assert.assertEquals(canonicalNameExpected, klass.getCanonicalName()); } Actually they do not have simple name. The simple name is empty string. They do, however have name, made up by the compiler. Poor javac does not have other choice. It has to make up some name even for the unnamed classes. It has to generate the code for the JVM and it has to save it to some file. Canonical name is again null. Are we ready with the examples? No. We have something simple (a.k.a. primitive) at the end. Java primitives. @Test public void intClassHasName() { final Class klass = int.class; final String intNameExpected = "int"; Assert.assertEquals(intNameExpected, klass.getSimpleName()); Assert.assertEquals(intNameExpected, klass.getName()); Assert.assertEquals(intNameExpected, klass.getCanonicalName()); } If the class represents a primitive, like int (what can be simpler than an int?) then the simple name, “the” name and the canonical names are all int the name of the primitive. Just as well an array of a primitive is very simple is it? @Test public void intArrayClassHasName() { final Class klass = int[].class; final String simpleNameExpected = "int[]"; Assert.assertEquals(simpleNameExpected, klass.getSimpleName()); final String nameExpected = "[I"; Assert.assertEquals(nameExpected, klass.getName()); final String canonicalNameExpected = "int[]"; Assert.assertEquals(canonicalNameExpected, klass.getCanonicalName()); } Well, it is not simple. The name is [I, which is a bit mysterious unless you read the respective chapter of the JVM specification. Perhaps I talk about that another time. Conclusion The simple name of the class is simple. The “name” returned by getName() is the one interesting for JVM level things. The getCanonicalName() is the one that looks most like Java source. You can get the full source code of the example above from the gist e789d700d3c9abc6afa0 from GitHub.

September 8, 2014

by Peter Verhas

CORE

· 7,002 Views · 27 Likes