Language Resources

The Latest Languages Topics

Recently I attended a local rheinJUG meeting in Düsseldorf. While the topic of the session was Eclipse e4, the night’s sponsor itemis provided some handouts on Xtext which got me very interested. The reason is that currently at work we are developing a mobile Java application (J9, CDC/Foundation 1.1 on Windows CE6) for which we needed an easy to use and reliable way for configuring navigation through the application. In a previous iteration we had – mostly because of time constraints – hard coded most of the navigational paths, but this time the app is more complex and doing that again was not really an option. First we thought about an XML based configuration, but this seemed to be a hassle to write (and read) and also would mean we would have to pay the price of parsing it on every application startup. Enter Xtext: An Eclipse based framework/library for building text based DSLs. In short, you just provide a grammar description of a new DSL to suit your needs and with – literally – just a few mouse clicks you are provided with a content-assist, syntax-highlight, outline-view-enabled Eclipse editor and optionally a code generator based on that language. Getting started: Sample Grammar There is a nice tutorial provided as part of the Xtext documentation, but I believe it might be beneficial to provide another example of how to put a DSL to good use. I will not go into every step in great detail, because setting up Xtext is Eclipse 3.6 Helios is just a matter of putting an Update Site URL in, and the New Project wizard provided makes the initial setup a snap. I assume, you have already set up Eclipse and Xtext and created a new Xtext project including a generator project (activate the corresponding checkbox when going through the wizard). In this post I am assuming a project name of com.danielschneller.navi.dsl and a file extension of .navi. When finished we will have the infrastructure ready for editing, parsing and generating code based on files like these: navigation rules for MyApplication mappings { map permission AdminPermission to "privAdmin" map permission DataAccessPermission to "privData" map coordinate Login to "com.danielschneller.myapp.gui.login.LoginController" in "com.danielschneller.myapp.login" map coordinate LoginFailed to "com.danielschneller.myapp.gui.login.LoginFailedController" in "com.danielschneller.myapp.login" map coordinate MainMenu to "com.danielschneller.myapp.gui.menu.MainMenuController" in "com.danielschneller.myapp.menu" map coordinate UserAdministration to "com.danielschneller.myapp.gui.admin.UserAdminController" in "com.danielschneller.myapp.admin" map coordinate DataLookup to "com.danielschneller.myapp.gui.lookup.LookupController" in "com.danielschneller.myapp.lookup" } navigations { define navigation USER_LOGON_FAILED define navigation USER_LOGON_SUCCESS define navigation OK define navigation BACK define navigation ADMIN define navigation DATA_LOOKUP } navrules { from Login on navigation USER_LOGON_FAILED go to LoginFailed on navigation USER_LOGON_SUCCESS go to MainMenu from LoginFailed on navigation OK go to Login from MainMenu on navigation ADMIN go to UserAdministration with AdminPermission on navigation DATA_LOOKUP go to DataLookup with DataAccessPermission from UserAdministration on navigation BACK go to MainMenu from DataLookup on navigation BACK go to MainMenu } As you can see it is a nice little language for defining coordinates in an application, meaning a specific GUI for a certain task and the possible navigation paths between them. Optionally a navigation path can be tagged to require one or more permissions to work. So for example one possible navigation path shown in the above sample is from the applications main menu, identified by the identifier MainMenu and represented in code by the com.danielschneller.myapp.gui.menu.MainMenuController class in the com.danielschneller.myapp.menu OSGi bundle to a GUI identified as DataLookup, implemented by com.danielschneller.myapp.gui.lookup.LookupController in the com.danielschneller.myapp.lookup bundle. For this path to be taken, the application must request the DataLookup navigation path and the currently logged in user be assigned the DataAccessPermission. What exactly that means is not the focus of this tutorial, suffice it to say that we somehow need to get the information contained in this specialized language into our Java application in some shape or form that can be evaluated at runtime. In the following example all information will be transformed into a HashMap based data structure. For our little mobile application this has several advantages over the XML option mentioned earlier: No XML parsing necessary on application startup, saving some performance Validation of the navigation rules ahead of time, preventing parse errors at runtime No libraries needed to access the information – by putting everything in a simple HashMap we do not have to rely on any non-standard classes whatsoever First thing I did when I started with Xtext was define a sample input file such as the one above. Then – following its general structure – I began to extract a formal grammar for it. Of course, the first draft of the sample data was not perfect, over the course of a few iterations I refined some of the syntax, but in the end this is the grammar definition I came up with. It is heavily commented to allow you to copy it out and still leave the documentation intact: grammar com.danielschneller.navi.NavigationRules with org.eclipse.xtext.common.Terminals generate navigationRules "http://com.danielschneller/fw/funkmde/navi/NavigationRules" /* * The top level entry point for the file. * "Root" is just a name as good as any, but * makes the meaning quite clear. */ Root: // first thing in the file is a "keyword", // followed by an attribute that will be // accessible as "name" later and allow // definition of an ID type of thing. 'navigation rules for' name=ID // after the keyword and "name" attribute // three sections follow, each assigned // to an attribute for later reference // (called "mappingdefs", "transitiondefs" // and "ruledefs"). // Their types are defined later in the file. mappingsdefs=Mappings transitiondefs=TransitionDefinitions ruledefs=NavigationRules // semicolon ends the definition of "Root" ; // mappings section >>>>>>>>>>>>>>>>>>>>>>>>>>>>> /* * Definition of the "Mappings" type used in * the "Root" type. */ Mappings: // first the keyword "mappings" is expected, // then an open curly 'mappings' '{' // after that a collection of "Mapping"s is // expected. The "+=" means that they will // all be collected in a collection type element // called "mappings" for future reference. // The "+" at the end means "at least one, but // more is just fine". (mappings+=Mapping)+ // finally the "Mappings" type requires a closing // curly brace. '}' // semicolon ends the definition of "Mappings" ; /* * Definition of a single "Mapping", those we are * collecting in the "mappings" attribute of the * "Mappings" type. */ Mapping: // each mapping starts with the keyword "map" // and is followed by an element of type "MappingSpec" 'map' MappingSpec ; /* * Definition of a "MappingSpec" element. This is * actually just a "parent type" for two more specific * kinds of "MappingSpec": */ MappingSpec: // no keywords are defined here, a "MappingSpec" // can be either a "PermissionMappingSpec" or a // "CoordinateMappingSpec". Any of these will be // fine where a "MappingSpec" is asked for. PermissionMappingSpec | CoordinateMappingSpec ; /* * Definition of a "PermissionMappingSpec" element. */ PermissionMappingSpec: // first the keyword "permission" is required. // then a "name" attribute is expected of type ID. // Following the name the "to" keyword is expected, // followed by a string that is stored in the "value" // attribute 'permission' name=ID 'to' value=STRING ; /* * Definition of a "CoordinateMappingSpec" element. * The definition is very similar to the "PermissionMappingSpec" * but has more attributes. */ CoordinateMappingSpec: // first the keyword "coordinate", then an ID stored as "name", // the keyword "to", followed by a string stored as "controllername", // next the keyword "in" and finally another string, memorized as // "bundleid" 'coordinate' name=ID 'to' controllername=STRING 'in' bundleid=STRING ; // <<<<<<<<<<<<<<<<<<<<<<<<<<<<< mappings section // >>>>>>>>>>>>>>>>>>>>>>>>>>>>> navigations section /* * Definition of the "TransitionDefinitions" type used in * the "Root" type. */ TransitionDefinitions: // first, this element is introduced with the "navigations" // keyword, followed by an open curly brace. 'navigations' '{' // after that a collection of "TransitionDefinition"s is // expected. The "+=" means that they will // all be collected in a collection type element // called "transitions" for future reference. // The "+" at the end means "at least one, but // more is just fine". (transitions+=TransitionDefinition)+ // the element ends with a closing curly brace '}' ; /* * Definition of a "TransitionDefinition" element. This * one is very simple. */ TransitionDefinition: // the keyword "define navigation" is required first, // then a "name" attribute of type ID is expected. 'define navigation' name=ID ; // <<<<<<<<<<<<<<<<<<<<<<<<<<<<< navigations section // >>>>>>>>>>>>>>>>>>>>>>>>>>>>> navrules section /* * Definition of the "NavigationRules" element. */ NavigationRules: // Element starts with the keywords "navrules" and // open curly. 'navrules' '{' // collection attribute called "rules", consisting // of one or more occurrences of a "Rule" element. (rules+=Rule)+ // element finishes with a closing curly keyword '}' ; /* * Definition of a "Rule" element as used in the "NavigationRules" * element. */ Rule: // first the "from" keyword, then a reference to one of the // coordinate mappings defined earlier. This time no new // definition of a coordinate is required, but one of those // that have been listed before. So the type here is put in // square brackets 'from' source=[CoordinateMappingSpec] // following the source specification, one or more "Destination" // type elements are expected, collected in a collection attribute // named "destinations" (destinations+=Destination)+ ; /* * Definition of a "Destination" type. These are collected * in a "Rule". */ Destination: // first comes an "on navigation" keyword. After that a // reference to one of the Transition elements defined // in the "navigations" section is required and stored // in the "transition" attribute. // after that follows a "go to" keyword and a reference // to a coordinate mapping, stored in the "target" attribute. // finally - as with the "destinations" collection attribute // in the "Rule" element - a "permissions" collection is // defined to store none or more (*) "PermissionReference" // elements. 'on navigation' transition=[TransitionDefinition] 'go to' target=[CoordinateMappingSpec] (permissions+=PermissionReference)* ; /* * Definition of a "PermissionReference" type. This is used * in the "permissions" collection of a "Destination". */ PermissionReference: // first, a "with" keyword is expected. After that a // "permission" attribute stores a reference to one of // the previously defined permission mappings from the // "mappings" section. 'with' permission=[PermissionMappingSpec] ; // <<<<<<<<<<<<<<<<<<<<<<<<<<<<< navrules section This is what XText can digest and create an editor plugin and outline view for. Just save this as navigationRules.xtext – when you created the XText project in Eclipse using the wizard it should have been prepared for you. Copying and pasting this into a .xtext file in Eclipse will provide you with syntax highlighting, code completion and syntax checking, making it easy to play around with grammar files. Once done, right click the .mwe2 file lying next to the grammar file in the Package Explorer view and select Run As MWE2 Workflow from the context menu. This will take a moment and generate several classes, both in the current (XText) project and the accompanying ...ui project. Next, right click the Xtext project and select Run As Eclipse Application from the context menu. This will bring up another Eclipse instance with the newly created support for navigation rules files (with a .navi suffix) installed. To try it out, just create a new project and in that a new file. Make sure its name ends in .navi. When asked, make sure to accept adding the Xtext nature to the project. You will be presented with a new, empty editor that already has an error marker in it. This is because according to our grammar definition, an empty file does not comply to all the rules we specified. Try hitting the code-completion shortcut (Ctrl-Space) twice and see what happens: The first code-completion fills in the navigation rules for part. According to the grammar this is the only valid text at the beginning of a file, so it is automatically inserted. Hitting Ctrl-Space again will tell you that now you need a Name of type ID. Just go ahead and try out the completion. It will help you create a syntactically sound navigation rules file. Notice that the Problems View tells you what is currently wrong. Also notice, that one you reach a part where references are expected by the grammar (e. g. when defining source and destination coordinates in a navigation rule) you will get suggestions based on what you entered earlier. This is what the whole sample from above looks like in the editor: While you are still fleshing out and fine tuning your grammar definitions, you will probably close this Eclipse instance and reopen it, once you repeated the Run As MWE2 Workflow steps in the main instance. In the long run I suggest you create a Feature and an Update Site project to allow easier distribution and updates of the intermediate iterations. Generating Code Now, as we have a complete Xtext DSL defined and in place let’s have a look at the Code Generation side of things. This part is completely optional: You are free to include the necessary Xtext libraries into your applications runtime (although they seem to be numerous) and just use them to dynamically load and parse .navi files on-the-fly. This would probably be a good idea if you were writing an Eclipse based application anyway. However, when targeting a very limited platform like JavaME this option is not viable. Instead we will now create a code generator that provides a transformation from the DSL syntax into more classic Java terms – specifically we will create a HashMap based data structure that carries all the same information, but in Java terms. This is a sample of what the generated output is going to look like: public class NaviRules { private Map navigationRules = new Hashtable(); // ... public NaviRules() { NaviDestination naviDest; // ========== From Login (com.danielschneller.myapp.gui.login.LoginController) // ========== On USER_LOGON_FAILED // ========== To LoginFailed (com.danielschneller.myapp.gui.login.LoginFailedController in com.danielschneller.myapp.login) naviDest = new NaviDestination(); naviDest.action = "USER_LOGON_FAILED"; naviDest.targetClassname = "com.danielschneller.myapp.gui.login.LoginFailedController"; naviDest.targetBundleId = "com.danielschneller.myapp.login"; store("com.danielschneller.myapp.gui.login.LoginController", naviDest); // ========== On USER_LOGON_SUCCESS // ========== To MainMenu (com.danielschneller.myapp.gui.menu.MainMenuController in com.danielschneller.myapp.menu) naviDest = new NaviDestination(); naviDest.action = "USER_LOGON_SUCCESS"; naviDest.targetClassname = "com.danielschneller.myapp.gui.menu.MainMenuController"; naviDest.targetBundleId = "com.danielschneller.myapp.menu"; store("com.danielschneller.myapp.gui.login.LoginController", naviDest); // ============================================================================= // ========== From LoginFailed (com.danielschneller.myapp.gui.login.LoginFailedController) // ========== On OK // ========== To Login (com.danielschneller.myapp.gui.login.LoginController in com.danielschneller.myapp.login) naviDest = new NaviDestination(); naviDest.action = "OK"; naviDest.targetClassname = "com.danielschneller.myapp.gui.login.LoginController"; naviDest.targetBundleId = "com.danielschneller.myapp.login"; store("com.danielschneller.myapp.gui.login.LoginFailedController", naviDest); // .... and so on ... } } The support class NaviDestination is omitted but is generally just a value holder struct type class. When creating the Xtext project using the wizard earlier we created a third Eclipse project, ending in ...generator. Its src folder contains three subdirectories called model, templates and workflow. Put the sample .navi file into the model directory. It will serve as the input for the generator. Create the first template Code generation is based on templates. Xtext leverages the Xpand template engine. In the templates directory create a new Xpand template using the context menu. Call it NaviRules.xpt, open it and insert the following: «REM» import the namespace defined in our DSL model «ENDREM» «IMPORT navigationRules» «REM» Define a template called "main" for elements of type "Root". The minus sign at the end takes care of not adding a newline at the end of it. «ENDREM» «DEFINE main FOR Root-» «ENDDEFINE» As there is only one instance of a Root element in a navigation rules file, this will be the main entry point - hence the name. There is no need to call it main, but it seems fitting. Now between the DEFINE and ENDDEFINE insert what is to be generated: As shown above, we need a new Java source file called NaviRules.java: ... «DEFINE main FOR Root-» «FILE "NaviRules.java"-» «ENDFILE-» «ENDDEFINE» ... Again, the contents to be generated is put in between the FILE and ENDFILE brackets. Anything not enclosed in «» will be used verbatim in the output file. So first of all, put in the static parts of the Java file. What I did was first write the source for a single navigation rule by hand, made sure it compiled and then copied over the relevant parts into the template piece by piece: ... «FILE "NaviRules.java"-» import java.util.*; public class NaviRules { public static class NaviDestination { String action; List requiredPermissions = new ArrayList(); String targetClassname; String targetBundleId; NaviDestination() {}; public final List getRequiredPermissions() { return new ArrayList(requiredPermissions); } // let Eclipse generate getters, setters, // equals and hashCode methods for this } private Map navigationRules = new Hashtable(); «ENDFILE-» ... Now, this is nothing special so far. To fill in the elements from the navigation rules DSL file put in the following: ... private Map navigationRules = new Hashtable(); public NaviRules() { NaviDestination naviDest; «REM» Iterate all elements in the "rules" collection attribute of the "ruledefs" attribute of the "Root" element. Call each iterated element (which is of type "Rule") "rule" and expand the "ruletmpl" template for it here. «ENDREM» «FOREACH ruledefs.rules AS r»«EXPAND ruletmpl FOR r»«ENDFOREACH» } ... In the class constructor we first define a local variable naviDest of the previously declared type. Then - as the comment states - the FOREACH instruction will iterate over all Rule type elements. This might not seem to be completely obvious at first. Remember at this point in the template the current scope is the "Root" element from the navigation rules file. It has an attribute called ruledefs as per the grammer definition. This attribute is of type NavigationRules which in turn has a collection attribute called rules, containing of Rule type objects. Inside the loop the current element can then be adressed by the template variable name r. The loop body (between FOREACH and ENDFOREACH) contains another Xpand instruction to expand a template called ruletmpl which will be declared next. Don't worry, even though this is a little difficult at first - switching contexts between the Java and the template scopes is made significantly easier in Eclipse, because the Xpand template editor will syntax color (static parts are blue) and also assist you with code completion inside the Xpand template parts. Ctrl-Spacing your way through it will make things more obvious than they are when reading an example. Now for the ruletmpl template. Place it below the ENDDEFINE statement belonging to the main template: ... «ENDFILE-» «ENDDEFINE» «DEFINE ruletmpl FOR Rule-» // ========== From «source.name» («source.controllername») «FOREACH destinations AS d»«EXPAND destTmpl(source) FOR d»«ENDFOREACH» // ============================================================================= «ENDDEFINE» You see the same idea used again: Static parts that get transferred into the output file 1:1 and Xpand statements that fill in data from the navigation rules definition file. In this case you see references to the attributes of the Rule element. As per the FOREACH instruction in the previous template, the one at hand will be repeated for every instance of Rule in our source file. Inside this definition the current scope is that of Rule, so with «source.name» the name attribute of the CoordinateMappingSpec object referenced as source in a Rule is taken first, then the controllername attribute likewise. Next up another FOREACH loop iterates the one or more possible Destinations of each Rule. Instead of just applying a template (destTmpl) for every Destination we also pass in the corresponding CoordinateMappingSpec stored in the source attribute of the Rule. This is then used in the following template: ... «DEFINE destTmpl(CoordinateMappingSpec source) FOR Destination-» // ========== On «transition.name» // ========== To «target.name» («target.controllername» in «target.bundleid») naviDest = new NaviDestination(); naviDest.action = "«transition.name»"; naviDest.targetClassname = "«target.controllername»"; naviDest.targetBundleId = "«target.bundleid»"; «FOREACH permissions AS p»«EXPAND permTmpl FOR p»«ENDFOREACH» store("«source.controllername»", naviDest); «ENDDEFINE» «DEFINE permTmpl FOR PermissionReference-» naviDest.requiredPermissions.add("«permission.value»"); «ENDDEFINE» In this innermost templates the attributes of the CoordinateMappingSpec objects source and target are accessed and put into place to be assigned to the members a NaviDestination Java object instance per Destination. There is only one more (very simple) template for the PermissionReference elements. With this, the Xpand file is complete. Set Up The Generator Workflow The wizard initially created a NavigationRulesGenerator.mwe2 file in the workflow folder. Open it and replace its contents with the following: module workflow.NavigationRulesGenerator import org.eclipse.emf.mwe.utils.* var targetDir = "src-gen" var fileEncoding = "Cp1252" var modelPath = "src/model" Workflow { component = org.eclipse.xtext.mwe.Reader { path = modelPath // this class has been generated by the xtext generator register = com.danielschneller.navi.NavigationRulesStandaloneSetup {} load = { slot = "root" type = "Root" } } component = org.eclipse.xpand2.Generator { metaModel = org.eclipse.xtend.typesystem.emf.EmfRegistryMetaModel {} expand = "templates::NaviRules::main FOREACH root" outlet = { path = targetDir } fileEncoding = fileEncoding } } The most interesting parts of this workflow file are the load section in the Reader component and the expand and outlet sections in the Generator component: The first one will connect a so-called slot with the Root element from our navigation rules. The second one will trigger the evaluation of the main template in the NaviRules.xpt file in the templates folder and feed any Root instances it finds in the *.navi files from the src/model (modelPath) into it. Now it is time for some actual generation. Run the generator workflow Right click the MWE2 file you just edited and select the Run As MWE2 Workflow command from the context menu. The Eclipse console will show this output: 0 [main] DEBUG org.eclipse.xtext.mwe.Reader - Resource Pathes : [src/model] 431 [main] DEBUG xt.validation.ResourceValidatorImpl - Syntax check OK! Resource: file:/Users/ds/ws/ws36_xtext/com.danielschneller.navi.dsl.generator/src/model/MyApp.navi 1013 [main] INFO org.eclipse.xpand2.Generator - Written 1 files to outlet [default](src-gen) 1014 [main] INFO .emf.mwe2.runtime.workflow.Workflow - Done. Then have a look at the newly generated contents of the src-gen source folder. If everything went alright, you should find a fresh NaviRules.java file placed there, based on the contents of your navigation rules file and the Xpand templates. Try and make some changes to the template, then re-run the workflow. You will see the changes reflected in the generated source file. Generate a second source File In the templates directory add another Xpand template file Navigation.xpt with the following content: «IMPORT navigationRules»; «DEFINE main FOR Root-» «FILE "Navigation.java"-» public final class Navigation { «FOREACH ruledefs.rules.destinations.transition.collect(e|e.name).toSet().sortBy(e|e) AS t»«EXPAND actionTmpl FOR t»«ENDFOREACH» private final String name; private Navigation(String aName) { name = aName; } public String getName() { return name; } } «ENDFILE-» «ENDDEFINE» «DEFINE actionTmpl FOR String-» /** Constant for Navigation «this» */ public static final Navigation «this» = new Navigation("«this»"); «ENDDEFINE» This is a template for a type-safe enumeration that can be used in Java 1.4 - remember I had to do this for JavaME. Notice the FOREACH loop in this case. It demonstrates that not only simple iterations are possible, but that Xpand allows more complex operations as well. In this case it will collect the names of all the navigation transitions from all the Destinations in the navigation rules. These are of type String. They are made unique by converting them to a Set datastructure and then finally sorted in their natural order. The resulting list of sorted strings is then iterated, each one - called t - is passed to the actionTmpl template. It is very simple, just placing the string itself («this») into a single line of Java source code. Of course, strictly speaking this is a rather complicated procedure to get the same information we could also have taken from the TransitionDefinitions element in the rules definition. However I think it serves as a nice example for additional Xpand capabilities. For a full description of its possibilities, have a look at the Xpand Reference in the Eclipse documentation. To use the new template, add another section to the MWE2 workflow definition: component = org.eclipse.xpand2.Generator { metaModel = org.eclipse.xtend.typesystem.emf.EmfRegistryMetaModel {} expand = "templates::Navigation::main FOREACH root" outlet = { path = targetDir } fileEncoding = fileEncoding } Running it again will produce a slightly different output, making clear that two files have been generated. This is what comes out in the src-gen folder as Navigation.java: public final class Navigation { /** Constant for Navigation ADMIN */ public static final Navigation ADMIN = new Navigation("ADMIN"); /** Constant for Navigation BACK */ public static final Navigation BACK = new Navigation("BACK"); /** Constant for Navigation DATA_LOOKUP */ public static final Navigation DATA_LOOKUP = new Navigation("DATA_LOOKUP"); /** Constant for Navigation OK */ public static final Navigation OK = new Navigation("OK"); ... More... This was just about my first experiments with Xtext. I am sure there is plenty more to be done with it. For more reading, please have a look at this very nice Getting started with Xtext tutorial by Peter Friese of Itemis. From http://www.danielschneller.com/2010/08/code-generation-with-xtext.html

August 7, 2010

by Daniel Schneller

· 27,172 Views

Practical PHP Patterns: Application Controller

The Application Controller pattern is a subpattern used in Web implementations of the Model-View-Controller one. It prescribes to interpose an object between the HTTP-related controllers (like the Action Controllers and the Front Controller) and the rest of the MVC machine of an application, or to substitute them. This pattern involves an additional layer of complexity in an MVC implementation, but it is useful for modelling interactions as a (finite) state machine. By the way, when your project transition from a set of pages to a full featured web application, you can benefit from an Application Controller over the old page-based controllers. Dynamic pages are produced on the fly, and are not always available; they may have dependencies on other previous pages, or may not be accessible until other events have already happened. An Application Controller takes care of this business logic, while not intruding into the domain model responsibilities. User experience For example, in basic use cases the pages of an application are orthogonal and can be visited in any order. Other times the planned user experience requires a particular path to be followed from the user, and some pages can be visited and executed only upon particular events. They may be valid only in some cases, or incomplete until their dependencies are satisfied. Consider for example forms compiled in multiple steps, or application wizards. You'll never visit them out of order, and you must have some sort of mechanism to organize the single responses in a smooth flow. Application controllers centralize the logic relative to passing from one of the screens to another. Further logic should be encapsulated in the underlying domain objects, which the Application Controller drives with composition. State machine The interaction of every user with your application is in a certain state, and every state has predefined events that once happened may transition the application (for this particular user) to another state. Every transition is associated with a response, which is presented to the end user, with further orthogonal embellishment like a layout. Usually interactions and this ideal [finite] state machines have a session-like scope (related to the current user), but nothing prevents considering the whole application a state machine. Finite State Machines are the simplest model for these interactions, but they can be either more than one (one for each user, for each notification, for each form...) or a more complex, custom model can be used by the Application Controller. Implementation Application controllers live behind an Action Controller to analyze the requests further, or behind a Front Controller to substitute the classical Page Controllers it forwards to. They have references as collaborators both to domain classes (on which the events provoke method execution) and also to view classes, since they must render a response. Domain objects are juggled around by the Application Controller, created, stored and deleted from persistence. Even when using frameworks, view are usually scripts instead, driven by a generic View object that accepts the path as a parameter. Logic, and what to put in it Domain logic may scatter into the Application Controller. I usually prefer to keep as much as possible in the domain layer, but Application Controller is ideal for what we call application logic instead of domain logic: single interactions with the domain model. The boundary is not clearly defined here. That said, you can easily decide where to put a bit of business logic by knwoing that what you put in this application layer won't be reused: it is transaction-specific logic. What you push down in the domain layer will probably be reused throughout the application. For instance, you should put in an Application Controller: controller or view specific code, which doesn't fit into the domain layer particular use cases form objects management handling of multiple views infrastructure aspects. Multiple Application Controllers Multiple application controllers are the norm, unless your web application is fairly small. For example they are useful for dealing with different clients (mobile ones and ordinary browsers), since the user experience is likely to change between different devices (with more noisy or more optimized interactions). Another use case for multiple Application Controllers is obviously in assigning one to each member of a set of interactions (multiple forms, or wizards, or any kinds of processes to accomplish.) Example Here I'll present a small, high-level sample of an Application Controller that manages a form compilation in multiple steps. In Zend Framework, it can be thought of as homogeneous to Action Controllers (so that it can be easily put in an existing application's infrastructure.) It usually does not take advantage of automated mechanisms like the ViewRenderer plugin to achieve a finer control. In fact, it does not map to a single domain action or a single view: the view is chosen on the fly and the domain model is seen through a Facade. session->merge($this->request->getPost()); // this is our View change $step = $this->_request->getParam('step', 1); if ($step == self::LAST) { return $this->render('final.phtml'); } else { $this->view->form = $this->forms[$step]; } // the script is always the same but the form rendered changes $this->render('appcontroller.phtml'); } } You may go further in generalization, and code a generic Application Controller for this use case which can be fed metadata about the interaction (number of steps, Zend_Form objects, etc.)

August 3, 2010

by Giorgio Sironi

· 3,049 Views

Achieving Immutability with Builder Design Pattern

Immutable classes define objects which, once created, never change their state.

August 3, 2010

by Shekhar Gulati

· 64,217 Views · 2 Likes

512000 concurrent websockets with Groovy++ and Gretty

We are staying in front of new world - all major browsers either support already or plan to support in next major version HTML5 (not in scope of this article) & WebSockets (main subject of the article). In 6 to 9 months we as application developers will have in our hands extremely powerful client side tools to build new generation of the Web. But are we ready on server side? And if not, what the point in having powerful and reliable communication channel between browser and server and non-utilising it. In this article I will talk about my expirience of handling 512K concurrent websockets using Groovy++ & Gretty. Why 512K and not 1M? This is very fair question and my initial goal was to handle exactly 1M concurrent websockets on one machine. It seems that due my lack of knowledge of tuning TCP/IP on Linux I was not able to achieve more. I had enough free CPU power and a lot of free memory but after 524285 open connections (always the same number) the server stopped to accept new connections. The magical number 524285 is so close to 524288=512*1024 that I can guess (and only guess) that we deal either with some limitation of Linux setup or with something in settings of Amazon network infrastructure (where I run my tests) . So what was the experiment about? In general it was my way to estimate (or at least to have feeling) how many (virtual) hardware units do we need if we hope to handle A LOT of concurrent users. The server itself is very simple. It does the following: On HTTP request from a client it responds with text document containing Groovy script to be executed by client. It accepts and keeps websocket connections from clients and respond to every received message (just plain string) with the same string in upper case The client program (running on separate machine) request the server for a scenario and then compile and execute it. Why do we need this trick with sending script to client? Truly speaking we don't. When I started writing the application I had illusion that it will simplify deployment to many clients. In fact, it did not and I had rsync all clients with my development machine after recompilation anyway. The scenario itself is also very straight forward. Client program opens 64000 concurrent web socket connections and approximately every 25 seconds sends to the server short string (approximately 30 characters). So server need to handle around 20000 requests per second or around 600K/s traffic in and the same amount out. Someone can argue if such structure of traffic is realistic or not. I don't want to go in to deep dispute here as it simulates very well one of applications under development in my company. To emulate 512000 concurrent clients I used 8 machines and 1 machine was the server. All machine was of the same "m1.large" Amazon EC2 instances with 7.5GB memory and 2 virtual cores running Fedora 11. 2.5GB memory was used by Java heap (around 5K per connection but of course not including kernel structures) and total CPU utilization was under 30% The server is written on Groovy++ using Gretty, which is lightweight server based on brilliant Netty framework and developed as part of Groovy++ standard library. More information on both can be found at Groovy++ home Gretty is extremely lightweight and fully non-blocking event driven web server. Gretty itself is written on Groovy++ and fully utilize concurrency libabry. It is not servlet container or any other relative of JavaEE. Right now it supports only static files http requests (including modern /param1/param2 REST-like requests) web sockets (including long-polling emulation protocol for old browsers) Here is essentially the whole server code. Obviously it is statically typed Groovy++ code. GrettyServer server = [ webContexts: [ "/" : [ public: { get("/scenario") { response.text = """ .............. here is client scenario code ................... """ } websocket("/ws",[ onMessage: { msg -> socket.send(msg.toUpperCase()) }, onConnect: { socket.send("Welcome!") }, onDisconnect: { } ]) } ] ] ] server.start() I don't want to explain more than written in the code above because I really hope the code is self explaiining. I hope it was interesting. Get Gretty and Groovy++ a try and let us know what do you think. Till next time.

July 29, 2010

by Alex Tkachman

· 57,311 Views

Practical PHP Patterns: Transform View

The Transform View pattern involves a view mechanism that processes data structures one element at the time, and transforms them into an end-user representation like HTML. The Transform View usually takes the shape of a view class (or function) with these characterizing traits: its input is domain model data structure, or an infrastructure one that contains domain elements like a Collection class or an array. Its output is a presentational data structure like an HTML page or a page block. Recently, also XML or JSON have become output formats. In fact, the simplest example of a natively available Transform View in PHP is the function json_encode(): it takes an array of data as its argument, and boils it down to a JSON string. It is not capable of navigating a more complex object graph to render it, but the intent of this pattern is intact. Implementation The Transform View can act over multiple objects or a single one. Its main difference with a Template View is the mechanism for expressing the production of the response: the first is a series of operation to apply to domain elements, the second is the representation itself with several hooks where dynamic content is inserted. Concretely, we can think of the Template View as a PHP script, maybe driven by a generic class that renders it when a method is called; the Transform View is instead composed of programmatic (code): it can be a class or method. Since patterns may be rendered more complex at will, you can introduce implementations that use XSLT to generates HTML or other representations from an XML one (with the XSL extension); but you still have to get the original XML one, maybe by introspection of the domain structures. XSL essentially it is a way to define transformations declaratively, mixing the two approaches and producing a portable description of the transformation which data undergo; many times you're better off by simply defining your transformation in code and cover (or drive) it with tests. A trade-off is that you can always use this pattern only for some parts of the page (when the output is oriented to humans and not to machines), as in the Two Step View pattern which we'll see in the next article. These particular Transform Views are a subset of View Helpers, which take domain objects as their input. Data access Infrastructure classes are usually navigable via their own Api, which is very rich. Domain classes instead are focused on modelling and information hiding instead of navigability, and this may raise an issue for displaying them through every kind of view. When a set of properties is exposed by a domain object directly or via accessors, we have two choices in building a Transform View: the object can be introspected dynamically, by using reflection or other metadata (e.g. annotations) to list public properties or getters. more specific Transform Views can be written which are designed only for usage on a particular class. They will present a strong coupling with the original domain data structure but also a finer control on the presentation layer. Advantages An obvious advantage of this pattern is code reuse when the implementation is generic enough to be valid for more than one class. A more subtle advantage may be the homogeneity of the different Transform Views objects. For example you may end up swapping different Transform Views or chaining them, or relying on polymorphism to select the right rendering for a use case. In general, promoting the View mechanism to an object yields the advantages of object-orientation (inheritance, polymorphism, encapsulation), and these pros are more easily perceived with a logic-intensive approach like a Transform View. Disadvantages The issues of this pattern come up with markup-intensive pages which contain a lot of semi-static HTML code, or of another presentational format. Such pages are often much easier to produce with a Template View when dynamic bindings are submerged. Even XML sometimes it's easier to print out tag by tag with Template View instead of managing a DOM implementation. Examples The sample code of this article regards using reflection to gather getters of a domain object and display it. Of course it is not a complete implementation (it does not cover relationships), but in its limited scope is quite useful. As always, you may prefer more specific incarnation of the pattern, but they will have to be kept in sync with the related domain model and constitute a larger code footprint. However, they will give you more freedom to tune the result in specific cases. _name = $name; } public function getName() { return $this->_name; } public function setCity($city) { $this->_city = $city; } public function getCity() { return $this->_city; } } class TransformView { /** * Iterates through getters and calls them to display $entity. * @param object $entity * @return string */ public function display($entity) { $result = "\n"; $rc = new ReflectionClass(get_class($entity)); foreach ($rc->getMethods() as $method) { $methodName = $method->getName(); if (strstr($methodName, 'get') == $methodName) { $field = str_replace('get', '', $methodName); $result .= "\n"; $result .= "{$field}\n"; $result .= "" . $entity->$methodName() . "\n"; $result .= "\n"; } } $result .= "\n"; return $result; } } $user = new User('Giorgio'); $user->setCity('Como'); $view = new TransformView(); echo $view->display($user);

July 27, 2010

by Giorgio Sironi

· 4,669 Views

Patterns for Using Custom Annotations

if you happen to create your own annotations, for instance to use with java 6 pluggable annotation processors, here are some patterns that i collected over time. nothing new, nothing fancy, just putting everything into one place, with some proposed names. local-name annotation have your tools accept any annotation as long as its single name (without the fully-qualified prefix) is the expected one. for example com.acme.notnull and net.companyname.notnull would be considered the same. this enables to use your own annotations rather than the one packaged with the tools, in order not to depend on them. example in the guice documentation : guice recognizes any @nullable annotation, like edu.umd.cs.findbugs.annotations.nullable or javax.annotation.nullable . composed annotations annotations can have annotations as values. this allows for some complex and tree-like configurations, such as mappings from one format to another (from/to xml, json, rdbm). here is a rather simple example from the hibernate annotations documentation: @associationoverride( name="propulsion", joincolumns = @joincolumn(name="fld_propulsion_fk") ) multiplicity wrapper java does not allow to use several times the same annotation on a given target. to workaround that limitation, you can create a special annotation that expects a collection of values of the desired annotation type. for example, you’d like to apply several times the annotation @advantage , so you create the multiplicity wrapper annotation: @advantages (advantages = {@advantage}) . typically the multiplicity wrapper is named after the plural form of its enclosed elements. example in hibernate annotations documentation: @attributeoverrides( { @attributeoverride(name="iso2", column = @column(name="borniso2") ), @attributeoverride(name="name", column = @column(name="borncountryname") ) } ) meta-inheritance it is not possible in java for annotations to derive from each other. to workaround that, the idea is simply to annotate your new annotation with the “super” annotation, which becomes a meta annotation. whenever you use your own annotation with a meta-annotation, the tools will actually consider it as if it was the meta-annotation. this kind of meta-inheritance helps centralize the coupling to the external annotation in one place, while making the semantics of your own annotation more precise and meaningful. example in spring annotations, with the annotation @component , but also works with annotation @qualifier : create your own custom stereotype annotation that is itself annotated with @component: @component public @interface mycomponent { string value() default ""; } @mycomponent public class myclass... another example in guice, with the binding annotation : @bindingannotation @target({ field, parameter, method }) @retention(runtime) public @interface paypal {} // then use it public class realbillingservice implements billingservice { @inject public realbillingservice(@paypal creditcardprocessor processor, transactionlog transactionlog) { ... } refactoring-proof values prefer values that are robust to refactorings rather than string litterals. myclass.class is better than “com.acme.myclass”, and enums are also encouraged. example in hibernate annotations documentation: @manytoone( cascade = {cascadetype.persist, cascadetype.merge}, targetentity=companyimpl.class ) and another example in the guice documentation : @implementedby(paypalcreditcardprocessor.class) configuration precedence rule convention over configuration and sensible defaults are two existing patterns that make a lot of sense with respect to using annotations as part of a configuration strategy. having no need to annotate is way better than having to annotate for little value. annotations are by nature embedded in the code, hence they are not well-suited for every case of configuration, in particular when it comes to deployment-specific configuration. the solution is of course to mix annotations with other mechanisms and use each of them where they are more appropriate. the following approach, based on precedence rule, and where each mechanism overrides the previous one, appears to work well: default value < annotation < xml < programmatic configuration for example, the default values could be suited for unit testing, while the annotation define all the stable configuration, leaving the other options to configure for deployments at the various stages, like production or qa environments. this principle is common (spring, java 6 ee among others), for example in jpa: the concept of configuration by exception is central to the jpa specification. conclusion this post is mostly a notepad of various patterns on how to use annotations, for instance when creating tools that process annotations, such as the annotation processing tools in java 5 and the pluggable annotations processors in java 6. don’t hesitate to contribute better patterns names, additional patterns and other examples of use. from http://cyrille.martraire.com/2010/07/patterns-for-using-annotations/

July 26, 2010

by Cyrille Martraire

· 19,811 Views

Practical PHP Patterns: Template View

In this series we have seen two Web presentation patterns so far: Page Controller, also known as Action Controller, which responds to a specific HTTP request from a client. Front Controller, which acts as the single entry point of a web application and routes requests to different Page Controllers. These two patterns are part of the Controller component of an MVC implementation. But do controllers generate the HTML response directly? The response is, usually, no. Controllers are mostly glue code which deals with orthogonal concerns such as authentication, not a mean to generate a response. In the MVC pattern, the Model is application-specific; the Controller layer is a combination of the two patterns treated earlier, while a View component is needed to complete the picture. There are different levels of complexity of the various View patterns available in the field of the web presentation, but we'll start with the classic Template View. A Template View is a bunch of HTML code with dynamic points submerged in it; again, PHP files can be thought as an instance of this pattern. Hello, {$_GET['name']}."; ?> Template View as an MVC component In a web MVC approach, the data is not pulled into the view script by accessing directly the Model, because the Controller layer should be responsible for that. Nor the view script contains the behavior of the application like in the example above. Instead, the non-static data used in the Template View is passed to it by the Controller. As a result, the View has no dependencies towards any other classes than the ones of the object passed to it (plain data structures or Model classes such as Entities and Value Objects). Ideally, a Template View should be declarative. It is usually implemented as a PHP script, so we have various options to render data, which ranges from the almost-declarative A more complex statement: A versatile approach: method(); ?> PHP and other server-side programming languages originally started using the < and > tags because WYSIWYG editors would ignore them, and let a designer access the Template View for editing without messing up the code. There is also the problem of well-formed XML documents, which these scripts are not (though it is not usually a problem unless you want to generate them.) PHP as a templating language even offers some alternative construct you may use in Template Views to simplify the structure of the code (bracket-less constructs): Popularity The Template View pattern is in the foundation of dynamic web pages. PHP is born with these ideas (the recursive acronym PHP means PHP: Hypertext Preprocessor.) JSP and ASP are also similar solutions where the dynamic pieces are put into an HTML context. The problem with all these three languages when used is the quantity of business logic that get transported in the views, while we would want to separate it in a testable, reusable object model. Smarty and some other PHP template engines take the Template View a level further, by defining a syntax for markers and parse the Template Views instead of directly executing them (thus this views are not .php files but are in some other format like .tpl). Personally I never used them since I think PHP is already a template language, which has all the flexibility I would want from one; at the same time, template engines eventually replicates the PHP syntax: {if $edit_flag} edit {/if} Advantages and disadvantages The radical advantage of adopting a Template View pattern is the clean separation of presentation logic from business-related one. This results also in the freedom of HTML modelling from people with only basic knowledge of PHP (see designers, SEO gurus, etc.) There are also some issues in using Template Views, but they are common to all View-related patterns. Basically, the presentation logic contained in views can be taken too further, to the point that domain-related logic scatters into the view instead of sitting in the Controller or in the Model. This is a problem which happened very often in legacy PHP applications, built before the current framework generation. Examples Zend Framework, our reference for Web presentation patterns, defines a generic View class which takes the path of a view script as an input, and renders it into a scope internal to one of its methods (automatically: once the controller is finished, the view is found and run.) This way, the context variables set up by the controller are made available to the view script on $this via __get and __set overloading. Hello, name; ?>. Not only variables (scalar and objects) can be put on $this, but also View helpers via __call() overloading. In this picture, custom methods can factor out presentation logic on external plugin objects which are wired to the generic View object, which forwards calls to them or returns them to view script itself. headTitle($this->object); $this->headTitle($this->methodName); if ($this->result) { echo "Result: $this->result"; } if ($this->form) { echo $this->form; } Zend Framework also switches automatically the name of the view to render basing on some conventional parameters like format. For example, ?format=xml will cause a comments.xml.phtml to be rendered instead of comments.phtml: comments as $c) { $guid = $host . $this->url(array('slug' => $this->article['slug']), 'article', true) . '#comment_' . $c['id']; $entries[] = array( 'title' => 'Comment #' . $c['id'], 'link' => $guid, 'guid' => $guid, 'description' => strip_tags($c['text']) ); } $feed = Zend_Feed::importArray(array( 'title' => $this->article['title'], 'description' => $this->article['description'], 'author' => $this->article['author'], 'link' => $host . $this->url() . '?format=xml', 'charset' => 'utf-8', 'entries' => $entries )); echo $feed->saveXml(); Note that all the data resources needed by the view are accessed through $this (they have to be previously injected), the View object.

July 21, 2010

by Giorgio Sironi

· 6,964 Views

Optimizing JPA Performance: An EclipseLink, Hibernate, and OpenJPA Comparison

'Impedance mismatch'. No two words encompass the troubles, headaches and quirks most developers face when attempting to link applications to relational databases (RDBMS). But lets face it, object orientated designs aren't going away anytime soon from mainstream languages and neither are the relational storage systems used in most applications. One side works with objects, while the other with tables. Resolving these differences -- or as its technically referred to 'object/relational impedance mismatch' -- can result in substantial overhead, which in turn can materialize into poor application performance. In Java, the Java Persistence API (JPA) is one of the most popular mechanisms used to bridge the gap between objects (i.e. the Java language) and tables (i.e. relational databases). Though there are other mechanisms that allow Java applications to interact with relational databases -- such as JDBC and JDO -- JPA has gained wider adoption due to its underpinnings: Object Relational Mapping (ORM). ORM's gain in popularity is due precisely to it being specifically designed to address the interaction between object and tables. In the case of JPA, there is a standard body charged with setting its course, a process which has given way to several JPA implementations, among the three most popular you will find: EclipseLink (evolved from TopLink), Hibernate and OpenJPA. But even though all three are based on the same standard, ORM being such a deep and complex topic, beyond core functionality each implementation has differences ranging from configuration to optimization techniques. What I will do next is explain a series of topics related to optimizing an application's use of the JPA, using and comparing each of the previous JPA implementations. While JPA is capable of automatically creating relational tables and can work with a series of relational database vendors, I will part from having pre-existing data deployed on a MySQL relational database, in addition to relying on the Spring framework to facilitate the use of the JPA. This will not only make it a fairer comparison, but also make the described techniques appealing to a wider audience, since performance issues become a serious concern once you have a large volume of data, in addition to MySQL and Spring being a common choice due to their community driven (i.e. open-source) roots. See the source code/application section at the end for instructions on setting up the application code discussed in the remainder of the sections. Download the Source Code associated with this article (~45 MB) The basics: Metrics In order to establish JPA performance levels in an application, it's vital to first obtain a series of metrics related to a JPA implementation's inner workings. These include things like: What are the actual queries being performed against a RDBMS? How long does each query take? Are queries being performed constantly against the RDBMS or is a cache being used? These metrics will be critical to our performance analysis, since they will shed light on the underlying operations performed by a JPA implementation and in the process show the effectiveness or ineffectiveness of certain techniques. In this area you will find the first differences among implementations, and I'm not talking about metric results, but actually how to obtain these metrics. To kick things off, I will first address the topic of logging. By default, all three JPA implementations discussed here -- EclipseLink, Hibernate and OpenJPA -- log the query performed against a RDBMS, which will be an advantage in determining if the queries performed by an ORM are optimal for a particular relational data model. Nevertheless, tweaking the logging level of a JPA implementation further can be helpful for one of two things: Getting even more details from the underlying operations made by a JPA -- which can be turned off by default (e.g. database connection details) -- or getting no logging information at all -- which can benefit a production system's performance. Logging in JPA implementations is managed through one of several logging frameworks, such as Apache Commons Logging or Log4J. This requires the presence of such libraries in an application. Logging configuration of a JPA implementation is mostly done through a value in an application's persistence.xml file or in some cases, directly in a logging framework's configuration files. The following table describes JPA logging configuration parameters: Large table, so here's an external link In addition to the information obtained through logging, there is another set of JPA performance metrics which require different steps to be obtained. One of these metrics is the time it takes to perform a query. Even though some JPA implementations provide this information using certain configurations, some do not. Even so, I opted to use a separate approach and apply it to all three JPA implementations in question. After all, time metrics measured in milliseconds can be skewed in certain ways depending on start and end time criteria. So to measure query times, I will use Aspects with the aid of the Spring framework. Aspects will allow us to measure the time it takes a method containing a query to be executed, without mixing the timing logic with the actual query logic -- the last feature of which is the whole purpose of using Aspects. Further discussing Aspects would go beyond the scope of performance, so next I will concentrate on the Aspect itself. I advise you to look over the accompanying source code, Aspects and Spring Aspects for more details on these topics and their configuration. The following Aspect is used for measuring execution times in query methods. package com.webforefront.aop;import org.apache.commons.lang.time.StopWatch;import org.apache.commons.logging.Log;import org.apache.commons.logging.LogFactory;import org.aspectj.lang.ProceedingJoinPoint;import org.aspectj.lang.annotation.Around;import org.aspectj.lang.annotation.Pointcut;import org.aspectj.lang.annotation.Aspect;@Aspectpublic class DAOInterceptor { private Log log = LogFactory.getLog(DAOInterceptor.class); @Around("execution(* com.webforefront.jpa.service..*.*(..))") public Object logQueryTimes(ProceedingJoinPoint pjp) throws Throwable { StopWatch stopWatch = new StopWatch(); stopWatch.start(); Object retVal = pjp.proceed(); stopWatch.stop(); String str = pjp.getTarget().toString(); log.info(str.substring(str.lastIndexOf(".")+1, str.lastIndexOf("@")) + " - " + pjp.getSignature().getName() + ": " + stopWatch.getTime() + "ms"); return retVal; } The main part of the Aspect is the @Around annotation. The value assigned to this last annotation indicates to execute the aspect method -- logQueryTimes -- each time a method belonging to a class in the com.webforefront.jpa.service package is executed -- this last package is where all our application's JPA query methods will reside. The logic performed by the logQueryTimes aspect method is tasked with calculating the execution time and outputting it as logging information using Apache Commons Logging. Another set of important JPA metrics is related to statistics beyond those provided by standard logging. The statistics I'm referring to are things related to caches, sessions and transactions. Since the JPA standard doesn't dictate any particular approach to statistics, each JPA implementation also varies in the type and way it collects statistics. Both Hibernate and OpenJPA have their own statistics class, where as EclipseLink relies on a Profiler to gather similar metrics. Since I'm already relying on Aspects, I will also use an Aspect to obtain statistics both prior and after the execution of a JPA query method. The following Aspect obtains statistics for an application relying on Hibernate. package com.webforefront.aop;import org.hibernate.stat.Statistics;import org.hibernate.SessionFactory;import org.aspectj.lang.ProceedingJoinPoint;import org.aspectj.lang.annotation.Around;import org.aspectj.lang.annotation.Aspect;import org.springframework.beans.factory.annotation.Autowired;import javax.persistence.EntityManagerFactory;import org.hibernate.ejb.HibernateEntityManagerFactory;import org.apache.commons.logging.Log;import org.apache.commons.logging.LogFactory;@Aspectpublic class CacheHibernateInterceptor { private Log log = LogFactory.getLog(DAOInterceptor.class); @Autowired private EntityManagerFactory entityManagerFactory; @Around("execution(* com.webforefront.jpa.service..*.*(..))") public Object log(ProceedingJoinPoint pjp) throws Throwable { HibernateEntityManagerFactory hbmanagerfactory = (HibernateEntityManagerFactory) entityManagerFactory; SessionFactory sessionFactory = hbmanagerfactory.getSessionFactory(); Statistics statistics = sessionFactory.getStatistics(); String str = pjp.getTarget().toString(); statistics.setStatisticsEnabled(true); log.info(str.substring(str.lastIndexOf(".")+1, str.lastIndexOf("@")) + " - " + pjp.getSignature().getName() + ": (Before call) " + statistics); Object result = pjp.proceed(); log.info(str.substring(str.lastIndexOf(".")+1, str.lastIndexOf("@")) + " - " + pjp.getSignature().getName() + ": (After call) " + statistics); return result; } } Notice the similar structure to the prior timing Aspect, except in this case the logging output contains values that belong to the Statistics Hibernate class obtained via the application's EntityManagerFactory. The next Aspect is used to obtain statistics for an application relying on OpenJPA. package com.webforefront.aop;import org.apache.openjpa.datacache.CacheStatistics;import org.apache.openjpa.persistence.OpenJPAEntityManagerFactory;import org.apache.openjpa.persistence.OpenJPAPersistence;import org.aspectj.lang.ProceedingJoinPoint;import org.aspectj.lang.annotation.Around;import org.aspectj.lang.annotation.Aspect;import org.springframework.beans.factory.annotation.Autowired;import javax.persistence.EntityManagerFactory;import org.apache.commons.logging.Log;import org.apache.commons.logging.LogFactory;@Aspectpublic class CacheOpenJPAInterceptor { private Log log = LogFactory.getLog(DAOInterceptor.class); @Autowired private EntityManagerFactory entityManagerFactory; @Around("execution(* com.webforefront.jpa.service..*.*(..))") public Object log(ProceedingJoinPoint pjp) throws Throwable { OpenJPAEntityManagerFactory ojpamanagerfactory = OpenJPAPersistence.cast(entityManagerFactory); CacheStatistics statistics = ojpamanagerfactory.getStoreCache().getStatistics(); String str = pjp.getTarget().toString(); log.info(str.substring(str.lastIndexOf(".")+1, str.lastIndexOf("@")) + " - " + pjp.getSignature().getName() + ": (Before call) Statistics [start time=" + statistics.start() + ",read count=" + statistics.getReadCount() + ",hit count=" + statistics.getHitCount() +",write count=" + statistics.getWriteCount() + ",total read count=" + statistics.getTotalReadCount() + ",total hit count=" + statistics.getTotalHitCount() +",total write count=" + statistics.getTotalWriteCount()); Object result = pjp.proceed(); log.info(str.substring(str.lastIndexOf(".")+1, str.lastIndexOf("@")) + " - " + pjp.getSignature().getName() + ": (After call) Statistics [start time=" + statistics.start() + ",read count=" + statistics.getReadCount() + ",hit count=" + statistics.getHitCount() +",write count=" + statistics.getWriteCount() + ",total read count=" + statistics.getTotalReadCount() + ",total hit count=" + statistics.getTotalHitCount() +",total write count=" + statistics.getTotalWriteCount()); return result; } } Once again, notice the similar Aspect structure to the previous Aspect which relies on an application's EntityManagerFactory. In this case, the logging output contains values that belong to the CacheStatistics OpenJPA class. Since OpenJPA does not enable statistics by default, you will need to add the following two properties to an application's persistence.xml file: The first property ensures statistics are gathered, while the second property is used to indicate the gathering of statistics take place on a single JVM. NOTE: The value "true(EnableStatistics=true)" also enables caching in addition to statistics. Since EclipseLink doesn't have any particular statistics class and relies on a Profiler to determine advanced metrics, the simplest way to obtain similar statistics to those of Hibernate and OpenJPA is through the Profiler itself. To active EclipseLink's Profiler you just need to add the following property to an application's persistence.xml file: . By doing so, the EclipseLink Profiler output's several metrics on each JPA query method execution as logging information. Now that you know how to obtain several metrics from all three JPA implementations and understand they will be obtained as fairly as possible for all three providers, it's time to put each JPA implementation to the test along with several performance techniques. JPQL queries, weaving and class transformations Lets start by making a query that retrieves data belonging to a pre-existing RDBMS table named "Master". The "Master" table contains over 17,000 records belonging to baseball players. To simplify matters, I will create a Java class named "Player" and map it to the "Master" table in order to retrieve the records as objects. Next, relying on the Spring framework's JpaTemplate functionality, I will setup a query to retrieve all "Player" objects, with the query taking the following form: getJpaTemplate().find("select e from Player e"); See the accompanying source code for more details on this last process. Next, I deploy the application using each of the three JPA implementations on Apache Tomcat, doing so separately, as well as starting and stopping the server on each deployment to ensure fair results. These are the results of doing so on a 64-bit Ubuntu-4GB RAM box, using Java 1.6: All player objects - 17,468 records Time Query Hibernate 3558 ms select player0_.lahmanID as lahmanID0_, player0_.nameFirst as nameFirst0_, player0_.nameLast as nameLast0_ from Master player0_ EclipseLink (Run-time weaver - Spring ReflectiveLoadTimeWeaver weaver ) 3215 ms SELECT lahmanID, nameLast, nameFirst FROM Master EclipseLink (Build-time weaving) 3571 ms SELECT lahmanID, nameLast, nameFirst FROM Master EclipseLink (No weaving) 3996 ms SELECT lahmanID, nameLast, nameFirst FROM Master OpenJPA (Build-time enhanced classes) 5998 ms SELECT t0.lahmanID, t0.nameFirst, t0.nameLast FROM Master t0 OpenJPA (Run-time enhanced classes- OpenJPA enhancer) 6136 ms SELECT t0.lahmanID, t0.nameFirst, t0.nameLast FROM Master t0 OpenJPA (Non enhanced classes) 7677 ms SELECT t0.lahmanID, t0.nameFirst, t0.nameLast FROM Master t0 As you can observe, the queries performed by each JPA implementation are fairly similar, with two of them using a shortcut notation (e.g. t0 and player0 for the table named 'Master'). This syntax variation though has minimal impact on performance, since directly querying an RDBMS using any of these notation variations shows identical results. However, the query times made through several JPA implementations using distinct parameters vary considerably. One important factor leading to this time difference is due to how each implementation handles JPA entities. Lets start with the OpenJPA implementation which had the poorest times. OpenJPA can execute an enhancement process on Java entities (e.g. in this case the 'Player' class). This enhancement process can be performed when the entities are built, at run-time or foregone altogether. As you can observe, foregoing entity enhancement altogether in OpenJPA produced the longest query times. Where as enhancing entities at either build-time or run-time produced relatively better results, with the former beating out the latter. By default, OpenJPA expects entities to be enhanced. This means you will either need to explicitly configure an application to support unenhanced classes by adding the following: ...property to an application's persistence.xml file or enhance classes at build-time or at run-time relying on the OpenJPA enhancer, otherwise an application relying on OpenJPA will throw an error. Given these OpenJPA results, the remaining OpenJPA tests will be based on build-time enhanced entity classes. For more on the topic of OpenJPA enhancement, refer to the OpenJPA documentation in addition to consulting the accompanying source code for this article. You may be wondering what exactly constitutes OpenJPA enhancement ? OpenJPA entity enhancement is a processing step applied to the bytecode generated by the Java compiler which adds JPA specific instructions to provide optimal runtime performance, these instructions can include things like flexible lazy loading and dirty read tracking. So why doesn't Hibernate or EclipseLink enhance entities ? In short, Hibernate and EclipseLink also enhance JPA entites, they just don't outright call it 'enhancement'. EclipseLink calls this 'enhancement' process by the more technical term: weaving. Similar to OpenJPA's enhancement process, weaving in EclipseLink can take place at either build-time (a.k.a. static weaving), run-time or forgone altogether. As you can observe in the results, all of EclipseLink's tests present smaller variations compared to OpenJPA. The longest EclipseLink variation involved not using weaving. If you think about it, this is rather logical given that the purpose of weaving consists of altering Java byte code for the purpose of adding optimized JPA instructions that include lazy loading, change tracking, fetch groups and internal optimizations. For the EclipseLink tests using weaving, both build-time and run-time weaving present better results. For build-time weaving, I used EclipseLink's library along with an Apache Ant task, where as for run-time weaving, I used the Spring framework's ReflectiveLoadTimeWeaver. I can only assume the slightly better performance of using run-time weaving over build-time weaving in EclipseLink was due to the fact of using a weaver integrated with the Spring framework, which in turn could result in better JPA optimizations designed for Spring applications. Nevertheless, considering the test result of forgoing weaving altogether, weaving does not appear to be a major performance impact when using EclipseLink, ceteris paribus. By default, EclipseLink expects run-time weaving to be enabled, otherwise you will receive an error in the form 'Cannot apply class transformer without LoadTimeWeaver specified'. This means that for cases using build-time weaving or no weaving at all, you will need to explicitly indicate this behavior. In order to disable EclipseLink weaving you will need to either configure an application's EntityManagerFactory Spring bean with: ... or add the .... ...property to an application's persistence.xml file. To indicate an application's entities are built using build-time weaving, substitute the previous property's "false" value with "static". To configure the default run-time weaver expected by EclipseLink, add the following: ...property to an application's EntityManagerFactory Spring bean. Given these EclipseLink results, the remaining EclipseLink tests will be based on run-time weaving provided by the Spring framework. For more on the topic of EclipseLink weaving, refer to the EclipseLink documentation at http://wiki.eclipse.org/Introduction_to_EclipseLink_Application_Development_(ELUG)#Using_Weaving, in addition to consulting the accompanying source code for this article. Hibernate doesn't require neither enhancing JPA entities or weaving. For this reason, there is only one test result. This not only makes Hibernate simpler to setup, but judging by its only test result -- which clock's in at second place with respect to all other tests -- Hibernate's performance ranks high compared to its counterparts. However, in what I would consider Hibernate's equivalent to OpenJPA's enhancement process or EclipseLink's weaving, you will find a series of Hibernate properties. For example, Hibernate has properties such as hibernate.default_batch_fetch_size designed to optimize lazy loading. As you might recall, among the purposes of both OpenJPA's enhancement process and EclipseLink's weaving are the optimization of lazy loading. So where as OpenJPA and EclipseLink require a separate and monolithic step -- at build-time or run-time -- to achieve JPA optimization techniques, Hibernate falls back to the use of granular properties specified in an application's persistence.xml file. Nevertheless, given that Hibernate's default behavior proved to be on par with the best query times, I didn't feel a need to further explore with these Hibernate properties. To get another sense of the times and mapping procedures of each JPA implementation, I will make more selective queries based on a Player object's first name and last name. These are the results of performing a query for all Player objects whose first name is John and a query for all Player objects whose last name in Smith. All player objects whose first name is John - 472 records Time Query EclipseLink 1265 ms SELECT lahmanID, nameLast, nameFirst FROM Master WHERE (nameFirst = ?) Hibernate 613 ms select player0_.lahmanID as lahmanID0_, player0_.nameFirst as nameFirst0_, player0_.nameLast as nameLast0_ from Master player0_ where player0_.nameFirst=? OpenJPA 1643 ms SELECT t0.lahmanID, t0.nameFirst, t0.nameLast FROM Master t0 WHERE (t0.nameFirst = ?) [params=?] All player objects whose last name is Smith - 146 records Time Query EclipseLink 986 ms SELECT lahmanID, nameLast, nameFirst FROM Master WHERE (nameLastt = ?) Hibernate 537 ms select player0_.lahmanID as lahmanID0_, player0_.nameFirst as nameFirst0_, player0_.nameLast as nameLast0_ from Master player0_ where player0_.nameLast=? OpenJPA 1452 ms SELECT t0.lahmanID, t0.nameFirst, t0.nameLast FROM Master t0 WHERE (t0.nameLast = ?) [params=?] These test results tell a slightly different story,with all three JPA implementations presenting substantial time differences amongst one another. At a lower record count, Hibernate's out-of-the-box configuration resulted in almost twice as fast queries as its closest competitor and almost three times faster queries than its other competitor. To get an even broader sense of the times and mapping procedures of each JPA implementation, I will make a query on a single Player object based on its id. These are the results of performing such a query. Single player object whose ID is 777- 1 record Time Query EclipseLink 521 ms SELECT lahmanID, nameLast, nameFirst FROM Master WHERE (lahmanID = ?) Hibernate 157 ms select player0_.lahmanID as lahmanID0_0_, player0_.nameFirst as nameFirst0_0_, player0_.nameLast as nameLast0_0_ from Master player0_ where player0_.lahmanID=? OpenJPA 1052 ms SELECT t0.nameFirst, t0.nameLast FROM Master t0 WHERE t0.lahmanID = ? [params=?] With the exception of the faster query times -- due to it being a query for a single Player object -- the times between JPA implementations are practically in proportion to the queries used for extracting multiple Player objects by first and last name. This will do it as far as test queries are concerned. However, a word of caution is in order when discussing these topics on optimization/enhancement/weaving. Even though the previous tests consisted of querying over 17,000 records and confirm clear advantages of using one provider and technique over another, they are still one dimensional, since they're based on read operations performed on a single object type and a single RDBMS table. JPA can perform a large array of operations that also include updating, writing and deleting RDBMS records, not to mention the execution of more elaborate queries that can span multiple objects and tables. In addition, RDBMS themselves can have influencing factors (e.g. indexes) over JPA query times. So all this said, it's not too far fetched to think the use of OpenJPA entity enhancement, EclipseLink weaving or Hibernate properties, could have varying degrees -- either beneficial or detrimental -- depending on the queries (i.e. multi-table, multi-object) and type of JPA operation (i.e. read, write, update, delete) involved. Next, I will describe one of the most popular techniques used to boost performance in JPA applications. Caches A cache allows data to remain closer to an application's tier without constantly polling an RDBMS for the same data. I entitled the section in plural -- caches -- because there can be several caches involved in an application using JPA. This of course doesn't mean you have to configure or use all the caches provided by an application relying on JPA, but properly configuring caches can go a long way toward enhancing an application's JPA performance. So lets start by analyzing what it's each JPA implementation offers in its out-of-the-box state in terms of caching. The following table illustrates tests done by simply invoking the previous JPA queries for a second and third consecutive time, without stopping the server. Note that the same process of deploying a single application at once was used, in addition to the server being re-started on each set of tests. Query / Implementation EclipseLink Hibernate OpenJPA All records (1st time) 3215 ms 3558 ms 5998 ms All records (2nd time) 507 ms 272 ms 521 ms All records (3rd time) 439 ms 218 ms 263 ms First name (1st time) 1265 ms 613 ms 1643 ms First name (2nd time) 151 ms 115 ms 239 ms First name (3rd time) 154 ms 101 ms 227 ms Last name (1st time) 986 ms 537 ms 1452 ms Last name (2nd time) 41 ms 41 ms 112 ms Last name (3rd time) 65 ms 38 ms 117 ms By ID (1st time) 521 ms 157 ms 1052 ms By ID (2nd time) 1 ms 6 ms 3 ms By ID (3rd time) 1 ms 3 ms 3 ms As you can observe, on both the second and third invocation all the queries show substantial improvements with respect to the first invocation. The primary cause for these improvements is unequivocally due to the use of a cache. But what type of cache exactly ? Could it be an RDBMS's own caching engine ? JPA ? Spring ? Or some other variation ?. In order to shed some light on cache usage, the following table illustrates the cache statistics generated on each of the previous JPA queries. Query / Impleme)ntation EclipseLink Hibernate OpenJPA All records (2nd time) number of objects=17468, total time=506, local time=506, row fetch=65, object building=328, cache=112, sql execute=47, objects/second=34521, sessions opened=2, sessions closed=2, connections obtained=2, statements prepared=2, statements closed=2, second level cache puts=0, second level cache hits=0, second level cache misses=0, entities loaded=34936, queries executed to database=2, query cache puts=0, query cache hits=0, query cache misses=0 N/A All records (3rd time) number of objects=17468, total time=435, local time=435, profiling time=1, row fetch=28, object building=323, cache=106, logging=1, sql execute=27, objects/second=40156, sessions opened=3, sessions closed=3, connections obtained=3, statements prepared=3, statements closed=3, second level cache puts=0, second level cache hits=0, second level cache misses=0, entities loaded=52404, queries executed to database=3, query cache puts=0, query cache hits=0, query cache misses=0 N/A First name (2nd time) number of objects=472, total time=148, local time=148, row fetch=27, object building=106, cache=7, logging=1, sql execute=3, objects/second=3189, sessions opened=2, sessions closed=2, connections obtained=2, statements prepared=2, statements closed=2, second level cache puts=0, second level cache hits=0, second level cache misses=0, entities loaded=944, queries executed to database=2, query cache puts=0, query cache hits=0, query cache misses=0 N/A First name (3rd time) number of objects=472, total time=152, local time=152, row fetch=20, object building=121, cache=7, sql execute=3, objects/second=3105, sessions opened=3, sessions closed=3 connections obtained=3, statements prepared=3, statements closed=3, second level cache puts=0, second level cache hits=0, second level cache misses=0, entities loaded=1416, queries executed to database=3, query cache puts=0, query cache hits=0, query cache misses=0 N/A Last name (2nd time) number of objects=146, total time=40, local time=40, row fetch=7, object building=27, cache=2, logging=1, sql execute=3, objects/second=3650, sessions opened=2, sessions closed=2, connections obtained=2, statements prepared=2, statements closed=2, second level cache puts=0, second level cache hits=0, second level cache misses=0, entities loaded=292, queries executed to database=2, query cache puts=0, query cache hits=0, query cache misses=0 N/A Last name (3rd time) number of objects=146, total time=63, local time=63, profiling time=1, row fetch=6, object building=19, cache=5, sql prepare=1, sql execute=23, objects/second=2317, sessions opened=3, sessions closed=3, connections obtained=3, statements prepared=3, statements closed=3, second level cache puts=0, second level cache hits=0, second level cache misses=0, entities loaded=438 queries executed to database=3, query cache puts=0, query cache hits=0, query cache misses=0 N/A By ID (2nd time) number of objects=1, total time=1, local time=1, time/object=1, objects/second=1000, sessions opened=2, sessions closed=2, connections obtained=2, statements prepared=2, statements closed=2, second level cache puts=0, second level cache hits=0, second level cache misses=0, entities loaded=2, queries executed to database=0, query cache puts=0, query cache hits=0, query cache misses=0 N/A By ID (3rd time) number of objects=1, total time=1, local time=1, time/object=1, objects/second=1000, sessions opened=3, sessions closed=3, connections obtained=3, statements prepared=3, statements closed=3, second level cache puts=0, second level cache hits=0, second level cache misses=0, entities loaded=3, queries executed to database=0, query cache puts=0, query cache hits=0, query cache misses=0 N/A Notice the statistics generated by each JPA implementation are different. EclipseLink reports a single cache statistic, OpenJPA doesn't even report statistics unless a cache is enabled -- see previous section on metrics for details on this behavior -- and Hibernate reports two cache related statistics: second level cache and query cache. At this juncture, if you look at the test results and statistics for the second and third invocation, something won't add up. How is it that OpenJPA's test results came out faster when caching is disabled by default ? An how about Hibernate returning 0's on its cache related statistics, even when its test results came out faster ? The reason for this performance increase is due to RDBMS caching. On the first query, the RDBMS needs to read data from its own file system (i.e. perform an I/O operation), on subsequent requests the data is present in RDBMS memory (i.e. its cache) making the entire JPA query much faster. A closer look at the Hibernate statistics field 'queries executed to the database' can confirm this. Notice that on every second query it shows 2 and on every third query it shows 3, meaning the data was read directly from the database. NOTE: The only exception to this occurs when a query is made on a single entity (i.e. by id), I will address this shortly. Next, lets start breaking down the caches you will encounter when using JPA applications. The JPA 2.0 standard defines two types of caches: A first level cache and a second level cache. The first level cache or EntityManager cache is used to properly handle JPA transactions. A first level cache only exist for the duration of the EntityManager. With the exception of long lived operations performed against a RDBMS, JPA EntityManager's are short lived and are created & destroyed per request or per transaction. In this case, given the nature of the queries, first level caches are cleared on every query. A second level cache on the other hand is a broader cache that can be used across transactions and users. This makes a JPA second level cache more powerful, since it can avoid constantly polling an RDBMS for the same data. But even though the JPA 2.0 standard now addresses second level cache features, this was not the case in JPA 1.0. In the 1.0 version of the JPA standard only a first level cache was addressed, leaving the door completely open on the topic of a second level cache. This created a fragmented approach to caching in JPA implementations, which even now as JPA 2.0 compliant implementations emerge, some non-standard features continue to be part of certain implementations given the value they provide to JPA caching in general. So as I move forward, bear in mind that just like previous JPA topics, each JPA implementation can have its own particular way of dealing with second level caching. I will start with OpenJPA, which has the least amount of proprietary caching options. To enable OpenJPA caching (i.e. second level caching) you need to declare the following two properties in an application's persistence.xml file: The first property ensures caching and statistics are activated, while the second property is used to indicate caching take place on a single JVM. The following results and statistics were obtained with OpenJPA's second level cache enabled. Query with OpenJPA caching Time Statistics Time without statistics All records (2nd time) 420 ms read count=34936, hit count=17468, write count=17468, total read count=34936, total hit count=17468, total write count=17468 347 ms All records (3rd time) 254 ms read count=52404, hit count=34936, write count=17468, total read count=52404, total hit count=34936, total write count=17468 230 ms First name (2nd time) 125 ms read count=944, hit count=472, write count=472, total read count=944, total hit count=472, total write count=472 127 ms First name (3rd time) 114 ms read count=1416, hit count=944, write count=472, total read count=1416, total hit count=944, total write count=472 132 ms Last name (2nd time) 63 ms read count=292, hit count=146, write count=146, total read count=292, total hit count=146, total write count=146 53 ms Last name (3rd time) 49 ms read count=438, hit count=292, write count=146, total read count=438, total hit count=292, total write count=146 50 ms By ID (2nd time) 5 ms read count=2, hit count=1, write count=1, total read count=2, total hit count=1, total write count=1 1 ms By ID (3rd time) 4 ms read count=3, hit count=2, write count=1, total read count=3, total hit count=2, total write count=1 1 ms As these test results illustrate, executing subsequent JPA queries with OpenJPA's second level cache produce superior results. Another important behavior illustrated in some of these test cases is that by simply disabling statistics -- and still using the second level cache -- query times improve even more. The OpenJPA statistics also demonstrate how the cache is being used. Notice that on each subsequent query the statistics field 'hit count' is duplicated, which means data is being read from the cache (i.e. a hit). Also notice the statistics field 'write count' remains static, which means data is only written once from the RDBMS to the cache. This is pretty basic functionality for a second level cache. On certain occasions a need may arise to interact directly with a cache. These interactions can range from prohibiting an entity from being cached, assigning a particular amount of memory to a cache, forcing an entity to always be cached, flushing all the data contained in a cache, or even plugging-in a third party caching solution to provide a more robust strategy, among other things. The JPA 2.0 standard provides a very basic feature set in terms of second level caching through javax.persistence.Cache. Upon consulting this interface, you'll realize it only provides four methods charged with verifying the presence of entities and evicting them. This feature set not only proves to be limited, but also cumbersome since it can only be leveraged programmatically (i.e. through an API). In this sense, and as I've already mentioned, JPA implementations have provided a series of features ranging from persistence.xml properties to Java annotations related to second level caching. OpenJPA offers several of these second level caching features, including a separate and supplemental cache called a 'query cache' which can further improve JPA performance. For such cases, I will point you directly to OpenJPA's cache documentation available at http://openjpa.apache.org/builds/apache-openjpa-2.1.0-SNAPSHOT/docs/manual/ref_guide_caching.html#ref_guide_cache_query so you can try these parameters for yourself on the accompanying application source code. Hibernate just like OpenJPA has its second level cache disabled. To enable Hibernate's second level cache you need to add the following properties to an application's persistence.xml file: Its worth mentioning that Hibernate has integral support for other second level caches. The previous properties displayed how to enable the HashtableCacheProvider cache -- the simplest of the integral second level caches -- but Hibernate also provides support for five additional caches, which include: EHCache, OSCache, SwarmCache, JBoss cache 1 and JBoss cache 2, all of which provide distinct features, albeit require additional configuration. Besides these properties, Hibernate also requires that each JPA entity be declared with a caching strategy. In this case, since the Person entity is read only, a caching strategy like the following would be used: Similar to OpenJPA, Hibernate also offers several second level caching features through proprietary annotations and configurations, as well as support for the separate and supplemental cache called a 'query cache' which can further improve JPA performance. For such cases, I will also point you directly to Hibernate's cache documentation available at http://docs.jboss.org/hibernate/core/3.3/reference/en/html/performance.html#performance-cache so you can try these parameters for yourself on the accompanying application source code. Unlike OpenJPA and Hibernate, EclipseLink's second level cache is enabled by default, therefore there is no need to provide any additional configuration. However, similar to its counterparts, EclipseLink also has a series of proprietary second level cache features which can enhance JPA performance. You can find more information on these features by consulting EclipseLink's cache documentation available at: http://wiki.eclipse.org/Introduction_to_Cache_(ELUG) With this we bring our discussion on object relational mapping performance with JPA to a close. I hope you found the various tests and metrics presented here a helpful aid in making decisions about your own JPA applications. In addition, don't forget you can rely on the accompanying source code to try out several JPA variations more ad-hoc to your circumstances. About the author Daniel Rubio is an independent technology consultant specializing in enterprise and web-based software. He blogs regularly on these and other software areas at http://www.webforefront.com. He's also authored and co-authored three books on Java technology. Source code/Application installation * Install MySQL on your workstation (Tested on MySQL 5.1.37-64 bits) - http://dev.mysql.com/downloads/ * Install data set on MySQL - Go to http://www.baseball-databank.org/ and click on the link titled 'Database in MySQL form'. This will download a zipped file with a series of MySQL data structures containing baseball statistics. First create a MySQL database to host the data using the command: 'mysqladmin -p create jpaperformance'. This will create a database named 'jpaperformance'. Next, load the baseball statistics using the following command: 'mysql -p -D jpaperformance < BDB-sql-2009-11-25.sql' where 'BDB-sql-2009-11.25.sql' represents the unzipped SQL script obtained by extracting the zip file you dowloaded. * Create JPA application WARs - The download includes source code, library dependencies and an Ant build file. This includes all three JPA implementations Hibernate 3.5.3, EclipseLink 2.1 and OpenJPA 2.1. To build the JPA Hibernate WAR - ant hibernate To build the JPA EclipseLink WAR - ant eclipselink To build the JPA OpenJPA WAR - ant openjpa All builds are placed under the dist/ directories. * Deploy to Tomcat 6.0.26 - Copy the MySQL Java driver and Spring Tomcat Weaver -- included in the download directory 'tomcat_jar_deps' -- to Apache Tomcat's /lib directory. - Copy each JPA application WAR to Apache Tomcat's /webapps directory, as needed. * Deployment URL's http://localhost:8080/hibernate/hibernate/home ( Query all Player objects ) http://localhost:8080/eclipselink/eclipselink/home ( Query all Player objects ) http://localhost:8080/openjpa/openjpa/home ( Query all Player objects ) http://localhost:8080/hibernate/hibernate/firstname/ ( Query Player objects by first name) http://localhost:8080/eclipselink/eclipselink/firstname/ ( Query Player objects by first name) http://localhost:8080/openjpa/openjpa/firstname/ ( Query Player objects by first name) http://localhost:8080/hibernate/hibernate/lastname/ ( Query Player objects by last name) http://localhost:8080/eclipselink/eclipselink/lastname/ ( Query Player objects by last name) http://localhost:8080/openjpa/openjpa/lastname/ ( Query Player objects by last name) http://localhost:8080/hibernate/hibernate/playerid/ (Query Player by id) http://localhost:8080/eclipselink/eclipselink/playerid/ ( Query Player by id) http://localhost:8080/openjpa/openjpa/playerid/ ( Query Player by id)

July 20, 2010

by Daniel Rubio

· 153,854 Views · 2 Likes

Reduce Fractions Function Python

A function that reduces/simplifies fractions using the Euclidean Algorithm, in Python. def reducefract(n, d): '''Reduces fractions. n is the numerator and d the denominator.''' def gcd(n, d): while d != 0: t = d d = n%d n = t return n assert d!=0, "integer division by zero" assert isinstance(d, int), "must be int" assert isinstance(n, int), "must be int" greatest=gcd(n,d) n/=greatest d/=greatest return n, d

July 19, 2010

by Snippets Manager

· 16,469 Views

Practical PHP Patterns: Front Controller

We have seen that Page Controllers (or, with their alternate intent and name, Action Controllers), are the basic units that get to respond to HTTP requests. But since we have many Page Controllers, how can we distinguish between them and make everyone answer a different type of HTTP request? The simplest solution is to associate every controller with a different end point: we do this naturally when we use PHP scripts. Every script is called when its path is the subject of the HTTP request - and the GET, POST parameters plus the various headers are passed to it as predefined variables. In more complex use cases, however, it can be useful to decouple the routes (types of URL requested by the clients) from the Page Controllers. In this case, a layer that analyzes some parameters in the request has to be interposed. Moreover, this layer can perform every operation that should be concentrated in a single point before the action-specific code is executed. Verifying authentication and authorization to access the single Page Controller is an example of a generic, centralized operation. Initialization of common resources like database connections or caches is another. You may have noticed that, a while ago, PHP applications started to transition from a set of PHP scripts (which Wordpress still uses nowadays) to a single entry point (such as an index.php file, or an invisible index.php file which is called via Url rewriting, making paths like /category/4815162342 possible). This central PHP script, or the class that is instanced and run in it, is called the Front Controller. The Front Controller handles every HTTP request which is routed to it via configuration of the web server, and decides to which entity the execution should be delegated. It can even be responsible for creating an abstraction over the plain PHP superglobal variables, like a Request object to contain parameters or a Response one to populate. Input Which parameters contained in an HTTP request does a Front Controller analyze? There is various implicit and explicit input data that can influence the behavior of the object. The path of the virtual end point is usually the first considered parameter; for example the default route mechanism in the Zend Framework uses the schemes /controller/action and /module/controller/action. GET and POST parameters can be passed to the Page/Action Controllers but also used to route a request differently or to use a particular view. For example, the Zend Framework's Front Controller switches the format of the response basing on the format GET parameter. (?format=xml) HTTP headers can be analyzed, particularly when the Front Controller is managing a REST-like web service. Also custom HTTP headers (X-...) can be extracted from the request. A Zend Framework helper hooked in the Front Controller by default turns off the layout component basing on the presence of the X-Requested-With header, inserted in XMLHttpRequest objects by Ajax libraries. This way the main segment of a page can be returned for insertion in a page via Ajax without further configuration. In general, HTTP is a rich protocol and its headers may convey useful information when a convention is adopted, in the limits of browser's capabilities. Advantages and disadvantages (in PHP applications) In PHP the Front Controller object is recreated at every HTTP request, with every collaborator object it may use. This means there are some peculiarities to the use of this pattern in PHP applications. The main feature of a Front Controller as implemented by the PHP framework is the lazy loading of helpers and controllers, which are created only if used. The Front Controller chooses the Action Controller to run, and then instances only its class for the current request (while in other approaches all the controllers may live in between requests.) An advantage of this practice is the light weight of the object-oriented infrastructure on the performance of the application. A disadvantage is the complexity of the Front Controller, which may have to manage the Dependency Injection of the various controllers. In general, in the first generation of PHP frameworks the Action Controllers are created with an empty constructor to ease the work of the Front Controller. The second generation (Zend Framework 2 and Symfony 2) promises to support Dependency Injection. An issue of the Front Controller pattern in PHP is its overhead, inserted as we have said in every request. For this reason, an application may provide multiple end points (multiple Facade .php files), where frequent, more specific requests are directed by the client's browser. For example, population of forms selects via Ajax can skip the Front Controller and refer to a public/ajax.php file. With this trade-off we lose the single entry point, but we bypass the overhead for a large set of requests with very little glue code. Besides these issues, the Front Controller pattern is the natural evolution of a web application. There are strong pros to the usage of this pattern when the size of the application increases: The Front Controller factors out common code from Page Controllers like parsing of the request and creation of a Response object. It provides a single entry point for managing every request, so you won't have to change one thousand files if you change the name of header.php which is included by them. It may accept plugins or a decoration process to provide new functionalities to all the application, since it intercept every single request. The main disadvantage is the complexity added, but the break even point is not far away, especially when someone else (a framework) provides you with a functional Front Controller. Examples The code sample presented here is a simplified version of the Front Controller of Zend Framework 1.x. The real Front Controller has many object collaborators, like a configurable Router and a Dispatcher. Unfortunately, the implementation of this particular Front Controller is a Singleton, which must be reset in testing environments instead of recreated. The Api provides, Zend_Controller_Front::getInstance() is a large breakage of the Law of Demeter. This implementation is very versatile however: you can inject plugins which are executed with parameters like the request object at a certain time (before or after a action is chosen or run), or helpers to provide as collaborators to all the Action Controllers. _plugins = new Zend_Controller_Plugin_Broker(); } /** * Singleton instance * * @return Zend_Controller_Front */ public static function getInstance() { if (null === self::$_instance) { self::$_instance = new self(); } return self::$_instance; } /** * Resets all object properties of the singleton instance * * Primarily used for testing; could be used to chain front controllers. * * Also resets action helper broker, clearing all registered helpers. * * @return void */ public function resetInstance() { $reflection = new ReflectionObject($this); foreach ($reflection->getProperties() as $property) { $name = $property->getName(); switch ($name) { case '_instance': break; case '_controllerDir': case '_invokeParams': $this->{$name} = array(); break; case '_plugins': $this->{$name} = new Zend_Controller_Plugin_Broker(); break; case '_throwExceptions': case '_returnResponse': $this->{$name} = false; break; case '_moduleControllerDirectoryName': $this->{$name} = 'controllers'; break; default: $this->{$name} = null; break; } } Zend_Controller_Action_HelperBroker::resetHelpers(); } /** * Convenience feature, calls setControllerDirectory()->setRouter()->dispatch() * * In PHP 5.1.x, a call to a static method never populates $this -- so run() * may actually be called after setting up your front controller. * * @param string|array $controllerDirectory Path to Zend_Controller_Action * controller classes or array of such paths * @return void * @throws Zend_Controller_Exception if called from an object instance */ public static function run($controllerDirectory) { self::getInstance() ->setControllerDirectory($controllerDirectory) ->dispatch(); } /** * Add a controller directory to the controller directory stack * * If $args is presented and is a string, uses it for the array key mapping * to the directory specified. * * @param string $directory * @param string $module Optional argument; module with which to associate directory. If none provided, assumes 'default' * @return Zend_Controller_Front * @throws Zend_Controller_Exception if directory not found or readable */ public function addControllerDirectory($directory, $module = null) { $this->getDispatcher()->addControllerDirectory($directory, $module); return $this; } /** * Set request class/object * * Set the request object. The request holds the request environment. * * If a class name is provided, it will instantiate it * * @param string|Zend_Controller_Request_Abstract $request * @throws Zend_Controller_Exception if invalid request class * @return Zend_Controller_Front */ public function setRequest($request) { if (is_string($request)) { if (!class_exists($request)) { require_once 'Zend/Loader.php'; Zend_Loader::loadClass($request); } $request = new $request(); } if (!$request instanceof Zend_Controller_Request_Abstract) { require_once 'Zend/Controller/Exception.php'; throw new Zend_Controller_Exception('Invalid request class'); } $this->_request = $request; return $this; } /** * Set response class/object * * Set the response object. The response is a container for action * responses and headers. Usage is optional. * * If a class name is provided, instantiates a response object. * * @param string|Zend_Controller_Response_Abstract $response * @throws Zend_Controller_Exception if invalid response class * @return Zend_Controller_Front */ public function setResponse($response) { if (is_string($response)) { if (!class_exists($response)) { require_once 'Zend/Loader.php'; Zend_Loader::loadClass($response); } $response = new $response(); } if (!$response instanceof Zend_Controller_Response_Abstract) { require_once 'Zend/Controller/Exception.php'; throw new Zend_Controller_Exception('Invalid response class'); } $this->_response = $response; return $this; } /** * Dispatch an HTTP request to a controller/action. * * @param Zend_Controller_Request_Abstract|null $request * @param Zend_Controller_Response_Abstract|null $response * @return void|Zend_Controller_Response_Abstract Returns response object if returnResponse() is true */ public function dispatch(Zend_Controller_Request_Abstract $request = null, Zend_Controller_Response_Abstract $response = null) { if (!$this->getParam('noErrorHandler') && !$this->_plugins->hasPlugin('Zend_Controller_Plugin_ErrorHandler')) { // Register with stack index of 100 require_once 'Zend/Controller/Plugin/ErrorHandler.php'; $this->_plugins->registerPlugin(new Zend_Controller_Plugin_ErrorHandler(), 100); } if (!$this->getParam('noViewRenderer') && !Zend_Controller_Action_HelperBroker::hasHelper('viewRenderer')) { require_once 'Zend/Controller/Action/Helper/ViewRenderer.php'; Zend_Controller_Action_HelperBroker::getStack()->offsetSet(-80, new Zend_Controller_Action_Helper_ViewRenderer()); } /** * Instantiate default request object (HTTP version) if none provided */ if (null !== $request) { $this->setRequest($request); } elseif ((null === $request) && (null === ($request = $this->getRequest()))) { require_once 'Zend/Controller/Request/Http.php'; $request = new Zend_Controller_Request_Http(); $this->setRequest($request); } /** * Set base URL of request object, if available */ if (is_callable(array($this->_request, 'setBaseUrl'))) { if (null !== $this->_baseUrl) { $this->_request->setBaseUrl($this->_baseUrl); } } /** * Instantiate default response object (HTTP version) if none provided */ if (null !== $response) { $this->setResponse($response); } elseif ((null === $this->_response) && (null === ($this->_response = $this->getResponse()))) { require_once 'Zend/Controller/Response/Http.php'; $response = new Zend_Controller_Response_Http(); $this->setResponse($response); } /** * Register request and response objects with plugin broker */ $this->_plugins ->setRequest($this->_request) ->setResponse($this->_response); /** * Initialize router */ $router = $this->getRouter(); $router->setParams($this->getParams()); /** * Initialize dispatcher */ $dispatcher = $this->getDispatcher(); $dispatcher->setParams($this->getParams()) ->setResponse($this->_response); // Begin dispatch try { /** * Route request to controller/action, if a router is provided */ /** * Notify plugins of router startup */ $this->_plugins->routeStartup($this->_request); try { $router->route($this->_request); } catch (Exception $e) { if ($this->throwExceptions()) { throw $e; } $this->_response->setException($e); } /** * Notify plugins of router completion */ $this->_plugins->routeShutdown($this->_request); /** * Notify plugins of dispatch loop startup */ $this->_plugins->dispatchLoopStartup($this->_request); /** * Attempt to dispatch the controller/action. If the $this->_request * indicates that it needs to be dispatched, move to the next * action in the request. */ do { $this->_request->setDispatched(true); /** * Notify plugins of dispatch startup */ $this->_plugins->preDispatch($this->_request); /** * Skip requested action if preDispatch() has reset it */ if (!$this->_request->isDispatched()) { continue; } /** * Dispatch request */ try { $dispatcher->dispatch($this->_request, $this->_response); } catch (Exception $e) { if ($this->throwExceptions()) { throw $e; } $this->_response->setException($e); } /** * Notify plugins of dispatch completion */ $this->_plugins->postDispatch($this->_request); } while (!$this->_request->isDispatched()); } catch (Exception $e) { if ($this->throwExceptions()) { throw $e; } $this->_response->setException($e); } /** * Notify plugins of dispatch loop completion */ try { $this->_plugins->dispatchLoopShutdown(); } catch (Exception $e) { if ($this->throwExceptions()) { throw $e; } $this->_response->setException($e); } if ($this->returnResponse()) { return $this->_response; } $this->_response->sendResponse(); } }

July 19, 2010

by Giorgio Sironi

· 10,624 Views

Navigate and Fix Errors and Warnings in a Class With Eclipse Keyboard Shortcuts

i really don’t like it when eclipse shows errors/warnings annotations in a class. it’s sometimes nice to jump from one to the next and clean out a class one line at a time, but most of the time they’re just distractions, so i want to be able to find and fix them fast. so there must be a better way to jump between the errors/warnings than to use the mouse or page down to the next error. these methods are not only slow but often frustrating because you tend to miss the annotation, especially if it’s a big class. and navigating to the problems view using the keyboard is ok, but sometimes overkill for just clearing out errors/warnings in one class. a good thing then that eclipse offers keyboard shortcuts that take you to the next/previous annotation in the class. and it does so in a way that selects the annotation immediately, allowing you to use quick fix (ctrl+1) to fix it fast. so here’s how to use these shortcuts to navigate between the error/warning annotations and fix some of the errors easily. how to jump between errors and warnings below are the keys you can use to go to the next/previous annotation and to initiate quick fix. shortcut description ctrl+. next annotation. moves to the next warning/error in the class. ctrl+, previous annotation. moves to the previous warning/error in the class. ctrl+1 quick fix. a fast way to resolve certain warnings/errors automatically, but also useful for automating common editing tasks . to see how fast the next annotation/quick fix combo works, i’ve set up an example in the following video that shows a class with multiple errors/warnings and how using the next/previous annotation and quick fix combo can make you work faster. there’s 1 warning (an unused variable, message , that should die) and 2 errors (not wrapping a fileoutputstream call in a try-catch and not initialising a local variable output ). obviously not all errors are going to be solvable by quick fix, but some are: adding a missing cast, filling in types for generics and adding the @suppresswarning or @override annotations. and for the rest, at least you’ll be able to get to them easily. bonus tip : also see how to cleanup some of these warnings automatically every time you save . you can, of course, remap any of these keyboard shortcuts if they’re inconvenient. have a look at how to manage your keyboard shortcuts , specifically looking out for the commands next , previous (yes, for some reason they’re just called next and previous) and quick fix . only jumping between errors you might sometimes want to tell eclipse to only move between errors (and not warnings) when you press the next/previous annotation command. well, this is controlled by a next/previous annotation dropdown menu in the toolbar (as shown in the image below). as you press the next/previous annotation key (either the keyboard shortcut or toolbar button), eclipse will move to whatever annotations are checked in the dropdown. eclipse’s default only has errors and warnings selected (which is a reasonably good default, for once). you could disable warnings if you only want to move between errors, or vice versa. bonus tip: the other selection you might want to enable is occurrences . an occurrence is when you stand on a variable/method and eclipse highlights it and its declaration and usages within the class. if you select occurrences, pressing the next/previous annotation keys will also jump to the variable’s/method’s declaration and usages. this is nice when you want to quickly move to the variable’s usage in a long class. it’s optional if you want this on, so play around and see if it works for you. from http://eclipseone.wordpress.com

July 14, 2010

by Byron M

· 12,192 Views

Practical PHP Patterns: Page Controller

This is the first article from the Web Presentation Patterns part of this series. We are going to skip the part on web-based Model-View-Controller since we have talked about it enough both for PHP's implementation and for other languages. Page Controller is a subpattern of the Model-View-Controller one, where every logical page has a correspondent PageController that produces a response dynamically. Usually there are two collaborators that the Controllers calls methods on - a Model and a View. The Model's goal is executing business logic and providing data, while the View's one is to display it, possibly in different formats, and provide ways for interacting with the application to the user (such as forms and links). History Back in the time of when the web was starting out as a platform to consult remote documents (1990s), a path would have always corresponded to a static HTML document, which would be sent as-is to the client which requested it in the first place. With the introduction of dynamic pages, a path did not translate to a single response anymore, but to an object capable of producing a response which can change everytime basing on parameters in the request, cookies or session variables. This dynamic generation of web pages is the basis of the Web as we know it today. Implementation Technically speaking, a Page Controller is an object (in the broader sense of the term, not ony an instance of a class but every kind of item) kept on the server, which is called upon a certain endpoint is the target of a HTTP request. For example, plain old PHP scripts, which you call with Urls like /index.php, /list.php or /folder/member.php?id=42 are Page Controllers. But what is happening when you see Urls like /node/25645? Obviousy there is no 25645 file stored in the web server filesytem. In more modern PHP frameworks, like Zend Framework or Symfony, the Page Controllers are not files but classes, a choice which simplifies the management of scope and the parameters passing to them (and also provides the unique possibility of defining multiple action methods). In this environment, Page Controllers are called Action Controllers or simply Actions, but don't be fooled by the name, since we have seen that even a simple PHP script is a Page Controller. The reason for a different name can be tracked back to the transition from page-oriented websites to web full-featured web applications: a controller does not simply display a page anymore, but it performs some actions, at the end of which it can redirect the client or forward the execution to a colleague. The responses produced by controllers are not only HTML pages anymore, since they can be produced as HTML fragments, XML or JSON results. Similarly, the plain old Page Controllers based on a PHP file have greater responsibilities than modern Action Controllers. Page Controller have to deal with the basic infrastructure of the programming language, like $_GET and $_POST, while Action Controllers are passed an higher-level abstraction like a Request Zend_Controller_Action_Request object. The insulation layer between the client and Action Controllers will be the subject of another article. In general, this layer is not needed if the translation between URLs and Page Controllers can be performed natively by the web server. Collaborators The two collaborators of the Page Controllers have various implementation, which will be treated in detail in the upcoming articles of this series. The Model part is the most application-specific part of the code, and contains as usual the business logic related to the domain. Ideally it should be testable in isolation, while the Page Controllers job is translating HTTP requests to method calls on the Model's objects and adding some indirection to check the client's authentication and authorization. After having obtained the necessary data from the Model, the Controller uses a View component to generate the response. In PHP scripts, this separation is almost non-existent, as the script executes its logic and then starts printing. In Action Controllers, which are modelled as classes, they actually forward, automatically or not, to a View, which in turn may be modelled as a PHP script (Zend Framework's case) or as an object. Examples There are two simple examples we can made about Page Controllers. The first one is a plain old PHP scripts, a solution widely employed before the advent of web frameworks. The mapping of the request to the file is performed by the web server, which takes the standard output of the script, along with its produced HTTP headers, and sends it back to the client. Hello, ' , $_GET['name'], ''; Very simple, and it's why PHP is so diffused and easy to use. But it does not scale. A more sophisticated example is an Action Controller of Zend Framework. Here the framework, basing on configuration, figures out which controller is responsible for a request, and produces Zend_Controller_Action_Request and Zend_Controller_Action_Response objects as an abstraction over standard input and standard output. Part of this abstraction already existed in PHP ($_GET, $_POST) as an advantage over CGI scripts, part of it (matching Urls with predefined formats or regular expressions) is implemented in userland PHP by the framework itself. class FooController extends Zend_Controller_Action { public function barAction() { // assign a variable to the view, with the value coming from somewhere $this->view->variable1 = ... $this->render('viewName'); // executes views/scripts/viewName.php } public function bazAction() { // other action related to the former } } For a full discussion of Zend Framework's implementation of this pattern, refer to the manual.

July 14, 2010

by Giorgio Sironi

· 10,437 Views

Practical PHP Patterns: Query Object

An ORM provides an abstraction of storage as an in-memory object graph, but it is difficult to navigate that graph via object pointers without loading a large part of it. Typical problems of this approach are the performance issues related to loading of the various objects, and the transfer of business logic execution from the database side to the client code side, with the resulting duplication. Anyway, when we start navigating an object graph we have to obtain a reference to an entity somehow (an Aggregate Root), from which we can navigate to the other ones. ORMs and, in general, Data Mappers provide different ways to select a subset of objects (or a single one) and reconstitute only that subset from the data storage. Custom mapper classes with domain-specific methods are the the simplest solution, which is often recommended when not using a generic Data Mapper. Custom mapper classes with finder methods are an half-baked solution, which mixes up domain-specific mappers with general purpose methods, sometimes needed to allow flexibility on the user side. Generic mapper classes with finder methods can be provided as a way to parametrize fields, resulting in methods like findBy($entityName, $field, $value). Generic mapper classes with query objects are employed when there is the necessity of composing queries and pass them around for further elaboration or refining. Promoting the query as an object helps this use case. Note that once a mapper implements query objects, they can be effectively used in finder methods, which are a subset of the functionality provided by query objects. In fact, query objects are the most versatile way to ask for the objects that satisfy certain conditions, and they are an Interpreter implementation over a query language adapt for an object model. All of us already know a query language: SQL. But SQL is pertinent to relational databases, while an ORM strives for keeping the illusion of an object-only model into existence. As a result, it must adopt a different language which describes object features, like HQL (Hibernate) or DQL (the Doctrine equivalent). Object query languages There are several differences between an object query language and SQL in the entities you can refer to within queries: SQL refers to tables; object query languages refer to classes and some tables like the association tables for many-to-many relationships simply vanish. SQL refers to rows; object query languages to objects. SQL refers to other tables for making JOINs; object query languages to object collaborators. SQL refers to columns, which also include foreign keys; object query languages only to fields of the objects. When a full-featured language is involved, there must be a component of the ORM that parses the strings containing language statements into a Query Object. Another way to define such an object (Interpreter) is constructing it by hand, by calling a series of setter methods or by implementing a Builder pattern. Advantages A Query Object hides the relational model (the schema) from the user, as it can be inferred by the union of the queries and the Metadata Mapping anyway. The information contained in the metadata, like foreign keys and additional tables, do not have to be repeated in the various components of client code. It hides also the peculiarities of the particular database vendor, since the generation of SQL can be addressed by a driver. It promotes queries as first-class citizens, making them objects that can be passed around, cloned or modified. The database abstraction layers like PDO make of statement objects (PDOStatement) one of their first modelling points. Disadvantages The implementation of the parser for a query language is a task of great complexity, which makes this pattern only feasible in generic Data Mappers. Even when using only Query Objects made by hand, it is advisable to employ an external Data Mapper to take advantage of the translation of object-based queries to SQL. Examples Doctrine 2 contains a parser for its Doctrine Query Language, which lets you define queries like you would do with PDO, but still referring to an object model. The documentation of the query language itself is pretty complete, so I won't go into details but I'll give you a feel of how using DQL is like. The language itself is compatible with the Doctrine 1 version, if you happen to have used it. createQuery('SELECT u FROM MyProject\Model\User u WHERE u.age > 20'); $users = $query->getResult(); $query = $em->createQuery("SELECT u, a FROM User u JOIN u.address a WHERE a.city = 'Berlin'"); $users = $query->getResult(); uery = $em->createQuery('SELECT u, p FROM CmsUser u JOIN u.phonenumbers p'); $users = $query->getResult(); // array of CmsUser objects with the phonenumbers association loaded $phonenumbers = $users[0]->getPhonenumbers(); $query = $em->createQuery('SELECT u, a, p, c FROM CmsUser u JOIN u.articles a JOIN u.phonenumbers p JOIN a.comments c'); $users = $query->getResult(); Sometimes there are no fixed queries, but a dynamic query has to be constructed from its various parts, as a union of conditions, joins and sorting parameters; not all the parameters may be available at a certain time and concatenating strings to compose a DQL statement is prone to error. Doctrine 2 includes a Query Builder which has methods you can call orthogonally, in any order and combination. add('select', 'u') ->add('from', 'User u') ->add('where', 'u.id = :identifier') ->add('orderBy', 'u.name ASC'); ->setParameter('identifier', 100); // Sets :identifier to 100, and thus we will fetch a user with u.id = 100

July 7, 2010

by Giorgio Sironi

· 6,502 Views

Refactoring into Scala Type Classes

A couple of weeks back I wrote about type class implementation in Scala using implicits. Type classes allow you to model orthogonal concerns of an abstraction without hardwiring it within the abstraction itself. This takes the bloat away from the core abstraction implementation into separate independent class structures. Very recently I refactored Akka actor serialization and gained some real insights into the benefits of using type classes. This post is a field report of the same. Inheritance and traits looked good .. .. but only initially. Myself and Jonas Boner had some cool discussions on serializable actors where the design we came up with looked as follows .. trait SerializableActor extends Actor trait StatelessSerializableActor extends SerializableActor trait StatefulSerializerSerializableActor extends SerializableActor { val serializer: Serializer //.. } trait StatefulWrappedSerializableActor extends SerializableActor { def toBinary: Array[Byte] def fromBinary(bytes: Array[Byte]) } // .. and so on All these traits make the concerns of serializability just too coupled with the core Actor implementation. And with various forms of serializable actors, clearly we were running out of class names. One of the wisdoms that the GoF Patterns book taught us was that when you struggle naming your classes using inheritance, you're definitely doing it wrong! Look out for other ways that separate the concerns more meaningfully. With Type Classes .. We took the serialization stuff out of the core Actor abstraction into a separate type class. /** * Type class definition for Actor Serialization */ trait FromBinary[T <: Actor] { def fromBinary(bytes: Array[Byte], act: T): T } trait ToBinary[T <: Actor] { def toBinary(t: T): Array[Byte] } // client needs to implement Format[] for the respective actor trait Format[T <: Actor] extends FromBinary[T] with ToBinary[T] We define 2 type classes FromBinary[T <: Actor] and ToBinary[T <: Actor] that the client needs to implement in order to make actors serializable. And we package them together as yet another trait Format[T <: Actor] that combines both of them. Next we define a separate module that publishes APIs to serialize actors that use these type class implementations .. /** * Module for actor serialization */ object ActorSerialization { def fromBinary[T <: Actor](bytes: Array[Byte]) (implicit format: Format[T]): ActorRef = //.. def toBinary[T <: Actor](a: ActorRef) (implicit format: Format[T]): Array[Byte] = //.. //.. implementation } Note that these type classes are passed as implicit arguments that the Scala compiler will pick up from the surrounding lexical scope. Here's a sample test case which implements the above strategy .. A sample actor with encapsulated state. Note that we no longer have any incidental complexity of my actor having to inherit from any specialized Actor class .. class MyActor extends Actor { var count = 0 def receive = { case "hello" => count = count + 1 self.reply("world " + count) } } and the client implements the type class for protocol buffer based serialization and package it as a Scala module .. object BinaryFormatMyActor { implicit object MyActorFormat extends Format[MyActor] { def fromBinary(bytes: Array[Byte], act: MyActor) = { val p = Serializer.Protobuf .fromBinary(bytes, Some(classOf[ProtobufProtocol.Counter])) .asInstanceOf[ProtobufProtocol.Counter] act.count = p.getCount act } def toBinary(ac: MyActor) = ProtobufProtocol.Counter.newBuilder.setCount(ac.count).build.toByteArray } } We have a test snippet that uses the above type class implementation .. import ActorSerialization._ import BinaryFormatMyActor._ val actor1 = actorOf[MyActor].start (actor1 !! "hello").getOrElse("_") should equal("world 1") (actor1 !! "hello").getOrElse("_") should equal("world 2") val bytes = toBinary(actor1) val actor2 = fromBinary(bytes) actor2.start (actor2 !! "hello").getOrElse("_") should equal("world 3") Note that the state is correctly serialized by toBinary and then subsequently de-serialized to get the updated value of the Actor state. This refactoring has made the core actor implementation much cleaner moving away the concerns of serialization to a separate abstraction. The client code also becomes cleaner in the sense that the client actor definition does not include details of how the actor state is being serialized. Scala's power of implicit arguments and executable modules made this type class based implementation possible. From http://debasishg.blogspot.com/2010/07/refactoring-into-scala-type-classes.html

July 7, 2010

by Debasish Ghosh

· 7,924 Views

Practical PHP Patterns: Metadata Mapping

The intent of the Metadata Mapping pattern is to express implementation details, related a particular domain and Domain Model, as metadata of a general purpose library. In the sense intended here, metadata is related to the persistence operations (transferring objects back and forth from a database). These metadata is usually fed to a general purpose object-relational mapper. Technically the term metadata is plural (of metadatum, data about data), but it is commonly used as an uncountable noun. Why expressing metadata Object-relational mapping is a difficult task to automate, prone to lots of potential bugs and undefined behaviors; expressing the domain-related peculiarities as metadata means that you are able to code only one ORM, and not have to repeat the same work in many custom Data Mappers, which are very boring to write and can't be transported out a specific application. Custom Data Mappers were a cleaner solution for Domain Models with regard to employing Active Records, and they are advocated for example in Zend Framework books like Keith's Pope one. They are finally becoming obsolete thanks to the power of a declarative approach like this pattern, which tools like Doctrine 2 are based on. Historicacally, Hibernate from JBoss was the first Data Mapper implemented as a generic ORM (it is a Java product). Doctrine 2 is the most famous PHP implementation, and it is in beta at the time of this writing. The metadata we'd like to tell to an ORM are for example: which classes should be persisted at all. Optional names for the tables (it can use the class names.) Which fields form the primary key. The types of the different columns, particularly important in a loosely typed language like PHP. Which collaborators have to be persisted and via what means: foreign keys and additional association tables. The metadata should usually not consist of code: non-standard behavior shouldn't be contained in them, as in general all the behavior like ineritance strategies and conversion of relationships is extracted in the generic ORM. Thus there are different formats we can use in place of PHP code: XML, annotations, YAML, INI... Different approaches There are two approaches to Metadata Mapping pattern, described by Fowler in his original book. The first one is code generation: the metadata is processed to generate the source code of the mapping classes, for example a Data Mapper for every entity or aggregate root of your model (one for User, one for BlogPost, and so on). The ORM would theoretically not be necessary in production if the generation is complete enough. Doctrine 1 used this approach in part, but it generated also the PHP code of the domain model itself from the Yaml mapping, as subclasses of Doctrine_Record. Still, Doctrine 1 was necessary to instantiate those classes and the solution wasn't so clean. Doctrine 2 is very different in architecture and goals. The second approach is called reflective program, and consists in interpreting the mapping at runtime in the ORM's code, to open up correctly the objects via reflection (or a standard interface) and putting them in the database. The converse can happen: objects can be recreated from the union of metadata and database tables. How it is used The reflective solution is the common one nowadays, and Doctrine 2 borrows it from Hibernate in its own design. Reflection is used to access the private fields to persist. Some critics point out speed problems of this technique, but keep in mind that your ORM is communicating with an external process or database machine at the same time of using reflection: it probably won't count much in the benchmark. Doctrine 2 however takes optimization seriously to the point that metadata internal classes (accessed very often) present an Api with public properties instead of methods to avoid every overhead in a crucial part (hydration of objects with data retrieved from the database). An advantage of generated code is that it would be easier to debug, but it is usually a pain to maintain: every time you evolve or refactor a domain class you have to regenerate the Mapper classes. You can't customize this code either, because you would lost your changes at the regeneration time. Advantages and (few) disadvantages Of course we lose some expressiveness by specifying metadata instead of a programmatical behavior like the source code of a custom Data Mapper. But we gain very much: a fully tested ORM, like Doctrine 2 in the PHP case, with only some lines of added metadata to keep in sync with the rest of the code base. Declarative approaches trading off completeness of functionalities (the absent ones are not used very often anyway) for developers time. But there are other advantages, such as the generation (and migration) of the database schema based on the metadata, and also of the proxy classes. Ideally, the metadata mapping is the only point of strong coupling of your Domain Model with an external adapter, the ORM. It is of course part of the infrastructure, so keep it under version control along with the code! Adding and removing fields or relationships, changing keys or refactoring is much easier because you do it declaratively instead of refactoring a specific mapper class. Note that automated refactoring tools are not to be trusted here: for example they usually ignore the mapping when you change a field name. So grep is your best ally. Examples The sample code of this article will present the different ways of specifying metadata for Doctrine 2, the most high-tech PHP ORM. The performance of the different methods are equivalent, since the metadata are read only one time into native PHP objects and then cached. Metadata is a vast subject since all the different persistence implementations have to be driven by it, but we will look more at the types of metadata specification we can use instead of all the different metadata instances, which are best described in conjunction with the single features (for example, the inheritance patterns articles contain the description of the metadata related to subclassing.) The simplest way to express metadata mapping in Doctrine 2 is via annotations, embedded in the docblocks and ignored from anything but the ORM: Don't be alarmed by the size: this mapping does much more than the annotations example's one. A third way to specify metadata is via YAML, a format widely used in symfony-related software: --- # Doctrine.Tests.ORM.Mapping.User.dcm.yml Doctrine\Tests\ORM\Mapping\User: type: entity table: cms_users id: id: type: integer generator: strategy: AUTO fields: name: type: string length: 50 oneToOne: address: targetEntity: Address joinColumn: name: address_id referencedColumnName: id oneToMany: phonenumbers: targetEntity: Phonenumber mappedBy: user cascade: cascadePersist manyToMany: groups: targetEntity: Group joinTable: name: cms_users_groups joinColumns: user_id: referencedColumnName: id inverseJoinColumns: group_id: referencedColumnName: id lifecycleCallbacks: prePersist: [ doStuffOnPrePersist, doOtherStuffOnPrePersistToo ] postPersist: [ doStuffOnPostPersist ]

July 5, 2010

by Giorgio Sironi

· 3,974 Views

Are You A Starter, A Finisher Or An Implementer?

There are three parts to every project, starting, finishing and everything in between. Two parts of the process are very difficult to complete, starting and finishing. This is not a tutorial on project management, as much as it is a general guide for people involved in a project. For example, lots of people have ideas. Ideas are easy because they require very little risk. But, what happens after the idea? You are supposed to start the project. However, most people stop with the idea because they “don’t have time” or even “I wouldn’t know where to begin”. Kat French explains how she does her best creative work: The super-secret, hush-hush, “I could tell you, but then I’d have to kill you” secret of how I do my best creative work. Ready? It’s called “starting.” The recipe is.. there is no recipe. This isn’t science. It’s more like alchemy. There are ingredients. Usually those ingredients have certain effects. When you put them all together and apply heat…”results may vary,” Starting does not mean that everything will go well or that you will be successful. Starting just means that you took the initiative to start, and that probably puts you ahead of the majority of workers out there. In order for a project to be successful, you have to start at some point. Most people are not good starters, they need some core foundation or baseline to start with. Some people also need the structure of a formal project management methodology or a detailed task list. The term “self-starter” has been abused by the whole recruitment/HR industry to become someone who can do their own work without significant prodding. What do you call someone who can take an idea and start a project? Some people may throw the title “entrepreneur” at that person, but it also has other meanings. The key is that this person can start something. Are you that person? One problem is that the starter may not be very good at filling in the various details of the project or finishing the project. Starters may be excited by the novelty of a project, but once you get mired in details, the novelty has worn off. By the time you are trying to finish the project, the starter is probably bored or even hates his job. Given that we know that no project is ever really done, you might be able to keep the starter happy by having them begin work on the next phase of the project or a significant new feature. At the other end of the project is completion. Starters typically do not fare well as a finisher of a project. As an example, look at the typical software development project. At the beginning of the project, there is a lot of technology research and foundation or framework code that needs to be completed. Starters love that work. At the end of a project, most of the work is in validating and correcting defects, and working with other departments to ensure deployment goes smoothly. A finisher is the person that works well juggling multiple tasks, fixing defects and managing processes to completion. Obviously, this is a very different person than the starter. The finisher loves a detailed task list as it gives them a goal. If they complete all of these tasks, it is likely that the project has reached its conclusion and the application has been deployed. However, you cannot always be really finishing a project, so how do you keep the finisher happy? Similar to the starter, you can have the finisher move from one project or feature to another. They are a nice complement to the starter in terms of the tasks to be completed. Are you a finisher? But how do you have one project look like several? In project management, a large project is broken into phases, which are really just smaller projects. If you do not have a really large project, you can create smaller projects by looking for milestones in your project. Agile methodologies take this concept to the extreme by ensuring that there is a fixed time for each iteration. In some cases, an iteration could be long enough to implement one feature. So, each feature in your product could become an iteration or a small project. So, we have talked about starting and finishing, but what about the stuff in between? Someone needs to fill in the details. I started by calling this person a filler, but that does not sound like a good name for someone. So, I will call this person an implementer. This person takes the basic infrastructure and puts the application features on top of it. They create the web forms and the code to save the data, using the frameworks provided by the starter. Most people fall into this category because it has the broadest spectrum of work. Each web page or feature may look like a new project for them. They may not require a detailed task list, depending upon experience, but they look at the requirements and fill in the details. Are you an implementer? Because most projects are full of details, the implementer has plenty of work to do. They can be moved from project to project filling in the gaps that the starter and finisher do not complete. Given that there are so many details in projects, this is where a project manager will spend a bulk of their time, managing the implementers. Implementers will also be the most diverse group of people, so management of these people could be a daunting task as well. Of course, the next question from people would be who is most valuable. For that question, I give you a quote from a Seth Godin post about linchpins: A newspaper asked me the following, which practically set my hair on fire: What inherent traits would make it easier for someone to becoming a linchpin? Surely not everyone can be a linchpin? Why not? Each of these types of people are important. What good is a starter if there is nobody there to finish? If you have a finisher, who starts the project in the right direction? Once the project is started someone needs to fill in the details, and that is not the starter or the finisher. There are some of those rare people that can take a project from start to finish, and there are others that overlap into two of the three groups. But you should be honest with yourself. What are you good at? Starting? Finishing? The stuff in between? From http://regulargeek.com/2010/06/24/are-you-a-starter-a-finisher-or-an-implementer

July 5, 2010

by Robert Diana

· 18,284 Views

How to Automatically Recover Tomcat From Crashes

Tomcat occasionally crashes if you do frequent hot-deploys or if you are running it on a machine with low memory. Every time tomcat crashes someone has to manually restart it, so I wrote a script which automatically detects that tomcat has crashed and restarts it. Here’s the pseudo logic: every few minutes { check tomcat status; if (status is "not running") { start tomcat; } } every few minutes { check tomcat status; if (status is "not running") { start tomcat; } } Here’s a shell script to implement the above logic. It assumes that you are running on a unix/linux system and have /etc/init.d/tomcat* script setup to manage tomcat. Adjust the path to “/etc/init.d/tomcat” in the script below to reflect the correct path on your computer. Sometimes it is called /etc/init.d/tomcat5 or /etc/init.d/tomcat6 depending on your tomcat version. Also make sure that the message “Tomcat Servlet Container is not running.” matches with the message that you get when you run the script when tomcat is stopped. #! /bin/sh SERVICE=/etc/init.d/tomcat STOPPED_MESSAGE="Tomcat Servlet Container is not running." if [ "`$SERVICE status`" == "$STOPPED_MESSAGE"]; then { $SERVICE start } fi #! /bin/sh SERVICE=/etc/init.d/tomcat STOPPED_MESSAGE="Tomcat Servlet Container is not running." if [ "`$SERVICE status`" == "$STOPPED_MESSAGE"]; then { $SERVICE start } fi To run the script every 10 minutes: 1. Save the above script to “/root/bin/recover-tomcat.sh”. 2. Add execute permission: chmod +x /root/bin/recover-tomcat.sh chmod +x /root/bin/recover-tomcat.sh 3. Add this to root’s crontab, type the following as root: crontab -e crontab -e 4. Add the following lines to the crontab: # monitor tomcat every 10 minutes */10 * * * * /root/bin/recover-tomcat.sh # monitor tomcat every 10 minutes */10 * * * * /root/bin/recover-tomcat.sh What if I don’t have /etc/init.d/tomcat* script on my computer? Tomcat creates a pid file, typically in the TOMCAT_HOME/bin directory. This file contains the process id of the tomcat process running on the machine. The pseudo logic in that case would be: if (the PID file does not exist) { // conclude that tomcat is not running start tomcat } else { read the process id from the PID file if (no process that id is running) { // conclude that tomcat has crashed start tomcat } } if (the PID file does not exist) { // conclude that tomcat is not running start tomcat } else { read the process id from the PID file if (no process that id is running) { // conclude that tomcat has crashed start tomcat } } You can implement the above logic as follows. The following is experimental and is merely a suggested way, test it on your computer before using it. # adjust this to reflect tomcat home on your computer TOMCAT_HOME=/opt/tomcat5 if [ -f $TOMCAT_HOME/bin/tomcat.pid ] then echo "PID file exists" pid="`cat $TOMCAT_HOME/bin/tomcat.pid`" if [ "X`ps -p $pid | awk '{print $1}' | tail -1`" = "X"] then echo "Tomcat is running" else echo "Tomcat had crashed" $TOMCAT_HOME/bin/startup.sh fi else echo "PID file does not exist. Restarting..." $TOMCAT_HOME/bin/startup.sh fi # adjust this to reflect tomcat home on your computer TOMCAT_HOME=/opt/tomcat5 if [ -f $TOMCAT_HOME/bin/tomcat.pid ] then echo "PID file exists" pid="`cat $TOMCAT_HOME/bin/tomcat.pid`" if [ "X`ps -p $pid | awk '{print $1}' | tail -1`" = "X"] then echo "Tomcat is running" else echo "Tomcat had crashed" $TOMCAT_HOME/bin/startup.sh fi else echo "PID file does not exist. Restarting..." $TOMCAT_HOME/bin/startup.sh fi Why would tomcat crash? The most common reason is low memory. For example, if you have allocated 1024MB of max memory to tomcat and enough memory is not available on that machine. Other reasons may involve repeated hot-deploys causing memory leaks, rare JVM bugs causing the JVM to crash. From http://www.vineetmanohar.com/2010/06/howto-auto-recover-tomcat-crashes

June 28, 2010

by Vineet Manohar

· 29,820 Views

16 Tips for Securing Your Admin Page

So you've finished that shiny new website and you want make sure that you and your buddies are in control. Besides the obvious things such as SSL and logging all access, there are a fewest practices for authentication/access that developers recommend. Here are some of the recommendations: Require separate login pages for users and admin using the same DB table. This will prevent XSRF and session-stealing, plus the attacker won't be able to access to admin areas) [Thief Master] Use complex passwords for admin accounts. For example, "uvula{:&:>iuJ", not "12345". Of course, you have to remember it. :) [Developer Art] Introduce an artificial pause between each admin password attempt to prevent brute force attacks. [Lo'oris] Blocking users IP after a number of failed admin login attempts or requiring a CAPTCHA after a failed login (but not the first one, because that's really annoying) will also stop brute force attacks. [Thief Master] If the admin section is in a separate subdirectory, you should consider also adding webserver native authentication to that area (e.g. via .htaccess in Apache). Then an attacker would need both the subdirectory password and the user password. [Thief Master] Consider Second level authentication such as client certificates (e.g. x509 certs), smart cards, cardspace, etc. [JoeGeeky] Restrict access to the admin area. Only allow clients from trusted IPs/Domains. [JoeGeeky] Lock down IPrincipal & Principal-based authorization and make rights immutable and non-enumerable. Also make sure that all authorization assessments are based on the Principal. [JoeGeeky] Set up an email notification system that alerts admins when any rights are upgraded. This will help you catch an attacker that elevates his/her rights. [JoeGeeky] Consider fine-grained rights for admins. Typical Role-Based Security (RBS) approaches are not as safe because some roles will end up with more rights that they need. You should distribute rights based on the exact actions that a admin performs. This could cause a lot of overhead with more diverse admin-types, but it is safer because rights are issued more sparingly. [JoeGeeky] Restrict the creation of further admins and carefully control what admins can do to other admins. It's best to have a locked-down 'super-admin' client. [JoeGeeky] Consider Client Side SSL Certificates or RSA type keyfobs (electronic tokens) for added security. [Daniel Papasian] If you're using using cookies for authentication, use separate cookies for admin and normal pages. One way is to put the admin section on a different domain. [Daniel Papasian] One possibility, if it's practical, is to put the admin site on a private subnet instead of the internet. [John Hartsock] Re-issue auth/session tickets when moving between admin and normal usage contexts of the website. [Richard JP Le Guen] Require equally strong mechanisms (using the above techniques) for basic users so that admins aren't the only ones with highly-secure accounts. [Lo'oris] These tips were gathered in a question by UpTheCreek from StackOverflow.

June 21, 2010

by Mitch Pronschinske

· 9,174 Views

NeoLoad 3.1 load tests Java Serialization

Neotys, a leader in easy-to-use, cost effective load testing tools for web applications today announced NeoLoad 3.1, the first test solution on the market to incorporate support for new push technologies such as Adobe RTMP or Ajax Push and now supports Java serialization. A new Java serialization module has been added to record and replay applications using the Java object serialization over HTTP. This module is fully compatible with the spring remote framework. New features Push Technologies module RTMP module Java Serialization module Advanced variabilization Alerts thresholds Customized reports > View all the new features. Free Trial Download the NeoLoad v3.1 demo (30-day free trial). More information http://www.neotys.com

June 18, 2010

by Christophe Marton

· 1,334 Views

Builder Pattern Tutorial with Java Examples

Learn the Builder Design Pattern with easy Java source code examples as James Sugrue continues his design patterns tutorial series, Design Patterns Uncovered

June 15, 2010

by James Sugrue

· 92,388 Views · 14 Likes