DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

The Latest Databases Topics

article thumbnail
JBoss RichFaces with Spring
This article is going to show you how to build a RichFaces application with Spring.
February 16, 2009
by Max Katz
· 203,783 Views
article thumbnail
Generic Repository and DDD - Revisited
Greg Young talks about the generic repository pattern and how to reduce the architectural seam of the contract between the domain layer and the persistence layer. The Repository is the contract of the domain layer with the persistence layer - hence it makes sense to have the contract of the repository as close to the domain as possible. Instead of a contract as opaque as Repository.FindAllMatching(QueryObject o), it is always recommended that the domain layer looks at something self revealing as CustomerRepository.getCustomerByName(String name) that explicitly states out the participating entities of the domain. +1 on all his suggestions. However, he suggests using composition, instead of inheritance to encourage reuse along with encapsulation of the implementation details within the repository itself .. something like the following (Java ized) public class CustomerRepository implements ICustomerRepository { private Repository internalGenericRepository; public IEnumerable getCustomersWithFirstNameOf(string _Name) { internalGenericRepository.fetchByQueryObject( new CustomerFirstNameOfQuery(_Name)); //could be hql or whatever } } Quite some time ago, I had a series of blogs on DDD, JPA and how to use generic repositories as an implementation artifact. I had suggested the use of the Bridge pattern to allow independent evolution of the interface and the implementation hierarchies. The interface side of the bridge will model the domain aspect of the repository and will ultimately terminate at the contracts that the domain layer will use. The implementation side of the bridge will allow for multiple implementations of the generic repository, e.g. JPA, native Hibernate or even, with some tweaking, some other storage technologies like CouchDB or the file system. After all, the premise of the Repository is to offer a transparent storage and retrieval engine, so that the domain layer always has the feel that it is operating on an in-memory collection. // root of the repository interface public interface IRepository { List read(String query, Object[] params); } public class Repository implements IRepository { private RepositoryImpl repositoryImpl; public List read(String query, Object[] params) { return repositoryImpl.read(query, params); } //.. } Base class of the implementation side of the Bridge .. public abstract class RepositoryImpl { public abstract List read(String query, Object[] params); } One concrete implementation using JPA .. public class JpaRepository extends RepositoryImpl { // to be injected through DI in Spring private EntityManagerFactory factory; @Override public List read(String query, Object[] params) { //.. } Another implementation using Hibernate. We can have similar implementations for a file system based repository as well .. public class HibernateRepository extends RepositoryImpl { @Override public List read(String query, Object[] params) { // .. hibernate based implementation } } Domain contract for the repository of the entity Restaurant. It is not opaque or narrow, uses the Ubiquitous language and is self-revealing to the domain user .. public interface IRestaurantRepository { List restaurantsByName(final String name); //.. } A concrete implementation of the above interface. Implemented in terms of the implementation artifacts of the Bridge pattern. At the same time the implementation is not hardwired with any specific concrete repository engine (e.g. JPA or filesystem). This wiring will be done during runtime using dependency injection. public class RestaurantRepository extends Repository implements IRestaurantRepository { public List restaurantsByEntreeName(String entreeName) { Object[] params = new Object[1]; params[0] = entreeName; return read( "select r from Restaurant r where r.entrees.name like ?1", params); } // .. other methods implemented } One argument could be that the query string passed to the read() method is dependent on the specific engine used. But it can very easily be abstracted using a factory that returns the appropriate metadata required for the query (e.g. named queries for JPA). How does this compare with Greg Young's solution ? Some of the niceties of the above Bridge based solution are .. The architecture seam exposed to the domain layer is NOT opaque or narrow. The domain layer works with IRestaurantRepository, which is intention revealing enough. The actual implementation is injected using Dependency Injection. The specific implementation engine is abstracted away and once agian injected using DI. So, in the event of using alternative repository engines, the domain layer is NOT impacted. Greg Young suggests using composition instead of inheritance. The above design also uses composition to encapsulate the implementation within the abstract base class Repository. However in case you do not want to have the complexity or flexibility of allowing switching of implementations, one leg of the Bridge can be removed and the design simplified From http://debasishg.blogspot.com/
January 20, 2009
by Debasish Ghosh
· 40,604 Views · 1 Like
article thumbnail
Hello EclipseLink on the NetBeans Platform
Let's use EclipseLink to set up some very basic database interaction in a NetBeans Platform application. Though it will be the ultimate 'Hello World' scenario, it should show how to get started with database interactivity on the NetBeans Platform, while also yet again showing the benefit of the NetBeans Platform's modular architecture. Of course, feel free to adapt these instructions to your needs, for example, instead of EclipseLink, use TopLink, or Hibernate, or whatever. You should also be surprised by how easy it is, once you know how. We'll simply access a database and display what we find there: Our application will look as follows: Notice that we will have 4 separate modules, which will enable us to easily provide alternative database providers, as well as alternative persistence providers, because the UI module uses generic code that could make use of any alternative backing modules. Let's get started. Create a Java Library and Generate Entity Classes from the Database. Firstly, create a Java Library project. Then use the Entity Classes from Database wizard to generate entity classes from your database. In the wizard, select EclipseLink in the step where you use the wizard to generate a persistence unit. Look at the generated code and notice that, among other things, you have a persistence.xml file in a folder called META-INF, thanks to the wizard. In my case, I chose a Sample database that comes with the IDE, and then I specified I want an entity class for the Customer table, which resulted in the IDE also creating an entity class for the related DiscountCode table: Build the Java Library and you will have a JAR file in the above application's "dist" folder. As you will read in the next step, that JAR file needs to be added as a library wrapper module to the application you will start creating in the next step. Create a NetBeans Platform Application. In the New Project dialog, specify that you want to create a new NetBeans Platform Application. Once you've created it, right-click the Modules node in the Projects window and choose Add New Library. Then select the JAR you created in the previous step. You now have your first custom module in the new application. Create Supporting Library Wrappers. Do the same as you did when creating the library wrapper for the entity class JAR, but this time for the EclipseLink JARs (which are in your GlassFish distro, make sure to include the persistence JAR that you find there too and, if you don't know which ones to include, go back to the Java Library shown in the previous screenshot and then expand the Libraries folder, which will show you which libraries you need). Next, create yet another library wrapper module... for the DerbyClient JAR. Create the UI Module. The final module you will need will provide the UI. So, create a new module (not a Library Wrapper Module, but just a plain NetBeans Module) and add a Window Component via the New Window Component wizard. Set Dependencies. You now have lots of classes all neatly separated into distinct modules. In order to be able to use code from one module in another module, you'll need to set dependencies, i.e., very explicit contracts (as opposed to accidental reuse of code in one place from another place, resulting in unmaintainable chaos). First, the entity classes module needs to have dependencies on the DerbyClient module, as well as on the EclipseLink module. Then, the UI module needs a dependency on the EclipseLink module as well as the entity classes module. (You could split things further so that the EclipseLink module is not a dependency of the UI module, by putting the persistence JAR in one module, with the other EclipseLink JARs separated in a different module.) Now, finally, let's do some coding. Not much needed, though. Add a JTextArea to the TopComponent in the UI module. Then add this to the end of the TopComponent constructor: EntityManager entityManager = Persistence.createEntityManagerFactory("EntityLibPU").createEntityManager(); Query query = entityManager.createQuery("SELECT c FROM Customer c"); List resultList = query.getResultList(); for (Customer c : resultList) { jTextArea1.append(c.getName() + " (" +c.getCity() + ")" + "\n"); } Above, you can see I am referring to a persistence unit named "EntityLibPU", which is the name set in the persistence.xml file. In addition, I am referring to one of the entity classes, called Customer, which is in the entity classes module. Adapt these bits to your needs. Deploy the Application. Start your database server and then run the application. You should see this: Congrats, you've just managed to set up JPA via EclipseLink in a modular NetBeans Platform application... and you only typed 6 lines of code.
January 17, 2009
by Geertjan Wielenga
· 34,378 Views
article thumbnail
JPA's Nasty "Unknown abstract schema type" Error
I'd been trying to debug the following error for days, and it drove me crazy. The problem, in a nutshell, is that JPA refuses to compile one of my NamedQueries, throwing the following error: Error compiling the query [UserVO.findByUserName: SELECT u FROM UserVO u WHERE u.name = :name]. Unknown abstract schema type [UserVO] After numerous Google searches, I concluded that JPA will throw the "Unknown abstract schema type" error when JPA fails to locate your entity class. Most often, this type of error occurs when: You have provided the database table name instead of the entity class name in the JPA query. For example, if you have an entity class called "UserVO", which maps to the table name "users", the query "SELECT u from users u" will throw the above exception. When running JPA in standalone mode, or not in a Java EE container (such as Tomcat 5 or 6), you forget to explicitly list all entity classes in the persistence.xml file, thus causing JPA to fail to locate the entities when compiling the query. Neither of above applied to my case. I have explicitly listed all my entity classes in the persistence.xml and I am sure my JPA query is valid. I have tested my code with different JPA implementations, but always saw the same error. Here's my UserVO class: @Entity(name = "users") @NamedQuery(name = "UserVO.findByUserName", query = "SELECT u FROM UserVO u WHERE u.name = :name") public class UserVO extends BaseVO implements Serializable { ... ... } If I remove the NamedQuery, my JPA works as expected, i.e, I am able to insert, delete, and update the UserVO object. Now, to all my smart readers, can you spot what's wrong in my code? Think about it and then scroll down for the answer... Answer: The culprit is the Entity annotation. I explicitly named the UserVO entity "users". JPA has no problem to map the UserVO entity to the users database table. However, JPA has a problem when compiling the JPA Query: it can't find the UserVO entity in the JPA context because I have renamed the UserVO entity to users. To resolve this, just add a @Table annotation with the table name, as shown in the code below: @Entity @Table(name = "users") @NamedQuery(name = "UserVO.findByUserName", query = "SELECT u FROM UserVO u WHERE u.name = :name") public class UserVO extends BaseVO implements Serializable { ... ... } Haha, stupid me... Anyway, Happy New Year to everyone.
January 13, 2009
by Khoo Chen Shiang
· 54,011 Views
article thumbnail
The Three Pillars of Continuous Integration
Continuous Integration commonly known as CI is a process that consists of continuously compiling, testing, inspecting, and deploying source code. In any typical CI environment, this means running a new build every time code changes within a version control repository. Martin Fowler describes CI as: A software development practice where members of a team integrate their work frequently, usually each person integrates at least daily - leading to multiple integrations per day. Each integration is verified by an automated build to detect integration errors as quickly as possible. Many teams find that this approach leads to significantly reduced integration problems and allows a team to develop cohesive software more rapidly. While CI is actually a process, the term Continuous Integration often is associated with three important tools in particular. As shown in the image the three pillars of CI are: 1. A version control repository like Subversion, or CVS. 2. A CI Server such as Hudson, or Cruise Control 3. An automated build process like Ant or Nant So, let’s look at each of these in detail: Version Control Repository: Version control repositories also known as SCM (source code management) play a crucial role in any software development environment. They also play a very important role for a successful CI process. The SCM is a central place for the team to store every needed artifact for the project. It is mandatory for the teams to put everything needed for a successful build into this repository. This includes the build scripts, property files, database scripts, all the libraries required to build the software and so on. The CI Server: For CI to function properly, we also need to have an automated process that monitors a version control repository and runs a build when any changes are detected. There are several CI servers available, both open source and commercial. Most of them are similar in their basic configuration and monitor a particular version control repository and run builds when any changes are detected. Some of the most commonly used open source CI servers are; Cruise Control, Continuum, and Hudson. Hudson is particularly interesting because of its ease of configuration and compelling plug-ins, which makes integration with test and static analysis tools much easier. Automated Build: The process of CI is about building software often, which is accomplished through the use of a build. A sturdy build strategy is by far the most important aspect of a successful CI process. In the absence of a solid build that does more than compile your code, CI withers. With automated builds, teams can reliably perform (in an automated fashion) otherwise manual tasks like compilation, testing, and even more interesting things like software inspection and deployment. Now that we have seen the important tools in our CI process, let’s see how a typical CI scenario looks like for a developer: CI server is configured to poll the version control repository continuously for changes. Developer commits code to the repository. CI server detects this change, and retrieves the latest code from the repository. This causes the CI server to invoke the build script with the given targets and options. If configured, CI Server will send out an e-mail to the specified recipients when a certain important event occurs. The CI server continues to poll for changes. Why is CI Important? This is one of the most frequently asked questions, and here are a few points to note about this powerful technique: Building software often greatly increases the likelihood that you will spot defects early, when they still are relatively manageable. Extends defect visibility. CI ensures that you have production ready software at every change. CI also ensures that you have reduced the risk of integration issues by building software at every change. CI server can also be configured to run continuous inspection which can assist the development team in finding potential bugs, bad programming practice, automatically check coding standards, and also provide valuable feedback on the quality of code being written. Over the past several months, I have assisted several companies in implementing CI. There was a little bit of resistance from the developers in the early stages when we implemented continuous feedback. But, never heard a single negative comment about this approach. If you already have a version control repository and automated builds, you are very close to the CI process. Download one of the open source CI servers, configure and setup a simple project. It should take less than an hour if you have automated build scripts. Start adding additional features like code inspections, generating reports, metrics, documentation and so on. Most important, send continuous feedback to your team. Give this process a try, you sure will be surprised to see how effective it is. And, as always share your thoughts, concerns or questions.
December 15, 2008
by Meera Subbarao
· 23,883 Views
article thumbnail
Computing 95 Percentile In MySQL
When doing performance analyzes you often would want to see 95 percentile, 99 percentile and similar values. The "average" is the evil of performance optimization and often as helpful as "average patient temperature in the hospital". Lets set you have 10000 page views or queries and have average response time of 1 second. What does it mean ? Really nothing - may be one page view was 10000 seconds and the rest was in low milliseconds or may be you had every single page view taking 1 second, which are completely different. You also do not really care about average performance - the goal of good user experience is majority of users to have good experience and average is not a good fit here. Defining your response time goal in 95 or 99 percentile is much better. Say you say 99 percentile response time should be one second, this means only 1 percent of queries/page views are allowed to take more than that. For larger systems defining (increasing) response times for 99.9 or even 99.99 percentile numbers often make sense. It also often makes sense to define response time goals separately for different transactions - the AJAX widget response time requirements may be very different from the slow search page. So you have defined your response time in terms of 95/99 percentile and get your logs in the table, so how to get the data if MySQL only provides you the avg: mysql> SELECT count(*),avg(wtime) FROM performance_log_081128 WHERE page_type='search'; +----------+-----------------+ | count(*) | avg(wtime) +----------+-----------------+ | 106859 | 1.4469140766532 +----------+-----------------+ 1 row IN SET (2.08 sec) The average response time here is for example; the real data what we need is number of rows which matches for given query type. Dividing the count by 100 we get our 1% of values and dividing by 20 5% of values, now we can get the response time we concerned about simply by running following order-by queries: mysql> SELECT wtime FROM performance_log_081128 WHERE page_type='search' ORDER BY wtime DESC LIMIT 1068,1; +---------+ | wtime +---------+ | 10.1007 +---------+ 1 row IN SET (2.06 sec) mysql> SELECT wtime FROM performance_log_081128 WHERE page_type='search' ORDER BY wtime DESC LIMIT 5342,1; +---------+ | wtime +---------+ | 5.09297 +---------+ 1 row IN SET (2.06 sec) So for this system the 95 percentile is just over 5 sec (some 3 times more than the average) and 99% percentile is just a bit over 10 seconds (6 times more than average). The both numbers are horrible and system surely needs to be fixed. These numbers are to illustrate - the percentile numbers can be quite different from average numbers (it is not rare to see 99 percentile to be order of magnitude different from the average) and this is what you really need to focus on. Looking at the numbers from the business standpoint try to understand what these really are. In some cases I see rather bad percentile on the backend which are not really the problem for the business because there is a cache up front anyway. If 99% of requests are coming from the cache and you observe certain 99 percentile response time on the backend it is often 99.99 percentile response time which is a lot different - you often can afford 1/10000 requests to stall for few seconds, because things outside of your control (like packet loss at client side) would be responsible for larger amount of delays. Be careful though - the "random" delays, for example if system was busy and delayed servicing request is one thing, "systematic" delays, when response time is always bad in given conditions can be much worse problems. You do not want your best client to suffer for example, even if he is the only one.
December 3, 2008
by Peter Zaitsev
· 15,414 Views
article thumbnail
Importing XML Data Into A SQLite Table
For inserting into a SQLite table, use the code as follows: //DB Connection private var dbconn:SQLConnection; //Query Statement private var sqlQuery:SQLStatement; //Create Table Statement private var sqlCreateTable:SQLStatement; //Insert Statement private var sqlInsert:SQLStatement; //Import Statement private var sqlImport:SQLStatement; /** * This is for importing xml data to a SQLite table * @param node xml node * @param user the user whos data this is * */ public function importPostXML( node:XMLNode, user:User ):void { var query:String = "INSERT INTO posts (" + "post_url," + "post_hash," + "post_desc," + "post_tags," + "post_time," + "post_extended," + "post_shared," + "post_replace," + "post_user)" + "VALUES ( " + ":post_url," + ":post_hash," + ":post_desc," + ":post_tags," + ":post_time," + ":post_extended," + ":post_shared," + ":post_replace," + ":post_user)"; sqlImport = new SQLStatement(); sqlImport.sqlConnection = dbconn; sqlImport.addEventListener( SQLEvent.RESULT, onSQLSave ); sqlImport.addEventListener( SQLErrorEvent.ERROR, onSQLError ); sqlImport.text = query; sqlImport.parameters[":post_url"] = node.attributes.href; sqlImport.parameters[":post_hash"] = node.attributes.hash; sqlImport.parameters[":post_desc"] = node.attributes.description; sqlImport.parameters[":post_tags"] = node.attributes.tag; sqlImport.parameters[":post_time"] = node.attributes.time; sqlImport.parameters[":post_extended"] = node.attributes.extended; sqlImport.parameters[":post_shared"] = node.attributes.shared; sqlImport.parameters[":post_replace"] = node.attributes.replace; sqlImport.parameters[":post_user"] = user.user_name; sqlImport.execute(); trace( "Importing XML to SQLite Database" ); }
September 21, 2008
by Jonnie Spratley
· 15,912 Views · 1 Like
article thumbnail
The Capability Pattern: Future-Proof Your APIs
Here is a simple pattern which you can use to make your APIs extensible, even by third parties, without sacrificing your ability to keep backward compatibility. It is very frequent to create a library which has two “sides” — an API side and an SPI side. The API is what applications call to use the library. The SPI (Service Provider Interface) is how functionality — for example, access to different kinds of resources, is provided. One example of this is JavaMail: To read/write email messages, you call JavaMail's API. Under the hood, when you ask for a mail store for, say, an IMAP mail server, the JavaMail library looks up all the providers registered (injected) on the classpath, and tries to find one that supports that protocol. The protocol handler is written to JavaMail's SPI. If it finds one, then you can fetch messages from IMAP servers using it. But your client code only ever calls the JavaMail API - it doesn't need to know anything about the IMAP service provider under the hood. There is one very big problem with the way this is usually done: API classes really ought to be final in almost all cases. SPI classes ought to be abstract classes unless the problem domain is extremely well-defined, in which case interfaces make sense (you can use either, but in a not-well-defined problem domain you may end up, over time, creating things with awful names like LayoutManager2). I won't go into great detail about why this is true here (my friend Jarda does in his new book and we discuss it somewhat in our book Rich Client Programming). In abbreviated form, the reasons are: You can provably backward compatibly add methods to a final class. And if the class is final, that fact has communication-value — it communicates to the user of that class that it's not something they might need to implement, where an interface would be more confusing. You can backward compatibly remove methods from an SPI interface or abstract class, if your library is the only thing that will ever call the SPI directly is your library. Older implementations will still have the method, it just will never be called (in a modular environment such as the NetBeans module system, OSGi or presumably JSR-277, you would enforce this by putting the API and SPI in separate JAR files, so a client can't even see the SPI classes). A minor benefit of using abstract classes is that you can semi-compatibly add non-abstract methods to an abstract class later. But do remember that you run the risk that someone will have a subclass with the same method name and arguments and an incompatible return-type (the JDK actually did this to us once in NetBeans, by adding Exception.getCause() in JDK 1.3). So adding methods to a public, non-final class in an API is a backward-incompatible change. Given those constraints, what happens if you mix API and SPI in the same class (which is what JavaMail and most Java standards do)? Well, you can't add methods compatibly because that could break subclasses. And you can't remove them compatibly, because clients could be calling them. You're stuck. You can't compatibly add or remove anything from the existing classes. As I've written elsewhere, it is the height of insanity that an application server vendor is supposed to implement interfaces and classes that its clients directly call — for exactly this reason. It would be much cleaner, and allow Java APIs to evolve much faster, if API and SPI were completely separated. But part of the appeal to vendors, for better or worse, to implement these specifications, is that they can extend them in custom ways that will tie developers who use those extensions to their particular implementation. This behavior not entirely about being evil and locking people in. There is a genuine case for innovation on top of a standard - that's how standards evolve, and some people will need functionality that the standard doesn't yet support. Enter the capability pattern. The capability pattern is very, very simple. It looks like this: public getCapability (Class type); That's it! It's incredibly simple! It has one caveat: Any call to getCapability() must be followed by a null-check. But this is much cleaner than either catching UnsupportedOperationExceptions, or if (foo.isAbleToDoX()) foo.doX() or if (foo instanceof DoerOfX) ((DoerOfX) foo).doX(). A null-check is nice and simple and clean by comparison. It's letting the Java type system work for you instead of getting into a wrestling match with it. Now, what can you do with it? Here's an example. In my previous blog I introduced an alternative design for how you could do something like SwingWorker. It contains a class called TaskStatus, which abstracts the task status data from the task-performing object itself. It is a simple interface with setters that allow a background thread to inform another object (presumably a UI) about the progress of a task. In light of what we just discussed, TaskStatus really ought to be a final class. So let's rewrite it a little, to look like this. We will use a mirror-class for the SPI. public final class TaskStatus { private final StatusImpl impl; TaskStatus (StatusImpl impl) { this.impl = impl; } public void setTitle (String title) { impl.setTitle (title); } public void setProgress (String msg, long progress, long min, long max) { //We could do argument sanity checks here and make life //simpler for anyone implementing StatusImpl impl.setProgress (msg, progress, min, max); } public void setProgress (String msg) { //...you get the idea //... } public abstract class StatusImpl { public abstract void setTitle (String title); public abstract void setProgress (String msg, long progress, long min, long max); public abstract void setProgress (String msg); //indeterminate mode public abstract void done(); public abstract void failed (Exception e); } So we have an API that handles basic status display. But people are going to invent new aspects to status display. We can't save the world and solve everybody's task-status problems before they even think of them - and we shouldn't try. We don't want to set things up so that it's up to us to implement everything the world will ever want. Luckily, it doesn't have to be that way. Since we've designed our API so that it can be compatibly added to, we let the rest of the world come up with things they need for displaying task status, and the ones that a lot of people need can be added to our API in the future. The capability pattern lets us do that. We add two methods to our API and SPI classes: public abstract class StatusImpl { //... public T getCapability (Class type); } public final class TaskStatus { //... public T getCapability (Class type) { return impl.getCapability (type); } } Let's put that to practical use. Someone might want to display how much time remains before the task is done. Our API doesn't handle that. Through the capability pattern, we can add that. We (or anyone implementing StatusImpl) can create the following interface: public interface StatusTime { public void setTimeRemaining (long milliseconds); } A task that wants to provide this information to the UI, if the UI supports it, simply does this: public T runInBackground (TaskStatus status) { StatusTime time = status.getCapability (StatusTime.class); for (...) { //do some slow work... if (time != null) { long remaining = //estimate the time remaining time.setTimeRemaining (remaining); } } } Even better, our Task API is, right now, not tied specifically to Swing or AWT - it could be used for anything that needs to follow the pattern of computing something on a background thread and then doing work on another one. Why not keep it un-tied to UI toolkits? All we have to do is make the code that actually handles the threading pluggable (I'll talk about how you do this simply using the Java classpath for dependency injection in my next blog). Then the result could be used with SWT or Thinlet as well, or even in a server-side application. Instead of a SwingWorker, we have an AnythingWorker! But we know we need a UI - and we know we are targetting Swing right now. How can we really keep this code completely un-tied from UI code and still have it be useful? The capability pattern comes to our rescue again - very very simply. An actual application using this UI simply fetches the default factory for StatusImpls (you need such a thing if you want to run multiple simultaneous background tasks and show status for each — my next blog will explain how this can be injected just by putting a JAR on the classpath) and does something like: Component statusUi = theFactory.getCapability (Component.class); if (statusUi != null) { statusBar.add (statusUi); } (or if we want to allow only one background task at a time, we can forget the factory and put the Component fetching code directly in our implementation of StatusImpl). If you are familiar with NetBeans Lookup API, the capability pattern is really a simplification of that (minus collection-based results and listening for changes). The point here is that the capability pattern lets you have an API that is composed completely of nice, future-proofed, evolvable, final classes, but the API is extensible even though it is final. The result is that the API can evolve faster, with fewer worries about breaking anybody's existing code. Which reduces the cycle time to improve existing libraries, and all our software evolves and improves faster, which is good for everyone. It also helps one to avoid trying to “save the world” — by allowing for extensibility, it is possible to create an API that is useful without needing to handle every possible thing anyone might ever want to do in that problem domain. Trying to save the world is what leads to scope-creep and never-finished projects. In this tutorial I discuss the don't try to save the world principle in a practical example. Does the mirror-class design seem a bit masochistic? I think it does point up a weakness in the scoping rules of the Java language. It would definitely be nicer to be able to, on the method level, make some methods visible to some kinds of clients, and other methods visible to other kinds of clients. But regardless of this, it's even more masochistic to end up “painted into a corner,”[1] and unable to fix bugs or add features without potentially breaking somebody's code. That's how you end up with ten-year-old unfixed bugs. [1]painted into a corner — An English idiom meaning to leave yourself with no options — you were painting the floor of a room in a pattern such that you end up standing in an unpainted corner of the room, and you can't leave the corner until the paint dries. From http://weblogs.java.net/blog/timboudreau/
August 29, 2008
by Tim Boudreau
· 20,123 Views
article thumbnail
Using a Hibernate Interceptor To Set Audit Trail Properties
In almost every application I've done, the database tables have some kind of audit trail fields. Sometimes this is a separate "audit log" table where all inserts, updates, deletes, and possibly even queries are logged. Other times there are the four typical audit trail fields in each table, for example you might have created_by, created_on, updated_by, and updated_on fields in each table. The goal in the latter case is to update those four fields with the appropriate information as to who created or updated a record and when they did it. Using a simple Hibernate Interceptor this can be accomplished with no changes to your application code (with several assumptions which I'll detail next). In other words, you won't need to and definitely should not be manually setting those audit properties littered around your application code. The basic assumptions I'll make for this simple audit interceptor are that: (1) model objects contain the four audit properties mentioned above, and (2) there is an easy way to obtain the current user's information from anywhere in the code. The first assumption is needed since you need some way to identify which properties constitute the audit trail properties. The second assumption is required because you need some way to obtain the credentials of the person making the change in order to set the createdBy or updatedBy property in your Hibernate Interceptor class. So, for reference purposes, assume you have a (Groovy) base entity like this with the four audit properties: @MappedSuperclassclass BaseEntity implements Serializable { String createdBy Date createdOn String updatedBy Date updatedOn} I'm using the Hibernate ImprovedNamingStrategy so that camel case names are translated to underscored names, e.g. "createdBy" becomes "created_by". Next assume there is a BlogEntry entity class that extends BaseEntity and inherits the audit trail properties: @Entityclass BlogEntry extends BaseEntity { @Id @GeneratedValue (strategy = GenerationType.IDENTITY) Long id @Version Long version String title @Column (name = "entry_text") String text @Temporal (TemporalType.TIMESTAMP) Date publishedOn} To implement the interceptor, we need to implement the aforementioned Interceptor interface. We could do this directly, but it is better to extend EmptyInterceptor so we need only implement the methods we actually care about. Without further ado, here's the implementation (excluding package declaration and imports): class AuditTrailInterceptor extends EmptyInterceptor { boolean onFlushDirty(Object entity, Serializable id, Object[] currentState, Object[] previousState, String[] propertyNames, Type[] types) { setValue(currentState, propertyNames, "updatedBy", UserUtils.getCurrentUsername()) setValue(currentState, propertyNames, "updatedOn", new Date()) true } boolean onSave(Object entity, Serializable id, Object[] state, String[] propertyNames, Type[] types) { setValue(state, propertyNames, "createdBy", UserUtils.getCurrentUsername()) setValue(state, propertyNames, "createdOn", new Date()) true } private void setValue(Object[] currentState, String[] propertyNames, String propertyToSet, Object value) { def index = propertyNames.toList().indexOf(propertyToSet) if (index >= 0) { currentState[index] = value } } So what did we do? First, we implemented the onFlushDirty and onSave methods because they are called for SQL updates and inserts, respectively. For example, when a new entity is first saved, the onSave method is called, at which point we want to set the createdBy and properties. And if an existing entity is updated, onFlushDirty is called and we set the updatedBy and updatedOn. Second, we are using the setValue helper method to do the real work. Specfically, the only way to modify the state in a Hibernate Interceptor (that I am aware of anyway) is to dig into the currentState array and change the appropriate value. In order to do that, you first need to trawl through the propertyNames array to find the index of the property you are trying to set. For example, if you are updating a blog entry you need to set the updatedBy and updatedOn properties within the currentState array. For a BlogEntry object, the currentState array might look like this before the update (the updated by and on propertes are both null in this case because the entity was created by Bob but has not been updated yet): { "Bob", 2008-08-27 10:57:19.0, null, null, 2008-08-27 10:57:19.0, "Lorem ipsum...", "My First Blog Entry", 0} You then need to look at the propertyNames array to provide context for what the above data represents: { "createdBy", "createdOn", "updatedBy", "updatedOn", "publishedOn", "text", "title", "version"} So in the above updatedBy is at index 2 and updatedOn is located at index 3. setValue() works by finding the index of the property it needs to set, e.g. "updatedBy," and if the property was found, it changes the value at that index in the currentState array. So for updatedBy at index 2, the following is the equivalent code if we had actually hardcoded the implementation to always expect the audit fields as the first four properties (which is obviously not a great idea): // Equivalent hard-coded code to change "updatedBy" in above example// Don't use in production!currentState[2] = UserUtils.getCurrentUsername() To actually make your interceptor do something, you need to enable it on the Hibernate Session. You can do this in one of several ways. If you are using plain Hibernate (i.e. not with Spring or another framework) you can set the interceptor globally on the SessionFactory, or you can enable it for each Session as in the following example code: // Configure interceptor globally (applies to all Sessions)sessionFactory = new AnnotationConfiguration() .configure() .setNamingStrategy(ImprovedNamingStrategy.INSTANCE) .setInterceptor(new AuditTrailInterceptor()) .buildSessionFactory()// Enable per SessionSession session = getSessionFactory().openSession(new AuditTrailInterceptor()) If you enable the interceptor globally, it must be thread-safe. If you are using Spring you can easily configure a global interceptor on your session factory bean: On the other hand, if you would rather enable the interceptor per session, you either need to use the openSession(Interceptor) method to open your sessions or alternatively implement your own version of CurrentSessionContext to use the getCurrentSession() method in order to set the interceptor. Using getCurrentSession() is preferable anyway since it allows several different classes (e.g. DAOs) to use the same session without needing to explicitly pass the Session object around to each object that needs it. At this point we're done. But, if you know about the Hibernate eventing system (e.g. you can listen for events such as inserts and updates and define event listener classes to respond to those events), you might be wondering why I didn't use that mechanism rather than the Interceptor. The reason is that, to the best of my current knowledge, you cannot alter state of objects in event listeners. So for example you would not be able to change an entity's state in a PreInsertEventListener implementation class. If anyone knows this is incorrect or has implemented it, I'd love to hear about it. Until next time, happy auditing! Originally posted on Scott Leberknight's blog
August 27, 2008
by Scott Leberknight
· 103,425 Views · 2 Likes
article thumbnail
SelectMany: Probably The Most Powerful LINQ Operator
Hi there back again. Hope everyone is already exploiting the power of LINQ on a fairly regular basis. Okay, everyone knows by now how simple LINQ queries with a where and select (and orderby, and Take and Skip and Sum, etc) are translated from a query comprehension into an equivalent expression for further translation: from p in products where p.Price > 100 select p.Name becomes products.Where(p => p.Price > 100).Select(p => p.Name) All blue syntax highlighting has gone; the compiler is happy with what remains and takes it from there in a left-to-right fashion (so, it depends on the signature of the found Where method whether or not we take the route of anonymous methods or, in case of an Expression<…> signature, the route of expression trees). But let’s make things slightly more complicated and abstract: from i in afrom j in bwhere i > jselect i + j It’s more complicated because we have two from clauses; it’s more abstract because we’re using names with no intrinsic meaning. Let’s assume a and b are IEnumerable sequences in what follows. Actually what the above query means in abstract terms is: (a X b).Where((i, j) => i > j).Select((i, j) => i + j) where X is a hypothetical Cartesian product operator, i.e. given a = { 1, 4, 7 } and b = { 2, 5, 8 }, it produces { (1,2), (1,5), (1,8), (4,2), (4,5), (4,8), (7,2), (7,5), (7,8) }, or all the possible pairs with elements from the first sequence combined with an element from the second sequence. For the record, the generalized from of such a pair – having any number of elements – would be a tuple. If we would have this capability, Where would get a sequence of such tuples, and it could identify a tuple in its lambda expression as a set of parameters (i, j). Similarly, Select would do the same and everyone would be happy. You can verify the result would be { 6, 9, 12 }. Back to reality now: we don’t have the direct equivalent of Cartesian product in a form that produces tuples. In addition to this, the Where operator in LINQ has a signature like this: IEnumerable Where(this IEnumerable source, Func predicate) where the predicate parameter is a function of one – and only one – argument. The lambda (i, j) => i > j isn’t compatible with this since it has two arguments. A similar remark holds for Select. So, how can we get around this restriction? SelectMany is the answer. Demystifying SelectMany What’s the magic SelectMany all about? Where could we better start our investigation than by looking at one of its signatures? IEnumerable SelectMany( this IEnumerable source, Func> collectionSelector, Func resultSelector) Wow, might be a little overwhelming at first. What does it do? Given a sequence of elements (called source) of type TSource, it asks every such element (using collectionSelector) for a sequence of – in some way related – elements of type TCollection. Next, it combines the currently selected TSource element with all of the TCollection elements in the returned sequence and feed it in to resultSelector to produce a TResult that’s returned. Still not clear? The implementation says it all and is barely three lines: foreach (TSource item in source) foreach (TCollection subItem in collectionSelector(item)) yield return resultSelector(item, subItem); This already gives us a tremendous amount of power. Here’s a sample: products.SelectMany(p => p.Categories, (p, c) => p.Name + “ has category “ + c.Name) How can we use this construct to translate multiple from clauses you might wonder? Well, there’s no reason the function passed in as the first argument (really the second after rewriting the extension method, i.e. the collectionSelector) uses the TSource argument to determine the IEnumerable result. For example: products.SelectMany(p => new int[] { 1, 2, 3 }, (p, i) => p.Name + “ with irrelevant number “ + i) will produce a sequence of strings like “Chai with irrelevant number 1”, “Chai with irrelevant number 2”, “Chai with irrelevant number 3”, and similar for all subsequent products. This sample doesn’t make sense but it illustrates that SelectMany can be used to form a Cartesian product-like sequence. Let’s focus on our initial sample: var a = new [] { 1, 4, 7 };var b = new [] { 2, 5, 8 };from i in afrom j in bselect i + j; I’ve dropped the where clause for now to simplify things a bit. With our knowledge of SelectMany above we can now translate the LINQ query into: a.SelectMany(i => b, …) This means: for every i in a, “extract” the sequence b and feed it into …. What’s the …’s signature? Something from a (i.e. an int) and something from the result of the collectionSelector (i.e. an int from b), is mapped onto some result. Well, in this case we can combine those two values by summing them, therefore translating the select clause in one go: a.SelectMany(i => b, (i, j) => i + j) What happens when we introduce a seemingly innocent where clause in between? from i in afrom j in bwhere i > jselect i + j; The first two lines again look like: a.SelectMany(i => b, …) However, going forward from there we’ll need to be able to reference i (from a) and j (from b) in both the where and select clause that follow but both the corresponding Where and Select methods only take in “single values”: IEnumerable Where(this IEnumerable source, Func predicate);IEnumerable Select(this IEnumerable source, Func projection); So what can we do to combine the value i and j into one single object? Right, use an anonymous type: a.SelectMany(i => b, (i, j) => new { i = i, j = j }) This produces a sequence of objects that have two public properties “i” and “j” (since it’s anonymous we don’t care much about casing, and indeed the type never bubbles up to the surface in the query above, because of what follows: a.SelectMany(i => b, (i, j) => new { i = i, j = j }).Where(anon => anon.i > anon.j).Select(anon => anon.i + anon.j) In other words, all references to i and j in the where and select clauses in the original query expression have been replaced by references to the corresponding properties in the anonymous type spawned by SelectMany. Lost in translation This whole translation of this little query above puts quite some work on the shoulder of the compiler (assuming a and b are IEnumerable and nothing more, i.e. no IQueryable): The lambda expression i => b captures variable b, hence a closure is needed. That same lambda expression acts as a parameter to SelectMany, so an anonymous method will be created inside the closure class. For new { i = i, j = j } an anonymous type needs to be generated. SelectMany’s second argument, Where’s first argument and Select’s first argument are all lambda expressions that generate anonymous methods as well. As a little hot summer evening exercise, I wrote all of this plumbing manually to show how much code would be needed in C# 2.0 minus closures and anonymous methods (more or less C# 1.0 plus generics). Here’s where we start from: class Q{ IEnumerable GetData(IEnumerable a, IEnumerable b) { return from i in a from j in b where i > j select i + j; } This translates into: class Q{ IEnumerable GetData(IEnumerable a, IEnumerable b) { Closure0 __closure = new Closure0(); __closure.b = b; return Enumerable.Select( Enumerable.Where( Enumerable.SelectMany( a, new Func>(__closure.__selectMany1), new Func>(__selectMany2) ), new Func, bool>(__where1) ), new Func, int>(__select1) ); } private class Closure0 { public IEnumerable b; public IEnumerable __selectMany1(int i) { return b; } } private static Anon0 __selectMany2(int i, int j) { return new Anon0(i, j); } private static bool __where1(Anon0 anon) { return anon.i > anon.j; } private static int __select1(Anon0 anon) { return anon.i + anon.j; }private class Anon0 // generics allow reuse of type for all anonymous types with 2 properties, hence the use of EqualityComparers in the implementation{ private readonly TI _i; private readonly TJ _j; public Anon0(TI i, TJ t2) { _i = i; _j = j; } public TI i { get { return _i; } } public TJ j { get { return _j; } } public override bool Equals(object o) { Anon0 anonO = o as Anon0; return anonO != null && EqualityComparer.Default.Equals(_i, anonO._i) && EqualityComparer.Default.Equals(_j, anonO._j); } public override int GetHashCode() { return EqualityComparer.Default.GetHashCode(_i) ^ EqualityComparer.Default.GetHashCode(_j); // lame quick-and-dirty hash code } public override string ToString() { return “( i = “ + i + “, j = ” + j + “ }”; // lame without StringBuilder } Just a little thought… Would you like to go through this burden to write a query? “Syntactical sugar” might have some bad connotation to some, but it can be oh so sweet baby! Bind in disguise Fans of “monads”, a term from category theory that has yielded great results in the domain of functional programming as a way to make side-effects explicit through the type system (e.g. the IO monad in Haskell), will recognize SelectMany’s (limited) signature to match the one of bind: IEnumerable SelectMany( this IEnumerable source, Func> collectionSelector) corresponds to: (>>=) :: M x –> (x –> M y) –> M y Which is Haskell’s bind operator. For those familiar with Haskell, the “do” notation – that allows the visual illusion of embedding semi-colon curly brace style of “imperative programming” in Haskell code – is syntactical sugar on top of this operator, defined (recursively) as follows: do { e } = edo { e; s } = e >>= \_ –> do { s }do { x <- e; s } = e >>= (\x –> do { s })do { let x = e; s } = let x = e in do { s } Rename to SelectMany, replace M x by IEnumerable and assume a non-curried form and you end up with: SelectMany :: (IEnumerable, x –> IEnumerable) –> IEnumerable Identifying x with TSource, y with TResult and turning a –> b into Func yields: SelectMany :: Func, Func>, IEnumerable> and you got identically the same signature as the SelectMany we started from. For the curious, M in the original form acts as a type constructor, something the CLR doesn’t support since it lacks higher-order kinded polymorphism; it’s yet another abstraction one level higher than generics that math freaks love to use in category theory. The idea is that if you can prove laws to be true in some “structure” and you can map that structure onto an another “target structure” by means of some mapping function, corresponding laws will hold true in the “target structure” as well. For instance: ({ even, odd }, +) and ({ pos, neg }, *) can be mapped onto each other pairwise and recursively, making it possible to map laws from the first one to the second one, e.g. even + odd –> oddpos * neg –> neg This is a largely simplified sample of course, I’d recommend everyone who’s interested to get a decent book on category theory to get into the gory details. A word of caution Now that you know how SelectMany works, can you think of a possible implication when selecting from multiple sources? Let me give you a tip: nested foreachs. This is an uninteresting sentence that acts as a placeholder in the time space while you’re thinking about the question. Got it? Indeed, order matters. Writing the following two lines of code produces a different query with a radically different execution pattern: from i in a from j in b …from j in b from i in a … Those roughly correspond to: foreach (var i in a) foreach (var j in b) … versus foreach (var j in b) foreach (var i in a) … But isn’t this much ado about nothing? No, not really. What if iterating over b is much more costly than iterating over a? For example, from p in localCollectionOfProductsfrom c in sqlTableOfCategories… This means that for every product iterated locally, we’ll reach out to the database to iterate over the (retrieved) categories. If both were local, there wouldn’t be a problem of course; if both were remote, the (e.g.) SQL translation would take care of it to keep the heavy work on the remote machine. If you want to see the difference yourself, you can use the following simulation: using System; using System.Collections.Generic; using System.Diagnostics; using System.Linq; using System.Threading; class Q { static void Main() { Stopwatch sw = new Stopwatch(); Console.WriteLine("Slow first"); sw.Start(); foreach (var s in Perf(Slow(), Fast())) Console.WriteLine(s); sw.Stop(); Console.WriteLine(sw.Elapsed); sw.Reset(); Console.WriteLine("Fast first"); sw.Start(); foreach (var s in Perf(Fast(), Slow())) Console.WriteLine(s); sw.Stop(); Console.WriteLine(sw.Elapsed); } static IEnumerable Perf(IEnumerable a, IEnumerable b) { return from i in a from j in b select i + "," + j; } static IEnumerable Slow() { Console.Write("Connecting... "); Thread.Sleep(2000); // mimic query overhead (e.g. remote server) Console.WriteLine("Done!"); yield return 1; yield return 2; yield return 3; } static IEnumerable Fast() { return new [] { 'a', 'b', 'c' }; } } This produces: [img_assist|nid=4625|title=|desc=|link=none|align=none|width=259|height=374] Obviously, it might be the case you’re constructing a query that can only execute by reaching out to the server multiple times, e.g. because order of the result matters (see screenshot above for an illustration of the ordering influence – but some local sorting operation might help too in order to satisfy such a requirement) or because the second query source depends on the first one (from i in a from j in b(i) …). There’s no silver bullet for a solution but knowing what happens underneath the covers certainly provides the necessary insights to come up with scenario-specific solutions. Happy binding!
August 20, 2008
by Bart De Smet
· 135,300 Views · 1 Like
article thumbnail
ASP.NET - Query Strings - Client Side State Management
Continuing the tour in the ASP.NET client side state management our current stop is the query string technique. You can read my previous posts in the state management subject in the following links: Client side state management introduction ViewState technique Hidden fields technique What are Query Strings? Query strings are data that is appended to the end of a page URL. They are commonly used to hold data like page numbers or search terms or other data that isn't confidential. Unlike ViewState and hidden fields, the user can see the values which the query string holds without using special operations like View Source. An example of a query string can look like http://www.srl.co.il?a=1;b=2. Query strings are included in bookmarks and in URLs that you pass in an e-mail. They are the only way to save a page state when copying and pasting a URL. The Query String Structure As written earlier, query strings are appended to the end of a URL. First a question mark is appended to the URL's end and then every parameter that we want to hold in the query string. The parameters declare the parameter name followed by = symbol which followed by the data to hold. Every parameter is separated with the ampersand symbol. You should always use the HttpUtility.UrlEncode method on the data itself before appending it. Query String Limitations You can use query string technique when passing from one page to another but that is all. If the first page need to pass non secure data to the other page it can build a URL with a query string and then redirect. You should always keep in mind that a query string isn't secure and therefore always validate the data you received. There are a few browser limitation when using query strings. For example, there are browsers that impose a length limitation on the query string. Another limitation is that query strings are passed only in HTTP GET command. How To Use Query Strings When you need to use a query string data you do it in the following way: string queryStringData = Request.QueryString["data"]; In the example I extract a data query string. The structure of the URL can look like url?data=somthing. After getting to data parameter value you should validate it in order not to enable security breaches. The next example is a code to help inject a query string into a URL: public string BuildQueryString(string url, NameValueCollection parameters){ StringBuilder sb = new StringBuilder(url); sb.Append("?"); IEnumerator enumerator = parameters.GetEnumerator(); while (enumerator.MoveNext()) { // get the current query parameter string key = enumerator.Current.ToString(); // insert the parameter into the url sb.Append(string.Format("{0}={1}&", key, HttpUtility.UrlEncode(parameters[key]))); } // remove the last ampersand sb.Remove(sb.Length - 1, 1); return sb.ToString(); } Summary To sum up the post, query string is another ASP.NET client side state management technique. It is most helpful for page number state or search terms. The technique isn't secured so avoid using it with confidential data. In the next post in this series I'll explain the how to use cookies.
July 20, 2008
by Gil Fink
· 77,653 Views
article thumbnail
GWT Basic Project Structure And Components
[img_assist|nid=3421|title=|desc=|link=url|url=http://www.manning.com/affiliate/idevaffiliate.php?id|align=left|width=208|height=388]The core of every GWT project is the project layout and the basic components required—host pages, entry points, and modules. To begin a GWT project, you need to create the default layout and generate the initial files. The easiest way to do this is to use the provided ApplicationCreator tool. Generating a project ApplicationCreator is provided by GWT to create the default starting points and layout for a GWT project. ApplicationCreator, like the GWT shell, supports several command-line parameters, which are listed in table 1. ApplicationCreator [-eclipse projectName] [-out dir] [-overwrite] [-ignore] className Table 1 ApplicationCreator command-line parameters Parameter Description -eclipse Creates a debug launch configuration for the named eclipse project -out The directory to which output files will be written (defaults to the current directory) -overwrite Overwrites any existing files -ignore Ignores any existing files; does not overwrite className The fully qualified name of the application class to be created To stub out an example calculator project, we’ll use ApplicationCreator based on a relative GWT_HOME path, and a className of com.manning.gwtip.calculator.client.Calculator, as follows: mkdir [PROJECT_HOME] cd [PROJECT_HOME] [GWT_HOME]/applicationCreator com.manning.gwtip.calculator.client.Calculator GWT_HOME It is recommended that you establish GWT_HOME as an environment variable referring to the filesystem location where you have unpacked GWT. Additionally, you may want to add GWT_HOME to your PATH for further convenience. We use GWT_HOME when referencing the location where GWT is installed and PROJECT_HOME to refer to the location of the current project. PATH SEPARATORS For convenience, when referring to filesystem paths, we'll use forward slashes, which work for two-thirds of supported GWT platforms. If you are using Windows, please adjust the path separators to use backward slashes. Running ApplicationCreator as described creates the default src directory structure and the starting-point GWT file resources. The standard directory structure Even though it's quite simple, the GWT layout is very important because the toolkit can operate in keeping with a Convention over Configuration design approach. As we’ll see, several parts of the GWT compilation process make assumptions about the default layout. Because of this, not everything has to be explicitly defined in every instance (which cuts down on the amount of configuration required). Taking a look at the output of the ApplicationCreator script execution, you will see a specific structure and related contents, as shown in listing 1. This represents the default configuration for a GWT project. Listing 1 ApplicationCreator output, showing the default GWT project structure: src src/com src/com/manning src/com/manning/gwtip src/com/manning/gwtip/calculator src/com/manning/gwtip/calculator/Calculator.gwt.xml src/com/manning/gwtip/calculator/client src/com/manning/gwtip/calculator/client/Calculator.java src/com/manning/gwtip/calculator/public src/com/manning/gwtip/calculator/public/Calculator.html Calculator-shell.sh Calculator-compile.sh The package name, com.manning.gwtip.calculator, is represented in the structure as a series of subdirectories in the src tree. This is the standard Java convention, and there are notably separate client and public subdirectories within. The client directory is intended for resources that will be compiled into JavaScript . Client items are translatable, or serializable, and will ultimately be downloaded to a client browser—these are Java resources in the source. The client package is known in GWT terminology as the source path. The public directory denotes files that will also be distributed to the client, but that do not require compilation and translation to JavaScript . This typically includes CSS, images, static HTML, and any other such assets that should not be translated, including existing JavaScript. The public package is known as the public path. Note that our client-side example does not use any server resources, but GWT does include the concept of a server path/package for server-side resources. Figure 1 illustrates this default GWT project layout. [img_assist|nid=4037|title=|desc=|link=none|align=none|width=293|height=284] ApplicationCreator generates the structure and a required set of minimal files for a GWT project. The generated files include the XML configuration module definition, the entry point Java class, and the HTML host page. These are some of the basic GWT project concepts. Along with the module definition, entry point, and host page, some shortcut scripts have also been created for use with the GWTShell and GWTCompiler tools. These scripts run the shell and compiler for the project. Table 2 lists all of the files created by ApplicationCreator: the basic resources and shortcut scripts needed for a GWT project. Table 2 ApplicationCreator-generated initial project files that serve as a starting point for GWT applications File Name Purpose GWT module file ProjectName.gwt.xml Defines the project configuration Entry point class ProjectName.java Starting class invoked by the module Host page ProjectName.html Initial HTML page that loads the module GWTShell shortcut invoker script ProjectName-shell.sh Invokes GWTShell for the project GWTCompiler shortcut invoker script ProjectName-compile.sh Invokes GWTCompiler for the project The starting points ApplicationCreator provides essentially wire up all the moving parts for you and stub out your project. You take it from there and modify these generated files to begin building a GWT application. If the toolkit did not provide these files via ApplicationCreator, getting a project started, at least initially, would be much more time consuming and confusing. Once you are experienced in the GWT ways, you may wind up using other tools to kick off a project: an IDE plugin, a Maven “archetype,” or your own scripts. ApplicationCreator, though, is the helpful default. The contents and structure that ApplicationCreator provides are themselves a working GWT “hello world” example. You get “hello world” for free, out of the box. "Hello world", however, is not that interesting. The connection of all the moving parts is what is really important; how a host page includes a module, how a module describes project resources, and how an entry point invokes project code. These concepts are applicable to all levels of GWT projects—the basic ones and beyond. Understanding these parts is key to gaining an overall understanding of GWT. Next, we’ll take a closer look at each of these concepts, beginning with the host page. Host pages A host page is the initial HTML page that invokes a GWT application. A host page contains a script tag that references a special GWT JavaScript file, Module.nocache.js. This JavaScript file, which the toolkit provides when you compile your project, kicks off the GWT application loading process. Along with the script reference that loads the project resources, you can also specify several GWT-related tags in the host page. These tag options are not present in the default host page created by ApplicationCreator, but it’s still important to be aware of them. The GWT tags that are supported in a host page are listed in table 3, as a reference. Table 3 GWT tags supported in host pages Meta tag Syntax Purpose gwt:module (Legacy, pre GWT 1.4.) Specifies the module to be loaded gwt:property Statically defines a deferred binding client property gwt:onPropertyErrorFn Specifies the name of a function to call if a client property is set to an invalid value (meaning that no matching compilation will be found) gwt:onLoadErrorFn Specifies the name of a function to call if an exception happens during bootstrapping or if a module throws an exception out of onModuleLoad(); the function should take a message parameter Thus, a host page includes a script reference that gets the GWT process started and refers to all the required project resources. The required resources for a project are assembled by the GWT compilation process, and are based on the module configuration. Modules GWT applications inhabit a challenging environment. This is partly because of the scope of responsibility GWT has elected to take on and partly because of the Internet landscape. Being a rich Internet-based platform and using only the basic built-in browser support for HTML, CSS, and JavaScript makes GWT quite elegant and impressive, but this combination is tough to achieve. Browsers that are “guided” by standards, but that don’t always stick to them, add to the pressure. Couple that environment with an approach that aims to bring static types, code standards, profiling and debugging, inheritance, and reuse to the web tier, and you have a tall order. To help with this large task, GWT uses modules as configuration and execution units that handle discreet areas of responsibility. Modules enable the GWT compiler to optimize the Java code it gets fed, create variants for all possible situations from a single code base, and make inheritance and property support possible. One of the most important resources generated by the ApplicationCreator is the Module.gwt.xml module descriptor for your project. This file exists in the top-level directory of your project’s package and provides a means to define resource locations and structure. In a default generated module file, there are only two elements: and . An element simply includes the configuration for another named GWT module in the current definition, and defines a class that kicks things off and moves from configuration to code. Table 4 provides an overview of the most common GWT module descriptor elements. Table 4 A summary of the most common elements supported by the GWT module descriptor Module element Description Identifies additional GWT modules that should be inherited into the current module Specifies which EntryPoint class should be invoked when starting a GWT project Identifies where the source code that should be translated into JavaScript by the GWT compiler is located Identifies where assets that are not translatable source code, such as images and CSS files, are located
July 14, 2008
by Schalk Neethling
· 32,407 Views
article thumbnail
Glimmer - Using Ruby to Build SWT User Interfaces
Glimmer is a JRuby DSL that enables easy and efficient authoring of user-interfaces using the robust platform-independent Eclipse SWT library. Glimmer comes with built-in data-binding support to greatly facilitate synchronizing UI with domain models. The goal of the Glimmer project is to create a JRuby framework on top of Eclipse technologies to enable easy and efficient authoring of desktop applications by taking advantage of the Ruby language. With Glimmer having just become an Eclipse project, it's a good time to find out more. Philosophy Glimmer's design philosophy can be summarized as follows: Concise and DRY Asks for minimum info needed to accomplish task Convention over configuration As predictable as possible for existing SWT developers Conventions Since Glimmer relies on Ruby, it is different in its syntax and conventions from what typical Java SWT developers would expect: Method parentheses are optional Java-vs-Ruby example: show() => show Method names follow underscored syntax Java-vs-Ruby example: addListener => add_listener Classes are constructed using the new(...) method (as opposed to new keyword): Java-vs-Ruby example: new GridLayout() => GridLayout.new Download Please download Glimmer from RubyForge: https://rubyforge.org/projects/glimmer/ NOTE: Glimmer is moving to Eclipse.org. Please visit http://andymaleh.blogspot.com for up-to-date news on the move and the upcoming download location on the Eclipse website. Installation Extract the Glimmer zip file and follow the installation instructions in the README file. NOTE: While Glimmer is platform-independent, its functionality has only been verified on Windows. Feedback from Mac and Linux users would be greatly appreciated. Tutorial Let's start with a very simple Glimmer Hello World example: shell { label { text “Hello World!” } } This will render the following: [img_assist|nid=3586|title=|desc=|link=none|align=undefined|width=126|height=48] In the SWT library a shell represents an application's window. It acts as a frame around the application widgets, which are visual components that display information and/or enable interaction with the user. One widget that was used in the Hello World example is the label widget, which simply displays text on the screen. Shell is also considered a widget, except it is a special kind of widget called composite. The shell keyword, which declared the application's shell, was followed by a block of code encased in curly braces. This block contains the shell content declarations, such as the Hello World label. The label keyword was also followed by a block of code. However, this block contained a property declaration for the label, stating that the text value is “Hello World!” So, to declare a widget, simply state its name followed by a block of code. The block may specify property values or nest other widget declarations for composite widgets. Now, let's move on to a more advanced example: shell { text "User Profile" composite { layout GridLayout.new(2, false) group { text "Name" layout GridLayout.new(2, false) layout_data GridData.new(fill, fill, true, true) label {text "First"}; text {text "Bullet"} label {text "Last"}; text {text "Tooth"} } group { layout_data GridData.new(fill, fill, true, true) text "Gender" button(radio) {text "Male"; selection true} button(radio) {text "Female"} } group { layout_data GridData.new(fill, fill, true, true) text "Role" button(check) {text "Student"; selection true} button(check) {text "Employee"; selection true} } group { text "Experience" layout RowLayout.new layout_data GridData.new(fill, fill, true, true) spinner {selection 5}; label {text "years"} } button { text "save" layout_data GridData.new(right, center, true, true) } button { text "close" layout_data GridData.new(left, center, true, true) } } }.open This will render the following: [img_assist|nid=3587|title=|desc=|link=none|align=undefined|width=195|height=209] The example contains a variation of widgets from SWT: Composite: a widget that can simply contain other widgets and manage their layout Group: Similar to Composite except that it usually has a border and a title. Text Field: Enables user to type in text information Checkbox Button: Allows user to make a selection from different options Radio Button: Allows user to make a selection between options that are mutually exclusive Spinner: Enables user to type in numeric information or spinning a number selection by mouse Push Button: Enables user to initiate actions Given that Glimmer relies on the Eclipse SWT library, developers may consult the SWT API as a reference on all the widgets, including their properties and layout options: http://help.eclipse.org/stable/nftopic/org.eclipse.platform.doc.isv/reference/api/index.html Keep in mind the following rules when reading the SWT API: Any widget available in SWT, including custom widgets written by developers, can be accessed from Glimmer by downcasing/underscoring the widget's name (e.g. Composite -> composite, LabledText -> labeled_text) Properties available on SWT widgets are specified by listing them followed by their values, each on a line or separated by semicolons within the widget's block (e.g. label {text "Username:"; font some_font}) Property names are also downcased/underscored in Glimmer. SWT widgets must have a style value specified, which is a constant available on the “SWT” class. Glimmer generally hides that by relying on smart defaults. Here is a listing of the defaults configured in Glimmer: text: SWT::BORDER table: SWT::BORDER spinner: SWT::BORDER button: SWT::PUSH Nonetheless, to customize a widget, a style value may be optionally specified within parentheses after the widget name. For an example, “button(SWT::RADIO)” renders a radio button and “button(SWT::CHECK)” renders a checkbox button. Glimmer's syntax also has syntactic sugar for specifying the style. Simply state the name of the style in the standard Ruby downcased/underscored format without the “SWT::” prefix. For example, button(SWT::RADIO) becomes button(radio). SWT composite widgets, such as shell, composite, and group can have a layout manager that lays out child widgets according to a certain pattern without the need to specify the (x, y) position of each child widget explicitly. Layout managers come in many flavors, such as GridLayout, offering a grid-like layout; FillLayout, allowing child widgets to fill the whole available area; and RowLayout, rendering child widgets one after the other in a row by default. Glimmer is configured with smart defaults for layout managers too: shell: FillLayout composite: GridLayout with one column group: GridLayout with one column GridLayout is a particularly useful SWT layout, so I will go over it in a little more detail here. GridLayout allows you to lay widgets out in a grid similar to HTML tables. To instantiate a custom GridLayout, you must specify the number of columns and whether they are of equal width or not. Here is a block of code demonstrating a group box having a GridLayout with 2 columns of unequal width: group { layout GridLayout.new(2, false) } Now, suppose we add four elements to that group box: group { layout GridLayout.new(2, false) label {text "First"}; text {text "Bullet"} label {text "Last"}; text {text "Tooth"} } The specified GridLayout will lay out the child widgets in the grid from left to right and top to bottom: The label with the text “First” will go into the 1st column of the 1st row. The text box with the text “Bullet” will go into the 2nd column of the 1st row. The label with the text “Last” will go into the 1st column of the 2nd row. The text box with the text “Tooth” will go into the 2nd column of the 2nd row. The group was actually a part of the advanced example illustrated earlier. It was given a title (by specifying the text attribute,) and the widget declarations were written in a way that maps visually to how they appear on the screen. Notice how text box declarations are on the same line as the label declarations since both the label and text box go under the same row, which helps improve code readability and maintainability: group { text "Name" layout GridLayout.new(2, false) layout_data GridData.new(fill, fill, true, true) label {text "First"}; text {text "Bullet"} label {text "Last"}; text {text "Tooth"} } That renders the following: [img_assist|nid=3588|title=|desc=|link=none|align=undefined|width=89|height=73] Layout of specific widgets may be further customized by specifying layout data. For GridLayout, layout data is specified through GridData objects. For example, we may decide to have the text boxes in the previous example have a greater width: group { layout GridLayout.new(2, false) label {text "First"}; text { text "Bullet" layout_data GridData.new(100, default) } label {text "Last"}; text { text "Tooth" layout_data GridData.new(100, default) } } This renders the following: [img_assist|nid=3589|title=|desc=|link=none|align=undefined|width=125|height=73] The used GridData constructor takes two parameters: width hint and height hint. The width was set to 100 pixels for both text boxes. The height was kept at the default value (SWT::DEFAULT) For more details about GridLayout, GridData, and other layout managers, please refer to the SWT API documentation. So far we have covered how to construct user-interfaces that can display data and gather input from the user. Next, we will demonstrate how to perform work based on actions taken by the user. SWT widgets can be monitored for certain user-interface events, such as mouse clicks, focus gain and loss, and key presses. With the original SWT API, events can be monitored by adding listeners to widgets. For example, to monitor the push of a button, you would add a SelectionListener that does some work in its widgetSelected event method. With Glimmer, events can be monitored by declaring their name (following Ruby conventions) prefixed by “on” Here is an example of how to monitor button selection: import org.eclipse.swt.widgets.MessageBox @shell1 = shell { composite { button { text 'Save' on_widget_selected { message_box = MessageBox.new(@shell1.widget, SWT::NULL) message_box.text = 'Information' message_box.message = 'Saved!' message_box.open } } } } @shell1.open This renders the following: [img_assist|nid=3590|title=|desc=|link=none|align=undefined|width=170|height=145] On click of the button, a message box is opened to let the user know that the information entered is saved. MessageBox is a class from SWT that represents message dialogs. It was imported using the JRuby import method. Its constructor takes a parent and style. To obtain the parent, we assigned the shell object to a Ruby class variable @shell1. Since Glimmer wraps all SWT constructed objects with Glimmer decorators ( e.g. Shell is wrapped with RShell,) to obtain the SWT Shell class and pass it as the parent to the MessageBox constructor, the widget method was called (e.g. @shell1.widget.) In the original SWT API, MessageBox has setter methods to set its text and message attributes. However in JRuby, the developer has the option to set them following the Ruby attribute conventions (e.g. message_box.text = 'value') because JRuby automatically enhances all Java objects with methods that follow the Ruby convention. Another example that benefits from event monitoring is field validation on loss of focus. For example, let's say we are validating the ZIP code on an address form, and we would like to display an error message if its value does not have a valid ZIP code format (e.g. 12345 or 12345-1234,) here is how we would do it with Glimmer (please add the following code before the button in the previous example): import org.eclipse.swt.widgets.MessageBox @shell1 = shell { composite { label { text "ZIP Code" } text { on_focus_lost { |focus_event| zip_code = focus_event.widget.text unless zip_code =~ /^\d{5}([-]\d{4})?$/ message_box = MessageBox.new(@shell1.widget, SWT::NULL) message_box.text = 'Validation Error' message_box.message = 'Format must match ##### or #####-####' message_box.open focus_event.widget.set_focus end } } button { text 'Save' on_widget_selected { message_box = MessageBox.new(@shell1.widget, SWT::NULL) message_box.message = 'Saved!' message_box.open } } } } @shell1.open Here is what it produces: [img_assist|nid=3591|title=|desc=|link=none|align=undefined|width=346|height=151] Notice how the on_focus_lost block has a FocusEvent object as a parameter. This parameter may be specified optionally whenever some information is needed from the event object. Again, this maps to the focusLost method on the FocusListener class in the original SWT API, which also takes a FocusEvent object as a parameter. While widgets in the original SWT API have a setFocus event to grab the user interface focus, in JRuby set_focus may be used instead following the Ruby naming conventions. Now, in order to cleanly separate event-driven behavior from user-interface code, we can rely on Glimmer's data-binding support. Stay tuned for the next tutorial, which will cover data-binding and how to achieve clean code separation with the Model-View-Presenter pattern. References: Glimmer Eclipse Technology Project Proposal: http://www.eclipse.org/proposals/glimmer/ Glimmer Newsgroup: http://www.eclipse.org/newsportal/thread.php?group=eclipse.technology.glimmer Glimmer at RubyForge: http://rubyforge.org/projects/glimmer/ Author Blog: http://andymaleh.blogspot.com Andy Maleh (andy at obtiva.com), Senior Consultant, Obtiva Corp.
June 19, 2008
by Andy Maleh
· 60,474 Views
article thumbnail
Running the Table With JMesa
Shhhh. I’ll tell you a secret. I don’t like tables. I know. Shocking, isn't it? Don't get me wrong: I don't dislike tables per se. They're great for displaying tabular material. (For page organization, not so much.) But I so dislike the code needed to build a table within a JSP. It usually comes down to something like this: User IDNameEmail${row.userID}${row.name}${row.email} All that iterative logic simply looks incomprehensible to me. It's still better than scriptlets or custom tag libraries (both of which were, to be sure, phenomenal in their time), but it's an undigestible mass, and even if I do step through it line by line and understand what it does, I'm still left with just a table. Users accustomed to active, Javascript-assisted widgets don't respond to tables that just lie there. Many more lines of code will be needed to enable them to do useful things like paginating through long lists of items, sorting by column values, and the like. It'll be an unholy mix of HTML, JSP directives, JSP tags, EL, Javascript, Java, XML, properties files, and so forth. The whole thing seems so error-prone (note to self: more code + more languages = more "opportunities" for bugs). But recently I discovered an open-source Java library called JMesa that provides another way. I'm going to share with you some of the things I've found in JMesa, building up an HTML page containing a table from nothing to, well, considerably more than nothing. There's a good deal of code here, to give you a sense of the JMesa API; hopefully. you'll come away with some ideas about how you can use JMesa in your own projects. I won't bother with package declarations, imports, or code not relevant to the point at hand; the complete code is available for download in the form of an Eclipse project. Installation instructions will be found at the end of this article. Join me in exploring JMesa! Preparation A Page to Show Before we can get to JMesa, though, we'll need a few things: a page within which to display our table, for instance. In fact, we'll learn even more if we put this page in a context. I have recently fallen in like with Spring MVC and so will use that to build a simple site with a few pages. Just to be clear, while Spring dependency injection and utilities are woven into the code below, JMesa does not depend upon Spring. The pages are not fancy, and I am going to skip most of the setup. Everything is included in the download, of course. One thing I shouldn't skip is the controller for the search results page, the page within which we will build our table. We'll start with pretty much the simplest functionality we can: public class SimpleSearchController extends AbstractController { @Override protected ModelAndView handleRequestInternal(HttpServletRequest request, HttpServletResponse response) throws Exception { return new ModelAndView("simple-results", "results", "Here we will display search results"); } } For those not familiar with Spring MVC, the ModelAndView return value contains a string that will be resolved to a view (in this project, it is resolved to "/WEB-INF/jsp/simple-results.jsp"), and a key-value pair (the second and third constructor arguments) that can be accessed using EL on the JSP page: ${results} Finally, we use the Spring jmesa-servlet.xml configuration file to create and associate a URL with our controller: welcomeController simpleSearchController Clicking on the "Search" link in the menu now produces: [img_assist|nid=3678|title=Figure 1.|desc=A simple page for our table|link=none|align=left|width=757|height=240] All right, not much. But it's the page we need. Something to Display Another thing we need before we can build a table is something to show in it. This "domain" object should be pretty easy to display: public class HelloWorld implements Comparable { private int pk; private String hello = "Hello"; private String world = "world"; private String from = "from"; private String firstName; private String lastName; private String format = "{0}, {1}! {2} {3} {4}"; // ... accessors and mutators public String toString() { return MessageFormat.format(getFormat(), hello, world, from, getFirstName(), getLastName()); } // ... implementations of equals, hashCode, and compareTo } Persistence Service Of course, we need instances of this domain object. Normally, we'd get them from a persistence service; for now, we'll just create them in memory: public class HelloWorldService { private int nextId; private Set helloWorlds = new TreeSet(); public HelloWorldService() { nextId = 1; helloWorlds.add(newInstance("Albert", "Einstein")); helloWorlds.add(newInstance("Grazia", "Deledda")); helloWorlds.add(newInstance("Francis", "Crick")); helloWorlds.add(newInstance("Linus", "Pauling")); helloWorlds.add(newInstance("Theodore", "Roosevelt")); helloWorlds.add(newInstance("Hideki", "Yukawa")); helloWorlds.add(newInstance("Harold", "Urey")); helloWorlds.add(newInstance("Barbara", "McClintock")); helloWorlds.add(newInstance("Hermann", "Hesse")); helloWorlds.add(newInstance("Mikhail", "Gorbachev")); helloWorlds.add(newInstance("Amartya", "Sen")); helloWorlds.add(newInstance("Albert", "Gore")); helloWorlds.add(newInstance("Amnesty", "International")); helloWorlds.add(newInstance("Daniel", "Bovet")); helloWorlds.add(newInstance("William", "Faulkner")); helloWorlds.add(newInstance("Otto", "Diels")); helloWorlds.add(newInstance("Marie", "Curie")); } public Set findAll() { return helloWorlds; } private HelloWorld newInstance(String firstName, String lastName) { HelloWorld hw = new HelloWorld(); hw.setPk(nextId++); hw.setFirstName(firstName); hw.setLastName(lastName); return hw; } } That's that. Now we're ready to focus on JMesa. JMesa Let's start with something extremely simple. On the very first page of the JMesa web site we find four lines of code that we can appropriate and refashion for a Spring controller: public class BasicJMesaSearchController extends AbstractController { private HelloWorldService helloWorldService; public void setHelloWorldService(HelloWorldService helloWorldService) { this.helloWorldService = helloWorldService; } @Override protected ModelAndView handleRequestInternal(HttpServletRequest request, HttpServletResponse response) throws Exception { Set results = helloWorldService.findAll(); TableFacade tableFacade = new TableFacadeImpl("results",request); tableFacade.setItems(results); tableFacade.setColumnProperties("pk", "firstName", "lastName", "format"); return new ModelAndView("results", "results", tableFacade.render()); } } We let Spring inject the HelloWorldService, which we use to retrieve a set of items to display. Then we create and configure the JMesa TableFacade class. This class takes an HTTP request in its constructor: TableFacade is going to send itself messages passed as parameters in the request (more on this in a moment). We supply it with the set of items and with which JavaBean property of those items we want displayed in each column. We'll also need a bit of new code in the search results page (in the project, this is actually a different search results page, as you, oh sharp-eyed reader, have already noticed): ${results} And we'll need to create and point to the new controller in jmesa-servlet.xml: ... basicSearchController Redeploy, and the results look like magic. How did we get them? [img_assist|nid=3679|title=Figure 2.|desc=Using JMesa "out of the box"|link=none|align=left|width=757|height=405] The key is in the variable results, which now holds the entire text of the table generated by the JMesa TableFacade when we called its render method. We also put a self-submitting HTML form around the JMesa table that it will use to send itself messages about how to alter itself. This makes possible many amazing features. The table automagically paginates itself. It allows the user to change the number of rows displayed. It allows sorting on any column or combination of columns. It provides color striping of table rows and onMouseOver row highlighting. And every bit of this came for free: we did nothing to enable it but what you have already seen. (OK, we played around with some of JMesa's images and CSS style sheets to make it fit in with our color scheme, but that really shouldn't count.) To demonstrate, we'll use the select at the top of the form to change the number of rows displayed to 16, sort by first name ascending and last name descending (by clicking on the first column header once and the second twice), and mouse over the third row to see the highlighting: [img_assist|nid=3681|title=Figure 3.|desc=JMesa search results sorted and highlighted|link=none|align=left|width=757|height=565] Now Al Gore and Einstein appear in the order we asked for. You will have noticed the images in the table toolbar. Those on the left are standard first, previous, next, and last navigation icons. The select we've already mentioned. But there are two other images as well: these turn filtering, another amazing feature of JMesa that is active by default, on and off. Filtering allows the user to apply expressions to a column in order to display only rows having matching values in that column. While filtering can take setup beyond the scope of this article, even by default it's astonishing. Try typing "Einstein" in the text field that appears above the last-name column header and clicking on the filter icon (the magnifying glass). The results show only the row containing Einstein's name in the last name column. And we didn't have to do a thing! [img_assist|nid=3680|title=Figure 4.|desc=JMesa search results filtered|link=none|align=left|width=757|height=280] See the JMesa web site for details about filtering, editable tables that keep track of your changes for you, and much, much more: it's impressive stuff. Customizing And now, to business. The JMesa default is astounding, but no default is ever exactly like you want it. The ability to customize is critical. Also, defaults rarely exercise every feature, and this one is no exception. Let's start with some requirements: We will display the value of each HelloWorld item's toString method in an additional column We will display more user-friendly values in the format column We will ensure that columns that cannot be reasonably sorted are made unsortable We will add columns containing links to edit and delete pages for the HelloWorld items We will display images in the edit and delete columns We will not display the Pk property of each item, but will pass its value to edit and delete pages as needed We will enable the user to retrieve a comma-separated-values (CSV) copy of the table contents We will enable the user to retrieve an Excel spreadsheet copy of the table contents We will disable filtering and highlighting We will reorganize the toolbar items in a different order Believe it or not, implementing each of these features will be quite easy! and you'll begin to get a sense for the possibilities of JMesa. ToString Column Each HelloWorld item produces a formatted string within its toString method. This is not a JavaBean property method, so we cannot directly point the TableFacade at it. We want this value to be rendered (to use JMesa terminology) as the contents of a (a cell) in each HTML row. Cell contents are produced by implementations of the CellEditor interface. Its getValue method is passed the item to be displayed, the property to be called, and the current row count. Since only the item itself is actually needed for our purpose, the implementation is simple: public class ToStringCellEditor implements CellEditor { @Override public Object getValue(Object item, String property, int rowcount) { if (item == null) { return ""; } return item.toString(); } } Of course, we'll need a column into which to put the results. All we need do is add an arbitrary value to the column properties list: tableFacade.setColumnProperties("firstName", "lastName", "format", "toString"); This value is used to retrieve the column: Row row = tableFacade.getTable().getRow(); Column column = row.getColumn("toString"); column.getCellRenderer().setCellEditor(new ToStringCellEditor()); Of course, this means that the getValue method of the ToStringCellEditor will always be passed a bogus property value, but since the editor doesn't use it, that's no problem. (Note that we've also left off the pk column as per requirements.) User-Friendly Format Column We continue by introducing a more user-friendly value into the format column. The format string "{0}, {1}! {2} {3} {4}" looks ugly and most likely won't be understood by an end user. The only real information it conveys is that it is the default value. We'll use a Spring MessageSource to supply something a little easier on the eyes at runtime. First, we'll add a property to the messages.properties file loaded by Spring at application startup: format.{0},\ {1}!\ {2}\ {3}\ {4}=Default format (The backslashes are needed to escape the white space in the key.) As we have already seen, a CellEditor is needed to change a cell's displayed value. Using MessageSource to produce the display value at runtime requires a few more lines than the ToStringCellEditor: public class SpringMessageCellEditor implements CellEditor { MessageSource source; String prefix; Locale locale; public SpringMessageCellEditor(MessageSource source, String prefix, Locale locale) { this.source = source; this.prefix = prefix; this.locale = locale; } public Object getValue(Object item, String property, int rowcount) { if (item != null) { try { return source.getMessage(prefix + "." + PropertyUtils.getProperty(item, property), null, locale); } catch (IllegalAccessException ignore) { } catch (InvocationTargetException ignore) { } catch (NoSuchMethodException ignore) { } } return null; } } We still have to add this editor to the column displaying the format property: Column column = row.getColumn("format"); column.getCellRenderer().setCellEditor(new SpringMessageCellEditor(messageSource, "format", locale); Unsortable Columns Next, we want the table to know that some columns are unsortable. Columns are typically sorted by property value, but we just added a column that corresponds to no property, that displays the output of the toString method. If the user clicked on the header of that column, he or she would wind up with a very ugly NullPointerException message. Making a column (actually, we need to have an HtmlColumn, but most columns qualify) unsortable is very simple: htmlColumn.setSortable(false); With this, no onClick method will be generated for the column header, preventing users from accidentally causing a mess. Edit and Delete Columns Now we'll add columns containing links to edit and delete pages for HelloWorld items. I prefer using icons to buttons saying "Edit" and "Delete", as it reduces the amount of textual information the user must process. Tables typically present a lot of information in a compact space, making user overload a problem worthy of attention. To do this, we'll need a CellEditor (by now, you knew that was coming!). Since this is functionality I use a lot, let's design it for reuse, refactoring out reusable code into one class, and code tailored to this project into another. ImageCellEditor encapsulates the general process of setting up an image with a link, and includes a method that will let subclasses override the default processing of the link: public class ImageCellEditor extends AbstractContextSupport implements CellEditor { private String image; private String alt; private String link; public ImageCellEditor(String image, String alt, String link) { this.image = image; this.alt = alt; this.link = link; } public Object getValue(Object item, String property, int rowcount) { CoreContext context = getCoreContext(); String imagePath = context.getPreference("html.imagesPath"); StringBuilder img = new StringBuilder(); if (link != null && link.trim().length() != 0) { img.append(""); } img.append(""); if (link != null && link.trim().length() != 0) { img.append(""); } return img.toString(); } /** * This method can be overridden by subclasses to handle specific * HTML link needs. */ public String processLink(Object item, String property, int rowcount, String link) { return link; } } This is our opportunity to introduce CoreContext and WebContext, two important classes that plug our code into the JMesa infrastructure. Extending AbstractContextSupport gets us JavaBean property methods for these objects (just a convenience; I could have implemented the interface ContextSupport, but then I would have had to write the property methods myself). The CoreContext has many uses; our immediate purpose for it is to retrieve a value configured in the jmesa.properties file. This was pointed to in web.xml: jmesaPreferencesLocation WEB-INF/jmesa.properties It contains a preference called "html.imagesPath" that replaces the default path from which JMesa retrieves images: html.imagesPath=/images/ This means we won't have to hard-code a part of the image URL. (There are a lot more configurable preferences: for details, see the JMesa web site.) The WebContext provides us with the servlet context path, again letting us avoid hard-coding the image URL: getWebContext().getContextPath() Getting back to the two image columns, we have a requirement to pass the Pk property of the appropriate HelloWorld to the edit or delete pages when the images are clicked. Adding this property to the link is easy, using the MessageFormat class to process the link argument of the application-specific subclass: public class HelloWorldImageCellEditor extends ImageCellEditor { public String processLink(Object item, String property, int rowcount, String link) { return MessageFormat.format(link, ((HelloWorld) item).getPk()); } } After creating the editor, we can retrieve the context objects for it from the TableFacade: ImageCellEditor editor = new HelloWorldImageCellEditor("edit.gif", messageSource.getMessage("image.edit.alt", null, locale), "edit.html?pk={0,number,integer}"); editor.setWebContext(tableFacade.getWebContext()); editor.setCoreContext(tableFacade.getCoreContext()); Now we have the images and the links. But it would be awfully nice if the images could be centered within the column, something notoriously difficult to achieve with CSS style sheets. What would work would be to use the align and valign attributes of the cell. How can we do that? The cell itself, as opposed to its contents, is rendered by the interface CellRenderer. Unfortunately, the HtmlCellRenderer sub-interface that comes with JMesa has no method for adding attributes. The Decorator and Template patterns, however, come to the rescue. Again, we implement the functionality for reuse as two classes, the first a generic decorator with an additional template method: public abstract class AttributedHtmlCellRendererDecorator implements HtmlCellRenderer { // all other methods will be delegated to this renderer protected HtmlCellRenderer renderer; public AttributedHtmlCellRendererDecorator(HtmlCellRenderer renderer) { this.renderer = renderer; } public Object render(Object item, int rowcount) { HtmlBuilder html = new HtmlBuilder(); html.td(2); html.width(getColumn().getWidth()); addAttributes(html); html.style(getStyle()); html.styleClass(getStyleClass()); html.close(); String property = getColumn().getProperty(); Object value = getCellEditor().getValue(item, property, rowcount); if (value != null) { html.append(value.toString()); } html.tdEnd(); return html.toString(); } /** * Subclasses will add attributes. */ public abstract void addAttributes(HtmlBuilder html); } The second will be a subclass that adds the specific attributes we need: public class AlignedHtmlCellRendererDecorator extends AttributedHtmlCellRendererDecorator { private String align; private String valign; public AlignedHtmlCellRendererDecorator(HtmlCellRenderer renderer, String align, String valign) { super(renderer); this.align = align; this.valign = valign; } @Override public void addAttributes(HtmlBuilder html) { html.align(align); html.valign(valign); } } Whew, that was a mouthful! However, our images will come out nicely centered in the column, and we've learned a good deal more about how the JMesa API works. There will be edit and delete pages to link to, of course, but these are not of interest here and are completely trivial in the Eclipse project. CSV and Excel Output In JMesa terminology, output other than HTML is called exporting the table. As complex as it might seem, it's actually the easiest part of the process. Again, a single line of code will do all we need: tableFacade.setExportTypes(response, org.jmesa.limit.ExportType.CSV, org.jmesa.limit.ExportType.EXCEL); That's really all there is to it! (OK, you have to include some JAR files in the library, but what did you expect, magic?) Filtering and Highlighting Making a row (we need an HtmlRow) unfilterable and unhighlighted is just as simple as making a column unsortable: htmlRow.setFilterable(false); htmlRow.setHighlighter(false); With this, no filtering row or icons will be generated above the column header and the highlighting feature will be turned off. Toolbar The code to reorganize the toolbar is quite straightforward; while we're at it, we need to include icons for the various output formats: public class ReorderedToolbar extends AbstractToolbar { @Override public String render() { if (ViewUtils.isExportable(getExportTypes())) { addExportToolbarItems(getExportTypes()); addToolbarItem(ToolbarItemType.SEPARATOR); } MaxRowsItem maxRowsItem = (MaxRowsItem) addToolbarItem(ToolbarItemType.MAX_ROWS_ITEM); if (getMaxRowsIncrements() != null) { maxRowsItem.setIncrements(getMaxRowsIncrements()); } addToolbarItem(ToolbarItemType.SEPARATOR); addToolbarItem(ToolbarItemType.FIRST_PAGE_ITEM); addToolbarItem(ToolbarItemType.PREV_PAGE_ITEM); addToolbarItem(ToolbarItemType.NEXT_PAGE_ITEM); addToolbarItem(ToolbarItemType.LAST_PAGE_ITEM); return super.render(); } } I arranged the icons by simply specifying the order in which they are added to the toolbar. They look more natural to me this way; your mileage may vary. Note that we delegate the messy work of actually rendering the toolbar to the JMesa superclass. Putting It All Together We'll refactor out reusable code once more in writing a Factory to encapsulate building our customized table, starting with an abstract class: public abstract class AbstractTableFactory { protected abstract String getTableName(); protected abstract void configureColumns(TableFacade tableFacade, Locale locale); protected abstract void configureUnexportedTable(TableFacade tableFacade, Locale locale); protected abstract ImageCellEditor getEditImageCellEditor(Locale locale); protected abstract ImageCellEditor getDeleteImageCellEditor( Locale locale); public TableFacade createTable(HttpServletRequest request, HttpServletResponse response, Collection items) { TableFacade tableFacade = new TableFacadeImpl(getTableName(), request); tableFacade.setItems(items); tableFacade.setStateAttr("return"); configureTableFacade(response, tableFacade); Locale locale = request.getLocale(); configureColumns(tableFacade, locale); if (! tableFacade.getLimit().isExported()) { configureUnexportedTable(tableFacade, locale); } return tableFacade; } public void configureTableFacade(HttpServletResponse response, TableFacade tableFacade) { tableFacade.setExportTypes(response, getExportTypes()); tableFacade.setToolbar(new ReorderedToolbar()); Row row = tableFacade.getTable().getRow(); if (row instanceof HtmlRow) { HtmlRow htmlRow = (HtmlRow) row; htmlRow.setFilterable(false); htmlRow.setHighlighter(false); } } protected ExportType[] getExportTypes() { return null; } protected void configureColumn(Column column, String title, CellEditor editor) { configureColumn(column, title, editor, false, true); } protected void configureColumn(Column column, String title, CellEditor editor, boolean filterable, boolean sortable) { column.setTitle(title); if (editor != null) { column.getCellRenderer().setCellEditor(editor); } if (column instanceof HtmlColumn) { HtmlColumn htmlColumn = (HtmlColumn) column; htmlColumn.setFilterable(filterable); htmlColumn.setSortable(sortable); } } protected void configureEditAndDelete(Row row, WebContext webContext, CoreContext coreContext, Locale locale) { HtmlComponentFactory factory = new HtmlComponentFactory(webContext, coreContext); HtmlColumn col = factory.createColumn((String) null); col.setFilterable(false); col.setSortable(false); CellRenderer renderer = col.getCellRenderer(); ImageCellEditor editor = getEditImageCellEditor(locale); editor.setWebContext(webContext); editor.setCoreContext(coreContext); renderer.setCellEditor(editor); col.setCellRenderer(new AlignedHtmlCellRendererDecorator((HtmlCellRenderer) renderer, "center", "middle")); row.addColumn(col); col = factory.createColumn((String) null); col.setFilterable(false); col.setSortable(false); renderer = col.getCellRenderer(); editor = getDeleteImageCellEditor(locale); editor.setWebContext(webContext); editor.setCoreContext(coreContext); renderer.setCellEditor(editor); col.setCellRenderer(new AlignedHtmlCellRendererDecorator((HtmlCellRenderer) renderer, "center", "middle")); row.addColumn(col); } } This has a lot of code (note the abstract methods ), in part because I know I usually want edit and delete columns. One line that might pass by unnoticed in all this, however, is really quite something: tableFacade.setStateAttr("return"); When this attribute is set, JMesa uses the Memento design pattern to save the state of its tables. When you return to a table page and include the attribute you specify here in the URL, you return to the exact place you left: the page number to which you had moved before leaving the table, the number of values displayed per page, and so forth. The application-specific concrete class, after all this, can be pretty simple: public class HelloWorldTableFactory extends AbstractTableFactory { protected MessageSource messageSource; public void setMessageSource(MessageSource messageSource) { this.messageSource = messageSource; } @Override protected String getTableName() { return "results"; } @Override protected ExportType[] getExportTypes() { return new ExportType[] { CSV, EXCEL }; } @Override protected void configureColumns(TableFacade tableFacade, Locale locale) { tableFacade.setColumnProperties("firstName", "lastName", "format", "toString"); Row row = tableFacade.getTable().getRow(); configureColumn(row.getColumn("firstName"), messageSource.getMessage("column.firstName", null, locale), null); configureColumn(row.getColumn("lastName"), messageSource.getMessage("column.lastName", null, locale), null); configureColumn(row.getColumn("format"), messageSource.getMessage("column.format", null, locale), new SpringMessageCellEditor(messageSource, "format", locale), false, false); configureColumn(row.getColumn("toString"), messageSource.getMessage("column.toString", null, locale), new ToStringCellEditor(), false, false); } @Override protected void configureUnexportedTable(TableFacade tableFacade, Locale locale) { HtmlTable table = (HtmlTable) tableFacade.getTable(); table.setCaption(messageSource.getMessage("table.caption", null, locale)); configureEditAndDelete(table.getRow(), tableFacade.getWebContext(), tableFacade.getCoreContext(), locale); } @Override protected ImageCellEditor getEditImageCellEditor(Locale locale) { return new HelloWorldImageCellEditor("edit.gif", messageSource.getMessage("image.edit.alt", null, locale), "edit.html?pk={0,number,integer}"); } @Override protected ImageCellEditor getDeleteImageCellEditor(Locale locale) { return new HelloWorldImageCellEditor("delete.gif", messageSource.getMessage("image.delete.alt", null, locale), "delete.html?pk={0,number,integer}"); } } Controller We end as we began, with a Spring MVC Controller to launch all this infrastructure. Since the details of table creation are encapulated in a factory, this is uncluttered: the only decision to be made is whether or not the table is to be exported. If it is exported, the results will be written directly to the output stream of the response; if not, they'll be rendered as a string containing our HTML table: public class CustomJMesaSearchController extends AbstractController { private HelloWorldService helloWorldService; private HelloWorldTableFactory tableFactory; public void setHelloWorldService(HelloWorldService helloWorldService) { this.helloWorldService = helloWorldService; } public void setTableFactory(HelloWorldTableFactory tableFactory) { this.tableFactory = tableFactory; } @Override protected ModelAndView handleRequestInternal(HttpServletRequest request, HttpServletResponse response) throws Exception { Set results = helloWorldService.findAll(); TableFacade tableFacade = tableFactory.createTable(request, response, results); if (tableFacade.getLimit().isExported()) { tableFacade.render(); return null; } return new ModelAndView("results", "results", tableFacade.render()); } } We are actually reusing the same JSP page as in the basic JMesa setup: the only difference is in the Java code that generates the table. One more change in jmesa-servlet.xml to create everything and tie it all together: ... customSearchController And how different the display looks!: [img_assist|nid=3682|title=Figure 5.|desc=A customized search result|link=none|align=left|width=757|height=465] Ajax Finally, the table looks like we want it to, but it's irritating having to resubmit the form each time we want to make a change. Isn't that the sort of thing Ajax is supposed to help us avoid? The answer is, of course, yes! So how do we leverage Ajax to help us? Fortunately, the JMesa folks have already worked that out. There are two parts to the solution: changes to the controller and changes to the JSP page. ${results} In our previous solution, the onInvokeAction Javascript method called createHiddenInputFieldsForLimitAndSubmit, which submitted the form. In the Ajax solution, it assembles parameters for the TableFacade class and sends a request for the HTML for table display, adding a parameter to indicate that it's an Ajax request. Then a callback Javascript function substitutes the returned HTML for the contents of the that now holds the table. The simplicity and unusual syntax of the latter code come courtesy of the jQuery Ajax library, which is thoughtfully used by JMesa: The controller, of course, needs to interpret this new request correctly. This is just one more branch on the decision tree we saw in the previous controller: public class AjaxJMesaSearchController extends AbstractController { @Override protected ModelAndView handleRequestInternal(HttpServletRequest request, HttpServletResponse response) throws Exception { Set results = helloWorldService.findAll(); TableFacade tableFacade = tableFactory.createTable(request, response, results); if (tableFacade.getLimit().isExported()) { tableFacade.render(); return null; } else if ("true".equals(request.getParameter("ajax"))) { String encoding = response.getCharacterEncoding(); byte[] contents = tableFacade.render() .getBytes(encoding); response.getOutputStream().write(contents); return null; } return new ModelAndView("ajax-results", "results", tableFacade.render()); } } Of course, we have to make Spring aware of the controller change in jmesa-servlet.xml: ... ajaxSearchController That's all there is to it! The table looks and acts just as it did, except now it refreshes without resubmitting the form each time. Conclusion Now I don't have to like tables: I can program them in Java and not worry about them on a display JSP. This makes the page cleaner, gives me more functionality out-of-box, and enables me to nix at least some of the languages I'd otherwise have to fuss with. What's not to like? I hope you'll take a good look at JMesa and see if it can make your life easier, and that this article helps you decide. Good luck! Installation of the Eclipse Project Installing the Eclipse project is not difficult; the included Ant build file and these instructions assume Tomcat as the deployment target (I'm using version 6.0.14 with JDK 6.0_03). If you want to use another servlet container, though, feel free to modify the instructions and the Ant file as needed: download the ZIP archive unzip the archive to any directory; it will create its own top-level subdirectory open the project as a Java project in Eclipse the project must use the Java 6 compiler (available from "http://java.sun.com/javase/6/") the Tomcat installation must be version 6 (available from "http://tomcat.apache.org/download-60.cgi") open the build file and modify the path to the Tomcat root add an external JAR file to the Eclipse project build path from the Tomcat installation: lib/servlet-api.jar run the Ant "deploy" target, which will build automatically open a browser and point it to "http://localhost:8080/running-jmesa-examples/" or to an equivalent URL for your setup (N.B. Some code in the project has been refactored from the way it appears in the article.)
June 18, 2008
by David Sills
· 56,469 Views
article thumbnail
ASP.NET - Preventing SQL Injection Attacks
Consider a simple web application that requires user input in some fields, lets say some search box. Suppose a user types the following string in that textbox: '; DROP DATABASE pubs -- On submit our application executes the following dynamic SQL statement SqlDataAdapter myCommand = new SqlDataAdapter("SELECT OrderId, OrderNumber FROM Orders WHERE OrderNumber = '" + OrderNumberTextBox.Text + "'", myConnection); Or stored procedure: SqlDataAdapter myCommand = new SqlDataAdapter("uspGetOrderList '" + OrderNumberTextBox.Text + "'", myConnection); The intention being that the user input would be run as: SELECT OrderId, OrderNumber FROM Orders WHERE OrderNumber = 'PO123' However, the code inserts the user's malicious input and generates the following query: SELECT OrderId, OrderNumber FROM Orders WHERE OrderNumber = ''; DROP DATABASE pubs --' In this case, the ' (single quotation mark) character that starts the rogue input terminates the current string literal in the SQL statement. As a result, the opening single quotation mark character of the rogue input results in the following statement. SELECT OrderId, OrderNumber FROM Orders WHERE OrderNumber = '' The; (semicolon) character tells SQL that this is the end of the current statement, which is then followed by the following malicious SQL code. ; DROP DATABASE pubs Finally, the -- (double dash) sequence of characters is a SQL comment that tells SQL to ignore the rest of the text. In this case, SQL ignores the closing ' (single quotation mark) character, which would otherwise cause a SQL parser error. --' Using stored procedures doesn’t solve the problem either because the generated query would be: uspGetOrderList ''; DROP DATABASE pubs--' Or perhaps this was your login page and your query being: SELECT UserId FROM Users WHERE LoginId = AND Password = AND IsActive = 1 Someone could easily login by typing in the following in your login textbox: ' OR 1 = 1; -- Which makes our query: SELECT UserId FROM Users WHERE LoginId = '' OR 1 = 1; --' AND Password = '' AND IsActive = 1 Viola, the attacker has now successfully logged in to your site using SQL injection attack. SQL injection can occur, as demonstrated above, when an application uses input to construct dynamic SQL statements or when it uses stored procedures to connect to the database. Conventional security measures, such as the use of SSL and IPSec, do not protect your application from SQL injection attacks. Successful SQL injection attacks enable malicious users to execute commands in an application's database. Common vulnerabilities that make your data access code susceptible to SQL injection attacks include: Weak input validation. Dynamic construction of SQL statements without the use of type-safe parameters. Use of over-privileged database logins. So what can we do to help protect our application from such attacks? To counter SQL injection attacks, we need to: Constrain and sanitize input data Check for known good data by validating for type, length, format, and range and using a list of acceptable characters to constrain input. Create a list of acceptable characters and use regular expressions to reject any characters that are not on the list. Using the list of unacceptable characters is impractical because it is very difficult to anticipate all possible variations of bad input. Start by constraining input in the server-side code for your ASP.NET Web pages. Do not rely on client-side validation because it can be easily bypassed. Use client-side validation only to reduce round trips and to improve the user experience. Check my other blog on Validation Application Block for server-side validation. If in the previous code example, the Order Number value is captured by an ASP.NET TextBox control, you can constrain its input by using a RegularExpressionValidator control as shown in the following. If the Order Number input is from another source, such as an HTML control, a query string parameter, or a cookie, you can constrain it by using the Regex class from the System.Text.RegularExpressions namespace. The following example assumes that the input is obtained from a cookie. using System.Text.RegularExpressions; if (Regex.IsMatch(Request.Cookies["OrderNumber"], "^PO\d{3}-\d{2}$")) { // access the database } else { // handle the bad input } Performing input validation is essential because almost all application-level attacks contain malicious input. You should validate all input, including form fields, query string parameters, and cookies to protect your application against malicious command injection. Assume all input to your Web application is malicious, and make sure that you use server validation for all sources of input. Use client-side validation to reduce round trips to the server and to improve the user experience, but do not rely on it because it is easily bypassed. Apply ASP.NET request validation during development to identify injection attacks ASP.NET request validation detects any HTML elements and reserved characters in data posted to the server. This helps prevent users from inserting script into your application. Request validation checks all input data against a hard-coded list of potentially dangerous values. If a match occurs, it throws an exception of type HttpRequestValidationException. Request validation is enabled by ASP.NET by default. You can see the following default setting in the Machine.config.comments file. Confirm that you have not disabled request validation by overriding the default settings in your server's Machine.config file or your application's Web.config file. You can disable request validation in your Web.config application configuration file by adding a element with validateRequest="false" or on an individual page by setting ValidateRequest="false" on the @ Pages element. NOTE: You should disable Request Validation only on the page with a free-format text field that accepts HTML-formatted input. You can test the effects of request validation. To do this, create an ASP.NET page that disables request validation by setting ValidateRequest="false", as follows: When you run the page, "Hello" is displayed in a message box because the script in txtString is passed through and rendered as client-side script in your browser. If you set ValidateRequest="true" or remove the ValidateRequest page attribute, ASP.NET request validation rejects the script input and produces an error similar to the following. A potentially dangerous Request. Form value was detected from the client (txtString=" Use type-safe SQL parameters for data access Parameter collections such as SqlParameterCollection provide type checking and length validation. If you use a parameters collection, input is treated as a literal value, and SQL Server does not treat it as executable code. An additional benefit of using a parameters collection is that you can enforce type and length checks. Values outside of the range trigger an exception. You can use these parameters with stored procedures or dynamically constructed SQL command strings. Using stored procedures does not necessarily prevent SQL injection. The important thing to do is use parameters with stored procedures. If you do not use parameters, your stored procedures can be susceptible to SQL injection if they use unfiltered input. The following code shows how to use SqlParameterCollection when calling a stored procedure: using System.Data; using System.Data.SqlClient; using (SqlConnection connection = new SqlConnection(connectionString)) { DataSet userDataset = new DataSet(); SqlDataAdapter myCommand = new SqlDataAdapter("uspGetOrderList", connection); myCommand.SelectCommand.CommandType = CommandType.StoredProcedure; myCommand.SelectCommand.Parameters.Add("@OrderNumber", SqlDbType.VarChar, 11); myCommand.SelectCommand.Parameters["@OrderNumber"].Value = OrderNumberTextBox.Text; myCommand.Fill(userDataset); } The @OrderNumber parameter is treated as a literal value and not as executable code. Also, the parameter is checked for type and length. In the preceding code example, the input value cannot be longer than 11 characters. If the data does not conform to the type or length defined by the parameter, the SqlParameter class throws an exception. You should review your application's use of stored procedures because simply using stored procedures with parameters does not necessarily prevent SQL injection. For example, the following parameterized stored procedure has several security vulnerabilities. CREATE PROCEDURE dbo.uspRunQuery @var ntext AS exec sp_executesql @var GO The stored procedure executes whatever statement is passed to it. Consider the @var variable being set to: DROP TABLE ORDERS; If you cannot use stored procedures, you should still use parameters when constructing dynamic SQL statements. The following code shows how to use SqlParametersCollection with dynamic SQL. using System.Data; using System.Data.SqlClient; using (SqlConnection connection = new SqlConnection(connectionString)) { DataSet userDataset = new DataSet(); SqlDataAdapter myDataAdapter = new SqlDataAdapter("SELECT OrderId, OrderNumber FROM Orders WHERE OrderNumber = @OrderNumber", connection); myCommand.SelectCommand.Parameters.Add("@OrderNumber", SqlDbType.VarChar, 11); myCommand.SelectCommand.Parameters["@OrderNumber"].Value = OrderNumberTextBox.Text; myDataAdapter.Fill(userDataset); } If you concatenate several SQL statements to send a batch of statements to the server in a single round trip, you can still use parameters if you make sure that parameter names are not repeated i.e. use unique parameter names during SQL text concatenation. SELECT OrderId, OrderNumber FROM Orders WHERE OrderNumber = 'PO123' using System.Data; using System.Data.SqlClient; using (SqlConnection oConn = new SqlConnection(connectionString)) { SqlDataAdapter oAdapter = new SqlDataAdapter( "SELECT CustomerID INTO #Temp1 FROM Customers " + "WHERE CustomerID > @custIDParm; " + "SELECT CompanyName FROM Customers " + "WHERE Country = @countryParm and CustomerID IN " + "(SELECT CustomerID FROM #Temp1);", oConn); SqlParameter custIDParm = oAdapter.SelectCommand.Parameters.Add("@custIDParm", SqlDbType.NChar, 5); custIDParm.Value = customerID.Text; SqlParameter countryParm = oAdapter.SelectCommand.Parameters.Add("@countryParm", SqlDbType.NVarChar, 15); countryParm.Value = country.Text; oConn.Open(); DataSet dataSet = new DataSet(); oAdapter.Fill(dataSet); } Use a least privileged account that has restricted permissions in the database Ideally, you should only grant execute permissions to selected stored procedures in the database and provide no direct table access. The problem is more severe if your application uses an over-privileged account to connect to the database. For example, if your application's login has privileges to eliminate a database, then without adequate safeguards, an attacker might be able to perform this operation. If you use Windows authentication to connect, the Windows account should be least-privileged from an operating system perspective and should have limited privileges and limited ability to access Windows resources. Additionally, whether or not you use Windows authentication or SQL authentication, the corresponding SQL Server login should be restricted by permissions in the database. Consider the example of an ASP.NET application running on Microsoft Windows Server 2003 that accesses a database on a different server in the same domain. By default, the ASP.NET application runs in an application pool that runs under the Network Service account. This account is a least privileged account. Create a SQL Server login for the Web server's Network Service account. The Network Service account has network credentials that are presented at the database server as the identity DOMAIN\WEBSERVERNAME$. For example, if your domain is called XYZ and the Web server is called 123, you create a database login for XYZ\123$. Grant the new login access to the required database by creating a database user and adding the user to a database role. Establish permissions to let this database role call the required stored procedures or access the required tables in the database. Only grant access to stored procedures the application needs to use, and only grant sufficient access to tables based on the application's minimum requirements. If the ASP.NET application only performs database lookups and does not update any data, you only need to grant read access to the tables. This limits the damage that an attacker can cause if the attacker succeeds in a SQL injection attack. Use Character Escaping Techniques In situations where parameterized SQL cannot be used, consider using character escaping techniques. If you are forced to use dynamic SQL and parameterized SQL cannot be used, you need to safeguard against input characters that have special meaning to SQL Server (such as the single quote character). If not handled, special characters such as the single quote character in the input can be utilized to cause SQL injection. Escape routines add an escape character to characters that have special meaning to SQL Server, thereby making them harmless. private static string GetStringForSQL(string inputSQL) { return inputSQL.Replace("'", "''"); } Special input characters pose a threat only with dynamic SQL and not when using parameterized SQL. Your first line of defense should always be to use parameterized SQL. Avoid disclosing database error information In the event of database errors, make sure you do not disclose detailed error messages to the user. Use structured exception handling to catch errors and prevent them from propagating back to the client. Log detailed error information locally, but return limited error details to the client. If errors occur while the user is connecting to the database, be sure that you provide only limited information about the nature of the error to the user. If you disclose information related to data access and database errors, you could provide a malicious user with useful information that he or she can use to compromise your database security. Attackers use the information in detailed error messages to help deconstruct a SQL query that they are trying to inject with malicious code. A detailed error message may reveal valuable information such as the connection string, SQL server name, or table and database naming conventions. See my other post on Exception Handling - Do's and Dont's. You can use the element to configure custom, generic error messages that should be returned to the client in the event of an application exception condition. Make sure that the mode attribute is set to "remoteOnly" in the web.config file as shown in the following example. After installing an ASP.NET application, you can configure the setting to point to your custom error page as shown in the following example. Conclusion The above list is just some points found on MSDN on how you can make your site more secure by effectively preventing SQL injection attacks. You should always be reviewing your code to find these or other security vulnerabilities; remember all type of attacks start with some input, and your first line of defense should be input validation using both client-side and server-side validation. Original Author Original article written by Misbah Arefin
June 18, 2008
by Schalk Neethling
· 90,725 Views
article thumbnail
Hibernate - Dynamic Table Routing
I have been searching for a method to dynamically route objects to databases at runtime using Hibernate and recently I found a solution which fit the bill.
June 13, 2008
by alvin sd
· 60,074 Views · 3 Likes
article thumbnail
Hibernate - Tuning Queries Using Paging, Batch Size, and Fetch Joins
This article covers queries - in particular a tuning test case and the relations between simple queries, join fetch queries, paging query results, and batch size. Paging the Query Results I will start with a short introduction about paging in EJB3: To support paging the EJB3 Query interface defines the following two methods: setMaxResults - sets the number of maximum rows to retrieve from the database setFirstResult - sets the first row to retrieve For example if our GUI displays a list of customers and we have 500,000 customers (database rows) in out database we wouldn't like to display all 500,000 records is one view (even if we put performance considerations aside - nobody can do anything with a list of 500,000 rows). The GUI design would usually include paging - we break the list of records to display into logical pages (for example 100 records per page) and the user can navigate between pages (same as Google's results navigator down the search page). When using the paging support it is important to remember that the query has to be sorted otherwise we can't be sure that when fetching the "next page" it will really be the next page (since in the absence of the 'order by' clause form a SQL query the order in which rows are fetch is unpredictable). Here is a sample use, for fetching the first tow pages of 100 rows each: Query q = entityManager.createQuery("select c from Customer c order by c.id"); q.setFirstResult(0).setMaxResults(100); .... next page ... Query q = entityManager.createQuery("select c from Customer c order by c.id"); q.setFirstResult(100).setMaxResults(100); This is a simple API and it's important (for performance) to remember using it when we need to fetch only parts of the results. Test Case Description This test cased is based on a real tuning I did for an application, I just changed the class names to Customer and Order. Let's assume that I have a Customer entity with a set of orders (lazily fetched - but it happens in eager fetch as well) and we need to: Fetch customers and their orders Do it in a "paging mode" - 100 customers per page Tuning Requirement #1 - Fetch Customers and Their Orders There are two possibilities to perform this kind of fetch: Simple select: select c from customer c order by c.id Join fetch: select distinct c from Customer c left outer join fetch c.orders order by c.id The simple select is as simple as it can be, we load a list of customers with a proxy collection in their orders field. The orders collection will be filled with data once I access it (for example c.getOrders().getSize() ). The 'join fetch' means that we want to fetch an association as an integral part of the query execution. The joined fetched entities (in the example above: c.orders) must be part of an association that is referenced by an entity returned from the query (in the example above: c). The 'join fetch' is one of the tools used for improving queries performance (see more in here). The Hibernate core documentations explains that "a 'fetch' join allows associations or collections of values to be initialized along with their parent objects, using a single select" (see here). I have in my database 18,998 customer records, each with few orders. Let's compare execution time for the two queries. My code looks the same for both queries (except of the query itself), I execute the query, then I iterate the results checking the size of of each customer orders collection and print the execution time and number of records fetch (as a sanity for the query syntax): Query q = entityManager.createQuery(queryStr); long a = System.currentTimeMillis(); List l = q.getResultList(); for (Customer c : l) { c.getOrders().size(); } long b = System.currentTimeMillis(); System.out.println("Execution time: " + (b - a)+ "; Number of records fetch: " + l.size() ); And to the numbers (avg. 3 executions): Simple select: 24,984 millis Join fetch: 1,219 millis The join fetch query execution time was 20 times faster(!) than the simple query. The reason is obvious, using the join fetch select I had only one round trip to the database. While using a simple select I had to fetch the customers (1 round trip to the database) and each time I accessed a collection I had another round trip (that's 18,998 additional round trips!). The winner is 'join fetch'. But does it? wait for the next one - the paging... Tuning Requirement #2 - Use Paging The second requirement was to do it in paging - each page will have 100 customers (so we will have 18,900/100+1 pages - the last page has 98 customers). So let's change the code above a little bit: Query q = entityManager.createQuery(queryStr); q.setFirstResult(pageNum*100).setMaxResults(100); long a = System.currentTimeMillis(); List l = q.getResultList(); for (Customer c : l) { c.getOrders().size(); } long b = System.currentTimeMillis(); System.out.println("Execution time: " + (b - a)+ "; Number of records fetch: " + l.size() ); I added the second line which limits the query result to a specific page with up to 100 records per page. And the numbers are (avg. 3 executions): Simple select: 328 millis Join fetch: 1,660 millis The wheel has turned over. Why? First a quote from the EJB3 Persistence specification: "The effect of applying setMaxResults or setFirstResult to a query involving fetch joins over collections is undefined" (section 3.6.1 - Query Interface) We could have stopped here but it is interesting to understand the issue and to see what Hibernate does. To implement the paging features Hibernate delegates the work to the database using its syntax to limit the number of records fetched by the query. Each database has its own proprietary syntax for limiting the number of fetched records, some examples: Postgres uses LIMIT and OFFSET Oracle has rownum MySQL uses its version of LIMIT and OFFSET MSSQL has the TOP keyword in the select and so on The important thing to remember here is meaning of such limit: the database returns a subset of the query result. So if we asked for the first 100 customers which their names contain 'Eyal' the outcome is logically the same as building a table in memory out of all customers that match the criteria and take from there the first 100 rows. And here is the catch: if the query with the limit includes a join clause for a collection than the first 100 row in the "logical table" will not necessarily be the first 100 customers. the outcome of the join might duplicate customers in the "logical tables" but the database doesn't aware or care about that - it performs operations on tables not on objects!. For example think of the extreme case, the customer 'Eyal' has 100 orders. The query will return 100 rows, hibernate will identify that all belong to the same customer and return only one Customer as the query result - this is not what we were asking for. This also works, of course, the other way around. If a customer had more than 100 orders and the result set size was limited to 100 rots the orders collection would not contain all of the customer's orders. To deal with that limitation Hibernate actually doesn't issue an SQL statement with a LIMIT clause. Instead it fetches all of the records and performs the paging in memory. This explains why using the 'join fetch' statement with paging took more than the one without paging - the delta is the in-memory paging done by Hibernate. If you look at Hibernate logs you will find the next warning issued by Hibernate: WARNING: firstResult/maxResults specified with collection fetch; applying in memory! Final Tuning - BatchSize Does it mean that in the case of paging we shouldn't use a join fetch? usually it does (unless your page size is very close to the actual number of records). But even if you use a simple select this is a classic case for using the @BatchSize annotation. If my session/entity manager has 100 customers attached to it than, be default, for each first access to one of the customers' order collection Hibernate will issue a SQL statement to fill that collection. At the end I will execute 100 statements to fetch 100 collections. You can see it in the log: Hibernate: /* select c from Customer c order by c.id */ select customer0_.id as id0_, customer0_.ccNumber as ccNumber0_, customer0_.name as name0_, customer0_.fixedDiscount as fixedDis5_0_, customer0_.DTYPE as DTYPE0_ from CUSTOMERS customer0_ order by customer0_.id limit ? offset ? Hibernate: /* load one-to-many par2.Customer.orders */ select orders0_.customer_id as customer4_1_, orders0_.id as id1_, orders0_.id as id1_0_, orders0_.customer_id as customer4_1_0_, orders0_.description as descript2_1_0_, orders0_.orderId as orderId1_0_ from ORDERS orders0_ where orders0_.customer_id=? Hibernate: /* load one-to-many par2.Customer.orders */ select orders0_.customer_id as customer4_1_, orders0_.id as id1_, orders0_.id as id1_0_, orders0_.customer_id as customer4_1_0_, orders0_.description as descript2_1_0_, orders0_.orderId as orderId1_0_ from ORDERS orders0_ where orders0_.customer_id=? Hibernate: /* load one-to-many par2.Customer.orders */ select orders0_.customer_id as customer4_1_, orders0_.id as id1_, orders0_.id as id1_0_, orders0_.customer_id as customer4_1_0_, orders0_.description as descript2_1_0_, orders0_.orderId as orderId1_0_ from ORDERS orders0_ where orders0_.customer_id=? ............ Hibernate: /* load one-to-many par2.Customer.orders */ select orders0_.customer_id as customer4_1_, orders0_.id as id1_, orders0_.id as id1_0_, orders0_.customer_id as customer4_1_0_, orders0_.description as descript2_1_0_, orders0_.orderId as orderId1_0_ from ORDERS orders0_ where orders0_.customer_id=? Hibernate: /* load one-to-many par2.Customer.orders */ select orders0_.customer_id as customer4_1_, orders0_.id as id1_, orders0_.id as id1_0_, orders0_.customer_id as customer4_1_0_, orders0_.description as descript2_1_0_, orders0_.orderId as orderId1_0_ from ORDERS orders0_ where orders0_.customer_id=? Hibernate: /* load one-to-many par2.Customer.orders */ select orders0_.customer_id as customer4_1_, orders0_.id as id1_, orders0_.id as id1_0_, orders0_.customer_id as customer4_1_0_, orders0_.description as descript2_1_0_, orders0_.orderId as orderId1_0_ from ORDERS orders0_ where orders0_.customer_id=? The @BatchSize annotation can be used to define how many identical associations to populate in a single database query. If the session has 100 customers attached to it and the mapping of the 'orders' collection is annotated with @BatchSize of size n. It means that whenever Hibernate needs to populate a lazy orders collection it checks the session and if it has more customers which their orders collections need to be populated it fetches up to n collections. Example: if we had 100 customers and the batch size was set to 16 when iterating over the customers to get their number of orders hibernate will go to the database only 7 times (6 times to fetch 16 collections and one more time to fetch the 4 remaining collections - see the sample below). If our batch size was set to 50 it would go only twice. @OneToMany(mappedBy="customer",cascade=CascadeType.ALL, fetch=FetchType.LAZY) @BatchSize(size=16) private Set orders = new HashSet(); And in the log: Hibernate: /* select c from Customer c order by c.id */ select customer0_.id as id0_, customer0_.ccNumber as ccNumber0_, customer0_.name as name0_, customer0_.fixedDiscount as fixedDis5_0_, customer0_.DTYPE as DTYPE0_ from CUSTOMERS customer0_ order by customer0_.id limit ? offset ? Hibernate: /* load one-to-many par2.Customer.orders */ select orders0_.customer_id as customer4_1_, orders0_.id as id1_, orders0_.id as id1_0_, orders0_.customer_id as customer4_1_0_, orders0_.description as descript2_1_0_, orders0_.orderId as orderId1_0_ from ORDERS orders0_ where orders0_.customer_id in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) Hibernate: /* load one-to-many par2.Customer.orders */ select orders0_.customer_id as customer4_1_, orders0_.id as id1_, orders0_.id as id1_0_, orders0_.customer_id as customer4_1_0_, orders0_.description as descript2_1_0_, orders0_.orderId as orderId1_0_ from ORDERS orders0_ where orders0_.customer_id in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) Hibernate: /* load one-to-many par2.Customer.orders */ select orders0_.customer_id as customer4_1_, orders0_.id as id1_, orders0_.id as id1_0_, orders0_.customer_id as customer4_1_0_, orders0_.description as descript2_1_0_, orders0_.orderId as orderId1_0_ from ORDERS orders0_ where orders0_.customer_id in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) Hibernate: /* load one-to-many par2.Customer.orders */ select orders0_.customer_id as customer4_1_, orders0_.id as id1_, orders0_.id as id1_0_, orders0_.customer_id as customer4_1_0_, orders0_.description as descript2_1_0_, orders0_.orderId as orderId1_0_ from ORDERS orders0_ where orders0_.customer_id in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) Hibernate: /* load one-to-many par2.Customer.orders */ select orders0_.customer_id as customer4_1_, orders0_.id as id1_, orders0_.id as id1_0_, orders0_.customer_id as customer4_1_0_, orders0_.description as descript2_1_0_, orders0_.orderId as orderId1_0_ from ORDERS orders0_ where orders0_.customer_id in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) Hibernate: /* load one-to-many par2.Customer.orders */ select orders0_.customer_id as customer4_1_, orders0_.id as id1_, orders0_.id as id1_0_, orders0_.customer_id as customer4_1_0_, orders0_.description as descript2_1_0_, orders0_.orderId as orderId1_0_ from ORDERS orders0_ where orders0_.customer_id in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) Hibernate: /* load one-to-many par2.Customer.orders */ select orders0_.customer_id as customer4_1_, orders0_.id as id1_, orders0_.id as id1_0_, orders0_.customer_id as customer4_1_0_, orders0_.description as descript2_1_0_, orders0_.orderId as orderId1_0_ from ORDERS orders0_ where orders0_.customer_id in (?, ?, ?, ?) Back to our test case. In my example setting the batch size to 100 looks like a nice tuning opportunity. And indeed when setting it to 100 the total execution time dropped to 188 millis (that's an 132 (!!!) times faster than worse result we had). The batch size can also be set globally by setting the hibernate.default_batch_fetch_size property for the session factory. From http://www.jroller.com/eyallupu/
June 9, 2008
by Eyal Lupu
· 256,239 Views · 7 Likes
article thumbnail
Taking the New Swing Tree Table for a Spin
Announcing the new Swing Tree Table yesterday, Tim Boudreau writes: Usage is incredibly easy - you just provide a standard Swing TreeModel of whatever sort you like, and an additional RowModel that can be queried for the other columns contents, editability and so forth. I found an example from some time ago, by Tim, and have been playing with it to get used to this new development. The result is as follows: To get started, I simply download the latest NetBeans IDE development build from netbeans.org and then attached the platform8/org-netbeans-swing-outline.jar to my Java SE project. For the rest, I wasn't required to do anything with NetBeans, necessarily. I could have attached the JAR to a project in Eclipse or anywhere else. Then I created a JFrame. To work with this Swing tree table, you need to provide the new "org.netbeans.swing.outline.Outline" class with the new "org.netbeans.swing.outline.OutlineModel" which, in turn, is built from a plain old javax.swing.tree.TreeModel, together with the new "org.netbeans.swing.outline.RowModel". Optionally, to change the default rendering, you can use the new "org.netbeans.swing.outline.RenderDataProvider". Let's first create a TreeModel for accessing files on disk. We will receive the root of the file system as a starting point: private static class FileTreeModel implements TreeModel { private File root; public FileTreeModel(File root) { this.root = root; } @Override public void addTreeModelListener(javax.swing.event.TreeModelListener l) { //do nothing } @Override public Object getChild(Object parent, int index) { File f = (File) parent; return f.listFiles()[index]; } @Override public int getChildCount(Object parent) { File f = (File) parent; if (!f.isDirectory()) { return 0; } else { return f.list().length; } } @Override public int getIndexOfChild(Object parent, Object child) { File par = (File) parent; File ch = (File) child; return Arrays.asList(par.listFiles()).indexOf(ch); } @Override public Object getRoot() { return root; } @Override public boolean isLeaf(Object node) { File f = (File) node; return !f.isDirectory(); } @Override public void removeTreeModelListener(javax.swing.event.TreeModelListener l) { //do nothing } @Override public void valueForPathChanged(javax.swing.tree.TreePath path, Object newValue) { //do nothing } } The above could simply be set as a JTree's model and then you'd have a plain old standard JTree. It would work, no problems, it would be a normal JTree. However, it wouldn't be a tree table since you'd only have a tree, without a table. Therefore, let's now add two extra columns, via the new "org.netbeans.swing.outline.RowModel" class, which will enable the creation of a tree table instead of a tree: private class FileRowModel implements RowModel { @Override public Class getColumnClass(int column) { switch (column) { case 0: return Date.class; case 1: return Long.class; default: assert false; } return null; } @Override public int getColumnCount() { return 2; } @Override public String getColumnName(int column) { return column == 0 ? "Date" : "Size"; } @Override public Object getValueFor(Object node, int column) { File f = (File) node; switch (column) { case 0: return new Date(f.lastModified()); case 1: return new Long(f.length()); default: assert false; } return null; } @Override public boolean isCellEditable(Object node, int column) { return false; } @Override public void setValueFor(Object node, int column, Object value) { //do nothing for now } } Now, after dragging-and-dropping an Outline object onto your JFrame (which is possible after adding the beans from the JAR to the NetBeans IDE Palette Manager) which, in turn, automatically creates a JScrollPane as well, this is how you could code the JFrame's constructor: public NewJFrame() { //Initialize the ui generated by the Matisse GUI Builder, which, //for example, adds the JScrollPane to the JFrame ContentPane: initComponents(); //Here I am assuming we are not on Windows, //otherwise use Utilities.isWindows() ? 1 : 0 //from the NetBeans Utilities API: TreeModel treeMdl = new FileTreeModel(File.listRoots()[0]); //Create the Outline's model, consisting of the TreeModel and the RowModel, //together with two optional values: a boolean for something or other, //and the display name for the first column: OutlineModel mdl = DefaultOutlineModel.createOutlineModel( treeMdl, new FileRowModel(), true, "File System"); //Initialize the Outline object: outline1 = new Outline(); //By default, the root is shown, while here that isn't necessary: outline1.setRootVisible(false); //Assign the model to the Outline object: outline1.setModel(mdl); //Add the Outline object to the JScrollPane: jScrollPane1.setViewportView(outline1); } Alternatively, without the NetBeans Matisse GUI Builder and NetBeans Palette Manager, i.e., simply using a standard Java class, you could do something like this: private Outline outline; public NewJFrame() { setDefaultCloseOperation(EXIT_ON_CLOSE); getContentPane().setLayout(new BorderLayout()); TreeModel treeMdl = new FileTreeModel(File.listRoots()[0]); OutlineModel mdl = DefaultOutlineModel.createOutlineModel( treeMdl, new FileRowModel(), true); outline = new Outline(); outline.setRootVisible(false); outline.setModel(mdl); getContentPane().add(new JScrollPane(outline),BorderLayout.CENTER); setBounds(20, 20, 700, 400); } At this point, you can run the JFrame, with this result: So, we see a lot of superfluous info that doesn't look very nice. Let's implement "org.netbeans.swing.outline.RenderDataProvider", as follows: private class RenderData implements RenderDataProvider { @Override public java.awt.Color getBackground(Object o) { return null; } @Override public String getDisplayName(Object o) { return ((File) o).getName(); } @Override public java.awt.Color getForeground(Object o) { File f = (File) o; if (!f.isDirectory() && !f.canWrite()) { return UIManager.getColor("controlShadow"); } return null; } @Override public javax.swing.Icon getIcon(Object o) { return null; } @Override public String getTooltipText(Object o) { File f = (File) o; return f.getAbsolutePath(); } @Override public boolean isHtmlDisplayName(Object o) { return false; } } Now, back in the constructor, add the renderer to the outline: outline1.setRenderDataProvider(new RenderData()); Run the JFrame again and the result should be the same as in the first screenshot above. Look again at the rendering code and note that, for example, you have tooltips:
June 4, 2008
by Geertjan Wielenga
· 83,982 Views
article thumbnail
Understanding HBase and BigTable
The hardest part about learning Hbase (the open source implementation of Google's BigTable), is just wrapping your mind around the concept of what it actually is. I find it rather unfortunate that these two great systems contain the words table and base in their names, which tend to cause confusion among RDBMS indoctrinated individuals (like myself). This article aims to describe these distributed data storage systems from a conceptual standpoint. After reading it, you should be better able to make an educated decision regarding when you might want to use Hbase vs when you'd be better off with a "traditional" database. It's all in the terminology Fortunately, Google's BigTable Paper clearly explains what BigTable actually is. Here is the first sentence of the "Data Model" section: A Bigtable is a sparse, distributed, persistent multidimensional sorted map. Note: At this juncture I like to give readers the opportunity to collect any brain matter which may have left their skulls upon reading that last line. The BigTable paper continues, explaining that: The map is indexed by a row key, column key, and a timestamp; each value in the map is an uninterpreted array of bytes. Along those lines, the HbaseArchitecture page of the Hadoop wiki posits that: HBase uses a data model very similar to that of Bigtable. Users store data rows in labelled tables. A data row has a sortable key and an arbitrary number of columns. The table is stored sparsely, so that rows in the same table can have crazily-varying columns, if the user likes. Although all of that may seem rather cryptic, it makes sense once you break it down a word at a time. I like to discuss them in this sequence: map, persistent, distributed, sorted, multidimensional, and sparse. Rather than trying to picture a complete system all at once, I find it easier to build up a mental framework piecemeal, to ease into it... map At its core, Hbase/BigTable is a map. Depending on your programming language background, you may be more familiar with the terms associative array (PHP), dictionary (Python), Hash (Ruby), or Object (JavaScript). From the wikipedia article, a map is "an abstract data type composed of a collection of keys and a collection of values, where each key is associated with one value." Using JavaScript Object Notation, here's an example of a simple map where all the values are just strings: { "zzzzz" : "woot", "xyz" : "hello", "aaaab" : "world", "1" : "x", "aaaaa" : "y" } persistent Persistence merely means that the data you put in this special map "persists" after the program that created or accessed it is finished. This is no different in concept than any other kind of persistent storage such as a file on a filesystem. Moving along... distributed Hbase and BigTable are built upon distributed filesystems so that the underlying file storage can be spread out among an array of independent machines. Hbase sits atop either Hadoop's Distributed File System (HDFS) or Amazon's Simple Storage Service (S3), while a BigTable makes use of the Google File System (GFS). Data is replicated across a number of participating nodes in an analogous manner to how data is striped across discs in a RAID system. For the purpose of this article, we don't really care which distributed filesystem implementation is being used. The important thing to understand is that it is distributed, which provides a layer of protection against, say, a node within the cluster failing. sorted Unlike most map implementations, in Hbase/BigTable the key/value pairs are kept in strict alphabetical order. That is to say that the row for the key "aaaaa" should be right next to the row with key "aaaab" and very far from the row with key "zzzzz". Continuing our JSON example, the sorted version looks like this: { "1" : "x", "aaaaa" : "y", "aaaab" : "world", "xyz" : "hello", "zzzzz" : "woot" } Because these systems tend to be so huge and distributed, this sorting feature is actually very important. The spacial propinquity of rows with like keys ensures that when you must scan the table, the items of greatest interest to you are near each other. This is important when choosing a row key convention. For example, consider a table whose keys are domain names. It makes the most sense to list them in reverse notation (so "com.jimbojw.www" rather than "www.jimbojw.com") so that rows about a subdomain will be near the parent domain row. Continuing the domain example, the row for the domain "mail.jimbojw.com" would be right next to the row for "www.jimbojw.com" rather than say "mail.xyz.com" which would happen if the keys were regular domain notation. It's important to note that the term "sorted" when applied to Hbase/BigTable does not mean that "values" are sorted. There is no automatic indexing of anything other than the keys, just as it would be in a plain-old map implementation. multidimensional Up to this point, we haven't mentioned any concept of "columns", treating the "table" instead as a regular-old hash/map in concept. This is entirely intentional. The word "column" is another loaded word like "table" and "base" which carries the emotional baggage of years of RDBMS experience. Instead, I find it easier to think about this like a multidimensional map - a map of maps if you will. Adding one dimension to our running JSON example gives us this: { "1" : { "A" : "x", "B" : "z" }, "aaaaa" : { "A" : "y", "B" : "w" }, "aaaab" : { "A" : "world", "B" : "ocean" }, "xyz" : { "A" : "hello", "B" : "there" }, "zzzzz" : { "A" : "woot", "B" : "1337" } } In the above example, you'll notice now that each key points to a map with exactly two keys: "A" and "B". From here forward, we'll refer to the top-level key/map pair as a "row". Also, in BigTable/Hbase nomenclature, the "A" and "B" mappings would be called "Column Families". A table's column families are specified when the table is created, and are difficult or impossible to modify later. It can also be expensive to add new column families, so it's a good idea to specify all the ones you'll need up front. Fortunately, a column family may have any number of columns, denoted by a column "qualifier" or "label". Here's a subset of our JSON example again, this time with the column qualifier dimension built in: { // ... "aaaaa" : { "A" : { "foo" : "y", "bar" : "d" }, "B" : { "" : "w" } }, "aaaab" : { "A" : { "foo" : "world", "bar" : "domination" }, "B" : { "" : "ocean" } }, // ... } Notice that in the two rows shown, the "A" column family has two columns: "foo" and "bar", and the "B" column family has just one column whose qualifier is the empty string (""). When asking Hbase/BigTable for data, you must provide the full column name in the form ":". So for example, both rows in the above example have three columns: "A:foo", "A:bar" and "B:". Note that although the column families are static, the columns themselves are not. Consider this expanded row: { // ... "zzzzz" : { "A" : { "catch_phrase" : "woot", } } } In this case, the "zzzzz" row has exactly one column, "A:catch_phrase". Because each row may have any number of different columns, there's no built-in way to query for a list of all columns in all rows. To get that information, you'd have to do a full table scan. You can however query for a list of all column families since these are immutable (more-or-less). The final dimension represented in Hbase/BigTable is time. All data is versioned either using an integer timestamp (seconds since the epoch), or another integer of your choice. The client may specify the timestamp when inserting data. Consider this updated example utilizing arbitrary integral timestamps: { // ... "aaaaa" : { "A" : { "foo" : { 15 : "y", 4 : "m" }, "bar" : { 15 : "d", } }, "B" : { "" : { 6 : "w" 3 : "o" 1 : "w" } } }, // ... } Each column family may have its own rules regarding how many versions of a given cell to keep (a cell is identified by its rowkey/column pair) In most cases, applications will simply ask for a given cell's data, without specifying a timestamp. In that common case, Hbase/BigTable will return the most recent version (the one with the highest timestamp) since it stores these in reverse chronological order. If an application asks for a given row at a given timestamp, Hbase will return cell data where the timestamp is less than or equal to the one provided. Using our imaginary Hbase table, querying for the row/column of "aaaaa"/"A:foo" will return "y" while querying for the row/column/timestamp of "aaaaa"/"A:foo"/10 will return "m". Querying for a row/column/timestamp of "aaaaa"/"A:foo"/2 will return a null result. sparse The last keyword is sparse. As already mentioned, a given row can have any number of columns in each column family, or none at all. The other type of sparseness is row-based gaps, which merely means that there may be gaps between keys. This, of course, makes perfect sense if you've been thinking about Hbase/BigTable in the map-based terms of this article rather than perceived similar concepts in RDBMS's. And that's about it Well, I hope that helps you understand conceptually what the Hbase data model feels like. As always, I look forward to your thoughts, comments and suggestions.
May 22, 2008
by Jim Wilson
· 84,618 Views · 5 Likes
article thumbnail
Python and the Star Schema
The star schema represents data as a table of facts (measurable values) that are associated with the various dimensions of the fact. Common dimensions include time, geography, organization, product and the like. I'm working with some folks whose facts are a bunch of medical test results, and the dimensions are patient, date, and a facility in which the tests were performed. I got an email with the following situation: "a client who is processing gigs of incoming fact data each day and they use a host of C/C++, Perl, mainframe and other tools for their incoming fact processing and I've seriously considered pushing Python in their organization.". Here are my thoughts on using Python for data warehousing when you've got Gb of data daily. Small Dimensions The pure Python approach only works when your dimension will comfortably fit into memory -- not a terribly big problem with most dimensions. Specifically, it doesn't work well for those dimensions which are so huge that the dimensional model becomes a snowflake instead of a simple star. When dealing with a large number of individuals (public utilities, banks, medical management, etc.) the "customer" (or "patient") dimension gets too big to fit into memory. Special bridge-table techniques must be used. I don't think Python would be perfect for this, since this involves slogging through a lot of data one record at a time. However, Python is considerably faster than PL/SQL. I don't know how it compares with Perl. Any programming language will be faster than any SQL procedure, because there's no RDBMS overhead. For all small dimensions. Load the dimension values from the RDBMS into a dict with a single query. Read all source data records (ideally from a flat file); conform the dimension, tracking changes; write a result record with the dimension FK information to a flat file. Iterate through the dimension dictionary and persist the dimension changes. The details vary with the Slowly Changing Dimension (SCD) rules you're using. The conformance algorithm is is essentially the following: row= Dimension(...) ident= ( row.field, row.field, row.field, ... ) dimension.setdefault( ident, row ) In some cases (like the Django ORM) this is called the get-or-create query. The Dimension Bus For BIG dimensions, I think you still have to implement the "dimension bus" outlined in The Data Warehouse Toolkit. To do this in Python, you should probably design things to look something like the following. For any big dimensions. Use an external sort-merge utility. Seriously. They're way fast for data sets too large to fit into memory. Use CSV format files and the resulting program is very tidy. The outline is as follows: First, sort the source data file into order by the identifying fields of the big dimension (customer number, patient number, whatever). Second, query the big dimension into a data file and sort it into the same order as the source file. (Using the SQL ORDER BY may be slower than an external sort; only measurements can tell which is faster.) Third, do a "match merge" to locate the differences between the dimension and the source. Don't use a utility like diff, it's too slow. This is a simple key matching between two files. The match-merge loop looks something like this. src= sourceFile.next() dim= dimensionFile.next() try: while True: src_key = ( src['field'], src['field'], ... ) dim_key= ( dim['field'], dim['field'], ... ) if src_key < dim_key: # missing some dimension values update_dimension( src ) src= sourceFile.next() elif dim_key < src_key: # extra dimension values dim= dimensionFile.next() else: # src and dim keys match # check non-key attributes for dimension change. src= sourceFile.next() except StopIteration, e: # if source is at end-of-file, that's good, we're done. # if dim is at end of file, all remaining src rows are dimension updates. for src in sourceFile: update_dimension( src ) At the end of this pass, you'll accumulate a file of customer dimension adds and changes, which is then persisted into the actual customer dimension in the database. This pass will also write new source records with the customer FK. You can also handle demographic or bridge tables at this time, too. Fact Loading The first step in DW loading is dimensional conformance. With a little cleverness the above processing can all be done in parallel, hogging a lot of CPU time. To do this in parallel, each conformance algorithm forms part of a large OS-level pipeline. The source file must be reformatted to leave empty columns for each dimension's FK reference. Each conformance process reads in the source file and writes out the same format file with one dimension FK filled in. If all of these conformance algorithms form a simple OS pipe, they all run in parallel. It looks something like this. src2cvs source | conform1 | conform2 | conform3 | load At the end, you use the RDBMS's bulk loader (or write your own in Python, it's easy) to pick the actual fact values and the dimension FK's out of the source records that are fully populated with all dimension FK's and load these into the fact table. I've written conformance processing in Java (which is faster than Python) and had to give up on SQL-based conformance for large dimensions. Instead, we did the above flat-file algorithm to merge large dimensions. The killer isn't the language speed, it's the RDBMS overheads. Once you're out of the database, things blaze. Indeed, products like the syncsort data sort can do portions of the dimension conformance at amazing speeds for large datasets. Hand Wringing "But," the hand-wringers say, "aren't you defeating the value of the RDBMS by working outside it?" The answer is NO. We're not doing incremental, transactional processing here. There aren't multiple update transactions in a warehouse. There are queries and there are bulk loads. Doing the prep-work for a bulk load outside the database is simply more efficient. We don't need locks, rollback segments, memory management, threading, concurrency, ACID rules or anything. We just need to match-merge the large dimension and the incoming facts.
May 20, 2008
by Steven Lott
· 11,292 Views · 1 Like
  • Previous
  • ...
  • 523
  • 524
  • 525
  • 526
  • 527
  • 528
  • Next
  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook
×