{{ !articles[0].partner.isSponsoringArticle ? "Platinum" : "Portal" }} Partner
java,grid computing

Google App Engine - Where Does It Fit?

This is yet another blog about Google App Engine (GAE) recent release of Java. Naturally, as an interested party, I rushed immediately to deploy my sample GridGain app on it, but immediately realized that you cannot deploy any sort of clustering or grid applications on the Google App Engine. This is simply not something they wish to support, happily leaving this portion of the market to Amazon EC2 customers.

A good way to think about GAE for Java is as of a Plain Vanilla Servlet Container Hosting. If you have a standard J2EE app that simply processes web requests and accesses DB data, then GAE is an ideal solution for you and will be much easier to use than Amazon EC2. But if your application needs to do anything beyond retrieving and showing DB data, then with GAE you will run into a brick wall. So, if GAE fits, if fits like a glove, but if it doesn't - then there is nothing you can do to help it.

Here is the list of some limitations you may wish to consider when starting with GAE:

1. You have no control over number of deployment instances.
This may be a non-issue for trivial use cases, but you immediately hit a wall if your application requires any knowledge of clustering or even of threads. For example, let's say you need to generate a web report that would normally take about 1 minute to generate in a single thread, which is normally too long for a web-request (GAE will automatically time it out after 30 secs). Ideally, to make it run faster, I would split it into multiple sub-parts, run it in parallel on several nodes, aggregate the results and produce a response to user within, say, 5 seconds.

However, there is nothing you can do to speed it up on GAE. Firstly, you have no idea how many parts to split the report into, and even if you did, there is no way to tell GAE that every part should run on a separate CPU or on a separate server instance altogether. On top of that, what if 9 out of 10 sup-parts completed successfully and 1 failed? Again, you cannot instruct GAE to retry just the part that failed - the whole request will have to be retried.

2. You have no control over load balancing
Not every application is a website and not every application can be measured in number of page hits. The underlying notion here is that if you are not a website, then GAE is not for you. However even if you are a website, not being able to control load-balancing can be quite limiting. If you take my report generation example above, GAE again would not be able to load balance it at all.

3. You cannot use any of the existing clustering infrastructure you have
If you already make use of clustering within your application (like discovery of different app server nodes, exchanging messages between app servers, using existing caching or compute grid products, etc..), you cannot do it in GAE. You are limited to the set of caching and data storage libraries provided by GAE only.

4. You cannot use full set of the JDK classes or services
Google published JRE White List for a list of JDK classes supported by the App Engine. The main limitation there is lack of AWT or Swing packages. Although not often used in web apps, this may be quite limiting for apps that use AWT for purposes other than thick UI, like dynamic image generation, etc... I should mention that GAE does offer its own image manipulation service to compensate for this.

5. You cannot access or store files
The only way to access a file is to put it in WAR and then get it off the classpath. Creation of files is prohibited - you can only store data using GAE datastore services. This alone can become a show-stopper for many applications, as even if your application does not access or store any files, it may very well depend on the libraries that do.

6. You don't get any access to the box your app is deployed on
From GAE standpoint, this limitation is justified. The GAE is running multiple WARs from different users on the same app server, so by giving you direct access to the box, they will be giving you access to the code that does not belong to you. On the other hand, not being able to access the box will scare many traditional sys admins who generally like to poke around production environments, look at OS-specific logs, do tuning, and other environment and runtime geeky stuff.

I will stop here as far as listing limitations. I think it is quite clear that GAE is not targeting enterprises as their customers. On the contrary, it looks like are they willingly leaving this portion of the market to Amazon EC2 and are concentrating on simple web application hosting which is a huge market on its own. The way they are beating Amazon here is by being elegantly simpler to use and deploy - there is no image creation, nor should user be concerned about number of images that are started. GAE will automatically manage web load and add CPU's to your app as load changes.

So, if you are a small business owner and need a quick and cheap hosting solution - go with GAE!

From http://gridgain.blogspot.com/

{{ tag }}, {{tag}},

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}
{{ parent.authors[0].realName || parent.author}}

{{ parent.authors[0].tagline || parent.tagline }}

{{ parent.views }} ViewsClicks