Deployment Resources

The Latest Deployment Topics

Bridging between JMS and RabbitMQ (AMQP) using Spring Integration

An old customer recently asked me if I had a solution for how to integrate between their existing JMS infrastructure on Websphere MQ with RabbitMQ. Although I know that RabbitMQ has the shovel plugin which can bridge between Rabbit instances I've yet not found a good plugin for JMS <-> AMQP forwarding. The first thing that came to my mind was to utilize a Spring Integration mediation as SI has excellent support for both JMS and Rabbit. Curious as I am I started a PoC and this is the result. It takes messages of a JMS queue and forwards to an AMQP exchange that is bound to a queue the consumer application is supposed to listen to. I used an external HornetQ instance in JBoss 6.1 as the JMS Provider, but I am 100% secure that the same setup would work for Websphere MQ as they both implement JMS. Be aware that I've done no performance tweaking or QoS setup yet as this is just a proof-of-concept. For a real setup you'd probably have to think about delivery guarantees versus performance and etc... The code will be available at a GitHub repository near you soon.. SpringContext in XML: org.jnp.interfaces.NamingContextFactory jnp://localhost:1099 org.jnp.interfaces:org.jboss.naming ConnectionFactory Maven POM: 4.0.0 org.rl si.jmstorabbit 0.0.1-SNAPSHOT jar si.jmstorabbit http://maven.apache.org UTF-8 2.2.5.Final 2.1.0.RELEASE springsource-release http://repository.springsource.com/maven/bundles/release false springsource-external http://repository.springsource.com/maven/bundles/external false org.springframework.integration spring-integration-core ${spring.integration.version} org.springframework.integration spring-integration-file ${spring.integration.version} org.springframework.integration spring-integration-amqp ${spring.integration.version} org.springframework.integration spring-integration-jms ${spring.integration.version} junit junit 3.8.1 test org.springframework spring-context 3.0.7.RELEASE jboss jnp-client 4.2.2.GA org.hornetq hornetq-core-client ${hornet.version} org.hornetq hornetq-jms-client ${hornet.version} org.hornetq hornetq-jms ${hornet.version} jboss jboss-common-client 3.2.3 org.jboss.netty netty 3.2.7.Final javax.jms jms 1.1

April 24, 2012

by Billy Sjöberg

· 30,074 Views

Amazon EMR Tutorial: Running a Hadoop MapReduce Job Using Custom JAR

See original post at https://muhammadkhojaye.blogspot.com/2012/04/how-to-run-amazon-elastic-mapreduce-job.html Introduction Amazon EMR is a web service which can be used to easily and efficiently process enormous amounts of data. It uses a hosted Hadoop framework running on the web-scale infrastructure of Amazon EC2 and Amazon S3. Amazon EMR removes most of the cumbersome details of Hadoop while taking care of provisioning of Hadoop, running the job flow, terminating the job flow, moving the data between Amazon EC2 and Amazon S3, and optimizing Hadoop. In this tutorial, we will use a developed WordCount Java example using Hadoop and thereafter, we execute our program on Amazon Elastic MapReduce. Prerequisites You must have valid AWS account credentials. You should also have a general familiarity with using the Eclipse IDE before you begin. The reader can also use any other IDE of their choice. Step 1 – Develop MapReduce WordCount Java Program In this section, we are first going to develop a WordCount application. A WordCount program will determine how many times different words appear in a set of files. In Eclipse (or whatever the IDE you are using), Create simple Java Project with the name "WordCount". Create a java class name Map and override the map method as follow, public class Map extends Mapper { private final static IntWritable one = new IntWritable(1); private Text word = new Text(); @Override public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { String line = value.toString(); StringTokenizer tokenizer = new StringTokenizer(line); while (tokenizer.hasMoreTokens()) { word.set(tokenizer.nextToken()); context.write(word, one); } } } Create a java class named Reduce and override the reduce method as shown below, public class Reduce extends Reducer { @Override protected void reduce(Text key, java.lang.Iterable values, org.apache.hadoop.mapreduce.Reducer.Context context) throws IOException, InterruptedException { int sum = 0; for (IntWritable value : values) { sum += value.get(); } context.write(key, new IntWritable(sum)); } } Create a java class named WordCount and defined the main method as below, public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); Job job = new Job(conf, "wordcount"); job.setJarByClass(WordCount.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); job.setMapperClass(Map.class); job.setReducerClass(Reduce.class); job.setInputFormatClass(TextInputFormat.class); job.setOutputFormatClass(TextOutputFormat.class); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); job.waitForCompletion(true); } Export the WordCount program in a jar using eclipse and save it to some location on disk. Make sure that you have provided the Main Class (WordCount.jar) during extraction ofu8u the jar file as shown below. Our jar is ready!!! Step 2 – Upload the WordCount JAR and Input Files to Amazon S3 Now we are going to upload the WordCount jar to Amazon S3. First, go to the following URL: https://console.aws.amazon.com/s3/home Next, click “Create Bucket”, give your bucket a name, and click the “Create” button. Select your new S3 bucket in the left-hand pane. Upload the WordCount JAR and sample input file for counting the words. Step 3 – Running an Elastic MapReduce job Now that the JAR is uploaded into S3, all we need to do is to create a new Job flow. let's execute the steps below. (I encourage readers to check out the following link for details regarding each step, How to Create a Job Flow Using a Custom JAR ) Sign in to the AWS Management Console and open the Amazon Elastic MapReduce console at https://console.aws.amazon.com/elasticmapreduce/ Click Create New Job Flow. In the DEFINE JOB FLOW page, enter the following details, a) Job Flow Name = WordCountJob b) Select Run your own applications) Select Custom JAR in the drop-down list) Click Continue In the SPECIFY PARAMETERS page, enter values in the boxes using the following table as a guide, and then click Continue.JAR Location = bucketName/jarFileLocationJAR Arguments =s3n://bucketName/inputFileLocations3n://bucketName/outputpath Please note that the output path must be unique each time we execute the job. The Hadoop always create a folder with the same name specified here. After executing the job, just wait and monitor your job that runs through the Hadoop flow. You can also look for errors by using the Debug button. The job should be complete within 10 to 15 minutes (can also depend on the size of the input). After completing the job, You can view results in the S3 Browser panel. You can also download the files from S3 and can analyze the outcome of the job. Amazon Elastic MapReduce Resources Amazon Elastic MapReduce Documentation,http://aws.amazon.com/documentation/elasticmapreduce/ Amazon Elastic MapReduce Getting Started Guide,http://docs.amazonwebservices.com/ElasticMapReduce/latest/GettingStartedGuide/ Amazon Elastic MapReduce Developer Guide,http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/ Apache Hadoop,http://hadoop.apache.org/ See more at https://muhammadkhojaye.blogspot.com/2012/04/how-to-run-amazon-elastic-mapreduce-job.html

April 23, 2012

by Muhammad Ali Khojaye

· 59,039 Views

Playing Sounds in Android

Let's take a closer look at how to play sounds on an Android device with SoundPool and MediaPlayer.

April 13, 2012

by Tony Siciliani

· 95,480 Views · 2 Likes

How to Use Sigma.js with Neo4j

i’ve done a few posts recently using d3.js and now i want to show you how to use two other great javascript libraries to visualize your graphs. we’ll start with sigma.js and soon i’ll do another post with three.js . we’re going to create our graph and group our nodes into five clusters. you’ll notice later on that we’re going to give our clustered nodes colors using rgb values so we’ll be able to see them move around until they find their right place in our layout. we’ll be using two sigma.js plugins, the gefx (graph exchange xml format) parser and the forceatlas2 layout. you can see what a gefx file looks like below. notice it comes from gephi which is an interactive visualization and exploration platform, which runs on all major operating systems, is open source, and is free. ... ... in order to build this file, we will need to get the nodes and edges from the graph and create an xml file. get '/graph.xml' do @nodes = nodes @edges = edges builder :graph end we’ll use cypher to get our nodes and edges: def nodes neo = neography::rest.new cypher_query = " start node = node:nodes_index(type='user')" cypher_query << " return id(node), node" neo.execute_query(cypher_query)["data"].collect{|n| {"id" => n[0]}.merge(n[1]["data"])} end we need the node and relationship ids, so notice i’m using the id() function in both cases. def edges neo = neography::rest.new cypher_query = " start source = node:nodes_index(type='user')" cypher_query << " match source -[rel]-> target" cypher_query << " return id(rel), id(source), id(target)" neo.execute_query(cypher_query)["data"].collect{|n| {"id" => n[0], "source" => n[1], "target" => n[2]} } end so far we have seen graphs represented as json, and we’ve built these manually. today we’ll take advantage of the builder ruby gem to build our graph in xml. xml.instruct! :xml xml.gexf 'xmlns' => "http://www.gephi.org/gexf", 'xmlns:viz' => "http://www.gephi.org/gexf/viz" do xml.graph 'defaultedgetype' => "directed", 'idtype' => "string", 'type' => "static" do xml.nodes :count => @nodes.size do @nodes.each do |n| xml.node :id => n["id"], :label => n["name"] do xml.tag!("viz:size", :value => n["size"]) xml.tag!("viz:color", :b => n["b"], :g => n["g"], :r => n["r"]) xml.tag!("viz:position", :x => n["x"], :y => n["y"]) end end end xml.edges :count => @edges.size do @edges.each do |e| xml.edge:id => e["id"], :source => e["source"], :target => e["target"] end end end end you can get the code on github as usual and see it running live on heroku. you will want to see it live on heroku so you can see the nodes in random positions and then move to form clusters. use your mouse wheel to zoom in, and click and drag to move around. credit goes out to alexis jacomy and mathieu jacomy . you’ve seen me create numerous random graphs, but for completeness here is the code for this graph. notice how i create 5 clusters and for each node i assign half its relationships to other nodes in their cluster and half to random nodes? this is so the forceatlas2 layout plugin clusters our nodes neatly. def create_graph neo = neography::rest.new graph_exists = neo.get_node_properties(1) return if graph_exists && graph_exists['name'] names = 500.times.collect{|x| generate_text} clusters = 5.times.collect{|x| {:r => rand(256), :g => rand(256), :b => rand(256)} } commands = [] names.each_index do |n| cluster = clusters[n % clusters.size] commands << [:create_node, {:name => names[n], :size => 5.0 + rand(20.0), :r => cluster[:r], :g => cluster[:g], :b => cluster[:b], :x => rand(600) - 300, :y => rand(150) - 150 }] end names.each_index do |from| commands << [:add_node_to_index, "nodes_index", "type", "user", "{#{from}"] connected = [] # create clustered relationships members = 20.times.collect{|x| x * 10 + (from % clusters.size)} members.delete(from) rels = 3 rels.times do |x| to = members[x] connected << to commands << [:create_relationship, "follows", "{#{from}", "{#{to}"] unless to == from end # create random relationships rels = 3 rels.times do |x| to = rand(names.size) commands << [:create_relationship, "follows", "{#{from}", "{#{to}"] unless (to == from) || connected.include?(to) end end batch_result = neo.batch *commands end

April 12, 2012

by Max De Marzi

· 15,385 Views

Using Maven's -U Command Line Option

My prefered solution was to use the Maven ‘update snapshots’ command line argument.

March 11, 2012

by Roger Hughes

· 106,928 Views · 1 Like

Why You Need a Git Pre-Commit Hook and Why Most Are Wrong

a pre-commit hook is a piece of code that runs before every commit and determines whether or not the commit should be accepted. think of it as the gatekeeper to your codebase. want to ensure you didn’t accidentally leave any pdb s in your code? pre-commit hook. want to make sure your javascript is jshint approved? pre-commit hook. want to guarantee clean, readable pep8 -compliant code? pre-commit hook. want to pipe all of the comments in your codebase through strunk & white ? please don’t. the pre-commit hook is just an executable file that runs before every commit. if it exits with zero status, the commit is accepted. if it exits with a non-zero status, the commit is rejected. (note: a pre-commit hook can be bypassed by passing the --no-verify argument.) along with the pre-commit hook there are numerous other git hooks that are available: post-commit, post-merge, pre-receive, and others that can be found here . why most pre-commit hooks are wrong be wary of the above’s example as the majority of pre-commit hooks you’ll see on the web are wrong. most test against whatever files are currently on disk, not what is in the staging area (the files actually being committed). we avoid this in our hook by stashing all changes that are not part of the staging area before running our checks and then popping the changes afterwards. this is very important because a file could be fine on disk while the changes that are being committed are wrong. the code below is the pre-commit hook we use at yipit. our hook is simply a set of checks to be run against any files that have been modified in this commit. each check can be configured to include/exclude particular types of files. it is designed for a django environment, but should be adaptable to other environments with minor changes. note that you need git 1.7.7+ #!/usr/bin/env python import os import re import subprocess import sys modified = re.compile('^(?:m|a)(\s+)(?p.*)') checks = [ { 'output': 'checking for pdbs...', 'command': 'grep -n "import pdb" %s', 'ignore_files': ['.*pre-commit'], 'print_filename': true, }, { 'output': 'checking for ipdbs...', 'command': 'grep -n "import ipdb" %s', 'ignore_files': ['.*pre-commit'], 'print_filename': true, }, { 'output': 'checking for print statements...', 'command': 'grep -n print %s', 'match_files': ['.*\.py$'], 'ignore_files': ['.*migrations.*', '.*management/commands.*', '.*manage.py', '.*/scripts/.*'], 'print_filename': true, }, { 'output': 'checking for console.log()...', 'command': 'grep -n console.log %s', 'match_files': ['.*yipit/.*\.js$'], 'print_filename': true, }, { 'output': 'checking for debugger...', 'command': 'grep -n debugger %s', 'match_files': ['.*\.js$'], 'print_filename': true, }, { 'output': 'running jshint...', # by default, jshint prints 'lint free!' upon success. we want to filter this out. 'command': 'jshint %s | grep -v "lint free!"', 'match_files': ['.*yipit/.*\.js$'], 'print_filename': false, }, { 'output': 'running pyflakes...', 'command': 'pyflakes %s', 'match_files': ['.*\.py$'], 'ignore_files': ['.*settings/.*', '.*manage.py', '.*migrations.*', '.*/terrain/.*'], 'print_filename': false, }, { 'output': 'running pep8...', 'command': 'pep8 -r --ignore=e501,w293 %s', 'match_files': ['.*\.py$'], 'ignore_files': ['.*migrations.*'], 'print_filename': false, }, { 'output': 'checking for sass changes...', 'command': 'sass --quiet --update %s', 'match_files': ['.*\.scss$'], 'print_filename': true, }, ] def matches_file(file_name, match_files): return any(re.compile(match_file).match(file_name) for match_file in match_files) def check_files(files, check): result = 0 print check['output'] for file_name in files: if not 'match_files' in check or matches_file(file_name, check['match_files']): if not 'ignore_files' in check or not matches_file(file_name, check['ignore_files']): process = subprocess.popen(check['command'] % file_name, stdout=subprocess.pipe, stderr=subprocess.pipe, shell=true) out, err = process.communicate() if out or err: if check['print_filename']: prefix = '\t%s:' % file_name else: prefix = '\t' output_lines = ['%s%s' % (prefix, line) for line in out.splitlines()] print '\n'.join(output_lines) if err: print err result = 1 return result def main(all_files): # stash any changes to the working tree that are not going to be committed subprocess.call(['git', 'stash', '-u', '--keep-index'], stdout=subprocess.pipe) files = [] if all_files: for root, dirs, file_names in os.walk('.'): for file_name in file_names: files.append(os.path.join(root, file_name)) else: p = subprocess.popen(['git', 'status', '--porcelain'], stdout=subprocess.pipe) out, err = p.communicate() for line in out.splitlines(): match = modified.match(line) if match: files.append(match.group('name')) result = 0 print 'running django code validator...' return_code = subprocess.call('$virtual_env/bin/python manage.py validate', shell=true) result = return_code or result for check in checks: result = check_files(files, check) or result # unstash changes to the working tree that we had stashed subprocess.call(['git', 'reset', '--hard'], stdout=subprocess.pipe, stderr=subprocess.pipe) subprocess.call(['git', 'stash', 'pop', '-q'], stdout=subprocess.pipe, stderr=subprocess.pipe) sys.exit(result) if __name__ == '__main__': all_files = false if len(sys.argv) > 1 and sys.argv[1] == '--all-files': all_files = true main(all_files) to use this hook or a hook that you create yourself, simply copy the file to .git/hooks/pre-commit inside of your project and make sure that it is executable or add in to your git repo and setup a symlink.

March 3, 2012

by Steve Pulec

· 23,490 Views · 1 Like

Creating a build pipeline using Maven, Jenkins, Subversion and Nexus.

for a while now, we had been operating in the wild west when it comes to building our applications and deploying to production. builds were typically done straight from the developer’s ide and manually deployed to one of our app servers. we had a manual process in place, where the developer would do the following steps. check all project code into subversion and tag build the application. archive the application binary to a network drive deploy to production update our deployment wiki with the date and version number of the app that was just deployed. the problem is that there were occasionally times where one of these steps were missed, and it always seemed to be at a time when we needed to either rollback to the previous version, or branch from the tag to do a bugfix. sometimes the previous version had not been archived to the network, or the developer forgot to tag svn. we were already using jenkins to perform automated builds, so we wanted to look at extending it further to perform release builds. the maven release plug-in provides a good starting point for creating an automated release process. we have also just started using the nexus maven repository and wanted to incorporate that as well to archive our binaries to, rather than archiving them to a network drive. the first step is to set up the project’s pom file with the deploy plugin as well as include configuration information about our nexus and subversion repositories. org.apache.maven.plugins maven-release-plugin 2.2.2 http://mks:8080/svn/jrepo/tags/frameworks/siestaframework the release plugin configuration is pretty straightforward. the configuration takes the subversion url of the location where the tags will reside for this project. the next step is to configure the svn location where the code will be checked out from. scm:svn:http://mks:8080/svn/jrepo/trunk/frameworks/siestaframework http://mks:8080/svn the last step in configuring the project is to set up the location where the binaries will be archived to. in our case, the nexus repository. lynden-java-release lynden release repository http://cisunwk:8081/nexus/content/repositories/lynden-java-release the project is now ready to use the maven release plug-in. the release plugin provides a number of useful goals. release:clean – cleans the workspace in the event the last release process was not successful. release: prepare – performs a number of operations checks to make sure that there are no uncommitted changes. ensures that there are no snapshot dependencies in the pom file, changes the version of the application and removes snapshot from the version. ie 1.0.3-snapshot becomes 1.0.3 run project tests against modified poms commit the modified pom tag the code in subersion increment the version number and append snapshot. ie 1.0.3 becomes 1.0.4-snapshot commit modified pom release: perform – performs the release process checks out the code using the previously defined tag runs the deploy maven goal to move the resulting binary to the repository. putting it all together the last step in this process is to configure jenkins to allow release builds on-demand, meaning we want the user to have to explicitly kick off a release build for this process to take place. we have download and installed the release jenkins plug-in in order to allow developers to kick off release builds from jenkins. the release plug-in will execute tasks after the normal build has finished. below is a screenshot of the configuration of one of our projects. the release build option for the project is enabled by selecting the “configure release build” option in the “build environment” section. the maven release plug-in is activated by adding the goals to the “after successful release build” section. (the –b option enables batch mode so that the release plug-in will not ask the user for input, but use defaults instead.) once the release option has been configured for a project there will be a “release” icon on the left navigation menu for the project. selecting this will kick off a build and then the maven release process, assuming the build succeeds. finally a look at svn and nexus verifies that the build for version 1.0.4 of the siesta-framework project has been tagged in svn and uploaded to nexus. the next steps for this project will be to generate release notes for release builds, and also to automate a deployment pipeline, so that developers can deploy to our test, staging and production servers via jenkins rather than manually from their development workstations. twitter: @robterp blog: http://rterp.wordpress.com

February 29, 2012

by Rob Terpilowski

· 86,823 Views

Separating Integration and Unit Tests with Maven, Sonar, Failsafe, and JaCoCo

Execute the slow integration tests separately from unit tests and show as much information about them as possible in Sonar.

February 8, 2012

by Jakub Holý

· 57,011 Views · 1 Like

Java.lang.VerifyError: Expecting a stackmap frame at branch target – JDK 7

Right now, when I try to persist an object in Google App Engine, I’m facing the error “Java.lang.VerifyError: Expecting a stackmap frame at branch target“. I’m using JDK 7 and it seems like the problem lies with this JDK. After googling a bit, I found that there seems to be two solutions to fix this problem. Solution 1: Change to JDK 6 As simple as is, change your JDK to version 6 and you won’t be bugged by this exception anymore. Well, in my case, I have to use JDK 7. So, moving on to the solution 2. Solution 2: Configure JVM Go to Windows -> Preferences -> Installed JREs. Select the default JVM and click edit. Then add this parameter as VM argument “-XX:-UseSplitVerifier” as seen below. This should solve the issue. From http://veerasundar.com/blog/2012/01/java-lang-verifyerror-expecting-a-stackmap-frame-at-branch-target-jdk-7/

February 6, 2012

by Veera Sundar

· 47,680 Views

Using the Android Parcel

A short definition of an Android Parcel would be that of a message container for lightweight, high-performance Inter-process communication (IPC). On Android, a "process" is a standard Linux one, and one process cannot normally access the memory of another process, so with Parcels, the Android system decomposes objects into primitives that can be marshaled/unmarshaled across process boundaries. But Parcels can also be used within the same process, to pass data across different components of a same application. As an example, a typical Android application has several screens, called "Activities" , and needs to communicate data or action from one Activity to the next. To write an object than can be passed through, we can implement the Parcelable interface. Android itself provides a built-in Parcelable object called an Intent which is used to pass information from one component to another. Using an Intent is pretty straightforward. Let's say we're collecting user data from our initial screen called CollectDataActivity. // inside CollectDataActivity, construct intent to pass along the next Activity, i.e. screen Intent in = new Intent(this, ProcessDataActivity.class); in.putExtra("userid", id); // (key,value) pairs in.putExtra("age", age); in.putExtra("phone", phone); in.putExtra("is_registered", true); // call next Activity --> next screen comes up startActivity(in); We need to collect that information from our data collection screen to process it. So all we do is the following: // inside ProcessDataActivity, get the info needed from previous Activity Intent in = this.getIntent(); in.getLongExtra("userid", 0L); in.getIntExtra("age", 0); in.getStringExtra("phone"); in.getBooleanExtra("is_registered", false); // false = default value overridden by user input Again, pretty straightforward. We retrieve the data using the same keys used to send it, and using our Intent's corresponding methods for each data type. But even when communicating with Intents, we can still use Parcels to pass data within the intent. For instance, we can do the above in a more elegant way using a custom, Parcelable User class: In the first Activity: // in CollectDataActivity, populate the Parcelable User object using its setter methods User usr = new User(); usr.setId(id); // collected from user input// etc.. // pass it to another component Intent in = new Intent(this, ProcessDataActivity.class); in.putExtra("user", usr); startActivity(in); In the second Activity: // in ProcessDataActivity retrieve User Intent intent = getIntent(); User usr = (User) intent.getParcelableExtra("user"); And this is what a Parcelable User class looks like: import android.os.Parcel; import android.os.Parcelable; public class User implements Parcelable { private long id; private int age; private String phone; private boolean registered; // No-arg Ctor public User(){} // all getters and setters go here //... /** Used to give additional hints on how to process the received parcel.*/ @Override public int describeContents() { // ignore for now return 0; } @Override public void writeToParcel(Parcel pc, int flags) { pc.writeLong(id); pc.writeInt(age); pc.writeString(phone); pc.writeInt( registered ? 1 :0 ); } /** Static field used to regenerate object, individually or as arrays */ public static final Parcelable.Creator CREATOR = new Parcelable.Creator() { public User createFromParcel(Parcel pc) { return new User(pc); } public User[] newArray(int size) { return new User[size]; } }; /**Ctor from Parcel, reads back fields IN THE ORDER they were written */ public User(Parcel pc){ id = pc.readLong(); age = pc.readInt(); phone = pc.readString(); registered = ( pc.readInt() == 1 ); } } What we did was: Make our User class implement the Parcelable interface. Parcelable is not a marker interface, hence what follows: Implement its describeContents method, which in this case does nothing. Implement its abstract method writeToParcel, which takes the current state of the object and writes it to a Parcel Add a static field called CREATOR to our class, which is an object implementing the Parcelable.Creator interface Add a Constructor that takes a Parcel as parameter. The CREATOR calls that constructor to rebuild our object. This looks like a lot of extra code at first, but bear in mind that, as in most cases, our application might evolve into incorporating more data from the user... Sometimes we need to pass complex objects from one component to another, and passing an object yields a cleaner design. The same logic applies for communicating between an Activity (foreground UI) and a background Service. We would just call the startService method instead of startActivity and pass it our Parcelable User object. Note that a Service is not running in a separate process by default. At this point, there are a couple of questions that may be raised: Isn't using an IPC-friendly, custom object for in-process communication simply overkill? Why would we want to use Parcelable, when we already have built-in Java serialization? The answer to the first concern is...maybe. But communicating through a custom object than through a list of key-value pairs is more OO, and it has no noticeable negative performance impact. As for the second question, why not simply have User implement Serializable, a theoretically simpler, marker interface? In one word, performance. Using Parcels is more efficient than serializing, at the price of some added complexity. That extra efficiency has in turn its limits: passing an image ( Bitmap) using Parcelable is generally not a good idea (although Bitmap does in fact implement Parcelable). A much more memory-efficient way would be to pass only its URI or Resource ID, so that other Android components in your application can have access to it. Another limitation of Parcelable is that it must not be used for general-purpose serialization to storage, since the underlying implementation may vary with different versions of the Android OS. So yes, Parcels are faster by design, but as high-performance transport, not as a replacement for general-purpose serialization mechanism. Having said all that, since our User object is Parcelable, it can now be sent from this application to another one running in another process, in particular through an interface implementing a remote service. In an upcoming post, we'll look at IPC and Android's Interface Definition Language (AIDL). from Tony's Blog

February 4, 2012

by Tony Siciliani

· 59,042 Views · 1 Like

Accessing Local Name-Based Virtual Hosts From the Android Emulator

To test mobile versions of websites, it is useful to be able to connect to a web server on your local machine from a web browser on an Android emulator without having to expose the web server to the Internet. You can’t use the normal loop-back IP address of 127.0.0.1 because that refers to the emulated Android device itself. Instead you have to use 10.0.2.2 to connect to the host machine. That’s fine if your local web server is serving a single site, but if you are using name-based virtual hosting to serve different sites depending on the host name of the request (with aliases for localhost defined in your machine’s hosts file), then you need to be making requests from the browser using the correct host name, not the IP address. The Android emulator does not use the host machine’s hosts file for name resolution so attempting to access http://myvirtualhost in the emulator’s browser will not work. This is because the emulated Android device has it’s own hosts file, so you have to update this to map the virtual host names to the local machine. The first step is to start the AVD with an increased partition size otherwise you may get an out of memory error when you try to save the modified hosts file: emulator -avd MyAVD -partition-size 128 You then have to remount the system partition so that it is writeable: adb remount Then copy the hosts file from the emulated device to the host machine: adb pull /etc/hosts Edit the hosts file so that it includes mappings for all relevant virtual host names: 127.0.0.1 localhost 10.0.2.2 myvirtualhost1 myvirtualhost2 Then copy the updated file back to the emulated device: adb push hosts /etc/hosts You should then be able to visit http://myvirtualhost1 in the emulator’s browser and see the correct site. From http://blog.uncommons.org/2012/01/12/accessing-local-name-based-virtual-hosts-from-the-android-emulator/

January 23, 2012

by Dan Dyer

· 15,398 Views

Which Integration Framework Should You Use – Spring Integration, Mule ESB or Apache Camel?

Data exchanges between companies are increasing a lot. The number of applications that must be integrated is increasing, too. The interfaces use different technologies, protocols and data formats. Nevertheless, the integration of these applications must be modeled in a standardized way, realized efficiently and supported by automatic tests. Three integration frameworks are available in the JVM environment, which fulfil these requirements: Spring Integration, Mule ESB and Apache Camel. They implement the well-known Enteprise Integration Patterns (EIP, http://www.eaipatterns.com) and therefore offer a standardized, domain-specific language to integrate applications. These integration frameworks can be used in almost every integration project within the JVM environment – no matter which technologies, transport protocols or data formats are used. All integration projects can be realized in a consistent way without redundant boilerplate code. This article compares all three alternatives and discusses their pros and cons. If you want to know, when to use a more powerful Enterprise Service Bus (ESB) instead of one of these lightweight integration frameworks, then you should read this blog post: http://www.kai-waehner.de/blog/2011/06/02/when-to-use-apache-camel/ (it explains when to use Apache Camel, but the title could also be „When to use a lightweight integration framework“). Comparison Criteria Several criteria can be used to compare these three integration frameworks: Open source Basic concepts / architecture Testability Deployment Popularity Commercial support IDE-Support Errorhandling Monitoring Enterprise readiness Domain specific language (DSL) Number of components for interfaces, technologies and protocols Expandability Similarities All three frameworks have many similarities. Therefore, many of the above comparison criteria are even! All implement the EIPs and offer a consistent model and messaging architecture to integrate several technologies. No matter which technologies you have to use, you always do it the same way, i.e. same syntax, same API, same automatic tests. The only difference is the the configuration of each endpoint (e.g. JMS needs a queue name while JDBC needs a database connection url). IMO, this is the most significant feature. Each framework uses different names, but the idea is the same. For instance, „Camel routes“ are equivalent to „Mule flows“, „Camel components“ are called „adapters“ in Spring Integration. Besides, several other similarities exists, which differ from heavyweight ESBs. You just have to add some libraries to your classpath. Therefore, you can use each framework everywhere in the JVM environment. No matter if your project is a Java SE standalone application, or if you want to deploy it to a web container (e.g. Tomcat), JEE application server (e.g. Glassfish), OSGi container or even to the cloud. Just add the libraries, do some simple configuration, and you are done. Then you can start implementing your integration stuff (routing, transformation, and so on). All three frameworks are open source and offer familiar, public features such as source code, forums, mailing lists, issue tracking and voting for new features. Good communities write documentation, blogs and tutorials (IMO Apache Camel has the most noticeable community). Only the number of released books could be better for all three. Commercial support is available via different vendors: Spring Integration: SpringSource (http://www.springsource.com) Mule ESB: MuleSoft (http://www.mulesoft.org) Apache Camel: FuseSource (http://fusesource.com) and Talend (http://www.talend.com) IDE support is very good, even visual designers are available for all three alternatives to model integration problems (and let them generate the code). Each of the frameworks is enterprise ready, because all offer required features such as error handling, automatic testing, transactions, multithreading, scalability and monitoring. Differences If you know one of these frameworks, you can learn the others very easily due to their same concepts and many other similarities. Next, let’s discuss their differences to be able to decide when to use which one. The two most important differences are the number of supported technologies and the used DSL(s). Thus, I will concentrate especially on these two criteria in the following. I will use code snippets implementing the well-known EIP „Content-based Router“ in all examples. Judge for yourself, which one you prefer. Spring Integration Spring Integration is based on the well-known Spring project and extends the programming model with integration support. You can use Spring features such as dependency injection, transactions or security as you do in other Spring projects. Spring Integration is awesome, if you already have got a Spring project and need to add some integration stuff. It is almost no effort to learn Spring Integration if you know Spring itself. Nevertheless, Spring Integration only offers very rudimenary support for technologies – just „basic stuff“ such as File, FTP, JMS, TCP, HTTP or Web Services. Mule and Apache Camel offer many, many further components! Integrations are implemented by writing a lot of XML code (without a real DSL), as you can see in the following code snippet: You can also use Java code and annotations for some stuff, but in the end, you need a lot of XML. Honestly, I do not like too much XML declaration. It is fine for configuration (such as JMS connection factories), but not for complex integration logic. At least, it should be a DSL with better readability, but more complex Spring Integration examples are really tough to read. Besides, the visual designer for Eclipse (called integration graph) is ok, but not as good and intuitive as its competitors. Therefore, I would only use Spring Integration if I already have got an existing Spring project and must just add some integration logic requiring only „basic technologies“ such as File, FTP, JMS or JDBC. Mule ESB Mule ESB is – as the name suggests – a full ESB including several additional features instead of just an integration framework (you can compare it to Apache ServiceMix which is an ESB based on Apache Camel). Nevertheless, Mule can be use as lightweight integration framework, too – by just not adding and using any additional features besides the EIP integration stuff. As Spring Integration, Mule only offers a XML DSL. At least, it is much easier to read than Spring Integration, in my opinion. Mule Studio offers a very good and intuitive visual designer. Compare the following code snippet to the Spring integration code from above. It is more like a DSL than Spring Integration. This matters if the integration logic is more complex. The major advantage of Mule is some very interesting connectors to important proprietary interfaces such as SAP, Tibco Rendevous, Oracle Siebel CRM, Paypal or IBM’s CICS Transaction Gateway. If your integration project requires some of these connectors, then I would probably choose Mule! A disadvantage for some projects might be that Mule says no to OSGi: http://blogs.mulesoft.org/osgi-no-thanks/ Apache Camel Apache Camel is almost identical to Mule. It offers many, many components (even more than Mule) for almost every technology you could think of. If there is no component available, you can create your own component very easily starting with a Maven archetype! If you are a Spring guy: Camel has awesome Spring integration, too. As the other two, it offers a XML DSL: ${in.header.type} is ‘com.kw.DvdOrder’ ${in.header.type} is ‘com.kw.VideogameOrder’ Readability is better than Spring Integration and almost identical to Mule. Besides, a very good (but commercial) visual designer called Fuse IDE is available by FuseSource – generating XML DSL code. Nevertheless, it is a lot of XML, no matter if you use a visual designer or just your xml editor. Personally, I do not like this. Therefore, let’s show you another awesome feature: Apache Camel also offers DSLs for Java, Groovy and Scala. You do not have to write so much ugly XML. Personally, I prefer using one of these fluent DSLs instead XML for integration logic. I only do configuration stuff such as JMS connection factories or JDBC properties using XML. Here you can see the same example using a Java DSL code snippet: from(“file:incomingOrders “) .choice() .when(body().isInstanceOf(com.kw.DvdOrder.class)) .to(“file:incoming/dvdOrders”) .when(body().isInstanceOf(com.kw.VideogameOrder.class)) .to(“jms:videogameOrdersQueue “) .otherwise() .to(“mock:OtherOrders “); The fluent programming DSLs are very easy to read (even in more complex examples). Besides, these programming DSLs have better IDE support than XML (code completion, refactoring, etc.). Due to these awesome fluent DSLs, I would always use Apache Camel, if I do not need some of Mule’s excellent connectors to proprietary products. Due to its very good integration to Spring, I would even prefer Apache Camel to Spring Integration in most use cases. By the way: Talend offers a visual designer generating Java DSL code, but it generates a lot of boilerplate code and does not allow vice-versa editing (i.e. you cannot edit the generated code). This is a no-go criteria and has to be fixed soon (hopefully)! And the winner is… … all three integration frameworks, because they are all lightweight and easy to use – even for complex integration projects. It is awesome to integrate several different technologies by always using the same syntax and concepts – including very good testing support. My personal favorite is Apache Camel due to its awesome Java, Groovy and Scala DSLs, combined with many supported technologies. I would only use Mule if I need some of its unique connectors to proprietary products. I would only use Spring Integration in an existing Spring project and if I only need to integrate „basic technologies“ such as FTP or JMS. Nevertheless: No matter which of these lightweight integration frameworks you choose, you will have much fun realizing complex integration projects easily with low efforts. Remember: Often, a fat ESB has too much functionality, and therefore too much, unnecessary complexity and efforts. Use the right tool for the right job! Best regards, Kai Wähner (Twitter: @KaiWaehner) http://www.kai-waehner.de/blog/2012/01/10/spoilt-for-choice-which-integration-framework-to-use-spring-integration-mule-esb-or-apache-camel/

January 19, 2012

by Kai Wähner

CORE

· 110,881 Views · 9 Likes

How to deploy a neo4j instance in Amazon EC2 in 10 minutes

Neo4j is a high-performance, NOSQL graph database with all the features of a mature and robust database. In this post I will explain how to deploy a neo4j instance in Amazon EC2 web service. For this tutorial to take you no more than 10 minutes you should be able to execute properly some bash commands like mv, tar, ssh and scp (secure copy). I also assume that you have an account in Amazon Web Services and you are familiar to the process of launching instances. If not, I strongly recommend you to follow this starting guide and complete it till you manage to connect to your instance with ssh. Start downloading the latest stable version of neo4j. Which you can find here. The “Community Edition” fits well for development purposes. Do not forget to select the Unix version of the server. This will download a tar.gz file which you will copy to your EC2 instance later. While you download the neo4j server open the AWS Management Console and launch a Basic 32-bit Amazon Linux AMI. If you want to launch an Ubuntu AMI please notice that it doesn’t ship with Java, which is required for running neo4j. If you are not familiar with key pairs, pem files or security groups I insist you to follow the EC2 starting guide I mentioned above. You can either create a new security group or use the default, but you will need to configure a new security rule for the neo4j server port. After launching the instance, create a TCP rule on port 7474 with source 0.0.0.0/0. Here you are opening port 7474 for anyone. If you are planning to use the neo4j REST API and remotely call it from another server, for example a Rails application hosted in Heroku, for security reasons, you may want to change the source field to the address of your Heroku server. Do not forget to open port 22 (SSH), this is typically the first rule normal people create after launching an instance. You are almost done! You should now install neo4j in your instance. Open a terminal in your localhost and navigate to the path where you downloaded neo4j. Copy the file to your Amazon instance by using the scp command: scp -i your_pem_file.pem neo4j-community-1.6.M01-unix.tar.gz ec2-user@YOUR_PUBLIC_INSTANCE_DNS:/home/ec2-user Please notice that you will need to change the path to your pem file, typically placed in ~/.ssh, the filename of the neo4j server you just downloaded and the plublic DNS of your instance. Now connect to your instance with SSH: ssh -i your_pem_file.pem ec2-user@YOUR_PUBLIC_INSTANCE_DNS Untar the neo4j server: tar xvfz neo4j-community-1.6.M01-unix.tar.gz.tar.gz Move it to /usr/local and rename the folder to neo4j: sudo mv neo4j-community-1.6.M01 /usr/local/neo4j Almost done!!! You should now open neo4j-server.properties under the conf directory and add the following line: org.neo4j.server.webserver.address=0.0.0.0 This lines allows anyone to connect remotely to your neo4j database server. Now run the start script. From the neo4j server folder. sudo ./bin/neo4j start Finally, open a browser and access the webadmin interface of your neo4j database by typing http://YOUR_PUBLIC_INSTANCE_DNS:7474. You should see the Neo4j Monitoring and Management Tool, pretty cool! If not, ask me You can now try using the REST API and the curl bash command to insert nodes and relationships. I hope this post helped you, good luck! Follow me on Twitter @negarnil Source: http://www.cloudtmp.com/java/how-to-deploy-a-neo4j-instance-in-amazon-ec2-in-10-minutes/

December 27, 2011

by Nicolas Garnil

· 27,403 Views · 1 Like

Maven + JavaScript Unit Test: Running in a Continuous Integration Environment

So you're still interested in unit testing JavaScript (good). This post is an extension of my much more indepth first posting on how to unit test JavaScript using JS Test Driver. Please check it out here. Recap Last Posting In the last posting we successfully unit tested JavaScript using Maven and JsTest Driver. This allowes us to test JavaScript when on an environment that has a modern browser installed and can be run. Problem with typical CI environments So what happens when the test are passing on your local box, but you go to check in your code and the Continuous Integration (CI) server pukes on the new tests becasue there is no "screen" to run chrome or firefox? As of this posting, none of the top-tier browsers have a "headless" or an in-memory only browser window. There are alternatives to running JavaScript in a browser, such as rhino.js, env.js or HtmlUnit, however, these are just ports of browsers and the JavaScript and DOM representation are not 100% accurate which can lead to problems with your code when rendered in a client's browser. Approach What we need to do is to run JSTestDriver's browser in a Virtual X Framebuffer (Xvfb) which is possible on nearly all Linux based systems. The example below uses a Solaris version of Linux, however, Debian and RedHat linux distrubutions come with the simplified bash script to easily run an appliation in a virtual framebuffer. This solution was derived from one posted solution on the JS Test Driver wiki. The given example is also a full working example that is in use at my current client. Here is the quick list of what we will accomplish. Note, several of these steps are discussed in depth in the previous post and are not covered in depth here. Create a profile to run Js Unit-Tests Copy JsTestDriver library to a known location for Maven to use Copy JavaScript main and test files to known locations Use ANT to start JsTestDriver and pipe the screen into xvfb Here is a sample profile to use. You will need to adjust the properties at the top of the profile to match your system. ci-jstests /opt/swf/bin/firefox 1.3.2 /opt/X11R6/xvfb-run org.apache.maven.plugins maven-dependency-plugin 2.1 copy generate-resources copy com.google.jstestdriver jstestdriver ${js-test-driver.version} jar true jsTestDriver.jar ${project.build.directory}/jstestdriver false true maven-resources-plugin 2.4.3 copy-main-files generate-test-resources copy-resources ${project.build.directory}/test-classes/main-js src/main/webapp/scripts false copy-test-files generate-test-resources copy-resources ${project.build.directory}/test-classes/test-js src/test/webapp/scripts false org.apache.maven.plugins maven-antrun-plugin 1.6 test run Possible problems Although I cannot predict or fix all problems, I can share the one major problem I ran into with Solaris and the script used to fix that. In Solaris (and could happen to other distros) the xvfb-run script was not available and several of the other libraries did not exist. I first had to download the latest X libraries and place them in their appropriate locations on the CI server. Next, I had to re-engineer the xvfb-run script. Here is a copy of my script (NOTE: This is the solution for my server and this may not work for you) I created a script that contains: /usr/openwin/bin/Xvfb :1 screen 0 1280x1024x8 pixdepths 8 24 fbdir /tmp/.X11-vbf & From http://www.ensor.cc/2011/08/maven-javascript-unit-test-running-in.html

December 23, 2011

by Mike Ensor

· 12,256 Views

Just in Time Compiler (JIT) in Hotspot

What is JIT Compiler? The Just In Time Compiler (JIT) concept and more generally adaptive optimization is well known concept in many languages besides Java (.Net, Lua, JRuby). In order to explain what is JIT Compiler I want to start with a definition of compiler concept. According to wikipedia compiler is "a computer program that transforms the source language into another computer language (the target language)". We are all familiar with static java compiler (javac) that compiles human readable .java files to a byte code that can be interpreted by JVM - .class files. Then what does JIT compile? The answer will given a moment later after explanation of what is "Just in Time". According to most researches, 80% of execution time is spent in executing 20% of code. That would be great if there was a way to determine those 20% of code and to optimize them. That's exactly what JIT does - during runtime it gathers statistics, finds the "hot" code compiles it from JVM interpreted bytecode (that is stored in .class files) to a native code that is executed directly by Operating System and heavily optimizes it. Smallest compilation unit is single method. Compilation and statistics gathering is done in parallel to program execution by special threads. During statistics gathering the compiler makes hypotheses about code function and as the time passes tries to prove or to disprove them. If the hypothesis is dis-proven the code is deoptimized and recompiled again. The name "Hotspot" of Sun (Oracle) JVM is chosen because of the ability of this Virtual Machine to find "hot" spots in code. What optimizations does JIT? Let's look closely at more optimizations done by JIT. Inline methods - instead of calling method on an instance of the object it copies the method to caller code. The hot methods should be located as close to the caller as possible to prevent any overhead. Eliminate locks if monitor is not reachable from other threads Replace interface with direct method calls for method implemented only once to eliminate calling of virtual functions overhead Join adjacent synchronized blocks on the same object Eliminate dead code Drop memory write for non-volatile variables Remove prechecking NullPointerException and IndexOutOfBoundsException Et cetera When the Java VM invokes a Java method, it uses an invoker method as specified in the method block of the loaded class object. The Java VM has several invoker methods, for example, a different invoker is used if the method is synchronized or if it is a native method. The JIT compiler uses its own invoker. Sun production releases check the method access bit for value ACC_MACHINE_COMPILED to notify the interpreter that the code for this method has already been compiled and stored in the loaded class. JIT compiler compiles the method block into native code for this method and stores that in the code block for that method. Once the code has been compiled the ACC_MACHINE_COMPILED bit, which is used on the Sun platform, is set. How do we know what JIT is doing in our program and how can it be controlled? First of all to disable JIT Djava.compiler=NONE parameter can be used. There are 2 types of JIT compilers in Hotspot - one is used for client program and one for server (-server option in VM parameters). Program, running on server enjoys usually from more resources than program running on client and to server program top throughput is usually more important. Hence JIT in server is more resource consuming and gathering statistics takes more time to make the statistics more accurate. For client program gathering statics for a method lasts 1500 method calls, for server 15000. These default values can be changed by -XX:CompileThreshold=XXX VM parameter. In order to find out whether default value is good for you try enabling "XX:+PrintCompilation" and "-XX:-CITime" parameters that print JIT statistics and time CPU spent by JIT. Benchmarks Most of the benchmarks show that JITed code runs 10 to 20 times faster than interpreted code. There are many benchmarks done. Below given result graphs of two of them: Its worth to mention that programs that run in JIT mode, but are still in "learning mode" run much slower than non JITed programs. Drawbacks of JIT JIT Increases level of unpredictability and complexity in Java program. It adds another layer that developers don't really understand. Example of possible bugs - 'happens before relations" in concurrency. JIT can easily reorder code if the change is safe for a program running in single thread. To solve this problem developers make hints to JIT using "synchronized" word or explicit locking. Increases non heap memory footprint - JITed code is stored in "Code Cache" generation. Advanced JIT JIT and garbage collection. For GC to occur program must reach safe points. For this purpose JIT injects yieldpoints at regular intervals in native code. In addition to scanning of stack to find root references, registers must be scanned as they may hold objects created by JIT Comments are appresiated. The article can be found also at: artiomg.blogspot.com/2011/10/just-in-time-compiler-jit-in-hotspot.html

October 29, 2011

by Artiom Gourevitch

· 39,243 Views · 1 Like

JDepend design metrics in CI

this article is intended to give the reader enough information to understand what jdepend is, what it does, and how to use it in a maven build. it’s a kind of cheat sheet, if you like. what is it? jdepend is more of a design metric than a code metric, it gives you information about your classes with regards to how they’re related to each other. using this information you should be able to identify any unwanted or dubious dependencies. how does it do that? it traverses java class files and generates design quality metrics, such as: number of classes and interfaces afferent couplings (ca) – what is this?? someone probably feels very proud of themselves for coming up with this phrase. afferent coupling means the number of other packages which depend on the package being measured, in a nutshell. jdepend define this as a measure of a package’s “responsibility” efferent couplings (ce) – sort of the opposite of ca. it’s a measure of the number of other packages that your package depends on abstractness (a) – the ratio of abstract classes to total classes. instability (i) – the ratio of efferent coupling (ce) to total coupling (ce + ca) distance from the main sequence (d) – this sounds fairly wishy-washy and i’ve never paid any attention to it. it’s defined as: “an indicator of the package’s balance between abstractness and stability”. meh. to use jdepend with maven you’ll need maven 2.0 or higher and jdk 1.4 or higher. you don’t need to install anything, as maven will sort this out for you by downloading it at build time. here’s a snippet from one of my project poms, it comes from in the section: org.codehaus.mojo jdepend-maven-plugin 1.6 build/maven/${pom.artifactid}/target/jdepend-reports what you’ll get is a jdepend entry under the project reports section of your maven site, like this: maven project reports page and this is what the actual report looks like (well, some of it): jdepend report summary: jdepend isn’t something i personally use very heavily, but i can understand how it could be used to good effect as a general measure of how closely related your classes are, which, in certain circumstances could prompt you to redesign or refactor your code. i don’t think this sort of information is required on a per commit basis, so i’d be tempted to only include it in my nightly reports. however, i also use sonar, and that has a built-in measure of afferent coupling, so if you’re only interested in that measurement and you’re already running sonar, then jdepend is probably a bit of an unnecessary overhead. also, sonar itself has some good plugins which can provide architectural and design governance features, at least one of which i know implemented jdepend. source: http://jamesbetteley.wordpress.com/2011/04/06/jdepend-design-metrics-in-ci/

October 21, 2011

by James Betteley

· 13,666 Views · 1 Like

Handling PHP Sessions in Windows Azure

One of the challenges in building a distributed web application is in handling sessions. When you have multiple instances of an application running and session data is written to local files (as is the default behavior for the session handling functions in PHP) a user session can be lost when a session is started on one instance but subsequent requests are directed (via a load balancer) to other instances. To successfully manage sessions across multiple instances, you need a common data store. In this post I’ll show you how the Windows Azure SDK for PHP makes this easy by storing session data in Windows Azure Table storage. In the 4.0 release of the Windows Azure SDK for PHP, session handling via Windows Azure Table and Blob storage was included in the newly added SessionHandler class. Note: The SessionHandler class supports storing session data in Table storage or Blob storage. I will focus on using Table storage in this post largely because I haven’t been able to come up with a scenario in which using Blob storage would be better (or even necessary). If you have ideas about how/why Blob storage would be better, I’d love to hear them. The SessionHandler class makes it possible to write code for handling sessions in the same way you always have, but the session data is stored on a Windows Azure Table instead of local files. To accomplish this, precede your usual session handling code with these lines: require_once 'Microsoft/WindowsAzure/Storage/Table.php'; require_once 'Microsoft/WindowsAzure/SessionHandler.php'; $storageClient = new Microsoft_WindowsAzure_Storage_Table('table.core.windows.net', 'your storage account name', 'your storage account key'); $sessionHandler = new Microsoft_WindowsAzure_SessionHandler($storageClient , 'sessionstable'); $sessionHandler->register(); Now you can call session_start() and other session functions as you normally would. Nicely, it just works. Really, that’s all there is to using the SessionHandler, but I found it interesting to take a look at how it works. The first interesting thing to note is that the register method is simply calling the session_set_save_handler function to essentially map the session handling functionality to custom functions. Here’s what the method looks like from the source code: public function register() { return session_set_save_handler(array($this, 'open'), array($this, 'close'), array($this, 'read'), array($this, 'write'), array($this, 'destroy'), array($this, 'gc') ); } The reading, writing, and deleting of session data is only slightly more complicated. When writing session data, the key-value pairs that make up the data are first serialized and then base64 encoded. The serialization of the data allows for lots of flexibility in the data you want to store (i.e. you don’t have to worry about matching some schema in the data store). When storing data in a table, each entry must have a partition key and row key that uniquely identify it. The partition key is a string (“sessions” by default, but this is changeable in the class constructor) and the the row key is the session ID. (For more information about the structure of Tables, see this post.) Finally, the data is either updated (it it already exists in the Table) or a new entry is inserted. Here’s a portion of the write function: $serializedData = base64_encode(serialize($serializedData)); $sessionRecord = new Microsoft_WindowsAzure_Storage_DynamicTableEntity($this->_sessionContainerPartition, $id); $sessionRecord->sessionExpires = time(); $sessionRecord->serializedData = $serializedData; try { $this->_storage->updateEntity($this->_sessionContainer, $sessionRecord); } catch (Microsoft_WindowsAzure_Exception $unknownRecord) { $this->_storage->insertEntity($this->_sessionContainer, $sessionRecord); } Not surprisingly, when session data is read from the table, it is retrieved by session ID, base64 decoded, and unserialized. Again, here’s a snippet that show’s what is happening: $sessionRecord = $this->_storage->retrieveEntityById( $this->_sessionContainer, $this->_sessionContainerPartition, $id ); return unserialize(base64_decode($sessionRecord->serializedData)); As you can see, the SessionHandler class makes good use of the storage APIs in the SDK. To learn more about the SessionHandler class (and the storage APIs), check out the documentation on Codeplex. You can, of course, get the complete source code here: http://phpazure.codeplex.com/SourceControl/list/changesets. As I investigated the session handling in the Windows Azure SDK for PHP, I noticed that the absence of support for SQL Azure as a session store was conspicuous. I’m curious about how many people would prefer to use SQL Azure over Azure Tables as a session store. If you have an opinion on this, please let me know in the comments.

October 19, 2011

by Brian Swan

· 7,889 Views

You can’t be Agile in Maintenance?

I’ve been going over a couple of posts by Steve Kilner that question whether Agile methods can be used effectively in software maintenance. It’s a surprising question really. There are a lot of maintenance teams who have had success following Agile methods like Scrum and Extreme Programming (XP) for some time now. We’ve been doing it for almost 5 years, enhancing and maintaining and supporting enterprise systems, and I know that it works. Agile development naturally leads into maintenance – the goal of incremental Agile development is to get working software out to customers as soon as possible, and get customers using it. At some point, when customers are relying on the software to get real business done and need support and help to keep the system running, teams cross from development over to maintenance. But there’s no reason for Agile development teams to fundamentally change the way that they work when this happens. It is harder to introduce Agile practices into a legacy maintenance team – there are a lot of technical requirements and some cultural changes that need to be made. But most maintenance teams have little to lose and lots to gain from borrowing from what Agile development teams are doing. Agile methods are designed to help small teams deal with a lot of change and uncertainty, and to deliver software quickly – all things that are at least as important in maintenance as they are in development. Technical practices in Extreme Programming especially help ensure that the code is always working – which is even more important in maintenance than it is in development, because the code has to work the first time in production. Agile methods have to be adapted to maintenance, but most teams have found it necessary to adapt these methods to fit their situations anyways. Let’s look at what works and what has to be changed to make Agile methods like Scrum and XP work in maintenance. What works well and what doesn’t Planning Game Managing maintenance isn’t the same as managing a development project – even an Agile development project. Although Agile development teams expect to deal with ambiguity and constant change, maintenance teams need to be even more flexible and responsive, to manage conflicts and unpredictable resourcing problems. Work has to be continuously reviewed and prioritized as it comes in – the customer can’t wait for 2 weeks for you to look at a production bug. The team needs a fast path for urgent changes and especially for hot fixes. You have to be prepared for support demands and interruptions. Structure the team so that some people can take care of second-level support, firefighting and emergency bug fixing and the rest of the team can keep moving forward and get something done. Build slack into schedules to allow for last-minute changes and support escalation. You will also have to be more careful in planning out maintenance work, to take into account technical and operational dependencies and constraints and risks. You’re working in the real world now, not the virtual reality of a project. Standups Standups play an important role in Agile projects to help teams come up to speed and bond. But most maintenance teams work fine without standups – since a lot of maintenance work can be done by one person working on their own, team members don’t need to listen to each other each morning talking about what they did yesterday and what they’re going to do – unless the team is working together on major changes. If someone has a question or runs into a problem, they can ask for help without waiting until the next day. Small releases Most changes and fixes that maintenance teams need to make are small, and there is almost always pressure from the business to get the code out as soon as it is ready, so an Agile approach with small and frequent releases makes a lot of sense. If the time boxes are short enough, the customer is less likely to interrupt and re-prioritize work in progress – most businesses can wait a few days or a couple of weeks to get something changed. Time boxing gives teams a way to control and structure their work, an opportunity to batch up related work to reduce development and testing costs, and natural opportunities to add in security controls and reviews and other gates. It also makes maintenance work more like a project, giving the team a chance to set goals and to see something get done. But time boxing comes with overhead – the planning and setup at the start, then deployment and reviews at the end – all of which adds up over time. Maintenance teams need to be ruthless with ceremonies and meetings, pare them down, keep only what’s necessary and what works. It’s even more important in maintenance than in development to remember that the goal is to deliver working code at the end of each time box. If some code is not working, or you’re not sure if it is working, then extend the deadline, back some of the changes out, or pull the plug on this release and start over. Don’t risk a production failure in order to hit an arbitrary deadline. If the team is having problems fitting work into time boxes, then stop and figure out what you’re doing wrong – the team is trying to do too much too fast, or the code is too unstable, or people don’t understand the code enough – and fix it and move on. Reviews and Retrospectives Retrospectives are important in maintenance to keep the team moving forward, to find better ways of working, and to solve problems. But like many practices, regular reviews reach a point of diminishing returns over time – people end up going through the motions. Once the team is setup, reviews don’t need to be done in each iteration unless the team runs into problems. Schedule reviews when you or the team need them. Collect data on how the team is working, on cycle time and bug report/fix ratios, correlate problems in production with changes, and get the team together to review if the numbers move off track. If the team runs into a serious problem like a major production failure, then get to the bottom of it through Root Cause Analysis. Sustainable pace / 40-hour week It’s not always possible to work a 40-hour week in maintenance. There are times when the team will be pushed to make urgent changes, spend late nights firefighting, releasing after hours and testing on weekends. But if this happens too often or goes on too long the team will burn out. It’s critical to establish a sustainable pace over the long term, to treat people fairly and give them a chance to do a good job. Pairing Pairing is hard to do in small teams where people are working on many different things. Pairing does make sense in some cases – people naturally pair-up when trying to debug a nasty problem or walking through a complicated change – but it’s not necessary to force it on people, and there are good reasons not to. Some teams (like mine) rely more on code reviews instead of pairing, or try to get developers to pair when first looking at a problem or change, and at the end again to review the code and tests. The important thing is to ensure that changes get looked at by at least one other person if possible, however this gets done. Collective Code Ownership Because maintenance teams are usually small and have to deal with a lot of different kinds of work, sooner or later different people will end up working on different parts of the code. It’s necessary, and it’s a good thing because people get a chance to learn more about the system and work with different technologies and on different problems. But there’s still a place for specialists in maintenance. You want the people who know the code the best to make emergency fixes or high-risk changes – or at least have them review the changes – because it has to work the first time. And sometimes you have no choice – sometimes there is only one person who understands a framework or language or technical problem well enough to get something done. Coding Guidelines – follow the rules Getting the team to follow coding guidelines is important in maintenance to help ensure the consistency and integrity of the code base over time – and to help ensure software security. Of course teams may have to compromise on coding standards and style conventions, depending on what they have inherited in the code base; and teams that maintain multiple systems will have to follow different guidelines for each system. Metaphor In XP, teams are supposed to share a Metaphor: a simple high-level expression of the system architecture (the system is a production line, or a bill of materials) and common names and patterns that can be used to describe the system. It’s a fuzzy concept at best, a weak substitute for more detailed architecture or design, and it’s not of much practical value in maintenance. Maintenance teams have to work with the architecture and patterns that are already in place in the system. What is important is making sure that the team has a common understanding of these patterns and the basic architecture so that the integrity isn’t lost – if it hasn’t been lost already. Getting the team together and reviewing the architecture, or reverse-engineering it, making sure that they all agree on it and documenting it in a simple way is important especially when taking over maintenance of a new system and when you are planning major changes. Simple Design Agile development teams start with simple designs and try to keep them simple. Maintenance teams have to work with whatever design and architecture that they inherit, which can be overwhelmingly complex, especially in bigger and older systems. But the driving principle should still be to design changes and new features as simple as the existing system lets you – and to simplify the system’s design further whenever you can. Especially when making small changes, simple, just-enough design is good – it means less documentation and less time and less cost. But maintenance teams need to be more risk adverse than development teams – even small mistakes can break compatibility or cause a run-time failure or open a security hole. This means that maintainers can’t be as iterative and free to take chances, and they need to spend more time upfront doing analysis, understanding the existing design and working through dependencies, as well as reviewing and testing their changes for regressions afterwards. Refactoring Refactoring takes on a lot of importance in maintenance. Every time a developer makes a change or fix they should consider how much refactoring work they should do and can do to make the code and design clearer and simpler, and to pay off technical debt. What and how much to refactor depends on what kind of work they are doing (making a well-thought-out isolated change, or doing shotgun surgery, or pushing out an emergency hot fix) and the time and risks involved, how well they understand the code, how good their tools are (development IDEs for Java and .NET at least have good built-in tools that make many refactorings simple and safe) and what kind of safety net they have in place to catch mistakes – automated tests, code reviews, static analysis. Some maintenance teams don’t refactor because they are too afraid of making mistakes. It’s a vicious circle – over time the code will get harder and harder to understand and change, and they will have more reasons to be more afraid. Others claim that a maintenance team is not working correctly if they don’t spend at least 50% of their time refactoring. The real answer is somewhere in between – enough refactoring to make changes and fixes safe. There are cases where extensive refactoring, restructuring or rewriting code is the right thing to do. Some code is too dangerous to change or too full of bugs to leave the way it is – studies show that in most systems, especially big systems, 80% of the bugs can cluster in 20% of the code. Restructuring or rewriting this code can pay off quickly, reducing problems in production, and significantly reducing the time needed to make changes and test them as you go forward. Continuous Testing Testing is even more important and necessary in maintenance than it is in development. And it’s a major part of maintenance costs. Most maintenance teams rely on developers to test their own changes and fixes by hand to make sure that the change worked and that they didn’t break anything as a side effect. Of course this makes testing expensive and inefficient and it limits how much work the team can do. In order to move fast, to make incremental changes and refactoring safe, the team needs a better safety net, by automating unit and functional tests and acceptance tests. It can take a long time to put in test scaffolding and tools and write a good set of automated tests. But even a simple test framework and a small set of core fat tests can pay back quickly in maintenance, because a lot changes (and bugs) tend to be concentrated in the same parts of the code – the same features, framework code and APIs get changed over and over again, and will need to be tested over and over again. You can start small, get these tests running quickly and reliably and get the team to rely on them, fill in the gaps with manual tests and reviews, and then fill out the tests over time. Once you have a basic test framework in place, developers can take advantage of TFD/TDD especially for bug fixes – the fix has to be tested anyways, so why not write the test first and make sure that you fixed what you were supposed to? Continuous Integration To get Continuous Testing to work, you need a Continuous Integration environment. Understanding, automating and streamlining the build and getting the CI server up and running and wiring in tests and static analysis checks and reporting can take a lot of work in an enterprise system, especially if you have to deal with multiple languages and platforms and dependencies between systems. But doing this work is also the foundation for simplifying release and deployment – frequent short releases means that release and deployment has to be made as simple as possible. Onsite Customer / Product Owner Working closely with the customer to make sure that the team is delivering what the customer needs when the customer needs it is as important in maintenance as it is in developing a new system. Getting a talented and committed Customer engaged is hard enough on a high-profile development project – but it’s even harder in maintenance. You may end up with too many customers with conflicting agendas competing for the team’s attention, or nobody who has the time or ability to answer questions and make decisions. Maintenance teams often have to make compromises and help fill in this role on their own. But it doesn’t all fit…. Kilner’s main point of concern isn’t really with Agile methods in maintenance. It’s with incremental design and development in general – that some work doesn’t fit nicely into short time boxes. Short iterations might work ok for bug fixes and small enhancements (they do), but sometimes you need to make bigger changes that have lots of dependencies. He argues that while Agile teams building new systems can stub out incomplete work and keep going in steps, maintenance teams have to get everything working all at once – it’s all or nothing. It’s not easy to see how big changes can be broken down into small steps that can be fit into short time boxes. I agree that this is harder in maintenance because you have to be more careful in understanding and untangling dependencies before you make changes, and you have to be more careful not to break things. The code and design will sometimes fight the kinds of changes that you need to make, because you need to do something that was never anticipated in the original design, or whatever design there was has been lost over time and any kind of change is hard to make. It’s not easy – but teams solve these problems all the time. You can use tools to figure out how much of a dependency mess you have in the code and what kind of changes you need to make to get out of this mess. If you are going to spend “weeks, months, or even years” to make changes to a system, then it makes sense to take time upfront to understand and break down build dependencies and isolate run-time dependencies, and put in test scaffolding and tests to protect the team from making mistakes as they go along. All of this can be done in time boxed steps. Just because you are following time boxes and simple, incremental design doesn’t mean that you start making changes without thinking them through. Read Working With Legacy Code – Michael Feathers walks through how to deal with these problems in detail, in both object oriented and procedural languages. What to do if it takes forever to make a change. How to break dependencies. How to find interception points and pinch points. How to find structure in the design and the code. What tests to write and how to get automated tests to work. Changing data in a production system, especially data shared with other systems, isn’t easy either. You need to plan out API changes and data structure changes as carefully as possible, but you can still make data and database changes in small, structured steps. To make code changes in steps you can use Branching by Abstraction where it makes sense (like making back-end changes) and you can protect customers from changes through Feature Flags and Dark Launching like Facebook and Twitter and Flickr do to continuously roll out changes – although you need to be careful, because if taken too far these practices can make code more fragile and harder to work with. Agile development teams follow incremental design and development to help them discover an optimal solution through trial-and-error. Maintenance teams work this way for a different reason – to manage technical risks by breaking big changes down and making small bets instead of big ones. Working this way means that you have to put in scaffolding (and remember to take it out afterwards) and plan out intermediate steps and review and test everything as you make each change. Sometimes it might feel like you are running in place, that it is taking longer and costing more. But getting there in small steps is much safer, and gives you a lot more control. Teams working on large legacy code bases and old technology platforms will have a harder time taking on these ideas and succeeding with them. But that doesn’t mean that they won’t work. Yes, you can be Agile in maintenance.

October 14, 2011

by Mitch Pronschinske

· 17,525 Views

Git: Getting the history of a deleted file

We recently wanted to get the Git history of a file which we knew existed but had now been deleted so we could find out what had happened to it. Using a simple git log didn’t work: git log deletedFile.txt fatal: ambiguous argument 'deletedFile.txt': unknown revision or path not in the working tree. We eventually came across Francois Marier’s blog post which points out that you need to use the following command instead: git log -- deletedFile.txt I’ve tried reading through the man page but I’m still not entirely sure what the distinction between using – and not using it is supposed to be. If someone could explain it that’d be cool… From http://www.markhneedham.com/blog/2011/10/04/git-getting-the-history-of-a-deleted-file

October 10, 2011

by Mark Needham

· 48,935 Views · 1 Like

The Goal of software development

The Goal by Eli Goldratt is a business book in the form of a novel, where the protagonist must save his factory from closing due to very low productivity. The Goal is not limited to the management of a large organization (not even to for-profit companies): you simply have to define different units of measurement, like goal units instead of making money, the default goal. In fact, from the applications of the Theory of Constraints in our field I think it applies to software development too. What follows is my translation of the themes of The Goal to our field. The Goal is... Mking money, of course. For the ones of you with knowledge in accounting, the original goal is: raising throghput, the amounts of items sold (not produced) in the unit of time. Lowering investment/inventory, all the money tied up in the system in the form of assets that could be sold or products that stay in a warehouse. Lowering operational expense, all the money that we have spent as support and that cannot be recovered. How does these measurements apply to software development? A team does not always have an impact on contract negotiation, so often talking about money is far from everyday reality (kudos to you if you can apply that point of the Agile Manifesto.) The goal for software development can be translated, in my opinion, to: raising the throughput, the amount of features delivered (deployed, not implemented or tested) in the unit of time. You can measure this amount in story points, since feature vary in size. lowering investment/inventory, all the time tied up in the system in the form of undeployed or untested features that clutter the code base. In a minor part, also investment in the form of hardware, but that's by far less important than the team's time. lower operational expense, the time spent by developers every day in order to support the development. Automation is a kind of time investment that will bring more time (and quality) in the future, lowering operational expense. Like for material products, WIP has storage and opportunity costs which goes into the operational expense. Kanban is a tool that tries to reduce WIP in order to foster the two latter points. Throughput accounting This kind of throughput accounting is emphasized in the novel, over the use of cost accounting, where each developer (ehm, factory worker) has to be occupied all the time, even if the work he is doing isn't moving towards the goal: neverending refactoring. Specification of a feature which cannot be implemented until two months are gone, and will have to be rewritten. Implementation of features which won't be merged with the main branch any time soon. With Test-Driven Development, we are getting good at moving a feature from implemented to tested directly in the same commit. Yet the missing step is getting the feature to the users: maybe that's also what Continuous Deployment is all about... Dependent events Dependent events and statistical fluctuations are production systems topics that make a balanced plant close to bankruptcy: however, we're not at the point in which we can model our team as precisely as a factory. The basic point is that a plant in which everyone is working all the time is inefficient: when an early stage (like defining a specification or implementing a feature) gets delayed, downstream step such as deployment are dalayed too. Converely, when an upstream step finish earlier, the downstream stage is already at maximum efficiency and cannot process the intermediate result faster. I wonder if this applies to software development too. In a factory, workers are specialized and can do just a few jobs across the plant. Since workers and machines have different production rates, there will be just one bottleneck: the slowest one. If products have to pass from the bottleneck, anyone producing faster than the bottleneck will just accumulate WIP in front of him. Continuing with our example, if the analyst or domain expert is churning out specifications for new features every day, most of them are just WIP in front of the development team. Once there is an established buffer, any additional specification won't raise throughput any faster; instead, it will raise the inventory (partial features) and the time spent in managing it. I think this is not always true in the most technical phases of development instead. For example, in a small team a developer may be moved to testing or refactoring, or setting up Continuous Integration or evaluation of a new library. Unless you have a DBA which can just manage databases, your developer is not fixed into a stage of the system. The bottleneck The previous example featured a bottleneck, the most famous concept of the Theory of Constraints introduced in the book. This translation to the software development case is mine and could be incomplete. A feature (or a user story) has to pass in a series of stations where different people will work on it to make it real: specification from a domain expert, implementation from a technical team, extended testing with optional customer validation, deployment which should be fast but at the same time must not kill the current version of the application. Each station has an average velocity. By definition, there is a station which is the slowest and can process fewest story points in the unit of time. This is the bottleneck. (This is not always true, as velocity may vary greatly in time with the addition of new people or hidden lines discovered in a feature. Becoming good at estimation and stabilizing a team are two objectives that help reach the assumption.) You can identify a bottleneck by looking at where is the WIP: it will accumulate in front of it. If you already have a kanban board, this phase is simpler... Once identified, the throughput of the system can only be improved by raising the bottleneck's capacity enough so that it is no more a bottleneck. You can move people to it (keeping an eye on communication costs); ensure it is used at maximum efficiency by freeing the specialized developers from other mundane tasks. Now you can restart and find a new bottleneck... Conclusions This is just a little introduction to the themes of the Theory of Constraints and Goldratt's teachings; I don't pretend to explain the whole book in an article. It is also against the Socratic method: you should reach the answers yourself, and these are just examples from my experience. There is more to Goldratt and The Goal than bottlenecks and throughput, such as continuous improvement. If you're working or managing in a software development team, I suggest you to read this book if you have the opportunity. Even when freelancing, it is an eye-opener in moving towards a Goal instead of busyworking; and it's written with a never boring teaching method.

October 4, 2011

by Giorgio Sironi

· 21,621 Views