DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

The Latest Career Development Topics

article thumbnail
Build Flow Jenkins Plugin
With the advent of Continuous Integration and Continuous Delivery, our builds are split into different steps creating the deployment pipeline. Some of these steps can be compiled and run fast tests, run slow tests, run automated acceptance tests, or releasing the application, to cite a few. Most of us are using Jenkins/Hudson to implement Continuous Integration/Delivery, and we manage job orchestration combining some Jenkins plugins like build pipeline, parameterized-build, join or downstream-ext. We have to configure all of them which implies polluting the job configuration through multiple jobs, which , makes the system configuration very complex to maintain. Build Flow enables us to define an upper level flow item to manage job orchestration and link up rules, using a dedicated DSL. Let's see a very simple example: First step is installing the plugin. Go to Jenkins -> Manage Jenkins -> Plugin Manager -> Available and find for CloudBees Build Flowplugin. Then you can go to Jenkins -> New Job and you will see a new kind of job called Build Flow. In this example we are going to name it build-all-yy. And now you only have to program using flow DSL how this job should orchestrate the other jobs. In "Define build flow using flow DSL" input text you can specify the sequence of commands to execute. In current example I have already created two jobs, one executing clean compile goal (yy-compile job name) and the other one executing javadoc goal (yy-javadoc job name). I know that this deployment pipeline is not real in a true environment but for now it is enough. Then we want javadoc job running after project is compiled. To configure this we don't have to create any upstream or downstream actions, simply add next lines at DSL text area: build("yy-compile"); build("yy-javadoc"); Save and execute build-all-yy job and both projects will be built in a sequential way. Now suppose that we add a third job called yy-sonar which runs sonar goal that generates code quality sonar report. In this case it seems obvious that after project is compiled, generation of javadocs and code quality jobs can be run in parallel. So script is changed to: build("yy-compile") parallel ( {build("yy-javadoc")}, {build("yy-sonar")} ) This plugin also supports more operations like retry (similar behaviour of retry-failed-job plugin) or guard-rescue, that it works mostly like a try+finally block. Also you can create parameterized builds, accessing to build execution or printing to Jenkins console. Next example will print build number of yy-compile job execution: b = build("yy-compile") out.println b.build.number And finally you can also have a quick graphical overview of the execution in Status section. It is true that could be improved more, but for now it is acceptable, and can be used without any problem. Build Flow plugin is in its early stages, in fact it is only at version 0.4. But will be a plugin to be considered in future, and I think it is good to know that it exists. Moreover is being developed by CloudBees folks so it is a guarantee of being fully supported by Jenkins. We Keep Learning. Alex. Warning: In order to run parallel tasks with the plugin Anonymous users must have Read Job access (Jenkins -> Manage Jenkins -> Configure System). There is an issue already inserted into Jira to fix this problem.
August 2, 2012
by Alex Soto
· 37,664 Views · 1 Like
article thumbnail
Bringing Order to Your Jenkins Jobs
Once you’ve been working with Jenkins and uberSVN for a while, you may find yourself in a situation where you have several jobs that need to run in a specific order, for example: Job 1 and Job 3 can run simultaneously. BUT Job 2 should only start when Job 1 and Job 3 have finished running. AND Job 4 should only start when Job 2 has finished. How can you implement this complicated setup? This is where Jenkins’ ‘Advanced Project Options’ and build triggers come in handy. In this tutorial, we’ll walk through the different options for scheduling jobs using Jenkins and uberSVN, the free ALM platform for Apache Subversion. Note, this tutorial assumes you have already created a job and configured it to automatically poll your Subversion repository. 1) Open the Jenkins tab of your uberSVN installation and select a job. 2) Click the ‘Configure’ option from the left-hand menu. 3) In the ‘Advanced Project Options’ tab, select the ‘Advanced…’ button 4) This contains two options that are useful for ordering your jobs: Block build when upstream project is building – blocks builds when a dependency is in the queue, or building. Note, these dependencies include both direct and transitive dependencies. Block build when downstream project is building – blocks builds when a child of the project is in the queue, or building. This applies to both direct and transitive children. If this option doesn’t meet your needs, you can explicitly name a project (or projects) that must be built before your job is allowed to run. To set this: 1) Scroll down to the ‘Build triggers’ tab on the configure page. 2) Select the ‘Build after other projects are built’ checkbox. This will bring up a text box where you can list any number of projects. Utilized properly, the build triggers and advanced project options should allow you to organize your jobs into a schedule. Tip, if you need even more control over your build schedule, there are plenty of scheduling plugins available. To add plugins to Jenkins, simply: 1) Open the ‘Manage Jenkins’ screen. 2) Click the ‘Manage Plugins’ link. 3) Open the ‘Available’ tab and select the appropriate plugins from the list.
July 28, 2012
by Jessica Thornsby
· 21,051 Views
article thumbnail
Set up a Nightly Build Process with Jenkins, SVN and Nexus
we wanted to set up a nightly integration build with our projects so that we could run unit and integration tests on the latest version of our applications and their underlying libraries. we have a number of libraries that are shared across multiple projects and we wanted this build to run every night and use the latest versions of those libraries even if our applications had a specific release version defined in their maven pom file. in this way we would be alerted early if someone added a change to one of the dependency libraries that could potentially break an application when the developer upgraded the dependent library in a future version of the application. the chart below illustrates our dependencies between our libraries and our applications. updating versions nightly both the crossdock-shared and messaging-shared libraries depend on the siesta framework library. the crossdock web service and crossdockmessaging applications both depend on the crossdock-shared and messaging-shared libraries. because of the dependency structure, we wanted the siestaframework library built first. the crossdock-shared and messaging-shared libraries could be built in parallel, but we didn’t want the builds for the crossdock web service and crossdockmessaging applications to begin until all the libraries had finished building. we also wanted the nightly build to tag a subversion with the build date as well as upload the artifact to our nexus “nightly build” repository. the resulting artifact would look something like siestaframework-20120720.jar also as i had mentioned, even though the crossdockmessaging app may specify in its pom file it depends on version 5.0.4 of the siestaframework library. for the purposes of the nightly build, we wanted it to use the freshly built siestaframework-nightly-20120720.jar version of the library. the first problem to tackle was getting the current date into the project’s version number. for this i started with the jenkins zentimestamp plugin . with this plugin the format of jenkin’s build_id timestamp can be changed. i used this to specify using the format of yyyymmdd for the timestamp. the next step was to get the timestamp into the version number of the project. i was able to accomplish this by using the maven versions plugin. one thing the versions plugin can do is allow you to dynamically override the version number in the pom file at build time. the code snippet from the siestaframework’s pom file is below. org.codehaus.mojo versions-maven-plugin 1.3.1 at this point the jenkins job can be configured to invoke the “versions;set” goal, passing in the new version string to use. the ${build_id} jenkins variable will have the newly formatted date string. this will produce an artifact with the name siestaframework-nightly-20120720.jar uploading artifacts to a nightly repository since this job needed to upload the artifact to a different repository from our release repository that's defined in our project pom files, the “altdeploymentrepository” property was used to pass in the location of the nightly repository. the deployment portion of the siestaframework job specifies the location of the nightly repository where ${lynden_nightly_repo} is a jenkins variable containing the nightly repo url. tagging subversion finally, the jenkins subversion tagging plugin was used to tag svn if the project was successfully built. the plugin provides a post-build action for the job with the configuration section shown below. dynamically updating dependencies so now that the main project is set up, the dependent projects are set up in a similar way, but need to be configured to use the siestaframework-nightly-20120720 of the dependency rather than whatever version they currently have specified in their pom file. this can be accomplished by changing the pom to use a property for the version number of the dependency. for example, if the snippet below was the original pom file— com.lynden siestaframework 5.0.1 —changing it to the following would allow the siestaframework version to be set dynamically: 5.0.1 com.lynden siestaframework ${siesta.version} this version can then be overriden by the jenkins job. the example below shows the jenkins configuration for the crossdock-shared build. enforcing build order the final step in this process is setting up a structure to enforce the build order of the projects. the dependencies are set up in such a way that siestaframework needs to be built first, and the crossdock-shared and messaging-shared libraries can be run concurrently once siestaframework finishes. the crossdock web service and crossdockmessaging application jobs can be run concurrently, too, but not until after both shared libraries have finished. setting up the crossdock-shared and messaging-shared jobs to be built after the siestaframework finishes is pretty straightforward. in the jenkins job configuration for both the shared libraries, the following build trigger is added: to satisfy the requirement that the apps build only after all libraries have built, i enlisted the help of the join plugin . the join plugin can be used to execute a job once all “downstream” jobs have completed. what does this mean exactly? looking at the diagram below, the crossdock-shared and the messaging-shared jobs are “downstream” from the siestaframework job. once both of these jobs complete, a join trigger can be used to start other jobs. in this case, rather than having the join trigger kick off other app jobs directly, i created a dummy join job. in this way, as we add more application builds, we don’t need to keep modifying the siestaframework job with the new application job we just added. to illustrate the configuration, siestaframework has a new post-build action (below): join-build is a jenkins job i configured that does not do anything when executed. then our crossdock web service and crossdockmessaging applications define their builds to trigger as soon as join-build has completed. in this way we are able to run builds each night that will update to the latest version of our dependencies as well as tag svn and archive the binaries to nexus. i’d love to hear feedback from anyone who is handling nightly builds via jenkins, and how they have handled the configuration and build issues.
July 25, 2012
by Rob Terpilowski
· 22,837 Views
article thumbnail
20 Subjects Every Software Engineer Should Know
Here are the most important subjects for software engineering, with brief explanations: 1.Object oriented analysis & design: For better maintainability, reusability and faster development, the most well accepted approach, shortly OOAD and its SOLID principals are very important for software engineering. 2.Software quality factors: Software engineering depends on some very important quality factors. Understanding and applying them is crucial. 3.Data structures & algorithms: Basic data structures like array, list, stack, tree, map, set etc. and useful algorithms are vital for software development. Their logical structure should be known. 4. Big-O notation: Big-O notation indicates the performance of an algorithm/code section. Understanding it is very important for comparing performances. 5.UML notation: UML is the universal and complete language for software design & analysis. If there is lack of UML in a development process, it feels there is no engineering. 6.Software processes and metrics: Software enginnering is not a random process. It requires a high level of systematic and some numbers to monitor those techniques. So, processes and metrics are essential. 7.Design patterns: Design patterns are standard and most effective solutions for specific problems. If you don't want to reinvent the wheel, you should learn them. 8.Operating systems basics: Learning OS basics is very important because all applications runs on it. By learning it, we can have better vision, viewpoints and performance for our applications. 9.Computer organization basics: All applications including OS requires a hardware for physical interaction. So, learning computer organization basics is vital again for better vision, viewpoints and performance. 10.Network basics: Network is related with computer organization, OS and the whole information transfer process. In any case we will face it while software development. So, it is important to learn network basics. 11.Requirement analysis: Requirement analysis is the starting point and one of the most important parts of software engineering. Performing it correctly and practically needs experience but it is very essential. 12.Software testing: Testing is another important part of software engineering. Unit testing, its best practices and techniques like black box, white box, mocking, TDD, integration testing etc. are subjects which must be known. 13.Dependency management: Library (JAR, DLL etc.) management, and widely known tools (Maven, Ant, Ivy etc.) are essential for large projects. Otherwise, antipatterns like Jar Hell are inevitable. 14.Continuous integration: Continuous integration brings easiness and automaticity for testing large modules, components and also performs auto-versioning. Its aim and tools (like Hudson etc.) should be known. 15.ORM (Object relational mapping): ORM and its widely known implementation Hibernate framework is an important technique for mapping objects into database tables. It reduces code length and maintenance time. 16.DI (Dependency Injection): DI or IoC (Inversion of Control) and its widely known implementation Spring framework makes life easy for object creation and lifetime management on big enterprise applications. 17.Version controlling systems: VCS tools (SVN, TFS, CVS etc.) are very important by saving so much time for collaborative works and versioning. Their logical viewpoint and standard cammands should be known. 18.Internationalization (i18n): i18n by extracting strings into external files is the best way of supporting multiple languages in our applications. Its practices on different IDEs and technologies must be known. 19.Architectural patterns: Understanding architectural design patterns (like MVC, MVP, MVVM etc.) is essential for producing a maintainable, clean, extendable and testable source code. 20.Writing clean code: Working code is not enough, it must be readable and maintainable also. So, code formatting and readable code development techniques are needed to be known and applied.
July 2, 2012
by Cagdas Basaraner
· 108,559 Views · 5 Likes
article thumbnail
Reportlab: Mixing Fixed Content and Flowables
Recently I needed the ability to use Reportlab’s flowables, but place them in fixed locations. Some of you are probably wondering why I would want to do that. The nice thing about flowables, like the Paragraph, is that they’re easily styled. If I could bold something or center something AND put it in a fixed location, then that would rock! It took a lot of Googling and trial and error, but I finally got a decent template put together that I could use for mailings. In this article, I’m going to show you how to do this too. Getting Started You’ll need to make sure you have Reportlab or you’ll end up with a whole lot of nothing. You can go here to grab it. While you wait for it to download you can continue reading this article or go do something else productive. Are you ready now? Then let’s get this show on the road! Now we just need to come up with an example. Fortunately I was working on something at my job that I’ve been able to dummy up into the following silly and incomplete form letter. Study the code closely because you never know when there will be a test from reportlab.lib.pagesizes import letter from reportlab.lib.styles import getSampleStyleSheet from reportlab.lib.units import mm, inch from reportlab.pdfgen import canvas from reportlab.platypus import Image, Paragraph, Table ######################################################################## class LetterMaker(object): """""" #---------------------------------------------------------------------- def __init__(self, pdf_file, org, seconds): self.c = canvas.Canvas(pdf_file, pagesize=letter) self.styles = getSampleStyleSheet() self.width, self.height = letter self.organization = org self.seconds = seconds #---------------------------------------------------------------------- def createDocument(self): """""" voffset = 65 # create return address address = """ Jack Spratt 222 Ioway Blvd, Suite 100 Galls, TX 75081-4016 """ p = Paragraph(address, self.styles["Normal"]) # add a logo and size it logo = Image("snakehead.jpg") logo.drawHeight = 2*inch logo.drawWidth = 2*inch ## logo.wrapOn(self.c, self.width, self.height) ## logo.drawOn(self.c, *self.coord(140, 60, mm)) ## data = [[p, logo]] table = Table(data, colWidths=4*inch) table.setStyle([("VALIGN", (0,0), (0,0), "TOP")]) table.wrapOn(self.c, self.width, self.height) table.drawOn(self.c, *self.coord(18, 60, mm)) # insert body of letter ptext = "Dear Sir or Madam:" self.createParagraph(ptext, 20, voffset+35) ptext = """ The document you are holding is a set of requirements for your next mission, should you choose to accept it. In any event, this document will self-destruct %s seconds after you read it. Yes, %s can tell when you're done...usually. """ % (self.seconds, self.organization) p = Paragraph(ptext, self.styles["Normal"]) p.wrapOn(self.c, self.width-70, self.height) p.drawOn(self.c, *self.coord(20, voffset+48, mm)) #---------------------------------------------------------------------- def coord(self, x, y, unit=1): """ # http://stackoverflow.com/questions/4726011/wrap-text-in-a-table-reportlab Helper class to help position flowables in Canvas objects """ x, y = x * unit, self.height - y * unit return x, y #---------------------------------------------------------------------- def createParagraph(self, ptext, x, y, style=None): """""" if not style: style = self.styles["Normal"] p = Paragraph(ptext, style=style) p.wrapOn(self.c, self.width, self.height) p.drawOn(self.c, *self.coord(x, y, mm)) #---------------------------------------------------------------------- def savePDF(self): """""" self.c.save() #---------------------------------------------------------------------- if __name__ == "__main__": doc = LetterMaker("example.pdf", "The MVP", 10) doc.createDocument() doc.savePDF() Now you’ve seen the code, so we’ll spend a little time going over how it works. First off we create a Canvas object that we can use without our LetterMaker class. We also create a styles dict and set up a few other class variables. In the createDocument method, we create a Paragraph (an address) using some HTML-like tags to control the font and line breaking behavior. Then we create a logo and size it before putting both items into a Reportlab Table object. You’ll note that I’ve left in a couple commented out lines that show how to place the logo without the table. We use the coord method to help position the flowable. I found it on StackOverflow and thought it was pretty handy. The body of the letter uses a little string substitution and puts the result into another Paragraph. We also use a stored offset to help us position things. I find that storing a couple of offsets for certain portions of the code is very helpful. If you use them carefully then you can just change a couple of offsets to move the content around on the document rather than having to edit the position of each element. If you need to draw lines or shapes, you can do them in the usual way with your canvas object. Wrapping Up I hope this code will help you in your PDF creation endeavors. I have to admit that I’m posting it on here as much for my own future benefit as for your own. I’m a little sad I had to strip out so much from it, but my organization wouldn’t like it very much if I posted the original. Regardless, you now have the tools to create some pretty fancy PDF documents with Python. Now you just have to get out there and do it!
June 29, 2012
by Mike Driscoll
· 19,871 Views
article thumbnail
Amazon EMR Tutorial: Running a Hadoop MapReduce Job Using Custom JAR
See original post at https://muhammadkhojaye.blogspot.com/2012/04/how-to-run-amazon-elastic-mapreduce-job.html Introduction Amazon EMR is a web service which can be used to easily and efficiently process enormous amounts of data. It uses a hosted Hadoop framework running on the web-scale infrastructure of Amazon EC2 and Amazon S3. Amazon EMR removes most of the cumbersome details of Hadoop while taking care of provisioning of Hadoop, running the job flow, terminating the job flow, moving the data between Amazon EC2 and Amazon S3, and optimizing Hadoop. In this tutorial, we will use a developed WordCount Java example using Hadoop and thereafter, we execute our program on Amazon Elastic MapReduce. Prerequisites You must have valid AWS account credentials. You should also have a general familiarity with using the Eclipse IDE before you begin. The reader can also use any other IDE of their choice. Step 1 – Develop MapReduce WordCount Java Program In this section, we are first going to develop a WordCount application. A WordCount program will determine how many times different words appear in a set of files. In Eclipse (or whatever the IDE you are using), Create simple Java Project with the name "WordCount". Create a java class name Map and override the map method as follow, public class Map extends Mapper { private final static IntWritable one = new IntWritable(1); private Text word = new Text(); @Override public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { String line = value.toString(); StringTokenizer tokenizer = new StringTokenizer(line); while (tokenizer.hasMoreTokens()) { word.set(tokenizer.nextToken()); context.write(word, one); } } } Create a java class named Reduce and override the reduce method as shown below, public class Reduce extends Reducer { @Override protected void reduce(Text key, java.lang.Iterable values, org.apache.hadoop.mapreduce.Reducer.Context context) throws IOException, InterruptedException { int sum = 0; for (IntWritable value : values) { sum += value.get(); } context.write(key, new IntWritable(sum)); } } Create a java class named WordCount and defined the main method as below, public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); Job job = new Job(conf, "wordcount"); job.setJarByClass(WordCount.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); job.setMapperClass(Map.class); job.setReducerClass(Reduce.class); job.setInputFormatClass(TextInputFormat.class); job.setOutputFormatClass(TextOutputFormat.class); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); job.waitForCompletion(true); } Export the WordCount program in a jar using eclipse and save it to some location on disk. Make sure that you have provided the Main Class (WordCount.jar) during extraction ofu8u the jar file as shown below. Our jar is ready!!! Step 2 – Upload the WordCount JAR and Input Files to Amazon S3 Now we are going to upload the WordCount jar to Amazon S3. First, go to the following URL: https://console.aws.amazon.com/s3/home Next, click “Create Bucket”, give your bucket a name, and click the “Create” button. Select your new S3 bucket in the left-hand pane. Upload the WordCount JAR and sample input file for counting the words. Step 3 – Running an Elastic MapReduce job Now that the JAR is uploaded into S3, all we need to do is to create a new Job flow. let's execute the steps below. (I encourage readers to check out the following link for details regarding each step, How to Create a Job Flow Using a Custom JAR ) Sign in to the AWS Management Console and open the Amazon Elastic MapReduce console at https://console.aws.amazon.com/elasticmapreduce/ Click Create New Job Flow. In the DEFINE JOB FLOW page, enter the following details, a) Job Flow Name = WordCountJob b) Select Run your own applications) Select Custom JAR in the drop-down list) Click Continue In the SPECIFY PARAMETERS page, enter values in the boxes using the following table as a guide, and then click Continue.JAR Location = bucketName/jarFileLocationJAR Arguments =s3n://bucketName/inputFileLocations3n://bucketName/outputpath Please note that the output path must be unique each time we execute the job. The Hadoop always create a folder with the same name specified here. After executing the job, just wait and monitor your job that runs through the Hadoop flow. You can also look for errors by using the Debug button. The job should be complete within 10 to 15 minutes (can also depend on the size of the input). After completing the job, You can view results in the S3 Browser panel. You can also download the files from S3 and can analyze the outcome of the job. Amazon Elastic MapReduce Resources Amazon Elastic MapReduce Documentation,http://aws.amazon.com/documentation/elasticmapreduce/ Amazon Elastic MapReduce Getting Started Guide,http://docs.amazonwebservices.com/ElasticMapReduce/latest/GettingStartedGuide/ Amazon Elastic MapReduce Developer Guide,http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/ Apache Hadoop,http://hadoop.apache.org/ See more at https://muhammadkhojaye.blogspot.com/2012/04/how-to-run-amazon-elastic-mapreduce-job.html
April 23, 2012
by Muhammad Ali Khojaye
· 59,026 Views
article thumbnail
Face Detection using HTML5, Javascript, Webrtc, Websockets, Jetty and OpenCV
How to create a real-time face detection system using HTML5, JavaScript, and OpenCV, leveraging WebRTC for webcam access and WebSockets for client-server communication.
April 23, 2012
by Jos Dirksen
· 53,100 Views
article thumbnail
Scheduling a Job Using The NCron Library
Introduction NCron is a .Net scheduling framework, it is a .Net version of Cron - the time based job scheduler found on unix like operating systems or Cron4j - scheduling library for Java. Ncron is light weight and easy to use, with little learning curve. It comes with some cool advantages, being that you can use it in C#, Vb.net or any other .Net programming language. It takes your mind off the details of scheduling and you can focus on how to implement the business logic of your application or the job to be scheduled. Details such as threading and timers have been taken care of. Ncron Library You can point your browser to http://code.google.com/p/ncron/downloads/detail?name=ncron-2.1.zip to download the ncron library. You need to add reference to the Ncron library in your project so as to be able to access the classes and functionalities of the Ncron scheduling framework. Scheduling a Job When creating a job to be scheduled using NCron, the job is wrapped up in a class which must extend the class NCron.CronJob and override a void method Execute public class MyJob : NCron.CronJob { public override void Execute() { System.IO.File.Copy(@"c:\\output.out", @"f:\\output.out"); } } The job to be scheduled will be placed in the Execute method. The next thing to do is to give NCron control over the job execution, by calling the static method Bootstrap.Init() at the entry point of your application, for example this can be put in the Main method. You should have a static setup method, which I called JobSetup method that will be passed into the Bootstrap.Init() method. using System; using System.Collections.Generic; using System.Linq; using System.Text; using NCron.Fluent.Crontab; using NCron.Fluent.Generics; using NCron.Service; namespace NcronExample { public class Program { private static void Main(string[] args) { Bootstrap.Init(args, JobSetup); } private static void JobSetup(SchedulingService schedulingService) { schedulingService.At("* * * * *").Run(); } } } The line of code inside the JobSetup method is to specify how the Job is going to be run, and the parameter in the schedulingService.At() method is known as crontab expression which I will discuss shortly. The SchedulingService class has a number of methods of interest. service.Daily().Run(); //runs the scheduled job once every day service.Hourly().Run(); //runs the scheduled job once every hour service.Weekly().Run(); //runs the scheduled job once every week Crontab Expression A crontab expression is a string comprising of 5 characters, which are seperated by space. This crontab expression when parsed produces occurrences of time based on a given schedule expressed in the crontab format. NCron parses crontab expression through the use of NCrontab(Crontab for .Net) an open source library for parsing crontab expressions. A regular crontab expression is of the form * * * * * where the first * is for minute which can be from 0-59. The second * is for hour which can also be from 0-23. The third * is for day of the month from 1-31. The fourth * is for month from 1-12. The last * is for day of week from 0-6 where 0 represents Sunday. The asterisk or wildcard character if left in the expression indicates all valid or legal values for that column. If yIf you want the scheduled job to run every minute, the expresion will be in the form below. * * * * * The The expression below causes the scheduler to run the job at the fifth minute of every ninth hour everyday. 5 9 * * * To run a job every tenth minute of every hour from Monday to Friday only, the expression will be in the form below. 10 * * * 1,2,3,4,5 You can read more on crontab expressions at http://code.google.com/p/ncrontab/wiki/CrontabExamples Deploying the Scheduled Job After the application has been built and compiled, you can deploy the scheduled job as a service by opening command prompt and change directory to where the executable of the application is and then run the command. ncronexample install To install the scheduled job as a service, and that is it !!!
April 18, 2012
by Ayobami Adewole
· 17,476 Views
article thumbnail
Quartz Scheduler Misfire Instructions Explained
Sometimes Quartz is not capable of running your job at the time when you desired. There are three reasons for that: all worker threads were busy running other jobs (probably with higher priority) the scheduler itself was down the job was scheduled with start time in the past (probably a coding error) You can increase the number of worker threads by simply customizing the org.quartz.threadPool.threadCount in quartz.properties (default is 10). But you cannot really do anything when the whole application/server/scheduler was down. The situation when Quartz was incapable of firing given trigger is called misfire. Do you know what Quartz is doing when it happens? Turns out there are various strategies (called misfire instructions) Quartz can take and also there are some defaults if you haven't thought about it. But in order to make your application robust and predictable (especially under heavy load or maintenance) you should really make sure your triggers and jobs are configured conciously. There are different configuration options (available misfire instructions) depending on the trigger chosen. Also Quartz behaves differently depending on trigger setup (so called smart policy). Although the misfire instructions are described in the documentation, I found it hard to understand what do they really mean. So I created this small summary article. Before I dive into the details, there is yet another configuration option that should be described. It is org.quartz.jobStore.misfireThreshold (in milliseconds), defaulting to 60000 (a minute). It defines how late the trigger should be to be considered misfired. With default setup if trigger was suppose to be fired 30 seconds ago, Quartz will happily just run it. Such delay is not considered misfiring. However if the trigger is discovered 61 seconds after the scheduled time - the special misfire handler thread takes care of it, obeying the misfire instruction. For test purposes we will set this parameter to 1000 (1 second) so that we can test misfiring quickly. Simple trigger without repeating In our first example we will see how misfiring is handled by simple triggers scheduled to run only once: val trigger = newTrigger(). startAt(DateUtils.addSeconds(new Date(), -10)). build() The same trigger but with explicitly set misfire instruction handler: val trigger = newTrigger(). startAt(DateUtils.addSeconds(new Date(), -10)). withSchedule( simpleSchedule(). withMisfireHandlingInstructionFireNow() //MISFIRE_INSTRUCTION_FIRE_NOW ). build() For the purpose of testing I am simply scheduling the trigger to run 10 seconds ago (so it is 10 seconds late by the time it is created!) In real world you would normally never schedule triggers like that. Instead imagine the trigger was set correctly but by the time it was scheduled the scheduler was down or didn't have any free worker threads. Nevertheless, how will Quartz handle this extraordinary situation? In the first code snippet above no misfire handling instruction is set (so called smart policy is used in that case). The second code snippet explicitly defines what kind of behaviour do we expect when misfiring occurs. See the table: Instruction Meaning smart policy - default See: withMisfireHandlingInstructionFireNow withMisfireHandlingInstructionFireNow MISFIRE_INSTRUCTION_FIRE_NOW The job is executed immediately after the scheduler discovers misfire situation. This is the smart policy. Example scenario: you have scheduled some system clean up at 2 AM. Unfortunately the application was down due to maintenance by that time and brought back on 3 AM. So the trigger misfired and the scheduler tries to save the situation by running it as soon as it can - at 3 AM. withMisfireHandlingInstructionIgnoreMisfires MISFIRE_INSTRUCTION_IGNORE_MISFIRE_POLICY QTZ-283 See: withMisfireHandlingInstructionFireNow withMisfireHandlingInstructionNextWithExistingCount MISFIRE_INSTRUCTION_RESCHEDULE_NEXT_WITH_EXISTING_COUNT See: withMisfireHandlingInstructionNextWithRemainingCount withMisfireHandlingInstructionNextWithRemainingCount MISFIRE_INSTRUCTION_RESCHEDULE_NEXT_WITH_REMAINING_COUNT Does nothing, misfired execution is ignored and there is no next execution. Use this instruction when you want to completely discard the misfired execution. Example scenario: the trigger was suppose to start recording of a program in TV. There is no point of starting recording when the trigger misfired and is already 2 hours late. withMisfireHandlingInstructionNowWithExistingCount MISFIRE_INSTRUCTION_RESCHEDULE_NOW_WITH_EXISTING_REPEAT_COUNT See: withMisfireHandlingInstructionFireNow withMisfireHandlingInstructionNowWithRemainingCount MISFIRE_INSTRUCTION_RESCHEDULE_NOW_WITH_REMAINING_REPEAT_COUNT See: withMisfireHandlingInstructionFireNow Simple trigger repeating fixed number of times This scenario is much more complicated. Imagine we have scheduled some job to repeat fixed number of times: val trigger = newTrigger(). startAt(dateOf(9, 0, 0)). withSchedule( simpleSchedule(). withRepeatCount(7). withIntervalInHours(1). WithMisfireHandlingInstructionFireNow() //or other ). build() In this example the trigger is suppose to fire 8 times (first execution + 7 repetitions) every hour, beginning at 9 AM today (startAt(dateOf(9, 0, 0)). Thus the last execution should occur at 4 PM. However assume that due to some reason the scheduler was not capable of running jobs at 9 and 10 AM and it discovered that fact at 10:15 AM, i.e. 2 firings misfired. How will the scheduler behave in this situation? Instruction Meaning smart policy - default See: withMisfireHandlingInstructionNowWithExistingCount withMisfireHandlingInstructionFireNow MISFIRE_INSTRUCTION_FIRE_NOW See: withMisfireHandlingInstructionNowWithRemainingCount withMisfireHandlingInstructionIgnoreMisfires MISFIRE_INSTRUCTION_IGNORE_MISFIRE_POLICYQTZ-283 Fires all triggers that were missed as soon as possible and then goes back to ordinary schedule. Example scenario: With this strategy in our example the scheduler will fire jobs scheduled at 9 and 10 AM immediately. Then it will wait to 11 AM and go back to ordinary schedule. Note: When handling misfires it is equally important to realize that the actual job execution time might be way after the scheduled time. This means you cannot simply rely on current system date, but you need to use JobExecutionContext .getScheduledFireTime(): def execute(context: JobExecutionContext) { val date = context.getScheduledFireTime //... } withMisfireHandlingInstructionNextWithExistingCount MISFIRE_INSTRUCTION_RESCHEDULE_NEXT_WITH_EXISTING_COUNT The scheduler won't do anything immediately. Instead it will wait for next scheduled time and run all triggers with scheduled intervals. See also: withMisfireHandlingInstructionNextWithRemainingCount Example scenario: at 10:15 the scheduler discovers 2 misfired executions. It waits until next scheduled time (11 AM) and fires all 8 scheduled executions every hour, stopping at 6 PM (the trigger should have stopped at 4 PM). withMisfireHandlingInstructionNextWithRemainingCount MISFIRE_INSTRUCTION_RESCHEDULE_NEXT_WITH_REMAINING_COUNT The scheduler discards misfired executions and waits for the next scheduled time. The total number of trigger executions will be less then configured. Example scenario: at 10:15 two misfired executions are discarded. The scheduler waits for next scheduled time (11 AM) and fires remaining triggers up to 4 PM. Effectively it behaves as if misfire never occurred. withMisfireHandlingInstructionNowWithExistingCount MISFIRE_INSTRUCTION_RESCHEDULE_NOW_WITH_EXISTING_REPEAT_COUNT First misfired trigger is executed immediately. Then the scheduler waits desired interval and executes all remaining triggers. Effectively the first fire time of the misfired trigger is moved to current time with no other changes. Example scenario: at 10:15 the scheduler runs the first misfired execution. Then it waits 1 hour and fires the second one at 11:15 AM. All 8 executions are performed, the last one at 5:15 PM withMisfireHandlingInstructionNowWithRemainingCount MISFIRE_INSTRUCTION_RESCHEDULE_NOW_WITH_REMAINING_REPEAT_COUNT First misfired execution runs immediately. Remaining misfired executions are discarded. Triggers that were not misfired are executed with desired interval. Example scenario: at 10:15 the scheduler runs the first misfired execution (from 9 AM). It discards remaining misfired executions (the one from 10 AM) and waits 1 hour to execute six more triggers: 11:15, 12:15, … 4:15 PM Simple trigger repeating infinitely In this scenario trigger repeats infinite number of times at a given interval: val trigger = newTrigger(). startAt(dateOf(9, 0, 0)). withSchedule( simpleSchedule(). withRepeatCount(SimpleTrigger.REPEAT_INDEFINITELY). withIntervalInHours(1). WithMisfireHandlingInstructionFireNow() //or other ). build() Once again trigger should fire on every hour, beginning at 9 AM today (startAt(dateOf(9, 0, 0)). However the scheduler was not capable of running jobs at 9 and 10 AM and it discovered that fact at 10:15 AM, i.e. 2 firings misfired. This is a more general situation compared to simple trigger running fixed number of times. Instruction Meaning smart policy - default See: withMisfireHandlingInstructionNextWithRemainingCount withMisfireHandlingInstructionFireNow MISFIRE_INSTRUCTION_FIRE_NOW See: withMisfireHandlingInstructionNowWithRemainingCount withMisfireHandlingInstructionIgnoreMisfires MISFIRE_INSTRUCTION_IGNORE_MISFIRE_POLICYQTZ-283 The scheduler will immediately run all misfired triggers, then continue on schedule. Example scenario: the triggers scheduled at 9 and 10 AM are executed immediately. Future invocations (next scheduled at 11 AM) are executed according to the plan. withMisfireHandlingInstructionNextWithExistingCount MISFIRE_INSTRUCTION_RESCHEDULE_NEXT_WITH_EXISTING_COUNT See: withMisfireHandlingInstructionNextWithRemainingCount withMisfireHandlingInstructionNextWithRemainingCount MISFIRE_INSTRUCTION_RESCHEDULE_NEXT_WITH_REMAINING_COUNT Does nothing, misfired executions are discarded. Then the scheduler waits for next scheduled interval and goes back to schedule. Example scenario: Misfired execution at 9 and 10 AM are discarded. The first execution occurs at 11 AM. withMisfireHandlingInstructionNowWithExistingCount MISFIRE_INSTRUCTION_RESCHEDULE_NOW_WITH_EXISTING_REPEAT_COUNT See: withMisfireHandlingInstructionNowWithRemainingCount withMisfireHandlingInstructionNowWithRemainingCount MISFIRE_INSTRUCTION_RESCHEDULE_NOW_WITH_REMAINING_REPEAT_COUNT The first misfired execution is run immediately, remaining are discarded. Next execution happens after desired interval. Effectively the first execution time is moved to current time. Example scenario: the scheduler fires misfired trigger immediately at 10:15 AM. Then waits an hour and runs the second one at 11:15 AM and continues with 1 hour interval. CRON triggers CRON triggers are the most popular ones amongst Quartz users. However there are also two other available triggers: DailyTimeIntervalTrigger (e.g. fire every 25 minutes) and CalendarIntervalTrigger (e.g. fire every 5 months). They support triggering policies not possible in both CRON and simple triggers. However they understand the same misfire handling instructions as CRON trigger. val trigger = newTrigger(). withSchedule( cronSchedule("0 0 9-17 ? * MON-FRI"). withMisfireHandlingInstructionFireAndProceed() //or other ). build() In this example the trigger should fire every hour between 9 AM and 5 PM, from Monday to Friday. But once again first two invocations were missed (so the trigger misfired) and this situation was discovered at 10:15 AM. Note that available misfire instructions are different compared to simple triggers: Instruction Meaning smart policy - default See: withMisfireHandlingInstructionFireAndProceed withMisfireHandlingInstructionIgnoreMisfires MISFIRE_INSTRUCTION_IGNORE_MISFIRE_POLICYQTZ-283 All misfired executions are immediately executed, then the trigger runs back on schedule. Example scenario: the executions scheduled at 9 and 10 AM are executed immediately. The next scheduled execution (at 11 AM) runs on time. withMisfireHandlingInstructionFireAndProceed MISFIRE_INSTRUCTION_FIRE_ONCE_NOW Immediately executes first misfired execution and discards other (i.e. all misfired executions are merged together). Then back to schedule. No matter how many trigger executions were missed, only single immediate execution is performed. Example scenario: the executions scheduled at 9 and 10 AM are merged and executed only once (in other words: the execution scheduled at 10 AM is discarded). The next scheduled execution (at 11 AM) runs on time. withMisfireHandlingInstructionDoNothing MISFIRE_INSTRUCTION_DO_NOTHING All misfired executions are discarded, the scheduler simply waits for next scheduled time. Example scenario: the executions scheduled at 9 and 10 AM are discarded, so basically nothing happens. The next scheduled execution (at 11 AM) runs on time. QTZ-283Note: QTZ-283: MISFIRE_INSTRUCTION_IGNORE_MISFIRE_POLICY not working with JDBCJobStore - apparently there is a bug when JDBCJobStore is used, keep an eye on that issue. As you can see various triggers behave differently based on the actual setup. Moreover, even though the so called smart policy is provided, often the decision is based on business requirements. Essentially there are three major strategies: ignore, run immediately and continue and discard and wait for next. They all have different use-cases: Use ignore policies when you want to make sure all scheduled executions were triggered, even if it means multiple misfired triggers will fire. Think about a job that generates report every hour based on orders placed during that last hour. If the server was down for 8 hours, you still want to have that reports generated, as soon as you can. In this case the ignore policies will simply run all triggers scheduled during that 8 hour as fast as scheduler can. They will be several hours late, but will eventually be executed. Use now* policies when there are jobs executing periodically and upon misfire situation they should run as soon as possible, but only once. Think of a job that cleans /tmp directory every minute. If the scheduler was busy for 20 minutes and finally can run this job, you don't want to run in 20 times! One is enough, but make sure it runs as fast it can. Then back to your normal one-minute intervals. Finally next* policies are good when you want to make sure your job runs at particular points in time. For example you need to fetch stock prices quarter past every hour. They change rapidly so if your job misfired and it is already 20 minutes past full hour, don't bother. You missed the correct time by 5 minutes and now you don't really care. It is better to have a gap rather than an inaccurate value. In this case Quartz will skip all misfired executions and simply wait for the next one.
April 13, 2012
by Tomasz Nurkiewicz
· 109,069 Views · 13 Likes
article thumbnail
Configuring Quartz With JDBCJobStore in Spring
I am starting a little series about Quartz scheduler internals, tips and tricks, this is chapter 0 - how to configure persistent job store.
April 7, 2012
by Tomasz Nurkiewicz
· 37,702 Views
article thumbnail
Why Having "DevOps" in a Job Title Makes Sense
We’ve been trying to grow our team for a few months now and the title we’re hiring for is Devops Engineer. One of the candidates our recruiters reached out to, let’s call him John, came back to us with a bunch of questions including: How do you feel about hiring someone with a devops title? It’s a very legittimate question, Devops is a cultural and professional movement, so how could it be a job title? What I argued in my reply to this fella is that Devops isn’t the job title, Devops Engineer is, and in this sense Devops is just a qualifier and I strongly believe a very useful one. I really sympathise with those that are fighting hard to keep Devops real and avoid the same faith that some refer to as the sad commercialisation of Agile. My campaign to make of devops a job title isn’t a campaign to come up with a set of bullet points that define Devops as a job so that I can put it on a resume or build it into a product. My argument here is that the guy I’m trying to hire, John, I want him to be a certain kind of guy and the best way I have to describe what I want is Devops Engineer. I’m looking for an operations guy , but I want him to be open to developers, consider engineering and the company as a whole, be focused on delivering value and not rathole into fights about technology or claim root access only on principle. I want that guy to have great communication skills and the interest to explore what’s besides his infrastructure, to be wanting to borrow as much good he can find in other disciplines across the organisation. And then of course there is the practical part, the desire to automate and escape a boring manual routine, the familiarity with cloud that willing or not has powered the movement, and even more specific things like configuration management. You may argue that this is just a good engineer or what systems engineers are becoming, in other words nothing new under the sun. And you may be right, but job titles are in many ways just another way to communicate, to broadcast an intent and a need. So you know what I told John about hiring Devops Engineers? That I felt pretty damn proud about it. The true ones, not the ones slapping it on their CV to get a job, are fantastic engineers and I can’t but encourage them to start to respond to that qualifier. Likewise the companies and individuals seeking them out are likely the ones building great groups those people will want to be members of. Yes, the moment it becomes a keyword recruiters start to match against we’re likely to see a spur of fakes trying to land a job, but that’s nothing new under the sun. Signed, a Devops manager Source: http://www.spikelab.org/devops-job-title/
March 5, 2012
by Spike Morelli
· 10,695 Views
article thumbnail
Why You Shouldn't Use Quartz Scheduler
If you need to schedule jobs in Java, it is fairly common in the industry to use Quartz directly or via Spring integration, but you might want to think twice.
January 30, 2012
by Craig Flichel
· 303,501 Views · 5 Likes
article thumbnail
EC2 Interview – AWS Interview – Cloud Interview – 8 Questions
If you're looking for a cloud expert, specifically someone who knows Amazon Web Services and EC2, you'll want to have a battery of questions to assess their knowledge.
September 15, 2011
by Sean Hull
· 111,875 Views · 1 Like
article thumbnail
Watermelon Reporting
This is what Wikipedia writes about the watermelon: The Watermelon (Citrullus lanatus (Thunb.), family Cucurbitaceae) can be both the fruit and the plant of a vine-like (scrambler and trailer) plant originally from southern Africa, and is one of the most common types of melon. [...] The watermelon fruit, loosely considered a type of melon (although not in the genus Cucumis), has a smooth exterior rind (green, yellow and sometimes white) and a juicy, sweet interior flesh (usually pink, but sometimes orange, yellow, red and sometimes green if not ripe). Watermelon (Citrullus lanatus (Thunb.), family Cucurbitaceae) can be both the fruit and the plant of a vine-like (scrambler and trailer) plant originally from southern Africa, and is one of the most common types of melon. This flowering plant produces a special type of fruit known by botanists as a pepo, a berry which has a thick rind (exocarp) and fleshy center (mesocarp and endocarp); pepos are derived from an inferior ovary, and are characteristic of the Cucurbitaceae. The watermelon fruit, loosely considered a type of melon (although not in the genus Cucumis), has a smooth exterior rind (green, yellow and sometimes white) and a juicy, sweet interior flesh (usually pink, but sometimes orange, yellow, red and sometimes green if not ripe). For my metaphor, I’ll use the one with red flesh but orange and yellow would work too. I think most of us experienced the phenomenon when the project status is red but is getting greener and greener when climbing the management ladder. The project’s core is red but for the management it has a nice green paring, so it looks like a watermelon. This is why I call this phenomenon Watermelon Reporting. But why are we creating such reports and how can we avoid it? Why? The bearer of bad news already had a bad time in the ancient world. If he was lucky, they gave him the chop but in other cases they simply chopped his head of. This hasn’t changed until now but fortunately only in a figurative sense. Some bosses aren’t interested that there are problems with a project in their responsibility because if they know about it, they are in charge. So what do they do to avoid incurring the wrath of their boss ? They tweak the project status just a bit and the melon starts growing. Another reason could be that nobody wants to be in the focus of management, thus they embellish the project status in the hope that everything turns for the better. And as we all know hope is the last to die. In the end the result is the same.. Eventually the overripe melon bursts and there is no rescue for the project anymore. How to avoid it? The answer is easy: Transparency, transparency and transparency. If there is no way to hide the current status the watermelon can’t grow. Fortunately Scrum and other agile frameworks provide tools like burndown charts and backlogs to help the team with their transparency. But there are also tools like dashboards or kanban boards to do this job, but this will be the subject of one of my next blog posts. Conclusion The nuts and bolts of any project are transparency. If the project status is transparent, the watermelons can’t arise. If anybody is able to get the information, it will be difficult to hide something.
August 8, 2011
by Marc Löffler
· 9,309 Views
article thumbnail
Eclipse Indigo Highlights: Five Reasons to Check Out ECF
The Eclipse Communication Framework has been a steady participant in the Eclipse release trains, continuously adding to its impressive list of features. This year’s inclusion of ECF 3.5 in the Indigo release train is no exception. In this article, I'll take a look at five key features of the release: OSGi 4.2 Remote Services/RSA Standards Support ECF Indigo implements two recently-completed OSGi standards: OSGi remote services and OSGi Remote Service Admin (RSA). The OSGi Remote Services spec provides a simple, standardized way to expose OSGi services for network discovery and remote access. ECF Indigo also implements the Enterprise specification for remote services management known as Remote Services Admin (RSA). The RSA specification defines a management agent to allow for enterprise-application control of the discovery and distribution of remote services via a standardized API. Also included in the RSA specification is a standardized format for communicating meta-data about remote services, advanced handling of security, discovery and distribution event notification, and advanced handling of remote service versioning. ECF has run its implementation of RS/RSA through the OSGi Test Compatability Kit to ensure that it is compliant with the OSGi specification. Extensibility through Provider Architecture ECF has a provider architecture, that allows major components of the OSGi remote services/RSA implementation to be extended, enhanced, or replaced as needed. For example, for interoperability with existing services and applications, it’s frequently desirable to be able to substitute the wire protocol/transport to one that is already being used. With the ECF provider architecture, it’s possible to substitute the underlying protocol...and use other frameworks based upon REST, SOAP, JMS, XML-RPC, XMPP, and/or others. If you wish, you can even define and use a proprietary provider and use it to expose your remote services. Or you can use one provider for remote services development and testing, and another for deployment. Asynchronous Proxies ECF has support for remote service access via asynchronous proxies. This allows client consumers of remote services to avoid the reliability problems that are frequent when synchronous proxies are used over a relatively slow and unreliable network. The choice of whether to use synchronous or asynchronous proxies is up to the programmer, and can be made at runtime. Here is more information about this feature of ECF’s remote services implementation. XML-RPC provider ECF Indigo has an XML-RPC-based provider, which implements the remote services API. Remote Service invocation through a proxy and/or async proxy is supported too. In addition to being usable for interoperability with existing XML-RPC-based services, it can also be used as an example of how to easily use an existing framework to create a remote service provider. Google wave provider Although discontinued by Google, Wave is an open protocol with an open source implementation of the Wave server available. This means you can still build applications that take advantage of the real time shared editing functionality from within your Eclipse environment using this provider. Already, ECF provides real time shared editing using cola. This is limited to two users on a a document at a time - using the Wave provider, you could have multiple authors collaborating on the same document. Mustafa and Sebastian created a multiplayer Android phone game for EclipseCon this year, using the Wave protocol for concurrency control. Take a look at the results in the video below. ECF on Other OSGi Frameworks You're not limited to running ECF on Equinox anymore: ECF4Felix allows ECF to run on the Felix OSGi framework. So far testing has only been done on Felix. But if you are willing to help with testing ECF Remote Services/RSA on another framework, please send an email to the ecf-dev mailing list. ECF Documentation Project ECF recently started the ECF Documentation Project. This project is an approach to improve the amount and quality of the ECF documentation with the help of the committer, contributor, and consumer communities. It also aims to use of ECF for new and existing consumers. Currently this includes a Users Guide and an Integrators Guide. As a user of ECF, the documentation effort is a huge help in getting ECF to work right within your application. Great credit is due to the ECF team for this, and all other features listed here. ECF wiki: http://wiki.eclipse.org/ECF Remote services section of ECF wiki: http://wiki.eclipse.org/ECF#OSGi_Remote_Services OSGi compendium specification (Chap 13 is Remote Services): http://www.osgi.org/download/r4v42/r4.cmpn.pdf OSGi Enterprise Specification (Chap 122 is RSA): http://www.osgi.org/download/r4v42/r4.enterprise.pdf RSA wiki pages: http://wiki.eclipse.org/Remote_Services_Admin Getting Started with Remote Services: http://wiki.eclipse.org/EIG:Getting_Started_with_OSGi_Remote_Services Asynchronous Proxies (examples): http://wiki.eclipse.org/Asynchronous_Proxies_for_Remote_Services ECF Builder: https://build.ecf-project.org/jenkins/ ECF Github site (other providers, examples, Wave, and Newsreader) : https://github.com/ECF ECF4Felix: https://github.com/ECF/ECF4Felix
June 22, 2011
by James Sugrue
· 15,505 Views
article thumbnail
10 Tricky Java Interview Questions
Here are some Java interview questions which are un-common What is the performance effect of a large number of import statements which are not used? Answer: They are ignored if the corresponding class is not used. Give a scenario where hotspot will optimize your code? Answer: If we have defined a variable as static and then initialized this variable in a static block then the Hotspot will merge the variable and the initialization in a single statement and hence reduce the code. What will happen if an exception is thrown from the finally block? Answer: The program will exit if the exception is not catched in the finally block. How does decorator design pattern works in I/O classes? Answer: The various classes like BufferedReader , BufferedWriter workk on the underlying stream classes. Thus Buffered* class will provide a Buffer for Reader/Writer classes. If I give you an assignment to design Shopping cart web application, how will you define the architecture of this application. You are free to choose any framework, tool or server? Answer: Usually I will choose a MVC framework which will make me use other design patterns like Front Controller, Business Delegate, Service Locater, DAO, DTO, Loose Coupling etc. Struts 2 is very easy to configure and comes with other plugins like Tiles, Velocity and Validator etc. The architecture of Struts becomes the architecture of my application with various actions and corresponding JSP pages in place. What is a deadlock in Java? How will you detect and get rid of deadlocks? Answer: Deadlock exists when two threads try to get hold of a object which is already held by another object. Why is it better to use hibernate than JDBC for database interaction in various Java applications? Answer: Hibernate provides an OO view of the database by mapping the various classes to the database tables. This helps in thinking in terms of the OO language then in RDBMS terms and hence increases productivity. How can one call one constructor from another constructor in a class? Answer: Use the this() method to refer to constructors. What is the purpose of intern() method in the String class? Answer: It helps in moving the normal string objects to move to the String literal pool How will you make your web application to use the https protocol? Answer: This has more to do with the particular server being used than the application itself. Here is how it can be done on tomcat: http://tomcat.apache.org/tomcat-4.1-doc/ssl-howto.html From http://extreme-java.blogspot.com/2011/05/10-tricky-java-interview-questions.html
May 10, 2011
by Sandeep Bhandari
· 101,764 Views · 1 Like
article thumbnail
Eradicating Non-Determinism in Tests
An automated regression suite can play a vital role on a software project, valuable both for reducing defects in production and essential for evolutionary design. In talking with development teams I've often heard about the problem of non-deterministic tests - tests that sometimes pass and sometimes fail. Left uncontrolled, non-deterministic tests can completely destroy the value of an automated regression suite. In this article I outline how to deal with non-deterministic tests. Initially quarantine helps to reduce their damage to other tests, but you still have to fix them soon. Therefore I discuss treatments for the common causes for non-determinism: lack of isolation, asynchronous behavior, remote services, time, and resource leaks. I've enjoyed watching ThoughtWorks tackle many difficult enterprise applications, bringing successful deliveries to many clients who have rarely seen success. Our experiences have been a great demonstration that agile methods, deeply controversial and distrusted when we wrote the manifesto a decade ago, can be used successfully. There are many flavors of agile development out there, but in what we do there is a central role for automated testing. Automated testing was a core approach to Extreme Programming from the beginning, and that philosophy has been the biggest inspiration to our agile work. So we've gained a lot of experience in using automated testing as a core part of software development. Automated testing can look easy when presented in a text book. And indeed the basic ideas are really quite simple. But in the pressure-cooker of a delivery project, trials come up that are often not given much attention in texts. As I know too well, authors have a habit of skimming over many details in order to get a core point across. In my conversations with our delivery teams, one recurring problem that we've run into is tests which have become unreliable, so unreliable that people don't pay much attention to whether they pass or fail. A primary cause of this unreliability is that some tests have become non-deterministic. A test is non-deterministic when it passes sometimes and fails sometimes, without any noticeable change in the code, tests, or environment. Such tests fail, then you re-run them and they pass. Test failures for such tests are seemingly random. Non-determinism can plague any kind of test, but it's particularly prone to affect tests with a broad scope, such as acceptance or functional tests. Why non-deterministic tests are a problem Non-deterministic tests have two problems, firstly they are useless, secondly they are a virulent infection that can completely ruin your entire test suite. As a result they need to be dealt with as soon as you can, before your entire deployment pipeline is compromised. I'll start with expanding on their uselessness. The primary benefit of having automated tests is that they provide bug detection mechanism by acting as regression tests[1]. When a regression test goes red, you know you've got an immediate problem, often because a bug has crept into the system without you realizing. Having such a bug detector has huge benefits. Most obviously it means that you can find and fix bugs just after they are introduced. Not just does this give you the warm fuzzies because you kill bugs quickly, it also makes it easier to remove them since you know the bug got in with the last set of changes that are fresh in your mind. As a result you know where to look for the bug, which is more than half the battle in squashing it. The second level of benefit is that as you gain confidence in your bug detector, you gain the courage to make big changes knowing that when you goof, the bug detector will go off and you can fix the mistake quickly. [2] Without this teams are frightened to make the changes code needs in order to be kept clean, which leads to a rotting of the code base and plummeting development speed. The trouble with non-deterministic tests is that when they go red, you have no idea whether its due to a bug, or just part of the non-deterministic behavior. Usually with these tests a non-deterministic failure is relatively common, so you end up shrugging your shoulders when these tests go red. Once you start ignoring a regression test failure, then that test is useless and you might as well throw it away. Indeed you really ought to throw a non-deterministic test away, since if you don't it has an infectious quality. If you have a suite of 100 tests with 10 non-deterministic tests in them, than that suite will often fail. Initially people will look at the failure report and notice that the failures are in non-deterministic tests, but soon they'll lose the discipline to do that. Once that discipline is lost, then a failure in the healthy deterministic tests will get ignored too. At that point you've lots the whole game and might as well get rid of all the tests. Quarantine My principal aim in this article is to outline common cases of non-deterministic tests and how to eliminate the non-determinism. But before I get there I offer one piece of essential advice: quarantine your non-deterministic tests. If you have non-deterministic tests keep them in a different test suite to your healthy tests. That way you'll you can continue to pay attention to what's going on with your healthy tests and get good feedback from them. Place any non-deterministic test in a quarantined area. (But fix quarantined tests quickly.) Then the question is what to do with the quarantined test suites. They are useless as regression tests, but they do have a future as work items for cleaning up. You should not abandon such tests, since any tests you have in quarantine are not helping you with your regression coverage. A danger here is that tests keep getting thrown into quarantine and forgotten, which means your bug detection system is eroding. As a result it's worthwhile to have a mechanism that ensures that tests don't stay in quarantine too long. I've come across various ways to do this. One is a simple numeric limit: e.g. only allow 8 tests in quarantine. Once you hit the limit you must spend time to clear all the tests out. This has the advantage of batching up your test-cleaning if that's how you like to do things. Another route is to put a time limit on how long a test may be in quarantine, such as no longer than a week. The general approach with quarantine is to take the quarantined tests out of the main deployment pipeline so that you still get your regular build process. However a good team can be more aggressive. Our Mingle team puts its quarantine suite into the deployment pipeline one stage after its healthy tests. That way it can get the feedback from the healthy tests but is also forced to ensure that it sorts out the quarantined tests quickly. [3] Lack of Isolation In order to get tests to run reliably, you must have clear control over the environment in which they run, so you have a well-known state at the beginning of the test. If one test creates some data in the database and leaves it lying around, it can corrupt the run of another test which may rely on a different database state. Therefore I find it's really important to focus on keeping tests isolated. Properly isolated tests can be run in any sequence. As you get to larger operational scope of functional tests, it gets progressively harder to keep tests isolated. When you are tracking down a non-determinism, lack of isolation is a common and frustrating cause. Keep your tests isolated from each other, so that execution of one test will not affect any others. There are a couple of ways to get isolation - either always rebuild your starting state from scratch, or ensure that each test cleans up properly after itself. In general I prefer the former, as it's often easier - and in particular easier to find the source of a problem. If a test fails because it didn't build up the initial state properly, then it's easy to see which test contains the bug. With clean-up, however, one test will contain the bug, but another test will fail - so it's hard to find the real problem. Starting from a blank state is usually easy with unit tests, but can be much harder with functional tests [4] - particularly if you have a lot of data in a database that needs to be there. Rebuilding the database each time can add a lot of time to test runs, so that argues for switching to a clean-up strategy. One trick that's handy when you're using databases, is to conduct your tests inside a transaction, and then to rollback the transaction at the end of the test. That way the transaction manager cleans up for you, reducing the chance of errors[5]. Another approach is to do a single build of a mostly-immutable starting fixture before running a group of tests. Then ensure that the tests don't change that initial state (or if they do, they reverse the changes in tear-down). This tactic is more error-prone than rebuilding the fixture for each test, but it may be worthwhile iff it takes too long to build the fixture each time. Although databases are a common cause for isolation problems, there are plenty of times you can get these in-memory too. In particular be aware with static data and singletons. A good example for this kind of problem is contextual environment, such as the currently logged in user. If you have an explicit tear-down in a test, be wary of exceptions that occur during the tear-down. If this happens the test can pass, but cause isolation failures for subsequent tests. So ensure that if you do get a problem in a tear-down, it makes a loud noise. Some people prefer to put less emphasis on isolation and more on defining clear dependencies to force tests to run in a specified order. I prefer isolation because it gives you more flexibility in running subsets of tests and parallelizing tests. Asynchronous Behavior Asynchrony is a boon that allows you to keep software responsive while taking on long term tasks. Ajax calls allow a browser to stay responsive while going back to the server for more data, asynchronous message allow a server process to communicate with other system without being tied to their laggardly latency. But in testing, asynchrony can be curse. The common mistake here is to throw in a sleep: //pseudo-code makeAsyncCall; sleep(aWhile); readResponse; This can bite you two ways. First off you'll want to set the sleep time to long enough that it gives plenty of time to get the response. But that means that you'll spend a lot of time idly waiting for the response, thus slowing down your tests. The second bite is that, however long you sleep, sometimes it won't be enough. There will be some change in environment that will cause you to exceed the sleep - and you'll get false failure. As a result I strongly urge you to never use bare sleeps like this. Never use bare sleeps to wait for asynchonous responses: use a callback or polling. There are basically two tactics you can do for testing an asynchronous response. The first is for the asynchronous service to take a callback which it can call when done. This is the best since it means you'll never have to wait any longer than you need to [6]. The biggest problem with this is that the environment needs to be able to do this and then the service provider needs to do it. This is one of the advantages of having the development team integrated with testing - if they can provide a callback then they will. The second option is to poll on the answer. This is more than just looking once, but looking regularly, something like this //pseudo-code makeAsyncCall startTime = Time.now; while(! responseReceived) { if (Time.now - startTime > waitLimit) throw new TestTimeoutException; sleep (pollingInterval); } readResponse The point of this approach is that you can set the pollingInterval to a pretty small value, and know that that's the maximum amount of dead time you'll lose to waiting for a response. This means you can set the waitLimit very high, which minimizes the chance of hitting it unless something serious has gone wrong. [7] Make sure you use a clear exception class that indicates this is a test timeout that's failing. This will help make it clear what's gone wrong should it happen, and perhaps allow a more sophisticated test harness to take account of this information in its display. The time values, in particular the waitLimit, should never be literal values. Make sure they are always values that can be easily set in bulk, either by using constants or set through the runtime environment. That way if you need to tweak them (and you will) you can tweak them all quickly. All this advice is handy for async calls where you expect a response from the provider, but how about those where there is no response. These are calls where we invoke a command on something and expect it to happen without any acknowledgment. This is the trickiest case since you can test for your expected response, but there's nothing to do to detect a failure other than timing-out. If the provider is something you're building you can handle this by ensuring the provider implements some way of indicating that it's done - essentially some form of callback. Even if only the testing code uses it, it's worth it - although often you'll find this kind of functionality is valuable for other purposes too[8]. If the provider is someone else's work, you can try persuasion, but otherwise may be stuck. Although this is also a case when using Test Doubles for remote services is worthwhile (which I'll discuss more in the next section). If you have a general failure in something asynchronous, such that it's not responding at all, then you'll always be waiting for timeouts and your test suite will take a long time to fail. To combat this it's a good idea to use a smoke test to check that the asynchronous service is responding at all and stop the test run right away if it isn't. Gerard Meszaros's book, xUnit Test Patterns, contains lots of good patterns for constructing tests. You can also often side-step the asynchrony completely. Gerard Meszaros's Humble Object pattern says that whenever you have some logic that's in a hard-to-test environment, you should isolate the logic you need to test from that environment. In this case it means put most of the logic you need to test in a place where you can test it synchronously. The asynchronous behavior should be as minimal (humble) as possible, that way you don't need that much testing of it. Remote Services Sometimes I'm asked if ThoughtWorks does any integration work, which I find somewhat amusing since there's hardly any project we do that doesn't involve a fair bit of integration. By their nature, enterprise applications involve a great deal of combining data from different systems. These systems are maintained by other teams operating to their own schedules, teams that often use a very different software philosophy to our heavily test-driven agile approach. Testing with such remote systems brings a number of problems, and non-determinism is high on the list. Often remote systems don't have test system we can call, which means hitting a live system. If there is a test system, it may not be stable enough to provide deterministic responses. In this situation it's vital to ensure determinism, so it's time to reach for a Test Double - a component that looks like the remote service, but is really just a pretend version that mimics the remote system's behavior. The double needs to be setup so that provides the right kind of response in interaction with our system, but in a way we control. In this manner we can ensure determinism. Using a double has a downside, in particular when we are testing across a broad scope. How can we be sure that the double behaves in the same way that remote system does? We can tackle this again using tests, a form of test that I call Integration Contract Tests. These run the same interaction with the remote system and the double, and check that the two match. In this case 'match' may not mean coming up with the same result (due to the non-determinisms), but results that share the same essential structure. Integration Contract Tests need to be run frequently, but not part of our system's deployment pipeline. Periodic running based on the rate of the change of the remote system is usually best. For writing these kinds of test doubles, I'm a big fan of Self Initializing Fakes - since these are very simple to manage. Some people are firmly against using Test Doubles in functional tests, believing that you must test with real connection in order to ensure end-to-end behavior. While I sympathize with their argument, automated tests are useless if they are non-deterministic. So any advantage you gain by talking to the real system is overwhelmed by the need to stamp out non-determinism[9]. Time Few things are more non-deterministic than a call to the system clock. Each time you call it, you get a new result, and any tests that depend on it can thus change. Ask for all the todos due in the next hour, and you regularly get a different answer[10]. The most important thing here is to ensure that you always wrap the system clock with routines that can be replaced with a seeded value for testing. A clock stub can be set to particular time and frozen at that time, allowing your tests to have complete control over its movements. That way you can synchronize your test data to the values in the seeded clock.[11][12] Always wrap the system clock, so it can be easily substituted for testing. One thing to watch with this, is that eventually your test data might start having problems because it's too old, and you get conflicts with other time based factors in your application. In this case you can move the data, and your clock seeds to new values. When you do this, ensure that this is the only thing you do. That way you can be sure that any tests that fail are due to time-movement in the test data. Another area where time can be a problem is when you rely on other behaviors from the clock. I once saw a system that generated random keys based on clock values. This systems started failing when it was moved to a faster machine that could allocate multiple ids within a single clock tick.[13] I've heard so many problems due to direct calls to the system clock that I'd argue for finding a way to use code analysis to detect any direct calls to the system clock and failing the build right there. Even a simple regex check might save you a frustrating debugging session after a call at an ungodly hour. Resource Leaks If your application has some kind of resource leak, this will lead to random tests failing, since it's just which test causes the resource leak to go over a limit that gets the failure. This case is awkward because any test can fail intermittently due to this problem. If it isn't a case of one test being non-deterministic then resource leaks are a good candidate to investigate. By resource leak, I mean any resource that the application has to manage by acquiring and releasing. In non-memory-managed environments, the obvious example is memory. Memory-management did much to remove this problem, but other resources still need to be managed, such as database connections. Usually the best way to handle these kind of resources is through a Resource Pool. If you do this then a good tactic is to configure the pool to a size of 1 and make it throw an exception should it get a request for a resource when it has none left to give. That way the first test to request a resource after the leak will fail - which makes it a lot easier to find the problem test. This idea of limiting resource pool sizes, is about increasing constraints to make errors more likely to crop up in tests. This is good because we want errors to show in tests so we can fix them before they manifest themselves in production. This principle can be used in other ways too. One story I heard was of a system which generated randomly named temporary files, didn't clean them up properly, and crashed on a collision. This kind of bug is very hard to find, but one way to manifest it is to stub the randomizer for testing so it always returns the same value. That way you can surface the problem more quickly.
April 14, 2011
by Martin Fowler
· 6,652 Views · 1 Like
article thumbnail
A story about User Stories; Where do you start and what about the planning?
In this multi-part post, I’m going to share my personal experiences while working with user stories for gathering, tracking and planning requirements. It currently consists out of three parts: What are they and why do you need them? Who writes them and how do you control scope? Where do you start and what about the planning? You can also download all parts as one comprehensive PDF for easy printing or e-reading. Where do you start? Suppose that after intensive discussions and tough scoping sessions you ended up with a list of user stories and are about to start building the system. The first story not only needs to realize some particular feature, but also involves building a skeleton implementation of the system’s architecture. How do you avoid spending way too much time on plumbing and other general purpose stuff you need for the rest of the stories? The article Managing the Bootstrap Story by Jennitta Andrea addressed this challenge in more detail and offers some alternative solutions. One of these solutions is to find and define a user story with the product owner that offers minimal functionality yet still has project value. Such a story is often referred to as the backbone story because you realize the backbone of your system in it. It’s quite common to use the backbone story to realize a proof-of-concept (PoC) that verifies the chosen architecture. Since a working PoC can give the product owner confidence that the team is able to build such a product, that fact alone may be enough project value for the product owner. More storyotypes? You might have suspected it already, but that backbone story is just an example of another storyotype. In fact, after I started looking for an approach to capture the non-functional requirements of a project or system, I ran into a slide deck that mentioned a whole set of additional storyotypes. Dan Rawsthorne, the author, tried to define a storyotype for virtually every possible thing you might need to do in a project. Personally I think he went a bit too far, but a small set of additional storyotypes proved to be very useful anyway. Storyotype Description Compound Epic A composite user story that groups a number of stories in a logical sense. Complex Epic A user story whose content and impact must be determined later in the project, but for which it is clear that it involves a significant amount of work. Setup A story that is used to setup the project environment, including a source control environment, a project website, a build server. Technical A story that involves making a technical improvement or adjustment. Examples include introducing a coding standard, refactoring a poor design, executing a performance test. Documentation A story for writing a user manual, installation manual, etc. Training A story for developing and/or hosting a training, or having a workshop with end users. Quality Improvements A story which objective is to fix a collection of related bugs, or spent a fixed amount of time to improve the quality of the code base. Spike A story that aims to do a technical investigation to determine the usability of a specific technology, or for trying an alternative technical solution. When is the story complete? So how do you know that a user story has been successfully realized? Well, if all is good, all stories will conform with INVEST and are associated with a number of acceptance criteria (typically written down as the how-to demo) specified by the product owner. That should be enough to determine if it is functionally sound. But what you still miss is a way of explaining the stakeholders, including the product owner, when the team treats the story as finished. That may differ by team, but usually includes some or more of the following criteria. The code compiles and there are no warnings or errors. The code meets the coding standards setup by the project or the organization. The code is reviewed by a peer developer. All automated unit and integration tests have completed successfully. Visual Studio’s static code analysis tool does not report any violations. ReSharper reports no potential errors (a.k.a. everything is 'green'). The daily integration build has completed successfully. The functionality was tested by another member of the team (anybody but the developer). The feature or functionality has been signed off using the project checklist. The system functionality is tested by a tester. The visual look and feel is has been approved by an employee of the communications department. Together with the story’s how-to demo these criteria are commonly referred to as the definition-of-done. Usually, a team or project will have a default definition-of-done that applies to all stories and only mentions the particulars of that story if necessary. Then what about the planning? User stories are an excellent unit for tracking progress within your project. However, purists within the Agile community will tell you that an Agile project will have no long term plan. Instead, the functionality is realized iteratively according to the priority defined by the product owner. I agree with the latter and believe that its iterative nature is essential for dealing with the changing requirements that are common in all projects. It allows deferring decisions to the last responsible moment, and that’s always a good thing. But in reality you often can’t escape from providing at least a rough schedule to your management. How should you deal with that? What I often do to get all stakeholders to join me in a number of workshops. Using use case diagrams to illustrate the context of the discussions, I try to get enough stories on paper to represent the entire scope of the project. You need to beware though that you don’t write down too much details or have too much in-depth discussions. That would give the stakeholders a false sense of precision, and consequently, will cause them to see the stories as a formal functional design. Also, if you run into some high-level chunk of functionality for which nobody really knows what it will look like, add an epic story for it and include a spike to elaborate on the epic later on in the project. Then organize a number of shorter meetings with the team or, if the team hasn’t been formed yet, with a few experienced developers. Let them discuss every story one by one and then try to estimate the size of each story in so called story points. Some people from the Agile community say you should estimate using relative sizes only. In other words, a story that seems to require twice as much work as another story should also have twice as many story points. The story point as a unit does not have value. It’s the relative differences that are important. What works for me is that every story point corresponds to the ideal day of an experienced senior software developer. In other words, one story point means that an experienced developer familiar with the chosen architecture, technology and project methodology needs to work for 8 hours without being disturbed by telephone, email, coffee breaks, or any other distractions. Mike Cohn, author of User Stories Applied, has dedicated many chapters to this estimation technique. Ideally, each story is between 1 and 8 story points, but at the beginning of the project you still may have some epics to break up. After finishing those meetings you should have an estimate of the total size of the project. Now, in order to get from those story points to a total number of hours you need to estimate the expected productivity of the team. Mike Cohn does this by creating a table with the expected roles, their availability (to deal with part time employees), and the expected productivity compared to the ideal senior developer (as a percentage). By calculating the average productivity and multiplying it with the number of story points you’ll end up with the total number of estimated man-hours. It’s only an estimate and both the productivity can be disappointing as well as the estimate in story points may appear to be wrong. But it still gives you an initial estimate that can be used for global planning and budget discussions. Obviously it is important to ensure that you keep on continuously measuring the actual productivity. Wow, now what? By now, it should be clear that a user story is not an independent concept but something that closely resonates with many of the aspects of our work in the software industry. In this multi-part post I have tried to explain a number of those aspects and to clarify the relationship between them. But even though I’ve not touched everything as detailed as possible, I still hope I've managed to convince you about the power and potential of user stories. Last but not least, if you have any questions or comments, please do not hesitate to email me at [email protected] or tweet me at my Twitter ID ddoomen.
March 17, 2011
by Dennis Doomen
· 7,402 Views
article thumbnail
HOWTO: Partially Clone an SVN Repo to Git, and Work With Branches
I've blogged a few times now about Git (which I pronounce with a hard 'g' a la "get", as it's supposed to be named for Linus Torvalds, a self-described git, but which I've also heard called pronounced with a soft 'g' like "jet"). Either way, I'm finding it way more efficient and less painful than either CVS or SVN combined. So, to continue this series ([1], [2], [3]), here is how (and why) to pull an SVN repo down as a Git repo, but with the omission of old (irrelevant) revisions and branches. Using SVN for SVN repos In days of yore when working with the JBoss Tools and JBoss Developer Studio SVN repos, I would keep a copy of everything in trunk on disk, plus the current active branch (most recent milestone or stable branch maintenance). With all the SVN metadata, this would eat up substantial amounts of disk space but still require network access to pull any old history of files. The two repos were about 2G of space on disk, for each branch. Sure, there's tooling to be able to diff and merge between branches w/o having both branches physically checked out, but nothing beats the ability to place two folders side by side OFFLINE for deep comparisons. So, at times, I would burn as much as 6-8G of disk simply to have a few branches of source for comparison and merging. With my painfullly slow IDE drive, this would grind my machine to a halt, especially when doing any SVN operation or counting files / disk usage. Using Git for SVN repos naively Recently, I started using git-svn to pull the whole JBDS repo into a local Git repo, but it was slow to create and still unwieldy. And the JBoss Tools repo was too large to even create as a Git repo - the operation would run out of memory while processing old revisions of code to play forward. At this point, I was stuck having individual Git repos for each JBoss Tools component (major source folder) in SVN: archives, as, birt, bpel, build, etc. It worked, but replicating it when I needed to create a matching repo-collection for a branch was painful and time-consuming. As well, all the old revision information was eating even more disk than before: jbosstools' trunk as multiple git-svn clones: 6.1G devstudio's trunk as single git-svn clone: 1.3G So, now, instead of a couple Gb per branch, I was at nearly 4x as much disk usage. But at least I could work offline and not deal w/ network-intense activity just to check history or commit a change. Still, far from ideal. Cloning SVN with standard layout & partial history This past week, I discovered two ways to make the git-svn experience at least an order of magnitude better: Standard layout (-s) - this allows your generated Git repo to contain the usual trunk, branches/* and tags/* layout that's present in the source SVN repo. This is a win because it means your repo will contain the branch information so you can easily switch between branches within the same repo on disk. No more remote network access needed! Revision filter (-r) - this allows your generated Git repo to start from a known revision number instead of starting at its birth. Now instead of taking hours to generate, you can get a repo in minutes by excluding irrelevant (ancient) revisions. So, why is this cool? Because now, instead of having 2G of source+metadata to copy when I want to do a local comparison between branches, the size on disk is merely: jbosstools' trunk as single git-svn clone w/ trunk and single branch: 1.3G devstudio's trunk as single git-svn clone w/ trunk and single branch: 0.13G So, not only is the footprint smaller, but the performance is better and I need never do a full clone (or svn checkout) again - instead, I can just copy the existing Git repo, and rebase it to a different branch. Instead of hours, this operation takes seconds (or minutes) and happens without the need for a network connection. Okay, enough blather. Show me the code! Check out the repo, including only the trunk & most recent branch # Figure out the revision number based on when a branch was created, then # from r28571, returns -r28571:HEAD rev=$(svn log --stop-on-copy \ http://svn.jboss.org/repos/jbosstools/branches/jbosstools-3.2.x \ | egrep "r[0-9]+" | tail -1 | sed -e "s#\(r[0-9]\+\).\+#-\1:HEAD#") # now, fetch repo starting from the branch's initial commit git svn clone -s $rev http://svn.jboss.org/repos/jbosstools jbosstools_GIT Now you have a repo which contains trunk & a single branch git branch -a # list local (Git) and remote (SVN) branches * master remotes/jbosstools-3.2.x remotes/trunk Switch to the branch git checkout -b local/jbosstools-3.2.x jbosstools-3.2.x # connect a new local branch to remote one Checking out files: 100% (609/609), done. Switched to a new branch 'local/jbosstools-3.2.x' git svn info # verify now working in branch URL: http://svn.jboss.org/repos/jbosstools/branches/jbosstools-3.2.x Repository Root: http://svn.jboss.org/repos/jbosstools Switch back to trunk git checkout -b local/trunk trunk # connect a new local branch to remote trunk Switched to a new branch 'local/trunk' git svn info # verify now working in branch URL: http://svn.jboss.org/repos/jbosstools/trunk Repository Root: http://svn.jboss.org/repos/jbosstools Rewind your changes, pull updates from SVN repo, apply your changes; won't work if you have local uncommitted changes git svn rebase Fetch updates from SVN repo (ignoring local changes?) git svn fetch Create a new branch (remotely with SVN) svn copy \ http://svn.jboss.org/repos/jbosstools/branches/jbosstools-3.2.x \ http://svn.jboss.org/repos/jbosstools/branches/some-new-branch From http://divby0.blogspot.com/2011/01/howto-partially-clone-svn-repo-to-git.html
January 28, 2011
by Nick Boldt
· 35,542 Views
article thumbnail
Interview: Troy Giunipero, Author of NetBeans E-commerce Tutorial
Troy Giunipero (pictured, right) is a student at the University of Edinburgh studying toward an MSc in Computer Science. Formerly, he was one of the NetBeans Docs writers based in Prague, Czech Republic, where he spent most of his time writing Java web tutorials. In this interview, Troy introduces you to The NetBeans E-commerce Tutorial. This is a very detailed tutorial describing just about everything you need to know when creating an e-commerce web application in Java. It has received a lot of very positive feedback. Let's find out about the background of this tutorial and what Troy learnt in writing it. Hi Troy! During your time on the NetBeans team, you wrote a very large tutorial describing how to create an e-commerce site. How and why did you start writing it? Well, there’s a short answer and a long answer to this. The short answer is that I was lucky to take part in Sun’s SEED (Sun Engineering Enrichment and Development) program. I wanted to focus on technical aspects, so I based my curriculum on developing an e-commerce application using Java technologies. I documented my efforts and applied them toward deliverables for the IDE’s 6.8 and 6.9 releases, resulting in the 13-part NetBeans E-commerce Tutorial. The long answer is that I had previously been tasked with creating an e-commerce application for my degree project (I was studying toward a BSc in IT and Computing), and ran into loads of trouble trying to integrate the various technologies into a cohesive, functioning application. I was coming from a non-technical background and found there was a steep learning curve involved in web development. My work was fraught with problems which I can now attribute to poor time-management, and a lack of good, practical, hands-on learning resources. So in a way, working on the AffableBean project (this is the project used in the NetBeans E-commerce Tutorial) was a way for me to go back and attempt to do the whole thing right. With the tutorial, I had two goals in mind: one, I wanted to consolidate my understanding of everything by writing about it, and two, I wanted to help others avoid the problems and pitfalls that I’d earlier ran into by designing a piece of documentation that puts everything together. Can you run us through the basic parts and what they provide? Certainly. First I want to point out that there’s a live demo application (http://dot.netbeans.org:8080/AffableBean/) which I managed to get up and running with help from Honza Pirek from the NetBeans Web Team (thanks Honza!): The application is modeled on the well-known MVC architecture: The tutorial refers to the above diagram at various points, and covers a bunch of different concepts and technologies along the way, including: Project design and planning (unit 2) Designing a data model (using MySQL WorkBench) (unit 4) Forward-engineering the data model into a database schema (unit 4) Database connectivity (units 3, 4, 6) Servlet, JSP/EL and JSTL technologies (units 3, 5, 6) EJB 3 and JPA 2 technologies (unit 7), and transactional support (unit 9) Session management (i.e., for the shopping cart mechanism) (unit 8) Form validation and data conversion (unit 9) Multilingual support (unit 10) Security (i.e., using form-based authentication and encrypted communication) (unit 11) Load testing with JMeter (unit 12) Monitoring the application with the IDE’s Profiler (unit 12) Tips for deployment to a production server (units 12, 13) Also, the tutorial aims to provide ample background information on the whole “Java specifications” concept, with an introduction to the Java Community Process, how final releases include reference implementations, and how these relate to the tutorial application using the IDE’s bundled GlassFish server (units 1, 7). Finally, the tutorial is as much about the above concepts and technologies as it is about learning to make best use of the IDE. I really tried to squeeze as much IDE-centric information in there as possible. So for example you’ll find: An introduction to the IDE’s main windows and their functions (unit 3) A section dedicated to editor tips and tricks (unit 5), and abundant usage of keyboard shortcuts in steps throughout the tutorial Use of the debugger (unit 8) Special “tip boxes” that discuss IDE functionality that is sometimes difficult to fit into conventional documentation. For example, there are tips on using the IDE’s Template Manager (unit 5), GUI support for database tables (unit 6), Javadoc support (unit 8), and support for code templates (unit 9). Did you learn any new things yourself while writing it? Yes! Three things immediately come to mind: EJB 3 technology. Initially this was a big hurdle for me. Using EJB 3 effectively seems to be something of an art form. If you know what you’re doing and understand exactly how to use the EntityManager to handle persistence operations on a database, EJB lets you do some amazingly smart things with just a few lines of code. But there seems to be a lack of good free documentation online—especially since EJB 3 is a significant departure from EJB 2. Therefore, almost all of the tutorial’s information on EJB comes from the very excellent book, EJB in Action by Debu Panda and Reza Rahman. Interpreting the NetBeans Profiler. The final hands-on unit, Testing an Profiling, was the most difficult for me to write, primarily because I just wasn’t familiar with the Profiler at all. I spent an unhealthy amount of time just watching the Telemetry graph run against my JMeter test plan, which is only slightly more stimulating than watching water come to boil. That being said, I feel that by just examining the graphs and other windows over time, critical logical associations start to jump out at you after a while. Likewise with JMeter. Hopefully unit 12 was able to capture and relay some of these. How to search online for decent articles and learning materials. The old Sun Technical Articles site was a great resource. Many of the links in the See Also sections at the bottom of tutorial units were found by adding site:java.sun.com/developer/technicalArticles/ to a Google search. Also the official forums (found at forums.sun.com) became a good place for questions I couldn’t find ready answers to. I had both the Java EE 5 and 6 Tutorials bookmarked. And Marty Hall’s Core Servlets and JavaServer Pages became an invaluable resource for the first half of the tutorial. What are your personal favorite features of the technologies discussed in the tutorial? I particularly liked learning about session management—using the HttpSession object to carry user-specific data between requests, and working with JSP’s corresponding implicit objects in the front-end pages. Session management is a defining aspect for e-commerce applications, as they need to provide some sort of shopping cart mechanism... ...and so the Managing Sessions unit (unit 8) was a key chapter in the tutorial. It’s extremely useful to be able to suspend the debugger on a portion of code that includes session-scoped variables, then hover the mouse over a given variable in the editor to determine its current value. I used the debugger continuously during this phase, and so I went so far as to incorporate use of the debugger throughout the Managing Sessions unit. What kind of background does someone starting the tutorial need to have? Someone can come to the tutorial with little or absolutely no experience using NetBeans. I’ve tried to be particularly careful in providing clear and easy-to-follow instructions in this respect. But one would be best off having some background or knowledge in basic web technologies, and at least some exposure to relational databases. With this foundation, I think that the topics covered in the second half of the tutorial, like applying entity classes and session beans, language support and security, won’t seem too daunting. I’ve noticed that the vast majority of feedback that comes in relates to the first half of the tutorial, and I sometimes get the impression that people feel they need to follow the tutorial units consecutively. Not so. The units are 90% modular. In other words, if somebody just wants to run through the security unit (unit 11), they can do so by downloading the associated project snapshot, follow the setup instructions, and then just follow along without needing to even look at other parts of the tutorial. What will they be able to do at the end of it? Naturally, anybody who completes individual tutorial units will be able to apply the concepts and technologies to their own work. But anyone who completes the tutorial in its entirety will gain an insight into the development process as a whole, and I think will also get a certain confidence that comes with knowing how “all the pieces fit together”—from gathering customer requirements all the way to deployment of the completed app to a production server. They’ll also have gained a solid familiarity with the NetBeans IDE, and be in a good position to explore popular Java web frameworks that work on top of servlet technology or impose an MVC architecture on their own, such as JSF, Spring, Struts, or Wicket. Do you see any problems in the technologies discussed and what would be your suggestions for enhancements? Well there’s one thing that comes to mind. When I started working on this project, I was studying the Duke’s BookStore example from the Java EE 5 Tutorial. A wonderful example that demonstrates how to progressively implement the same application using various technologies and combinations thereof. So for example you start out with an all-servlet implementation, then move on to a JSP/servlet version. Then there’s a JSTL implementation and ultimately, a version using JavaServer Faces. It’s great learning material, but also terrifically outdated. Right around this time, Sun was gearing up for the big Java EE 6 release (Dec. 2009), and I was also trying to learn about the new upcoming technologies, namely CDI, JSF 2, and EJB 3, for my regular NetBeans documentation work. I was getting the definite sense that JSP and JSTL were slowly being pushed aside—in the case of JavaServer Faces, Facelets templating was the new page authoring technology. So really, the E-commerce Tutorial application has become a sort of EE 5/EE 6 hybrid by combining JSP/JSTL with EJB 3 and JPA 2. Now the problem I see from the perspective of a student trying to learn this stuff from scratch, is that the leap from basic servlet technology to a full-blown JSF/EJB/JPA solution is tremendous, and cannot readily be taught through a single tutorial. Naturally, others may disagree with me here. I’m not sure if there’s a solution other than to compensate by producing a lot of quality learning material that covers lots of different use-cases. I’d suggest that the E-commerce Tutorial puts one in a very advantageous position to begin learning about Java-based frameworks, such as GWT, Spring, and JSF, which is a natural course of action for people looking to get a job with this knowledge. Planning any more parts to the tutorial or a new one? No more parts. The E-commerce Tutorial is done. Upon committing the final installments and changes last November, I rejoiced. However, I’m still actively responding to feedback [the ‘Send Us Your Feedback’ links at the bottom of tutorials] and plan to maintain it indefinitely, so if anyone spots any typos, has questions or comments, recommendations for improvement, etc., please write in! :-)
January 9, 2011
by Geertjan Wielenga
· 30,160 Views
  • Previous
  • ...
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • Next
  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook
×