Maintenance Resources

The Latest Maintenance Topics

Option 1: JMX Many people asked can they manage Quartz via JMX, and that the documentation on this is not clear enough to help them get started. So, let me highlight couple ways you can do this. Yes you can enable JMX in quartz with the following in quartz.properties org.quartz.scheduler.jmx.export = true After this, you use standard JMX client such as $JAVA_HOME/bin/jconsole to connect and manage remotely. Option 2: RMI Another way to manage quartz remotely is to enable RMI in Quartz. If you use this, you basically run one instance of Quartz as RMI server, and then you can create second Quartz instance as RMI client. These two can talk remotely via a TCP port. For server scheduler instance, you want to add these in quartz.properties org.quartz.scheduler.rmi.export = true org.quartz.scheduler.rmi.createRegistry = true org.quartz.scheduler.rmi.registryHost = localhost org.quartz.scheduler.rmi.registryPort = 1099 org.quartz.scheduler.rmi.serverPort = 1100 And for client scheduler instance, you want to add these in quartz.properties org.quartz.scheduler.rmi.proxy = true org.quartz.scheduler.rmi.registryHost = localhost org.quartz.scheduler.rmi.registryPort = 1099 The RMI feature is mentioned in Quartz doc here. Quartz doesn't have a client API, but use the same org.quartz.Scheduler for both server and client. It's just the configuration are different. By different configuration, you get very different behavior. For server, your scheduler is running all the jobs, while for client, it's simply a proxy. Your client scheduler instance will not run any jobs! You must be really careful when shutting down client because it does allow you to bring down the server! These configurations have been highlighted in the MySchedule project. If you run the webapp, you should see a screen like this demo, and you will see it provided many sample of quartz configurations with these remote managment config properties. If configure with RMI option, you can actually still use MySchedule web UI to manage the Quartz as proxy. You can view and drill down jobs, and you can even stop or shutdown remote server! Based on my experience, there is a down side of using Quartz RMI feature though. That is it creates a single point of failure. There is no fail over if your RMI server port is down!

September 14, 2012

by Zemian Deng

· 23,321 Views

Quartz Scheduler Misfire Instructions Explained

Sometimes Quartz is not capable of running your job at the time when you desired. There are three reasons for that: all worker threads were busy running other jobs (probably with higher priority) the scheduler itself was down the job was scheduled with start time in the past (probably a coding error) You can increase the number of worker threads by simply customizing the org.quartz.threadPool.threadCount in quartz.properties (default is 10). But you cannot really do anything when the whole application/server/scheduler was down. The situation when Quartz was incapable of firing given trigger is called misfire. Do you know what Quartz is doing when it happens? Turns out there are various strategies (called misfire instructions) Quartz can take and also there are some defaults if you haven't thought about it. But in order to make your application robust and predictable (especially under heavy load or maintenance) you should really make sure your triggers and jobs are configured conciously. There are different configuration options (available misfire instructions) depending on the trigger chosen. Also Quartz behaves differently depending on trigger setup (so called smart policy). Although the misfire instructions are described in the documentation, I found it hard to understand what do they really mean. So I created this small summary article. Before I dive into the details, there is yet another configuration option that should be described. It is org.quartz.jobStore.misfireThreshold (in milliseconds), defaulting to 60000 (a minute). It defines how late the trigger should be to be considered misfired. With default setup if trigger was suppose to be fired 30 seconds ago, Quartz will happily just run it. Such delay is not considered misfiring. However if the trigger is discovered 61 seconds after the scheduled time - the special misfire handler thread takes care of it, obeying the misfire instruction. For test purposes we will set this parameter to 1000 (1 second) so that we can test misfiring quickly. Simple trigger without repeating In our first example we will see how misfiring is handled by simple triggers scheduled to run only once: val trigger = newTrigger(). startAt(DateUtils.addSeconds(new Date(), -10)). build() The same trigger but with explicitly set misfire instruction handler: val trigger = newTrigger(). startAt(DateUtils.addSeconds(new Date(), -10)). withSchedule( simpleSchedule(). withMisfireHandlingInstructionFireNow() //MISFIRE_INSTRUCTION_FIRE_NOW ). build() For the purpose of testing I am simply scheduling the trigger to run 10 seconds ago (so it is 10 seconds late by the time it is created!) In real world you would normally never schedule triggers like that. Instead imagine the trigger was set correctly but by the time it was scheduled the scheduler was down or didn't have any free worker threads. Nevertheless, how will Quartz handle this extraordinary situation? In the first code snippet above no misfire handling instruction is set (so called smart policy is used in that case). The second code snippet explicitly defines what kind of behaviour do we expect when misfiring occurs. See the table: Instruction Meaning smart policy - default See: withMisfireHandlingInstructionFireNow withMisfireHandlingInstructionFireNow MISFIRE_INSTRUCTION_FIRE_NOW The job is executed immediately after the scheduler discovers misfire situation. This is the smart policy. Example scenario: you have scheduled some system clean up at 2 AM. Unfortunately the application was down due to maintenance by that time and brought back on 3 AM. So the trigger misfired and the scheduler tries to save the situation by running it as soon as it can - at 3 AM. withMisfireHandlingInstructionIgnoreMisfires MISFIRE_INSTRUCTION_IGNORE_MISFIRE_POLICY QTZ-283 See: withMisfireHandlingInstructionFireNow withMisfireHandlingInstructionNextWithExistingCount MISFIRE_INSTRUCTION_RESCHEDULE_NEXT_WITH_EXISTING_COUNT See: withMisfireHandlingInstructionNextWithRemainingCount withMisfireHandlingInstructionNextWithRemainingCount MISFIRE_INSTRUCTION_RESCHEDULE_NEXT_WITH_REMAINING_COUNT Does nothing, misfired execution is ignored and there is no next execution. Use this instruction when you want to completely discard the misfired execution. Example scenario: the trigger was suppose to start recording of a program in TV. There is no point of starting recording when the trigger misfired and is already 2 hours late. withMisfireHandlingInstructionNowWithExistingCount MISFIRE_INSTRUCTION_RESCHEDULE_NOW_WITH_EXISTING_REPEAT_COUNT See: withMisfireHandlingInstructionFireNow withMisfireHandlingInstructionNowWithRemainingCount MISFIRE_INSTRUCTION_RESCHEDULE_NOW_WITH_REMAINING_REPEAT_COUNT See: withMisfireHandlingInstructionFireNow Simple trigger repeating fixed number of times This scenario is much more complicated. Imagine we have scheduled some job to repeat fixed number of times: val trigger = newTrigger(). startAt(dateOf(9, 0, 0)). withSchedule( simpleSchedule(). withRepeatCount(7). withIntervalInHours(1). WithMisfireHandlingInstructionFireNow() //or other ). build() In this example the trigger is suppose to fire 8 times (first execution + 7 repetitions) every hour, beginning at 9 AM today (startAt(dateOf(9, 0, 0)). Thus the last execution should occur at 4 PM. However assume that due to some reason the scheduler was not capable of running jobs at 9 and 10 AM and it discovered that fact at 10:15 AM, i.e. 2 firings misfired. How will the scheduler behave in this situation? Instruction Meaning smart policy - default See: withMisfireHandlingInstructionNowWithExistingCount withMisfireHandlingInstructionFireNow MISFIRE_INSTRUCTION_FIRE_NOW See: withMisfireHandlingInstructionNowWithRemainingCount withMisfireHandlingInstructionIgnoreMisfires MISFIRE_INSTRUCTION_IGNORE_MISFIRE_POLICYQTZ-283 Fires all triggers that were missed as soon as possible and then goes back to ordinary schedule. Example scenario: With this strategy in our example the scheduler will fire jobs scheduled at 9 and 10 AM immediately. Then it will wait to 11 AM and go back to ordinary schedule. Note: When handling misfires it is equally important to realize that the actual job execution time might be way after the scheduled time. This means you cannot simply rely on current system date, but you need to use JobExecutionContext .getScheduledFireTime(): def execute(context: JobExecutionContext) { val date = context.getScheduledFireTime //... } withMisfireHandlingInstructionNextWithExistingCount MISFIRE_INSTRUCTION_RESCHEDULE_NEXT_WITH_EXISTING_COUNT The scheduler won't do anything immediately. Instead it will wait for next scheduled time and run all triggers with scheduled intervals. See also: withMisfireHandlingInstructionNextWithRemainingCount Example scenario: at 10:15 the scheduler discovers 2 misfired executions. It waits until next scheduled time (11 AM) and fires all 8 scheduled executions every hour, stopping at 6 PM (the trigger should have stopped at 4 PM). withMisfireHandlingInstructionNextWithRemainingCount MISFIRE_INSTRUCTION_RESCHEDULE_NEXT_WITH_REMAINING_COUNT The scheduler discards misfired executions and waits for the next scheduled time. The total number of trigger executions will be less then configured. Example scenario: at 10:15 two misfired executions are discarded. The scheduler waits for next scheduled time (11 AM) and fires remaining triggers up to 4 PM. Effectively it behaves as if misfire never occurred. withMisfireHandlingInstructionNowWithExistingCount MISFIRE_INSTRUCTION_RESCHEDULE_NOW_WITH_EXISTING_REPEAT_COUNT First misfired trigger is executed immediately. Then the scheduler waits desired interval and executes all remaining triggers. Effectively the first fire time of the misfired trigger is moved to current time with no other changes. Example scenario: at 10:15 the scheduler runs the first misfired execution. Then it waits 1 hour and fires the second one at 11:15 AM. All 8 executions are performed, the last one at 5:15 PM withMisfireHandlingInstructionNowWithRemainingCount MISFIRE_INSTRUCTION_RESCHEDULE_NOW_WITH_REMAINING_REPEAT_COUNT First misfired execution runs immediately. Remaining misfired executions are discarded. Triggers that were not misfired are executed with desired interval. Example scenario: at 10:15 the scheduler runs the first misfired execution (from 9 AM). It discards remaining misfired executions (the one from 10 AM) and waits 1 hour to execute six more triggers: 11:15, 12:15, … 4:15 PM Simple trigger repeating infinitely In this scenario trigger repeats infinite number of times at a given interval: val trigger = newTrigger(). startAt(dateOf(9, 0, 0)). withSchedule( simpleSchedule(). withRepeatCount(SimpleTrigger.REPEAT_INDEFINITELY). withIntervalInHours(1). WithMisfireHandlingInstructionFireNow() //or other ). build() Once again trigger should fire on every hour, beginning at 9 AM today (startAt(dateOf(9, 0, 0)). However the scheduler was not capable of running jobs at 9 and 10 AM and it discovered that fact at 10:15 AM, i.e. 2 firings misfired. This is a more general situation compared to simple trigger running fixed number of times. Instruction Meaning smart policy - default See: withMisfireHandlingInstructionNextWithRemainingCount withMisfireHandlingInstructionFireNow MISFIRE_INSTRUCTION_FIRE_NOW See: withMisfireHandlingInstructionNowWithRemainingCount withMisfireHandlingInstructionIgnoreMisfires MISFIRE_INSTRUCTION_IGNORE_MISFIRE_POLICYQTZ-283 The scheduler will immediately run all misfired triggers, then continue on schedule. Example scenario: the triggers scheduled at 9 and 10 AM are executed immediately. Future invocations (next scheduled at 11 AM) are executed according to the plan. withMisfireHandlingInstructionNextWithExistingCount MISFIRE_INSTRUCTION_RESCHEDULE_NEXT_WITH_EXISTING_COUNT See: withMisfireHandlingInstructionNextWithRemainingCount withMisfireHandlingInstructionNextWithRemainingCount MISFIRE_INSTRUCTION_RESCHEDULE_NEXT_WITH_REMAINING_COUNT Does nothing, misfired executions are discarded. Then the scheduler waits for next scheduled interval and goes back to schedule. Example scenario: Misfired execution at 9 and 10 AM are discarded. The first execution occurs at 11 AM. withMisfireHandlingInstructionNowWithExistingCount MISFIRE_INSTRUCTION_RESCHEDULE_NOW_WITH_EXISTING_REPEAT_COUNT See: withMisfireHandlingInstructionNowWithRemainingCount withMisfireHandlingInstructionNowWithRemainingCount MISFIRE_INSTRUCTION_RESCHEDULE_NOW_WITH_REMAINING_REPEAT_COUNT The first misfired execution is run immediately, remaining are discarded. Next execution happens after desired interval. Effectively the first execution time is moved to current time. Example scenario: the scheduler fires misfired trigger immediately at 10:15 AM. Then waits an hour and runs the second one at 11:15 AM and continues with 1 hour interval. CRON triggers CRON triggers are the most popular ones amongst Quartz users. However there are also two other available triggers: DailyTimeIntervalTrigger (e.g. fire every 25 minutes) and CalendarIntervalTrigger (e.g. fire every 5 months). They support triggering policies not possible in both CRON and simple triggers. However they understand the same misfire handling instructions as CRON trigger. val trigger = newTrigger(). withSchedule( cronSchedule("0 0 9-17 ? * MON-FRI"). withMisfireHandlingInstructionFireAndProceed() //or other ). build() In this example the trigger should fire every hour between 9 AM and 5 PM, from Monday to Friday. But once again first two invocations were missed (so the trigger misfired) and this situation was discovered at 10:15 AM. Note that available misfire instructions are different compared to simple triggers: Instruction Meaning smart policy - default See: withMisfireHandlingInstructionFireAndProceed withMisfireHandlingInstructionIgnoreMisfires MISFIRE_INSTRUCTION_IGNORE_MISFIRE_POLICYQTZ-283 All misfired executions are immediately executed, then the trigger runs back on schedule. Example scenario: the executions scheduled at 9 and 10 AM are executed immediately. The next scheduled execution (at 11 AM) runs on time. withMisfireHandlingInstructionFireAndProceed MISFIRE_INSTRUCTION_FIRE_ONCE_NOW Immediately executes first misfired execution and discards other (i.e. all misfired executions are merged together). Then back to schedule. No matter how many trigger executions were missed, only single immediate execution is performed. Example scenario: the executions scheduled at 9 and 10 AM are merged and executed only once (in other words: the execution scheduled at 10 AM is discarded). The next scheduled execution (at 11 AM) runs on time. withMisfireHandlingInstructionDoNothing MISFIRE_INSTRUCTION_DO_NOTHING All misfired executions are discarded, the scheduler simply waits for next scheduled time. Example scenario: the executions scheduled at 9 and 10 AM are discarded, so basically nothing happens. The next scheduled execution (at 11 AM) runs on time. QTZ-283Note: QTZ-283: MISFIRE_INSTRUCTION_IGNORE_MISFIRE_POLICY not working with JDBCJobStore - apparently there is a bug when JDBCJobStore is used, keep an eye on that issue. As you can see various triggers behave differently based on the actual setup. Moreover, even though the so called smart policy is provided, often the decision is based on business requirements. Essentially there are three major strategies: ignore, run immediately and continue and discard and wait for next. They all have different use-cases: Use ignore policies when you want to make sure all scheduled executions were triggered, even if it means multiple misfired triggers will fire. Think about a job that generates report every hour based on orders placed during that last hour. If the server was down for 8 hours, you still want to have that reports generated, as soon as you can. In this case the ignore policies will simply run all triggers scheduled during that 8 hour as fast as scheduler can. They will be several hours late, but will eventually be executed. Use now* policies when there are jobs executing periodically and upon misfire situation they should run as soon as possible, but only once. Think of a job that cleans /tmp directory every minute. If the scheduler was busy for 20 minutes and finally can run this job, you don't want to run in 20 times! One is enough, but make sure it runs as fast it can. Then back to your normal one-minute intervals. Finally next* policies are good when you want to make sure your job runs at particular points in time. For example you need to fetch stock prices quarter past every hour. They change rapidly so if your job misfired and it is already 20 minutes past full hour, don't bother. You missed the correct time by 5 minutes and now you don't really care. It is better to have a gap rather than an inaccurate value. In this case Quartz will skip all misfired executions and simply wait for the next one.

April 13, 2012

by Tomasz Nurkiewicz

· 109,070 Views · 13 Likes

The Hidden Treasure of Quartz Scheduler Plugins

Although briefly described in the official documentation, I believe Quartz plugins aren't known enough, looking at how useful they are. Essentially plugins in Quartz are convenient classes wrapping registration of underlying listeners. You are free to write your own plugins but we will focus on existing ones shipped with Quartz. LoggingTriggerHistoryPlugin First some background. Two main abstractions in Quartz are jobs and triggers. Job is a piece of code that we would like to schedule. Trigger instructs the scheduler when this code should run. CRON (e.g. run every Friday between 9 AM and 5 PM until November) and simple (run 100 times every 2 hours) triggers are most commonly used. You associate any number of triggers to a single job. Believe it or not, Quartz by default provides no logging or monitoring whatsoever of executed jobs and triggers. There is an API, but no built-in logging is implemented. It won't show you that it now executes this particular job due to this trigger firing. So the first thing you should do is adding the following lines to your quartz.properties: org.quartz.plugin.triggerHistory.class=org.quartz.plugins.history.LoggingTriggerHistoryPlugin org.quartz.plugin.triggerHistory.triggerFiredMessage=Trigger [{1}.{0}] fired job [{6}.{5}] scheduled at: {2, date, dd-MM-yyyy HH:mm:ss.SSS}, next scheduled at: {3, date, dd-MM-yyyy HH:mm:ss.SSS} org.quartz.plugin.triggerHistory.triggerCompleteMessage=Trigger [{1}.{0}] completed firing job [{6}.{5}] with resulting trigger instruction code: {9}. Next scheduled at: {3, date, dd-MM-yyyy HH:mm:ss.SSS} org.quartz.plugin.triggerHistory.triggerMisfiredMessage=Trigger [{1}.{0}] misfired job [{6}.{5}]. Should have fired at: {3, date, dd-MM-yyyy HH:mm:ss.SSS} The first line (and the only required) loads the plugin class LoggingTriggerHistoryPlugin. The remaining lines are configuring the plugin, customizing the logging messages. I found the built-in defaults not very well thought, e.g. they display current time which is already part of the logging framework message. You are free to construct any logging message, see the API for details. Adding these extra few lines makes debugging and monitoring much easier: LoggingTriggerHistoryPlugin | Trigger [Demo.Every-few-seconds] fired job [Demo.Print-message] scheduled at: 04-04-2012 23:23:47.036, next scheduled at: 04-04-2012 23:23:51.036 //...job output LoggingTriggerHistoryPlugin | Trigger [Demo.Every-few-seconds] completed firing job [Demo.Print-message] with resulting trigger instruction code: DO NOTHING. Next scheduled at: 04-04-2012 23:23:51.036 You see now why naming your triggers (Demo.Every-few-seconds) and jobs (Demo.Print-message) is so important. LoggingJobHistoryPlugin There is another handy plugin related to logging: org.quartz.plugin.jobHistory.class=org.quartz.plugins.history.LoggingJobHistoryPlugin org.quartz.plugin.jobHistory.jobToBeFiredMessage=Job [{1}.{0}] to be fired by trigger [{4}.{3}], re-fire: {7} org.quartz.plugin.jobHistory.jobSuccessMessage=Job [{1}.{0}] execution complete and reports: {8} org.quartz.plugin.jobHistory.jobFailedMessage=Job [{1}.{0}] execution failed with exception: {8} org.quartz.plugin.jobHistory.jobWasVetoedMessage=Job [{1}.{0}] was vetoed. It was to be fired by trigger [{4}.{3}] at: {2, date, dd-MM-yyyy HH:mm:ss.SSS} The rule is the same - plugin + extra configuration. See JavaDoc of LoggingJobHistoryPlugin for details and possible placeholders. Quick look at logs reveals very descriptive output: Trigger [Demo.Every-few-seconds] fired job [Demo.Print-message] scheduled at: 04-04-2012 23:34:53.739, next scheduled at: 04-04-2012 23:34:57.739 Job [Demo.Print-message] to be fired by trigger [Demo.Every-few-seconds], re-fire: 0 //...job output Job [Demo.Print-message] execution complete and reports: null Trigger [Demo.Every-few-seconds] completed firing job [Demo.Print-message] with resulting trigger instruction code: DO NOTHING. Next scheduled at: 04-04-2012 23:34:57.739 I have no idea why these plugins aren't enabled by default. After all, if you don't want such a verbose output, you can turn it off in your logging framework. Never mind, I think it is a good idea to have them in place when troubleshooting Quartz execution. XMLSchedulingDataProcessorPlugin This is a pretty comprehensive plugin. It reads XML file (by default named quartz_data.xml) containing jobs and triggers definitions and adds them to the scheduler. This is especially useful when you have a global job that you need to add once. Plugin can either update the existing jobs/triggers or ignore the XML file if they already exist - very useful when JDBCJobStore is used. org.quartz.plugin.xmlScheduling.class=org.quartz.plugins.xml.XMLSchedulingDataProcessorPlugin In the aforementioned article we have been manually adding job to the scheduler: val trigger = newTrigger(). withIdentity("Every-few-seconds", "Demo"). withSchedule( simpleSchedule(). withIntervalInSeconds(4). repeatForever() ). build() val job = newJob(classOf[PrintMessageJob]). withIdentity("Print-message", "Demo"). usingJobData("msg", "Hello, world!"). build() scheduler.scheduleJob(job, trigger) The same can be achieved with XML configuration, just place the following quartz_data.xml in your CLASSPATH: false true Every-few-seconds Demo Print-message Demo -1 4000 Print-message Demo com.blogspot.nurkiewicz.quartz.demo.PrintMessageJob msg Hello, World! The file supports both simple and CRON triggers and is well described using XML Schema. It is even possible to point out to an XML files somewhere in the file system and periodically scan them for changes (!) (see: XMLSchedulingDataProcessorPlugin.setScanInterval(). Guess what is Quartz using to schedule periodic scanning? org.quartz.plugin.xmlScheduling.fileNames=/etc/quartz/system-jobs.xml,/home/johnny/my-jobs.xml org.quartz.plugin.xmlScheduling.scanInterval=60 ShutdownHookPlugin Last but not least, ShutdownHookPlugin. Small but probably useful plugin that register shutdown hook in the JVM in order to gently stop the scheduler. However I recommend turning cleanShutdown off - if the system already tries to abruptly stop the application (typically scheduler shutdown is called by Spring via SchedulerFactoryBean) or the user hit Ctrl+C - waiting for currently running jobs seems like a bad idea. After all, maybe we are killing the application because some jobs are running for too long/hanging? org.quartz.plugin.shutdownHook.class=org.quartz.plugins.management.ShutdownHookPlugin org.quartz.plugin.shutdownHook.cleanShutdown=false As you can see Qurtz ships with few quite interesting plugins. For some reason they aren't described in detail in the official documentation, but they work pretty well and are a valuable addition to scheduler. The source code with applied plugins is available on GitHub.

April 9, 2012

by Tomasz Nurkiewicz

· 18,905 Views · 1 Like

Configuring Quartz With JDBCJobStore in Spring

I am starting a little series about Quartz scheduler internals, tips and tricks, this is chapter 0 - how to configure persistent job store.

April 7, 2012

by Tomasz Nurkiewicz

· 37,702 Views

Why You Shouldn't Use Quartz Scheduler

If you need to schedule jobs in Java, it is fairly common in the industry to use Quartz directly or via Spring integration, but you might want to think twice.

January 30, 2012

by Craig Flichel

· 303,501 Views · 5 Likes

Low-level Infrastructure: Puppet, DNS and DHCP

Right. Let’s have a look at the massive technical implications of the Fix Puppet idea. As I mentioned in my earlier blogpost, in order to fix puppet in a sensible way, we’ll have to review all, and overhaul some of the underlying infrastructure that allows it all to run. The interlinks and dependencies between all the parts are a little tricky to visualise. So, here’s a picture. Anything in red needs attention, and the stuff in green *just works*. Things in blue are install stages, and these are what we’re working on making perfect. Right, so we’ve basically got a directed graph, representing the steps and stages that have to happen to a new machine before users can log in. The steps taken to build a machine, roughly look like this: Unbox. Plug in. Configure Netboot. Hand MAC Address to DHCP server and assign a hostname. Client PXEBoots. Client downloads a preseed file. Client installs itself. Client Reboots. Puppet runs on First Boot. Puppet completes. Client Reboots again. Users login That’s about it, really. The first 4 steps are a hell of a lot easier with the support and co-operation of the supplier. It’s nice to have systems preconfigured to PXE boot as the BIOS default, and even cooler if they can send the MAC addresses as labels on each physical machine. If we’re going to build out a new infrastructure, we’re going to need to review and reinstall the servers that provide this infrastructure, before we can build any workstations. I’m a massive massive fan of puppet, and believe that it should be used for the configuration of all servers and workstations. As such, I didn’t want to rebuild anything without using puppet, so the first step, had to be getting puppet working again. So, without further ado, let’s take a look at the Puppet portion of this, well, one of them. My predecessor saw fit that all nodes should be defined with puppet-dashboard, which is itself, a fine piece of software, but I think more for reporting than specification. Initially, at least, I rebuilt the puppet manifest from a known-good configuration. Namely the base configs I wrote for a blogpost about a year ago; base configs that I’m going to update soon. I’m a bit of an old fashioned puppet user. I like my nodes defined in nodes.pp, not some External Node Classifier service. Reason being, I like to be able to look in one place and find exactly what I want. It’s not a massive ballache to clone down the puppet git repo, make a change and push it back up. In fact, it’s better than having a web interface for your node classifications, because git provides you with an intrinsic log of what was changed, and it’s easy to revert to an old version, because everything’s stored in source control. You can also test what you’re about to do, because again, it’s just a source control repo. I’m a fan of having Jenkins run a few sanity checks on your puppet repo, but that’s a digression for another blogpost. I’m not going to go into great depth about how to install DHCP and DNS, and how to make it work with puppet, at least, not here. What I will say, though is that Puppet Module Tool is the most fantastically easy way to generate boilerplate modules for puppet. All you need to do is run puppet-module generate tomoconnor-dhcp and you get a full puppet module folder called tomoconnor-dhcp which contains all the structure according to the best practice guidelines. Excellent. As part of the review process, it became quite apparent that Bind9 has no sensible admin/management interface, or at least, there wasn’t one installed, and frankly, anything that has such horrific config files should be shot. Having had good experience and results using PowerDNS in the past, we decided that this would be a valid upgrade from BIND. PowerDNS relies on a SQL backend for storing the record data in. You can use either MySQL or PostgreSQL, or possibly some others. Since MySQL can be a bitch, and is, to all serious purposes, a toy database, Postgres seems like a better choice. 9.1 is stable, and there are deb package available for it. 9.1 also does hot-standby replication, which is a miracle, because Postgres replication used to be a massive pain in the testicles. There were, initially some mysterious problems with the TFTPd server being generally crappy, mostly regarding timeouts, which was because the storage of the TFTP data was on a painfully slow disk. Moving it from there to the NFS mount dramatically increased performance and stopped TFTP going crazy. In the TFTP'd config, there's a block for configuring the boot options of the preseed install. This is how PXE hands over the details of the preseed server, and the classes of preseed file to run (basically, which modules) label lucid_ws menu label ^2) Auto Install Ubuntu Lucid WorkStation text help Start hands off install of a workstation. endtext menu default kernel ubuntu-1004-installer/amd64/linux append tasks=standard pkgsel/language-pack-patterns= pkgsel/install-language-support=false vga=normal initrd=ubuntu-1004-installer/amd64/initrd.gz -- quiet auto debian-installer/country=GB debian-installer/language=en debian-installer/keymap=us debian-installer/locale=en_GB.UTF8 netcfg/choose_interface=eth0 netcfg/get_hostname=ubuntu netcfg/get_domain=installdomain.wibblesplat.com url=http://autoserver/d-i/lucid/preseed.cfg classes=wibblesplat;workstation DEBCONF_DEBUG=1 Initially, the Preseed files contained all sorts of crazy hacky shit in the d-i late-command setting. late-command is cool. It’s basically the last thing to run before the first reboot when you build a new debian/ubuntu system. You can tell it to do all sorts of stuff in there. You probably shouldn’t, though. Especially when what you’re doing in there is better done elsewhere. The previous Preseed file contained a whole bunch of “inject these source files into /etc/apt/sources.list”, which is utter bullshit, because you can do exactly the same thing with d-i local repositories, which does the same thing, only far far cleaner. That’s not to say that my refactored preseed files don’t use late-command at all. I’ve chosen to insert some lines into /etc/rc.local on the freshly built system that ensures a puppet run at first boot. On the preseed server, there’s a file called “firstboot.sh” which gets dropped into /usr/local/bin by way of a wget command in late-command. The next thing that happens in late-command is a line to remove “exit 0” from /etc/rc.local and replace it with a thing that calls “/usr/local/bin/firstboot.sh” When firstboot runs, it runs puppet, checks for sanity, and then removes itself from /etc/rc.local. The code to actually do that looks like this: d-i preseed/late_command string \ wget -q -O /target/root/firstboot.sh http://autoserver/d-i/bin/firstboot.sh && \ chmod +x /target/root/firstboot.sh && \ sed -i 's_exit 0_sh /root/firstboot.sh_' /target/etc/rc.local This relies on having something on http://autoserver that is basically just apache hosting some files for the preseeder to retrieve during installation. Cool huh? That ensures that the first thing that happens once the new machine has been built and rebooted, is a puppet run. Some stuff we do here relies on our hand-rolled deb packages, which are stored in our own, internal APT repo. We’ve also got an APT cache, created and maintained by apt-cacher-ng, which at least means that when you’re rebuilding systems frequently, that all the packages you would otherwise download from archive.ubuntu.com come straight over the LAN. The major problem initially with this was the speed, or lack of. It certainly wasn’t performing anywhere near speeds you’d expect from a 1GE LAN, and the reason was again, slow disks. Moving the apt-cache files to the NFS highspeed storage again helped performance. If we struggle in future, I’m going to look at a SSD cache for this, but I think that the performance of the SAS/SATA disks on massively parallel storage provided by our NFS servers will be adequate for the forseeable future. Next up, the Puppetmaster. Again, I was pretty keen on building this from scratch, but using puppet itself to configure it’s own master. Sounds pretty counter-intuitive, right? But the puppet client can bootstrap the master quite easily by using files as it’s source. The first step is to clone down the latest puppet manifests from git, so you either need to git export elsewhere, or install git-core. Your choice. Once you’ve got those, all you need to do is install puppet-client, and run: puppet apply /path/to/your/manifests/site.pp If you’ve written the manifests right, and you’ve got your master defined as a node, you should find that puppet will install puppetmaster, and so on, and then you get a ready and working puppetmaster that just configured itself. I used puppet-module tool to generate modules for the following services/items: “applications” - which actually contains a bunch of custom/proprietary application install rules, a declassified example is there’s a googlechrome.pp file that installs chrome from a PPA. Other modules: dhcp, kernel, ldap, network, nfs, nscd, ntp, nvidia, postgres, powerdns and ssmtp. As is the trend with puppet, and modern DevOps, a vast majority of the code in the entire manifest repository has been gleaned and researched from other puppet modules on github. Acknowledgement is in place where it’s due, and the working copies we’re using are frequently forked on github from the original. It’s great, this, actually. If you search on PuppetForge http://forge.puppetlabs.com/ the array of modules available is staggering. It makes bootstrapping a new manifest set remarkably quick and easy. The NFS module contains a bunch of requirements for mounting NFS shares, and the definitions for an NFS share to be mounted. All pretty simple stuff, but modularised for ease of use. I’m particularly proud of the postgres module which has a master class, and a slave class, which installs and configures the required files and packages to enable streaming hot-standby replication on Postgres9.1 I will release the declassified fork of this soon. I’m going to wrap this post up here. It’s a massively long one, and there’s still lots more left to write. Source: tomoconnor.eu/blogish/low-level-infrastructure-puppet-dns-and-dhcp/

January 29, 2012

by Tom O'connor

· 7,936 Views

Enabling JMX in Hibernate, Ehcache, Quartz, DBPC and Spring

A collection of short how-to's for enabling JMX in several popular Java technologies. Continuing our journey with JMX (see: ...JMX for human beings) we will learn how to enable JMX support (typically statistics and monitoring capabilities) in some popular frameworks. Most of this information can be found on project's home pages, but I decided to collect it with few the addition of some useful tips. Hibernate (with Spring support) Exposing Hibernate statistics with JMX is pretty simple, however some nasty workarounds are requires when JPA API is used to obtain underlying SessionFactory class JmxLocalContainerEntityManagerFactoryBean() extends LocalContainerEntityManagerFactoryBean { override def createNativeEntityManagerFactory() = { val managerFactory = super.createNativeEntityManagerFactory() registerStatisticsMBean(managerFactory) managerFactory } def registerStatisticsMBean(managerFactory: EntityManagerFactory) { managerFactory match { case impl: EntityManagerFactoryImpl => val mBean = new StatisticsService(); mBean.setStatisticsEnabled(true) mBean.setSessionFactory(impl.getSessionFactory); val name = new ObjectName("org.hibernate:type=Statistics,application=spring-pitfalls") ManagementFactory.getPlatformMBeanServer.registerMBean(mBean, name); case _ => } } } Note that I have created a subclass of Springs built-in LocalContainerEntityManagerFactoryBean. By overriding createNativeEntityManagerFactory() method I can access EntityManagerFactory and by trying to downcast it to org.hibernate.ejb.EntityManagerFactoryImpl we were able to register Hibernate Mbean. One more thing has left. Obviously we have to use our custom subclass instead of org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean. Also, in order to collect the actual statistics instead of just seeing zeroes all the way down we must set the hibernate.generate_statistics flag. @Bean def entityManagerFactoryBean() = { val entityManagerFactoryBean = new JmxLocalContainerEntityManagerFactoryBean() entityManagerFactoryBean.setDataSource(dataSource()) entityManagerFactoryBean.setJpaVendorAdapter(jpaVendorAdapter()) entityManagerFactoryBean.setPackagesToScan("com.blogspot.nurkiewicz") entityManagerFactoryBean.setJpaPropertyMap( Map( "hibernate.hbm2ddl.auto" -> "create", "hibernate.format_sql" -> "true", "hibernate.ejb.naming_strategy" -> classOf[ImprovedNamingStrategy].getName, "hibernate.generate_statistics" -> true.toString ).asJava ) entityManagerFactoryBean } Here is a sample of what can we expect to see in JvisualVM (don't forget to install all plugins!): In addition we get a nice Hibernate logging: HQL: select generatedAlias0 from Book as generatedAlias0, time: 10ms, rows: 20 EhCache Monitoring caches is very important, especially in application where you expect values to generally be present there. I tend to query the database as often as needed to avoid unnecessary method arguments or local caching. Everything to make code as simple as possible. However this approach only works when caching on the database layer works correctly. Similar to Hibernate, enabling JMX monitoring in EhCache is a two-step process. First you need to expose provided MBean in MBeanServer: @Bean(initMethod = "init", destroyMethod = "dispose") def managementService = new ManagementService(ehCacheManager(), platformMBeanServer(), true, true, true, true, true) @Bean def platformMBeanServer() = ManagementFactory.getPlatformMBeanServer def ehCacheManager() = ehCacheManagerFactoryBean.getObject @Bean def ehCacheManagerFactoryBean = { val ehCacheManagerFactoryBean = new EhCacheManagerFactoryBean ehCacheManagerFactoryBean.setShared(true) ehCacheManagerFactoryBean.setCacheManagerName("spring-pitfalls") ehCacheManagerFactoryBean } Note that I explicitly set CacheManager name. This is not required but this name is used as part of the Mbean name and a default one contains hashCode value, which is not very pleasant. The final touch is to enable statistics on a cache basis: Now we can happily monitor various caching characteristics of every cache separately: As we can see the percentage of cache misses increases. Never a good thing. If we don't enable cache statistics, enabling JMX is still a good idea since we get a lot of management operations for free, including flushing and clearing caches (useful during debugging and testing). Quartz scheduler In my humble opinion Quartz scheduler is very underestimated library, but I will write an article about it on its own. This time we will only learn how to monitor it via JMX. Fortunately it's as simple as adding: org.quartz.scheduler.jmx.export=true To quartz.properties file. The JMX support in Quartz could have been slightly broader, but still one can query e.g. which jobs are currently running. By the way the new major version of Quartz (2.x) brings very nice DSL-like support for scheduling: val job = newJob(classOf[MyJob]) val trigger = newTrigger(). withSchedule( repeatSecondlyForever() ). startAt( futureDate(30, SECOND) ) scheduler.scheduleJob(job.build(), trigger.build()) Apache Commons DBCP Apache Commons DBCP is the most reasonable JDBC pooling library I came across. There is also c3p0, but it doesn't seem like it's actively developed any more. Tomcat JDBC Connection Pool looked promising, but since it's bundled in Tomcat, your JDBC drivers can no longer be packaged in WAR. The only problem with DBCP is that it does not support JMX. At all (see this two and a half year old issue). Fortunately this can be easily worked around. Besides we will learn how to use Spring built-in JMX support. Looks like the standard BasicDataSource has all what we need, all we have to do is to expose existing metrics via JMX. With Spring it is dead-simple – just subclass BasicDataSource and add @ManagedAttribute annotation over desired attributes: @ManagedResource class ManagedBasicDataSource extends BasicDataSource { @ManagedAttribute override def getNumActive = super.getNumActive @ManagedAttribute override def getNumIdle = super.getNumIdle @ManagedAttribute def getNumOpen = getNumActive + getNumIdle @ManagedAttribute override def getMaxActive: Int= super.getMaxActive @ManagedAttribute override def setMaxActive(maxActive: Int) { super.setMaxActive(maxActive) } @ManagedAttribute override def getMaxIdle = super.getMaxIdle @ManagedAttribute override def setMaxIdle(maxIdle: Int) { super.setMaxIdle(maxIdle) } @ManagedAttribute override def getMinIdle = super.getMinIdle @ManagedAttribute override def setMinIdle(minIdle: Int) { super.setMinIdle(minIdle) } @ManagedAttribute override def getMaxWait = super.getMaxWait @ManagedAttribute override def setMaxWait(maxWait: Long) { super.setMaxWait(maxWait) } @ManagedAttribute override def getUrl = super.getUrl @ManagedAttribute override def getUsername = super.getUsername } Here are few data source metrics going crazy during load-test: JMX support in the Spring framework itself is pretty simple. As you have seen above exposing arbitrary attribute or operation is just a matter of adding an annotation. You only have to remember about enabling JMX support using either XML or Java (also see: SPR-8943 : Annotation equivalent to with @Configuration): or: @Bean def annotationMBeanExporter() = new AnnotationMBeanExporter() This article wasn't particularly exciting. However, the knowledge of JMX metrics will enable us to write simple yet fancy dashboards in no time. Stay tuned! From http://nurkiewicz.blogspot.com/2011/12/enabling-jmx-in-hibernate-ehcache-qurtz.html

December 22, 2011

by Tomasz Nurkiewicz

· 12,628 Views

Zero Downtime – What is it and why is it important?

For most large web applications, uptime is of foremost importants. Any outage can be seen by customers as a frustration, or opportunity to move to a competitor. What's more for a site that also includes e-commerce, it can mean real lost sales. Zero Downtime describes a site without service interruption. To achieve such lofty goals, redundancy becomes a critical requirement at every level of your infrastructure. If you're using cloud hosting, are you redundant to alternate availability zones and regions? Are you using geographically distributed load balancing? Do you have multiple clustered databases on the backend, and multiple webservers load balanced. All of these requirements will increase uptime, but may not bring you close to zero downtime. For that you'll need thorough testing. The solution is to pull the trigger on sections of your infrastructure, and prove that it fails over quickly without noticeable outage. The ultimate test is the outage itself. Sean Hull on Quora: What is zero downtime and why is it important? Source: http://www.iheavy.com/2011/06/23/zero-downtime-what-is-it-and-why-is-it-important/

November 23, 2011

by Sean Hull

· 26,149 Views

Mocking JMS infrastructure with MockRunner to favour testing

This article shows *one* way to mock the JMS infrastructure in a Spring JMS application. This allows us to test our JMS infrastructure without actually having to depend on a physical connection being available. If you are reading this article, chances are that you are also frustrated with failing tests in your continuous integration environment due to a JMS server being (temporarily) unavailable. By mocking the JMS provider, developers are left free to test not only the functionality of their API (unit tests) but also the plumbing of the different components, e.g. in a Spring container. In this article I show how a Spring JMS Hello World application can be fully tested without the need of a physical JMS connection. I would like to stress the fact that the code in this article is by no means meant for production and that the approach shown is just one of many. The infrastructure For this article I use the following infrastructure: Apache ActiveMQ, an open source JMS provider, running on an Ubuntu installation Spring 3 Java 6 MockRunner Eclipse as development environment, running on Windows 7 The Spring configuration It's my belief that using what I define as Spring Configuration Strategy Pattern (SCSP) is the right solution in almost all cases when there is the need for a sound testing infrastructure. I will dedicate an entire article to SCSP, for now this is how it looks: The Spring application context Here follows the content of jemosJms-appContext.xml The only important thing to note here is that there are some services which rely on an existing bean named jmsConnectionFactory but that such bean is not defined in this file. This is key to the SCSP and I will illustrate this in one of my future articles. The Spring application context implementation Here follows the content of jemosJms-appContextImpl.xml which could be seen as an implementation of the Spring application context defined above This Spring context file imports the Spring application context defined above and it is this application context which declared the connection factory. This decoupling of the bean requirement (in the super context) from its actual declaration (Spring application context implementation) represents the cornerstore of SCSP. Mocking the JMS provider - The Spring Test application context and MockRunner Following the same approach I used above, I can now declare a fake connection factory which does not require a physical connection to a JMS provider. Here follows the content of jemosJmsTest-appContext.xml. Please note that this file should reside in the test resources of your project, i.e. it should never make it to production. Here the Spring test application context file imports the Spring application context (not its implementation) and it declares a fake connection factory, thanks to the MockRunner MockQueueConnectionFactory class. A POJO listener The job of handling the message is delegated to a simple POJO, which happens to be declared also as a bean: package uk.co.jemos.experiments; public class HelloWorldHandler { /** The application logger */ private static final org.apache.log4j.Logger LOG = org.apache.log4j.Logger .getLogger(HelloWorldHandler.class); public void handleHelloWorld(String msg) { LOG.info("Received message: " + msg); } } There is nothing glamorous about this class. In real life this should have probably be the implementation of an interface, but here I wanted to keep things simple. A simple JMS message producer Here follows an example of a JMS message producer, which would use the real JMS infrastructure to send messages: package uk.co.jemos.experiments; import org.springframework.context.ApplicationContext; import org.springframework.context.support.ClassPathXmlApplicationContext; import org.springframework.jms.core.JmsTemplate; public class JmsTest { /** The application logger */ private static final org.apache.log4j.Logger LOG = org.apache.log4j.Logger .getLogger(JmsTest.class); /** * @param args */ public static void main(String[] args) { ApplicationContext ctx = new ClassPathXmlApplicationContext( "classpath:jemosJms-appContextImpl.xml"); JmsTemplate jmsTemplate = ctx.getBean(JmsTemplate.class); jmsTemplate.send("jemos.tests", new HelloWorldMessageCreator()); LOG.info("Message sent successfully"); } } The only thing of interest here is that this class retrieves the real JmsTemplate to send a message to the queue. Now if I was to run this class as is, I would obtain the following: 2011-07-31 17:09:46 ClassPathXmlApplicationContext [INFO] Refreshing org.springframework.context.support.ClassPathXmlApplicationContext@19e0ff2f: startup date [Sun Jul 31 17:09:46 BST 2011]; root of context hierarchy 2011-07-31 17:09:46 XmlBeanDefinitionReader [INFO] Loading XML bean definitions from class path resource [jemosJms-appContextImpl.xml] 2011-07-31 17:09:46 XmlBeanDefinitionReader [INFO] Loading XML bean definitions from class path resource [jemosJms-appContext.xml] 2011-07-31 17:09:46 DefaultListableBeanFactory [INFO] Pre-instantiating singletons in org.springframework.beans.factory.support.DefaultListableBeanFactory@3479e304: defining beans [helloWorldConsumer,jmsTemplate,org.springframework.jms.listener.DefaultMessageListenerContainer#0,jmsConnectionFactory]; root of factory hierarchy 2011-07-31 17:09:46 DefaultLifecycleProcessor [INFO] Starting beans in phase 2147483647 2011-07-31 17:09:47 HelloWorldHandler [INFO] Received message: Hello World 2011-07-31 17:09:47 JmsTest [INFO] Message sent successfully Writing the integration test There are various interpretations as to what different types of tests mean and I don't pretend to have the only answer; my interpreation is that an integration test is a functional test which also wires up different components together but which does not interact with real external infrastructure (e.g. a Dao integration test fakes data, a JMS integration test fakes the JMS physical connection, an HTTP integration test fakes the remote Web host, etc). Whereas in my opinion, the main purpose of a unit (aka functional) test is to let the API emerge from the tests, the main goal of an integration test is to test that the plumbing amongst components works as expected so as to avoid surprises in a production environment. Both unit (functional) and integration tests should run very fast (e.g. under 10 minutes) as they constitute what can be considered the "development token". If unit and integration tests are green one should feel pretty confident that 90% of the functionality works as expected; in my projects when both unit and integration tests are green I let developers free to release the token. This does not mean that the other 10% (e.g. the interaction with the real infrastructure) should not be tested, but this can be delegated to system tests which run nightly and don't require the development token. Because unit and integration tests need to run fast, interaction with external infrastructure should be mocked whenever possible. Here follows an integration test for the Hello World handler: package uk.co.jemos.experiments.test.integration; import javax.annotation.Resource; import javax.jms.TextMessage; import junit.framework.Assert; import org.junit.Before; import org.junit.Test; import org.springframework.jms.core.JmsTemplate; import org.springframework.test.context.ContextConfiguration; import org.springframework.test.context.junit4.AbstractJUnit4SpringContextTests; import uk.co.jemos.experiments.HelloWorldHandler; import uk.co.jemos.experiments.HelloWorldMessageCreator; import com.mockrunner.jms.DestinationManager; import com.mockrunner.mock.jms.MockQueue; /** * @author mtedone * */ @ContextConfiguration(locations = { "classpath:jemosJmsTest-appContextImpl.xml" }) public class HelloWorldHandlerIntegrationTest extends AbstractJUnit4SpringContextTests { @Resource private JmsTemplate jmsTemplate; @Resource private DestinationManager mockDestinationManager; @Resource private HelloWorldHandler helloWorldHandler; @Before public void init() { Assert.assertNotNull(jmsTemplate); Assert.assertNotNull(mockDestinationManager); Assert.assertNotNull(helloWorldHandler); } @Test public void helloWorld() throws Exception { MockQueue mockQueue = mockDestinationManager.createQueue("jemos.tests"); jmsTemplate.send(mockQueue, new HelloWorldMessageCreator()); TextMessage message = (TextMessage) jmsTemplate.receive(mockQueue); Assert.assertNotNull("The text message cannot be null!", message.getText()); helloWorldHandler.handleHelloWorld(message.getText()); } } And here follows the output: 2011-07-31 17:17:26 XmlBeanDefinitionReader [INFO] Loading XML bean definitions from class path resource [jemosJmsTest-appContextImpl.xml] 2011-07-31 17:17:26 XmlBeanDefinitionReader [INFO] Loading XML bean definitions from class path resource [jemosJms-appContext.xml] 2011-07-31 17:17:26 GenericApplicationContext [INFO] Refreshing org.springframework.context.support.GenericApplicationContext@f01a1e: startup date [Sun Jul 31 17:17:26 BST 2011]; root of context hierarchy 2011-07-31 17:17:27 DefaultListableBeanFactory [INFO] Pre-instantiating singletons in org.springframework.beans.factory.support. DefaultListableBeanFactory@39478a43: defining beans [helloWorldConsumer,jmsTemplate,org.springframework.jms.listener.DefaultMessageListener Container#0,destinationManager,configurationManager,jmsConnectionFactory,org.springframework.context.annotation.internalConfigurationAnnotation Processor,org.springframework.context.annotation.internalAutowiredAnnotationProcessor,org.springframework.context.annotation.internalRequired AnnotationProcessor,org.springframework.context.annotation.internalCommonAnnotationProcessor]; root of factory hierarchy 2011-07-31 17:17:27 DefaultLifecycleProcessor [INFO] Starting beans in phase 2147483647 2011-07-31 17:17:27 HelloWorldHandler [INFO] Received message: Hello World 2011-07-31 17:17:27 GenericApplicationContext [INFO] Closing org.springframework.context.support.GenericApplicationContext@f01a1e: startup date [Sun Jul 31 17:17:26 BST 2011]; root of context hierarchy 2011-07-31 17:17:27 DefaultLifecycleProcessor [INFO] Stopping beans in phase 2147483647 2011-07-31 17:17:32 DefaultMessageListenerContainer [WARN] Setup of JMS message listener invoker failed for destination 'jemos.tests' - trying to recover. Cause: Queue with name jemos.tests not found 2011-07-31 17:17:32 DefaultListableBeanFactory [INFO] Destroying singletons in org.springframework.beans.factory.support. DefaultListableBeanFactory@39478a43: defining beans [helloWorldConsumer,jmsTemplate,org.springframework.jms.listener.DefaultMessageListener Container#0,destinationManager,configurationManager,jmsConnectionFactory,org.springframework.context.annotation.internalConfigurationAnnotationProcessor ,org.springframework.context.annotation.internalAutowiredAnnotationProcessor,org.springframework.context.annotation.internalRequiredAnnotationProcessor,org.springframework.context. annotation.internalCommonAnnotationProcessor]; root of factory hierarchy In this test, although we simulated a message roundtrip to a JMS queue, the message never left the current JVM and it the whole execution did not depend on a JMS infrastructure being up. This gives us the power to simulate the JMS infrastructure, to test the integration of our business components without having to fear a red from time to time due to JMS infrastructure being down or inaccessible. Please note that in the output there are some warnings because the JMS listener container declared in the jemosJms-appContext.xml does not find a queue named "jemos.test" in the fake connection factory, but this is fine; it's a warning and does not impede the test from running successfully. The Maven configuration Here follows the Maven pom.xml to compile the example: 4.0.0 uk.co.jemos.experiments jmx-experiments 0.0.1-SNAPSHOT Jemos JMS experiments junit junit 4.8.2 test com.mockrunner mockrunner 0.3.1 test log4j log4j 1.2.16 compile org.slf4j slf4j-api 1.6.1 compile org.slf4j slf4j-simple 1.6.1 compile org.apache.activemq activemq-all 5.5.0 compile org.springframework spring-beans 3.0.5.RELEASE org.springframework spring-context 3.0.5.RELEASE org.springframework spring-core 3.0.5.RELEASE org.springframework spring-jms 3.0.5.RELEASE org.springframework spring-test 3.0.5.RELEASE test From http://tedone.typepad.com/blog/2011/07/mocking-spring-jms-with-mockrunner.html

August 5, 2011

by Marco Tedone

· 54,455 Views · 17 Likes

Infrastructure Provisioning – What is it and why is it important?

In the old days... You would have a closet in your startup company with a rack of computers. Provisioning involved: Deciding on your architectural direction, what, where & how Ordering the new hardware Waiting weeks for the packages to arrive Setup the hardware, wire things together, power up Discover some component is missing, or failed and order replacement Wait longer... Finally get all the pieces setup Configure software components and go Along came some industrious folks who realized power and data to your physical location wasn't reliable. So datacenters sprang up. With data centers, most of the above steps didn't change except between steps 3 & 4 you would send your engineers out to the datacenter location. Trips back and forth ate up time and energy. Then along came managed hosting. Managed hosting saved companies a lot of headache, wasted man hours, and other resources. They allowed your company to do more of what it does well, run the business, and less on managing hardware and infrastructure. Provisioning now became: Decide on architecture direction Call hosting provider and talk to sales person Wait a day or two Setup & configure software components and go Obviously this new state of affairs improved infrastructure provisioning dramatically. It simplified the process and sped it up as well. What's more a managed hosting provider could keep spare parts and standard components on hand in much greater volume than a small firm. That's a big plus. This evolution continued because it was a win-win for everyone. The only downside was when engineers made mistakes, and finger pointing began. But despite all of that, a managed hosting provider which does only that, can do it better, and more reliably than you can yourself. So where are we in present day? We are all either doing, or looking out cloud provisioning of infrastructure. What's cloud provisioning? It is a complete paradigm shift, but along the same trajectory as what we've described above. Now you removed all the waiting. No waiting for sales team, or the ordering process. That's automatic. No waiting for engineers to setup the servers, they're already setup. They are allocated by your software and scripts. Even the setup and configuration of software components, Operating System and services to run on that server - all automatic. This is such a dramatic shift, that we are still feeling the affects of it. Traditional operations teams have little experience with this arrangement, and perhaps little trust in virtual servers. Business units are also not used to handing the trigger to infrastructure spending over to ops teams or to scripts and software. However the huge economic pressures continue to push firms to this new model, as well as new operational flexibility. Gartner predicts this trend will only continue. The advantages of cloud infrastructure provisioning include: Metered payment - no huge outlay of cash for new infrastructure Infrastructure as a service - scripted components automate & reduced manual processes Devops - Manage infrastructure like code with version control and reproduceability Take unused capacity offline easily & save on those costs Disaster Recovery is free - reuse scripts to build standard components Easily meet seasonal traffic requirements - spinup additional servers instantly On Quora Sean Hull asks - What is infrastructure provisioning and why is it important?

July 11, 2011

by Sean Hull

· 12,855 Views · 2 Likes

REST API: for Infrastructure, Domain or Application Layer?

It seems that lots of projects/products/services want to expose a REST API these days. But I have found very few that actually follow the REST constraints, and in a lot of the cases it doesn't even make sense for them to follow REST constraints in the first place. One of the main constraints that is commonly violated is the hypertext constraint. Basically, all state changes have to be done by following links, starting from a bookmarked URL. But almost noone does that. However, should they? This article will outline various layers that REST API's can be implemented in, and when it makes sense, and when not. To begin with, in a typical enterprise app there are three options for layers that you might want to expose using a REST API. These are the infrastructure layer, the domain layer, and the application layer. Infrastructure layer If we start with the infrastructure layer, we are typically talking about a database vendor that wants to allow developers to access it using "REST". The API would allow you to create/remove databases, and then insert/update/delete data. Typically it's pretty normal stuff, and the API doesn't change all that much between versions. Accessing this over HTTP maybe makes sense, but is it RESTful? I'll give you an example. I installed CouchDB, and given the hypermedia constraint I should then be able to go to "http://localhost:5984/", and it will tell me what I can do next (like create a database). But when I do a GET on that URL I get this: {"couchdb":"Welcome","version":"1.0.1"} So now what? The hypermedia doesn't tell me what I can do, so therefore as a REST client I will assume there's nothing I can do. This very simple test shows that the HTTP API for CouchDB isn't really RESTful at all. The question is: should it be? That is obviously up to the developers to decide. But if I were the architect I would maybe say, no, it shouldn't be RESTful. Why? Because I want to allow URL templates to be used, so that the client, given the server URL and a document id, is allowed to construct a URL on its own and GET the document. If this was truly RESTful the client would have to do a query in a form first, with the id, in order to get the URL of the document to be retrieved. That might be inefficient for a database, so I might opt not to do this. Which is, in effect, what they already have done. The only problem is that they call it RESTful, when it isn't, so it gives me as a developer the wrong impression of what I can expect from it. This line of reasoning could be done for pretty much most infrastructure layer API's. They're not RESTful, though many say they are, and most likely they shouldn't try to be! IT'S OK! Just say "Accessible over HTTP, see docs for URL templates and whatnot", and be done with it. Domain layer The next potential layer to be exposed over REST is the domain layer. This typically means that you take your domain entities and expose their data straight on the web, through CRUD operations. Very straightforward. There are tons of articles and blogs that show how to do this. But is it RESTful? Or is it even a good idea in the first place? The first test, again, would be to see if the app follows the hypermedia constraint. In this option it is technically possible to allow queries that will list the various URL's to entities in your domain, which you then can update/delete. So on the surface it might seem like you are following the hypermedia constraint. The problem usually comes with the fact that you are exposing domain state rather than application state. Let me explain through a simple example. Let's say you are building an issue tracker. You can access individual issues through links like: /issue/123 which on GET gives you documents such as: {"status":"OPEN","description":"Some issue"} Awesome. Now a client can change the status to "CLOSED" and PUT that. Tada! Case closed. Or is it? What if a client then decides to reopen it, by simply posting a new status of "OPEN" to it. Ok, that worked. But should it? Maybe your domain model really would have wanted it to only go to "REOPENED" from the "CLOSED" state. But how do you express that? How is the client to know that this is the only valid transition? And what happens when we have many versions of clients, each of which has a slightly different set of rules for what you are allowed to do when? Basically, chaos is ensured. And this is the problem with exposing your domain model using a REST API. The client has to own the application logic, and there's no way the server can be sure that it has the "right" logic. And the client, even if it *wants* to play nice (if code ever wants anything is debatable), will have a hard time knowing whether it is playing by the rules or not. It might even get a bit neurotic, trying to do the right thing, whatever that means. In summary, exposing your domain model does not help the client know what the valid state transitions are, and makes it very hard to do other things like role-based security authorization (maybe only an admin is allowed to REOPEN a CLOSED case?). I would therefore recommend that noone exposes their domain models using a REST API. Application layer Finally we come to the application layer. The application layer is designed to implement usecases of the domain model, and has all the context and logic needed to ensure that only valid state transitions are made. In short, it seems like it is especially appropriate to being exposed through a REST API, as it can at all points tell the client what it can do (either based on state or authorization rules or any other type of rules it might have). If we go back to the issue tracker, what would this mean in practice? It could mean that when you do a GET on /issue/123 you get something like this back: {"data":{"status":"OPEN","description":"Some issue"},"links":[{"close":"/issue/123/close.json"}]} This now instead of referring to viewing the domain state of an issue refers to the usecase of viewing an issue with the intent of working on it. There might be other URL's and other queries that only return the data, or maybe a table of the data, or somesuch. But this one, specifically, refers to the usecase of working with the issue. So, as a REST client I can now inspect the data, and then look at what links are available. If the client has a UI it can enable a button that says "Close issue" based on the available link, since it detected a link relation "close" that it understands. The client can then do GET on that link, find out whether the server expects any form to be filled in, and then submit it using POST, thereby letting the server application layer logic transition the issue to the "CLOSED" state. We are no longer relying on the client to contain the logic of knowing when to allow what, and the client also does not have to know how to construct the URL. As long as it can parse the hypertext (and we might use a custom JSON mediatype to indicate what "data" and "links" mean) and do something with it, we're fine. If we in the future change the domain model to also allow the "resolve" link relation for "OPEN" issues, old clients can ignore it, and new clients can enable new actions in the UI that uses it. In summary, the application layer is a very good candidate to be exposed through a REST API. It encapsulates the application rules for when the various state transitions are allowed, and can make use of user authorization to further enable/disable actions. This takes away a lot of responsibilities from the client, which now also can be "dynamic" in the sense that it can easily react to what state changes are available when by simply checking link availability in the hypermedia returned from the server. The main issue with exposing the application layer through a REST API is that there are pretty much no available frameworks that help you do all this in an easy way. But this is not REST's "fault", obviously, but rather that the "REST" community hasn't yet matured to understand what it should and what it should not do. In the Streamflow project we rolled our own simple framework for doing the above, and I'm very happy with that, but unfortunately most other frameworks seems to be in the "expose your domain model" camp, which means that a lot of this link management is non-trivial to do. This is a fixable situation though. I hope that this post has somewhat clarified what the issues are with exposing infrastructure and domain models through REST API's, and why it's not really a good idea in the first place, and why exposing the application layer really is the logical and simpler option. From http://www.jroller.com/rickard/entry/rest_api_for_infrastructure_domain

October 18, 2010

by Rickard Oberg

· 21,342 Views

The Three Pillars of Continuous Integration

Continuous Integration commonly known as CI is a process that consists of continuously compiling, testing, inspecting, and deploying source code. In any typical CI environment, this means running a new build every time code changes within a version control repository. Martin Fowler describes CI as: A software development practice where members of a team integrate their work frequently, usually each person integrates at least daily - leading to multiple integrations per day. Each integration is verified by an automated build to detect integration errors as quickly as possible. Many teams find that this approach leads to significantly reduced integration problems and allows a team to develop cohesive software more rapidly. While CI is actually a process, the term Continuous Integration often is associated with three important tools in particular. As shown in the image the three pillars of CI are: 1. A version control repository like Subversion, or CVS. 2. A CI Server such as Hudson, or Cruise Control 3. An automated build process like Ant or Nant So, let’s look at each of these in detail: Version Control Repository: Version control repositories also known as SCM (source code management) play a crucial role in any software development environment. They also play a very important role for a successful CI process. The SCM is a central place for the team to store every needed artifact for the project. It is mandatory for the teams to put everything needed for a successful build into this repository. This includes the build scripts, property files, database scripts, all the libraries required to build the software and so on. The CI Server: For CI to function properly, we also need to have an automated process that monitors a version control repository and runs a build when any changes are detected. There are several CI servers available, both open source and commercial. Most of them are similar in their basic configuration and monitor a particular version control repository and run builds when any changes are detected. Some of the most commonly used open source CI servers are; Cruise Control, Continuum, and Hudson. Hudson is particularly interesting because of its ease of configuration and compelling plug-ins, which makes integration with test and static analysis tools much easier. Automated Build: The process of CI is about building software often, which is accomplished through the use of a build. A sturdy build strategy is by far the most important aspect of a successful CI process. In the absence of a solid build that does more than compile your code, CI withers. With automated builds, teams can reliably perform (in an automated fashion) otherwise manual tasks like compilation, testing, and even more interesting things like software inspection and deployment. Now that we have seen the important tools in our CI process, let’s see how a typical CI scenario looks like for a developer: CI server is configured to poll the version control repository continuously for changes. Developer commits code to the repository. CI server detects this change, and retrieves the latest code from the repository. This causes the CI server to invoke the build script with the given targets and options. If configured, CI Server will send out an e-mail to the specified recipients when a certain important event occurs. The CI server continues to poll for changes. Why is CI Important? This is one of the most frequently asked questions, and here are a few points to note about this powerful technique: Building software often greatly increases the likelihood that you will spot defects early, when they still are relatively manageable. Extends defect visibility. CI ensures that you have production ready software at every change. CI also ensures that you have reduced the risk of integration issues by building software at every change. CI server can also be configured to run continuous inspection which can assist the development team in finding potential bugs, bad programming practice, automatically check coding standards, and also provide valuable feedback on the quality of code being written. Over the past several months, I have assisted several companies in implementing CI. There was a little bit of resistance from the developers in the early stages when we implemented continuous feedback. But, never heard a single negative comment about this approach. If you already have a version control repository and automated builds, you are very close to the CI process. Download one of the open source CI servers, configure and setup a simple project. It should take less than an hour if you have automated build scripts. Start adding additional features like code inspections, generating reports, metrics, documentation and so on. Most important, send continuous feedback to your team. Give this process a try, you sure will be surprised to see how effective it is. And, as always share your thoughts, concerns or questions.

December 15, 2008

by Meera Subbarao

· 23,859 Views

Configuring Logging in JBoss

Learn how to properly configure logs in JBoss.

November 19, 2008

by Meera Subbarao

· 223,085 Views

Patching from Local History

Using patches is a popular way to share changes between the teammates or supply updates to software products to the customers. With IntelliJ IDEA, creating and applying versioned patches is quite simple and intuitive: you can do it from the main Version Control menu, or from the Changes tool window. However, IntelliJ IDEA suggests an additional way to create and apply your “personal” patches. As we have discussed earlier, numerous changes pass unnoticed by the version control systems, because you just do not check in every change you make to your files while working. You know that IntelliJ IDEA keeps your own “personal version control” – the local history. Besides the possibility to roll back to a certain revision, you can also create a patch on the base of a revision or action, share it with your colleagues, and apply it when necessary. Local history applies to the folders, files, members and fragments of text, but the technique of creating a patch is common in all cases. Let’s see how it’s done. Select a folder in the Project tool window, and choose Local History on its context menu. In the Local History view, right-click the desired revision, and choose Create Patch: [img_assist|nid=1204|title=|desc=|link=none|align=left|width=505|height=357] In the dialog box that opens, specify the name and location of the patch file: [img_assist|nid=1205|title=|desc=|link=none|align=left|width=415|height=136] An interesting possibility is suggested by the Reverse patch checkbox. If you check this option, IntelliJ IDEA will create a patch that rolls back the selected action. For example, it you have created a file, the patch will delete it. Applying your “personal” patch is done as usual, using the Apply Patch command on the main Version Control menu. If a patch file is stored in project, you can invoke this command on the context menu of the patch file in the Project tool window: [img_assist|nid=1206|title=|desc=|link=none|align=left|width=249|height=195]

February 21, 2008

by Irina Megorskaya

· 9,792 Views