Data Engineering Resources

The Latest Data Engineering Topics

How to Use NodeManager to Control WebLogic Servers

In my previous post, you have seen how we can start a WebLogic admin and multiple managed servers. One downside with that instruction is that those processes will start in foreground and the STDOUT are printed on terminal. If you intended to run these severs as background services, you might want to try the WebLogic node manager wlscontrol.sh tool. I will show you how you can get Node Manager started here. The easiest way is still to create the domain directory with the admin server running temporary and then create all your servers through the /console application as described in last post. Once you have these created, then you may shut down all these processes and start it with Node Manager. 1. cd $WL_HOME/server/bin && startNodeManager.sh & 3. $WL_HOME/common/bin/wlscontrol.sh -d mydomain -r $HOME/domains/mydomain -c -f startWebLogic.sh -s myserver START 4. $WL_HOME/common/bin/wlscontrol.sh -d mydomain -r $HOME/domains/mydomain -c -f startManagedWebLogic.sh -s appserver1 START The first step above is to start and run your Node Manager. It is recommended you run this as full daemon service so even OS reboot can restart itself. But for this demo purpose, you can just run it and send to background. Using the Node Manager we can then start the admin in step 2, and then to start the managed server on step 3. The NodeManager can start not only just the WebLogic server for you, but it can also monitor them and automatically restart them if they were terminated for any reasons. If you want to shutdown the server manually, you may use this command using Node Manager as well: $WL_HOME/common/bin/wlscontrol.sh -d mydomain -s appserver1 KILL The Node Manager can also be used to start servers remotely through SSH on multiple machines. Using this tool effectively can help managing your servers across your network. You may read more details here: http://docs.oracle.com/cd/E23943_01/web.1111/e13740/toc.htm TIPS1: If there is problem when starting server, you may wnat to look into the log files. One log file is the/servers//logs/.out of the server you trying to start. Or you can look into the Node Manager log itself at $WL_HOME/common/nodemanager/nodemanager.log TIPS2: You add startup JVM arguments to each server starting with Node Manager. You need to create a file under /servers//data/nodemanager/startup.properties and add this key value pair:Arguments = -Dmyapp=/foo/bar TIPS3: If you want to explore Windows version of NodeManager, you may want to start NodeManager without native library to save yourself some trouble. Try adding NativeVersionEnabled=false to$WL_HOME/common/nodemanager/nodemanager.properties file.

March 24, 2014

by Zemian Deng

· 13,924 Views

Clearing the Database with Django Commands

In a previous post, I presented a method of loading initial data into a Django database by using a custom management command. An accompanying task is cleaning the database up. Here I want to discuss a few options for doing that. First, some general design notes on Django management commands. If you run manage.py help you’ll see a whole bunch of commands starting with sql. These all share a common idiom – print SQL statements to the standard output. Almost all DB engines have means to pipe commands from the standard input, so this plays great with the Unix philosophy of building pipes of single-task programs. Django even provides a convenient shortcut for us to access the actual DB that’s being used with a given project – the dbshell command. As an example, we have the sqlflush command, which returns a list of the SQL statements required to return all tables in the database to the state they were in just after they were installed. In a simple blog-like application with "post" and "tag" models, it may return something like: $ python manage.py sqlflush BEGIN; DELETE FROM "auth_permission"; DELETE FROM "auth_group"; DELETE FROM "django_content_type"; DELETE FROM "django_session"; DELETE FROM "blogapp_tag"; DELETE FROM "auth_user_groups"; DELETE FROM "auth_group_permissions"; DELETE FROM "auth_user_user_permissions"; DELETE FROM "blogapp_post"; DELETE FROM "blogapp_post_tags"; DELETE FROM "auth_user"; DELETE FROM "django_admin_log"; COMMIT; Note there’s a lot of tables here, because the project also installed the admin and auth applications from django.contrib. We can actually execute these SQL statements, and thus wipe out all the DB tables in our database, by running: $ python manage.py sqlflush | python manage.py dbshell For this particular sequence, since it’s so useful, Django has a special built-in command named flush. But there’s a problem with running flush that may or may not bother you, depending on what your goals are. It wipes out all tables, and this means authentication data as well. So if you’ve created a default admin user when jump-starting the application, you’ll have to re-create it now. Perhaps there’s a more gentle way to delete just your app’s data, without messing with the other apps? Yes. In fact, I’m going to show a number of ways. First, let’s see what other existing management commands have to offer. sqlclear will emit the commands needed to drop all tables in a given app. For example: $ python manage.py sqlclear blogapp BEGIN; DROP TABLE "blogapp_tag"; DROP TABLE "blogapp_post"; DROP TABLE "blogapp_post_tags"; COMMIT; So we can use it to target a specific app, rather than using the kill-all approach of flush. There’s a catch, though. While flush runs delete to wipe all data from the tables, sqlclear removes the actual tables. So in order to be able to work with the database, these tables have to be re-created. Worry not, there’s a command for that: $ python manage.py sql blogapp BEGIN; CREATE TABLE "blogapp_post_tags" ( "id" integer NOT NULL PRIMARY KEY AUTOINCREMENT, "post_id" integer NOT NULL REFERENCES "blogapp_post" ("id"), "tag_id" varchar(50) NOT NULL REFERENCES "blogapp_tag" ("name"), UNIQUE ("post_id", "tag_id") ) ; CREATE TABLE "blogapp_post" ( "id" integer NOT NULL PRIMARY KEY AUTOINCREMENT, <.......> ) ; CREATE TABLE "blogapp_tag" ( <.......> ) ; COMMIT; So here’s a first way to do a DB cleanup: pipe sqlclear appname into dbshell. Then pipe sql appname to dbshell. An alternative way, which I like less, is to take the subset of DELETE statements generated by sqlflush, save them in a text file, and pipe it through to dbshell when needed. For example, for the blog app discussed above, these statements should do it: BEGIN; DELETE FROM "blogapp_tag"; DELETE FROM "blogapp_post"; DELETE FROM "blogapp_post_tags"; DELETE COMMIT; The reason I don’t like it is that it forces you to have explicit table names stored somewhere, which is a duplication of the existing models. If you happen to change some of your foreign keys, for example, tables will need changing so this file will have to be regenerated. The approach I like best is more programmatic. Django’s model API is flexible and convenient, and we can just use it in a custom management command: from django.core.management.base import BaseCommand from blogapp.models import Post, Tag class Command(BaseCommand): def handle(self, *args, **options): Tag.objects.all().delete() Post.objects.all().delete() Save this code as blogapp/management/commands/clear_models.py, and now it can be invoked with: $ python manage.py clear_models

March 24, 2014

by Eli Bendersky

· 19,025 Views

Estimating Statistics via Bootstrapping and Monte Carlo Simulation

We want to estimate some "statistics" (e.g. average income, 95 percentile height, variance of weight ... etc.) from a population. It will be too tedious to enumerate all members of the whole population. For efficiency reason, we randomly pick a number samples from the population, compute the statistics of the sample set to estimate the corresponding statistics of the population. We understand the estimation done this way (via random sampling) can deviate from the population. Therefore, in additional to our estimated statistics, we also include a "standard error" (how big our estimation may be deviated from the actual population statistics) or a "confidence interval" (a lower and upper bound of the statistics which we are confident about containing the true statistics). The challenge is how do we estimate the "standard error" or the "confidence interval". A straightforward way is to repeat the sampling exercise many times, each time we create a different sample set from which we compute one estimation. Then we look across all estimations from different sample sets to estimate the standard error and confidence interval of the estimation. But what if collecting data from a different sample set is expensive, or for any reason the population is no longer assessable after we collected our first sample set. Bootstrapping provides a way to address this ... Bootstrapping Instead of creating additional sample sets from the population, we create additional sample sets by re-sampling data (with replacement) from the original sample set. Each of the created sample set will follow the same data distribution of the original sample set, which in turns, follow the population. R provides a nice "bootstrap" library to do this. > library(boot) > # Generate a population > population.weight <- rnorm(100000, 160, 60) > # Lets say we care about the ninety percentile > quantile(population.weight, 0.9) 90% 236.8105 > # We create our first sample set of 500 samples > sample_set1 <- sample(population.weight, 500) > # Here is our sample statistic of ninety percentile > quantile(sample_set1, 0.9) 90% 232.3641 > # Notice that the sample statistics deviates from the population statistics > # We want to estimate how big is this deviation by using bootstrapping > # I need to define my function to compute the statistics > ninety_percentile <- function(x, idx) {return(quantile(x[idx], 0.9))} > # Bootstrapping will call this function many times with different idx > boot_result <- boot(data=sample_set1, statistic=ninety_percentile, R=1000) > boot_result ORDINARY NONPARAMETRIC BOOTSTRAP Call: boot(data = sample_set1, statistic = ninety_percentile, R = 1000) Bootstrap Statistics : original bias std. error t1* 232.3641 2.379859 5.43342 > plot(boot_result) > boot.ci(boot_result, type="bca") BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS Based on 1000 bootstrap replicates CALL : boot.ci(boot.out = boot_result, type = "bca") Intervals : Level BCa 95% (227.2, 248.1 ) Calculations and Intervals on Original Scale Here is the visual output of the bootstrap plot Bootstrapping is a powerful simulation technique for estimate any statistics in an empirical way. It is also non-parametric because it doesn't assume any model as well as parameters and just use the original sample set to estimate the statistics. If we assume certain distribution model want to see the distribution of certain statistics. Monte Carlo simulation provides a powerful way for this. Monte Carlo Simulation The idea is pretty simple, based on a particular distribution function (defined by a specific model parameters), we generate many sets of samples. We compute the statistics of each sample set and see how the statistics distributed across different sample sets. For example, given a normal distribution population, what is the probability distribution of the max value of 5 randomly chosen samples. > sample_stats <- rep(0, 1000) > for (i in 1:1000) { + sample_stats[i] <- max(rnorm(5)) + } > mean(sample_stats) [1] 1.153008 > sd(sample_stats) [1] 0.6584022 > par(mfrow=c(1,2)) > hist(sample_stats, breaks=30) > qqnorm(sample_stats) > qqline(sample_stats) Here is the distribution of the "max(5)" statistics, which shows some right skewness Bootstrapping and Monte Carlo simulation are powerful tools to estimate statistics in an empirical manner, especially when we don't have an analytic form of solution.

March 21, 2014

by Ricky Ho

· 4,914 Views

WSO2 DSS: Batch Insert Sample (End to End)

WSO2 DSS wraps Data Services Layer and provides us with a simple GUI to define a Data Service with zero Java code. With this, a change to the data source is just a simple click away and no other party needs to be aware of this. With this sample demonstration, we will see how to do a batch insert to a table. Batch insert is useful when you want to insert data in sequential manner. This also means that if at least one of the insertion query fails all the other queries ran so far in the batch will be rolled back as well. If one insertion in the batch fails means whole batch is failed. This can be used if you are running the same query to insert data many times. With batch insert all the data will be sent in one call. So this reduce the number calls you have to call, to get the data inserted. This comes with one condition that, The query should not be producing results back. (We will only be notified whether the query was successful or not.) Prerequisites: WSO2 Data Services Server - http://wso2.com/products/data-services-server/ (current latest 3.1.1) Mysql connector (JDBC) - https://www.mysql.com/products/connector/ If we already have a data service running which is not sending back a result set , then it's just a matters of adding following property in service declaration. enableBatchRequests="true" Anyway I will be demonstrating the creation of the service from the scratch. 1. Create a service as follows going through the wizard, 2. Create the data source 3. Create the query - (This is an insert query. Also note the input mapping we have add as relevant to the query. To know more about input mapping and using validation refer the documentation.) 4. Create the operation - Select the query to be executed once the operation is called. By enabling return request status, we will be notified whether the operation was a success or not. 5. Try it! - When we list the services we will see this new service now. In the right we will have an option to try it. Here we can see the option to try the service giving the input parameters. Here I have tried it two insertions in a batch. Now if we go to XML view of the service it will be similar to following, which is saved in server as a .dbs file. com.mysql.jdbc.Driver jdbc:mysql://localhost:3306/json_array root root 1 10 SELECT 1 insert into flights (flight_no, number_of_cases, created_by, description, trips) values (:flight_no,:number_of_cases,:created_by,:description,:trips) If we hit on the service name in the list of services, we will be directed to Service Dashboard where we can see several other options for the service. It provides the option to generate an Axis2 client for the service. Once we get the client then it's a matter of calling the methods in the stub as follows. private static BatchRequestSampleOldStub.AddFlight_type0 createFlight(int cases, String creator, String description, int trips) { BatchRequestSampleOldStub.AddFlight_type0 val = new BatchRequestSampleOldStub.AddFlight_type0(); val.setNumber_of_cases(cases); val.setCreated_by(creator); val.setDescription(description); val.setTrips(trips); printFlightInfo(cases, creator, description, trips); return val; } public static void main(String[] args) throws Exception { String epr = "http://localhost:9763" + "/services/BatchInsertSample"; BatchRequestSampleOldStub stub = new BatchRequestSampleOldStub(epr); BatchRequestSampleOldStub.AddFlight_batch_req vals1 = new BatchRequestSampleOldStub.AddFlight_batch_req(); vals1.addAddFlight(createFlight(1, "Pushpalanka", "test", 2)); vals1.addAddFlight(createFlight(2, "Jayawardhana", "test", 2)); vals1.addAddFlight(createFlight(3, "lanka@gmail.com", "test", 2)); try { System.out.println("Executing Add Flights.."); stub.addFlight_batch_req(vals1); } catch (Exception e) { System.out.println("Error in Add Flights!"); } Complete client code can be found here. Cheers! Ref: http://docs.wso2.org/display/DSS311/Batch+Processing+Sample

March 21, 2014

by Pushpalanka Jayawardhana

· 9,852 Views

Grails Goodness: Using Hibernate Native SQL Queries

Sometimes we want to use Hibernate native SQL in our code. For example we might need to invoke a selectable stored procedure, we cannot invoke in another way. To invoke a native SQL query we use the method createSQLQuery() which is available from the Hibernate session object. In our Grails code we must then first get access to the current Hibernate session. Luckily we only have to inject the sessionFactory bean in our Grails service or controller. To get the current session we invoke the getCurrentSession() method and we are ready to execute a native SQL query. The query itself is defined as a String value and we can use placeholders for variables, just like with other Hibernate queries. In the following sample we create a new Grails service and use a Hibernate native SQL query to execute a selectable stored procedure with the nameorganisation_breadcrumbs. This stored procedure takes one argument startId and will return a list of results with an id, name and level column. // File: grails-app/services/com/mrhaki/grails/OrganisationService.groovy package com.mrhaki.grails import com.mrhaki.grails.Organisation class OrganisationService { // Auto inject SessionFactory we can use // to get the current Hibernate session. def sessionFactory List breadcrumbs(final Long startOrganisationId) { // Get the current Hiberante session. final session = sessionFactory.currentSession // Query string with :startId as parameter placeholder. final String query = 'select id, name, level from organisation_breadcrumbs(:startId) order by level desc' // Create native SQL query. final sqlQuery = session.createSQLQuery(query) // Use Groovy with() method to invoke multiple methods // on the sqlQuery object. final results = sqlQuery.with { // Set domain class as entity. // Properties in domain class id, name, level will // be automatically filled. addEntity(Organisation) // Set value for parameter startId. setLong('startId', startOrganisationId) // Get all results. list() } results } } In the sample code we use the addEntity() method to map the query results to the domain class Organisation. To transform the results from a query to other objects we can use the setResultTransformer() method. Hibernate (and therefore Grails if we use the Hibernate plugin) already has a set of transformers we can use. For example with the org.hibernate.transform.AliasToEntityMapResultTransformer each result row is transformed into a Map where the column aliases are the keys of the map. // File: grails-app/services/com/mrhaki/grails/OrganisationService.groovy package com.mrhaki.grails import org.hibernate.transform.AliasToEntityMapResultTransformer class OrganisationService { def sessionFactory List> breadcrumbs(final Long startOrganisationId) { final session = sessionFactory.currentSession final String query = 'select id, name, level from organisation_breadcrumbs(:startId) order by level desc' final sqlQuery = session.createSQLQuery(query) final results = sqlQuery.with { // Assign result transformer. // This transformer will map columns to keys in a map for each row. resultTransformer = AliasToEntityMapResultTransformer.INSTANCE setLong('startId', startOrganisationId) list() } results } } Finally we can execute a native SQL query and handle the raw results ourselves using the Groovy Collection API enhancements. The result of thelist() method is a List of Object[] objects. In the following sample we use Groovy syntax to handle the results: // File: grails-app/services/com/mrhaki/grails/OrganisationService.groovy package com.mrhaki.grails class OrganisationService { def sessionFactory List> breadcrumbs(final Long startOrganisationId) { final session = sessionFactory.currentSession final String query = 'select id, name, level from organisation_breadcrumbs(:startId) order by level desc' final sqlQuery = session.createSQLQuery(query) final queryResults = sqlQuery.with { setLong('startId', startOrganisationId) list() } // Transform resulting rows to a map with key organisationName. final results = queryResults.collect { resultRow -> [organisationName: resultRow[1]] } // Or to only get a list of names. //final List names = queryResults.collect { it[1] } results } } Code written with Grails 2.3.7.

March 20, 2014

by Hubert Klein Ikkink

· 22,875 Views · 1 Like

Cloud Automation with WinRM vs SSH

[Article originally written by Barak Merimovich.] Automation the Linux Way In the Linux world SSH, secure shell, is the de facto standard for remote connectivity and automation for the purpose of logging into a remote machine to install tools and run commands. It's pretty much ubiquitous, runs across multiple Linux versions and distributions, and every Linux admin worth their salt knows SSH and how to configure it. What's more, it's even the default enabled port on most clouds - port 22. An important feature available with SSH is support for file transfer via its secure copy protocol - AKA SCP, and secure file transfer protocol - AKA SFTP. These are a built-in part of the tool or exist as add-ons to the protocol that are almost always available. Therefore, using SSH for file transfer and remote execution is basically a given with Linux, and there are even tools to support SSH clients available for virtually every major programming language and operating system. WinRM in a Linux World So what comes out-of-the-box with Linux, is less of a given with Windows. SSH, obviously, is not built in with Windows; over the years there have been different protocols attempting to achieve the same functionality, such as Secure Telnet and others, however to date, none have really caught on. From Windows Server 2003, a new tool called WinRM - windows remote management, was introduced. WinRM is a SOAP-based protocol built on web services that among other things, allows you to connect to a remote system, providing a shell, essentially offering similar functionality to SSH. WinRM is currently the Windows world alternative to SSH. The Pros The advantage with WinRM is that you can use a vanilla VM with nothing pre-configured on it, with the only prerequisite being that the WinRM service needs to be running. EC2, the largest cloud provider today, supports this out-of-the-box, so if you want to run a standard Amazon machine image (AMI) for Windows, WinRM is enabled by default. This makes it possible to quickly start working with a cloud, all that needs to be done is bring up a standard Windows VM, and then it's possible to remotely configure it - and start using it. This is very useful in cloud environments where you are sometimes unable to create a custom Windows image or are limited to a very small number of images and want to limit your resource usage. The Challenges Where SSH has become the de facto protocol with Linux, WinRM is far less known tool in the Windows world, although it does offer comparable features as far as security, as well as connecting and executing commands to a remote machine. The standard tool for using WinRM is usually PowerShell, the new Windows shell that is intended to supersede the standard command prompt. To date though, there are still relatively few programming languages with built-in support for WinRM, making automation and remote execution of tasks over WinRM much more complex. To achieve these tasks, Cloudify employs PowerShell itself, as an external process to act as a client library for accessing WinRM. The primary issue with this, however, is that the client-side also needs to be running Windows, as PowerShell cannot run on Linux. Another aspect where WinRM differs from SSH is that it does not really have built-in file transfer. There is no direct equivalent for secure copy in SSH for WinRM. That said, it is possible to implement file transfer through PowerShell scripts. There are currently several open source initiatives looking to build a WinRM client for Linux - or specifically for some programming languages, such as Java, however, these are in different levels of maturity, where none of them are fully featured yet. Hence, PowerShell remains the default tool for Cloudify, which essentially provides the same level of functionality you would expect for running remote commands on a Linux machine with Windows. WinRM & Security Another interesting point to consider about WinRM is its support for encryption. WinRM supports three types of transfer protocols, HTTP, HTTPS, and encrypted HTTP. With HTTP, inevitably your wire protocol is unencrypted. It is only a good idea to use HTTP inside your own data center in the event that you are completely convinced that no one can monitor anything going over the wire. HTTPS is commonly used instead of HTTP, however with WinRM there's a chicken and egg issue. If you want to work with HTTPS you are required to set up an SSL certificate on the remote machine. The challenge here is when you're starting with a vanilla Windows VM that will not have the certificate installed, there is a need to automate the insertion of that certificate, however this often cannot be done, as WinRM is not running. Encrypted HTTP, which is also the default in EC2, basically uses your login credentials as your encryption key and it works. From a security perspective this is the recommended secure transfer protocol to use. It is worth noting that most attempts to create a WinRM client library tend to encounter problems around the encrypted HTTP protocol, as implementing MS' encrypted HTTP system - credSSP - is challenging. However, there are various projects working on achieving this, so it will hopefully be solved in the near future. Where Cloudify Comes Into the Mix Where WinRM comes into play with Cloudify, is during the cloud bootstrapping process. By using WinRM Cloudify is able to remotely connect to a vanilla VM provided by the cloud, and set up the Cloudify manager or agent to run on the machine. In addition to traditional cloud environments, WinRM also works on non-cloud and non-virtualized environments, such as a standard data center with multiple Windows servers running. All that needs to be done is provide Cloudify with the credentials, and it will use WinRM to connect and set up the machine remotely. Since WinRM is pre-packaged with Windows, there is no need to install anything. The only thing requirement, as mentioned above, is to have the WinRM service running, as not all Windows images will have this service running. Conclusion In short WinRM is the Window's world alternative to SSHD that allows you to remotely login securely and execute commands on Windows machines. From a cloud automation perspective, it provides virtually all the necessary functionality requirements, and thus it is recommended to have WinRM running in your Windows environment.

March 19, 2014

by Sharone Zitzman

· 25,375 Views

Time Series Feature Design: The Consensus has dRafted a Decision

So, after reaching the conclusion that replication is going to be hard, I went back to the office and discussed those challenges and was in general pretty annoyed by it. Then Michael made a really interesting suggestion. Why not put it on RAFT? And once he explained what he meant, I really couldn’t hold my excitement. We now have a major feature for 4.0. But before I get excited about that (we’ll only be able to actually start working on that in a few months, anyway), let us talk about what the actual suggestion was. Raft is a consensus algorithm. It allows a distributed set of computers to arrive into a mutually agreed upon set of sequential log records. Hm… I wonder where else we can find sequential log records, and yes, I am looking at you Voron.Journal. The basic idea is that we can take the concept of log shipping, but instead of having a single master/slave relationship, we change things so we can put Raft in the middle. When committing a transaction, we’ll hold off committing the transaction until we have a Raft consensus that it should be committed. The advantage here is that we won’t be constrained any longer by the master/slave issue. If there is a server down, we can still process requests (maybe need to elect a new cluster leader, but that is about it). That means that from an architectural standpoint, we’ll have the ability to process write requests for any quorum (N/2+1). That is a pretty standard requirement for distributed databases, so that is perfectly fine. That is a pretty awesome thing to have, to be honest, and more importantly, this is happening at the low level storage layer. That means that we can apply this behavior not just to a single database solution, but to many database solutions. I’m pretty excited about this.

March 19, 2014

by Oren Eini

· 2,001 Views

Change Font Terminal Tool Window in IntelliJ IDEA

IntelliJ IDEA 13 added the Terminal tool window to the IDE. We can open a terminal window with Tools | Open Terminal.... To change the font of the terminal we must open the preferences and select IDE Settings | Editor | Colors & Fonts | Console Font. Here we can choose a font and change the font size:

March 18, 2014

by Hubert Klein Ikkink

· 35,460 Views · 1 Like

Shrink Your Time Machine Backups and Free Disk Space

Time Machine is a backup and restore tool from Apple which is very well integrated into OS X. In my personal opinion Time Machine is not yet awesome.

March 18, 2014

by Enrico Maria Crisostomo

· 160,851 Views · 1 Like

ActiveMQ - Network of Brokers Explained

Objective This 7 part blog series is to share about how to create network of ActiveMQ brokers in order to achieve high availability and scalability. Why network of brokers? ActiveMQ message broker is a core component of messaging infrastructure in an enterprise. It needs to be highly available and dynamically scalable to facilitate communication between dynamic heterogeneous distributed applications which have varying capacity needs. Scaling enterprise applications on commodity hardware is a rage nowadays. ActiveMQ caters to that very well by being able to create a network of brokers to share the load. Many times applications running across geographically distributed data centers need to coordinate messages. Running message producers and consumers across geographic regions/data centers can be architected better using network of brokers. ActiveMQ uses transport connectors over which it communicates with message producers and consumers. However, in order to facilitate broker to broker communication, ActiveMQ uses network connectors. A network connector is a bridge between two brokers which allows on-demand message forwarding. In other words, if Broker B1 initiates a network connector to Broker B2 then the messages on a channel (queue/topic) on B1 get forwarded to B2 if there is at least one consumer on B2 for the same channel. If the network connector was configured to be duplex, the messages get forwarded from B2 to B1 on demand. This is very interesting because it is now possible for brokers to communicate with each other dynamically. In this 7 part blog series, we will look into the following topics to gain understanding of this very powerful ActiveMQ feature: Network Connector Basics - Part 1 Duplex network connectors - Part 2 Load balancing consumers on local/remote brokers - Part 3 Load-balance consumers/subscribers on remote brokers Queue: Load balance remote concurrent consumers - Part 4 Topic: Load Balance Durable Subscriptions on Remote Brokers - Part 5 Store/Forward messages and consumer failover - Part 6 How to prevent stuckmessages Virtual Destinations - Part 7 To give credit where it is due, the following URLs have helped me in creating this blog post series. Advanced Messaging with ActiveMQ by Dejan Bosanac [Slides 32-36] Understanding ActiveMQ Broker Networks by Jakub Korab Prerequisites ActiveMQ 5.8.0 – To create broker instances Apache Ant – To run ActiveMQ sample producer and consumers for demo. We will use multiple ActiveMQ broker instances on the same machine for the ease of demonstration. Network Connector Basics - Part 1 The following diagram shows how a network connector functions. It bridges two brokers and is used to forward messages from Broker-1 to Broker-2 on demand if established by Broker-1 to Broker-2. A network connector can be duplex so messages could be forwarded in the opposite direction; from Broker-2 to Broker-1, once there is a consumer on Broker-1 for a channel which exists in Broker-2. More on this in Part 2 Setup network connector between broker-1 and broker-2 Create two broker instances, say broker-1 and broker-2 Ashwinis-MacBook-Pro:bin akuntamukkala$ pwd /Users/akuntamukkala/apache-activemq-5.8.0/bin Ashwinis-MacBook-Pro:bin akuntamukkala$ ./activemq-admin create ../bridge-demo/broker-1 Ashwinis-MacBook-Pro:bin akuntamukkala$ ./activemq-admin create ../bridge-demo/broker-2 Since we will be running both brokers on the same machine, let's configure broker-2 such that there are no port conflicts. Edit /Users/akuntamukkala/apache-activemq-5.8.0/bridge-demo/broker-2/conf/activemq.xml Change transport connector to 61626 from 61616 Change AMQP port from 5672 to 6672 (won't be using it for this blog) Edit /Users/akuntamukkala/apache-activemq-5.8.0/bridge-demo/broker-2/conf/jetty.xml Change web console port to 9161 from 8161 Configure Network Connector from broker-1 to broker-2 Add the following XML snippet to/Users/akuntamukkala/apache-activemq-5.8.0/bridge-demo/broker-1/conf/activemq.xml The above XML snippet configures two network connectors "T:broker1->broker2" (only topics as queues are excluded) and "Q:broker1->broker2" (only queues as topics are excluded). This allows for nice separation between network connectors used for topics and queues. The name can be arbitrary although I prefer to specify the [type]:[source broker]->[destination broker]. The URI attribute specifies how to connect to broker-2 Start broker-2 Ashwinis-MacBook-Pro:bin akuntamukkala$ pwd /Users/akuntamukkala/apache-activemq-5.8.0/bridge-demo/broker-2/bin Ashwinis-MacBook-Pro:bin akuntamukkala$ ./broker-2 console Start broker-1 Ashwinis-MacBook-Pro:bin akuntamukkala$ pwd /Users/akuntamukkala/apache-activemq-5.8.0/bridge-demo/broker-1/bin Ashwinis-MacBook-Pro:bin akuntamukkala$ ./broker-1 console Logs on broker-1 show 2 network connectors being established with broker-2 INFO | Establishing network connection from vm://broker-1?async=false&network=true to tcp://localhost:61626 INFO | Connector vm://broker-1 Started INFO | Establishing network connection from vm://broker-1?async=false&network=true to tcp://localhost:61626 INFO | Network connection between vm://broker-1#24 and tcp://localhost/127.0.0.1:61626@52132(broker-2) has been established. INFO | Network connection between vm://broker-1#26 and tcp://localhost/127.0.0.1:61626@52133(broker-2) has been established. Web Console on broker-1 @ http://localhost:8161/admin/connections.jsp shows the two network connectors established to broker-2 The same on broker-2 does not show any network connectors since no network connectors were initiated by broker-2 Let's see this in action Let's produce 100 persistent messages on a queue called "foo.bar" on broker-1. Ashwinis-MacBook-Pro:example akuntamukkala$ pwd /Users/akuntamukkala/apache-activemq-5.8.0/example Ashwinis-MacBook-Pro:example akuntamukkala$ ant producer -Durl=tcp://localhost:61616 -Dtopic=false -Ddurable=true -Dsubject=foo.bar -Dmax=100 broker-1 web console shows that 100 messages have been enqueued in queue "foo.bar" http://localhost:8161/admin/queues.jsp Let's start a consumer on a queue called "foo.bar" on broker-2. The important thing to note here is that the destination name "foo.bar" should match exactly. Ashwinis-MacBook-Pro:example akuntamukkala$ ant consumer -Durl=tcp://localhost:61626 -Dtopic=false -Dsubject=foo.bar We find that all the 100 messages from broker-1's foo.bar queue get forwarded to broker-2's foo.bar queue consumer. broker-1 admin console at http://localhost:8161/admin/queues.jsp broker-2 admin console @ http://localhost:9161/admin/queues.jspshows that the consumer we had started has consumed all 100 messages which were forwarded on-demand from broker-1 broker-2 consumer details on foo.bar queue broker-1 admin console shows that all 100 messages have been dequeued [forwarded to broker-2 via the network connector]. broker-1 consumer details on "foo.bar" queue shows that the consumer is created on demand: [name of connector]_[destination broker]_inbound_[source broker] Thus we have seen the basics of network connector in ActiveMQ. As always, please feel to comment about anything that can be improved. Your inputs are welcome! Stay tuned for Part 2.

March 12, 2014

by Ashwini Kuntamukkala

· 39,579 Views · 2 Likes

Exporting Spring Data JPA Repositories as REST Services using Spring Data REST

Spring Data modules provides various modules to work with various types of datasources like RDBMS, NOSQL stores etc in unified way. In my previous article SpringMVC4 + Spring Data JPA + SpringSecurity configuration using JavaConfig I have explained how to configure Spring Data JPA using JavaConfig. Now in this post let us see how we can use Spring Data JPA repositories and export JPA entities as REST endpoints using Spring Data REST. First let us configure spring-data-jpa and spring-data-rest-webmvc dependencies in our pom.xml. org.springframework.data spring-data-jpa 1.5.0.RELEASE org.springframework.data spring-data-rest-webmvc 2.0.0.RELEASE Make sure you have latest released versions configured correctly, otherwise you will encounter the following error: java.lang.ClassNotFoundException: org.springframework.data.mapping.SimplePropertyHandler Create JPA entities. @Entity @Table(name = "USERS") public class User implements Serializable { private static final long serialVersionUID = 1L; @Id @GeneratedValue(strategy = GenerationType.IDENTITY) @Column(name = "user_id") private Integer id; @Column(name = "username", nullable = false, unique = true, length = 50) private String userName; @Column(name = "password", nullable = false, length = 50) private String password; @Column(name = "firstname", nullable = false, length = 50) private String firstName; @Column(name = "lastname", length = 50) private String lastName; @Column(name = "email", nullable = false, unique = true, length = 50) private String email; @Temporal(TemporalType.DATE) private Date dob; private boolean enabled=true; @OneToMany(fetch=FetchType.EAGER, cascade=CascadeType.ALL) @JoinColumn(name="user_id") private Set roles = new HashSet<>(); @OneToMany(mappedBy = "user") private List contacts = new ArrayList<>(); //setters and getters } @Entity @Table(name = "ROLES") public class Role implements Serializable { private static final long serialVersionUID = 1L; @Id @GeneratedValue(strategy = GenerationType.IDENTITY) @Column(name = "role_id") private Integer id; @Column(name="role_name",nullable=false) private String roleName; //setters and getters } @Entity @Table(name = "CONTACTS") public class Contact implements Serializable { private static final long serialVersionUID = 1L; @Id @GeneratedValue(strategy = GenerationType.IDENTITY) @Column(name = "contact_id") private Integer id; @Column(name = "firstname", nullable = false, length = 50) private String firstName; @Column(name = "lastname", length = 50) private String lastName; @Column(name = "email", nullable = false, unique = true, length = 50) private String email; @Temporal(TemporalType.DATE) private Date dob; @ManyToOne @JoinColumn(name = "user_id") private User user; //setters and getters } Configure DispatcherServlet using AbstractAnnotationConfigDispatcherServletInitializer. Observe that we have added RepositoryRestMvcConfiguration.class to getServletConfigClasses() method. RepositoryRestMvcConfiguration is the one which does the heavy lifting of looking for Spring Data Repositories and exporting them as REST endpoints. package com.sivalabs.springdatarest.web.config; import javax.servlet.Filter; import org.springframework.data.rest.webmvc.config.RepositoryRestMvcConfiguration; import org.springframework.orm.jpa.support.OpenEntityManagerInViewFilter; import org.springframework.web.servlet.support.AbstractAnnotationConfigDispatcherServletInitializer; import com.sivalabs.springdatarest.config.AppConfig; public class SpringWebAppInitializer extends AbstractAnnotationConfigDispatcherServletInitializer { @Override protected Class[] getRootConfigClasses() { return new Class[] { AppConfig.class}; } @Override protected Class[] getServletConfigClasses() { return new Class[] { WebMvcConfig.class, RepositoryRestMvcConfiguration.class }; } @Override protected String[] getServletMappings() { return new String[] { "/rest/*" }; } @Override protected Filter[] getServletFilters() { return new Filter[]{ new OpenEntityManagerInViewFilter() }; } } Create Spring Data JPA repositories for JPA entities. public interface UserRepository extends JpaRepository { } public interface RoleRepository extends JpaRepository { } public interface ContactRepository extends JpaRepository { } That's it. Spring Data REST will take care of rest of the things. You can use spring Rest Shell https://github.com/spring-projects/rest-shell or Chrome's Postman Addon to test the exported REST services. D:\rest-shell-1.2.1.RELEASE\bin>rest-shell http://localhost:8080:> Now we can change the baseUri using baseUri command as follows: http://localhost:8080:>baseUri http://localhost:8080/spring-data-rest-demo/rest/ http://localhost:8080/spring-data-rest-demo/rest/> http://localhost:8080/spring-data-rest-demo/rest/>list rel href ====================================================================================== users http://localhost:8080/spring-data-rest-demo/rest/users{?page,size,sort} roles http://localhost:8080/spring-data-rest-demo/rest/roles{?page,size,sort} contacts http://localhost:8080/spring-data-rest-demo/rest/contacts{?page,size,sort} Note: It seems there is an issue with rest-shell when the DispatcherServlet url mapped to "/" and issue list command it responds with "No resources found". http://localhost:8080/spring-data-rest-demo/rest/>get users/ { "_links": { "self": { "href": "http://localhost:8080/spring-data-rest-demo/rest/users/{?page,size,sort}", "templated": true }, "search": { "href": "http://localhost:8080/spring-data-rest-demo/rest/users/search" } }, "_embedded": { "users": [ { "userName": "admin", "password": "admin", "firstName": "Administrator", "lastName": null, "email": "admin@gmail.com", "dob": null, "enabled": true, "_links": { "self": { "href": "http://localhost:8080/spring-data-rest-demo/rest/users/1" }, "roles": { "href": "http://localhost:8080/spring-data-rest-demo/rest/users/1/roles" }, "contacts": { "href": "http://localhost:8080/spring-data-rest-demo/rest/users/1/contacts" } } }, { "userName": "siva", "password": "siva", "firstName": "Siva", "lastName": null, "email": "sivaprasadreddy.k@gmail.com", "dob": null, "enabled": true, "_links": { "self": { "href": "http://localhost:8080/spring-data-rest-demo/rest/users/2" }, "roles": { "href": "http://localhost:8080/spring-data-rest-demo/rest/users/2/roles" }, "contacts": { "href": "http://localhost:8080/spring-data-rest-demo/rest/users/2/contacts" } } } ] }, "page": { "size": 20, "totalElements": 2, "totalPages": 1, "number": 0 } } You can find the source code at https://github.com/sivaprasadreddy/sivalabs-blog-samples-code/tree/master/spring-data-rest-demo For more Info on Spring Rest Shell: https://github.com/spring-projects/rest-shell

March 7, 2014

by Siva Prasad Reddy Katamreddy

· 29,690 Views

Convert CSV Data to Avro Data

In one of my previous posts I explained how we can convert json data to avro data and vice versa using avro tools command line option. Today I was trying to see what options we have for converting csv data to avro format, as of now we don't have any avro tool option to accomplish this . Now, we can either write our own java program (MapReduce program or a simple java program) or we can use various SerDe's available with Hive to do this quickly and without writing any code :) To convert csv data to Avro data using Hive we need to follow the steps below: Create a Hive table stored as textfile and specify your csv delimiter also. Load csv file to above table using "load data" command. Create another Hive table using AvroSerDe. Insert data from former table to new Avro Hive table using "insert overwrite" command. To demonstrate this I will use use the data below (student.csv): 0,38,91 0,65,28 0,78,16 1,34,96 1,78,14 1,11,43 Now execute below queries in Hive: --1. Create a Hive table stored as textfile USE test; CREATE TABLE csv_table ( student_id INT, subject_id INT, marks INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE; --2. Load csv_table with student.csv data LOAD DATA LOCAL INPATH "/path/to/student.csv" OVERWRITE INTO TABLE test.csv_table; --3. Create another Hive table using AvroSerDe CREATE TABLE avro_table ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' TBLPROPERTIES ( 'avro.schema.literal'='{ "namespace": "com.rishav.avro", "name": "student_marks", "type": "record", "fields": [ { "name":"student_id","type":"int"}, { "name":"subject_id","type":"int"}, { "name":"marks","type":"int"}] }'); --4. Load avro_table with data from csv_table INSERT OVERWRITE TABLE avro_table SELECT student_id, subject_id, marks FROM csv_table; Now you can get data in Avro format from Hive warehouse folder. To dump this file to local file system use below command: hadoop fs -cat /path/to/warehouse/test.db/avro_table/* > student.avro If you want to get json data from this avro file you can use avro tools command: java -jar avro-tools-1.7.5.jar tojson student.avro > student.json So we can easily convert csv to avro and csv to json also by just writing 4 HQLs.

March 5, 2014

by Rishav Rohit

· 39,488 Views · 1 Like

When to Use MongoDB Rather than MySQL (or Other RDBMS): The Billing Example

NoSQL has been a hot topic a pretty long time (well, it's not only a buzz anymore). However, when should we really use it instead of an RDBMS?

March 3, 2014

by Moshe Kaplan

· 378,067 Views · 12 Likes

Python CSV Files: Reading and Writing

Learn to parse CSV (Comma Separated Values) files with Python examples using the csv module's reader function and DictReader class.

March 3, 2014

by Mike Driscoll

· 375,145 Views · 6 Likes

Jersey: Ignoring SSL certificate – javax.net.ssl.SSLHandshakeException: java.security.cert.CertificateException

Last week Alistair and I were working on an internal application and we needed to make a HTTPS request directly to an AWS machine using a certificate signed to a different host. We use jersey-client so our code looked something like this: Client client = Client.create(); client.resource("https://some-aws-host.compute-1.amazonaws.com").post(); // and so on When we ran this we predictably ran into trouble: com.sun.jersey.api.client.ClientHandlerException: javax.net.ssl.SSLHandshakeException: java.security.cert.CertificateException: No subject alternative DNS name matching some-aws-host.compute-1.amazonaws.com found. at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149) at com.sun.jersey.api.client.Client.handle(Client.java:648) at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670) at com.sun.jersey.api.client.WebResource.post(WebResource.java:241) at com.neotechnology.testlab.manager.bootstrap.ManagerAdmin.takeBackup(ManagerAdmin.java:33) at com.neotechnology.testlab.manager.bootstrap.ManagerAdminTest.foo(ManagerAdminTest.java:11) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222) at org.junit.runners.ParentRunner.run(ParentRunner.java:300) at org.junit.runner.JUnitCore.run(JUnitCore.java:157) at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:74) at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:202) at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:65) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120) Caused by: javax.net.ssl.SSLHandshakeException: java.security.cert.CertificateException: No subject alternative DNS name matching some-aws-host.compute-1.amazonaws.com found. at sun.security.ssl.Alerts.getSSLException(Alerts.java:192) at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1884) at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:276) at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:270) at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1341) at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:153) at sun.security.ssl.Handshaker.processLoop(Handshaker.java:868) at sun.security.ssl.Handshaker.process_record(Handshaker.java:804) at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1016) at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1312) at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1339) at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1323) at sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:563) at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:185) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1300) at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468) at sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:338) at com.sun.jersey.client.urlconnection.URLConnectionClientHandler._invoke(URLConnectionClientHandler.java:240) at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:147) ... 31 more Caused by: java.security.cert.CertificateException: No subject alternative DNS name matching some-aws-host.compute-1.amazonaws.com found. at sun.security.util.HostnameChecker.matchDNS(HostnameChecker.java:191) at sun.security.util.HostnameChecker.match(HostnameChecker.java:93) at sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:347) at sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:203) at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:126) at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1323) ... 45 more We figured that we needed to get our client to ignore the certificate and came across this Stack Overflow thread which had some suggestions on how to do this. None of the suggestions worked on their own but we ended up with a combination of a couple of the suggestions which did the trick: public Client hostIgnoringClient() { try { SSLContext sslcontext = SSLContext.getInstance( "TLS" ); sslcontext.init( null, null, null ); DefaultClientConfig config = new DefaultClientConfig(); Map properties = config.getProperties(); HTTPSProperties httpsProperties = new HTTPSProperties( new HostnameVerifier() { @Override public boolean verify( String s, SSLSession sslSession ) { return true; } }, sslcontext ); properties.put( HTTPSProperties.PROPERTY_HTTPS_PROPERTIES, httpsProperties ); config.getClasses().add( JacksonJsonProvider.class ); return Client.create( config ); } catch ( KeyManagementException | NoSuchAlgorithmException e ) { throw new RuntimeException( e ); } } You’re welcome Future Mark.

March 2, 2014

by Mark Needham

· 42,544 Views · 3 Likes

How to "Backcast" a Time Series in R

Sometimes it is useful to “backcast” a time series — that is, forecast in reverse time. Although there are no in-built R functions to do this, it is very easy to implement. Suppose x is our time series and we want to backcast for periods. Here is some code that should work for most univariate time series. The example is non-seasonal, but the code will also work with seasonal data. library(forecast) x <- WWWusage h <- 20 f <- frequency(x) # Reverse time revx <- ts(rev(x), frequency=f) # Forecast fc <- forecast(auto.arima(revx), h) plot(fc) # Reverse time again fc$mean <- ts(rev(fc$mean),end=tsp(x)[1] - 1/f, frequency=f) fc$upper <- fc$upper[h:1,] fc$lower <- fc$lower[h:1,] fc$x <- x # Plot result plot(fc, xlim=c(tsp(x)[1]-h/f, tsp(x)[2]))

February 28, 2014

by Rob J Hyndman

· 5,493 Views

Hibernate Query by Example (QBE)

What is It Query by example is an alternative querying technique supported by the main JPA vendors but not by the JPA specification itself. QBE returns a result set depending on the properties that were set on an instance of the queried class. So if I create an Address entity and fill in the city field then the query will select all the Address entities having the same city field as the given Address entity. The typical use case of QBE is evaluating a search form where the user can fill in any search fields and gets the results based on the given search fields. In this case QBE can reduce code size significantly. When to Use · Using many fields of an entity in a query · User selects which fields of an Entity to use in a query · We are refactoring the entities frequently and don’t want to worry about breaking the queries that rely on them Limitations · QBE is not available in JPA 1.0 or 2.0 · Version properties, identifiers and associations are ignored · The query object should be annotated with @Entity Test Data I used the following entities to test the QBE feature of Hibernate: · Address (long id, String city, String street, String countryISO2Code, AddressType addressType) · AddressType (Integer type, String description) Imports The examples will refer to the following classes: import org.hibernate.Criteria; import org.hibernate.Session; import org.hibernate.criterion.Example; import org.hibernate.criterion.Restrictions; import org.junit.Test; import java.util.List; Utility Methods I also made two utility methods to present a list of the two entity types: private void listAddresses(List addresses) { for (Address address : addresses) { System.out.println(address.getId() + ", " + address.getCountryISO2Code() + ", " + address.getCity() + ", " + address.getStreet() + ", " + address.getAddressType().getType() + ", " + address.getAddressType().getDescription()); } } private void listAddressTypes(List addressTypes) { for (AddressType addressType : addressTypes) { System.out.println(addressType.getType() + ", " + addressType.getDescription()); } } Example 1: Equals This example code returns the Address entities matching the given CountryISO2Code and City. Method: @Test public void testEquals() throws Exception { Session session = (Session) entityManager.getDelegate(); Address address = new Address(); address.setCountryISO2Code("US"); address.setCity("CHICAGO"); Example addressExample = Example.create(address); Criteria criteria = session.createCriteria(Address.class).add(addressExample); listAddresses(criteria.list()); } Result: 75, US, CHICAGO, Los Angeles Way2, 6, Customer 170, US, CHICAGO, Jackson Blvd 33a, 4, Delivery 63, US, CHICAGO, Main Avenue 1, 5, Bill to 37, US, CHICAGO, Jackson Blvd 33a, 4, Delivery 36, US, CHICAGO, Jackson Blvd 33a, 4, Delivery Example 2: Id Limitation This example presents that id fields in the query object are ignored. Method: @Test public void testIdLimitation() throws Exception { Session session = (Session) entityManager.getDelegate(); Address address = new Address(); address.setCountryISO2Code("US"); address.setCity("CHICAGO"); address.setId(100); // setting id is ignored Example addressExample = Example.create(address); Criteria criteria = session.createCriteria(Address.class).add(addressExample); listAddresses(criteria.list()); } Result: 75, US, CHICAGO, Los Angeles Way2, 6, Customer 170, US, CHICAGO, Jackson Blvd 33a, 4, Delivery 63, US, CHICAGO, Main Avenue 1, 5, Bill to 37, US, CHICAGO, Jackson Blvd 33a, 4, Delivery 36, US, CHICAGO, Jackson Blvd 33a, 4, Delivery Example 3: Association Limitation Associations of the query object are ignored, too. Method: @Test public void testAssociationLimitation() throws Exception { Session session = (Session) entityManager.getDelegate(); Address address = new Address(); address.setCountryISO2Code("US"); address.setCity("CHICAGO"); AddressType addressType = new AddressType(); addressType.setType(5); address.setAddressType(addressType); // setting an association is ignored Example addressExample = Example.create(address); Criteria criteria = session.createCriteria(Address.class).add(addressExample); listAddresses(criteria.list()); } Result: 75, US, CHICAGO, Los Angeles Way2, 6, Customer 170, US, CHICAGO, Jackson Blvd 33a, 4, Delivery 63, US, CHICAGO, Main Avenue 1, 5, Bill to 37, US, CHICAGO, Jackson Blvd 33a, 4, Delivery 36, US, CHICAGO, Jackson Blvd 33a, 4, Delivery Example 4: Like QBE supports like in the query object if we enable it with Example.enableLike(). Method: @Test public void testLike() throws Exception { Session session = (Session) entityManager.getDelegate(); Address address = new Address(); address.setCountryISO2Code("US"); address.setCity("AT%"); Example addressExample = Example.create(address).enableLike(); Criteria criteria = session.createCriteria(Address.class).add(addressExample); listAddresses(criteria.list()); } Result: 83, US, ATLANTA, null, 6, Customer 184, US, ATLANTA, null, 1, Shipper 25, US, ATLANTA, null, 1, Shipper Example 5: ExcludeProperty We can exclude a property with Example.excludeProperty(String propertyName). Method: @Test public void testExcludeProperty() throws Exception { Session session = (Session) entityManager.getDelegate(); Address address = new Address(); address.setCountryISO2Code("US"); address.setCity("AT%"); Example addressExample = Example.create(address).enableLike() .excludeProperty("countryISO2Code"); // countryISO2Code is a property of Address Criteria criteria = session.createCriteria(Address.class).add(addressExample); listAddresses(criteria.list()); } Result: 154, GR, ATHENS, BETA ALPHA Street 5, 2, Consignee 83, US, ATLANTA, null, 6, Customer 25, US, ATLANTA, null, 1, Shipper 184, US, ATLANTA, null, 1, Shipper Example 6: IgnoreCase Case-insensitive search is supported by Example.ignoreCase(). Method: @Test public void testIgnoreCase() throws Exception { Session session = (Session) entityManager.getDelegate(); AddressType addressType = new AddressType(); addressType.setDescription("customer"); Example addressTypeExample = Example.create(addressType).ignoreCase(); Criteria criteria = session.createCriteria(AddressType.class) .add(addressTypeExample); listAddressTypes(criteria.list()); } Result: 6, Customer Example 7: ExcludeZeroes We can ignore 0 values of the query object by Example.excludeZeroes(). Method: @Test public void testExcludeZeroes() throws Exception { Session session = (Session) entityManager.getDelegate(); AddressType addressType = new AddressType(); addressType.setType(0); addressType.setDescription("Customer"); Example addressTypeExample = Example.create(addressType) .excludeZeroes(); Criteria criteria = session.createCriteria(AddressType.class) .add(addressTypeExample); listAddressTypes(criteria.list()); } Result: 6, Customer Example 8: Combining with Criteria QBE can be combined with criteria query. In this example we add further restriction to the query object using criteria query. Method: @Test public void testCombiningWithCriteria() throws Exception { Session session = (Session) entityManager.getDelegate(); AddressType addressType = new AddressType(); addressType.setDescription("Customer"); Example addressTypeExample = Example.create(addressType); Criteria criteria = session .createCriteria(AddressType.class).add(addressTypeExample) .add(Restrictions.eq("type", 6)); listAddressTypes(criteria.list()); } Result: 6, Customer Example 9: Association With criteria query we can filter both sides of an association, using two query objects. Method: @Test public void testAssociation() throws Exception { Session session = (Session) entityManager.getDelegate(); Address address = new Address(); address.setCountryISO2Code("US"); AddressType addressType = new AddressType(); addressType.setType(6); Example addressExample = Example.create(address); Example addressTypeExample = Example.create(addressType); Criteria criteria = session.createCriteria(Address.class).add(addressExample) .createCriteria("addressType").add(addressTypeExample); // addressType is a property of Address listAddresses(criteria.list()); } Result: 84, US, BOSTON, null, 6, Customer 83, US, ATLANTA, null, 6, Customer 82, US, SAN FRANCISCO, null, 6, Customer 75, US, CHICAGO, Los Angeles Way2, 6, Customer EclipseLink EclipseLink QBE uses QueryByExamplePolicy, ReadObjectQuery and JpaHelper: QueryByExamplePolicy qbePolicy =newQueryByExamplePolicy(); qbePolicy.excludeDefaultPrimitiveValues(); Address address =newAddress(); address.setCity("CHICAGO"); ReadObjectQuery roq =newReadObjectQuery(address, qbePolicy); Query query =JpaHelper.createQuery(roq, entityManager); OpenJPA OpenJPA uses OpenJPAQueryBuilder: CriteriaQuery cq = openJPAQueryBuilder.createQuery(Address.class); Address address =newAddress(); address.setCity("CHICAGO"); cq.where(openJPAQueryBuilder.qbe(cq.from(Address.class), address); References Hibernate: · Srinivas Guruzu and Gary Mak: Hibernate Recipes: A Problem-Solution Approach (Apress) · http://docs.jboss.org/hibernate/core/3.3/reference/en/html/querycriteria.html#querycriteria-examples · http://www.java2s.com/Code/Java/Hibernate/CriteriaQBEQueryByExampleCriteria.htm · http://www.dzone.com/snippets/hibernate-query-example · http://gal-levinsky.blogspot.de/2012/01/qbe-pattern.html Hibernate associations: · http://stackoverflow.com/questions/9309884/query-by-example-on-associations · http://stackoverflow.com/questions/8236596/hibernate-query-by-example-equivalent-of-association-criteria-query JPA: · http://stackoverflow.com/questions/2880209/jpa-findbyexample EclipseLink: · http://www.coderanch.com/t/486528/ORM/databases/findByExample-JPA-book OpenJPA: · http://www.ibm.com/developerworks/java/library/j-typesafejpa/#N10C18

February 27, 2014

by Donat Szilagyi

· 62,177 Views · 3 Likes

A Deeper Look into the Java 8 Date and Time API

Within this post we will have a deeper look into the new Date/Time API we get with Java 8 (JSR 310). Please note that this post is mainly driven by code examples that show the new API functionality. I think the examples are self-explanatory so I did not spent much time writing text around them :-) Let's get started! Working with Date and Time Objects All classes of the Java 8 Date/Time API are located within the java.time package. The first class we want to look at is java.time.LocalDate. A LocalDate represents a year-month-day date without time. We start with creating new LocalDate instances: // the current date LocalDate currentDate = LocalDate.now(); // 2014-02-10 LocalDate tenthFeb2014 = LocalDate.of(2014, Month.FEBRUARY, 10); // months values start at 1 (2014-08-01) LocalDate firstAug2014 = LocalDate.of(2014, 8, 1); // the 65th day of 2010 (2010-03-06) LocalDate sixtyFifthDayOf2010 = LocalDate.ofYearDay(2010, 65); LocalTime and LocalDateTime are the next classes we look at. Both work similar to LocalDate. ALocalTime works with time (without dates) while LocalDateTime combines date and time in one class: LocalTime currentTime = LocalTime.now(); // current time LocalTime midday = LocalTime.of(12, 0); // 12:00 LocalTime afterMidday = LocalTime.of(13, 30, 15); // 13:30:15 // 12345th second of day (03:25:45) LocalTime fromSecondsOfDay = LocalTime.ofSecondOfDay(12345); // dates with times, e.g. 2014-02-18 19:08:37.950 LocalDateTime currentDateTime = LocalDateTime.now(); // 2014-10-02 12:30 LocalDateTime secondAug2014 = LocalDateTime.of(2014, 10, 2, 12, 30); // 2014-12-24 12:00 LocalDateTime christmas2014 = LocalDateTime.of(2014, Month.DECEMBER, 24, 12, 0); By default LocalDate/Time classes will use the system clock in the default time zone. We can change this by providing a time zone or an alternative Clock implementation: // current (local) time in Los Angeles LocalTime currentTimeInLosAngeles = LocalTime.now(ZoneId.of("America/Los_Angeles")); // current time in UTC time zone LocalTime nowInUtc = LocalTime.now(Clock.systemUTC()); From LocalDate/Time objects we can get all sorts of useful information we might need. Some examples: LocalDate date = LocalDate.of(2014, 2, 15); // 2014-06-15 boolean isBefore = LocalDate.now().isBefore(date); // false // information about the month Month february = date.getMonth(); // FEBRUARY int februaryIntValue = february.getValue(); // 2 int minLength = february.minLength(); // 28 int maxLength = february.maxLength(); // 29 Month firstMonthOfQuarter = february.firstMonthOfQuarter(); // JANUARY // information about the year int year = date.getYear(); // 2014 int dayOfYear = date.getDayOfYear(); // 46 int lengthOfYear = date.lengthOfYear(); // 365 boolean isLeapYear = date.isLeapYear(); // false DayOfWeek dayOfWeek = date.getDayOfWeek(); int dayOfWeekIntValue = dayOfWeek.getValue(); // 6 String dayOfWeekName = dayOfWeek.name(); // SATURDAY int dayOfMonth = date.getDayOfMonth(); // 15 LocalDateTime startOfDay = date.atStartOfDay(); // 2014-02-15 00:00 // time information LocalTime time = LocalTime.of(15, 30); // 15:30:00 int hour = time.getHour(); // 15 int second = time.getSecond(); // 0 int minute = time.getMinute(); // 30 int secondOfDay = time.toSecondOfDay(); // 55800 Some information can be obtained without providing a specific date. For example, we can use the Year class if we need information about a specific year: Year currentYear = Year.now(); Year twoThousand = Year.of(2000); boolean isLeap = currentYear.isLeap(); // false int length = currentYear.length(); // 365 // sixtyFourth day of 2014 (2014-03-05) LocalDate date = Year.of(2014).atDay(64); We can use the plus and minus methods to add or subtract specific amounts of time. Note that these methods always return a new instance (Java 8 date/time classes are immutable). LocalDate tomorrow = LocalDate.now().plusDays(1); // before 5 houres and 30 minutes LocalDateTime dateTime = LocalDateTime.now().minusHours(5).minusMinutes(30); TemporalAdjusters are another nice way for date manipulation. TemporalAdjuster is a single method interface that is used to separate the process of adjustment from actual date/time objects. A set of common TemporalAdjusters can be accessed using static methods of the TemporalAdjusters class. LocalDate date = LocalDate.of(2014, Month.FEBRUARY, 25); // 2014-02-25 // first day of february 2014 (2014-02-01) LocalDate firstDayOfMonth = date.with(TemporalAdjusters.firstDayOfMonth()); // last day of february 2014 (2014-02-28) LocalDate lastDayOfMonth = date.with(TemporalAdjusters.lastDayOfMonth()); Static imports make this more fluent to read: import static java.time.temporal.TemporalAdjusters.*; ... // last day of 2014 (2014-12-31) LocalDate lastDayOfYear = date.with(lastDayOfYear()); // first day of next month (2014-03-01) LocalDate firstDayOfNextMonth = date.with(firstDayOfNextMonth()); // next sunday (2014-03-02) LocalDate nextSunday = date.with(next(DayOfWeek.SUNDAY)); Time Zones Working with time zones is another big topic that is simplified by the new API. The LocalDate/Time classes we have seen so far do not contain information about a time zone. If we want to work with a date/time in a certain time zone we can use ZonedDateTime or OffsetDateTime: ZoneId losAngeles = ZoneId.of("America/Los_Angeles"); ZoneId berlin = ZoneId.of("Europe/Berlin"); // 2014-02-20 12:00 LocalDateTime dateTime = LocalDateTime.of(2014, 02, 20, 12, 0); // 2014-02-20 12:00, Europe/Berlin (+01:00) ZonedDateTime berlinDateTime = ZonedDateTime.of(dateTime, berlin); // 2014-02-20 03:00, America/Los_Angeles (-08:00) ZonedDateTime losAngelesDateTime = berlinDateTime.withZoneSameInstant(losAngeles); int offsetInSeconds = losAngelesDateTime.getOffset().getTotalSeconds(); // -28800 // a collection of all available zones Set allZoneIds = ZoneId.getAvailableZoneIds(); // using offsets LocalDateTime date = LocalDateTime.of(2013, Month.JULY, 20, 3, 30); ZoneOffset offset = ZoneOffset.of("+05:00"); // 2013-07-20 03:30 +05:00 OffsetDateTime plusFive = OffsetDateTime.of(date, offset); // 2013-07-19 20:30 -02:00 OffsetDateTime minusTwo = plusFive.withOffsetSameInstant(ZoneOffset.ofHours(-2)); Timestamps Classes like LocalDate and ZonedDateTime provide a human view on time. However, often we need to work with time viewed from a machine perspective. For this we can use the Instant class which represents timestamps. An Instant counts the time beginning from the first second of January 1, 1970 (1970-01-01 00:00:00) also called the EPOCH. Instant values can be negative if they occured before the epoch. They followISO 8601 the standard for representing date and time. // current time Instant now = Instant.now(); // from unix timestamp, 2010-01-01 12:00:00 Instant fromUnixTimestamp = Instant.ofEpochSecond(1262347200); // same time in millis Instant fromEpochMilli = Instant.ofEpochMilli(1262347200000l); // parsing from ISO 8601 Instant fromIso8601 = Instant.parse("2010-01-01T12:00:00Z"); // toString() returns ISO 8601 format, e.g. 2014-02-15T01:02:03Z String toIso8601 = now.toString(); // as unix timestamp long toUnixTimestamp = now.getEpochSecond(); // in millis long toEpochMillis = now.toEpochMilli(); // plus/minus methods are available too Instant nowPlusTenSeconds = now.plusSeconds(10); Periods and Durations Period and Duration are two other important classes. Like the names suggest they represent a quantity or amount of time. A Period uses date based values (years, months, days) while a Duration uses seconds or nanoseconds to define an amount of time. Duration is most suitable when working with Instants and machine time. Periods and Durations can contain negative values if the end point occurs before the starting point. // periods LocalDate firstDate = LocalDate.of(2010, 5, 17); // 2010-05-17 LocalDate secondDate = LocalDate.of(2015, 3, 7); // 2015-03-07 Period period = Period.between(firstDate, secondDate); int days = period.getDays(); // 18 int months = period.getMonths(); // 9 int years = period.getYears(); // 4 boolean isNegative = period.isNegative(); // false Period twoMonthsAndFiveDays = Period.ofMonths(2).plusDays(5); LocalDate sixthOfJanuary = LocalDate.of(2014, 1, 6); // add two months and five days to 2014-01-06, result is 2014-03-11 LocalDate eleventhOfMarch = sixthOfJanuary.plus(twoMonthsAndFiveDays); // durations Instant firstInstant= Instant.ofEpochSecond( 1294881180 ); // 2011-01-13 01:13 Instant secondInstant = Instant.ofEpochSecond(1294708260); // 2011-01-11 01:11 Duration between = Duration.between(firstInstant, secondInstant); // negative because firstInstant is after secondInstant (-172920) long seconds = between.getSeconds(); // get absolute result in minutes (2882) long absoluteResult = between.abs().toMinutes(); // two hours in seconds (7200) long twoHoursInSeconds = Duration.ofHours(2).getSeconds(); Formatting and Parsing Formatting and parsing is another big topic when working with dates and times. In Java 8 this can be accomplished by using the format() and parse() methods: // 2014-04-01 10:45 LocalDateTime dateTime = LocalDateTime.of(2014, Month.APRIL, 1, 10, 45); // format as basic ISO date format (20140220) String asBasicIsoDate = dateTime.format(DateTimeFormatter.BASIC_ISO_DATE); // format as ISO week date (2014-W08-4) String asIsoWeekDate = dateTime.format(DateTimeFormatter.ISO_WEEK_DATE); // format ISO date time (2014-02-20T20:04:05.867) String asIsoDateTime = dateTime.format(DateTimeFormatter.ISO_DATE_TIME); // using a custom pattern (01/04/2014) String asCustomPattern = dateTime.format(DateTimeFormatter.ofPattern("dd/MM/yyyy")); // french date formatting (1. avril 2014) String frenchDate = dateTime.format(DateTimeFormatter.ofPattern("d. MMMM yyyy", new Locale("fr"))); // using short german date/time formatting (01.04.14 10:45) DateTimeFormatter formatter = DateTimeFormatter.ofLocalizedDateTime(FormatStyle.SHORT) .withLocale(new Locale("de")); String germanDateTime = dateTime.format(formatter); // parsing date strings LocalDate fromIsoDate = LocalDate.parse("2014-01-20"); LocalDate fromIsoWeekDate = LocalDate.parse("2014-W14-2", DateTimeFormatter.ISO_WEEK_DATE); LocalDate fromCustomPattern = LocalDate.parse("20.01.2014", DateTimeFormatter.ofPattern("dd.MM.yyyy")); Conversion Of course we do not always have objects of the type we need. Therefore, we need an option to convert different date/time related objects between each other. The following examples show some of the possible conversion options: // LocalDate/LocalTime <-> LocalDateTime LocalDate date = LocalDate.now(); LocalTime time = LocalTime.now(); LocalDateTime dateTimeFromDateAndTime = LocalDateTime.of(date, time); LocalDate dateFromDateTime = LocalDateTime.now().toLocalDate(); LocalTime timeFromDateTime = LocalDateTime.now().toLocalTime(); // Instant <-> LocalDateTime Instant instant = Instant.now(); LocalDateTime dateTimeFromInstant = LocalDateTime.ofInstant(instant, ZoneId.of("America/Los_Angeles")); Instant instantFromDateTime = LocalDateTime.now().toInstant(ZoneOffset.ofHours(-2)); // convert old date/calendar/timezone classes Instant instantFromDate = new Date().toInstant(); Instant instantFromCalendar = Calendar.getInstance().toInstant(); ZoneId zoneId = TimeZone.getDefault().toZoneId(); ZonedDateTime zonedDateTimeFromGregorianCalendar = new GregorianCalendar().toZonedDateTime(); // convert to old classes Date dateFromInstant = Date.from(Instant.now()); TimeZone timeZone = TimeZone.getTimeZone(ZoneId.of("America/Los_Angeles")); GregorianCalendar gregorianCalendar = GregorianCalendar.from(ZonedDateTime.now()); Conclusion With Java 8 we get a very rich API for working with date and time located in the java.time package. The API can completely replace old classes like java.util.Date or java.util.Calendar with newer, more flexible classes. Due to mostly immutable classes the new API helps in building thread safe systems. The source of the examples can be found on GitHub.

February 27, 2014

by Michael Scharhag

· 208,897 Views · 18 Likes

Getting Started with Mocking in Java using Mockito

We all write unit tests but the challenge we face at times is that the unit under test might be dependent on other components. And configuring other components for unit testing is definitely an overkill. Instead we can make use of Mocks in place of the other components and continue with the unit testing. To show how one can use mocks, I have a Data access layer(DAL), basically a class which provides an API for the application to access and modify the data in the data repository. I then unit test the DAL without actually the need to connect to the data repository. The data repository can be a local database or remote database or a file system or any place where we can store and retrieve the data. The use of a DAL class helps us in keeping the data mappers separate from the application code. Lets create a Java project using maven. mvn archetype:generate -DgroupId=info.sanaulla -DartifactId=MockitoDemo -DarchetypeArtifactId=maven-archetype-quickstart -DinteractiveMode=false The above creates a folder MockitoDemo and then creates the entire directory structure for source and test files. Consider the below model class for this example: package info.sanaulla.models; import java.util.List; /** * Model class for the book details. */ public class Book { private String isbn; private String title; private List authors; private String publication; private Integer yearOfPublication; private Integer numberOfPages; private String image; public Book(String isbn, String title, List authors, String publication, Integer yearOfPublication, Integer numberOfPages, String image){ this.isbn = isbn; this.title = title; this.authors = authors; this.publication = publication; this.yearOfPublication = yearOfPublication; this.numberOfPages = numberOfPages; this.image = image; } public String getIsbn() { return isbn; } public String getTitle() { return title; } public List getAuthors() { return authors; } public String getPublication() { return publication; } public Integer getYearOfPublication() { return yearOfPublication; } public Integer getNumberOfPages() { return numberOfPages; } public String getImage() { return image; } } The DAL class for operating on the Book model class is: package info.sanaulla.dal; import info.sanaulla.models.Book; import java.util.ArrayList; import java.util.Arrays; import java.util.Collections; import java.util.List; /** * API layer for persisting and retrieving the Book objects. */ public class BookDAL { private static BookDAL bookDAL = new BookDAL(); public List getAllBooks(){ return Collections.EMPTY_LIST; } public Book getBook(String isbn){ return null; } public String addBook(Book book){ return book.getIsbn(); } public String updateBook(Book book){ return book.getIsbn(); } public static BookDAL getInstance(){ return bookDAL; } } The DAL layer above currently has no functionality and we are going to unit test that piece of code (TDD). The DAL layer might communicate with a ORM Mapper or Database API which we are not concerned while designing the API. Test Driving the DAL Layer There are lot of frameworks for Unit testing and mocking in Java but for this example I would be picking JUnit for unit testing and Mockito for mocking. We would have to update the dependency in Maven’s pom.xml 4.0.0 info.sanaulla MockitoDemo jar 1.0-SNAPSHOT MockitoDemo http://maven.apache.org junit junit 4.10 test org.mockito mockito-all 1.9.5 test Now lets unit test the BookDAL. During the unit testing we will inject mock data into the BookDAL so that we can complete the testing of the API without depending on the data source. Initially we will have an empty test class: public class BookDALTest { public void setUp() throws Exception { } public void testGetAllBooks() throws Exception { } public void testGetBook() throws Exception { } public void testAddBook() throws Exception { } public void testUpdateBook() throws Exception { } } We will inject the mock BookDAL and mock data in the setUp() as shown below: public class BookDALTest { private static BookDAL mockedBookDAL; private static Book book1; private static Book book2; @BeforeClass public static void setUp(){ //Create mock object of BookDAL mockedBookDAL = mock(BookDAL.class); //Create few instances of Book class. book1 = new Book("8131721019","Compilers Principles", Arrays.asList("D. Jeffrey Ulman","Ravi Sethi", "Alfred V. Aho", "Monica S. Lam"), "Pearson Education Singapore Pte Ltd", 2008,1009,"BOOK_IMAGE"); book2 = new Book("9788183331630","Let Us C 13th Edition", Arrays.asList("Yashavant Kanetkar"),"BPB PUBLICATIONS", 2012,675,"BOOK_IMAGE"); //Stubbing the methods of mocked BookDAL with mocked data. when(mockedBookDAL.getAllBooks()).thenReturn(Arrays.asList(book1, book2)); when(mockedBookDAL.getBook("8131721019")).thenReturn(book1); when(mockedBookDAL.addBook(book1)).thenReturn(book1.getIsbn()); when(mockedBookDAL.updateBook(book1)).thenReturn(book1.getIsbn()); } public void testGetAllBooks() throws Exception {} public void testGetBook() throws Exception {} public void testAddBook() throws Exception {} public void testUpdateBook() throws Exception {} } In the above setUp() method I have: Created a mock object of BookDAL BookDAL mockedBookDAL = mock(BookDAL.class); Stubbed the API of BookDAL with mock data, such that when ever the API is invoked the mocked data is returned. //When getAllBooks() is invoked then return the given data and so on for the other methods. when(mockedBookDAL.getAllBooks()).thenReturn(Arrays.asList(book1, book2)); when(mockedBookDAL.getBook("8131721019")).thenReturn(book1); when(mockedBookDAL.addBook(book1)).thenReturn(book1.getIsbn()); when(mockedBookDAL.updateBook(book1)).thenReturn(book1.getIsbn()); Populating the rest of the tests we get: package info.sanaulla.dal; import info.sanaulla.models.Book; import org.junit.BeforeClass; import org.junit.Test; import static org.junit.Assert.*; import static org.mockito.Mockito.mock; import static org.mockito.Mockito.when; import java.util.Arrays; import java.util.List; public class BookDALTest { private static BookDAL mockedBookDAL; private static Book book1; private static Book book2; @BeforeClass public static void setUp(){ mockedBookDAL = mock(BookDAL.class); book1 = new Book("8131721019","Compilers Principles", Arrays.asList("D. Jeffrey Ulman","Ravi Sethi", "Alfred V. Aho", "Monica S. Lam"), "Pearson Education Singapore Pte Ltd", 2008,1009,"BOOK_IMAGE"); book2 = new Book("9788183331630","Let Us C 13th Edition", Arrays.asList("Yashavant Kanetkar"),"BPB PUBLICATIONS", 2012,675,"BOOK_IMAGE"); when(mockedBookDAL.getAllBooks()).thenReturn(Arrays.asList(book1, book2)); when(mockedBookDAL.getBook("8131721019")).thenReturn(book1); when(mockedBookDAL.addBook(book1)).thenReturn(book1.getIsbn()); when(mockedBookDAL.updateBook(book1)).thenReturn(book1.getIsbn()); } @Test public void testGetAllBooks() throws Exception { List allBooks = mockedBookDAL.getAllBooks(); assertEquals(2, allBooks.size()); Book myBook = allBooks.get(0); assertEquals("8131721019", myBook.getIsbn()); assertEquals("Compilers Principles", myBook.getTitle()); assertEquals(4, myBook.getAuthors().size()); assertEquals((Integer)2008, myBook.getYearOfPublication()); assertEquals((Integer) 1009, myBook.getNumberOfPages()); assertEquals("Pearson Education Singapore Pte Ltd", myBook.getPublication()); assertEquals("BOOK_IMAGE", myBook.getImage()); } @Test public void testGetBook(){ String isbn = "8131721019"; Book myBook = mockedBookDAL.getBook(isbn); assertNotNull(myBook); assertEquals(isbn, myBook.getIsbn()); assertEquals("Compilers Principles", myBook.getTitle()); assertEquals(4, myBook.getAuthors().size()); assertEquals("Pearson Education Singapore Pte Ltd", myBook.getPublication()); assertEquals((Integer)2008, myBook.getYearOfPublication()); assertEquals((Integer)1009, myBook.getNumberOfPages()); } @Test public void testAddBook(){ String isbn = mockedBookDAL.addBook(book1); assertNotNull(isbn); assertEquals(book1.getIsbn(), isbn); } @Test public void testUpdateBook(){ String isbn = mockedBookDAL.updateBook(book1); assertNotNull(isbn); assertEquals(book1.getIsbn(), isbn); } } One can run the test by using maven command: mvn test. The output is: ------------------------------------------------------- T E S T S ------------------------------------------------------- Running info.sanaulla.AppTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.029 sec Running info.sanaulla.dal.BookDALTest Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.209 sec Results : Tests run: 5, Failures: 0, Errors: 0, Skipped: 0 So we have been able to test the DAL class without actually configuring the data source by using mocks.

February 26, 2014

by Mohamed Sanaulla

· 233,101 Views · 18 Likes

How to Estimate Memory Consumption

This story goes back at least a decade, when I was first approached by a PHB with a question “How big servers are we going to need to buy for our production deployment”. The new and shiny system we have been building was nine months from production rollout and apparently the company had promised to deliver the whole solution, including hardware. Oh boy, was I in trouble. With just a few years of experience down my belt, I could have pretty much just tossed a dice. Even though I am sure my complete lack of confidence was clearly visible, I still had to come up with the answer. Four hours of googling later I recall sitting there with the same question still hovering in front of my bedazzled face: “How to estimate the need for computing power?” In this post I start to open up the subject by giving you rough guidelines on how to estimate memory requirements for your brand new Java application. For the impatient ones – the answer will be to start with the memory equal to approximately 5 x [amount of memory consumed by Live Data] and start the fine-tuning from there. For the ones more curious about the logic behind, stay with me and I will walk you through the reasoning. First and foremost, I can only recommend to avoid answering a question phrased like this without detailed information being available. Your answer has to be based upon the performance requirements, so do not even start without clarifying those first. And I do not mean way-too-ambiguous “The system needs to support 700 concurrent users”, but a lot more specific ones about latency and throughput, taking into account the amount of data, usage patterns. One should also not forget about the budget also – we all can all dream about sub-millisecond latencies, but those without HFT banking backbone budgets – unfortunately it will only remain a dream. For now, lets assume you have those requirements in place. Next stop would be to create the load test scripts emulating user behaviour. If you are now able to launch those scripts concurrently you have built a foundation to the answer. As you might also have guessed, the next step involves our usual advice of measuring not guessing. But with a caveat. Live Data Size Namely, our quest for the optimal memory configuration requires capturing the Live Data Size. Having captured this, we have the baseline configuration in place for the fine-tuning. How does one define live data size? Charlie Hunt and Binu John in their “Java Performance” book have given it the following definition: Live data size is the heap size consumed by the set of long-lived objects required to run the application in its steady state. Equipped with the definition, we are ready to run your load tests against the application with the GC logging turned on (-XX:+PrintGCTimeStamps -Xloggc:/tmp/gc.log -XX:+PrintGCDetails) and visualize the logs (with the help of gcviewer for example) to determine the moment when the application has reached to the steady state. What you are after looks similar to the following: We can see the GC doing its job both with minor and Full GC runs in a familiar double-saw-toothed graphic. This particular application seems to have achieved a steady state already after the first full GC run on 21st second. In most cases however, it takes 10-20 Full GC runs to spot the change in trends. After four full GC runs we can estimate that the Live Data Size is equal to approximately 100MB. The aforementioned Java Performance book is now indicating that there is a strong correlation between the Live Data Size and the optimal memory configuration parameters in a typical Java EE application. The evidence from the field is also backing up their recommendations: Set the maximum heap size to 3-4 x [Live Data Size] So, for our application at hand, we should set the -Xmx to be in between 300m and 400m for the initial performance tests and take it from there. We have mixed feelings about other recommendations given in the book, recommending to set the maximum permanent generation size to 1.2-1.5 x [Live Data Size of the Permanent Generation] and the -XX:NewRatio being set to 1-1.5 x of the [Live Data Size]. We are currently gathering more data to determine whether the positive correlation exists, but until then I recommend to base your survival and eden configuration decisions on monitoring your allocation rate instead. Why should one bother you might now ask. Indeed, two reasons for not caring immediately surface: 8G memory chip is in sub $100 territory at the time of writing this article Virtualization, especially when using large providers such as Amazon AWS make adjusting the capacity easy Both of the reasons are partially valid and have definitely reduced the need for provisioning to be precise. But both of them are still putting you in the danger zone When tossing in huge amounts of memory “just in case” you are most likely going to significantly affect the latency – going into heaps above 8G it is darn easy to introduce Full GC pauses spanning over tens of seconds. When over-provisioning with the mindset of “lets tune it later”, the “later” part has a tendency of never arriving. I have faced numerous applications running on vastly over provisioned environments just because of this. For example the aforementioned application I discovered running on Amazon EC2 m1.xlarge instance was costing the company $4,200 per instance / year. Converting it to m1.small reduced the bill to just $520 for the instance. 8-fold cost reduction will be visible from your operations budget if your deployments are large, trust me on this. Summary Unfortunately I still see way too many decisions made exactly like I was forced to do a decade ago. This leads to the under- and over planning of capacity, both of which can be equally poor choices, especially if you cannot enjoy the benefits of virtualization. I got lucky with mine, but you might not get away with your guestimate, so I can only recommend to actually plan ahead using the simple framework described in this post. If you enjoyed the content, I can only recommend to follow our performance tuning advice in Twitter.

February 25, 2014

by Nikita Salnikov-Tarnovski

· 11,207 Views