Data Engineering Resources

The Latest Data Engineering Topics

WSO2 DSS: Batch Insert Sample (End to End)

WSO2 DSS wraps Data Services Layer and provides us with a simple GUI to define a Data Service with zero Java code. With this, a change to the data source is just a simple click away and no other party needs to be aware of this. With this sample demonstration, we will see how to do a batch insert to a table. Batch insert is useful when you want to insert data in sequential manner. This also means that if at least one of the insertion query fails all the other queries ran so far in the batch will be rolled back as well. If one insertion in the batch fails means whole batch is failed. This can be used if you are running the same query to insert data many times. With batch insert all the data will be sent in one call. So this reduce the number calls you have to call, to get the data inserted. This comes with one condition that, The query should not be producing results back. (We will only be notified whether the query was successful or not.) Prerequisites: WSO2 Data Services Server - http://wso2.com/products/data-services-server/ (current latest 3.1.1) Mysql connector (JDBC) - https://www.mysql.com/products/connector/ If we already have a data service running which is not sending back a result set , then it's just a matters of adding following property in service declaration. enableBatchRequests="true" Anyway I will be demonstrating the creation of the service from the scratch. 1. Create a service as follows going through the wizard, 2. Create the data source 3. Create the query - (This is an insert query. Also note the input mapping we have add as relevant to the query. To know more about input mapping and using validation refer the documentation.) 4. Create the operation - Select the query to be executed once the operation is called. By enabling return request status, we will be notified whether the operation was a success or not. 5. Try it! - When we list the services we will see this new service now. In the right we will have an option to try it. Here we can see the option to try the service giving the input parameters. Here I have tried it two insertions in a batch. Now if we go to XML view of the service it will be similar to following, which is saved in server as a .dbs file. com.mysql.jdbc.Driver jdbc:mysql://localhost:3306/json_array root root 1 10 SELECT 1 insert into flights (flight_no, number_of_cases, created_by, description, trips) values (:flight_no,:number_of_cases,:created_by,:description,:trips) If we hit on the service name in the list of services, we will be directed to Service Dashboard where we can see several other options for the service. It provides the option to generate an Axis2 client for the service. Once we get the client then it's a matter of calling the methods in the stub as follows. private static BatchRequestSampleOldStub.AddFlight_type0 createFlight(int cases, String creator, String description, int trips) { BatchRequestSampleOldStub.AddFlight_type0 val = new BatchRequestSampleOldStub.AddFlight_type0(); val.setNumber_of_cases(cases); val.setCreated_by(creator); val.setDescription(description); val.setTrips(trips); printFlightInfo(cases, creator, description, trips); return val; } public static void main(String[] args) throws Exception { String epr = "http://localhost:9763" + "/services/BatchInsertSample"; BatchRequestSampleOldStub stub = new BatchRequestSampleOldStub(epr); BatchRequestSampleOldStub.AddFlight_batch_req vals1 = new BatchRequestSampleOldStub.AddFlight_batch_req(); vals1.addAddFlight(createFlight(1, "Pushpalanka", "test", 2)); vals1.addAddFlight(createFlight(2, "Jayawardhana", "test", 2)); vals1.addAddFlight(createFlight(3, "[email protected]", "test", 2)); try { System.out.println("Executing Add Flights.."); stub.addFlight_batch_req(vals1); } catch (Exception e) { System.out.println("Error in Add Flights!"); } Complete client code can be found here. Cheers! Ref: http://docs.wso2.org/display/DSS311/Batch+Processing+Sample

March 21, 2014

by Pushpalanka Jayawardhana

· 10,105 Views

Grails Goodness: Using Hibernate Native SQL Queries

Sometimes we want to use Hibernate native SQL in our code. For example we might need to invoke a selectable stored procedure, we cannot invoke in another way. To invoke a native SQL query we use the method createSQLQuery() which is available from the Hibernate session object. In our Grails code we must then first get access to the current Hibernate session. Luckily we only have to inject the sessionFactory bean in our Grails service or controller. To get the current session we invoke the getCurrentSession() method and we are ready to execute a native SQL query. The query itself is defined as a String value and we can use placeholders for variables, just like with other Hibernate queries. In the following sample we create a new Grails service and use a Hibernate native SQL query to execute a selectable stored procedure with the nameorganisation_breadcrumbs. This stored procedure takes one argument startId and will return a list of results with an id, name and level column. // File: grails-app/services/com/mrhaki/grails/OrganisationService.groovy package com.mrhaki.grails import com.mrhaki.grails.Organisation class OrganisationService { // Auto inject SessionFactory we can use // to get the current Hibernate session. def sessionFactory List breadcrumbs(final Long startOrganisationId) { // Get the current Hiberante session. final session = sessionFactory.currentSession // Query string with :startId as parameter placeholder. final String query = 'select id, name, level from organisation_breadcrumbs(:startId) order by level desc' // Create native SQL query. final sqlQuery = session.createSQLQuery(query) // Use Groovy with() method to invoke multiple methods // on the sqlQuery object. final results = sqlQuery.with { // Set domain class as entity. // Properties in domain class id, name, level will // be automatically filled. addEntity(Organisation) // Set value for parameter startId. setLong('startId', startOrganisationId) // Get all results. list() } results } } In the sample code we use the addEntity() method to map the query results to the domain class Organisation. To transform the results from a query to other objects we can use the setResultTransformer() method. Hibernate (and therefore Grails if we use the Hibernate plugin) already has a set of transformers we can use. For example with the org.hibernate.transform.AliasToEntityMapResultTransformer each result row is transformed into a Map where the column aliases are the keys of the map. // File: grails-app/services/com/mrhaki/grails/OrganisationService.groovy package com.mrhaki.grails import org.hibernate.transform.AliasToEntityMapResultTransformer class OrganisationService { def sessionFactory List> breadcrumbs(final Long startOrganisationId) { final session = sessionFactory.currentSession final String query = 'select id, name, level from organisation_breadcrumbs(:startId) order by level desc' final sqlQuery = session.createSQLQuery(query) final results = sqlQuery.with { // Assign result transformer. // This transformer will map columns to keys in a map for each row. resultTransformer = AliasToEntityMapResultTransformer.INSTANCE setLong('startId', startOrganisationId) list() } results } } Finally we can execute a native SQL query and handle the raw results ourselves using the Groovy Collection API enhancements. The result of thelist() method is a List of Object[] objects. In the following sample we use Groovy syntax to handle the results: // File: grails-app/services/com/mrhaki/grails/OrganisationService.groovy package com.mrhaki.grails class OrganisationService { def sessionFactory List> breadcrumbs(final Long startOrganisationId) { final session = sessionFactory.currentSession final String query = 'select id, name, level from organisation_breadcrumbs(:startId) order by level desc' final sqlQuery = session.createSQLQuery(query) final queryResults = sqlQuery.with { setLong('startId', startOrganisationId) list() } // Transform resulting rows to a map with key organisationName. final results = queryResults.collect { resultRow -> [organisationName: resultRow[1]] } // Or to only get a list of names. //final List names = queryResults.collect { it[1] } results } } Code written with Grails 2.3.7.

March 20, 2014

by Hubert Klein Ikkink

· 23,269 Views · 1 Like

Cloud Automation with WinRM vs SSH

[Article originally written by Barak Merimovich.] Automation the Linux Way In the Linux world SSH, secure shell, is the de facto standard for remote connectivity and automation for the purpose of logging into a remote machine to install tools and run commands. It's pretty much ubiquitous, runs across multiple Linux versions and distributions, and every Linux admin worth their salt knows SSH and how to configure it. What's more, it's even the default enabled port on most clouds - port 22. An important feature available with SSH is support for file transfer via its secure copy protocol - AKA SCP, and secure file transfer protocol - AKA SFTP. These are a built-in part of the tool or exist as add-ons to the protocol that are almost always available. Therefore, using SSH for file transfer and remote execution is basically a given with Linux, and there are even tools to support SSH clients available for virtually every major programming language and operating system. WinRM in a Linux World So what comes out-of-the-box with Linux, is less of a given with Windows. SSH, obviously, is not built in with Windows; over the years there have been different protocols attempting to achieve the same functionality, such as Secure Telnet and others, however to date, none have really caught on. From Windows Server 2003, a new tool called WinRM - windows remote management, was introduced. WinRM is a SOAP-based protocol built on web services that among other things, allows you to connect to a remote system, providing a shell, essentially offering similar functionality to SSH. WinRM is currently the Windows world alternative to SSH. The Pros The advantage with WinRM is that you can use a vanilla VM with nothing pre-configured on it, with the only prerequisite being that the WinRM service needs to be running. EC2, the largest cloud provider today, supports this out-of-the-box, so if you want to run a standard Amazon machine image (AMI) for Windows, WinRM is enabled by default. This makes it possible to quickly start working with a cloud, all that needs to be done is bring up a standard Windows VM, and then it's possible to remotely configure it - and start using it. This is very useful in cloud environments where you are sometimes unable to create a custom Windows image or are limited to a very small number of images and want to limit your resource usage. The Challenges Where SSH has become the de facto protocol with Linux, WinRM is far less known tool in the Windows world, although it does offer comparable features as far as security, as well as connecting and executing commands to a remote machine. The standard tool for using WinRM is usually PowerShell, the new Windows shell that is intended to supersede the standard command prompt. To date though, there are still relatively few programming languages with built-in support for WinRM, making automation and remote execution of tasks over WinRM much more complex. To achieve these tasks, Cloudify employs PowerShell itself, as an external process to act as a client library for accessing WinRM. The primary issue with this, however, is that the client-side also needs to be running Windows, as PowerShell cannot run on Linux. Another aspect where WinRM differs from SSH is that it does not really have built-in file transfer. There is no direct equivalent for secure copy in SSH for WinRM. That said, it is possible to implement file transfer through PowerShell scripts. There are currently several open source initiatives looking to build a WinRM client for Linux - or specifically for some programming languages, such as Java, however, these are in different levels of maturity, where none of them are fully featured yet. Hence, PowerShell remains the default tool for Cloudify, which essentially provides the same level of functionality you would expect for running remote commands on a Linux machine with Windows. WinRM & Security Another interesting point to consider about WinRM is its support for encryption. WinRM supports three types of transfer protocols, HTTP, HTTPS, and encrypted HTTP. With HTTP, inevitably your wire protocol is unencrypted. It is only a good idea to use HTTP inside your own data center in the event that you are completely convinced that no one can monitor anything going over the wire. HTTPS is commonly used instead of HTTP, however with WinRM there's a chicken and egg issue. If you want to work with HTTPS you are required to set up an SSL certificate on the remote machine. The challenge here is when you're starting with a vanilla Windows VM that will not have the certificate installed, there is a need to automate the insertion of that certificate, however this often cannot be done, as WinRM is not running. Encrypted HTTP, which is also the default in EC2, basically uses your login credentials as your encryption key and it works. From a security perspective this is the recommended secure transfer protocol to use. It is worth noting that most attempts to create a WinRM client library tend to encounter problems around the encrypted HTTP protocol, as implementing MS' encrypted HTTP system - credSSP - is challenging. However, there are various projects working on achieving this, so it will hopefully be solved in the near future. Where Cloudify Comes Into the Mix Where WinRM comes into play with Cloudify, is during the cloud bootstrapping process. By using WinRM Cloudify is able to remotely connect to a vanilla VM provided by the cloud, and set up the Cloudify manager or agent to run on the machine. In addition to traditional cloud environments, WinRM also works on non-cloud and non-virtualized environments, such as a standard data center with multiple Windows servers running. All that needs to be done is provide Cloudify with the credentials, and it will use WinRM to connect and set up the machine remotely. Since WinRM is pre-packaged with Windows, there is no need to install anything. The only thing requirement, as mentioned above, is to have the WinRM service running, as not all Windows images will have this service running. Conclusion In short WinRM is the Window's world alternative to SSHD that allows you to remotely login securely and execute commands on Windows machines. From a cloud automation perspective, it provides virtually all the necessary functionality requirements, and thus it is recommended to have WinRM running in your Windows environment.

March 19, 2014

by Sharone Zitzman

· 26,004 Views

Time Series Feature Design: The Consensus has dRafted a Decision

So, after reaching the conclusion that replication is going to be hard, I went back to the office and discussed those challenges and was in general pretty annoyed by it. Then Michael made a really interesting suggestion. Why not put it on RAFT? And once he explained what he meant, I really couldn’t hold my excitement. We now have a major feature for 4.0. But before I get excited about that (we’ll only be able to actually start working on that in a few months, anyway), let us talk about what the actual suggestion was. Raft is a consensus algorithm. It allows a distributed set of computers to arrive into a mutually agreed upon set of sequential log records. Hm… I wonder where else we can find sequential log records, and yes, I am looking at you Voron.Journal. The basic idea is that we can take the concept of log shipping, but instead of having a single master/slave relationship, we change things so we can put Raft in the middle. When committing a transaction, we’ll hold off committing the transaction until we have a Raft consensus that it should be committed. The advantage here is that we won’t be constrained any longer by the master/slave issue. If there is a server down, we can still process requests (maybe need to elect a new cluster leader, but that is about it). That means that from an architectural standpoint, we’ll have the ability to process write requests for any quorum (N/2+1). That is a pretty standard requirement for distributed databases, so that is perfectly fine. That is a pretty awesome thing to have, to be honest, and more importantly, this is happening at the low level storage layer. That means that we can apply this behavior not just to a single database solution, but to many database solutions. I’m pretty excited about this.

March 19, 2014

by Oren Eini

· 2,167 Views

Change Font Terminal Tool Window in IntelliJ IDEA

IntelliJ IDEA 13 added the Terminal tool window to the IDE. We can open a terminal window with Tools | Open Terminal.... To change the font of the terminal we must open the preferences and select IDE Settings | Editor | Colors & Fonts | Console Font. Here we can choose a font and change the font size:

March 18, 2014

by Hubert Klein Ikkink

· 36,164 Views · 1 Like

Shrink Your Time Machine Backups and Free Disk Space

Time Machine is a backup and restore tool from Apple which is very well integrated into OS X. In my personal opinion Time Machine is not yet awesome.

March 18, 2014

by Enrico Maria Crisostomo

· 162,247 Views · 1 Like

ActiveMQ - Network of Brokers Explained

Objective This 7 part blog series is to share about how to create network of ActiveMQ brokers in order to achieve high availability and scalability. Why network of brokers? ActiveMQ message broker is a core component of messaging infrastructure in an enterprise. It needs to be highly available and dynamically scalable to facilitate communication between dynamic heterogeneous distributed applications which have varying capacity needs. Scaling enterprise applications on commodity hardware is a rage nowadays. ActiveMQ caters to that very well by being able to create a network of brokers to share the load. Many times applications running across geographically distributed data centers need to coordinate messages. Running message producers and consumers across geographic regions/data centers can be architected better using network of brokers. ActiveMQ uses transport connectors over which it communicates with message producers and consumers. However, in order to facilitate broker to broker communication, ActiveMQ uses network connectors. A network connector is a bridge between two brokers which allows on-demand message forwarding. In other words, if Broker B1 initiates a network connector to Broker B2 then the messages on a channel (queue/topic) on B1 get forwarded to B2 if there is at least one consumer on B2 for the same channel. If the network connector was configured to be duplex, the messages get forwarded from B2 to B1 on demand. This is very interesting because it is now possible for brokers to communicate with each other dynamically. In this 7 part blog series, we will look into the following topics to gain understanding of this very powerful ActiveMQ feature: Network Connector Basics - Part 1 Duplex network connectors - Part 2 Load balancing consumers on local/remote brokers - Part 3 Load-balance consumers/subscribers on remote brokers Queue: Load balance remote concurrent consumers - Part 4 Topic: Load Balance Durable Subscriptions on Remote Brokers - Part 5 Store/Forward messages and consumer failover - Part 6 How to prevent stuckmessages Virtual Destinations - Part 7 To give credit where it is due, the following URLs have helped me in creating this blog post series. Advanced Messaging with ActiveMQ by Dejan Bosanac [Slides 32-36] Understanding ActiveMQ Broker Networks by Jakub Korab Prerequisites ActiveMQ 5.8.0 – To create broker instances Apache Ant – To run ActiveMQ sample producer and consumers for demo. We will use multiple ActiveMQ broker instances on the same machine for the ease of demonstration. Network Connector Basics - Part 1 The following diagram shows how a network connector functions. It bridges two brokers and is used to forward messages from Broker-1 to Broker-2 on demand if established by Broker-1 to Broker-2. A network connector can be duplex so messages could be forwarded in the opposite direction; from Broker-2 to Broker-1, once there is a consumer on Broker-1 for a channel which exists in Broker-2. More on this in Part 2 Setup network connector between broker-1 and broker-2 Create two broker instances, say broker-1 and broker-2 Ashwinis-MacBook-Pro:bin akuntamukkala$ pwd /Users/akuntamukkala/apache-activemq-5.8.0/bin Ashwinis-MacBook-Pro:bin akuntamukkala$ ./activemq-admin create ../bridge-demo/broker-1 Ashwinis-MacBook-Pro:bin akuntamukkala$ ./activemq-admin create ../bridge-demo/broker-2 Since we will be running both brokers on the same machine, let's configure broker-2 such that there are no port conflicts. Edit /Users/akuntamukkala/apache-activemq-5.8.0/bridge-demo/broker-2/conf/activemq.xml Change transport connector to 61626 from 61616 Change AMQP port from 5672 to 6672 (won't be using it for this blog) Edit /Users/akuntamukkala/apache-activemq-5.8.0/bridge-demo/broker-2/conf/jetty.xml Change web console port to 9161 from 8161 Configure Network Connector from broker-1 to broker-2 Add the following XML snippet to/Users/akuntamukkala/apache-activemq-5.8.0/bridge-demo/broker-1/conf/activemq.xml The above XML snippet configures two network connectors "T:broker1->broker2" (only topics as queues are excluded) and "Q:broker1->broker2" (only queues as topics are excluded). This allows for nice separation between network connectors used for topics and queues. The name can be arbitrary although I prefer to specify the [type]:[source broker]->[destination broker]. The URI attribute specifies how to connect to broker-2 Start broker-2 Ashwinis-MacBook-Pro:bin akuntamukkala$ pwd /Users/akuntamukkala/apache-activemq-5.8.0/bridge-demo/broker-2/bin Ashwinis-MacBook-Pro:bin akuntamukkala$ ./broker-2 console Start broker-1 Ashwinis-MacBook-Pro:bin akuntamukkala$ pwd /Users/akuntamukkala/apache-activemq-5.8.0/bridge-demo/broker-1/bin Ashwinis-MacBook-Pro:bin akuntamukkala$ ./broker-1 console Logs on broker-1 show 2 network connectors being established with broker-2 INFO | Establishing network connection from vm://broker-1?async=false&network=true to tcp://localhost:61626 INFO | Connector vm://broker-1 Started INFO | Establishing network connection from vm://broker-1?async=false&network=true to tcp://localhost:61626 INFO | Network connection between vm://broker-1#24 and tcp://localhost/127.0.0.1:61626@52132(broker-2) has been established. INFO | Network connection between vm://broker-1#26 and tcp://localhost/127.0.0.1:61626@52133(broker-2) has been established. Web Console on broker-1 @ http://localhost:8161/admin/connections.jsp shows the two network connectors established to broker-2 The same on broker-2 does not show any network connectors since no network connectors were initiated by broker-2 Let's see this in action Let's produce 100 persistent messages on a queue called "foo.bar" on broker-1. Ashwinis-MacBook-Pro:example akuntamukkala$ pwd /Users/akuntamukkala/apache-activemq-5.8.0/example Ashwinis-MacBook-Pro:example akuntamukkala$ ant producer -Durl=tcp://localhost:61616 -Dtopic=false -Ddurable=true -Dsubject=foo.bar -Dmax=100 broker-1 web console shows that 100 messages have been enqueued in queue "foo.bar" http://localhost:8161/admin/queues.jsp Let's start a consumer on a queue called "foo.bar" on broker-2. The important thing to note here is that the destination name "foo.bar" should match exactly. Ashwinis-MacBook-Pro:example akuntamukkala$ ant consumer -Durl=tcp://localhost:61626 -Dtopic=false -Dsubject=foo.bar We find that all the 100 messages from broker-1's foo.bar queue get forwarded to broker-2's foo.bar queue consumer. broker-1 admin console at http://localhost:8161/admin/queues.jsp broker-2 admin console @ http://localhost:9161/admin/queues.jspshows that the consumer we had started has consumed all 100 messages which were forwarded on-demand from broker-1 broker-2 consumer details on foo.bar queue broker-1 admin console shows that all 100 messages have been dequeued [forwarded to broker-2 via the network connector]. broker-1 consumer details on "foo.bar" queue shows that the consumer is created on demand: [name of connector]_[destination broker]_inbound_[source broker] Thus we have seen the basics of network connector in ActiveMQ. As always, please feel to comment about anything that can be improved. Your inputs are welcome! Stay tuned for Part 2.

March 12, 2014

by Ashwini Kuntamukkala

· 40,115 Views · 2 Likes

Exporting Spring Data JPA Repositories as REST Services using Spring Data REST

Spring Data modules provides various modules to work with various types of datasources like RDBMS, NOSQL stores etc in unified way. In my previous article SpringMVC4 + Spring Data JPA + SpringSecurity configuration using JavaConfig I have explained how to configure Spring Data JPA using JavaConfig. Now in this post let us see how we can use Spring Data JPA repositories and export JPA entities as REST endpoints using Spring Data REST. First let us configure spring-data-jpa and spring-data-rest-webmvc dependencies in our pom.xml. org.springframework.data spring-data-jpa 1.5.0.RELEASE org.springframework.data spring-data-rest-webmvc 2.0.0.RELEASE Make sure you have latest released versions configured correctly, otherwise you will encounter the following error: java.lang.ClassNotFoundException: org.springframework.data.mapping.SimplePropertyHandler Create JPA entities. @Entity @Table(name = "USERS") public class User implements Serializable { private static final long serialVersionUID = 1L; @Id @GeneratedValue(strategy = GenerationType.IDENTITY) @Column(name = "user_id") private Integer id; @Column(name = "username", nullable = false, unique = true, length = 50) private String userName; @Column(name = "password", nullable = false, length = 50) private String password; @Column(name = "firstname", nullable = false, length = 50) private String firstName; @Column(name = "lastname", length = 50) private String lastName; @Column(name = "email", nullable = false, unique = true, length = 50) private String email; @Temporal(TemporalType.DATE) private Date dob; private boolean enabled=true; @OneToMany(fetch=FetchType.EAGER, cascade=CascadeType.ALL) @JoinColumn(name="user_id") private Set roles = new HashSet<>(); @OneToMany(mappedBy = "user") private List contacts = new ArrayList<>(); //setters and getters } @Entity @Table(name = "ROLES") public class Role implements Serializable { private static final long serialVersionUID = 1L; @Id @GeneratedValue(strategy = GenerationType.IDENTITY) @Column(name = "role_id") private Integer id; @Column(name="role_name",nullable=false) private String roleName; //setters and getters } @Entity @Table(name = "CONTACTS") public class Contact implements Serializable { private static final long serialVersionUID = 1L; @Id @GeneratedValue(strategy = GenerationType.IDENTITY) @Column(name = "contact_id") private Integer id; @Column(name = "firstname", nullable = false, length = 50) private String firstName; @Column(name = "lastname", length = 50) private String lastName; @Column(name = "email", nullable = false, unique = true, length = 50) private String email; @Temporal(TemporalType.DATE) private Date dob; @ManyToOne @JoinColumn(name = "user_id") private User user; //setters and getters } Configure DispatcherServlet using AbstractAnnotationConfigDispatcherServletInitializer. Observe that we have added RepositoryRestMvcConfiguration.class to getServletConfigClasses() method. RepositoryRestMvcConfiguration is the one which does the heavy lifting of looking for Spring Data Repositories and exporting them as REST endpoints. package com.sivalabs.springdatarest.web.config; import javax.servlet.Filter; import org.springframework.data.rest.webmvc.config.RepositoryRestMvcConfiguration; import org.springframework.orm.jpa.support.OpenEntityManagerInViewFilter; import org.springframework.web.servlet.support.AbstractAnnotationConfigDispatcherServletInitializer; import com.sivalabs.springdatarest.config.AppConfig; public class SpringWebAppInitializer extends AbstractAnnotationConfigDispatcherServletInitializer { @Override protected Class[] getRootConfigClasses() { return new Class[] { AppConfig.class}; } @Override protected Class[] getServletConfigClasses() { return new Class[] { WebMvcConfig.class, RepositoryRestMvcConfiguration.class }; } @Override protected String[] getServletMappings() { return new String[] { "/rest/*" }; } @Override protected Filter[] getServletFilters() { return new Filter[]{ new OpenEntityManagerInViewFilter() }; } } Create Spring Data JPA repositories for JPA entities. public interface UserRepository extends JpaRepository { } public interface RoleRepository extends JpaRepository { } public interface ContactRepository extends JpaRepository { } That's it. Spring Data REST will take care of rest of the things. You can use spring Rest Shell https://github.com/spring-projects/rest-shell or Chrome's Postman Addon to test the exported REST services. D:\rest-shell-1.2.1.RELEASE\bin>rest-shell http://localhost:8080:> Now we can change the baseUri using baseUri command as follows: http://localhost:8080:>baseUri http://localhost:8080/spring-data-rest-demo/rest/ http://localhost:8080/spring-data-rest-demo/rest/> http://localhost:8080/spring-data-rest-demo/rest/>list rel href ====================================================================================== users http://localhost:8080/spring-data-rest-demo/rest/users{?page,size,sort} roles http://localhost:8080/spring-data-rest-demo/rest/roles{?page,size,sort} contacts http://localhost:8080/spring-data-rest-demo/rest/contacts{?page,size,sort} Note: It seems there is an issue with rest-shell when the DispatcherServlet url mapped to "/" and issue list command it responds with "No resources found". http://localhost:8080/spring-data-rest-demo/rest/>get users/ { "_links": { "self": { "href": "http://localhost:8080/spring-data-rest-demo/rest/users/{?page,size,sort}", "templated": true }, "search": { "href": "http://localhost:8080/spring-data-rest-demo/rest/users/search" } }, "_embedded": { "users": [ { "userName": "admin", "password": "admin", "firstName": "Administrator", "lastName": null, "email": "[email protected]", "dob": null, "enabled": true, "_links": { "self": { "href": "http://localhost:8080/spring-data-rest-demo/rest/users/1" }, "roles": { "href": "http://localhost:8080/spring-data-rest-demo/rest/users/1/roles" }, "contacts": { "href": "http://localhost:8080/spring-data-rest-demo/rest/users/1/contacts" } } }, { "userName": "siva", "password": "siva", "firstName": "Siva", "lastName": null, "email": "[email protected]", "dob": null, "enabled": true, "_links": { "self": { "href": "http://localhost:8080/spring-data-rest-demo/rest/users/2" }, "roles": { "href": "http://localhost:8080/spring-data-rest-demo/rest/users/2/roles" }, "contacts": { "href": "http://localhost:8080/spring-data-rest-demo/rest/users/2/contacts" } } } ] }, "page": { "size": 20, "totalElements": 2, "totalPages": 1, "number": 0 } } You can find the source code at https://github.com/sivaprasadreddy/sivalabs-blog-samples-code/tree/master/spring-data-rest-demo For more Info on Spring Rest Shell: https://github.com/spring-projects/rest-shell

March 7, 2014

by Siva Prasad Reddy Katamreddy

· 30,000 Views

Convert CSV Data to Avro Data

In one of my previous posts I explained how we can convert json data to avro data and vice versa using avro tools command line option. Today I was trying to see what options we have for converting csv data to avro format, as of now we don't have any avro tool option to accomplish this . Now, we can either write our own java program (MapReduce program or a simple java program) or we can use various SerDe's available with Hive to do this quickly and without writing any code :) To convert csv data to Avro data using Hive we need to follow the steps below: Create a Hive table stored as textfile and specify your csv delimiter also. Load csv file to above table using "load data" command. Create another Hive table using AvroSerDe. Insert data from former table to new Avro Hive table using "insert overwrite" command. To demonstrate this I will use use the data below (student.csv): 0,38,91 0,65,28 0,78,16 1,34,96 1,78,14 1,11,43 Now execute below queries in Hive: --1. Create a Hive table stored as textfile USE test; CREATE TABLE csv_table ( student_id INT, subject_id INT, marks INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE; --2. Load csv_table with student.csv data LOAD DATA LOCAL INPATH "/path/to/student.csv" OVERWRITE INTO TABLE test.csv_table; --3. Create another Hive table using AvroSerDe CREATE TABLE avro_table ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' TBLPROPERTIES ( 'avro.schema.literal'='{ "namespace": "com.rishav.avro", "name": "student_marks", "type": "record", "fields": [ { "name":"student_id","type":"int"}, { "name":"subject_id","type":"int"}, { "name":"marks","type":"int"}] }'); --4. Load avro_table with data from csv_table INSERT OVERWRITE TABLE avro_table SELECT student_id, subject_id, marks FROM csv_table; Now you can get data in Avro format from Hive warehouse folder. To dump this file to local file system use below command: hadoop fs -cat /path/to/warehouse/test.db/avro_table/* > student.avro If you want to get json data from this avro file you can use avro tools command: java -jar avro-tools-1.7.5.jar tojson student.avro > student.json So we can easily convert csv to avro and csv to json also by just writing 4 HQLs.

March 5, 2014

by Rishav Rohit

· 39,690 Views · 1 Like

When to Use MongoDB Rather than MySQL (or Other RDBMS): The Billing Example

NoSQL has been a hot topic a pretty long time (well, it's not only a buzz anymore). However, when should we really use it instead of an RDBMS?

March 3, 2014

by Moshe Kaplan

· 378,853 Views · 12 Likes

Python CSV Files: Reading and Writing

Learn to parse CSV (Comma Separated Values) files with Python examples using the csv module's reader function and DictReader class.

March 3, 2014

by Mike Driscoll

· 375,607 Views · 6 Likes

Jersey: Ignoring SSL certificate – javax.net.ssl.SSLHandshakeException: java.security.cert.CertificateException

Last week Alistair and I were working on an internal application and we needed to make a HTTPS request directly to an AWS machine using a certificate signed to a different host. We use jersey-client so our code looked something like this: Client client = Client.create(); client.resource("https://some-aws-host.compute-1.amazonaws.com").post(); // and so on When we ran this we predictably ran into trouble: com.sun.jersey.api.client.ClientHandlerException: javax.net.ssl.SSLHandshakeException: java.security.cert.CertificateException: No subject alternative DNS name matching some-aws-host.compute-1.amazonaws.com found. at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149) at com.sun.jersey.api.client.Client.handle(Client.java:648) at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670) at com.sun.jersey.api.client.WebResource.post(WebResource.java:241) at com.neotechnology.testlab.manager.bootstrap.ManagerAdmin.takeBackup(ManagerAdmin.java:33) at com.neotechnology.testlab.manager.bootstrap.ManagerAdminTest.foo(ManagerAdminTest.java:11) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222) at org.junit.runners.ParentRunner.run(ParentRunner.java:300) at org.junit.runner.JUnitCore.run(JUnitCore.java:157) at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:74) at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:202) at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:65) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120) Caused by: javax.net.ssl.SSLHandshakeException: java.security.cert.CertificateException: No subject alternative DNS name matching some-aws-host.compute-1.amazonaws.com found. at sun.security.ssl.Alerts.getSSLException(Alerts.java:192) at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1884) at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:276) at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:270) at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1341) at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:153) at sun.security.ssl.Handshaker.processLoop(Handshaker.java:868) at sun.security.ssl.Handshaker.process_record(Handshaker.java:804) at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1016) at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1312) at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1339) at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1323) at sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:563) at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:185) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1300) at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468) at sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:338) at com.sun.jersey.client.urlconnection.URLConnectionClientHandler._invoke(URLConnectionClientHandler.java:240) at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:147) ... 31 more Caused by: java.security.cert.CertificateException: No subject alternative DNS name matching some-aws-host.compute-1.amazonaws.com found. at sun.security.util.HostnameChecker.matchDNS(HostnameChecker.java:191) at sun.security.util.HostnameChecker.match(HostnameChecker.java:93) at sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:347) at sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:203) at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:126) at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1323) ... 45 more We figured that we needed to get our client to ignore the certificate and came across this Stack Overflow thread which had some suggestions on how to do this. None of the suggestions worked on their own but we ended up with a combination of a couple of the suggestions which did the trick: public Client hostIgnoringClient() { try { SSLContext sslcontext = SSLContext.getInstance( "TLS" ); sslcontext.init( null, null, null ); DefaultClientConfig config = new DefaultClientConfig(); Map properties = config.getProperties(); HTTPSProperties httpsProperties = new HTTPSProperties( new HostnameVerifier() { @Override public boolean verify( String s, SSLSession sslSession ) { return true; } }, sslcontext ); properties.put( HTTPSProperties.PROPERTY_HTTPS_PROPERTIES, httpsProperties ); config.getClasses().add( JacksonJsonProvider.class ); return Client.create( config ); } catch ( KeyManagementException | NoSuchAlgorithmException e ) { throw new RuntimeException( e ); } } You’re welcome Future Mark.

March 2, 2014

by Mark Needham

· 43,054 Views · 8 Likes

How to "Backcast" a Time Series in R

Sometimes it is useful to “backcast” a time series — that is, forecast in reverse time. Although there are no in-built R functions to do this, it is very easy to implement. Suppose x is our time series and we want to backcast for periods. Here is some code that should work for most univariate time series. The example is non-seasonal, but the code will also work with seasonal data. library(forecast) x <- WWWusage h <- 20 f <- frequency(x) # Reverse time revx <- ts(rev(x), frequency=f) # Forecast fc <- forecast(auto.arima(revx), h) plot(fc) # Reverse time again fc$mean <- ts(rev(fc$mean),end=tsp(x)[1] - 1/f, frequency=f) fc$upper <- fc$upper[h:1,] fc$lower <- fc$lower[h:1,] fc$x <- x # Plot result plot(fc, xlim=c(tsp(x)[1]-h/f, tsp(x)[2]))

February 28, 2014

by Rob J Hyndman

· 5,736 Views

Hibernate Query by Example (QBE)

What is It Query by example is an alternative querying technique supported by the main JPA vendors but not by the JPA specification itself. QBE returns a result set depending on the properties that were set on an instance of the queried class. So if I create an Address entity and fill in the city field then the query will select all the Address entities having the same city field as the given Address entity. The typical use case of QBE is evaluating a search form where the user can fill in any search fields and gets the results based on the given search fields. In this case QBE can reduce code size significantly. When to Use · Using many fields of an entity in a query · User selects which fields of an Entity to use in a query · We are refactoring the entities frequently and don’t want to worry about breaking the queries that rely on them Limitations · QBE is not available in JPA 1.0 or 2.0 · Version properties, identifiers and associations are ignored · The query object should be annotated with @Entity Test Data I used the following entities to test the QBE feature of Hibernate: · Address (long id, String city, String street, String countryISO2Code, AddressType addressType) · AddressType (Integer type, String description) Imports The examples will refer to the following classes: import org.hibernate.Criteria; import org.hibernate.Session; import org.hibernate.criterion.Example; import org.hibernate.criterion.Restrictions; import org.junit.Test; import java.util.List; Utility Methods I also made two utility methods to present a list of the two entity types: private void listAddresses(List addresses) { for (Address address : addresses) { System.out.println(address.getId() + ", " + address.getCountryISO2Code() + ", " + address.getCity() + ", " + address.getStreet() + ", " + address.getAddressType().getType() + ", " + address.getAddressType().getDescription()); } } private void listAddressTypes(List addressTypes) { for (AddressType addressType : addressTypes) { System.out.println(addressType.getType() + ", " + addressType.getDescription()); } } Example 1: Equals This example code returns the Address entities matching the given CountryISO2Code and City. Method: @Test public void testEquals() throws Exception { Session session = (Session) entityManager.getDelegate(); Address address = new Address(); address.setCountryISO2Code("US"); address.setCity("CHICAGO"); Example addressExample = Example.create(address); Criteria criteria = session.createCriteria(Address.class).add(addressExample); listAddresses(criteria.list()); } Result: 75, US, CHICAGO, Los Angeles Way2, 6, Customer 170, US, CHICAGO, Jackson Blvd 33a, 4, Delivery 63, US, CHICAGO, Main Avenue 1, 5, Bill to 37, US, CHICAGO, Jackson Blvd 33a, 4, Delivery 36, US, CHICAGO, Jackson Blvd 33a, 4, Delivery Example 2: Id Limitation This example presents that id fields in the query object are ignored. Method: @Test public void testIdLimitation() throws Exception { Session session = (Session) entityManager.getDelegate(); Address address = new Address(); address.setCountryISO2Code("US"); address.setCity("CHICAGO"); address.setId(100); // setting id is ignored Example addressExample = Example.create(address); Criteria criteria = session.createCriteria(Address.class).add(addressExample); listAddresses(criteria.list()); } Result: 75, US, CHICAGO, Los Angeles Way2, 6, Customer 170, US, CHICAGO, Jackson Blvd 33a, 4, Delivery 63, US, CHICAGO, Main Avenue 1, 5, Bill to 37, US, CHICAGO, Jackson Blvd 33a, 4, Delivery 36, US, CHICAGO, Jackson Blvd 33a, 4, Delivery Example 3: Association Limitation Associations of the query object are ignored, too. Method: @Test public void testAssociationLimitation() throws Exception { Session session = (Session) entityManager.getDelegate(); Address address = new Address(); address.setCountryISO2Code("US"); address.setCity("CHICAGO"); AddressType addressType = new AddressType(); addressType.setType(5); address.setAddressType(addressType); // setting an association is ignored Example addressExample = Example.create(address); Criteria criteria = session.createCriteria(Address.class).add(addressExample); listAddresses(criteria.list()); } Result: 75, US, CHICAGO, Los Angeles Way2, 6, Customer 170, US, CHICAGO, Jackson Blvd 33a, 4, Delivery 63, US, CHICAGO, Main Avenue 1, 5, Bill to 37, US, CHICAGO, Jackson Blvd 33a, 4, Delivery 36, US, CHICAGO, Jackson Blvd 33a, 4, Delivery Example 4: Like QBE supports like in the query object if we enable it with Example.enableLike(). Method: @Test public void testLike() throws Exception { Session session = (Session) entityManager.getDelegate(); Address address = new Address(); address.setCountryISO2Code("US"); address.setCity("AT%"); Example addressExample = Example.create(address).enableLike(); Criteria criteria = session.createCriteria(Address.class).add(addressExample); listAddresses(criteria.list()); } Result: 83, US, ATLANTA, null, 6, Customer 184, US, ATLANTA, null, 1, Shipper 25, US, ATLANTA, null, 1, Shipper Example 5: ExcludeProperty We can exclude a property with Example.excludeProperty(String propertyName). Method: @Test public void testExcludeProperty() throws Exception { Session session = (Session) entityManager.getDelegate(); Address address = new Address(); address.setCountryISO2Code("US"); address.setCity("AT%"); Example addressExample = Example.create(address).enableLike() .excludeProperty("countryISO2Code"); // countryISO2Code is a property of Address Criteria criteria = session.createCriteria(Address.class).add(addressExample); listAddresses(criteria.list()); } Result: 154, GR, ATHENS, BETA ALPHA Street 5, 2, Consignee 83, US, ATLANTA, null, 6, Customer 25, US, ATLANTA, null, 1, Shipper 184, US, ATLANTA, null, 1, Shipper Example 6: IgnoreCase Case-insensitive search is supported by Example.ignoreCase(). Method: @Test public void testIgnoreCase() throws Exception { Session session = (Session) entityManager.getDelegate(); AddressType addressType = new AddressType(); addressType.setDescription("customer"); Example addressTypeExample = Example.create(addressType).ignoreCase(); Criteria criteria = session.createCriteria(AddressType.class) .add(addressTypeExample); listAddressTypes(criteria.list()); } Result: 6, Customer Example 7: ExcludeZeroes We can ignore 0 values of the query object by Example.excludeZeroes(). Method: @Test public void testExcludeZeroes() throws Exception { Session session = (Session) entityManager.getDelegate(); AddressType addressType = new AddressType(); addressType.setType(0); addressType.setDescription("Customer"); Example addressTypeExample = Example.create(addressType) .excludeZeroes(); Criteria criteria = session.createCriteria(AddressType.class) .add(addressTypeExample); listAddressTypes(criteria.list()); } Result: 6, Customer Example 8: Combining with Criteria QBE can be combined with criteria query. In this example we add further restriction to the query object using criteria query. Method: @Test public void testCombiningWithCriteria() throws Exception { Session session = (Session) entityManager.getDelegate(); AddressType addressType = new AddressType(); addressType.setDescription("Customer"); Example addressTypeExample = Example.create(addressType); Criteria criteria = session .createCriteria(AddressType.class).add(addressTypeExample) .add(Restrictions.eq("type", 6)); listAddressTypes(criteria.list()); } Result: 6, Customer Example 9: Association With criteria query we can filter both sides of an association, using two query objects. Method: @Test public void testAssociation() throws Exception { Session session = (Session) entityManager.getDelegate(); Address address = new Address(); address.setCountryISO2Code("US"); AddressType addressType = new AddressType(); addressType.setType(6); Example addressExample = Example.create(address); Example addressTypeExample = Example.create(addressType); Criteria criteria = session.createCriteria(Address.class).add(addressExample) .createCriteria("addressType").add(addressTypeExample); // addressType is a property of Address listAddresses(criteria.list()); } Result: 84, US, BOSTON, null, 6, Customer 83, US, ATLANTA, null, 6, Customer 82, US, SAN FRANCISCO, null, 6, Customer 75, US, CHICAGO, Los Angeles Way2, 6, Customer EclipseLink EclipseLink QBE uses QueryByExamplePolicy, ReadObjectQuery and JpaHelper: QueryByExamplePolicy qbePolicy =newQueryByExamplePolicy(); qbePolicy.excludeDefaultPrimitiveValues(); Address address =newAddress(); address.setCity("CHICAGO"); ReadObjectQuery roq =newReadObjectQuery(address, qbePolicy); Query query =JpaHelper.createQuery(roq, entityManager); OpenJPA OpenJPA uses OpenJPAQueryBuilder: CriteriaQuery cq = openJPAQueryBuilder.createQuery(Address.class); Address address =newAddress(); address.setCity("CHICAGO"); cq.where(openJPAQueryBuilder.qbe(cq.from(Address.class), address); References Hibernate: · Srinivas Guruzu and Gary Mak: Hibernate Recipes: A Problem-Solution Approach (Apress) · http://docs.jboss.org/hibernate/core/3.3/reference/en/html/querycriteria.html#querycriteria-examples · http://www.java2s.com/Code/Java/Hibernate/CriteriaQBEQueryByExampleCriteria.htm · http://www.dzone.com/snippets/hibernate-query-example · http://gal-levinsky.blogspot.de/2012/01/qbe-pattern.html Hibernate associations: · http://stackoverflow.com/questions/9309884/query-by-example-on-associations · http://stackoverflow.com/questions/8236596/hibernate-query-by-example-equivalent-of-association-criteria-query JPA: · http://stackoverflow.com/questions/2880209/jpa-findbyexample EclipseLink: · http://www.coderanch.com/t/486528/ORM/databases/findByExample-JPA-book OpenJPA: · http://www.ibm.com/developerworks/java/library/j-typesafejpa/#N10C18

February 27, 2014

by Donat Szilagyi

· 62,503 Views · 3 Likes

A Deeper Look into the Java 8 Date and Time API

Within this post we will have a deeper look into the new Date/Time API we get with Java 8 (JSR 310). Please note that this post is mainly driven by code examples that show the new API functionality. I think the examples are self-explanatory so I did not spent much time writing text around them :-) Let's get started! Working with Date and Time Objects All classes of the Java 8 Date/Time API are located within the java.time package. The first class we want to look at is java.time.LocalDate. A LocalDate represents a year-month-day date without time. We start with creating new LocalDate instances: // the current date LocalDate currentDate = LocalDate.now(); // 2014-02-10 LocalDate tenthFeb2014 = LocalDate.of(2014, Month.FEBRUARY, 10); // months values start at 1 (2014-08-01) LocalDate firstAug2014 = LocalDate.of(2014, 8, 1); // the 65th day of 2010 (2010-03-06) LocalDate sixtyFifthDayOf2010 = LocalDate.ofYearDay(2010, 65); LocalTime and LocalDateTime are the next classes we look at. Both work similar to LocalDate. ALocalTime works with time (without dates) while LocalDateTime combines date and time in one class: LocalTime currentTime = LocalTime.now(); // current time LocalTime midday = LocalTime.of(12, 0); // 12:00 LocalTime afterMidday = LocalTime.of(13, 30, 15); // 13:30:15 // 12345th second of day (03:25:45) LocalTime fromSecondsOfDay = LocalTime.ofSecondOfDay(12345); // dates with times, e.g. 2014-02-18 19:08:37.950 LocalDateTime currentDateTime = LocalDateTime.now(); // 2014-10-02 12:30 LocalDateTime secondAug2014 = LocalDateTime.of(2014, 10, 2, 12, 30); // 2014-12-24 12:00 LocalDateTime christmas2014 = LocalDateTime.of(2014, Month.DECEMBER, 24, 12, 0); By default LocalDate/Time classes will use the system clock in the default time zone. We can change this by providing a time zone or an alternative Clock implementation: // current (local) time in Los Angeles LocalTime currentTimeInLosAngeles = LocalTime.now(ZoneId.of("America/Los_Angeles")); // current time in UTC time zone LocalTime nowInUtc = LocalTime.now(Clock.systemUTC()); From LocalDate/Time objects we can get all sorts of useful information we might need. Some examples: LocalDate date = LocalDate.of(2014, 2, 15); // 2014-06-15 boolean isBefore = LocalDate.now().isBefore(date); // false // information about the month Month february = date.getMonth(); // FEBRUARY int februaryIntValue = february.getValue(); // 2 int minLength = february.minLength(); // 28 int maxLength = february.maxLength(); // 29 Month firstMonthOfQuarter = february.firstMonthOfQuarter(); // JANUARY // information about the year int year = date.getYear(); // 2014 int dayOfYear = date.getDayOfYear(); // 46 int lengthOfYear = date.lengthOfYear(); // 365 boolean isLeapYear = date.isLeapYear(); // false DayOfWeek dayOfWeek = date.getDayOfWeek(); int dayOfWeekIntValue = dayOfWeek.getValue(); // 6 String dayOfWeekName = dayOfWeek.name(); // SATURDAY int dayOfMonth = date.getDayOfMonth(); // 15 LocalDateTime startOfDay = date.atStartOfDay(); // 2014-02-15 00:00 // time information LocalTime time = LocalTime.of(15, 30); // 15:30:00 int hour = time.getHour(); // 15 int second = time.getSecond(); // 0 int minute = time.getMinute(); // 30 int secondOfDay = time.toSecondOfDay(); // 55800 Some information can be obtained without providing a specific date. For example, we can use the Year class if we need information about a specific year: Year currentYear = Year.now(); Year twoThousand = Year.of(2000); boolean isLeap = currentYear.isLeap(); // false int length = currentYear.length(); // 365 // sixtyFourth day of 2014 (2014-03-05) LocalDate date = Year.of(2014).atDay(64); We can use the plus and minus methods to add or subtract specific amounts of time. Note that these methods always return a new instance (Java 8 date/time classes are immutable). LocalDate tomorrow = LocalDate.now().plusDays(1); // before 5 houres and 30 minutes LocalDateTime dateTime = LocalDateTime.now().minusHours(5).minusMinutes(30); TemporalAdjusters are another nice way for date manipulation. TemporalAdjuster is a single method interface that is used to separate the process of adjustment from actual date/time objects. A set of common TemporalAdjusters can be accessed using static methods of the TemporalAdjusters class. LocalDate date = LocalDate.of(2014, Month.FEBRUARY, 25); // 2014-02-25 // first day of february 2014 (2014-02-01) LocalDate firstDayOfMonth = date.with(TemporalAdjusters.firstDayOfMonth()); // last day of february 2014 (2014-02-28) LocalDate lastDayOfMonth = date.with(TemporalAdjusters.lastDayOfMonth()); Static imports make this more fluent to read: import static java.time.temporal.TemporalAdjusters.*; ... // last day of 2014 (2014-12-31) LocalDate lastDayOfYear = date.with(lastDayOfYear()); // first day of next month (2014-03-01) LocalDate firstDayOfNextMonth = date.with(firstDayOfNextMonth()); // next sunday (2014-03-02) LocalDate nextSunday = date.with(next(DayOfWeek.SUNDAY)); Time Zones Working with time zones is another big topic that is simplified by the new API. The LocalDate/Time classes we have seen so far do not contain information about a time zone. If we want to work with a date/time in a certain time zone we can use ZonedDateTime or OffsetDateTime: ZoneId losAngeles = ZoneId.of("America/Los_Angeles"); ZoneId berlin = ZoneId.of("Europe/Berlin"); // 2014-02-20 12:00 LocalDateTime dateTime = LocalDateTime.of(2014, 02, 20, 12, 0); // 2014-02-20 12:00, Europe/Berlin (+01:00) ZonedDateTime berlinDateTime = ZonedDateTime.of(dateTime, berlin); // 2014-02-20 03:00, America/Los_Angeles (-08:00) ZonedDateTime losAngelesDateTime = berlinDateTime.withZoneSameInstant(losAngeles); int offsetInSeconds = losAngelesDateTime.getOffset().getTotalSeconds(); // -28800 // a collection of all available zones Set allZoneIds = ZoneId.getAvailableZoneIds(); // using offsets LocalDateTime date = LocalDateTime.of(2013, Month.JULY, 20, 3, 30); ZoneOffset offset = ZoneOffset.of("+05:00"); // 2013-07-20 03:30 +05:00 OffsetDateTime plusFive = OffsetDateTime.of(date, offset); // 2013-07-19 20:30 -02:00 OffsetDateTime minusTwo = plusFive.withOffsetSameInstant(ZoneOffset.ofHours(-2)); Timestamps Classes like LocalDate and ZonedDateTime provide a human view on time. However, often we need to work with time viewed from a machine perspective. For this we can use the Instant class which represents timestamps. An Instant counts the time beginning from the first second of January 1, 1970 (1970-01-01 00:00:00) also called the EPOCH. Instant values can be negative if they occured before the epoch. They followISO 8601 the standard for representing date and time. // current time Instant now = Instant.now(); // from unix timestamp, 2010-01-01 12:00:00 Instant fromUnixTimestamp = Instant.ofEpochSecond(1262347200); // same time in millis Instant fromEpochMilli = Instant.ofEpochMilli(1262347200000l); // parsing from ISO 8601 Instant fromIso8601 = Instant.parse("2010-01-01T12:00:00Z"); // toString() returns ISO 8601 format, e.g. 2014-02-15T01:02:03Z String toIso8601 = now.toString(); // as unix timestamp long toUnixTimestamp = now.getEpochSecond(); // in millis long toEpochMillis = now.toEpochMilli(); // plus/minus methods are available too Instant nowPlusTenSeconds = now.plusSeconds(10); Periods and Durations Period and Duration are two other important classes. Like the names suggest they represent a quantity or amount of time. A Period uses date based values (years, months, days) while a Duration uses seconds or nanoseconds to define an amount of time. Duration is most suitable when working with Instants and machine time. Periods and Durations can contain negative values if the end point occurs before the starting point. // periods LocalDate firstDate = LocalDate.of(2010, 5, 17); // 2010-05-17 LocalDate secondDate = LocalDate.of(2015, 3, 7); // 2015-03-07 Period period = Period.between(firstDate, secondDate); int days = period.getDays(); // 18 int months = period.getMonths(); // 9 int years = period.getYears(); // 4 boolean isNegative = period.isNegative(); // false Period twoMonthsAndFiveDays = Period.ofMonths(2).plusDays(5); LocalDate sixthOfJanuary = LocalDate.of(2014, 1, 6); // add two months and five days to 2014-01-06, result is 2014-03-11 LocalDate eleventhOfMarch = sixthOfJanuary.plus(twoMonthsAndFiveDays); // durations Instant firstInstant= Instant.ofEpochSecond( 1294881180 ); // 2011-01-13 01:13 Instant secondInstant = Instant.ofEpochSecond(1294708260); // 2011-01-11 01:11 Duration between = Duration.between(firstInstant, secondInstant); // negative because firstInstant is after secondInstant (-172920) long seconds = between.getSeconds(); // get absolute result in minutes (2882) long absoluteResult = between.abs().toMinutes(); // two hours in seconds (7200) long twoHoursInSeconds = Duration.ofHours(2).getSeconds(); Formatting and Parsing Formatting and parsing is another big topic when working with dates and times. In Java 8 this can be accomplished by using the format() and parse() methods: // 2014-04-01 10:45 LocalDateTime dateTime = LocalDateTime.of(2014, Month.APRIL, 1, 10, 45); // format as basic ISO date format (20140220) String asBasicIsoDate = dateTime.format(DateTimeFormatter.BASIC_ISO_DATE); // format as ISO week date (2014-W08-4) String asIsoWeekDate = dateTime.format(DateTimeFormatter.ISO_WEEK_DATE); // format ISO date time (2014-02-20T20:04:05.867) String asIsoDateTime = dateTime.format(DateTimeFormatter.ISO_DATE_TIME); // using a custom pattern (01/04/2014) String asCustomPattern = dateTime.format(DateTimeFormatter.ofPattern("dd/MM/yyyy")); // french date formatting (1. avril 2014) String frenchDate = dateTime.format(DateTimeFormatter.ofPattern("d. MMMM yyyy", new Locale("fr"))); // using short german date/time formatting (01.04.14 10:45) DateTimeFormatter formatter = DateTimeFormatter.ofLocalizedDateTime(FormatStyle.SHORT) .withLocale(new Locale("de")); String germanDateTime = dateTime.format(formatter); // parsing date strings LocalDate fromIsoDate = LocalDate.parse("2014-01-20"); LocalDate fromIsoWeekDate = LocalDate.parse("2014-W14-2", DateTimeFormatter.ISO_WEEK_DATE); LocalDate fromCustomPattern = LocalDate.parse("20.01.2014", DateTimeFormatter.ofPattern("dd.MM.yyyy")); Conversion Of course we do not always have objects of the type we need. Therefore, we need an option to convert different date/time related objects between each other. The following examples show some of the possible conversion options: // LocalDate/LocalTime <-> LocalDateTime LocalDate date = LocalDate.now(); LocalTime time = LocalTime.now(); LocalDateTime dateTimeFromDateAndTime = LocalDateTime.of(date, time); LocalDate dateFromDateTime = LocalDateTime.now().toLocalDate(); LocalTime timeFromDateTime = LocalDateTime.now().toLocalTime(); // Instant <-> LocalDateTime Instant instant = Instant.now(); LocalDateTime dateTimeFromInstant = LocalDateTime.ofInstant(instant, ZoneId.of("America/Los_Angeles")); Instant instantFromDateTime = LocalDateTime.now().toInstant(ZoneOffset.ofHours(-2)); // convert old date/calendar/timezone classes Instant instantFromDate = new Date().toInstant(); Instant instantFromCalendar = Calendar.getInstance().toInstant(); ZoneId zoneId = TimeZone.getDefault().toZoneId(); ZonedDateTime zonedDateTimeFromGregorianCalendar = new GregorianCalendar().toZonedDateTime(); // convert to old classes Date dateFromInstant = Date.from(Instant.now()); TimeZone timeZone = TimeZone.getTimeZone(ZoneId.of("America/Los_Angeles")); GregorianCalendar gregorianCalendar = GregorianCalendar.from(ZonedDateTime.now()); Conclusion With Java 8 we get a very rich API for working with date and time located in the java.time package. The API can completely replace old classes like java.util.Date or java.util.Calendar with newer, more flexible classes. Due to mostly immutable classes the new API helps in building thread safe systems. The source of the examples can be found on GitHub.

February 27, 2014

by Michael Scharhag

· 209,448 Views · 18 Likes

Getting Started with Mocking in Java using Mockito

We all write unit tests but the challenge we face at times is that the unit under test might be dependent on other components. And configuring other components for unit testing is definitely an overkill. Instead we can make use of Mocks in place of the other components and continue with the unit testing. To show how one can use mocks, I have a Data access layer(DAL), basically a class which provides an API for the application to access and modify the data in the data repository. I then unit test the DAL without actually the need to connect to the data repository. The data repository can be a local database or remote database or a file system or any place where we can store and retrieve the data. The use of a DAL class helps us in keeping the data mappers separate from the application code. Lets create a Java project using maven. mvn archetype:generate -DgroupId=info.sanaulla -DartifactId=MockitoDemo -DarchetypeArtifactId=maven-archetype-quickstart -DinteractiveMode=false The above creates a folder MockitoDemo and then creates the entire directory structure for source and test files. Consider the below model class for this example: package info.sanaulla.models; import java.util.List; /** * Model class for the book details. */ public class Book { private String isbn; private String title; private List authors; private String publication; private Integer yearOfPublication; private Integer numberOfPages; private String image; public Book(String isbn, String title, List authors, String publication, Integer yearOfPublication, Integer numberOfPages, String image){ this.isbn = isbn; this.title = title; this.authors = authors; this.publication = publication; this.yearOfPublication = yearOfPublication; this.numberOfPages = numberOfPages; this.image = image; } public String getIsbn() { return isbn; } public String getTitle() { return title; } public List getAuthors() { return authors; } public String getPublication() { return publication; } public Integer getYearOfPublication() { return yearOfPublication; } public Integer getNumberOfPages() { return numberOfPages; } public String getImage() { return image; } } The DAL class for operating on the Book model class is: package info.sanaulla.dal; import info.sanaulla.models.Book; import java.util.ArrayList; import java.util.Arrays; import java.util.Collections; import java.util.List; /** * API layer for persisting and retrieving the Book objects. */ public class BookDAL { private static BookDAL bookDAL = new BookDAL(); public List getAllBooks(){ return Collections.EMPTY_LIST; } public Book getBook(String isbn){ return null; } public String addBook(Book book){ return book.getIsbn(); } public String updateBook(Book book){ return book.getIsbn(); } public static BookDAL getInstance(){ return bookDAL; } } The DAL layer above currently has no functionality and we are going to unit test that piece of code (TDD). The DAL layer might communicate with a ORM Mapper or Database API which we are not concerned while designing the API. Test Driving the DAL Layer There are lot of frameworks for Unit testing and mocking in Java but for this example I would be picking JUnit for unit testing and Mockito for mocking. We would have to update the dependency in Maven’s pom.xml 4.0.0 info.sanaulla MockitoDemo jar 1.0-SNAPSHOT MockitoDemo http://maven.apache.org junit junit 4.10 test org.mockito mockito-all 1.9.5 test Now lets unit test the BookDAL. During the unit testing we will inject mock data into the BookDAL so that we can complete the testing of the API without depending on the data source. Initially we will have an empty test class: public class BookDALTest { public void setUp() throws Exception { } public void testGetAllBooks() throws Exception { } public void testGetBook() throws Exception { } public void testAddBook() throws Exception { } public void testUpdateBook() throws Exception { } } We will inject the mock BookDAL and mock data in the setUp() as shown below: public class BookDALTest { private static BookDAL mockedBookDAL; private static Book book1; private static Book book2; @BeforeClass public static void setUp(){ //Create mock object of BookDAL mockedBookDAL = mock(BookDAL.class); //Create few instances of Book class. book1 = new Book("8131721019","Compilers Principles", Arrays.asList("D. Jeffrey Ulman","Ravi Sethi", "Alfred V. Aho", "Monica S. Lam"), "Pearson Education Singapore Pte Ltd", 2008,1009,"BOOK_IMAGE"); book2 = new Book("9788183331630","Let Us C 13th Edition", Arrays.asList("Yashavant Kanetkar"),"BPB PUBLICATIONS", 2012,675,"BOOK_IMAGE"); //Stubbing the methods of mocked BookDAL with mocked data. when(mockedBookDAL.getAllBooks()).thenReturn(Arrays.asList(book1, book2)); when(mockedBookDAL.getBook("8131721019")).thenReturn(book1); when(mockedBookDAL.addBook(book1)).thenReturn(book1.getIsbn()); when(mockedBookDAL.updateBook(book1)).thenReturn(book1.getIsbn()); } public void testGetAllBooks() throws Exception {} public void testGetBook() throws Exception {} public void testAddBook() throws Exception {} public void testUpdateBook() throws Exception {} } In the above setUp() method I have: Created a mock object of BookDAL BookDAL mockedBookDAL = mock(BookDAL.class); Stubbed the API of BookDAL with mock data, such that when ever the API is invoked the mocked data is returned. //When getAllBooks() is invoked then return the given data and so on for the other methods. when(mockedBookDAL.getAllBooks()).thenReturn(Arrays.asList(book1, book2)); when(mockedBookDAL.getBook("8131721019")).thenReturn(book1); when(mockedBookDAL.addBook(book1)).thenReturn(book1.getIsbn()); when(mockedBookDAL.updateBook(book1)).thenReturn(book1.getIsbn()); Populating the rest of the tests we get: package info.sanaulla.dal; import info.sanaulla.models.Book; import org.junit.BeforeClass; import org.junit.Test; import static org.junit.Assert.*; import static org.mockito.Mockito.mock; import static org.mockito.Mockito.when; import java.util.Arrays; import java.util.List; public class BookDALTest { private static BookDAL mockedBookDAL; private static Book book1; private static Book book2; @BeforeClass public static void setUp(){ mockedBookDAL = mock(BookDAL.class); book1 = new Book("8131721019","Compilers Principles", Arrays.asList("D. Jeffrey Ulman","Ravi Sethi", "Alfred V. Aho", "Monica S. Lam"), "Pearson Education Singapore Pte Ltd", 2008,1009,"BOOK_IMAGE"); book2 = new Book("9788183331630","Let Us C 13th Edition", Arrays.asList("Yashavant Kanetkar"),"BPB PUBLICATIONS", 2012,675,"BOOK_IMAGE"); when(mockedBookDAL.getAllBooks()).thenReturn(Arrays.asList(book1, book2)); when(mockedBookDAL.getBook("8131721019")).thenReturn(book1); when(mockedBookDAL.addBook(book1)).thenReturn(book1.getIsbn()); when(mockedBookDAL.updateBook(book1)).thenReturn(book1.getIsbn()); } @Test public void testGetAllBooks() throws Exception { List allBooks = mockedBookDAL.getAllBooks(); assertEquals(2, allBooks.size()); Book myBook = allBooks.get(0); assertEquals("8131721019", myBook.getIsbn()); assertEquals("Compilers Principles", myBook.getTitle()); assertEquals(4, myBook.getAuthors().size()); assertEquals((Integer)2008, myBook.getYearOfPublication()); assertEquals((Integer) 1009, myBook.getNumberOfPages()); assertEquals("Pearson Education Singapore Pte Ltd", myBook.getPublication()); assertEquals("BOOK_IMAGE", myBook.getImage()); } @Test public void testGetBook(){ String isbn = "8131721019"; Book myBook = mockedBookDAL.getBook(isbn); assertNotNull(myBook); assertEquals(isbn, myBook.getIsbn()); assertEquals("Compilers Principles", myBook.getTitle()); assertEquals(4, myBook.getAuthors().size()); assertEquals("Pearson Education Singapore Pte Ltd", myBook.getPublication()); assertEquals((Integer)2008, myBook.getYearOfPublication()); assertEquals((Integer)1009, myBook.getNumberOfPages()); } @Test public void testAddBook(){ String isbn = mockedBookDAL.addBook(book1); assertNotNull(isbn); assertEquals(book1.getIsbn(), isbn); } @Test public void testUpdateBook(){ String isbn = mockedBookDAL.updateBook(book1); assertNotNull(isbn); assertEquals(book1.getIsbn(), isbn); } } One can run the test by using maven command: mvn test. The output is: ------------------------------------------------------- T E S T S ------------------------------------------------------- Running info.sanaulla.AppTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.029 sec Running info.sanaulla.dal.BookDALTest Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.209 sec Results : Tests run: 5, Failures: 0, Errors: 0, Skipped: 0 So we have been able to test the DAL class without actually configuring the data source by using mocks.

February 26, 2014

by Mohamed Sanaulla

· 233,435 Views · 18 Likes

How to Estimate Memory Consumption

This story goes back at least a decade, when I was first approached by a PHB with a question “How big servers are we going to need to buy for our production deployment”. The new and shiny system we have been building was nine months from production rollout and apparently the company had promised to deliver the whole solution, including hardware. Oh boy, was I in trouble. With just a few years of experience down my belt, I could have pretty much just tossed a dice. Even though I am sure my complete lack of confidence was clearly visible, I still had to come up with the answer. Four hours of googling later I recall sitting there with the same question still hovering in front of my bedazzled face: “How to estimate the need for computing power?” In this post I start to open up the subject by giving you rough guidelines on how to estimate memory requirements for your brand new Java application. For the impatient ones – the answer will be to start with the memory equal to approximately 5 x [amount of memory consumed by Live Data] and start the fine-tuning from there. For the ones more curious about the logic behind, stay with me and I will walk you through the reasoning. First and foremost, I can only recommend to avoid answering a question phrased like this without detailed information being available. Your answer has to be based upon the performance requirements, so do not even start without clarifying those first. And I do not mean way-too-ambiguous “The system needs to support 700 concurrent users”, but a lot more specific ones about latency and throughput, taking into account the amount of data, usage patterns. One should also not forget about the budget also – we all can all dream about sub-millisecond latencies, but those without HFT banking backbone budgets – unfortunately it will only remain a dream. For now, lets assume you have those requirements in place. Next stop would be to create the load test scripts emulating user behaviour. If you are now able to launch those scripts concurrently you have built a foundation to the answer. As you might also have guessed, the next step involves our usual advice of measuring not guessing. But with a caveat. Live Data Size Namely, our quest for the optimal memory configuration requires capturing the Live Data Size. Having captured this, we have the baseline configuration in place for the fine-tuning. How does one define live data size? Charlie Hunt and Binu John in their “Java Performance” book have given it the following definition: Live data size is the heap size consumed by the set of long-lived objects required to run the application in its steady state. Equipped with the definition, we are ready to run your load tests against the application with the GC logging turned on (-XX:+PrintGCTimeStamps -Xloggc:/tmp/gc.log -XX:+PrintGCDetails) and visualize the logs (with the help of gcviewer for example) to determine the moment when the application has reached to the steady state. What you are after looks similar to the following: We can see the GC doing its job both with minor and Full GC runs in a familiar double-saw-toothed graphic. This particular application seems to have achieved a steady state already after the first full GC run on 21st second. In most cases however, it takes 10-20 Full GC runs to spot the change in trends. After four full GC runs we can estimate that the Live Data Size is equal to approximately 100MB. The aforementioned Java Performance book is now indicating that there is a strong correlation between the Live Data Size and the optimal memory configuration parameters in a typical Java EE application. The evidence from the field is also backing up their recommendations: Set the maximum heap size to 3-4 x [Live Data Size] So, for our application at hand, we should set the -Xmx to be in between 300m and 400m for the initial performance tests and take it from there. We have mixed feelings about other recommendations given in the book, recommending to set the maximum permanent generation size to 1.2-1.5 x [Live Data Size of the Permanent Generation] and the -XX:NewRatio being set to 1-1.5 x of the [Live Data Size]. We are currently gathering more data to determine whether the positive correlation exists, but until then I recommend to base your survival and eden configuration decisions on monitoring your allocation rate instead. Why should one bother you might now ask. Indeed, two reasons for not caring immediately surface: 8G memory chip is in sub $100 territory at the time of writing this article Virtualization, especially when using large providers such as Amazon AWS make adjusting the capacity easy Both of the reasons are partially valid and have definitely reduced the need for provisioning to be precise. But both of them are still putting you in the danger zone When tossing in huge amounts of memory “just in case” you are most likely going to significantly affect the latency – going into heaps above 8G it is darn easy to introduce Full GC pauses spanning over tens of seconds. When over-provisioning with the mindset of “lets tune it later”, the “later” part has a tendency of never arriving. I have faced numerous applications running on vastly over provisioned environments just because of this. For example the aforementioned application I discovered running on Amazon EC2 m1.xlarge instance was costing the company $4,200 per instance / year. Converting it to m1.small reduced the bill to just $520 for the instance. 8-fold cost reduction will be visible from your operations budget if your deployments are large, trust me on this. Summary Unfortunately I still see way too many decisions made exactly like I was forced to do a decade ago. This leads to the under- and over planning of capacity, both of which can be equally poor choices, especially if you cannot enjoy the benefits of virtualization. I got lucky with mine, but you might not get away with your guestimate, so I can only recommend to actually plan ahead using the simple framework described in this post. If you enjoyed the content, I can only recommend to follow our performance tuning advice in Twitter.

February 25, 2014

by Nikita Salnikov-Tarnovski

· 11,737 Views

Choosing Columns for Agile Team Boards

"And let Reform her columns roll. With thunder peal, and lightning flash..." - Ignis, "The Genius of Liberty" Vol III No. 2 Introduction In the past couple of articles we've seen how a Kanban board is able to help in the attainment of transparency and the stabilization of an agile team. Today we'll see if we can resolve one of the most common queries that result from this usage: how does a team decide which columns should appear on the board for tracking the progress of work items? The simplest case...and why it may not be enough When we set up a Kanban board in the last article, there were only three columns - or to use the correct term, "stations". These were a "Backlog" station (essentially a "To Do" list of work that has not yet been started), a station for showing which work is "In Progress", and a finally a station for representing the work that has been completed. You can't get much simpler than that, and it begs the question as to why you would want to make it more complicated. In practice however, there are at least two situations in which this minimalist approach will be found wanting: A team isn't cross-trained, and its members effectively work in skill silos. Consequently we can expect dependencies between team members, some of whom may become blocked while waiting for others to complete their part of the work. The incurral of this wasted time will not be apparent if it is all considered to be "work in progress". For example, the team may be split into developers and testers, and bottlenecks may arise as work passes between them. We may need to break Work In Progress down into further stations in order to expose this waste more fully. Bottlenecks arise due to constraints in the workflow, which is a different problem. In this situation a team might be fully cross-trained, and none of its members become blocked waiting for another. Rather, waste arises because the work itself is inefficiently staged. This often happens with activities like development, review, and test. For example if two people are required for a review, but only one is needed for development and test, then a bottleneck may well occur. Work will build-up awaiting review due to contention for these resources and the value of the investment in effort will start to depreciate. Again the incurral of this waste will not be apparent if it is all considered to be "work in progress". More stations are needed to expose it. Adding further stations These then are the two key things to consider when choosing additional stations. We're out to expose waste caused by work silos, or by the inappropriate staging of activities. Either of these can introduce constraints and become the source of bottlenecks in a value stream. Sometimes blockages can occur due to a dependency on something that must be done outside of the team. When this happens, it implies that the team are not fully in control of their own process, and consequently are unable to meet their own Definition of Done. They don't have all of the skills or resources needed. This is a problem and a contra-indication to agile practice. If it happens it's essential to make the dependency clear so that it can be challenged and removed. We may therefore choose to have an "externally blocked" column on the board to expose problems of this nature. It isn't really a station, because it doesn't represent a state in which value is added. Rather, it shows that an item has stalled within the value stream and that the team are not in a position to provide remedy. Another option is to place a red, day-glo sticky note on the ticket highlighting the seriousness of the problem. This is a clear signal that an impediment has occurred...that is to say, in Lean-Kanban terminology, it is an andon flag. In this case the flag shows that a major blockage has arisen and needs resolving. Challenging the boundaries Now we need to turn our attention to the boundaries of the board. There are two principal areas we should look at. Firstly, on the leftmost side of the board, we can see the work that a team inducts into its "Backlog" prior to actioning it. Secondly, on the rightmost side, we can see the work that the team considers to be "Done". These two boundaries are very often a source of waste. To understand why, just consider how backlogs are often allowed to grow without effective limit, and at how completed work may be permitted to accumulate in a "Done" column. These stations may not represent work in progress as far as the team is concerned, but it would be foolish to deny that they are batches too. After all, they are still part of the value stream. They represent inventory that is depreciating in value, or relevance, until something useful is done with it all. It behooves us to query the waste that is incurred, and to ask how the size of these batches may be constrained. Specifically: How can work be inducted into a backlog with minimal accumulation and delay? How can value be delivered to consumers as soon as work is completed? In short, what can be done to "lean" these process boundaries, so that inventory in the team's part of the value stream enters and exits in a "just-in-time" fashion? We can answer these questions by improving transparency still further. This can mean the refinement of the "Backlog" and "Done" columns into other, more finely-grained stations. For example, work might be building up in a Product Backlog because it is not being triaged appropriately, or perhaps because acceptance criteria are insufficiently well defined. We might be able to expose these problems by replacing a backlog with "Triaged", "Accepted", and "Ready" stations. At the other side of the board, completed work may be building up in the "Done" column because a release cannot yet be made. Additional stations such as "System Integrated", "In User Acceptance", and "Awaiting Release" could add clarity here. Removing stations The simple, 3 column board we started has now exploded into a behemoth of perhaps ten columns or more. This may seem like an excessively complex structure for a workflow and a casual observer may criticize it for being fundamentally unagile. After all, inventory should either be work in progress by an agile team, or it will be awaiting their attention or have already been completed. The criticism is a valid one but we need to bear one thing in mind: these stations are there to expose problems. Only once transparency has been attained can we hope to provide remedy. The bottlenecks, along with the diagnostic stations we added to reveal them, can then be removed. Conclusion Knowing how many "columns" to include on an agile board, and what they should be, is something of a black art to many agile teams. In this article we've looked at the issues involved in making this decision. The board of a fully cross-trained team should be elegant in its simplicity, but when problems arise we must be prepared to do some digging in order to root out their causes.

February 25, 2014

by $$anonymous$$

· 12,508 Views · 2 Likes

Voron & Time Series Data: Getting Real Data Outputs

So far, we have just put the data in and out. And we have had a pretty good track record doing so. However, what do we do with the data now that we have it? As you can expect, we need to read it out. Usually by specific date ranges. The interesting thing is that we usually are not interested in just a single channel, we care about multiple channels. And for fun, those channel might be synchronized or not. An example of the first might be the current speed and the current engine temperature in a car. They are generally share the exact same timestamps. An example of out of sync is when you have a sensor on a rooftop measuring rainfall, and another sensor in the sewer measuring water flow rates. (Again, thanks to Dan for helping me with the domain). This is interesting, because it present quite a few interesting problems: We need to merge different streams into a unified view. We need to handle both matching and non matching sequences. We need to handle erroneous data, what happens when we have two reading for the same time for the same sensor? Yes, that shouldn’t happen, but it does. I solved this with the following API: public class RangeEntry { public DateTime Timestamp; public double?[] Values; } IEnumerable results = dts.ScanRanges(DateTime.MinValue, DateTime.MaxValue, new[] { "6febe146-e893-4f64-89f8-527f2dbaae9b", "707dcb42-c551-4f1a-9203-e4b0852516cf", "74d5bee8-9a7b-4d4e-bd85-5f92dfc22edb", "7ae29feb-6178-4930-bc38-a90adf99cfd3", }); This API gives me the results in the time order, with the same positions as the ids requested for the values. With nulls if there isn’t a value matching the value from that time in that particular sensor channel. The actual implementation relies on this method: IEnumerable ScanRange(DateTime start, DateTime end, string id) All this does it provide the entries all the entries in a particular date range, for a particular channel. Let us see how we implement multi channel scanning on top of this: private class PendingEnumerator { public IEnumerator Enumerator; public int Index; } private class PendingEnumerators { private readonly SortedDictionary> _values = new SortedDictionary>(); public void Enqueue(PendingEnumerator entry) { List list; var dateTime = entry.Enumerator.Current.Timestamp; if (_values.TryGetValue(dateTime, out list) == false) { _values.Add(dateTime, list = new List()); } list.Add(entry); } public bool IsEmpty { get { return _values.Count == 0; } } public List Dequeue() { if (_values.Count == 0) return new List(); var kvp = _values.First(); _values.Remove(kvp.Key); return kvp.Value; } } public IEnumerable ScanRanges(DateTime start, DateTime end, string[] ids) { if (ids == null || ids.Length == 0) yield break; var pending = new PendingEnumerators(); for (int i = 0; i < ids.Length; i++) { var enumerator = ScanRange(start, end, ids[i]).GetEnumerator(); if(enumerator.MoveNext() == false) continue; pending.Enqueue(new PendingEnumerator { Enumerator = enumerator, Index = i }); } var result = new RangeEntry { Values = new double?[ids.Length] }; while (pending.IsEmpty == false) { Array.Clear(result.Values,0,result.Values.Length); var entries = pending.Dequeue(); if (entries.Count == 0) break; foreach (var entry in entries) { var current = entry.Enumerator.Current; result.Timestamp = current.Timestamp; result.Values[entry.Index] = current.Value; if(entry.Enumerator.MoveNext()) pending.Enqueue(entry); } yield return result; } } We are getting a single entry from each channel into the pending enumerators. Then, we collate all the entries that share the same time into a single entry. We use the Index property to track the actual expected index of the entry in the output. And we handle duplicate times in the same channel by outputting multiple entries. Testing this on my 1.1 million records data set, we can get 185 thousands records back in 0.15 seconds.

February 25, 2014

by Oren Eini

· 5,378 Views

Brief comparison of BDD frameworks

JDave, Concordion, Easyb, JBehave, Cucumber are all compared here briefly for your convenience.

February 24, 2014

by Sebastian Laskawiec

· 129,846 Views · 16 Likes