Over a million developers have joined DZone.

SQL or ETL Isn’t Ideal Option for Structured Data Computing in Java

· Java Zone

Learn more about how the Java language, tools and frameworks have been the foundation of countless enterprise systems, brought to you in partnership with Salesforce.

When developing Java programs, sometimes, we may find ourselves facing such a challenge: Performing the massive structured data computation according to the data from text or Excel. For example, with the policy data details, how to find out the salesman who sold the most or the least insurance products during a certain period?

How about importing the above mentioned data into database manually and then computing with
SQL database? Bad idea! Importing all the data to database just for once cannot ensure the live update of data. Once data changed, users will have to import over again. This will lead to cumbersome procedures and bring about great workload.

How about automatically importing data to database at a specified time with ETL tool? Still bad! This practice will usually require PHP/Perl/VBScript/JavaScript knowledge and massive data update algorithms, which is very costly in time and money. Moreover, the changes of demand cause difficult maintenance and also have great impact on the performance and stability of the existing database. To meet a certain computing demand, it is not worthy to build a system that is similar to the data warehouse.

In addition, not all Java applications are incorporated with a database. What if there is no database? It is obviously a bad idea to install a set of databases for just meeting a computation demand.

Even if the database and the ETL regular update system are established successfully. It is still inconvenient in most cases, for example, the cross-database computation, and sometimes it is hard to reduce the coupling between data computing scripts and Java codes. Take the SQL statements in this case for an example, it is already quite hard to understand and maintain them, let alone composing such complex scripts, as shown below:

SELECT salesMan 

FROM (SELECT salesMan,

  row_number() OVER (ORDER BY isrCount DESC) descOrder, 

  row_number() OVER (ORDER BY isrCount ASC) ascOrder 

  FROM (SELECT salesMan,

   COUNT(*) isrCount

  FROM insurance  where salesDate>= ? and salesDate<=?

  GROUP BY salesMan

WHERE descOrder=1 OR ascOrder=1 ORDER BY descOrder

How about using the Java language directly? Even worse! Java does not offer the functions to implement query, group, sort, and summarizing directly and straightforwardly, while these are basic requirements for the massive structured data computation. You will have to implement all these details by yourself.

Then, no way out? No. esProc is just an ideal tool to accomplish the objective. This is a professional
Java development tool for database computing and really good at simplifying the complex computation to integrate with Java easily. For esProc, the scripts are shown below:

esProc can retrieve data directly from and compute on multiple databases\txt files\Excel sheets. esProc offers a grid style and agile syntax specially tailored for massive structured data computation. The result can be exported to JDBC to boost the Java computational capability dramatically. In addition, it enables the cross-database computation and supports code reuse by nature. Even the debug functionality is also quite perfect. Considering all these advantages, it is clear that esProc is more efficient than SQL.

Discover how the Force.com Web Services Connector (WSC) is a code-generation tool and runtime library for use with Force.com Web services, brought to you in partnership with Salesforce.


The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}