Big Data/Analytics Zone is brought to you in partnership with:
  • submit to reddit
Veeresham Kardas10/06/14
0 replies

CSV Operations using OpenCSV

OpenCSV is one of the best tools for CSV operations. We will see how to use OpenCSV for basic reading and writing operations.

Adam Diaz10/02/14
0 replies

The Evolution of MapReduce and Hadoop

Recently I authored a section of the DZone Guide for Big Data 2014. I wrote about MapReduce and the evolution of Hadoop.

Sander Mak10/01/14
0 replies

The Developer’s Guide to Data Science

When developers talk about using data, they are usually concerned with ACID, scalability, and other operational aspects of managing data. But data science is not just about making fancy business intelligence reports for management. Data drives the user experience directly, not after the fact.

Isaac Sacolick10/01/14
0 replies

Solving the Data Scientist Shortfall by Deploying a Self Service BI Program

Want to learn more about what "self-service" BI programs? Why many organizations are looking to leverage these technologies and programs on their quest to become more data-driven.

Mark Needham09/29/14
0 replies

R: ggplot - Plotting multiple variables on a line chart

In my continued playing around with meetup data I wanted to plot the number of members who join the Neo4j group over time. I wanted to plot the actual count alongside a rolling average for which I created the following data frame:

Linda Gimmeson09/28/14
0 replies

10 Big Data Tools

Hadoop isn't the only big data tool out there. Check out this list of big data tools available.

Armel Nene09/27/14
0 replies

Big Data Architecture Best Practices

The marketing department of software vendors have done a good job making Big Data go mainstream, whatever that means. The promise of we can achieve anything if we make use of Big Data; business insight and beating our competitions to submission. Yet, there is no well-publicised Big Data successful implementation. The question is: why not?

Adam Diaz09/26/14
0 replies

The Evolution of MapReduce and Hadoop

With MapReduce, companies no longer need to delete old logs that are ripe with insights—or dump them onto unmanageable tape storage—before they’ve had a chance to analyze them. Today, the Apache Hadoop project is the most widely used implementation of MapReduce.

Evert Pot09/25/14
0 replies

Accessing protected properties from objects that share the same ancestry.

I realized something odd about accessing protected properties the other day. It's possible in PHP to access protected properties from other objects, as long as they are from the same class, as illustrated here:

Benjamin Ball09/25/14
0 replies

The No Fluff Introduction to Big Data

Due to the obstacles presented by large scale data management, the goal for developers and data scientists is two-fold: first, systems must be created to handle large scale data, and two, business intelligence and insights should be acquired from analysis of the data.

Alec Noller09/24/14
0 replies

Dev of the Week: Sander Mak

This week we're talking to Sander Mak, Senior Software Engineer at Luminis Technologies, JavaOne Rockstar, and featured author in DZone's 2014 Guide to Big Data.

Benjamin Ball09/24/14
0 replies

Addressing the Problems of Big Data: Answers from 6 Experts on Twitter

Monday was the official launch of DZone's 2014 Guide to Big Data and to kick off this event, we gathered a panel of six Big Data specialists to participate in DZone's first Twitter Q&A, which appeared under the #DZBigData hashtag.

David Mai09/24/14
1 replies

Is ETL on Life Support?

Two years ago, Phil Shelley, the former Chief Technology Officer of Sears Holdings and CEO of Metascale, predicted the death of ETL. Now, two years later. Let’s take a look at what has happened to ETL.

David Mai09/24/14
0 replies

2014 Magic Quadrant for Data Integration Tools: Déjà vu All Over Again

A brief review that summarizes the few changes in the Magic Quadrant for Data Integration Tools between 2013 and 2014.

Raymond Camden09/23/14
0 replies

Using the New York Times API to Chart Occurrences in Headlines

This weekend I discovered that the New York Times has a pretty deep developer API. I thought I'd try to build a little experiment. What if we could use the API to map the number of times a keyword appeared in headlines over time?

Alec Noller09/22/14
1 replies

Introducing DZone's 2014 Guide to Big Data

DZone's 2014 Guide to Big Data was produced to help you discover emerging information about the Big Data landscape and learn about how the shifting needs of data scientists and developers are influencing new tools and technologies.

Trevor Parsons09/19/14
0 replies

How to Avoid the Big Data Black Hole

Data collection should be synthesized into meaningful events. Getting users addicted to a platform by the quality and frequency of decisions versus encouraging them to spin the wheel to see what happens and becoming a 5th limb.

Steve Hanov09/19/14
1 replies

A Quick Measure of Sortedness

How do you measure the "sortedness" of a list? Here, I propose another measure for sortedness. The procedure is to sum the difference between the position of each element in the sorted list, x, and where it ends up in the unsorted list, f(x). We divide by the square of the length of the list and multiply by two, because this gives us a nice number between 0 and 1. Subtracting from 1 makes it range from 0, for completely unsorted, to 1, for completely sorted.

Mihai Dinca - P...09/19/14
2 replies

The Java Versions War

Do not take for granted that if your application works with Java version X it will automatically and flawlessly work with any Java version Y > X.

Mark Needham09/19/14
0 replies

R: Calculating rolling or moving averages

I’ve been playing around with some time series data in R and since there’s a bit of variation between consecutive points I wanted to smooth the data out by calculating the moving average.

Mark Needham09/19/14
0 replies

R: ggplot – Plotting a single variable line chart

I’ve been learning how to do moving averages in R and having done that calculation I wanted to plot these variables on a line chart using ggplot.

Alec Noller09/17/14
0 replies

Dev of the Week: Chanwit Kaewkasi

This week we're talking to Chanwit Kaewkasi, Assistant Professor at the Suranaree University of Technology’s School of Computer Engineering in Thailand, co-developer of a series of low-cost Big Data clusters, and featured author in DZone's upcoming 2014 Guide to Big Data.

Chris Odell09/17/14
0 replies

Why I will Always Try And Find A Ready-Built Library

By the time you have developed something and fixed any issues with it, your version is simply not going to be as tested as a ready built component that is used by thousands of people.

Alec Noller09/15/14
1 replies

Join Us For Our Big Data Twitter Q&A! #DZBigData

In anticipation of our 2014 Guide to Big Data, we have arranged for a panel of experts - Kirk Borne and Jonathan Ellis, to answer your Big Data questions on Twitter on Monday, September 22, 2014. To participate, simply ask a question using the hashtag #DZBigData.

Drew Harvey09/15/14
0 replies

DB2 CONCAT (Concatenate) Function

The DB2 CONCAT function will combine two separate expressions to form a single string expression. It can leverage database fields, or explicitly defined strings as one or both expression when concatenating the values together.