Big Data Resources

This Kafka tutorial aims to help you understand failover for brokers and consumers; you'll run a consumer to test and prove Kafka's consumer failover.

August 23, 2017

by Jean-Paul Azar

· 56,887 Views · 12 Likes

Concept Learning: The Stepping Stone Toward Machine Learning With Find-S

Learn about one of the most simple algorithms of artificial intelligence, the Find-S algorithm, to help you get started with machine learning.

August 21, 2017

by Girish Bharti

· 25,699 Views · 5 Likes

Data Science for Java Developers With Tablesaw

Tablesaw is like an open-source Java power tool for data manipulation with hooks for interactive visualization, analytics, and machine learning. Come learn all about it!

August 20, 2017

by Larry White

· 28,753 Views · 38 Likes

What Is a Data Science Workbench and Why Do Data Scientists Need One?

These factors described in this article make data scientists self-sufficient, improve the effectiveness of their models, and accelerate the time to insight.

August 19, 2017

by Syed Mahmood

· 17,148 Views · 3 Likes

Kafka Consumer Architecture - Consumer Groups and Subscriptions

In this installment, learn about Kafka consumer architecture, consumer groups, how record processing is shared, and failover for consumers.

August 19, 2017

by Jean-Paul Azar

· 57,945 Views · 12 Likes

Kafka Producer Architecture - Picking the Partition of Records

This article covers Kafka Producer Architecture, including how a partition is chosen, producer cadence, partitioning strategies, and consumers.

August 18, 2017

by Jean-Paul Azar

· 69,740 Views · 12 Likes

Kafka Topic Architecture - Replication, Failover, and Parallel Processing

Digging deeper into Kafka architecture, this article covers the details of replication, failover, and parallel processing in this data pipeline software.

August 17, 2017

by Jean-Paul Azar

· 98,348 Views · 29 Likes

Analyst, Scientist, or Specialist? Choosing Your Data Job Title

There are tons of data job titles, including data scientist, data analyst, and data specialist. It’s important to pick one that matches your capabilities and aspirations.

August 15, 2017

by Shelby Blitz

· 10,399 Views · 7 Likes

Hadoop Distributions: Past, Present, and Future

In a world where open-source software can avoid vendor lock-in, are major Hadoop distributors discarding some of that benefit to the detriment of Hadoop users?

August 15, 2017

by Mark Chopping

· 10,662 Views · 3 Likes

Solving a Clustering Problem Using the k-Means Algorithm With Oracle

Clustering algorithms let machines group data points or items into groups with similar characteristics. See how to use the k-means algorithm with Oracle to do clustering.

August 15, 2017

by Emrah Mete

· 13,410 Views · 7 Likes

How to Order Streamed DataFrames

Many of the solutions that you experiment with to help you order streamed DataFrames will bring you to disappointment. Luckily, there's a light at the end of the table!

August 15, 2017

by Mahesh Chand Kandpal

· 6,929 Views · 1 Like

Testing MQTT Messaging Brokers

If you're looking to test your IoT app's communication, here's how JMeter can load test the popular MQTT protocol, with an overview of the protocol itself.

August 13, 2017

by Roman Aladev

· 18,079 Views · 7 Likes

Kafka Producer in Java

This detailed tutorial will help you create a simple Kafka producer, which allows you to publish records to the Kafka cluster.

August 10, 2017

by Jean-Paul Azar

· 108,741 Views · 14 Likes

Kafka Architecture

Learn about the architecture and functionality of Kafka, the software for building real-time streaming data pipelines, in this comprehensive primer.

August 9, 2017

by Jean-Paul Azar

· 136,758 Views · 111 Likes

The Role of Predictive Analytics in DevOps

Learn how data and predictive analysis can be used by DevOps engineers to further develop and optimize the DevOps workflow.

August 5, 2017

by Badri Srinivasan

· 11,682 Views · 4 Likes

Coffee With a Data Scientist: Tuhin Chattopadhyay, Ph.D.

In the fourth issue of DZone's Coffee With a Data Scientist, we had a chat with business analytics evangelist, Tuhin Chattopadhyay, to glean some of his expert insights and opinions on the Big Data space.

August 4, 2017

by Michael Tharrington

· 9,346 Views · 4 Likes

Event Driven Microservices Patterns

Read about the motivation behind the switch to microservices, and some of the patterns that make these applications more scalable.

August 4, 2017

by Carol McDonald

· 67,997 Views · 75 Likes

Using Airflow to Manage Talend ETL Jobs

Learn how to schedule and execute Talend jobs with Airflow, an open-source platform that programmatically orchestrates workflows as directed acyclic graphs of tasks.

August 3, 2017

by Rathnadevi Manivannan

· 22,242 Views · 4 Likes

Where Are the Biggest Opportunities for AI?

Here's what 22 executives who are familiar with AI said when we asked them, "What are the most common issues you see preventing companies from realizing the benefits of AI?"

July 31, 2017

by Tom Smith

CORE

· 10,545 Views · 3 Likes

Don't Use Apache Kafka Consumer Groups the Wrong Way!

Apache Kafka is great — but if you're going to use it, you have to be very careful not to break things. Here's how you can avoid the pain!

July 30, 2017

by Paolo Patierno

· 275,484 Views · 20 Likes

The Latest Big Data Topics