Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Efficient Duplicate Detection Over Massive Data Sets

DZone's Guide to

Efficient Duplicate Detection Over Massive Data Sets

Free Resource

Learn best practices according to DataOps. Download the free O'Reilly eBook on building a modern Big Data platform.

This is the fourth presentation of the Data Quality module that I am presenting today.


Find the perfect platform for a scalable self-service model to manage Big Data workloads in the Cloud. Download the free O'Reilly eBook to learn more.

Topics:
bigdata ,big data ,duplication detection

Published at DZone with permission of Pradeeban Kathiravelu, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}