Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Master Data Belongs on the Blockchain

DZone's Guide to

Master Data Belongs on the Blockchain

Like all big data, master data offers important opportunities for machine learning analytics. Embedded analytics of anonymized master data can yield powerful insights.

· Big Data Zone ·
Free Resource

Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.

It's easy to think of large organizations as centralized and monolithic, but the reality is often the opposite. For banking, healthcare, transportation, energy, manufacturing, and other sectors, the trend is decentralized locations and teams managing local data. But it's a trend that comes with the potential for chaos — especially for master data, where accuracy, security, and conformity are essential.

Building Master Data Management capabilities on the blockchain offers the benefits of traditional MDM while also taking advantage of powerful new paradigms for flexibility, consensus, and embedded analytics.

Let's look at the details.

Master Data Management (MDM) depends on creating consensus truth for the enterprise. If Hospital A merges with Hospital B, their big stores of master data need to merge, as well. It's critical that the process reliably matches patient records when it should, while carefully avoiding false matches. Real lives can depend on the accuracy of the master data matching process.

Traditionally, matching has meant linking the records within the two different databases, based on identifiers like Social Security Number, date of birth, drivers' license information, and so on. The MDM system could write the linkage information to a central database accessible from different locations. But having a single copy of the linkage data in a single location has meant that admins need to take special care to ensure that the data is highly available and secure. Private blockchain networks (also called permissioned networks) offer an intriguing alternative.

A New Role for Blockchain

What started as a digital ledger for assets and currency is expanding into new realms. Over time, large enterprises will adopt distributed ledger models to record and manage biographic and biometric data. For example, imagine hospitals, banks, and governments all wanting to maintain their master data on the blockchain. But those organizations will need ways to match and link that data across private networks.

Consider Hospital A and Hospital B. If they each maintain their patients' records, how will they combine those records in the event of a merger?

The hospitals could first create a business network using the blockchain technology. That offers an advantage because data sharing then happens on the blockchain network as opposed to being centralized. Once the teams create the network and begin sharing data on the network, sophisticated algorithms kick in to perform matching and linking — and the linking information is also stored natively on the blockchain.

Teams could also choose whether each node should maintain its own copy of the linkage information on the ledger. If not, the node can simply consume the linkage information that's maintained elsewhere on the network. That option keeps transaction activity from swamping any nodes that might have less compute power or connectivity while helping to ensure that the linkage data is stored redundantly across multiple nodes.

Hopefully, the hospital example helps paint a compelling picture of the potential advantages of MDM on the blockchain, but the gains don't stop there. Consider...

  • Data reconciliation: When every participating business unit is part of the blockchain network, there's no longer a need to move data between the business units. With traditional MDM, data movement can consume an enormous amount of time and energy.
  • Cost and trust: Maintaining a central infrastructure is expensive and prone to security compromise. With the blockchain system, transactions aren't committed without the consensus of the whole system.
  • Organizational efficiency: The blockchain eliminates the need for complex reconciliation between different nodes, whether the nodes are branch banks, health clinics, distribution centers, or other peers in the system.
  • Disintermediation: Eliminates central intermediaries and reduces the fear of arbitrage within the ecosystem.
  • Transparency: Enables audit trails to be established for assets and transactions, minimizing disputes.

Looking Forward

Like all big data, master data offers important opportunities for machine learning analytics. Obviously, embedded analytics of anonymized master data can yield powerful insights, but machine learning can also play a role further upstream. Innovative firms will find ways to apply machine learning to the matching process itself to ensure even higher confidence for the linkages between records.

Ultimately, the goal is to make Master Data Management as easy and intuitive as possible. New tools will give non-technical users across industries the ability to manage master data flexibly, efficiently, securely — and with perfect confidence.

Hortonworks Community Connection (HCC) is an online collaboration destination for developers, DevOps, customers and partners to get answers to questions, collaborate on technical articles and share code examples from GitHub.  Join the discussion.

Topics:
big data ,blockchain ,data security ,master data management

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}