Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Take Big Data to the Next Level With Blockchain Networks

DZone's Guide to

Take Big Data to the Next Level With Blockchain Networks

Blockchains securely store transaction data. See how they can be used by vertical markets to streamline transactions across ecosystems.

· Big Data Zone ·
Free Resource

Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.

This article is featured in the new DZone Guide to Big Data Get your free copy for more insightful articles, industry statistics, and more! 

Bitcoin has taken the financial markets by storm, and the world's first decentralized digital currency could have a similar effect on the big data market as well. It’s not Bitcoin’s financial significance that I’m referring to, but rather its technological significance. Specifically, Bitcoin was developed to use a peer-to-peer network technology called a blockchain. A blockchain is an electronic ledger containing blocks of related records which are distributed on a peer-to-peer computer network, where every computer in the network has a copy of the ledger. Any new entries into the ledger can only take place after a set of computers agree that the transaction is valid. For example, in a Bitcoin application where person A sells a Bitcoin to person B, the validation process confirms that person A does actually have a Bitcoin to sell by auditing person A’s previous transactions in the ledger. Most importantly, multiple peers in the blockchain network execute this validation process independently so the transaction is approved by consensus rather than relying on a single entity.

Blockchains can also automate much of the work typically handled by middlemen or escrow services. Middlemen (usually lawyers, notaries, or financial institutions) serve as trusted third parties that review and enforce the terms of a contract or agreement between two parties. While they provide a valuable service that helps ensure the validity of a transaction, a middleman’s responsibility for verifying information can delay the closing of a deal. For example, an escrow company will hold the funds for a real estate purchase until it can verify that funds are indeed available and that the property title can be transferred from the seller to the purchaser; it’s a process that can add days and weeks to the sale process. But if the purchase were made via a blockchain, the services of an escrow company wouldn’t be needed. Since the blockchain can validate the transaction independently, both parties involved can rest assured that the transaction won’t be authorized until the network achieves consensus that all terms of the transaction have been met. This greatly accelerates the transaction’s closing time and eliminates the cost associated with having a middleman review the transaction. In essence, blockchains enable “smart contracts” which are digital agreements with terms and penalties agreed upon in advance so the blockchain can then enforce them automatically.

In addition to providing high availability semantics by using a peer-to-peer architecture, blockchains also provide excellent data security. Security in a blockchain can be a challenging concept to grasp at first: exactly how secure can a system be if everyone involved in it has access to the complete transaction records of every other participant? First, the fact that all participants in the blockchain have access to every transaction record means that in order to commit fraud, each and every participants’ transaction logs would need to be compromised and altered. In a peer-to-peer blockchain network that could potentially include millions of client computers spread across the globe, that kind of record tampering isn’t possible. Combine the sheer number of peers in a blockchain with the fact that the technology uses private/public keys to encrypt transaction records, and it becomes readily apparent why blockchains are positioned to be the new standard for performing sophisticated, transaction-based workflows with extremely high levels of integrity.

That kind of technology standard can provide significant benefits to an entire industry, not just one company. But it will require a leap of faith among players in vertical markets, as it calls for companies (even competitors) to transact in a shared environment. While at first glance that concept might appear counterintuitive, companies can benefit by pooling and sharing access to their data. Financial organizations are already sharing data about fraudulent transactions in order to better defend against them, and cybersecurity companies are doing the same with the data they gather from successful network breaches. Essentially, it’s “a rising tide raises all ships”-type philosophy that allows companies to leverage each other’s data to improve the reliability and veracity of transactions between all players in a supply chain.

Let me illustrate this by way of a vertical market use case example: agriculture. Like nearly every industry, agriculture companies are adopting big data systems to store and analyze the massive amounts of data their organizations generate in order find ways to improve their products and processes. What if that operational data were uploaded to a blockchain network that included data from other agricultural companies? Vendors in the supply chain could track raw materials and products as they move through the supply chain, and the benefits flow from there. Say a restaurant chain serving organic food wants to make sure all of its vegetable suppliers use organic fertilizer; they could check the transaction blockchain to confirm their suppliers are actually purchasing and applying organic fertilizer on the crops they grow, harvest, and sell. Or what if a pest or fungus begins to attack a certain crop? With the data available through a blockchain, analytics could quickly trace the spread of the pest through the ecosystem to find the source and take appropriate action. It could also help streamline compliance issues as industry watchdogs (government agencies, for example) could access the blockchain to verify participants are complying with industry standards and regulations.

Image title

The diagram illustrates how a blockchain network can verify all participants in a food supply chain are providing required information and meeting their responsibilities to other participants in the network.

Setting up a blockchain transaction network will take some effort among industry players. All parties using the blockchain need to reach consensus on what kind of data is shared (for instance, data on pricing or proprietary IP would likely be excluded from the blockchain), as well as what format the data is in when shared. But once that’s established, no one player in the blockchain would be responsible for maintaining a server on behalf of the whole industry (not to mention taking on the op-ex costs and liability concerns involved in maintaining a big data network). Furthermore, blockchain technology could be relatively easy to implement on top of existing IT infrastructures thanks to the development of cloud technology. Cloud computing by definition uses a distributed network of resources, much like the distributed network of peer computers used to form a blockchain. Companies already using the cloud will be less likely to struggle with the fear, confusion, and doubt that often accompany the adoption of new technologies like blockchain.

Indeed, the blockchain model could serve as a catalyst for increased adoption of cloud and big data platforms. In industries where a blockchain network is used to store transaction data, companies not connected to the blockchain or not using big data analytics will be at a competitive disadvantage to those that do. As market trends change and alter the dynamics of the supply chain, non-connected companies won’t be able to spot those trends and adjust their strategies as quickly as connected competitors can. Moreover, they won’t be able to process transactions as quickly and accurately as other companies connected to the blockchain, a fact that’s bound to be exploited by competitors that do use blockchain.

Time and time again, we’ve seen how the adoption of big data analytics can business benefits and efficiencies that were previously unknowable. And if that big data analytics platform has access to even more data thanks to a blockchain, the potential of big data analytics to positively change business operations grows even more compelling.

This article is featured in the new DZone Guide to Big Data Get your free copy for more insightful articles, industry statistics, and more! 

Hortonworks Community Connection (HCC) is an online collaboration destination for developers, DevOps, customers and partners to get answers to questions, collaborate on technical articles and share code examples from GitHub.  Join the discussion.

Topics:
big data ,blockchain ,security ,transactions ,cloud computing

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}