Big Data on a Budget in Oil and Gas

DZone 's Guide to

Big Data on a Budget in Oil and Gas

Using NiFi, Tableau, and Hadoop for innovating and analyzing data in the Oil and Gas space.

· Big Data Zone ·
Free Resource

The recent decline in oil prices put pressure on the IT budgets of exploration and production (E&P) companies across the globe. However, these macro-economic challenges also represent an opportunity to rethink legacy IT platforms for storing and processing data.

At Hortonworks, we partner with our E&P customers to renovate their data systems by adding Hortonworks Data Platform (powered by Apache™ Hadoop®) and Hortonworks DataFlow (powered by Apache NiFi).

But this post is not only about cost savings for the sake of cost savings. We also help Hortonworks subscribers in oil and gas find use cases that redirect those savings towards uses that impact the top line.

Renovation with Connected Data Platforms from Hortonworks, allows E&P companies to supplement legacy platforms such as an enterprise data warehouse (EDW) without losing functionality. This in turn frees up economic resources that can be redirected towards innovation.

That innovation delivers actionable intelligence via new modern data applications to enable E&P companies to address the challenges they’re facing in the current economic and technological climate. These include:

  • Historic commodity pricing pressures
  • Aging assets
  • An aging workforce that will soon retire with a wealth of industry knowledge
  • Skyrocketing amount and variance in the data that needs to be collected, processed, stored, integrated, and analyzed
  • The desire to “democratize” data science, by breaking legacy system data siloes to enable analytic initiatives

The Challenge: Renovating Legacy Data Storage Platforms

Every E&P company that I’ve ever worked with has some sort of old, outdated hardware gathering dust in the corner. It sits alongside the newer, shinier servers. It still costs money to maintain, but it fills some need that forces the IT team to keep it around — whether it’s for compliance reasons, concern about downtime or because of the high cost to migrate data.

Hortonworks Data Platform (HDP) helps you free up those data storage resources that are holding part of your IT budget hostage, allowing you to use those resources to fund innovation where you need it most.

Most of our customers begin renovation by optimizing their data architecture with an Active Archive, ETL Onboard or Data Enrichment use case:

  • Active Archive relocates cold data from higher-cost storage platforms into HDP, without harming data availability
  • ETL Onboard moves ETL processing into HDP and frees processing capacity in higher-cost systems of record
  • Data Enrichment makes data already under management more valuable by combining it with additional data sources. For example, data scientists can join well log data with data on production and historical project costs

E&P firms also face a challenge bringing data from the edge of their data footprint into storage, and Hortonworks DataFlow (HDF) solves that challenge. HDF captures and conducts data to HDP where it becomes available for real-time decisions and historical analysis. Because the core technology behind HDF was originally developed at the US National Security Agency, it is extremely secure and delivers data with a fine-grain audit trail on how each bit of data arrived at its destination.

Converting Renovation into Innovation: The LAS Files Use Case

Before we jump into a detailed description of LAS Analytics with Connected Data Platforms, watch a 5-minute demo on the subject: Watch the LAS Analytics Demo.

Image title

Reference Architecture for Hortonworks LAS Analytics Solution

Hortonworks E&P customers are already using HDP for well log analysis, as portrayed in the video. As far as HDP is concerned, a log is a log is a log. But of course analysts do data discovery on semi-structured LAS Files for a very specific reason: they want to analyze well logs to find more oil and gas.

A well log contains two basic types of information:

  1. The metadata on each log, such as the sensors that it contains, the sensor mnemonics and their units of measurement partitioned by well ID
  2. The actual readings of the well curves

Hortonworks DataFlow is the ideal tool to ingest this raw well log data-in-motion, along with its relevant metadata. In fact, as it transmits those logs to HDP for long-term storage, HDF can add more metadata describing any in-flight transformation or combination that it does.

Once the data is resting in HDP, the platform’s multi-tenant processing capabilities help engineers query the huge LAS dataset, with all of the log information from all of the wells. Data scientists can then overlay schema on the data, as it suits their needs.

Moreover, those data scientists can create a master table that includes LAS log data, along with other data from disparate datasets describing a well in formats like HTML, XML and CSV. When combined in HDP, these diverse sources build a universal history of the well that can include auction history, production data, rock formations, geolocation data and perforations.

Image title

LAS Data Visualization Powered by LAS Log Data Stored in Hortonworks Data Platform

Data scientists and analysts then use visualization tools already on their desktops to turn queries on that data lake into compelling, intuitive graphs. Here’s an example of that using technology from Hortonworks partner Tableau:

Image title

LAS Graphing Using Tableau Over Data Stored in HDP

Oil and gas data scientists may begin their LAS log data exploration with a specific research agenda. They can also discover new insights by taking a “random walk” through a comprehensive data lake storing years and years of detailed well logs.

Connected Data Platforms from Hortonworks power that new type of well analysis that has never before been possible at scale.

apache nifi, hadoop, hdfs, hortonworks

Published at DZone with permission of Kenneth Smith , DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}