DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
The Latest "Software Integration: The Intersection of APIs, Microservices, and Cloud-Based Systems" Trend Report
Get the report
  1. DZone
  2. Data Engineering
  3. Data
  4. Design Considerations for Real-Time Operational Data Architectures for Industry 4.0

Design Considerations for Real-Time Operational Data Architectures for Industry 4.0

We examine the rise of the Industry 4.0 era and how IoT and big data processes can help us meet its challenges.

mahendraprabu sundarraj user avatar by
mahendraprabu sundarraj
·
Jun. 21, 19 · Analysis
Like (4)
Save
Tweet
Share
7.43K Views

Join the DZone community and get the full member experience.

Join For Free

Data Architecture Design

The fourth Industrial Revolution brought cyber-physical systems to the manufacturing floor, leading to to the production of data at an unforeseen volume. Most of the current statistical process control systems (SPCs) were designed in the Industry 3.0 era, and they use only a fraction of the data produced on production lines to monitor the production quality.

However, to ensure waste reduction and yield increases, new age manufacturing systems need real-time operational data replication and analysis to improve. This article lists the key design considerations for real-time operational data architecture.

Separate Storage and Compute

Building a centralized data lake or data warehouse to consolidate all the machine and application data may be an easy first decision to make. However, careful consideration is needed to separate storage and compute in order to provide greater scalability and performance. Below are some examples.

Ability to store diverse data types – IoT in Industry 4.0 keeps introducing new data formats. Most of the time, companies realize the data potential months or years after the data is generated. Separating storage from computing allows organizations to onboard new data formats quickly without having to worry about how to process the data.  

Fast data replication – Most of the packaged data marts, where storage and compute are coupled together, expect the data to be prepared before getting ingested into data marts. Choosing data marts such as data lakes, where storage is decoupled from compute, allows companies to store any data format without any preparation. Sometimes, one can even avoid data replication altogether if distributed query engines work with distributed data sources or data marts.

Fast data retrieval – Once compute is separated from storage, it can be scaled up or down according to your needs. New age distributed query engines come with sophisticated auto scale features.

Leverage Edge Computing

Unlimited computing is available with the cloud, which makes it tempting to send all the data to the cloud and let the cloud prepare data for consumption. At the outset, it looks like the right decision since, in the cloud, data replication speed is at its best. However, replicated data is not immediately useful for consumption as it needs to be prepared further. Moreover, continuous data preparation in the cloud is not going to be cheap, and cost will increase at a rate directly proportional to the increase in data volume.

Edge computing layers, which are attached to machines, can be leveraged to solve the problem mentioned above. Data can be curated and structured closer to the data source at the edge layer, reducing the compute needed in the cloud and making the data ready for consumption much faster. Whenever new intelligent machines are added to the production floor or the supply chain, computing cost goes up. Hence making the best out of edge computing will help to create sustainable and cost-effective architecture for the industrial world.

Metadata Management

There are multiple, well-documented advantages of maintaining metadata. Below are some of the advantages o fcentrally maintaining metadata in an industrial world.

Distributed query systems - If the metadata is centrally maintained, it can help create a customer experience as if the customer is querying one single database even though the data is stored in distributed data marts. Frequently queried operational data can be stored in a database, and all others can be preserved in inexpensive object storage.

Data quality – Metadata can help to validate the data before ingesting into data marts, leading to high data quality and reduced effort in data cleansing. Centralized metadata repositories, combined with API engines, can help ensure data quality throughout the enterprise.

The above advantages — distributed query engines and data quality — also help to increase data retrieval speed.

Optimized File Formats

Optimized file formats are hugely popular in big data systems. However, the usefulness of optimized file formats can go beyond big data storage systems and help to create low latency data transfers as well. For example, data can be converted to a columnar format closer to the edge and be transmitted to the destination rather than sending the data in traditional formats such as CSV, XML, or JSON.

Using a columnar format, one can reduce the size of the data file close to one-fourth of the actual size. Reduced size helps to expedite data transfer processes. If the destination is expected to be a data lake or big data file systems such as HDFS, these file formats are easy to be stored without further data formatting and will be ready to be consumed immediately.

The above mentioned points are critical design considerations in either creating the data system from scratch or implementing a packaged solution.

Big data Architecture Design Data quality

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Keep Your Application Secrets Secret
  • 10 Things to Know When Using SHACL With GraphDB
  • Host Hack Attempt Detection Using ELK
  • Kubernetes-Native Development With Quarkus and Eclipse JKube

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: