DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
  1. DZone
  2. Data Engineering
  3. Data
  4. Data Protection for Operational Reporting Platforms

Data Protection for Operational Reporting Platforms

As enterprises onboard business-critical applications on this next-generation technology stack, they need to prepare for failures or errors that may disrupt customer-facing applications.

Jeannie Liou user avatar by
Jeannie Liou
·
May. 04, 16 · Opinion
Like (5)
Save
Tweet
Share
1.87K Views

Join the DZone community and get the full member experience.

Join For Free

Making real-time business decisions based on data from social, mobile and cloud platforms is a requirement in today’s world. That’s why an ecosystem of next-generation operational reporting platforms such as Spark, Apache Cassandra, Kafka, Docker, Mesos, Marathon, and more have come to life. Enterprises are using these technologies to re-architect their data storage and processing frameworks in order to achieve lower costs, scalability and faster response times. However, as enterprises onboard business-critical applications on this next-generation technology stack, they need to prepare for failures or errors that may disrupt customer-facing applications.

One of the key benefits that operational reporting platforms provide is real-time visibility, which allows businesses to make strategic decisions based on actionable insights. However, the new applications generate a large amount of data and archaic operational reporting platforms (those based on traditional data warehouses, data marts, and ETL logic) are unable to catch up and, as a result:

  • Business decisions are based on stale data
  • Fragmented data increases the risk of errors
  • Multiple copies of data increases storage costs

These issues have driven many progressive, data-centric enterprises to re-architect their operational reporting platforms. In this new architecture, Kafka serves as a persistent message bus that ingests data from relational stores and provides failure resiliency and message buffering. Spark or Spark Streaming may be used for transformation, aggregation, and other lightweight stream processing. These processing nodes may be deployed in Docker containers. Finally, Apache Cassandra serves as a distributed data storage layer that provides linear scalability and high-availability. Kafka also enables the ability to add multiple data consumers such as Hadoop/HBase through Flafka.
Datos IO - operational reporting platformsHowever, as enterprises adopt operational reporting platforms, they must ensure that bad data feeds and human error do not result in data loss. Apache Cassandra natively replicates data but does not provide the capability to go back in time to recover data efficiently and at scale. That’s where Datos IO RecoverX comes in. Datos IO RecoverX, a scale-out data protection software, is an important element of this next-generation distributed data services architecture.

As enterprises onboard their critical applications to achieve development agility, scalability, and lower operating costs, they are also investing to protect their next-generation applications from data loss, and rightly so. If you are building your next-generation infrastructure around distributed databases, please get in touch with our experts who can guide you through different techniques to ensure data protection.

Shalabh Goyal

Shalabh Goyal is a Product Manager at Datos IO. Before Datos IO, Shalabh Goyal worked for EMC as a Principal Product Manager, where he led backup and recovery products for virtualized and cloud environments. He holds an MBA from UC Berkeley and a Ph.D. from Georgia Tech.

Data (computing)

Published at DZone with permission of Jeannie Liou, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • RabbitMQ vs. Memphis.dev
  • Architectural Miscalculation and Hibernate Problem "Type UUID but Expression Is of Type Bytea"
  • Data Engineering Trends for 2023
  • Kubernetes vs Docker: Differences Explained

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: