DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Because the DevOps movement has redefined engineering responsibilities, SREs now have to become stewards of observability strategy.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Related

  • Resolving Parameter Sensitivity With Parameter Sensitive Plan Optimization in SQL Server 2022
  • Comparing Managed Postgres Options on The Azure Marketplace
  • Useful System Table Queries in Relational Databases
  • Introducing Graph Concepts in Java With Eclipse JNoSQL

Trending

  • SaaS in an Enterprise - An Implementation Roadmap
  • Intro to RAG: Foundations of Retrieval Augmented Generation, Part 2
  • Caching 101: Theory, Algorithms, Tools, and Best Practices
  • Understanding the Shift: Why Companies Are Migrating From MongoDB to Aerospike Database?
  1. DZone
  2. Data Engineering
  3. Databases
  4. What Is Data Redundancy?

What Is Data Redundancy?

When you start compiling true big data sets, having redundant records is a massive headache. Read on to learn how to avoid redundancy.

By 
Garrett Alley user avatar
Garrett Alley
·
Feb. 04, 19 · Analysis
Likes (3)
Comment
Save
Tweet
Share
23.2K Views

Join the DZone community and get the full member experience.

Join For Free

Data Redundancy Explained

Data redundancy occurs when the same piece of data is stored in two or more separate places. Suppose you create a database to store sales records, and in the records for each sale, you enter the customer address. Yet, you have multiple sales to the same customer so the same address is entered multiple times. The address that is repeatedly entered is redundant data.

How Does Data Redundancy Occur?

Data redundancy can be designed; for example, suppose you want to back up your company’s data nightly. This creates a redundancy. Data redundancy can also occur by mistake. For example, the database designer who created a system with a new record for each sale may not have realized that his design caused the same address to be entered repeatedly. You may also end up with redundant data when you store the same information in multiple systems. For instance, suppose you store the same basic employee information in Human Resources records and in records maintained for your local site office.

Why Data Redundancy Can Be a Problem

When data redundancy is unplanned, it can be a problem. For example, if you have a Customers table which includes the address as one of the data fields, and the John Doe family conducts business with you and all live at the same address, you will have multiple entries of the same address in your database. If the John Doe family moves, you'll need to update the address for each family member, which can be time-consuming, and introduce the possibility of entering a mistake or a typo for one of the addresses. In addition, each entry of the address that is unnecessary takes up additional space that becomes costly over time. Lastly, the more redundancy, the greater difficulty in maintaining the data. These problems — inconsistent data, wasted space, and effort to maintain data — can become a major headache for companies with lots of data.

How Data Redundancy Is Resolved for a Database

It’s not possible or practical to have zero data redundancy, and many database administrators consider it acceptable to have a certain amount of data redundancy if there is a central master field. Master data is a single source of common business data used across multiple systems or applications. It is usually non-transactional data, such as a list of customers and their contact information. The master data ensures that if a piece of data changes, you update the data only one time, which allows you to prevent data inconsistencies.

In addition, the process of normalization is commonly used to remove redundancies. When you normalize the data, you organize the columns (attributes) and tables (relations) of a database to ensure that their dependencies are correctly enforced by database integrity constraints. The set of rules for normalizing data is called a normal form, and a database is considered "normalized" if it meets the third normal form, meaning that it is free of insert, delete, and update anomalies.

How Data Redundancy Is Resolved Between Different Systems

But, what if you have data redundancy between multiple systems? For example, what if you have employee information stored in a departmental database, a Human Resources database, and a database for your site office? Many companies handle this by integrating their data and removing the redundancies when doing so. However, this can be a time-intensive process and can involve multiple steps, including designing the data warehouse and cleansing the data. If you're interested, see What is Data Integration? for more details about the process of data integration.

Data redundancy Database

Opinions expressed by DZone contributors are their own.

Related

  • Resolving Parameter Sensitivity With Parameter Sensitive Plan Optimization in SQL Server 2022
  • Comparing Managed Postgres Options on The Azure Marketplace
  • Useful System Table Queries in Relational Databases
  • Introducing Graph Concepts in Java With Eclipse JNoSQL

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!