DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
  1. DZone
  2. Data Engineering
  3. Big Data
  4. [DZone Research] The Three Vs of Big Data

[DZone Research] The Three Vs of Big Data

A look at how the volume, variety, and velocity of big data sets impacts the work of software developers and data scientists.

G. Ryan Spain user avatar by
G. Ryan Spain
·
Jordan Baker user avatar by
Jordan Baker
·
Oct. 07, 18 · Analysis
Like (2)
Save
Tweet
Share
5.90K Views

Join the DZone community and get the full member experience.

Join For Free

This article is part of the Key Research Findings from the 2018 DZone Guide to Big Data: Stream Processing, Statistics, and Scalability.

Introduction

For the 2018 DZone Guide to Big Data, we surveyed 540 software and data professionals to get their thoughts on various topics surrounding the field of big data and the practice of data science. In this article, we focus on how respondents told us their work is affected by the velocity, volume, and variety of data. 

The Three Vs

The concept of “Big Data” has been a difficult one to define. The sheer amount (volume) of data that is able to be stored in a single hard disk drive, solid state drive, secure digital memory card, etc., continues to improve, and hardware technology has grown fast enough that I still remember buying a computer with a 10 gigabyte hard drive and being told from the salesperson at the electronics store, “you’ll never need more computer storage again.” With new data storage options (cloud and hybrid storage, for example), storing large volumes of data isn’t such a hard thing to overcome, though it still requires some planning to do. The complications added by “Big Data” include dealing not only with data volume but also data variety (how many different types of data you have to deal with) and data velocity (how fast the data is being added).

Beyond that, “Big Data” is complicated by the fact that just storing this data is not enough; to get anything from the data being collected, it not only needs to be stored but also needs to be analyzed. 76% of our survey respondents said they have to deal with large quantities of data (volume), while 46% said they have to work with high-velocity data, and 45% said they have to work with highly variable data.

Each of these “Big Data” categories comes with its own set of challenges. The most challenging data sources for those dealing with high-volume data were files (47%) and server logs (46%), and the most challenging data types were relational (51%) and semi-structured (e.g. JSON, XML; 39%). For those dealing with high-velocity data, server logs and sensors/remote hardware (both 42%) were the most challenging data sources, and semi-structured (36%) and complex data (e.g. graph, hierarchical; 30%) were the most challenging data types. Finally, regarding data variety, the biggest data source challenges came from files (56%). Server logs, sensor/remote hardware data, ERP and other enterprise systems, user-generated data, and supply-chain/logistics/other procurement data all fell between 28% and 32% of responses labeling these tasks as “challenging.” 

Conclusion

The Three Vs are a great way to conceptualize the basic building blocks of big data. But as data volume continues to grow due to the increased adoption of IoT devices, increased access to the internet across the world, and more, the variety and velocity of data will also continue to increase, potentially compounding the issues outlined above. 

What are you preferred methods and tools for working with the Three Vs?  

This article is part of the Key Research Findings from the 2018 DZone Guide to Big Data: Stream Processing, Statistics, and Scalability.

Big data DZone

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Memory Debugging: A Deep Level of Insight
  • RabbitMQ vs. Memphis.dev
  • A Simple Union Between .NET Core and Python
  • PHP vs React

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: