DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
View Events Video Library
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
  1. DZone
  2. Data Engineering
  3. Databases
  4. When Big Data is Slow
Content provided by Couchbase logo

When Big Data is Slow

Don Pinto user avatar by
Don Pinto
·
May. 01, 14 · Interview
Like (0)
Save
Tweet
Share
3.11K Views

The key to being successful in big data initiatives is being able to manage the speed, scale and structure at sub-millisecond speed.

Big Data is a big term. It encompasses concepts about data types, dozens of different technologies to manage those data types and the eco-system around all those technologies. And everything in it moves fast!

Big data is quickly evolving. A classic big data solution, the most common big data technology architecture in use today, relies on importing and exporting data (typically into Hadoop) via batch processes. While this has yielded tremendous business results in the form of better customer insight and predictive analysis, it is not a real time solution. It is slow.

As technology advances at an ever-increasing rate, so are best practices for big data solutions: a modern big data solution relies on real-time data processing via stream processing. A modern big data solution leverages integration with Elasticsearch, Storm, and more. It enables real-time analysis and search while meeting operational requirements. In order to enable real-time analysis and search, a modern big data solution requires a high performance NoSQL database that is scalable. The NoSQL database must fulfill operational requirements while meeting the performance requirements necessary to enable real-time analysis and search.

A modern big data solution is only as fast as its slowest component. That brings us to a recent announcement by Mongo and Cloudera.  While we applaud every effort to help customers understand best practices for big data architecture, we also must address which NoSQL solution is the right piece to enable a truly, fast big data architecture. A scalable, high performance NoSQL database ensures that the operational database will not be the slowest component. A NoSQL database that’s difficult to scale and relies on database wide locks will fail to leverage the potential a modern big data solution. This is the difference between MongoDB and Couchbase Server. Sure, MongoDB can be a part of classic big data solutions: these were not designed for real time analytics and don’t need the speed that a modern big data solution requires. Couchbase Server can be a part of both classic big data solutions and modern big data solutions.

A classic big data solution, which we mentioned earlier, is in use at many organizations today. It typically relies on integration with Hadoop. Couchbase Server integrates with Hadoop via a Cloudera certified Sqoop connector (link).

Matt Asay cited a classic big data use case where Hadoop analyzes the crowd and a NoSQL database interacts with the individuals. The individual interactions are fed to Hadoop and the crowd analysis is fed to the NoSQL database. For Couchbase, this isn’t just a use case. It’s a customer reference. AOL leverages Hadoop and Couchbase Server in a classic big data solution to enable intelligent advertising (link).

LivePerson leverages Hadoop, Storm and Couchbase Server in its modern big data solution (link). The LivePerson architecture leverages both batch-oriented processing and real-time processing. LivePerson considered NoSQL databases from Couchbase, MongoDB, and DataStax. However, only Couchbase Server was able to meet their high throughput requirements.


Comments

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends: