DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
11 Monitoring and Observability Tools for 2023
Learn more
  1. DZone
  2. Data Engineering
  3. AI/ML
  4. Jumping the Database S-Curve

Jumping the Database S-Curve

MemSQL envisions a future of AI-augmented datastores replacing distributed datastores. See how databases have evolved and where they might be going.

Gary Orenstein user avatar by
Gary Orenstein
·
Jan. 27, 17 · Opinion
Like (0)
Save
Tweet
Share
5.32K Views

Join the DZone community and get the full member experience.

Join For Free

adaptation and reinvention

long term success hinges on adaptation and reinvention, especially in our dynamic world where nothing lasts forever. especially with business, we routinely see the rise and fall of products and companies.

the long game mandates change, and the database ecosystem is no different. today, megatrends of social, mobile, cloud, big data, analytics, iot, and machine learning place us at a generational intersection.

data drives our digital world and the systems that shepherd it underpin much of our technology infrastructure. but data systems are morphing rapidly, and companies reliant on data infrastructure must keep pace with change. in other words, winning companies will need to jump the database s-curve.

the s-curve concept

in 2011, paul nunes and tim breene of the accenture institute for high performance published jumping the s-curve: how to beat the growth cycle, get on top, and stay there .

in the world of innovation, an s-curve explains the common evolution of a successful new technology or product. at first, early adopters provide the momentum behind uptake. a steep ascent follows, as the masses swiftly catch up. finally, the curve levels off sharply, as the adoption approaches saturation.

the book details a common dilemma that too many businesses only manage to a single s-curve of revenue growth,

in which a business starts out slowly, grows rapidly until it approaches market saturation, and then levels off.

they share the contrast of stand-out, high-performance businesses that manage growth across multiple s-curves. these companies continually find ways to invent new products and services that drive long-term revenue and efficiency.

image title

authors nunes and breene outline three traits of high-performance companies that successfully scale multiple s-curves,

  1. a big enough market insight : companies must identify, “a substantial market change on the horizon that heralds the chance to build a major business for the company that identifies and seizes the opportunity.”
  2. threshold competence before scaling : companies “understand exactly how they must be distinctive in order to create the value the market demands.”
  3. worthy of serious talent : “high performers attract and keep the ‘serious talent’ they need — the people with the abilities and the attitude to drive the creation of successful businesses.”

applying the s-curve to data infrastructure

the s-curve applies to new technologies and products, and for the rest of this piece, we apply the concept to the evolving database and data management world.

the monolithic era

the initial era of databases and data warehouses tracked the path of monolithic, single node, scale up systems. as cpu, memory, and storage density increased, data volumes and performance requirements were typically satisfied by a single server’s capabilities. over time this expanded to very expensive servers and often more expensive storage area networks (sans).

image title

architects recognized the simplicity of single server systems, and the advantages provided to build robust architectures for mission critical applications.

for most of the last 30 to 40 years, since the development of sql in the 1970s, this model served the industry well.

however, when data volumes and performance requirements increased dramatically, as we see with digital megatrends, the limits of a single server system become readily apparent.

the distributed era

starting around 2007, distributed datastores like hadoop began to take hold. distributed architectures use clusters of low-cost servers in concert to achieve scale and economic efficiencies not possible with monolithic systems. in the past ten years, a range of distributed systems have emerged to power a new s-curve of business progress.

examples of prominent technologies in the distributed era include, but are certainly not limited to

  • message queues like apache kafka and aws kinesis.
  • transformation tiers like apache spark.
  • orchestration systems like zookeeper and kubernetes.

more specifically in the datastore arena are

  • key-value stores like cassandra.
  • document stores like mongodb.
  • relational datastores like memsql.

image title

advantages of distributed datastores

distributed datastores provide numerous advantages over monolithic systems, including

  • scale : aggregating servers together enables larger capacities than single node systems.
  • performance : the power of many far outpaces the power of one.
  • alignment with cpu trends : while cpus are gaining more cores, processing power per core has not grown nearly as much. distributed systems are designed from the start to scale out to more cpus and cores.

numerous economic efficiencies also come into play with distributed datastores, including

  • no sans : distributed systems can store data locally to make use of low-cost server resources
  • no sharding : scaling monolithic systems requires attention to sharding. distributed system remove this need
  • deployment flexibility : well-designed distributed systems will run across bare metal, containers, virtual machines, and the cloud
  • common core team for numerous configurations : with one type of distributed system, it teams can configure a range of clusters for different capacities and performance requirements
  • industry standard servers : low-cost hardware or cloud instances provide ample resources for distributed systems. no appliances required

together these architectural and economic advantages mark the rationale for jumping the database s-curve.

the future of ai augmented datastores

beyond distributed datastores, the future includes more artificial intelligence (ai) and using it to streamline data management performance.

image title

ai will appear in many ways, including

natural language queries : such as sophisticated queries expressed in business terminology using voice recognition

efficient data storage : by identifying more logical patterns, compressing effectively, and creating indexes without requiring a trained database administrator

new pattern recognition : discerning new trends in the data without the user having to specify a query

of course, ai will likely expand data management performance far beyond these examples too. in fact, in a recent news release , gartner predicts,

more than 40 percent of data science tasks will be automated by 2020, resulting in increased productivity and broader usage of data and analytics by citizen data scientists

transcending database s-curves

the industry currently sits at the intersection of the monolithic and the distributed datastore eras, and jumping s-curves is no easy feat. by definition, a new s-curve is significantly different from the prior, where many technologies, skill sets, and mindsets likely do not transfer forward.

addressing this transformation, paul nunes advises,

pay attention to insights that arise from the periphery of the organization

perhaps nunes is suggesting successful companies have the talent and mindset needed to jump, but need to be creative to discover it.

adopting new products and technologies mandates change, which is rarely easy and often challenging at first. but if there is one universal truth about business, staying in place rarely leads to long-term success. companies that move forward, however, have a chance to jump the s-curve and lead another wave of growth.

Relational database Data science Machine learning

Published at DZone with permission of Gary Orenstein, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Frontend Troubleshooting Using OpenTelemetry
  • JWT Authentication and Authorization: A Detailed Introduction
  • Apache Kafka Is NOT Real Real-Time Data Streaming!
  • Reliability Is Slowing You Down

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: