DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workkloads.

Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • The Future of Data Lakehouses: Apache Iceberg Explained
  • Data Architectures in the AI Era: Key Strategies and Insights
  • The Power of AI: Building a Robust Data Ecosystem for Enterprise Success
  • Inside the World of AI Data Centers

Trending

  • Fixing Common Oracle Database Problems
  • Unlocking AI Coding Assistants Part 1: Real-World Use Cases
  • Internal Developer Portals: Modern DevOps's Missing Piece
  • Develop a Reverse Proxy With Caching in Go
  1. DZone
  2. Data Engineering
  3. Data
  4. How VAST Data’s Platform Is Removing Barriers To AI Innovation

How VAST Data’s Platform Is Removing Barriers To AI Innovation

Faster access to more data regardless of where the data resides will accelerate the adoption and success of AI-driven applications, solutions, and discoveries.

By 
Tom Smith user avatar
Tom Smith
DZone Core CORE ·
Sep. 08, 23 · Analysis
Likes (2)
Comment
Save
Tweet
Share
3.1K Views

Join the DZone community and get the full member experience.

Join For Free

I recently had the opportunity to speak with Renen Hallak, Founder and CEO of VAST Data, about their new unified data platform for AI. VAST made waves in 2019 with the release of their VAST DataStore, a highly performant and scalable all-flash storage system. However, as I learned from Renen, storage was only the opening act in VAST's grander vision to become an AI data platform.

With the hype and investment around AI reaching astronomical levels, the demands on infrastructure are greater than ever. VAST aims to eliminate common compromises around performance, scale, geography, and ease of use to unlock AI's potential. On August 1st, VAST unveiled its expanded data platform, comprising a new database and compute capabilities alongside its flagship VAST DataStore.

The VAST Data Journey Started With a Revolutionary Architecture

VAST's journey began in 2016 with the creation of an innovative architecture called Disaggregated Shared Everything (DASE). According to Renen, VAST's goal from the outset was to provide AI algorithms with unfettered access to more data more quickly.

DASE completely reimagines data center design by separating storage and computing into independent resource pools that can scale in parallel. This eliminates bottlenecks like cache coherence and metadata management that restrict scale-out architectures. VAST also developed new shared data structures and protocols enabling consistent, efficient data access across the disaggregated environment.

As a result, DASE delivers previously unattainable performance at scale. It empowers AI workloads to rapidly analyze immense datasets in ways not possible with traditional infrastructure. By merging more data, faster access, and direct connectivity to analog and digital data sources, VAST believes DASE will unlock new algorithm breakthroughs.

VAST DataStore: High-Speed Unstructured Data Repository

Built on DASE, VAST's flagship product is the VAST DataStore, released in 2019. The VAST DataStore condenses SAN and NAS capabilities into a unified all-flash system specialized for unstructured data.

Leveraging the parallelism of DASE, the VAST DataStore cost-effectively offers file, object, and HPC storage using only flash memory. There is no need for a separate flash performance tier with a slower disk handling capacity in the background. All data enjoys rapid, random access.

The VAST DataStore efficiently handles unstructured data at an exabyte scale through standard interfaces like NFS, SMB, and S3. Behind the scenes, DASE stores data in tiny elements accessed in parallel by compute resources. Features like deduplication, compression, snapshots, and QoS are implemented in real time via DASE's persistent write buffer.

New VAST DataBase and VAST DataEngine Expand Capabilities

Building on the VAST DataStore's success, VAST Data recently announced their expanded platform, introducing the VAST DataBase and VAST DataEngine. Together with the VAST DataStore, these form a unified environment for data-centric AI spanning ingestion, storage, processing, and querying.

The VAST DataBase leverages DASE to deliver a hyperscale database for both transactional and analytical workloads. Using an innovative columnar format, the VAST DataBase reduces data sizes for lightning-fast query performance at scale. DASE allows simultaneous OLTP inserts and OLAP queries with no tradeoffs. The database also serves as a metadata catalog across unstructured data in the VAST DataStore.

The VAST DataEngine enables processing data workloads directly within the global data fabric. It can optimize task placement based on factors like data locality and cost. Developers can create recursive compute loops triggered by data events anywhere in the fabric. This continuous processing paradigm supercharges data-driven AI workflows.

VAST DataSpace: Limitless Data Fabric Powering AI Innovation

Tying everything together is VAST DataSpace, a global namespace unifying data silos across on-prem, cloud, and edge locations. This groundbreaking data accessibility allows apps to harness data without central ownership. Instead of moving data to compute, compute comes to the data for optimal efficiency.

With a unified data fabric removing traditional limitations, exciting new AI use cases emerge. VAST customer Pixar revolutionized animated film production through globally shared datasets. Online travel giant Agoda uses VAST to power its entire big data and machine learning pipeline.

By eliminating compromises around data access, VAST Data is pioneering the next evolution of AI infrastructure. Performance, scale, geography, and ease-of-use barriers are collapsing, allowing enterprises to focus on innovations rather than infrastructure. VAST Data is unlocking a new era where ideas, not technology constraints, determine the boundaries of AI innovation.

The Possibilities With the Unified Vast Data Platform

The capabilities enabled by VAST Data's unified platform are diverse, spanning real-time analytics, model training, database applications, and more. Let's explore some use cases:

Real-Time Analytics

For real-time analytics, the VAST DataStore offers ultra-fast access to vast amounts of unstructured data. The VAST DataBase facilitates ad hoc analytical queries across billions of rows of structured data. Bringing these together in VAST DataSpace allows for rapid analysis correlating unstructured and structured data streams.

Continuous Model Training

The VAST DataEngine enables continuous model training workflows. As new unstructured data lands in the VAST DataStore, events trigger model training jobs to execute in VAST DataSpace using the latest data. Results get written for immediate inference access.

Cloudbursting

To scale analytics or training workloads, VAST DataSpace can burst into the public cloud while maintaining a unified global namespace. This allows leveraging cloud resources for extra capacity without data migration.

Hyperscale Database

The VAST DataBase's simultaneous OLTP and OLAP support at an extreme scale provides an ideal foundation for large-scale transactional applications that also require analytical insights.

Data Lakes

For data lake needs, the VAST DataStore offers a centralized repository for all enterprise data. The VAST DataBase provides a metadata catalog of data assets. VAST DataSpace ties everything together into a cohesive environment.

In summary, the unified nature of the VAST Data platform lends itself to an array of data-intensive use cases. By removing infrastructure limitations, the possibilities are endless.

The Road Ahead for VAST Data

VAST shows no signs of slowing down. The company recently raised $210 million at a $3.7 billion valuation. VAST is aggressively expanding, including the launch of a new R&D facility focused on advancing DASE technologies.

Some areas VAST is innovating on include:

  • Making DASE accessible as a composable data services fabric
  • Expanding global file system capabilities
  • New data reduction techniques like DNA compression
  • Optimizations for AI/ML, GPGPU workloads
  • Zone storage tiering for low-latency data access
  • Hybrid and multi-cloud data management

Additionally, Renen hinted at expanding VAST's market focus beyond AI and analytics into emerging areas like ML Ops, the metaverse, and Web 3.0.

It's an exciting time to watch how pioneers like VAST Data reshape the limits of what's possible with data. As innovations in AI and next-generation applications create immense data demands, the companies fulfilling these infrastructure needs will power the most groundbreaking advancements.

AI Database Data lake Data storage Data store Data (computing)

Opinions expressed by DZone contributors are their own.

Related

  • The Future of Data Lakehouses: Apache Iceberg Explained
  • Data Architectures in the AI Era: Key Strategies and Insights
  • The Power of AI: Building a Robust Data Ecosystem for Enterprise Success
  • Inside the World of AI Data Centers

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!