Data Platform Update for AI and Analytics

DZone 's Guide to

Data Platform Update for AI and Analytics

Platform lowers TCO and integrates storage and security across on-premises, edge, and cloud deployments.

· AI Zone ·
Free Resource

Thanks to Anoop Dawar, SVP, Product Management and Marketing and Bill Peterson, VP, Industry Solutions at MapR® Technologies, Inc. for taking me through changes to the MapR Data Platform to speed the operational impact of automated analytics, improve the productivity of developers and data scientists, lower TCO, and streamline security and storage across on-premises data centers, clouds

“Customers have made it clear that traditional approaches to managing and processing data for AI and Analytics leave critical gaps. In response, MapR’s newest innovations focused on enabling data scientists and developers to leverage all data for more impactful results,” said Dawar.  “The continual evolution of the MapR platform is evident in this release of new capabilities, built in close collaboration with leading customers, that powers distributed AI and analytics spanning multi-temperature, multi-protocol, on-premises, edge and cloud deployments.” 

The platform update includes major areas of innovation that extend the data fabric to cloud storage through object tiering; fast ingest erasure coding for more cost-effective, long-term data retention; security innovations to automatically enable security across the environment, and a new S3 API supporting next-gen applications and increasing application portability. 

Image title

A recent Gartner research note states the need for a new approach citing that “Data growth has far outstripped compute growth, resulting in an imbalance in system architectures. Emerging data-intensive workloads that require data-centric processing — such as AI, high-performance computing (HPC) and IoT — will expose the system imbalance, especially in data movement, resulting in new architectural innovations to address this gap.1

New Capabilities

  • Core Data Services Innovations to Speed AI and Analytics and lower TCO

    •  Policy-Driven automatic data placement across performance-optimized, capacity-optimized and cost-optimized tiers, on-prem or in

    • Fast ingest erasure coding that can now be used for capacity-optimized tiers or with

    • Native S3 Interface for next-generation applications for direct analytics on operational data and transparent application portability across on-premise and multi-cloud environments

    • Advanced Secure File-based services to ensure corporate security compliance with NFSv4.

  • Simplified Development and Deployment of AI and Analytic Applications

    • High performance, continuous processing with Spark 2.3 for structured streaming and machine learning

    • Analytics toolkit support with Hive 2.3 that has over 800 JIRAs resolved

    • Non-programmer enablement to create streaming applications with KSQL

    • Simplified streaming analytics application development with Change Data Capture (CDC) and

    • Apache Drill 1.13 with expanded SQL support, high performance at scale and query experience with Hue,

    • Native language bindings (Python and Node.JS) and efficient queries directly on JSON datatypes without ETL for faster and easier database applications development.

  • Streamlined Security and Critical Data Asset Protection

    • Volume-based data encryption at rest provides an additional means to prevent unauthorized access to sensitive data. Encryption is also used to avoid exposure to breaches such as packet sniffing and theft of storage devices.

    • Secure by Default ensures that data platform security out-of-the-box including core and ecosystem services for new installations with a single click. All data can be stored encrypted and all network connections are encrypted with authentication enabled.

1"2018 Strategic Roadmap for Compute Infrastructure,” April 10, 2018. Copyright (c) 2018, Gartner, Inc.

ai ,artificial intelligence ,big data ,iot ,mapr

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}