Capitalising further on the $1 billion investment announced in January 2014 around the launch of the Watson Group to develop future big data, analytics, and cognitive computing capabilities, IBM has brought together its entire big data and analytics portfolio under one logical roof – the IBM Analytics Platform.
The IBM Analytics Platform includes many of the Big Data capability clusters identified in MWD Advisors’ vendor landscape report, and adds others focused on specific data sources like content management platforms, along with a framework for data governance and management:
- Hadoop – IBM Hadoop system software, comprising capabilities from the BigInsights and InfoSphere product families, combines an open source distribution of Apache Hadoopwith IBM analytics tools, including integrated SQL-on-Hadoop as well as connectors to IBM and other vendors’ analytics platforms higher up the stack. Products are available both for on-premise deployment and as SaaS subscription offerings.
- NoSQL – NoSQL database technology is not currently a big marketing focus for IBM within its Analytics Platform. However in addition to IBM BigInsights for Apache Hadoop’s Big SQL (touted as a means to invoke SQL queries against data in HDFS), the company does have products and services to be found outside the platform which sport NoSQL capabilities: products from the Informix, DB2 and Cloudant portfolios.
- Streaming analytics – IBM’s stream computing platform, InfoSphere Streams, consists of an Apache Eclipse-based integrated development environment, analytic toolkits, and a high-speed, distributed, scaled-out runtime architecture. The InfoSphere Streams roadmap can be traced back to 2003, though in its first five years releases were confined to the US Government (and deployed for national security use cases). The first edition of a generally-available product came in May 2009, with the latest version (v4.0) launched in March 2015.
- Enterprise data warehouse – As well as offerings in relatively new capability clusters like Hadoop and stream processing, IBM has a mature portfolio of data warehousing software, products, tools, and models – some of which can trace their lineage back to 1980s mainframe computing. The big data portfolio now touches the DB2 BLU Acceleration offering as well as some of the Informix Warehouse products, PureData appliances, and SoftLayer-hosted dashDB.
- Data governance and management – The data governance and management capabilities in the IBM Analytics Platform come in the shape of the Information Integration and Governance and Data Refinement families of tools core to IBM’s InfoSphere portfolio. They’re used here to underpin how the components of the Analytics Platform are able to increase the value of data earmarked for analytics workloads by bringing it together from diverse sources whilst managing its quality, consistency, and determining provenance (and hence enabling confidence in the insights derived from it), securing and protecting it whilst in play and at rest, and maintaining the enterprise’s master data that drives operational business processes.
There’s little denying IBM’s breadth of ambition in covering end-to-end all that most organisations would conceive of needing in order to “harness all their data” and “deploy a full range of analytics”. What’s currently unclear though, is the extent to which this coming-together of a raft of technical capabilities (available through multiple modes of delivery) can deliver as much cohesion in real-world deployments as it does logically on paper.
This is an extract from a new report in our Discovering and Acting on Insights research theme, by Principal Analyst Craig Wentworth.