Over a million developers have joined DZone.

Automating Over the Data Skills Gap

We're all asked to do more and more with less and less. Less money, and less time. DARPA's started a project to help us all with this, by extracting models from data. Learn what they're up to here.

· Big Data Zone

Learn how you can maximize big data in the cloud with Apache Hadoop. Download this eBook now. Brought to you in partnership with Hortonworks.

data-scientist

A few weeks ago I chaired a session at EdTechX in London on the skills gap and what might be the crucial skills for the future.  Certainly one of the skills of the present is in data science, with a recent report from Crowd Flower revealing the ongoing shortage in people with data skills.  It revealed that a staggering 83% of respondents were struggling to find people to fill vacancies in data science related roles.

Of course, this is not a new trend, with Gartner highlighting the issue way back in 2012, but the Crowd Flower data suggests things are getting worse, not better.

Automating the Problem Away

To try and rectify matters, DARPA has recently set up a new program called Data-Driven Discovery of Models (D3M).  D3M aims to help in the development of automated means of crossing the data skills gap and allow non-experts to develop their own complex models.  They will be empowered to do this via a significant level of automation of the back-end work behind such algorithms.

In many ways, therefore, it’s doing for data science what WYSIWYG editors did for web development 20 years ago and visual programming environments have done for coding.  DARPA believe that it will be akin to allowing relative novices to behave like virtual data scientists.

“We have an urgent need to develop machine-based modeling for users with no data-science background. We believe it’s possible to automate certain aspects of data science, and specifically to have machines learn from prior example how to construct new models,” DARPA say.

The overall aim is to open up to non-specialists the ability to create complex empirical models in areas where they have subject matter expertise but little in the way of data science capabilities.

“This capability will enable subject matter experts to create empirical models without the need for data scientists, and will increase the productivity of expert data scientists via automation. The automated model discovery systems developed by the D3M Program will be tested on real-world problems that will progressively get harder during the course of the program. Toward the end of the program, D3M will target problems that are both unsolved and underspecified in terms of data and instances of outcomes available for modeling,” they conclude.

Hortonworks DataFlow is an integrated platform that makes data ingestion fast, easy, and secure. Download the white paper now.  Brought to you in partnership with Hortonworks

Topics:
automation ,data mining

Published at DZone with permission of Adi Gaskell. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

SEE AN EXAMPLE
Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.
Subscribe

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}