Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Solving the Data Scientist Shortfall by Deploying a Self Service BI Program

DZone's Guide to

Solving the Data Scientist Shortfall by Deploying a Self Service BI Program

· Big Data Zone ·
Free Resource

Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.

Those of us who have seen the business impact know that there is a competitive race to get the organization, smarter, more capable, and faster leveraging its data. Get there too slow and perhaps your competitor will drive benefits and potentially steal margin. Get there too late and a startup or a business working in a different market will emerge and disrupt your business.

Researching, selecting and implementing new technologies is often the starting point for CIOs and technologists. Is Hadoop more important or a NoSQL database? What commercial build to use, deployed to what cloud, and managed by what team? How much storage and what type? Predictive analytics, real time data processing, data visualization, or semantic processing?</p>

As hard as getting the right technologies in place may be, the consensus among CIOs that I speak to, the media, and with industry analytsts is that there is, and will continue to be a shortfall of data scientists for organizations looking to become data driven. Data scientist - is that a new term, or a new job function or role? There are many definitions of data scientists such as here,here, and here but for simplicity's sake, let's just say they are the users of big data technologies. Depending on their backgrounds, they need analytics skills to ask the right questions, data vizualization abilities, coding skills to work with analytic engines, statistical backgrounds, machine learning skills and other capabilities that enable them to pick the right tool for the job, find insights, and present the results.

CIOs and their teams may be part of the internal debate on how to fill data scientist positions and where to align the role in the organization. In my experience, this debate can trickle down to the entire IT organization. After all, as a technologist don't you want to be the one delivering value from new technologies and not just solutioning, architecting and supporting a solution? Isn't this a golden opportunity for the CIO to lay claim to a critical business function and deliver it with expertise and scale?

There isn't a uniform answer, and a lot depends on the current organization's structure, skills and leadership capabilities. My opinion on this matter</p>
  • Organizations that already have centralized analytical team, quants, data scientists, BI experts, or reporting teams will mostly likely introduce Big Data as a new technology or practice with these teams. If the leader of this team is not producing results, then it is likely that the organization will look to make changes.
  • Other organizations that have not invested or succeeded with data technologies or business intelligence - and I think this accounts for the majority of organizations - are more likely benefit from a decentralized model. More importantly, these organizations probably have to look to train people with data science skills as much if not more than trying to hire this skill set.

Want to lean more? Have a look at What technologies work best for decentralized data scientists or the 10 principles of of what data scientists need from self-service BI programs .

Hortonworks Community Connection (HCC) is an online collaboration destination for developers, DevOps, customers and partners to get answers to questions, collaborate on technical articles and share code examples from GitHub.  Join the discussion.

Topics:

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}