Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

The Data Structures and Algorithms Learning Problem

DZone's Guide to

The Data Structures and Algorithms Learning Problem

Some handy book recommendations on where to start with learning about the fundamental issues of data structures.

· Big Data Zone
Free Resource

Need to build an application around your data? Learn more about dataflow programming for rapid development and greater creativity. 

Here's a snippet of an email:

In big data / data science, the curse of dimensionality keeps showing up over and over. A good place to start is the wiki article “ curse of dimensionality.” The issue seems to be that a lot of these big data / data science people have not taken the time to study fundamental data structures.

There was more about Foundations of Multidimensional and Metric Data Structures by Hanan Samet being too detailed, Stack Overflow being too high-level, and more hand-wringing after that, too.

The email was pleading for some book or series of blog posts that would somehow educate data science folks on more fundamental issues of data structures and algorithms. Perhaps getting them to drop some dimensions when doing k-NN problems or perhaps exploit some other data structure that didn't involve 100's of columns.

I think.

I'm guessing because — like a lot of hand-waving emails — it didn't involve code. And yes, I'm very bigoted about the distinction between code and hand-waving.

If there is a lack of awareness of appropriate data structures, the real place to start is The Algorithm Design Manual by Steven Skiena.

I harbor my doubts that this is the real problem, however. I think that the broad spectrum of computing applications leads to a lot of specialization. I don't think that it's really prudent to try and think of generalists who can handle deep data science issues as well as algorithm design and performance issues. No one expects them to write JavaScript and tinker with CSS so that the web site which presents the results looks good.

I actually think the real problem is that some folks expect too much from their data scientists.

In fantasy land the rock stars are full stack developers who can span the entire spectrum from OS to CSS. In the real world, developers have different strengths and interests. In some cases, "full stack" means mediocre skills in a lot of areas.

Here's a more useful response: Bridging the Gap Between Data Science and DevOps. I don't think the problem is "big data / data science people have not taken the time to study fundamental data structures". I think the problem is that big data is a cooperative venture. It takes a team to solve a problem.

Check out the Exaptive data application Studio. Technology agnostic. No glue code. Use what you know and rely on the community for what you don't. Try the community version.

Topics:
software eng ,data science ,algorithm ,dimensional data

Published at DZone with permission of Steven Lott, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

THE DZONE NEWSLETTER

Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

X

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}