It’s been three years since Harvard Business Review made the claim that Data Science was “the sexiest job of the 21st century.” Since then, multiple other blogs and media have echoed this sentiment. Right around the time that the original article came out, I began my path as a data scientist. Throughout the three years I have been a data scientist, I routinely get asked, “So how do you become a data scientist?”
As a previous newcomer to data science in a company without an experienced mentor, I understand the difficulties of wading through an excess of information. If I was to start all over again, I would want a better understanding of the following:
- What is the day-to-day life of a data scientist?
- What does the current landscape of the industry look like?
- The history of data science as a discipline.
- How can I realistically get started?
- What a data science analysis really looks like?
Below, I’ve collected a handful blog posts that I find helpful to answer the above questions. Additionally, some of the blogs are useful for learning the skills immediately useful on the job.
This article hits close to home for me. From elaborating on the different types of data scientists to what to do if you are on the “wrong” side of those types. (Hint: there is no wrong side.) Chang explains well how the role of a data scientist morphs as a company grows from a start-up to an established company.
Finding the type of company you fit into is as important as figuring out what kind of data scientist you are. For myself, I was one of the first data scientists at Workday. My own career grew with the team, and the work I was doing changed as the company grew. Surprisingly to myself, I started feeling the hunger pangs of wanting to head to a new and young company so I could help them get started really using the data they were collecting. It took three years for me to figure out what kind of data scientist I was, and what type of work I enjoyed doing.
Compared to Chang, DataCamp showcases a more traditional segmentation of the data science industry. While some of the job titles explored here might be seen as “old school”, I believe it offers a more realistic view of the industry as it stands today. Those of us within the bubble of Silicon Valley forget that outside of the Bay Area, companies operate exceedingly different. The gap is slowly closing but budding data scientists may want to look outside of that dreamy job title to find a fulfilling role.
Also, be sure to check out the rest of the DataCamp blog. It’s useful for those looking to start learning R. They have easy to follow tutorials around key skills in R as well as comparative articles around other commonly used tools. These are great for beginners who need a place to start as well as those that are rusty with the R language.
Rafael Irizarry / SimplyStatistics
The saying “Those who do not learn from history are doomed to repeat it,” is especially true in the story presented here. In this quick history lesson about data science, he reminds us that data science, at its core, has existed for decades. For people interested in data science, learning the history of not only data science, but also computer science, statistics, and mathematics offers an eye opening experience of what these disciplines are, what they are not, and how they interplay together.
Eli Bressert at Insight Data Science
Once you have an understanding of what data science is and how it can be used to create insights, now it’s time to start getting your hands dirty. Just like a new runner wouldn’t start with the Boston Marathon, your first project within data science shouldn’t be too specialized or complex. Before any large data exploration project, you have to start by gaining a basic understanding of the data. In Bressert’s essay, he clearly shows the steps of this process as well as why they are important. Many times, doing exploratory data analysis can find important insights and clarifications before you move ahead to the statistical analysis & machine learning.
Finally, an example of how even simple analysis can create a big impact. I like Chen’s work here; he clarifies his findings at a level anyone can understand. His report graphics are clear and concise. They enhance rather than overwhelm his report – a common abuse. For interested data scientists to be, this is a good example of the type of methodological approach one can utilize to get their feet wet.
My list is in no way exhaustive of the great blogs to read as a learning data scientist. I only gathered a handful that I found particularly interesting for those new to the field. There is a plethora of online learning materials to learn the more advanced topics of the trade, as well as other blogs to keep an eye on for upcoming research. I also recommend using and reading the results of the various and ongoing Kaggle competitions. This will allow you to peek into the thought process of data scientists and how they get to their results. And of course, keep following the blog here at Treasure Data for tips and how to’s across a variety of subjects.
This article was originally written by Diana Shealy at Treasure Data, and syndicated here with permission.