Practicing Data Science
Practicing Data Science
A collection of use cases.
Join the DZone community and get the full member experience.Join For Free
Bias comes in a variety of forms, all of them potentially damaging to the efficacy of your ML algorithm. Read how Alegion's Chief Data Scientist discusses the source of most headlines about AI failures here.
It was great speaking with Rosaria Silipo, Principal Data Scientist at KNIME during their fall summit. Rosaria is the editor of Practicing Data Science, a new book highlighting the many different types of data science projects in multiple vertical industries.
There are many different types of data science projects: with or without labeled data; stopping at data wrangling or involving Machine Learning algorithms; predicting classes or predicting numbers; with unevenly distributed classes, with binary classes, or even with no examples of one of the classes; with structured data and with unstructured data; using past samples or just remaining in the present; with real-time or close to real-time execution requirements and with acceptably slower performances; showing the results in shiny reports or hiding the nitty and gritty behind a neutral IT architecture; and — last but not least — with large budgets or no budget at all.
Rosaria has seen many of the above projects and their data science nuances. With so much experience — and related mistakes — she wanted to share what she and her colleagues have learned. The idea of the book is a collection of data science case studies from past projects.
Use cases help to establish best practices for data science projects. Which algorithm to use depends on the problem you are trying to solve and the data you have to solve the problem. If you have a problem without a labeled data set, you need to use an unsupervised model. Different use cases call for different models. There is not one model that works for everything.
This book includes project reviews from IoT, financial industry, customer intelligence, social media, cybersecurity, and more. Use cases vary with unbalanced, less frequent, and non-existent packages. Often with published case studies, there are no actionable next steps. This ebook includes actionable workflow examples, available on the KNIME EXAMPLES server, which are dutifully reported at the beginning of each section.
A complimentary download of the ebook is available to DZone readers using promo code DZONE-2018. The code expires December 31, 2019. KNIME plans to add more use cases to enhance the learning and best practices.
Opinions expressed by DZone contributors are their own.