Today, DZone released its Data Mining Refcard, authored by Giuseppe Vettigli. We had a chance to sit down with Giuseppe to learn about his background and his experience authoring this Refcard. You can download the Refcard below.
European and American, big companies as Software Engineer for the development of solutions based on Machine Learning and Data Mining techniques.
DZone: What types of developers would find this card the most useful?
Giuseppe Vettigli: This card is for programmers, engineers and scientists who wants a reference for their data mining task or wants to discover the capabilities of Python for their data analysis tasks.
DZone: In two sentences, summarize the contents of this Refcard.
Giuseppe Vettigli: This Refcard is a collection of code snippets that can help you in the development of complex Data Mining applications. It also contains insights about how to explain the results of the techniques used in order to support the data analysis process.
DZone: In the introduction to the Refcard, you claim that "Python has become more and more used for the development of data centric applications" in recent years. Can you elaborate on why you believe Python has been embraced data mining?
Giuseppe Vettigli: n my opinion, Python has become popular for Data Mining, and in general for Scientific Computing, applications because:
-it makes easy the integration with C, C++, and Fortran code giving access to a wide range of scientific computing libraries.
-it has a clean syntax which makes it easy to learn, even for people without a programming background, and fast to develop.
-it is free to use, also for commercial products.
-It is a language with a long and rich history.
DZone: Tell us about an interesting project you are working on now or will be working on in the near future.
Giuseppe Vettigli: At moment I am building an application to classify and extract information from sentences extracted from dialogues in order to support a complex dialogue model. In this projects I am using various Data Mining techniques to analyse a huge dataset of sentences in order to extract useful patterns that can improve the classification process and that can be used into the model.