Introduction to data mining
Introduction to data mining
Academic year 2017/2018
- Course ID
- Teaching staff
- Roberto Esposito
Prof. Rosa Meo
- 2nd year
- Teaching period
- First semester
- D.M. 270 TAF C - Related or integrative
- Course disciplinary sector (SSD)
- INF/01 - informatica
- Formal authority
- Type of examination
- Databases and Algorithms, Programming
Sommario del corso
The objectives of the course will be introduce students to the field of Data Mining and Machine Learning, that merge together competencies of statistics and computer science.
The course will teach the differences between tasks and models and will introduce the students to some of the popular models in Machine Learning such as binary classification and related tasks, transformation of a binary classification model into a multiple class model, concept learning by means of logical formulas, tree models and their purposes, rule models, subgroup discovery, linear models (least squares, regression), perceptron, Support Vector Machines, Kernel methods.
The course will introduce the algorithms for the training of the models.
The laboratory part of the course will introduce the students to a practical open software suite that includes the algorithms of learning of the models seen during the course (and much more).
Results of learning outcomes
The results of the learning outcomes will be mastering some the main concepts in Data Mining and Machine Learning and using them in the context of a practical open software suite for data analysis and machine learning.
The course lessons will be both theorical and practical.
Learning assessment methods
The final exam will be oral in which the students will be asked to show that they master the theorical lessons (knowledge of the models and of their purposes) and use of the practical software suite (Weka) for data analysis in some use cases.
Machine learning experiments in Laboratory with a software suite for Data Mining.
The laboratory will be a practical support to the learning of the theorical lessons by means of practical data analysis assignments on public data-sets.
Tasks and models; Binary classification and related tasks; Beyond binary classification (transformation of a binary classification model into a multiple class model; Concept learning by means of logical formulas; Version Space; learning hypothesis by means of Horn clauses; Tree models (decision trees, regression trees, features trees, ranking trees); rule models (list of rules and sets of rules); subgroup discovery; linear models (least squares, regression); perceptron; Support Vector Machines; Kernel methods;
Suggested readings and bibliography
Peter Flach, Machine Learning - The Art and Science of Algorithms that Make Sense of Data, Cambridge University Press, 2012.
The material for the course is available in Moodle.