Statistical machine learning
Statistical machine learning
Academic year 2016/2017
- Course ID
- Prof. Antonio Canale
- 2nd year
- Teaching period
- First semester
- D.M. 270 TAF C - Related or integrative
- Course disciplinary sector (SSD)
- SECS-S/01 - statistica
- Class Lecture
- Type of examination
Sommario del corso
The course introduces methods and models to extract important patterns and trends from big amount of data, and presents basic concepts of machine learning and data mining from a statistical perspective. All the methods will be introduced from a theoretical point of view and implemented on real datasets in the R language.
Results of learning outcomes
Knowledge and understanding
- Advances knowledge of parametric and nonparametric models for prediction and classification
Applying knowledge and understanding
- Ability to convert various problems and data into statistical models to perform several type of prediction/classification.
- Students will be able to discern the different aspects of statistical learning in modern settings.
- Students will properly use statistical language to comunicate the results of their findings.
- The skills acquired will give students the opportunity of improving and deepening their knowledge of statistical modeling.
Half of the lectures are devoted to the theorerical aspects of statistical machine learning and the remaining half to their practical implemetation in the R software considering both the related numerical and computational issues. Exercises will be assigned during lectures and lab sessions.
Learning assessment methods
The exam consists of three parts: the first part is a written exam on theory; the second part is a practical session with R; the last part is an oral discussion.
- Context and motivations;
- Trade-off between goodness-of-fit and model complexity (i.e. variance and bias);
- Model selection techiniques (AIC, BIC, cross validation);
- Training and test set;
- Variable selection and shrinkage
- Elements of nonparametric regression
- Structured nonparametric regression
- Logistic and multilogit regression;
- Elements of nonparametric classification
- Ensable techniques (bagging, boosting, random forest);
- Tools for data visualization;
- Computational tools (parallel computing, recursive estimations);
Suggested readings and bibliography
- AZZALINI, SCARPA. Data analysis and data mining . Oxford University Press
- HASTIE, TIBSHIRANI AND FRIEDMAN. The elements of statistical learning: data mining, inference and prediction. Springer-Verlag.
Days Time Classroom Wednesday 16:00 - 18:00 Aula 12 - Edificio Storico Polo di Management ed Economia Thursday 11:15 - 13:15 Aula 13 - Edificio Storico Polo di Management ed Economia Thursday 16:00 - 18:00 Aula 10 - Edificio Storico Polo di Management ed Economia
Lessons: dal 27/09/2016 to 09/12/2016
This course will be delivered at the ESOMAS Department.