- Oggetto:
- Oggetto:
Multivariate statistical analysis
- Oggetto:
Multivariate statistical analysis
- Oggetto:
Academic year 2017/2018
- Course ID
- MAT0041
- Teacher
- Pierpaolo De Blasi
- Year
- 1st year
- Teaching period
- Second semester
- Type
- D.M. 270 TAF C - Related or integrative
- Credits/Recognition
- 6
- Course disciplinary sector (SSD)
- SECS-S/01 - statistica
- Delivery
- Formal authority
- Language
- English
- Attendance
- Mandatory
- Type of examination
- Written
- Prerequisites
- Probability Theory
- Propedeutic for
- Statistical Machine Learning
- Oggetto:
Sommario del corso
- Oggetto:
Course objectives
The course aims at introducing multivariate analysis in statistical modeling. All the methods will be implemented on real datasets in the R language.
- Oggetto:
Results of learning outcomes
The student will learn the basic techniques for analyzing multi-dimensional data (including visualization), study multivariate distributions and their properties, discuss various methods for dimension reduction.
- Oggetto:
Course delivery
The course is composed of 48 hours of class lectures. Examples and exercises will be dealt with at class through the R language.
- Oggetto:
Learning assessment methods
Problem Sets:
There will be 2 problem sets assigned throughout the course. They will be posted in due time on
https://sites.google.com/a/carloalberto.org/pdeblasi/teaching
together with an indication of the deadline.
Problem sets must be submitted and there are no late submissions. They are an essential part of the course, providing students with a guide on how well they are grasping the material on a "real time" basis. They request the solution of exercises, solution which might require the use of a statistical software. Students are encouraged to work in groups on the problem sets. However, students should understand the material on their own, and hand in their own problem sets.
Exam:
There will be a final exam, check out for dates on
http://www.master-sds.unito.it
The final examination consists of a written test, either a short or a long test according to the problem sets. Specifically,(1) First 2 exam dates: the course grade is determined by the problem sets and the final exam. The final exam consists of a short written test (1h) on the part of the program not covered by problem sets followed by an oral examination. The final grade will be a combination of the problem sets grades (66%), and the final exam grade (33%). For students who have failed to submit the solutions of the problem sets, case (2) below applies.
(2) From the 3rd exam date on: the final exam consists of a long written test (3h) on the whole program and the final grade will be determined solely by it (100%).- Oggetto:
Program
- Introduction
- summary statistics for multivariate data
- multivariate data visualization
- multivariate Gaussian distributions
- Principal Component Analysis (PCA):
- geometric and algebraic basics of PCA
- calculation and choice of components
- plotting PCs, interpretation
- Factor Analysis (FA):
- model definition and assumptions
- estimation of loadings and communalities
- choice of the number of factors
- factor rotation
- Canonical Correlation Analysis:
- computation and interpretation
- relationship with multiple regression
- Discriminant Analysis and Classification:
- classification rules
- linear and quadratic discrimination
- error rates
- Cluster Analysis:
- measure of similarity
- hierarchical clustering
- K-means clustering
- model based clusteringSuggested readings and bibliography
- Oggetto:
The bibliography, to be confirmed at the beginning of the course, is:
- R.A. Johnson and D.W. Wichern (2007). Applied Multivariate Statistical Analysis. Prentice-Hall, 6th Ed.
Suggested readings:
- Hastie, Tibshirani, Friedman (2009). The Elements of Statistical Learning, 2nd ed., Springer
- Afifi A., May S., Clark V.A. (2012). Practical Multivariate Analysis, 5th ed., Chapman & Hall/CRC
- Everitt B. (2005). An R and S-PLUS Companion to Multivariate Analysis. Springer
- Rencher A. C., Christensen W. F. (2012). Methods of multivariate analysis, 3rd ed., Wiley
- Rencher A.C. (1992). Interpretation of canonical discriminant functions, canonical variates and principal components. The American Statistician 46, 217-225.- Oggetto:
Class schedule
- Oggetto: