Vai al contenuto principale
Oggetto:
Oggetto:

Multivariate statistical analysis

Oggetto:

Multivariate statistical analysis

Oggetto:

Academic year 2017/2018

Course ID
MAT0041
Teacher
Pierpaolo De Blasi
Year
1st year
Teaching period
Second semester
Type
D.M. 270 TAF C - Related or integrative
Credits/Recognition
6
Course disciplinary sector (SSD)
SECS-S/01 - statistica
Delivery
Formal authority
Language
English
Attendance
Mandatory
Type of examination
Written
Prerequisites
Probability Theory
Propedeutic for
Statistical Machine Learning
Oggetto:

Sommario del corso

Oggetto:

Course objectives

The course aims at introducing multivariate analysis in statistical modeling. All the methods will be implemented on real datasets in the R language.

Oggetto:

Results of learning outcomes

The student will learn the basic techniques for analyzing multi-dimensional data (including visualization), study multivariate distributions and their properties, discuss various methods for dimension reduction.

Oggetto:

Course delivery

The course is composed of 48 hours of class lectures. Examples and exercises will be dealt with at class through the R language.

Oggetto:

Learning assessment methods

Problem Sets:
There will be 2 problem sets assigned throughout the course.  They will be posted in due time on
https://sites.google.com/a/carloalberto.org/pdeblasi/teaching
together with an indication of the deadline.
Problem sets must be submitted and there are no late submissions. They are an essential part of the course, providing students with a guide on how well they are grasping the material on a "real time" basis. They request the solution of exercises, solution which might require the use of a statistical software. Students are encouraged to work in groups on the problem sets. However, students should understand the material on their own, and hand in their own problem sets.

Exam:
There will be a final exam, check out for dates on
http://www.master-sds.unito.it
The final examination consists of a written test, either a short or a long test according to the problem sets. Specifically,

(1) First 2 exam dates: the course grade is determined by the problem sets and the final exam. The final exam consists of a short written test (1h) on the part of the program not covered by problem sets followed by an oral examination. The final grade will be a combination of the problem sets grades (66%),  and the final exam grade (33%).  For students who have failed to submit the solutions of the problem sets, case (2) below applies.

(2) From the 3rd exam date on: the final exam consists of a long written test (3h) on the whole program and the final grade will be determined solely by it (100%).

Oggetto:

Program

- Introduction
       - summary statistics for multivariate data
       - multivariate data visualization
       - multivariate Gaussian distributions
- Principal Component Analysis (PCA):
       - geometric and algebraic basics of PCA
       - calculation and choice of components
       - plotting PCs, interpretation
- Factor Analysis (FA):
       - model definition and assumptions
       - estimation of loadings and communalities
       - choice of the number of factors
       - factor rotation
- Canonical Correlation Analysis:
       - computation and interpretation
       - relationship with multiple regression
- Discriminant Analysis and Classification:
       - classification rules
       - linear and quadratic discrimination
       - error rates
- Cluster Analysis:
       - measure of similarity
       - hierarchical clustering
       - K-means clustering
       - model based clustering

Suggested readings and bibliography

Oggetto:

The bibliography, to be confirmed at the beginning of the course, is:

- R.A. Johnson and D.W. Wichern (2007). Applied Multivariate Statistical Analysis. Prentice-Hall, 6th Ed.

Suggested readings:
- Hastie, Tibshirani, Friedman (2009). The Elements of Statistical Learning, 2nd ed., Springer
- Afifi A., May S., Clark V.A. (2012). Practical Multivariate Analysis, 5th ed., Chapman & Hall/CRC
- Everitt B. (2005). An R and S-PLUS Companion to Multivariate Analysis. Springer
- Rencher A. C., Christensen W. F. (2012). Methods of multivariate analysis, 3rd ed., Wiley
- Rencher A.C. (1992). Interpretation of canonical discriminant functions, canonical variates and principal components. The American Statistician 46, 217-225.

 



Oggetto:

Class schedule

Oggetto:
Last update: 22/02/2018 10:30
Location: https://www.master-sds.unito.it/robots.html
Non cliccare qui!