## Topic outline

• ### General

Please note that all the lectures and exercises of this course are given in zoom. Please note that all the lectures of this course are given in zoom. Please note that neither the lectures nor the exercises are recorded, but the lecture slides and some additional notes are posted under "Lecture materials"! Please note also that homework assignments are part of the grading.

PLEASE FIND BELOW THE ZOOM LINKS TO THE LECTURES. JUST SCROLL DOWN THE PAGE.

This course is an introduction to multivariate statistical analysis. The goal is to learn basics of common multivariate data analysis techniques and to use the methods in practice. Software R is used in the exercises of this course. The topics of the course are multivariate location and scatter, principal component analysis, bivariate correspondence analysis, multiple correspondence analysis, canonical correlation analysis, discriminant analysis, classification, and clustering.

Before the course starts, make sure that you know how to calculate the univariate means, medians, variances, and max and min values. Familiarize yourself with the correlation coefficients and common graphical presentations (boxplots, scatter plots, histograms, bar plots, pie charts) of data. Make sure that you know what is a cumulative distribution function, a probability density function, and a probability mass function. Make sure that you know what is the expected value of a random variable. Read about univariate and multivariate normal distributions and elliptical distributions. Make sure that you know what is meant by central symmetric distributions and skew distributions.

How to pass this course?

You are expected to:

-Attend the lectures and be active - not compulsory, no points, but highly recommended.

-Submit your project work on time - THIS IS COMPULSORY - max 6 points.

-Take the exam - max 24 points.

-Participate to weekly exercises (group 1, group 2, group 3 OR group 4) - not compulsory, but highly recommended - max 3 points.

-Be ready to present your homework solutions in the exercise group - not compulsory, but highly recommended - max 3 points.

Max total points = 6 + 24 + 3 + 3 = 36. You need at least 16 points in order to pass the course.

How to get a good grade?

-Attend the lectures and be active!

-Work hard on your project work.

-Be active in the exercises!

-Study for the exam!

Grading is based on the total points as follows: 16p -> 1, 20p -> 2, 24p -> 3, 28p -> 4, 32p -> 5.

• ### Assignments

Zoom link to H01 (Thu 12:15 - 14:00): https://aalto.zoom.us/j/63226493878

Zoom link to H02 (Thu 16:15 - 18:00): https://aalto.zoom.us/j/68661083944

Zoom link to H03 (Fri 10:15 - 12:00): https://aalto.zoom.us/j/66124217667

Zoom link to H04 (Fri 12:15 - 14:00): https://aalto.zoom.us/j/63586072571

Assistants' office hours (Fri 14:15 - 15:15): https://aalto.zoom.us/j/65111800008

Exercises

Participate to weekly zoom exercises (group 1, group 2, group 3 OR group 4) - not compulsory, but highly recommended - max 3 points. If you attend 3-5 times, you get 1 point. If you attend 6-8 times, you get 2 points. If you attend at least 9 times (out of 11 times), you get 3 points.

In order to earn the exercise points, you have to arrive on time to the zoom exercise session and write your name to the participation list. You can not get any exercise points without attending the exercises.

Exercise session 11 is reserved for the project work and for summarizing the contents of the course.

Attending all the exercise sessions, including the last one, is highly recommended.

Note that all the exercise groups are online groups.

Homework

Solve the homework problems and be ready to present your solutions in the zoom exercise group - not compulsory, but highly recommended - max 3 points. Note that your solution does not have to be perfect or even correct --- trying your very best is enough!

If you solve your homework assignments  3-5 times, you get 1 point. If you solve your homework assignments 6-8 times, you get 2 points. If you  solve your homework assignments at least 9 times (out of 10 times), you get 3 points.

In order to earn the homework points, you have to arrive on time to the zoom exercise session and write your name to the homework list. You can not get any homework points without attending the exercises.

The exercise points are valid until the end of December 2022.

Project Work

Submit your project work on time as one single pdf-file - THIS IS COMPULSORY - max 6 points

Find a multivariate (at least 3-variate) dataset (Statistics Finland (=Tilastokeskus), OECD, collect yourself, ...), set a research question, and perform multivariate analysis. Write a report (max 10 pages), and submit it below before Monday 11.4.2020 at 12.00! Note that the deadline is at noon, not midnight!

Note that the project work has to be conducted individually. Group work is not allowed.

Goals of the project work:

-Description of the research questions

-Description of the dataset

-Univariate and bivariate statistical analysis to present the variables

-Application of your chosen multivariate statistical methods to answer research questions (justification and output)

-Conclusions and answers to the question raised at the beginning

-Critical evaluation of the analysis

Remember that no findings is a finding!

Note that you will automatically get 0 points from the exam if you will not submit your project work on time!

Maximum points are 6 and the 6 points are divided as follows.

Intro (description of the research question and of the data source or data collection) --- max 0.5 p.

Univariate analysis (description of the variables, summary statistics, visualization) --- max 1p.

Bivariate analysis (analysis of bivariate dependencies, visualization) --- max 1 p.

Multivariate analysis --- max 3 p. This is divided to selection of the method --- max 1 p.; technical implementation --- max 1 p.; and presenting the results/interpretation --- max 1 p.

Critical evaluations (report about possible sources of biases etc.) --- max 0.5 p.

If the report is not polished (blurry images, text in the marginal etc), that may lead to -1p.

Note that you don't have to attach any R-codes to your project work.