Topic outline

  • Basic information

    • Credit:  5 ECTS
    • Level: M.Sc. and Doctoral Studies  
    • Teaching:  Period III-IV (12 x 2h lectures), 9.1.2024 – 10.4.2024
    • Exercise sessions: Periods III-IV (6 x 2h)
    • Language: English

    Course personnel

    • Profs. Sergiy Vorobyov, Esa Ollila and Visa Koivunen
    • Teaching assistants (TA):  Eeli Susan and Xinjue Wang
    • Guest Lecturer: TBD

    You may contact us by email via format firstname.lastname@aalto.fi 

    Tue 10-12

    Wed 12-14

    9.1: Sergiy, Kide big hall Sklodowska-Curie - 1501     

     

    16.1: Eeli, Maarintie 8, AS6. Recap on vector calculus and  Basic Matrices (using numpy)

    17.1: Esa, Kide smaller classroom Meitner - 1571 Recap on matrix algebra

    23.1: Sergiy, Kide big hall Sklodowska-Curie - 1501     

     

    30.1: Sergiy, Kide big hall Sklodowska-Curie - 1501

    31.1: Kide smaller classroom Meitner - 1571

    6.2:Sergiy, Maarintie 8, AS6

     

    13.2: Sergiy, Kide big hall Sklodowska-Curie - 1501

    14.2: Kide smaller classroom Meitner - 1571

    Exam week

    27.2: Esa, Kide big hall Sklodowska-Curie - 1501  

     

    5.3: Esa, Kide big hall Sklodowska-Curie - 1501

    6.3: Kide smaller classroom Meitner - 1571

    12.3: Esa, Kide big hall Sklodowska-Curie - 1501                

     

    19.3: Esa, Kide big hall Sklodowska-Curie - 1501

    20.3: Kide smaller classroom Meitner - 1571

    26.3: Visa, Kide big hall Sklodowska-Curie - 1501

     

    9.4: Sergiy, Kide big hall Sklodowska-Curie - 1501

    10.4: Kide smaller classroom Meitner - 1571

    NOTE: Some exceptions to the lecture hall: 

    • Lectures on Tue. at 10:15-12:00 (see the table above for the lecturer and location for each lecture)     
    • Exercises on Wed. at 12:15-14:00 (see the table above for the location for each exercise)

    Objectives: 

    • to give students the tools and training to recognize the problems of processing large scale data that arise in engineering and computer science 
    • to present the basic theory of such problems, concentrating on results that are useful in computation 
    • to give students a thorough understanding of how such problems are thought of, modeled and addressed, and some experience in solving them 
    • to give students the background required to use the methods in their own research work 
    • to give students sufficient background information in linear algebra, statistical and machine learning tools, optimization, and sparse reconstruction for processing large scale data 
    • to give students a number of examples of successful application of the techniques for signal processing of large scale data. 

    Materials and text books

    • Course slides, lecture and exercise session notes, videos and codes

    There are several useful textbooks (with online pdf-s) available such as: 

    Assessment and grading

    • All together 3-4 Homework assignments. Homework assignments are computer programming assignments which can be done preferably either using python or MATLAB.

    Homeworks will be made available on Mycources and returned via Mycources (follow instructions for each homework).  

    Grading will be based on homework assignments.

    Possible bonus points for active participation for lectures and exercise sessions.

    Contents (tentative, subject to change): 

    Sergiy:

    1. Optimization Methods for Large Scale Data Analysis (First-Order Accelerated Methods for Smooth Functions, Extension to Non-Smooth Functions - Sub-gradient Methods)
    2. Optimization Methods for Huge Scale Data Analysis (Stochastic Gradient Methods, Proximal Gradient Method, Mirror Descent, Frank-Wolfe, ADMM, Block-Coordinate Descent)
    3. Applications to Data Analysis with Structured Sparsity and Machine Learning

    Esa: 

    1. Classification and regression tasks, basic principles
    2. Lasso and its generalisations 
    3. Decision Trees, Bagging, Random Forests. 
    4. Boosting and its variants

    Visa : TBD


    • File icon
      Final Grades File
      Not available unless: You are a(n) Student
    • File icon
      Introduction and Overview File
      Not available unless: You are a(n) Student

      Introduction

      Motivation

      History

      Encompassing Model

      Basic Data Analysis Problems

      Example: PCA

    • File icon
      Sergiy's Summary Notes (Pony) File
      Not available unless: You are a(n) Student

      Everything in a short summary notes (pony) form for computational (optimization) aspects of large-scale data analysis.

    • File icon
      Gradient Descent File
      Not available unless: You are a(n) Student

      Quadratic minimization problems

      Strongly convex and smooth problems

      Convex and smooth problems

      Nonconvex problems

    • File icon
      Subgradient Methods File
      Not available unless: You are a(n) Student

      Steepest descent

      Subgradients

      Projected subgradient descent:

      (i) Convex and Lipschitz problems

      (ii) Strongly convex and Lipschitz problems

      Convex-concave saddle point problems

    • File icon
      Mirror Descent File
      Not available unless: You are a(n) Student

      Mirror descent

      Bregman divergence

      Alternative forms of mirror descent

      Convergence analysis

    • File icon
      Proximal Gradient Descent File
      Not available unless: You are a(n) Student

      Proximal gradient descent for composite functions

      Proximal mapping / operator

      Convergence analysis

    • File icon
      Accelerated Gradient Methods File
      Not available unless: You are a(n) Student

      Heavy-ball methods

      Nesterov’s accelerated gradient methods

      Accelerated proximal gradient methods (FISTA)

      Convergence analysis

      Lower bounds

    • File icon
      Dual and Primal-Dual Proximal Gradient Methods File
      Not available unless: You are a(n) Student

      Dual proximal gradient method

      Primal-dual proximal gradient method

    • File icon
      ADMM File
      Not available unless: You are a(n) Student

      Augmented Lagrangian method

      Alternating direction method of multipliers

    • File icon
      Stochastic Gradient Descent (SGD) File
      Not available unless: You are a(n) Student

      Stochastic gradient descent (stochastic approximation)

      Convergence analysis

      Reducing variance via iterate averaging

    • File icon
      Variance Reduction for SGD File
      Not available unless: You are a(n) Student

      Stochastic variance reduced gradient (SVRG): Convergence analysis for strongly convex problems

      Stochastic recursive gradient algorithm (SARAH): Convergence analysis for nonconvex problems

      Other variance reduced stochastic methods:

      (i) Stochastic dual coordinate ascent (SDCA)

      (ii) SAGA

    • File icon
      Adam File
      Not available unless: You are a(n) Student

      Introduction to the use of optimization for machine learning: from GD to SGD to Momentum to Adam.

    • Assignment icon
      Homework Assignment 1
      Not available unless: You are a(n) Student

      Due to February 15, 2024. 

    • Assignment icon
      Homework Assignment 2
      Not available unless: You are a(n) Student
    • File icon
      Lecture on multiple hypothesis testing and sequential inference File
      Not available unless any of:
      • You are a(n) Student
      • You are a(n) Teacher

      Introduction to Multiple Hypothesis Testing (MHT), Sequential Detection and Change-Point Detection for large scale and streaming data by Visa Koivunen DICE/ELEC