CS-C3240 - Machine Learning D, Lecture, 10.1.2022-8.4.2022
This course space end date is set to 08.04.2022 Search Courses: CS-C3240
Topic outline
-
Coordinator: Sanna Lun (tin.lun@aalto.fi)
The best place to ask questions about the ML project is through our course Slack channel.
Objectives
A main component of this course is an individual student project. The student project lets you- practice the modelling of a real-life application as a ML problem
- apply basic ML methods discussed in the lectures and assignments
- practice the writing of a scientific report ("paper")
- practice the ultimate tool of science: peer-review
The project will be completed incrementally in three stages (40 points in total):Stage 1Problem formulationStage 2Problem formulation +One ML methodStage 3Full reportPoints 5 points 10 points 25 points Submission opens 27 Jan, 20:00 3 Mar, 20:00 24 Mar, 20:00 Submission closes 10 Feb, 20:00 10 Mar, 20:00 4 Apr, 12:00 Peer review closes 17 Feb, 20:00 17 Mar, 20:00 11 Apr, 20:00 Grading criteria Click here
Last updated 3 FebClick here
Last updated 7 MarClick here
Last updated 31 MarThe points achieved during each stage consist of two components
- quality of your report, assessed by peer graders (students or course staff)
- quality of your review (e.g., gradings are well-justified)
NOTE! To get maximum points you must complete all the peer evaluations that will be assigned to you.
Adding your code to the report
You are expected to include your code as an appendix in your stage 2 and 3 submissions. Please choose one from the following two methods.1. Attach as an appendix in your report in pdf format- Create a new notebook on https://jupyter.cs.aalto.fi/
- Start coding there, or open and paste your code there
- Click File -> Download as -> PDF via LaTeX (.pdf)
- Export your written report as PDF, join these two PDF files together (first written report, then code)
2. Github and Kaggle notebookDo not choose this option if your identity is obvious based on your username or profile.Make your code available on Github or Kaggle notebook, and include the link to the repository/notebook in your report. Make sure that the repository/notebook is public, so that any reviewers can have access to it.Scope
You can freely choose a topic or domain of interest. Your project topic could be related to your favourite hobby (reading, music, sports). You are also welcome to work on a topic related to your current studies, research or even (Bsc, Msc, or Phd) thesis.
However, for the ML methods that you are going to use to solve the problem, you must choose from the list of ML methods here. This restriction of ML methods is meant to support the peer-grading process (peer graders should have a sufficient level of expertise).
Peer-Grading- You must justify your grading for each grading criterion. Unjustified peer gradings will be penalized.
- Around 100 submissions will be randomly selected and grade by TAs after the peer review.
- You are welcome to contact course staff if you consider the received peer grading to be inappropriate.
Source of data
Here are a few places to start if you are unsure about where to find data. However, you are encouraged to use any data that you can get your hands on through legitimate means. If you are collecting data for a research/thesis, we encourage you to use it for this project.
- World Bank Open Data: https://data.worldbank.org/
- Kaggle: https://www.kaggle.com/datasets
- Helsinki region infoshare: https://hri.fi/
- Numbeo: https://www.numbeo.com/
- UCI ML repo: https://archive.ics.uci.edu/ml/datasets.php
- European Statistical Database (Eurostat): https://ec.europa.eu/eurostat/data/database
- Sklearn dataset: https://scikit-learn.org/stable/datasets/toy_dataset.html
- Open database of Helsinki, https://www2.helsinki.fi/en/research-stations/varrio-subarctic-research-station/links/open-data-repositories
- Toward AI suggestions: https://pub.towardsai.net/best-datasets-for-machine-learning-data-science-computer-vision-nlp-ai-c9541058cf4f
Please let us know if you have further suggestions on other great sources of data!
Originality
Other resources
Tips for Good Writing: click me
Tips for Good Reviewing: reviewer guide of ICLR conference; criticizing with kindness; mistakes of reviewers; last-minute reviewing advice