Kurs: ELEC-E5550 - Statistical Natural Language Processing D, Lecture, 11.1.2022-12.4.2022, Sektion: Lectures

Översikt

Lectures
Lecture schedule 2022:
11 jan 1 Introduction & Project groups / Mikko Kurimo
18 jan 2 Statistical language models / Mikko Kurimo
25 jan 3 Word2vec / Tiina Lindh-Knuutila
01 feb 4 Sentence level processing / Mikko Kurimo
08 feb 5 Speech recognition / Janne Pylkkönen
15 feb 6 Morpheme-level processing / Mathias Creutz
22 feb Exam week, no lecture
01 mar 7 Chatbots and dialogue agents / Mikko Kurimo
08 mar 8 Neural language modeling and BERT / Mittul Singh
15 mar 9 Statistical machine translation / Jaakko Väyrynen
22 mar 10 Neural machine translation / Stig-Arne Grönroos
29 mar 11 Societal impacts and course conclusion / Krista Lagus and Mikko Kurimo
Below you can find slides of 2021 lectures until they are substituted by 2022 ones as the course progresses. Lecture recordings will also be added here.
- Välj aktivitet Zoom link to participate in the lectures
  
  Tillgänglig om: Du tillhör någon grupp
  
  Zoom link to participate in the lectures URL
- Välj aktivitet 11 Jan 2022 1: Introduction & Course content / Mik...
  
  11 Jan 2022 1: Introduction & Course content / Mikko Kurimo
- Välj aktivitet Lecture 1 slides (2022)
  Lecture 1 slides (2022) Fil PDF
  
  Introduction to Statistical Natural Language Processing
  Course practicalities in 2022
  Lecture 1 in the course text books:
  Manning-Schutze: Chapters 1-2 pp. 1-80
- Välj aktivitet Lecture 1 recording (2022)
  
  Tillgänglig om: Du tillhör någon grupp
  
  Lecture 1 recording (2022) Fil MP4
- Välj aktivitet Lecture 1 exercise return box
  
  Lecture 1 exercise return box Inlämningsuppgift
  
  Students must
  
  Lämna in
  
  -What kind of Natural Language Processing applications have you used?
  -What is working well? What does not work?
  -What kind of future applications would be useful in your daily life?
  Please type or upload the notes from your breakout group discussion here, e.g. as a photo, text or pdf file to earn a lecture activity point.
- Välj aktivitet 18 Jan 2022 2: Statistical language models / Mikko...
  
  18 Jan 2022 2: Statistical language models / Mikko Kurimo
- Välj aktivitet Lecture 2 slides (2022) Final
  Lecture 2 slides (2022) Final Fil PDF
  
  statistical language models and their applications
  maximum likelihood estimation of n-grams
  class-based n-grams
  the main smoothing methods for n-grams
  introduction to other statistical and neural language models
  Lecture 2 in the course text books:
  Manning-Schutze: Chapter 6 pp. 191-228
  Jurafsky-Martin 3rd (online) edition: Chapter 3 pp. 37-62 (and Chapter 7 pp.131-150 for simple NNLMs)
- Välj aktivitet Lecture 2 recording (2022)
  
  Tillgänglig om: Du tillhör någon grupp
  
  Lecture 2 recording (2022) Fil MP4
- Välj aktivitet Lecture 2 A exercise return box (applications)
  
  Lecture 2 A exercise return box (applications) Inlämningsuppgift
  
  Students must
  
  Lämna in
  
  List as many potential applications for statistical language models as you can!
  Typically they are tasks where you need the probability or to find the most probable word or sentence given some background information
  Please type or upload your answer here, e.g. as a photo, text or pdf file and earn a lecture activity point.
- Välj aktivitet Lecture 2 B exercise return box (Good-Turing)
  Lecture 2 B exercise return box (Good-Turing) Inlämningsuppgift
  
  Students must
  
  Lämna in
  
  Watch a video where Prof. Jurafsky (Stanford) explains Good-Turing smoothing (between 02:00 – 08:45)
  Click:
  Or search for:”Good Turing video Jurafsky”
  Answer briefly these 3 questions in a single file or text field
  Estimate the prob. of catching next any new fish species, if you already got: 5 perch, 2 pike, 1 trout, 1 zander and 1 salmon?
  Estimate the prob. of catching next a salmon?
  What may cause practical problems when applying Good-Turing smoothing for rare words in large text corpora?
  Please type or upload your answer here, e.g. as a photo, text or pdf file and earn a lecture activity point.
- Välj aktivitet 25 Jan 2022 3: Word2vec / Tiina Lindh-Knuutila
  25 Jan 2022 3: Word2vec / Tiina Lindh-Knuutila
- Välj aktivitet Lecture 3 slides (2022) Final
  Lecture 3 slides (2022) Final Fil PDF
  
  distributional semantics
  vector space models
  word2vec
  information retrieval
  Lecture 3 in the course text books:
  Jurafsky-Martin 3rd (online) edition: Chapter 6
- Välj aktivitet Lecture 3 recording (2022)
  
  Tillgänglig om: Du tillhör någon grupp
  
  Lecture 3 recording (2022) Fil MP4
- Välj aktivitet Lecture 3 exercise return box (word vectors)
  Lecture 3 exercise return box (word vectors) Inlämningsuppgift
  
  Students must
  
  Lämna in
  
  What are the benefits of distributional semantics?
  What kind of problems there might be?
  What kind of applications can you come up with using these models?
  Please type or upload your answer here, e.g. as a photo, text or pdf file and earn a lecture activity point.
- Välj aktivitet 1 Feb 2022 4: Sentence level processing / Mikko Ku...
  1 Feb 2022 4: Sentence level processing / Mikko Kurimo
- Välj aktivitet Lecture 4 slides (2022) (final)
  Lecture 4 slides (2022) (final) Fil PDF
  
  Part-of-Speech and Named Entity tagging
  Hidden Markov models and Viterbi algorithm
  Advanced tagging methods
  
  Lecture 4 in the course text books:
  Manning - Schütze(1999). MIT Press. Chapters 9--12
  Jurafsky-Martin 3rd (online) edition: Chapters 8--9
- Välj aktivitet Lecture 4 recording (2022)
  
  Tillgänglig om: Du tillhör någon grupp
  
  Lecture 4 recording (2022) Fil MP4
- Välj aktivitet Lecture 4 exercise return box (HMM and Viterbi)
  
  Lecture 4 exercise return box (HMM and Viterbi) Inlämningsuppgift
  
  Students must
  
  Lämna in
  
  Please type or upload your answer here, e.g. as a photo, text or pdf file and earn a lecture activity point.
  Discuss with each other in breakout rooms and propose answers for these 3 questions:
  
  1. Finish the POS tagging by Viterbi search example by hand.
  - Return the values of the boxes and the final tag sequence. Either take a photo of your drawing, fill in the given ppt, or just type the values into the text box
  2. Did everyone get the same tags? Is the result correct? Why / why not?
  3. What are the pros and cons of HMM tagger?
  
  All submissions, even incorrect or incomplete ones, will be awarded by one activity point.
- Välj aktivitet 08 Feb 2022 5: Speech recognition / Janne Pylkköne...
  08 Feb 2022 5: Speech recognition / Janne Pylkkönen
- Välj aktivitet Lecture 5 slides (2022)
  Lecture 5 slides (2022) Fil PDF
  
  Hybrid DNN-HMM architecture
  End-to-end architectures
  Applications
  
  Lecture 5 in the course text books:
  Jurafsky-Martin 3rd (online) edition: Chapters 26
- Välj aktivitet Lecture 5 recording (2022)
  
  Tillgänglig om: Du tillhör någon grupp
  
  Lecture 5 recording (2022) Fil MP4
- Välj aktivitet Lecture 5 exercise return box
  Lecture 5 exercise return box Inlämningsuppgift
  
  Students must
  
  Lämna in
  
  Please type or upload your answer here, e.g. as a photo, text or pdf file and earn a lecture activity point.
  Discuss with each other in breakout rooms and propose answers for these questions:
  Think about an application where ASR would be useful, but where
  it is not yet commonly used. How would ASR change the user
  experience? What are the biggest challenges for ASR in that use
  case?
  All submissions, even incorrect or incomplete ones, will be awarded by one activity point.
- Välj aktivitet 15 Feb 2022 6: Morpheme-level processing / Mathias...
  15 Feb 2022 6: Morpheme-level processing / Mathias Creutz
- Välj aktivitet Lecture 6 slides (2022) final
  
  Lecture 6 slides (2022) final Fil PDF
  
  Not all of these slides will be discussed during the lecture, but everything is useful reading, still.
  NOTE: There is no text book yet that would cover this stuff well, so read the slides carefully!
- Välj aktivitet Lecture 6 recording (2022)
  
  Tillgänglig om: Du tillhör någon grupp
  
  Lecture 6 recording (2022) Fil MP4
- Välj aktivitet Lecture 6 exercise return box
  
  Lecture 6 exercise return box Inlämningsuppgift
  
  Students must
  
  Lämna in
  
  Please type or upload your answer here, e.g. as a photo, text or pdf file and earn a lecture activity point.
- Välj aktivitet 23 Feb 2021: Exam week, no lecture
  23 Feb 2021: Exam week, no lecture
- Välj aktivitet 01 Mar 2022 7: Chatbots and dialogue agents / Mikk...
  01 Mar 2022 7: Chatbots and dialogue agents / Mikko Kurimo
- Välj aktivitet Lecture 7 slides (2022)
  Lecture 7 slides (2022) Fil PDF
  
  Rule-based and Corpus-based chatbots
  Retrieval and Machine Learning based chatbots
  Evaluation of chatbots
  
  Lecture 6 in the course text books:
  Jurafsky-Martin 3rd (online) edition: Chapter 24
- Välj aktivitet Lecture 7 recording (2022)
  
  Tillgänglig om: Du tillhör någon grupp
  
  Lecture 7 recording (2022) Fil MP4
- Välj aktivitet Lecture 7 exercise return box
  Lecture 7 exercise return box Inlämningsuppgift
  
  Students must
  
  Lämna in
  
  Please type or upload your answer here, e.g. as a photo, text or pdf file and earn a lecture activity point.
  Discuss with each other in breakout rooms and propose answers for these 6 questions:
  Which chatbots and dialogue agents have you used? What can they do, what not?
  Try ELIZA, e.g. https://www.eclecticenergies.com/ego/eliza or http://psych.fullerton.edu/mbirnbaum/psych101/Eliza.htm When does it fail? How to improve it?
  Try PARRY, e.g. https://www.chatbots.org/chatbot/parry/ or https://www.botlibre.com/browse?id=857177 When does it fail? How to improve it?
  Try more chatbots or dialogue agents, e.g. transformer: https://convai.huggingface.co/ or anyone from: https://www.chatbots.org/
  What do you think: How to make better chatbots? How to automatically evaluate chatbots?
  What ethical issues do chatbots have? Any suggestions how to solve them?
- Välj aktivitet 8 March 2022 8: Neural language modeling / Mittul ...
  8 March 2022 8: Neural language modeling / Mittul Singh
- Välj aktivitet Lecture 8 slides (2022) final
  
  Lecture 8 slides (2022) final Fil PDF
  
  NNLMs are discussed in Chapter 7 in the 2020 online version of Jurafsky - Martin book.
- Välj aktivitet Lecture 8 recording (2022)
  
  Tillgänglig om: Du tillhör någon grupp
  
  Lecture 8 recording (2022) Fil MP4
- Välj aktivitet Lecture 8 exercise return box: Self-attention
  
  Lecture 8 exercise return box: Self-attention Inlämningsuppgift
  
  Students must
  
  Lämna in
  
  Please type or upload your answer here, e.g. as a photo, text or pdf file and earn a lecture activity point.
- Välj aktivitet Lecture 8 exercise return box: Gating
  
  Lecture 8 exercise return box: Gating Inlämningsuppgift
  
  Students must
  
  Lämna in
  
  Please type or upload your answer here, e.g. as a photo, text or pdf file and earn a lecture activity point.
- Välj aktivitet 15 March 2022 9: Statistical machine translation /...
  15 March 2022 9: Statistical machine translation / Jaakko Väyrynen
- Välj aktivitet Lecture 9 slides (2022)
  Lecture 9 slides (2022) Fil PDF
  
  Lecture based on:
  Chapter 13.2-13.4 in Manning & Schutze
  Chapter 21 in the OLD Jurafsky & Martin: Speech and Language Processing
  Chapter 11 in the NEW Jurafsky & Martin: Speech and Language Processing
  Koehn: "Statistical Machine Translation", http://www.statmt.org/book/
- Välj aktivitet Lecture 9 recording (2022)
  
  Tillgänglig om: Du tillhör någon grupp
  
  Lecture 9 recording (2022) Fil MP4
- Välj aktivitet Lecture 9 exercise return box
  Lecture 9 exercise return box Inlämningsuppgift
  
  Students must
  
  Lämna in
  
  Please type or upload your answer here, e.g. as a photo, text or pdf file and earn a lecture activity point.
  Discuss with each other in breakout rooms and propose answers for these 2 questions:
  Consider different levels of language and different kinds of source-target pairs:
  What would be easy/hard to translate with MT?
  Have you seen failed/succesful usage or applications of MT?
- Välj aktivitet 22 Mar 2022 10: Neural machine translation / Stig-...
  22 Mar 2022 10: Neural machine translation / Stig-Arne Grönroos
- Välj aktivitet Lecture 10 slides (2022)
  
  Lecture 10 slides (2022) Fil PDF
  
  NMT is discussed in Chapter 11 in the 2020 online version of Jurafsky - Martin book.
- Välj aktivitet Lecture 10 recording (2022)
  
  Tillgänglig om: Du tillhör någon grupp
  
  Lecture 10 recording (2022) Fil MP4
- Välj aktivitet Lecture 10 exercise return box
  
  Lecture 10 exercise return box Inlämningsuppgift
  
  Students must
  
  Lämna in
  
  Other tasks that you can use an NMT architecture for?
  Same form, different semantics.
  Please type or upload your answer here, e.g. as a photo, text or pdf file and earn a lecture activity point.
- Välj aktivitet 29 Mar 2022 11: Societal impacts and course conclu...
  29 Mar 2022 11: Societal impacts and course conclusion / Krista Lagus and Mikko Kurimo
- Välj aktivitet Lecture 11 slides, part 1 (2022) New SSH research
  
  Lecture 11 slides, part 1 (2022) New SSH research Fil PDF
  
  The first part of Krista's presentation
- Välj aktivitet Lecture 11 slides (2022) Societal impact
  Lecture 11 slides (2022) Societal impact Fil PDF
  
  This material is not included in the text books
  Check the slides and any reading material mentioned there
- Välj aktivitet Lecture 11 slides (2022) Conclusion
  Lecture 11 slides (2022) Conclusion Fil PDF
  
  The contents of the course
  Info about passing the course and grading
  Info about the exam
  Quick recap of previous lectures
- Välj aktivitet Lecture 11 recording, part 1 (2022)
  
  Tillgänglig om: Du tillhör någon grupp
  
  Lecture 11 recording, part 1 (2022) Fil MP4
- Välj aktivitet Lecture 11 recording, part 2 (2022)
  
  Tillgänglig om: Du tillhör någon grupp
  
  Lecture 11 recording, part 2 (2022) Fil MP4
- Välj aktivitet Lecture 11 exercise return box: Public sentiments and emotions
  
  Lecture 11 exercise return box: Public sentiments and emotions Inlämningsuppgift
  
  Students must
  
  Lämna in
  
  Discuss with group:
  1. Do you think it is possible to detect speaker’s emotions from text? Explain!
  2. What good might there be, if we could create a “WORRY-O-METER”?
  3. What problems do you foresee?
  Please type or upload your answer here, e.g. as a photo, text or pdf file and earn a lecture activity point.
- Välj aktivitet Lecture 11 exercise return box: Principles
  
  Lecture 11 exercise return box: Principles Inlämningsuppgift
  
  Students must
  
  Lämna in
  
  1. What are important principles for you, that you would like to see more of, in the world / in the discussions / in social media?
  ‒ Write down at least one, and describe it to your group
  2. What would the world be like if your chosen principle was adopted or became stronger in the world, or in some particular context or forum? Describe concretely, if possible
  Please type or upload your answer here, e.g. as a photo, text or pdf file and earn a lecture activity point.

MyCourses service break

ELEC-E5550 - Statistical Natural Language Processing D, Lecture, 11.1.2022-12.4.2022

Översikt

Lectures

Students

Teachers

Service