ELEC-E5531 - Speech and Language Processing Seminar V D, Lecture, 18.1.2022-5.4.2022
Kurssiasetusten perusteella kurssi on päättynyt 05.04.2022 Etsi kursseja: ELEC-E5531
Osion kuvaus
-
Week 1 (18.1.) Introduction, Challenges & InspirationsPART I: FOUNDATIONS
- Why conversational and speech/audio interaction?
- Showcasing examples, challenges and opportunities
- Course Intro: Goals & Expectations
Week 2 (25.1.) Audio Perception, Phonetics & Speech ProductionIntroduction: Designing Voice Assistants for use by Children and Vocal Sketching, by Prof. Nitin Sawhney and Koray Tahiroğlu.Lab Session: Introduction to Pure Data and Arduino (Lecture Recording)- Sound Perception & Action
- Phonetics
- Speech Production
Readings:
- Warren, R. M. (1982). The Relation of Hearing to Other Senses. In Auditory perception: A new synthesis (pp. 188–195). Pergamon Press.
- Linguistic Structure of Speech wiki
Week 3 (1.2.) Semantics, Dialogue & Conversational Design- Semantics & Computational Semantics
- Chatbots & Dialogue Systems
- Conversational Design
Presentations on Semantics & Computational Semantics by Kaisla Kajava, Chatbots & Dialogue Systems by Nitin Sawhney, and Conversational Design by Jade Roberts. (Presentation PDF)
Readings:
- Jurafsky, D., & Martin, J. H. (2021). Speech and language processing: An introduction to natural language processing, computational linguistics, and speech recognition. Chapter 6: Vector Semantics & Embeddings (Pages 1-9).
- Jurafsky, D., & Martin, J. H. (2021). Speech and language processing: An introduction to natural language processing, computational linguistics, and speech recognition. Chapter 24: Chatbots & Dialogue Systems (Pages 1-18).
- Conversational Design by Erika Hall, Chapter 2 (+ skim Introduction)
Week 4 (8.2.) Speech & Auditory Processing for Voice Interaction- Speech Recognition & Synthesis
- Auditory Processing
- Integrating Speech & Auditory Interaction in Wearable Devices
Presentations on Speech Recognition & Synthesis by Tom Bäckström, Audio Processing by Koray Tahiroğlu & Camilo Sanchez, Integrating Speech & Auditory Interaction in Wearable Devices by Nitin Sawhney. (Presentation PDF)
Readings:
- Overview of Speech Recognition and Speech Synthesis (Aalto wiki pages) - for in-depth review you may refer to an Overview of Modern Speech Recognition by Xuedong Huang and Li Deng, Microsoft Corporation, Indurkhya/Handbook of Natural Language Processing, 2009 or an Overview of Speech Synthesis and Recognition, Pierre Nugues, 2010.
- Sawhney, N. and Schmandt, C., 2000. Nomadic Radio: Speech and Audio Interaction for Contextual Messaging in Nomadic Environments. ACM Transactions on Computer-Human interaction (TOCHI), 7(3), pp.353-383.
- Clarkson, B., Sawhney, N. and Pentland, A., 1998. Auditory Context Awareness via Wearable Computing. Energy, 400(600), p.20.
Week 5 (15.2.) Project Concept Presentations- Project Concept Presentations for feedback & discussion
Lab Session: The future wheel (futures and foresight methods for VAI) (Lecture Recording)PART II: APPLIED RESEARCH & DESIGN
Week 6 (22.2.) Designing Multimodal Conversational Systems- Presentation by Speechly
- Dialogue, Conversational Design and Flow
- Creating & Testing Voice & Multimodal Content
Speechly demos: https://demos.speechly.com/
Source code: https://github.com/speechly/speechly
Speechly Dashboard for creating NLU configurations: https://api.speechly.com/dashboard
Web Components documentation: https://docs.speechly.com/client-libraries/usage/
CodePen: https://codepen.io/arzga/pen/VwrxEPbWeek 7 (1.3.) Human-AI Interaction & Participatory Design- Understanding challenges of Human-AI Interaction
- Participatory & Co-Design Principles with Users
- How Human-AI Interaction Is Uniquely Difficult to Design (Kaisla) (Presentation file)
- Age-appropriate Participatory Design of a Storytelling Voice Input in the Context of Historytelling (Aigul) (Presentation file)
- Potential and Pitfalls of Digital Voice Assistants in Older Adults With and Without Intellectual Disabilities: Relevance of Participatory Design Elements and Ecologically Valid Field Studies (Mohammad) (Presentation file)
Week 8 (8.3.) Personalities & Perceptions in Voice Assistants- Understanding Perceptions of Voice User Interfaces
- Designing Voice Interaction with older adults
- At Your Service: Designing Voice Assistant Personalities to Improve Automotive User Interfaces (Presentation file)
- Speed Dating with Voice User Interfaces: Understanding How Families Interact and Perceive Voice User Interfaces in a Group Setting (Presentation file)
- Exploring older adults’ perception and use of smart speaker-based voice assistants: A longitudinal study (Presentation file)
Week 9 (15.3.) Sound Architectures of Trust, Privacy & Security- Trust, Privacy & Security in Society
- Understanding role of biases in Voice Assistants
- Okay google, what about my privacy?’: User’s privacy perceptions and acceptance of voice based digital assistants (Tim) (Presentation file)
- Adoption of smart voice assistants technology among Airbnb guests: A revised self-efficacy-based value adoption model (SVAM) (Presentation file)
- Voice Recognition Still Has Significant Race and Gender Biases (Tim) (Presentation file)
Week 10 (22.3.) Designing Physical & Multimodal Interaction- Physical Interaction
- Multimodal & Playful Interaction
- Plausible Auditory Augmentation of Physical Interaction. (Presentation file)
- Multimodal interaction: A review (Presentation file)
Week 11 (29.3.) Computational Creativity & Immersive Environments- Computational Creativity
- Designing for Immersive Audio Environments
- Sound in AR/VR Applications
- Wearable / Ubiquitous Computing etc.
- Voices and Voids: Subverting Voice Assistant Systems through Performative Experiments (Presentation file)
Week 12 (5.4.) Final Team Presentations- Project Presentations and Feedback (extended 3-hour class 14:00 – 17:00)
- Pre-diagnosis voice chatbot by Si Zuo (presentation file)
- Talking paintings: learning about art using voice and auditory methods by Aigul Agisheva, Jade Roberts, and Lucy Truong (presentation file)