Topic outline

  • ------------------------------- WEEK 1 ------------------

    Tue 20/04/2021  09:00 – 12:45  

    • Introduction to Deep Learning with Audio
      • Audio Domain and Symbolic Applications - different models - different methods
    • Deep Learning Models Applied to musical projects 
      • Examples

    • What tools do we have available? 
      •  Pure Data, Python, Conda, Magenta, PyExt
      • Installation of the Required Tools
    • Getting started with AI-Duet
      • Symbolic Domain
      • Setting up Melody RNN
      • Course Exercise on AI-Duet
      • Brief discussion on the exercise outcomes

    Wed 21/04/2021  09:00 – 12:45 

    • DDSP (Differentiable Digital Signal Processing)  
      • Timber Transfer
      • Setup (MacOS and Linux)
      • Features in timbre_transfer.pd
      • Checkpoints
    • Exercises
      • Try a few different combinations of input audio and checkpoint. What kind of observations can you make about how the inputs' characteristics affect the output?
      • Experiment with the f₀ octave shift, f₀ confidence threshold and loudness dB shift parameters. How does the algorithm respond to extreme values of these?
      • Brief discussion on the exercise outcomes
      • Group training - we will train a few checkpoints overnight with students' audio (takes ~5 hours per checkpoint)

    Thu 22/11/2019  09:00 – 12:45  
    • Nsynth and GANSynth
      • GANSynth: adversarial neural audio synthesis
      • Architecture of GANSynth
      • Other audio/music applications of GANs
      • Checkpoints
      • Setup (MacOS and Linux)
      • Training GANSynth

    • Exercises
      • Try generating some random latent vectors and synthesizing sounds from them using gansynth.pd and the all_instruments checkpoint. What kind of timbres does the neural network generate? How does the acoustic_only checkpoint compare?
      • Try manually drawing in the latent vector (z) array and then synthesizing. GANSynth expects z to be normalized such that its magnitude is 1, but drawing in arbitrary values breaks this. What happens to the generated sounds?
      • Try interpolating between different latent vectors using gansynth_multi.pd. How does the resulting synthesized sound compare to the sounds from the original latent vectors? By default, the synthesise message in this patch is set up to generate four different pitches, but it may be easier to compare sounds by using the same pitch for each.
      • Brief discussion on the exercise outcomes
      • Group training
    • NSynth: neural audio synthesis
      • WaveNet
      • Open NSynth Super
      • nsynth.pd

    • Exercise
      • Load some sounds into nsynth.pd and explore how they change by moving the position on the X/Y pad. If you don't have a MIDI input, you can manually send note_on <pitch> messages to the second inlet of the subpatch containing the X/Y pad. Investigate the structure of the patch. What kind of alternative ways of interacting with the sounds can you come up with?
      • Brief discussion on the exercise outcomes
      • Group Training:  Using any kind of instrument you prefer, record 4-second samples of each of the following notes: C2, E2, G#2, C3, E3, G#3, C4 (MIDI notes 24, 28, 32, 36, 40, 44, 48). Convert the samples to 16000 Hz sample rate, 16-bit signed integer. Make sure they're exactly 4 seconds long (64000 samples). Note that the low sample rate means your sounds will lose all frequencies above 8000 Hz, so don't waste time on making super detailed highs! Name your samples with the instrument name and note number separated by an underscore, e.g. sandstormlead_24.wav. We will collect the samples in groups of four and run the audio generation scripts on Aalto Science-IT's Triton cluster. This will take a few days, after which we will load the samples onto Open NSynth Super devices and explore the generated sounds.

    Fri 23/04/2021  09:00 – 12:45
    • GANSpaceSynth
      • Conditional GANs
      • GANSpaceSynth Architecture
      • Setup (MacOS, Linux)
      • Checkpoints
      • ganspacesynth.pd

    • Hallucinations
      • Conditional GANs
      • ganspacesynth_halluseq.pd
      • Exercise: We will compare the audio features that are extracted from PCA on 3 dimensions, describing their semantic meanings and comparable differences. 2 different checkpoints will be used in this exercise
      • Brief discussion on the exercise outcomes
    • Project ideas pitch ( 1min / student ) 

    ------------------------------- WEEK 2 ------------------

    Tue 27/04/2021  09:00 – 12:00

    • SampleRNN
      • generate sequences of similar audio
      • albums generated using SampleRNN - DADABOTS
      • Setup (macOS and Linux)
      • Checkpoints
    • Exercise
      • Try generating some sounds with different values for the sampling temperature parameter. How does it affect the results?
      • Brief discussion on the exercise outcomes

    Wed 28/04/2021  09:00 – 12:00
    •  Project work and Tutoring

    Thu 29/04/2021  09:00 – 12:00
    • Project work and Tutoring

    Fri 30/04/2021  09:00 – 12:00

    • Project work and Tutoring

    ------------------------------- WEEK 3------------------  

    Tue 04/05/2021  - 09:00 – 12:00
    • Project work and Tutoring

    Wed 05/05/2021  - 09:00 – 12:00
    • Project work and Tutoring

    Thr  06/05/2021  - 09:00 – 12:00
    • Project work and Tutoring


    Fri 07/05/2021  - 09:00 – 12:00