Topic: Programming Assignments | CS-E400101 - Research Seminar in Computer Science D: Research Seminar on Security and Privacy of Machine Learning, Lectures, 1.3.2022-27.5.2022

Topic outline

Programming Assignments
Assignment 1: crafting adversarial examples
Deadline April 15th, 23:59 (Helsinki time). Submit through JupyterHub.
Introduction
Evasion attacks aka adversarial examples are a common threat to the integrity of machine learning models.
In this mini project, you're going to choose a method of your liking, implement and evaluate it, and provide some analysis.
The method that you choose must be a black-box method.
Within this task your adversary models is as follows:
black-box access to a MNIST model trained by us
adversary doesn't know the exact architecture of the model but they can assume it's a simple CNN
adversary doesn't have access to victim's training data
but you can use the MNIST test set (that is already loaded) for crafting and testing your adversarial examples
adversary doesn't have access to the gradients (since it's a black-box)
you'd interact with the victim model only through the predict function in the PerturbationCrafter
you can implement techniques that rely on local surrogate models or pseudo gradients
Try to limit yourself to using pytorch, numpy and matplotlib. We cannot guarantee availability of other (or latest) packages on Aalto's jupyter hub.
Some references to get you started:
Explaining and Harnessing Adversarial Examples
Boosting Adversarial Attacks with Momentum
Towards Deep Models Resistant to Adversarial Attacks
Tasks
You'd complete the assignment using Aalto's JupyterHub.
The server for this course is CS-E4001 - Research Seminar in Computer Science D: Research Seminar on Security and Privacy of Machine Learning.
To access your notebook, start the server, go to assignments tab and fetch the notebook.
Once you're done, submit the notebook.
Presentation of the Method
This is the first part of your written report.
Choose an adversarial attack that you'd like to evaluate. One aspect of this task is the literature review and finding something that you find interesting. Once you select the paper, introduce the main ideas behind the attack: motivation/intuition, relevant equations.
Implementation
This is the coding part.
Implement the attack and test it using several hundred samples to make sure that it works.
Write your code s.t. the implementation is contained within the PerturbationCrafter; you can add other methods to this class as needed.
DISCLAIMER: choose/optimize your method s.t. generation of 100 samples doesn't take longer than 5 minutes (timed on JupyterHub).
Analysis and Takeaways
This is the second part of your written report + code for your analysis.
We want you to perform a detailed analysis of the method, tweaking various knobs. This part is quite open-ended but things that you can consider depending on the method that you choose:
\( \epsilon \) value
number of iterations/steps for iterative methods or query budget
misclassification confusion matrix
relative difficulty of getting misclassified as a particular class given starting class (for targeted methods)
most common misclassification class given starting class (for untargeted methods)
choice of the pseudo gradient
choice of surrogate model
The goal of this part is for you to show that you are capable of thorough analysis of attack methods.
Grading
Recall that this project constitutes 15% of the grade.
Furthermore, the project is graded as follows:
Presentation of the Method max 3 points
intuitive explanation and key ideas max 1
technical explanation, equations (maybe figures) max 2
Implementation max 3 points
implementation max 2
testing max 1
Analysis and Takeaways max 4 points
analysis of the overall effectiveness of the method max 1
analysis specific to your method max 2
conclusion, takeaways, observations max 1

Assignment 2: watermarking a model
Deadline May 20th, 23:59 (Helsinki time). Submit through JupyterHub.
Introduction
Watermarking your model is a common way to prove that you're the rightful owner. It draws heavily from media watermarking and steganography.
In this mini project, you're going to choose a method of your liking, implement and evaluate it, and provide some analysis.
The method that you choose can be either a white-box or a black-box method.
Within this task the adversary models is as follows:
you're a model vendor that sells models
adversary is a client that wants to remove the watermark from the model that they had bought from you
adversary doesn't have access to the training data
but they can use other datasets to remove the watermark
also, in your analysis, you can use the MNIST test set (that is already loaded) for evaluating the impact of watermark removal techniques on the primary classification task. However, the adversary cannot use this set to e.g., fine-tune their model.
adversary has white-box access to the model, including:
the architecture
the weights
the gradients
Try to limit yourself to using pytorch, numpy and matplotlib. We cannot guarantee availability of other (or latest) packages on Aalto's jupyter hub.
Some references to get you started:
Embedding Watermarks Into Deep Neural Networks
Protecting Intellectual Property of Deep Neural Networks with Watermarking
Turning Your Weakness Into a Strength: Watermarking Deep Neural Networks by Backdooring
Tasks
You'd complete the assignment using Aalto's JupyterHub.
The server for this course is CS-E4001 - Research Seminar in Computer Science D: Research Seminar on Security and Privacy of Machine Learning.
To access your notebook, start the server, go to assignments tab and fetch the notebook.
Once you're done, submit the notebook.
Presentation of the Method
This is the first part of your written report.
Choose a watermarking scheme that you'd like to evaluate. One aspect of this task is the literature review and finding something that you find interesting. Once you select the paper, introduce the main ideas behind the scheme: motivation/intuition, relevant equations.
Implementation
This is the coding part.
Implement the scheme and make sure that it works for some default settings (aka reasonably high test and watermark accuracy / low p-value).
Use the code in mnist_stuff.py as the base for your implementation (copy stuff that you need to your notebook but don't modify anything in that file). It provides basic training and testing code.
In general, you don't need to implement and analyse complicated cryptographic verification schemes (such as in Turning Your Weakness Into a Strength: Watermarking Deep Neural Networks by Backdooring). Nevertheless, if you aren't sure, message us on Teams and we'll let you know.
Analysis and Takeaways
This is the second part of your written report + code for your analysis.
We want you to perform a detailed analysis of the method, tweaking various knobs. This part is quite open-ended but things that you can consider depending on the method that you choose:
trigger size / watermark size
difficulty to embed a watermark (e.g. number of epochs) and impact on the loss landscape
watermark removal and robustness of the verification (accuracy or p-value), e.g.:
pruning
fine-tunning
injecting noise
transfer learning
The goal of this part is for you to show that you are capable of thorough analysis of the robustness of a watermarking scheme.
Even though MNIST is quick to train, you will likely have to train multiple models, so plan your experiments accordingly.
Grading
Recall that this project constitutes 15% of the grade.
Furthermore, the project is graded as follows:
Presentation of the Method max 3 points
intuitive explanation and key ideas max 1
technical explanation, equations (maybe figures) max 2
Implementation max 3 points
implementation max 2
testing max 1
Analysis and Takeaways max 4 points
analysis of the overall effectiveness of the method max 1
analysis specific to your method max 2
conclusion, takeaways, observations max 1

MyCourses service break 10-17

CS-E400101 - Research Seminar in Computer Science D: Research Seminar on Security and Privacy of Machine Learning, Lectures, 1.3.2022-27.5.2022

Topic outline

Programming Assignments

Introduction

Presentation of the Method

Implementation

Analysis and Takeaways

Grading

Assignment 2: watermarking a model

Introduction

Tasks

Presentation of the Method

Implementation

Analysis and Takeaways

Grading

Tuki / Support

Opiskelijoille / Students

Opettajille / Teachers

Palvelusta

About service

Service