CS-E4000 - Seminar in Computer Science D, Lecture, 4.9.2023-1.12.2023
Kurssiasetusten perusteella kurssi on päättynyt 01.12.2023 Etsi kursseja: CS-E4000
Available topics
SS1 : Privacy and Dark Patterns
Tutor: Sanna Suoranta (sanna.suoranta@aalto.fi)
Users privacy should have been getting better after EU's GDPR regulation. However, users can be lured to give more rights to their data than what they would initially wanted. How this happens and how could we prevent it?References:
- Christoph Bösc, Benjamin Erb, Frank Kargl, Henning Kopp, and Stefan Pfattheicher (2016) Tales from the Dark Side: Privacy Dark Strategies and Privacy Dark Patterns. Proceedings on Privacy Enhancing Technologies. pp 237-254. DOI 10.1515/popets-2016-003
- Waldman, A. E. (2020) Cognitive biases, dark patterns, and the 'privacy paradox'. Current opinion in psychology. vol 31, feb 2020, pp. 105-109, https://doi.org/10.1016/j.copsyc.2019.08.025
SS2: What users really think while trying to use something that requires security?
Tutor: Sanna Suoranta (sanna.suoranta@aalto.fi)
Nowadays, brain imaging technology can provide us information what users are really seeing, thinking and feeling while they use a user interface based on brain areas that are activating. What happens when a user see a phishing message? How could we help users to act secure ways by providing better user interfaces?References:
- Zhepeng Rui and Zhenyu Gu (2021) A Review of EEG and fMRI Measuring Aesthetic Processing in Visual User Experience Research. Computational Intelligence and Neuroscience. Vol 2021. https://doi.org/10.1155/2021/2070209
- Ajaya Neupane, Nitesh Saxena, Jose Omar Maximo, and Rajesh Kana (2016) Neural Markers of Cybersecurity: An fMRI Study of Phishing and Malware Warnings.
- IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 11, NO. 9, SEPTEMBER 2016, DOI 10.1109/TIFS.2016.2566265
- Ajaya Neupane, Nitesh Saxena, Keya Kuruvilla, Michael Georgescu, and Rajesh Kana (2014), NDSS'14, https://nsaxena.engr.tamu.edu/wp-content/uploads/sites/238/2019/12/nskgk-ndss14.pdf
- Vance. A., Jenkins, J. L., Anderson, B. B., Bjornn, D. K.*, & Kirwan, C. B. (2018). Tuning out
- security warnings: A longitudinal examination of habituation through fMRI, eye tracking, and field experiments. Management Information Systems Quarterly, 43(2), 1-26. https://doi.org/10.25300/MISQ/2018/14124
SS3: Cookie consent forms
Tutor: Sanna Suoranta (sanna.suoranta@aalto.fi)
All webpages ask their users to accept cookies. Does this really protect users who do not want to share their actions with web services?
References:
- Dino Bollinger, Karel Kubicek, Carlos Cotrini, and David Basin (2022) Automating Cookie Consent and GDPR Violation Detection Proceedings of the 31st USENIX Security Symposium. August 10–12, 2022, Boston, MA, USA https://www.usenix.org/system/files/sec22-bollinger.pdf
- Cristiana Santos, Nataliia Bielova, Célestin Matte. Are cookie banners indeed compliant with the law? cs.CR arXiv:1912.07144 https://doi.org/10.48550/arXiv.1912.07144
- Chiara Krisam, Heike Dietmann, Melanie Volkamer, Oksana Kulyk.
Dark Patterns in the Wild: Review of Cookie Disclaimer Designs on Top 500 German Websites
EuroUSEC '21: Proceedings of the 2021 European Symposium on Usable SecurityOctober 2021Pages 1–8https://doi.org/10.1145/3481357.3481516
XW:Model based Learning to Optimize (L2O): Applications Survey
Tutor: Xinjue Wang (xinjue.wang@aalto.fi)
Learning to Optimize (L2O) stands at the intersection of machine learning and optimization. While traditional optimization techniques are rooted in theory, L2O utilizes machine learning to craft optimization methods, focusing on mitigating the long-winded task of manual iterations in engineering. By training on specific problem sets, L2O creates optimization methods tailored to solve similar problems effectively. Its success hinges on factors like the type of optimization, the learning method architecture, and the training process. A variety of modern challenges have been tackled using L2O, such as: - Image Restoration & Reconstruction: L2O techniques have transformed tasks like image denoising, deblurring, super-resolution, inpainting, and more. - Medical Imaging: This field demands precision, especially in methods like MRI and CT imaging. L2O presents an innovative solution, dealing even with complex-valued inputs, to offer accurate image reconstructions. - Wireless Communication: Here, L2O helps in areas such as resource management, signal detection, and LDPC coding. Notably, in MIMO detection, L2O methods have shown resilience to low signal-to-noise ratios and other challenges. This student project aims to conduct a comprehensive survey of the applications of model-based L2O techniques. Beyond the survey, there's an opportunity for further in-depth research by introducing an innovative L2O framework tailored for image datasets with distinct patterns. Prerequisite: Basic understanding of classical optimization methods, deep learning and PyTorch programming.
References:
- - Chen, T., Chen, X., Chen, W., Wang, Z., Heaton, H., Liu, J., & Yin, W. (2022). Learning to optimize: A primer and a benchmark. _The Journal of Machine Learning Research_, _23_(1), 8562-8620. - Monga, V., Li, Y., & Eldar, Y. C. (2021). Algorithm unrolling: Interpretable, efficient deep learning for signal and image processing. _IEEE Signal Processing Magazine_, _38_(2), 18-44. -
- Zhang, J., & Ghanem, B. (2018). ISTA-Net: Interpretable optimization-inspired deep network for image compressive sensing. In _Proceedings of the IEEE conference on computer vision and pattern recognition_ (pp. 1828-1837). -
- Xiang, J., Dong, Y., & Yang, Y. (2021). FISTA-Net: Learning a fast iterative shrinkage thresholding network for inverse problems in imaging. _IEEE Transactions on Medical Imaging_, _40_(5), 1329-1339. -
- Ma, J., Liu, X. Y., Shou, Z., & Yuan, X. (2019). Deep tensor admm-net for snapshot compressive imaging. In _Proceedings of the IEEE/CVF International Conference on Computer Vision_ (pp. 10223-10232).
BL1: Adversarial perturbation attacks
Tutor: Blerta Lindqvist (blerta.lindqvist@aalto.fi)
The topic is about evasion attacks and it's flexible depending on student interest. It can be theoretical or with code experiments, for example, to reproduce a paper's results. Or perhaps to try a new attack. Pairs of students can also work together, provided they show that it is a team effort with collaboration.Reference:
- A list with a focus on evasion attacks and defenses https://nicholas.carlini.com/writing/2018/adversarial-machine-learning-reading-list.html
BL2: Adversarial defenses against adversarial perturbation attacks
Tutor: Blerta Lindqvist (blerta.lindqvist@aalto.fi)
The topic is about adversarial defenses against evasion attacks and it's flexible depending on student interest. It can be theoretical or with code experiments, for example, to reproduce a paper's results. Or perhaps to try a new defense. Pairs of students can also work together, provided they show that it is a team effort with collaboration.
References:
- A list with a focus on evasion attacks and defenses https://nicholas.carlini.com/writing/2018/adversarial-machine-learning-reading-list.html
SR: Microservices - when and how to use them
Tutor: Sara Ranjbaran (sara.ranjbaran@aalto.fi)
Microservice architecture is a modern approach to software system design in which the functionality of the system is divided into small independent units. Microservice systems differ from a more traditional monolithic system in many ways some of which are unexpected. A cost of a migration from a monolithic system to a system based on microservices is often substantial so this decision needs to be carefully evaluated. Microservices have become very popular in recent years. An increasing number of companies (e.g., Amazon, Netflix, LinkedIn) are moving towards dismantling their existing monolithic applications in favor of distributed microservice systems. As with any big software project, migrating to a microservice architecture often requires considerable investment. In this project work, you will discuss the benefits and drawbacks of adopting microservice architecture in comparison to monolithic architecture. Depending on your interests, possible viewpoints to this generic topic are also for example security challenges, scalability, serverless computing, service mesh, big data platforms, etc.
References:
- Pooyan Jamshidi, Claus Pahl, Nabor C. Mendon.a, James Lewis, and KStefan Tilkov. Microservices: The journey so far and challenges ahead. IEEE SOFTWARE, 35(3):24 – 35, 2018.
- Nicola Dragoni, Saverio Giallorenzo, Alberto Lluch Lafuente, Manuel Mazzara, Fabrizio Montesi, Ruslan Mustafin, and Larisa Safina. Microservices: Yesterday, Today, and Tomorrow, pages 195–216. Springer International Publishing, Cham, 2017.
RM: Understanding prompt generator for AI
Tutor: Rongjun Ma (rongjun.ma@aalto.fi)
A prompt generator is a tool that generates prompts for interacting with AI-powered natural language processing models. These prompts are essential for instructing the AI model on what task or response you want it to generate. Nowadays, prompt generators can be used in many cases such as draft instructions for Midjourney. This project aims to explore how prompt generator works and how it helps human-machine communication. Students can choose one use case of the prompt generator such as Midjourney text generator or Information Retrieval Prompts. During the course, students will conduct a literature review on one use case and understand how different types of prompts work, analyze the existing frameworks, and propose design implications.
Reference:
- Ruskov, M. (2023). Grimm in Wonderland: Prompt Engineering with Midjourney to Illustrate Fairytales. arXiv preprint arXiv:2302.08961.
- Liu, V., & Chilton, L. B. (2022, April). Design guidelines for prompt engineering text-to-image generative models. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (pp. 1-23)
AZ1: Generating a realistic visual crowdsourced dataset using Carla simulator for autonomous driving
Tutor: Aziza Zhanabatyrova (zhanabatyrova.aziza@aalto.fi)
The task is to write a script to be able to generate a dataset with some of the crowdsourced data characteristics. For example multiple vehicles with different/ random camera placements, camera parameters. Different scenarios and scenes. Provide a document explaining your findings.Reference:
- https://carla.org/
- https://github.com/carla-simulator/carla
AZ2: Survey on robust camera pose estimation methods
Tutor: Aziza Zhanabatyrova (zhanabatyrova.aziza@aalto.fi)
Accurate camera relocalization plays an important role in autonomous driving. Given query images, the camera relocalization task aims at estimating the camera poses for which the images are taken. The task is to write a survey on recent methods robust to noisy data/ incomplete data, or other challenges.
Reference:
- https://ieeexplore.ieee.org/document/4676964
- https://arxiv.org/abs/2211.11238
MG: Enhancing the usability of the mobile health apps (or Websites) designed for older adults
Tutor: Maedeh Ghorbanian Zolbin (maedeh.ghorbanianzolbin@aalto.fi)
this topic focuses on designing mobile health apps (or Websites) according to specific features of older adults to meet their unique needs. This topic will answer to the following questions: (1). What challenges do older adults face when using current mobile health apps ( or websites)? (2) How can the user interface and overall design of mobile health apps (or websites) be improved to better meet the needs of older adults and accommodate their cognitive and physical abilities?
Reference:
- https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8837196/
- https://journals.sagepub.com/doi/full/10.1177/1064804619840731
ES1: Calibration in quantized LLMs
Tutor: Erik Schultheis (erik.schultheis@aalto.fi)
Recently, heavily quantized large language models (LLMs) have become really popular, as they enable high-quality inference
on low-cost hardware. Quantization comes with a noticable drop in model quality.
The idea of the project is to look at this problem from the point-of-view of model calibration, that is, for each context X with predicted token T in V, for each v in V: P[T=v | phi(X)_v=p] = p.
The survey part would look at LLMs in general (a very deep understanding is not necessary, though), and at model calibration (measures of calibration and algorithms for calibrating).
For the hands-on part, the student would run several quantized versions with different weight resolutions and report calibration measures, and then run some off-the-shelf calibration algorithms and see if (part of) the performance drop can be restored. If the results are positive, this might also be an interesting topic for a subsequent master's thesis.
Reference:
- LLama 2 paper: https://arxiv.org/abs/2307.09288 calibration in scikit learn: https://scikit-learn.org/stable/modules/calibration.html multiclass calibration
- https://arxiv.org/pdf/1902.06977.pdf (might not be necessary for the project, but could be looked at if the student is interested)
ES2: Adaptive repeat penalties for LLMs
Tutor: Erik Schultheis (erik.schultheis@aalto.fi)
Text generation with large language models is based on a large neural network that predicts scores for each possible next token, but in practice, the choice of selecting the next token is more complicated than just sampling according to the induces probability distribution. The main part of the project is to survey several existing heuristics for next-token selection. For the hands-on part, some experimentation with adaptive repetition penalty could be an interesting topic: Current heuristics penalize repeating the same token within a given time frame, which improves generate text, but also leads to often quite long sentences (as the "." token is penalized). An extension of this heuristic that takes into account the repetition frequency might be able to improve this.Reference:
- LLama2: https://arxiv.org/abs/2307.09288 locally typical
- sampling: https://arxiv.org/abs/2202.00666 Llama.cpp
- implementation: https://github.com/ggerganov/llama.cpp (the command line options reveal several tunable heuristics for next token selection)
VH: Distributed Artificial Intelligence for IIoT
Tutor: Vesa Hirvisalo (vesa.hirvisalo@aalto.fi)
Distributed Artificial Intelligence (DAI) is needed in large systems, such as in future IIoT systems. In addition to distributed inference, there might be need for distributed decision making. For example, a fleet of autonomous vehicles often must make decisions that affect each other. This kind of a multi-agent system needs to share information between nodes, causing further load on the communication infrastructure. Also, training is often very time demanding and computationally heavy process, that can benefit significantly from distributed training methods. The task is to make a review on DAI, especially the branches that include the services structures within their research (DAI as a Service, DAIaaS).Reference:
- N. Janbi, I. Katib, and R. Mehmood. Distributed artificial intelligence: Taxonomy, review, framework, and reference architecture, in Intelligent Systems with Applications, 18, 2023.
DOI: 10.1016/j.iswa.2023.200231
DH: Amortized Meta Bayesian Optimization
Tutor: Daolang Huang (daolang.huang@aalto.fi)
Bayesian optimization (BO) is a well-established method to optimize black-box functions whose direct evaluations are costly. In recent years, several meta-BO architectures have been proposed, which can learn the surrogate model and the acquisition function end-to-end on a set of BO tasks by neural networks. In this project, our target is to read through several recent works on this topic and build our own pipeline for meta-BO. Prerequisites: Certain knowledge about deep learning and probability, basic idea about Bayesian optimization. PyTorch programming experience is expected.
Reference:
- MONGOOSE: Path-wise Smooth Bayesian Optimisation via Meta-learning https://arxiv.org/abs/2302.11533
- End-to-End Meta-Bayesian Optimisation with Transformer Neural Processes https://arxiv.org/abs/2305.15930
KH: Large language models for modeling of physical systems
Tutor: Katsiaryna Haitsiukevich (katsiaryna.haitsiukevich@aalto.fi)
Large language models (LLMs) changed the landscape of natural language processing (NLP) over the last few years. However, the capabilities of LLMs can be utilized in a number of applications beyond NLP tasks [1]. One can see LLMs as a general pattern machines that can be applied to extraction of more abstract non-linguistic patterns from data (see, e.g., [2]). In this project, we will study different approaches on how LLMs can be utilized for modeling of physical systems. For example, potentially large language models can accelerate the extraction of symbolic equations from the system description similar to [3] or LLMs can by used augment the training dataset to boost the classifier/regressor performance similar to [4].Reference:
- Jablonka, K.M., Ai, Q., Al-Feghali, A., Badhwar, S., Bocarsly, J.D., Bran, A.M., Bringuier, S., Brinson, L.C., Choudhary, K., Circi, D. and Cox, S., 2023. 14 Examples of How LLMs Can Transform Materials Science and Chemistry: A Reflection on a Large Language Model Hackathon. Digital Discovery.
- Mirchandani, S., Xia, F., Florence, P., Ichter, B., Driess, D., Arenas, M.G., Rao, K., Sadigh, D. and Zeng, A., 2023. Large language models as general pattern machines. arXiv preprint arXiv:2307.04721.
- Landajuela, M., Lee, C.S., Yang, J., Glatt, R., Santiago, C.P., Aravena, I., Mundhenk, T., Mulcahy, G. and Petersen, B.K., 2022. A unified framework for deep symbolic regression. Advances in Neural Information Processing Systems, 35, pp.33985-33998.
- Hegselmann, S., Buendia, A., Lang, H., Agrawal, M., Jiang, X. and Sontag, D., 2023, April. Tabllm: Few-shot classification of tabular data with large language models. In International Conference on Artificial Intelligence and Statistics (pp. 5549-5581). PMLR.
AJ1: Where is the Beef ? The Optimal Selection of Training Data.
Tutor: Alexander Jung (alex.jung@aalto.fi)
Assume you have a powerful model, such as a deep neural network with billions of tunable parameters, but way to little training data. Which other datasets, available freely via the internet such as wikipedia articles, are most useful additions to your local training set? This project studies principled methods and fundamental limits for this training data selection problem.
Reference:
- Werner, M., He, L., Praneeth Karimireddy, S., Jordan, M., and Jaggi, M., “Provably Personalized and Robust Federated Learning”, <i>arXiv e-prints</i>, 2023. doi:10.48550/arXiv.2306.08393.
- Laurent Jacob, Jean-philippe Vert, Francis Bach, Clustered Multi-Task Learning: A Convex Formulation Part of Advances in Neural Information Processing Systems 21 (NIPS) 2008 SarcheshmehPour,
- Y., Tian, Y., Zhang, L., and Jung, A., “Clustered Federated Learning via Generalized Total Variation Minimization”, <i>arXiv e-prints</i>, 2021. doi:10.48550/arXiv.2105.12769.
AJ2: Trade offs between Explainability and Accuracy of Machine Learning.
Tutor: Alexander Jung (alex.jung@aalto.fi)
Many ML application domains require more than just a high accuracy (or statistical power) of trained models. The trained ML models should also be interpretable or explainable. This project studies the intrinsic trade off between a precise measure of subjective explainability and the achievable accuracy in important ML settings.Reference:
- A. Jung and P. H. J. Nardelli, "An Information-Theoretic Approach to Personalized Explainable Machine Learning," in IEEE Signal Processing Letters, vol. 27, pp. 825-829, 2020, doi: 10.1109/LSP.2020.2993176
- Zhang, L., Karakasidis, G., Odnoblyudova, A., Dogruel, L., and Jung, A., “Explainable Empirical Risk Minimization”, <i>arXiv e-prints</i>, 2020. doi:10.48550/arXiv.2009.01492.
MK1: Certification of machine learning robustness
Tutor: Mikko Kiviharju (mikko.kiviharju@aalto.fi)
Some machine learning types are vulnerable to different data-level attacks, such as data poisoning, backdooring or evasion. There are, however, methods to make implementations more robust, and authorities are becoming more aware of the need to check ML implementations specifically against data level attacks. This seminar topic can be approached either comparatively, where existing national schemes (e.g.[BSI] and [NIST]) are compared to schemes that do not yet require testing [KATAKRI], or to weigh those schemes against latest research.Reference:
- BSI, Federal Office for Information Security: "Security of AI-Systems:Fundamentals - Adversarial Deep Learning" https://www.bsi.bund.de/SharedDocs/Downloads/EN/BSI/KI/Security-of-AI-systems_fundamentals.html
- NIST: "Adversarial Machine Learning A Taxonomy and Terminology of Attacks and Mitigations", https://doi.org/10.6028/NIST.AI.100-2e2023.ipd
- Formin/Kansallinen turvallisuusviranomainen: "KATAKRI 2020: Tietoturvallisuuden auditointityökalu viranomaisille", https://um.fi/katakri-tietoturvallisuuden-auditointityokalu-viranomaisille
MK2: OT vs IT cybersecurity management
Tutor: Mikko Kiviharju (mikko.kiviharju@aalto.fi)
Managing cybersecurity of different operational technology (OT) systems (industrial control and automation and others) may be challenging due to differing priorities of safety and security. This study will view the recommended OT-specific issues via NIST guidance for OT [NIST1] and their cybersecurity framework (CRF) [NIST2].
Reference:
- [NIST1] NIST: "NIST Special Publication SP 800-82r3: Guide to Operational Technology (OT) Security", https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-82r3.ipd.pdf
- [NIST2] NIST. "Cybersecurity framework", https://www.nist.gov/cyberframework
MSA1: Effective and efficient knowledge distillation approaches
Tutor: Maryam Sabzevari (maryam.sabzevari@nokia-bell-labs.com)
Knowledge distillation is a technique in deep learning aimed at model compression, where a compact model, referred to as the student, is trained to replicate the behavior of a larger, more complex model known as the teacher. This research area focuses on refining knowledge distillation methodologies by exploring novel distillation architectures, multi-modal knowledge transfer, adaptive distillation strategies and self-distillation techniques.Reference:
- Huang, Tao, et al. "Knowledge distillation from a stronger teacher." Advances in Neural Information Processing Systems 35 (2022): 33716-33727.
- Phuong, Mary, and Christoph Lampert. "Towards understanding knowledge distillation." International conference on machine learning. PMLR, 2019.
- Cho, Jang Hyun, and Bharath Hariharan. "On the efficacy of knowledge distillation." Proceedings of the IEEE/CVF international conference on computer vision. 2019.
MSA2: Continual learning approaches and challenges
Tutor: Maryam Sabzevari (maryam.sabzevari@nokia-bell-labs.com)
Continual learning, also known as lifelong learning, is a critical research area in machine learning that addresses the challenge of enabling AI systems to learn and adapt continuously over time, much like how humans learn from new experiences. In continual learning, models strive to accumulate knowledge from various data sources or tasks sequentially while retaining the knowledge learned from previous experiences. This presents a unique set of challenges, including catastrophic forgetting, where new information can disrupt the performance on previously learned tasks. Researchers in this field explore methods for preserving and consolidating past knowledge, adapting to changing environments, and efficiently incorporating new information. Continual learning is vital for building AI systems that can evolve and improve over time, making it highly relevant in diverse applications.Reference:
- Guo, Yiduo, Bing Liu, and Dongyan Zhao. "Online continual learning through mutual information maximization." International Conference on Machine Learning. PMLR, 2022.
- Arani, Elahe, Fahad Sarfraz, and Bahram Zonooz. "Learning fast, learning slow: A general continual learning method based on complementary learning system." arXiv preprint arXiv:2201.12604 (2022).
- Wang, Liyuan, et al. "Memory replay with data compression for continual learning." arXiv preprint arXiv:2202.06592 (2022).
JM: CDN attacks
Tutor: Jose Luis Martin Navarro (jose.martinnavarro@aalto.fi)
Content Delivery Networks (CDNs) are an essential Internet infrastructure that improves the performance and scalability of content requests such as webpages and media. They work as a geographically distributed proxy platform, caching and forwarding content on a massive scale. Many CDN providers include Distributed Denial-of-Service (DDoS) protection as an additional feature of using their services, usually through tools such as Web application firewalls (WAF). Several studies have tried to develop a general model for CDN attacks [1][2] . However, the CDN landscape is not static. New variants of attacks emerge every year, and vendors fix and develop different countermeasures. In this project, we ask students to conduct a literature review on the latest DDoS attacks [3][4][5]and validate the results with an ethical approach. With this understanding, we aim to identify new potential threats for CDNs.
Reference:
- M. Ghaznavi, E. Jalalpour, M. A. Salahuddin, R. Boutaba, D. Migault and S. Preda, "Content Delivery Network Security: A Survey," in *IEEE Communications Surveys & Tutorials*, vol. 23, no. 4, pp. 2166-2190, Fourthquarter 2021, doi: 10.1109/COMST.2021.3093492.
- Sharafaldin, Iman, Arash Habibi Lashkari, Saqib Hakak, and Ali A. Ghorbani. "Developing realistic distributed denial of service (DDoS) attack dataset and taxonomy." In *2019 International Carnahan Conference on Security Technology (ICCST)*, pp. 1-8. IEEE, 2019.
- CDN Backfired: Amplification Attacks Based on HTTP Range Requests
- Guo, Run, Jianjun Chen, Yihang Wang, Keran Mu, Baojun Liu, Xiang Li, Chao Zhang, Haixin Duan, and Jianping Wu. "Temporal {CDN-Convex} Lens: A {CDN-Assisted} Practical Pulsing {DDoS} Attack." In *32nd USENIX Security Symposium (USENIX Security 23)*, pp. 6185-6202. 2023
- Li, Zihao, and Weizhi Meng. "Mind the amplification: cracking content delivery networks via DDoS attacks." In *Wireless
Algorithms, Systems, and Applications: 16th International Conference,
WASA 2021, Nanjing, China, June 25–27, 2021, Proceedings, Part II 16*, pp. 186-197. Springer International Publishing, 2021
MS: Security for future factories: overview of tools and technologies
Tutor: Mohit Sethi (mohit.sethi@aalto.fi)
Operational technology (OT) security for factories is currently receiving quite a lot of attention from the industry, as well as, from regulators. The student is expected to provide an overview of different tools and technologies such as unidirectional gateways and data diodes that are available for ensuring security of factories.
Reference:
- https://csrc.nist.gov/pubs/sp/800/82/r3/ipd
- https://www.ideals.illinois.edu/items/124965
- https://www.opswat.com/products/unidirectional-security-gateway-guide
- https://www.opswat.com/blog/how-we-test-unidirectional-gateway
HF: Prospects of WebAssembly to address performance and energy challenges of mobile clients
Tutor: Hannu Flinck (hannu.flinck@nokia.com)
WebAssembly (WASM) is a binary instruction format that allows execution of code in a safe and efficient manner across different platforms and programming languages. It was designed as a low-level virtual machine that runs code at near-native speed, making it ideal for performance-intensive applications. The motivation behind WASM is to enable developers to build web applications with high performance and portability, breaking the barriers imposed by traditional browser-based JavaScript execution. The run-time performance of WASM against that of JavaScript has been studied extensively. Recently the energy consumption of WASM versus JavaScript has also been addressed. One study found that using WASM may improve the run-time the battery consumption by 39% in comparison to JavaScript.
References:
- W. Wang, "Empowering Web Applications with WebAssembly: Are We There Yet?," 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE), Melbourne, Australia, 2021, pp. 1301-1305, doi: 10.1109/ASE51524.2021.9678831.
- J. De Macedo, R. Abreu, R. Pereira and J. Saraiva, "On the Runtime and Energy Performance of WebAssembly: Is WebAssembly superior to JavaScript yet?," 2021 36th IEEE/ACM International Conference on Automated Software Engineering Workshops (ASEW), Melbourne, Australia, 2021, pp. 255-262, doi: 10.1109/ASEW52652.2021.00056.
- J. De Macedo, R. Abreu, R. Pereira and J. Saraiva, "WebAssembly versus JavaScript: Energy and Runtime Performance," 2022 International Conference on ICT for Sustainability (ICT4S), Plovdiv, Bulgaria, 2022, pp. 24-34, doi: 10.1109/ICT4S55073.2022.00014.
- Jangda, Abhinav, et al. "Not so fast: Analyzing the performance of {WebAssembly} vs. native code." 2019 USENIX Annual Technical Conference (USENIX ATC 19). 2019.
- Kjorveziroski, V., Filiposka, S. WebAssembly as an Enabler for Next Generation Serverless Computing. J Grid Computing 21, 34 (2023). https://doi.org/10.1007/s10723-023-09669-8
JB1: Software supply chain security: the SLSA specification
Tutor: Jacopo Bufalino (jacopo.bufalino@aalto.fi)
Vulnerable code can leak into a production system in every step of the software supply chain. The SLSA specification has been proposed to address the key supply chain threats by verifying the source, build and dependency integrity. SLSA provides three different levels of supply chain security guarantees, depending on the level of assurance needed for a given context. For this topic, you have to build a project that is SLSA level 3 compliant and present its related security guarantees.Reference:
- https://slsa.dev/ https://dl.acm.org/doi/abs/10.1145/3372297.3420015
- https://arxiv.org/abs/2002.01139
- https://ieeexplore.ieee.org/abstract/document/9740718
- https://owasp.org/www-project-software-component-verification-standard/
JB2: Testing network policies with eBPF
Tutor: Jacopo Bufalino (jacopo.bufalino@aalto.fi)
eBPF is an in-kernel Virtual Machine, designed to run sandboxed programs within the OS kernel. Its usage extends across a wide array of applications in both server environments and the cloud. eBPF applications include resource observability, network filtering and monitoring, and network security. In this seminar topic, the student will survey the existing eBPF applications in the field of network security and create an eBPF program to test the firewall policies of a Linux host.Reference:
- https://ebpf.io/ https://cilium.io/
- https://github.com/lizrice/ebpf-beginners
- https://dl.acm.org/doi/abs/10.1145/3371038
- https://ieeexplore.ieee.org/abstract/document/9335808
MSI: Neural radiance fields (NeRF)
Tutor: Matti Siekkinen (matti.siekkinen@aalto.fi)
NeRF provides means to synthesize novel views using a neural network based model that has been trained to learn a specific 3D scene from a set of images. NeRF basically enables 6DoF navigation within the learned scene and has applications, for instance, in XR. Much improvements have been made by the research community since the idea was first introduced a few years ago. The student's task is to survey the state of the art and identify still open problems.
Reference:
- https://www.matthewtancik.com/nerf
- https://paperswithcode.com/method/nerf