CS-E400002_aalto-CUR-148044-2248315: Available topics

AA: Aalto startup ecosystem (and/ or University ecosystem for student (and/or research-based and deep tech) startups

Tutor: Antti Ainamo

Who are in key roles in student startups (or other startups) in the entrepreneurial ecosystem in and around Aalto University? Why do various actors (outside entrepreneurs and investors, faculty and staff, student, public policy makers, large firms, anyone else) bother — are even very eager — to participate? How is the ecosystem organized (emergently and/or deliberately)? What appears to work well / What could be improved? Distinction between different kinds of startups (e.g. purely online startups vs startups with key off-line components; Aalto vs other universities; Finland vs other countries; student startups vs other startups) welcome.

Prerequisite: Basic understanding of reinforcement learning and probabilities.

References:

Ainamo, Pikas & Mikkelä 2021 ”University ecosystem for student startups: A ’platform of trust’ perspective”
Ainamo, Dell’Era & Verganti 2021: ”Radical circles and visionary innovation: Angry Birds and the transformation of video games”, Creativity and Innovation Management [ both availlable online e.g. ResearchGate ”Antti Ainamo”]

TA1: Optimizing firewall policies

Tutor: Tuomas Aura

Firewall policies are typically defined as a linear table of rules that is evaluated in top-down order. Some firewalls allow the rules to branch into multiple chains or tables. The firewall policy is essentially a function from a multi-dimensional space (e.g., connection 5-tuple) to a binary decision (pass or drop). There is quite a bit of literature about optimizing firewall policy implementations. The typical goal is to minimize the latency and CPU power used for processing each packet. In this topic, the student will learn about the algorithmic optimization methods for firewall rules and implement one of them It is possible to work in pairs on the hands-on the problem definition and test data. Each student should implement a different optimization algorithm or version and write the seminar paper independently.

References:

Balajee Vamanan, Gwendolyn Voskuilen, T. N. Vijaykumar. EffiCuts: optimizing packet classification for memory and throughput, ACM SIGCOMM 2010. https://doi.org/10.1145/1851275.1851208
Sumeet Singh, Florin Baboescu, George Varghese, Jia Wang. Packet Classification Using Multidimensional Cutting. ACM SIGXOMM 2003. https://dl.acm.org/doi/10.1145/863955.863980

TA2: Detecting anomalies in firewall configurations

Tutor: Tuomas Aura

Firewall policies tend to be fragile, and network administrators are reluctant to make any changes that could cause unexpected failures or open new vulnerabilities. One approach to improving the robustness of the firewall configuration is automated anomaly detection, which aims to find internal inconsistencies in the policy. There may indicate vulnerabilities or too strict policies that cause communication failures. In this topic, the student should summarize the types of anomalies detected described in literature and implement and demonstrate a detection algorithm for one or more types of anomaly. The goals could be extended to defining new types of anomalies and bad smells in the filtering policies (some ideas: comparing the policy to observed packet counts, or matching related protocols and communication directions).

It is possible to work in pairs on the problem definition, test data, and optionally also the anomaly detection implementation. Each student will write the seminar paper independently.

References:

Tarek Abbes, Adel Bouhoula, Michaël Rusinowitch. Detection of firewall configuration errors with updatable tree, International Journal of Information Security, Springer Verlag, 2016. https://hal.inria.fr/hal-01320646/document
Hongxin Hu, Gail-Joon Ahn, Ketan Kulkarn. Detecting and Resolving Firewall Policy Anomalies, IEEE Transactions on Dependable and Secure Computing, 2012. https://ieeexplore-ieee-org.libproxy.aalto.fi/stamp/stamp.jsp?tp=&arnumber=6143955
Avishai Wool. Trends in Firewall Configuration Errors Measuring the Holes in Swiss Cheese. IEEE Internet Computing 2010. https://ieeexplore.ieee.org/abstract/document/5440153

TA3: Layer-2 integrity protection

Tutor: Tuomas Aura

In embedded Ethernet networks, it is desirable to implement end-to-end authentication on the data link layer. IEEE 802.1AE is an extension of the Ethernet specification that adds a message authentication code to each frame. In the AUTOSAR specification for automotive networks, there is a similar extension called SecOC, which tries to minimize the overhead of the integrity protection in terms of the number of bits per frame. Some difficulties in such security mechanism arise from replay attacks and freshness indicators. Moreover, the authentication is based on symmetric keys, and there is a big difference between pairwise authentication and group authentication. The student should explore the design options based in the standard protocols. It is possible to analyze the tradeoffs made with respect to replay protection and overhead. Experimental implementation or deployment in a virtual network environment is one possible direction for the project. In that case, the hands-on work can be done in pairs. Each student will write the seminar paper independently.

References:

802.1AE: MAC Security (MACsec), https://1.ieee802.org/security/802-1ae/
AUTOSAR, Specification of Secure Onboard Communication Protocol, https://www.autosar.org/fileadmin/user_upload/standards/foundation/20-11/AUTOSAR_PRS_SecOcProtocol.pdf

TA4: Firewalls and filtering policies for small networks

Tutor: Tuomas Aura

This broad topic allows students to explore filtering and isolation defenses in network used by small businesses or at home. Firewalls that filter network traffic are an old a relatively stable area of network security. The networks and their applications, however, are changing. Home and small-business networks are used for connecting embedded (IoT) devices, which are not fully trusted and need to be isolated from each other and from the rest of the networks. IPv6 deployment means that it is not possible to rely on private address spaces and NAT for security, network segmentation with VLANs, many of the local services have been moved to cloud networks, and increased use of VPN connections further complicates the network structure. Since these networks have no professional administration, it is difficult to be sure that the firewall configuration is correct. In this topic, the students may: (1) Build a virtual testbed network with a firewall and a reasonable policy. The explore ways to gain assurance of its correctness, e.g., by probing the firewall. (2) Students who have completed CS-E4300 Network security can implement automated testing for the firewall and VPN policies created in the course projects. (3) Students who have access to a real-world or testbed firewall configuration from their workplace or home can analyze that (only with permission and in anonymized form). To be successful in this topic, the students should have some idea of their own about what kind of testbed to build or target network to analyze. Otherwise, it can be difficult to come up with sufficiently interesting firewall rules. It is possible to work in pairs on the hands-on part of the project. Each student will write the seminar paper independently.

References:

Firewall experiments could be conducted with the pfSense community edition https://www.pfsense.org/download/ and a virtual network environment.
Firewall Builder Cookbook, chapter 14, has various examples of firewall rules: http://fwbuilder.sourceforge.net/4.0/docs/users_guide5/cookbook.shtml

AB: Discrepancy measures in Approximate Bayesian Computation

Tutor: Ayush Bharti

Parameter inference for simulator-based, or generative, models is a challenging task due to the unavailability of the likelihood function. Approximate Bayesian computation (ABC) has emerged as a popular technique for inferring parameters of such models in the past couple of decades. ABC relies on sampling from the model in order to approximate the posterior distribution of the parameters. Parameter values that yield simulated data "close" to the observed data in some distance metric are selected as samples from the desired approximate posterior. In case of high-dimensional data, the distances are computed between their summary statistics. However, selection of appropriate summaries is non-trivial and adds to the approximation of the posterior. A number of ABC methods have been proposed recently where the distributions of the simulated and observed data are compared directly instead of their summaries. In this project, you will review the literature on the distances used in ABC methods, implement and compare them for some toy examples.

References:

http://proceedings.mlr.press/v84/jiang18a/jiang18a.pdf and the references therein

CB: Identity-hiding in secure communication

Tutor: Christopher Brzuska

In secure communication between 2 (or more), both parties usually seek to establish the authenticity of the respective other party. At the same time, it might be desirable to hide one's identity from eavesdroppers and other attackers on the network layer. These goals seem somewhat contradictory. What can be achieved here? Possible topics include: - protocols for identity-hiding key exchange - attacks on identity-hiding key exchange protocols - security models for identity-hiding key exchange - impossibility results - any other relevant scientific question in this domain (to be discussed, you can contact Chris ahead of time if your preference of this topic depends on it) Topics not included (this year) are: TOR, MixNets, onion routing (since these are not cryptographic and the goal of this topic is to be a cryptographic)

References:

We will research the topic together, but here are a couple of starting points for research:

**Protocols**

The NOISE protocol framework: https://noiseprotocol.org/noise.pdf

**Security Models (and more protocols)**
This paper was one of the earlier papers introducing identity-hiding in key exchange:

https://www.cypherpunks.ca/~iang/pubs/ntor.pdf

This article also introduced a definition of identity-hiding and analyzed the OPACITY protocol (the paper was followed by an interesting controversy since not everyone agreed with the conclusions of the article):

https://link.springer.com/content/pdf/10.1007/978-3-642-40203-6_20.pdf Here are three more recent papers on the topic:
https://link.springer.com/chapter/10.1007/978-3-030-88428-4_33 https://eprint.iacr.org/2020/1519
https://link.springer.com/chapter/10.1007/978-3-319-14054-4_6

LC: Theoretical Framework for Cloud Computing Network Measurements

Tutor: Lorenzo Corneo

Literature study concerning the most used techniques to perform network measurements towards cloud datacenters at global scale. The student will learn about common methodologies used to validate network measurements reviewing papers from the leading conferences in computer networking, e.g., ACM SIGCOMM, ACM IMC, ACM WWW, etc

References:
The following papers show two different approaches for performing cloud measurements.

https://lorenzocorneo.github.io/papers/2021-www.pdf
https://lorenzocorneo.github.io/papers/2021-networking.pdf

AD: Deep Reinforcement Learning and Games

Tutor: Anton Debner

Computer games offer an excellent platform for researching Deep Reinforcement Learning (DRL) methods. In 2015, Mnih et al [0] showed that DRL can be used to play Atari games with human-like performance. In 2019, AlphaStar [1] learned to beat the best StarCraft 2 players and Open AI research showed [2] that DRL agents can learn to collaboratively use unintended game mechanics to complete tasks. There are several publicly available environments and frameworks including robotic arms, multi-legged creatures, classic 2D Atari games, early 3D games (e.g., Doom 1993) and modern competitive games. Tools for using DRL techniques are also being integrated into popular game engines (e.g., [3]). While games are entertaining tools to research on, they are often seen as a platform for creating generalizable techniques to solve problems in other areas as well. The task is to perform a literature review on recent papers, focusing either on how the game-related research generalizes to other non-gaming related domains or how DRL can be used in game development. The task can be adjusted based on the interests of the student. General interest towards AI / deep learning and video games is preferred.

References:
[0] Mnih, V., Kavukcuoglu, K., Silver, D. et al. Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015). https://doi.org/10.1038/nature14236
[1] Vinyals, O., Babuschkin, I., Czarnecki, W.M. et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature 575, 350–354 (2019). https://doi.org/10.1038/s41586-019-1724-z
[2] OpenAI: Emergent Tool Use from Multi-Agent Interaction, https://openai.com/blog/emergent-tool-use/
[3] Unity ML-Agents Toolkit, https://github.com/Unity-Technologies/ml-agents

JH: Asymmetric multi-core scheduling
Tutor: Jaakko Harjuhahto

Most modern desktop multi-core processors are homogenous: all of the processor cores are identical in instruction set architecture and features (i.e. micro-architecture). However, heterogenous or asymmetric processors with multiple different core designs on the same chip offer benefits for specific use cases. These asymmetric architectures present the operating system scheduler with more options to exploit the characteristics of the different processor core types to achieve runtime goals, such as energy efficiency for mobile devices. Practical examples of this design paradigm are ARM big.LITTLE chips with 'big' cores designed for performance and 'LITTLE' cores prioritizing energy efficiency. Nearly all modern smartphones have such CPUs. Intel's upcoming Alder Lake desktop CPUs bring a similar hybrid design with distinct 'Performance' and 'Efficiency' cores on the same chip to the desktop. For this topic, the task is to review literature on asymmetric multi-core scheduling. [1] offers a survey on asymmetric multicore processors, from which a sub-section can be chosen as the focus for the seminar paper. Two examples of more specific topics are given below: - How does the scheduler profile processes in order to decide which type of core is optimal for them? - Compare how two or more scheduling goals (e.g. performance, fairness, energy consumption) are realized in proposed scheduling policies Having completed a course on operating systems, such as CS-C3140 Operating Systems at Aalto, is beneficial for this topic.

References:
[1] S. Mittal, A Survey of Techniques for Architecting and Managing Asymmetric Multicore Processors, ACM Computing Surveys, 2016, https://doi.org/10.1145/2856125

VH: Co-inference techniques for Edge and Fog computing

Tutor: Vesa Hirvisalo

Description: Edge and Fog computing [1] are emerging paradigms for organizing computing services for Internet of Things. They extend the concepts and practices of cloud computing toward the rapidly increasing number of connected devices. Many aspects of both edge and fog computing are currently under intense research. One of these is utilization of artificial intelligence based on deep learning. However, the edge and fog computing domains have their inherent requirements and restrictions, which calls for efficient inference methods to be used. Various co-inference methods (e.g., [2],[3]) are one approach to this. The task to is make an overview of co-inference techniques in Edge and Fog computing.

References:
[1] Yousefpour & al. All One Needs to Know about Fog Computing and Related Edge Computing Paradigms - A Complete Survey. Journal of Systems Architecture. DOI:10.1016/j.sysarc.2019.02.009
[2] S. Teerapittayanon, B. McDanel and H. T. Kung, BranchyNet: Fast inference via early exiting from deep neural networks, 2016 23rd International Conference on Pattern Recognition (ICPR), 2016, pp. 2464-2469, doi: 10.1109/ICPR.2016.7900006. [3] J. Shao and J. Zhang, BottleNet++: An End-to-End Approach for Feature Compression in Device-Edge Co-Inference Sstems, 2020 IEEE International Conference on Communications Workshops (ICC Workshops), 2020, pp. 1-6, doi: 10.1109/ICCWorkshops49005.2020.9145068.

AJ1: Implicit bayesian neural networks with normalizing flows

Tutor: Anirudh Jain

Recent work on implicit bayesian neural networks is a promising new direction for scalable deep networks. Existing work use a limited gaussian or mixture of gaussian prior. We plan to use normalizing flows to have more expressive priors for bayesian neural networks.

References:

https://arxiv.org/abs/2010.13498

AJ2: Marginal normalizing flows for discrete data

Tutor: Anirudh Jain

Dequantization is widely used as a necessary step to apply normalizing flows for discrete data. Variational dequantization has led to SOTA results in density estimation with normalizing flows such as images. We want to investigate learning a joint likelihood over the discrete and dequantized data representation.

References:

https://proceedings.neurips.cc/paper/2019/file/e046ede63264b10130007afca077877f-Paper.pdf

TJ1: Active Learning

Tutor: Ti John

In Machine Learning, we often simply assume that we have a large-enough training set to fit complex models such as deep neural networks. In practice, we may have a large set of unlabelled examples, but it can be very costly to determine the true labels - not just in image classification: for example, in drug discovery, we study the interaction between a potential drug molecule and a target receptor, requiring computationally expensive simulations or even more expensive lab experiments. In Active Learning, the AI model decides for which examples to request labels, to maximise prediction performance whilst minimising the number of labels required. In this topic, following on from the overview in Settles (2012), you will explore the current state of the art in the active learning literature.

References:

Settles (2012) [http://active-learning.net/, see http://burrsettles.com/pub/settles.activelearning.pdf for a freely accessible prequel]

TJ2: Best of both worlds? Combinations of Gaussian Processes and Neural Network

Tutor: Ti John

Gaussian processes (GPs) are good at accurately quantifying uncertainty, allow for incorporating prior knowledge, and can be comparatively interpretable - but they struggle with large datasets and in high dimensions. This is what neural networks (NNs) are great at. Can we get the best of both worlds? This motivates combining NNs and GPs, for example in "deep kernel learning". In this topic we explore recent literature on this combination, where it works well and what their limitations are.

References:

https://proceedings.mlr.press/v51/wilson16.html
https://proceedings.mlr.press/v161/ober21a.html

TJ3: Approximate inference: how accurate is it really?

Tutor: Ti John

A common problem in probabilistic machine learning is to infer functional relationships, and Gaussian processes are popular to also quantify the uncertainty over such relationships. To handle different observation models such as regression in presence of outliers, count observations, or preference rankings, we need to resort to approximate inference methods. For binary classification, Rasmussen & Nickisch (2008) provide a thorough comparison of different methods, but for other observation models, it is not clear what the trade-offs are. In this topic you will study different approximate inference methods proposed in the literature and develop an empirical comparison, which could help guide practitioners in the future.

References:

https://www.jmlr.org/papers/v9/nickisch08a.html

ALJ1: Explainable Empirical Risk Minimization

Tutor: Alexander Jung

Study novel methods for making machine learning more explainable. The idea is to regularize existing methods such as linear regression or deep learning using a penalty term that measures the amount of surprise caused for the user.

References:

Chapter 10"Explainable ML" of https://github.com/alexjungaalto/MachineLearningTheBasics/blob/master/MLBasicsBook.pdf
Jung, A., “Explainable Empirical Risk Minimization”, <i>arXiv e-prints</i>, 2020. https://arxiv.org/abs/2009.01492
A. Jung and P. H. J. Nardelli, "An Information-Theoretic Approach to Personalized Explainable Machine Learning," in IEEE Signal Processing Letters, vol. 27, pp. 825-829, 2020, doi: 10.1109/LSP.2020.2993176.

ALJ2: Networked Federated Learning

Tutor: Alexander Jung

Some application domains generate distributed collections of local datasets (such as audio recordings of smartphones during a pandemic). This project studies theory and algorithms for learning tailored models for each local dataset by leveraging similarities between the statistical properties of local datasets (nearby people are more likely to have same Covid-19 infection status).

References:

A. Jung, "Federated Learning over Networks for Pandemics", LiveProject Manning.com, 2021 https://www.manning.com/liveprojectseries/federated-learning-ser
A. Jung, "Networked Exponential Families for Big Data Over Networks," in IEEE Access, vol. 8, pp. 202897-202909, 2020, doi: 10.1109/ACCESS.2020.3033817.

ALJ3: Federated Learning in Block Models

Tutor: Alexander Jung

Most work on federated learning focuses on methodology and basic convergence properties. However, a fine-grained analysis of computational and statistical aspects of federated learning is missing. This seminar topic aims at exploring these aspects for simple toy models such as stochastic block models.

References:

Y. Sarcheshmehpour, M. Leinonen and A. Jung, "Federated Learning from Big Data Over Networks," ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021, pp. 3055-3059, doi: 10.1109/ICASSP39728.2021.9414903.
A. Jung, "Clustering in Partially Labeled Stochastic Block Models via Total Variation Minimization," 2020 54th Asilomar Conference on Signals, Systems, and Computers, 2020, pp. 731-735, doi: 10.1109/IEEECONF51394.2020.9443311.
A. Jung and N. Tran, "Localized Linear Regression in Networked Data," in IEEE Signal Processing Letters, vol. 26, no. 7, pp. 1090-1094, July 2019, doi: 10.1109/LSP.2019.2918933.

ALJ4: Explainable Empirical Risk Minimization

Tutor: Alexander Jung

A key challenge in ML applications is the transparency or explainability of ML predictions. We have recently proposed to model explanations as a form of communication between ML method and user. The effect of providing an explanation is to reduce the uncertainty of the user about the prediction. This research topic aims at developing explainable ML methods by using the reduction in uncertainty as regularization term.

References:

A. Jung and P. H. J. Nardelli, "An Information-Theoretic Approach to Personalized Explainable Machine Learning," in IEEE Signal Processing Letters, vol. 27, pp. 825-829, 2020, doi: 10.1109/LSP.2020.2993176.
Jung, A., “Explainable Empirical Risk Minimization”, <i>arXiv e-prints</i>, 2020. https://arxiv.org/abs/2009.01492

KKH: Approximation Algorithms for Clustering Problems

Tutor: Kamyar Khodamoradi

Clustering is a central problem in many disciplines of science in which the high-level task is to group similar items together. Variations of this problem such as the k-Means clustering have received special attention in machine learning applications in recent years. From the theoretical point of view, a vast and rich literature has been developed in the past decades that tackles clustering as an optimization problem. Naturally, since in most settings, solving the clustering problem exactly is computationally expensive (say, NP-hard), many have looked at the approximation version of the question. That is, finding algorithms that can efficiently find nearly good solutions with mathematical guarantees on the quality of the approximation. In this seminar topic, we will explore some of the seminal works that have contributed to this line of work.

References:

Local Search Heuristics for k-Median and Facility Location Problems DOI: https://epubs.siam.org/doi/10.1137/S0097539702416402 2.
Approximation algorithms for metric facility location and k-Median problem using the primal-dual schema and Lagrangian relaxation DOI: https://dl.acm.org/doi/10.1145/375827.375845 3.
Approximation schemes for Euclidean k-medians and related problems DOI: https://dl.acm.org/doi/abs/10.1145/276698.276718 4.
Approximating k-median by pseudo-approximation DOI: https://epubs.siam.org/doi/10.1137/130938645

MK: A survey of deterministic networking

Tutor: Miika Komu

Survey deterministic networking (DETNET) in academic and standardization literature

References:

https://datatracker.ietf.org/doc/html/rfc8557
https://datatracker.ietf.org/doc/html/rfc8655
https://datatracker.ietf.org/doc/html/draft-ietf-detnet-bounded-latency
Last Spring, the course included a seminar paper on real-time networking that focused on Time Sensitive Networking (TSN). The paper is good background reading material.

NHK: Animating Virtual Characters with Deep Neural Networks

Tutor: Nam Hee Kim

Neural networks have revolutionized computer animation. In this project, we dive deep into the recent advances in character animation and robotic control, namely deep reinforcement learning (DRL) for physics-based character animation and data-driven kinetic control. The main objectives of this project are (1) to review the challenges in character animation and (2) to understand fundamental components for modern, deep learning-based character control systems. Bonus: the student may get hands-on experience implementing one of the systems studied.

References:

http://www.theorangeduck.co.uk/media/uploads/motioncnn.pdf
Holden et al., A deep learning framework for character motion synthesis and editing: http://www.daniel-holden.com/media/uploads/motionsynthesis.pdf
Holden et al., Phase-functioned neural networks for character control: https://core.ac.uk/download/pdf/131072916.pdf
Bergamin et al., DReCon: data-driven responsive control of physics-based characters: https://static-wordpress.akamaized.net/montreal.ubisoft.com/wp-content/uploads/2019/11/13214229/DReCon.pdf
Fussell et al., SuperTrack: motion tracking for physically simulated characters using supervised learning: https://dl.acm.org/doi/pdf/10.1145/3478513.3480527
Starke et al., Mode-adaptive neural networks for quadruped motion control: https://core.ac.uk/download/pdf/160483483.pdf
Starke et al., Neural state machine for character-scene interactions: http://www.ipab.inf.ed.ac.uk/cgvu/nsm.pdf
Won and Lee, Learning body shape variation in physics-based characters: https://mrl.snu.ac.kr/publications/ProjectMorphCon/MorphCon.pdf
Won et al., A scalable approach to control diverse behaviors for physically simulated characters: https://research.fb.com/wp-content/uploads/2020/06/A-Scalable-Approach-to-Control-Diverse-Behaviors-for-Physically-Simulated-Characters.pdf
Won et al., Control strategies for physically simulated characters performing two-player competitive sports: https://dl.acm.org/doi/pdf/10.1145/3450626.3459761
Xie et al., ALLSTEPS: Curriculum‐driven Learning of Stepping Stone Skills: https://arxiv.org/pdf/2005.04323 Ling et al., Character controllers using motion VAEs: https://arxiv.org/pdf/2103.14274
Reda and Tao et al., Learning to locomote: Understanding how environment design matters for deep reinforcement learning: https://arxiv.org/pdf/2010.04304

ZRY: Non-stationary Human Preferences in Human-AI teams

Tutor: Zeinab R. Yousefi

One main challenge in the human-AI teams is predicting the intents of a human team member through observations of the human’s behavior. Inverse Reinforcement Learning (IRL) is an approach of imitation learning with the core idea of extracting a reward function given observed, optimal behavior. The underlying motivation of IRL is that a reward function is a very compact and informative representation of expert behavior and usually can generalize well to new situations. However, such approaches typically assume that the human’s intent is stationary and its dynamics do not change over time. The environment in which the data collection process is carried out might change over time as well as the policy demonstrated by the expert. Thus, a viable IRL method must identify the time points at which the agent’s intention changes and deal with them appropriately. In this thesis, you will do a comprehensive literature survey about the current state-of-the-art in this area. Furthermore, you can engage in a small implementation in the aforementioned context.

References:

Basic understanding of reinforcement learning and probabilities

VR1: Explaining deep neural networks using Neuro Symbolic Integration

Tutor: Vishnu Raj

Even though deep learning have proved itself to be a great tool for solving complex machine learning problems, concerns about trust, safety, interpretability and accountability hinders its adoption to critical problems including autonomous systems, healthcare solutions etc. Multiple experts have pointed out that in order to build a rich AI system, which is semantically sound, explainable and ultimately trustworthy, one needs to include with it a sound reasoning layer in combination with deep learning. Neuro-Symbolic Integration (NSI) is the area of research of which aims to tie together the robust learning capabilities of deep neural networks with the excellent reasoning capabilities of symbolic representations. DeepNSI builds on the idea that predictions from deep neural network models can be explained are symbolic rules constructed based on the intermediate layer activations and/or mapping the outputs to inputs using simple rules. This can help in interpreting the model output as well as understanding model risks. This project aims at exploring the latest literature in Deep Neuro-Symbolic Integration, applying these methods to interpret existing model\s predictions and finally exploring the idea of combining existing feature visualization methods such as Grad-CAM to derived symbolic rules. Preferred background - Interested in deep learning and interpretability - Experience in using one of the popular deep learning frameworks (Tensorflow/PyTorch)

References:
[1] Townsend, J., Chaton, T. and Monteiro, J.M., 2019. Extracting relational explanations from deep neural networks: A survey from a neural-symbolic perspective. _IEEE transactions on neural networks and learning systems_, _31_(9), pp.3456-3470. [2] Sarker, M.K., Zhou, L., Eberhart, A. and Hitzler, P., 2021. Neuro-Symbolic Artificial Intelligence Current Trends. _arXiv preprint arXiv:2105.05330_.
[3] Garcez, Artur d'Avila, and Luis C. Lamb. "Neurosymbolic AI: the 3rd Wave." _arXiv preprint arXiv:2012.05876_ (2020).
[4] Jang, S.I., Girard, M.J. and Thiery, A.H., 2021. Explainable Diabetic Retinopathy Classification Based on Neural-Symbolic Learning.
[5] Cingillioglu, N. and Russo, A., 2021. pix2rule: End-to-end Neuro-symbolic Rule Learning. _arXiv preprint arXiv:2106.07487_.
[6] Bennetot, A., Laurent, J.L., Chatila, R. and Díaz-Rodríguez, N., 2019. Towards explainable neural-symbolic visual reasoning. _arXiv preprint arXiv:1909.09065_.

VR2: Gradient estimators for Implicit probabilistic models

Tutor: Vishnu Raj

Implicit models allow for generation of samples but not for pointwise evaluation of probabilities. These models include data simulators, Generative Adversarial Networks, Approximate Inference Techniques relying on implicit distributions. However, training such models from data is challenging as due to intractable posterior distribution and associated problems with estimating gradients. Recently, a new set of gradient estimators are proposed which can compute the gradient of model parameters based on only the samples. Based on Stein's identity, these methods use kernel methods to approximate the gradients and are found to be successful in multiple real-world problems. This project aims to conduct a thorough literature survey on Stein's Gradient Estimator and its recent variants. Simulation studies will be conducted on evaluating the effect of different kernels on the quality of gradient approximation. Preferred background - Interest in Bayesian deep learning models - Interest in implimenting models in one of the popular deep learning frameworks

References:
[1] Li, Y. and Turner, R.E., 2017. Gradient estimators for implicit models. _arXiv preprint arXiv:1705.07107_.
[2] Shi, J., Sun, S. and Zhu, J., 2018, July. A spectral approach to gradient estimation for implicit distributions. In _International Conference on Machine Learning_ (pp. 4644-4653). PMLR.
[3] Shi, J., Sun, S. and Zhu, J., 2017. Kernel implicit variational inference. _arXiv preprint arXiv:1705.10119_.
[4] Zhou, Y., Shi, J. and Zhu, J., 2020, November. Nonparametric score estimators. In _International Conference on Machine Learning_ (pp. 11513-11522). PMLR.

SS: CNN over unordered sets

Tutor: Stephan Sigg

Traditional convolutional neural networks assume an implicit ordering over their inputs. This seminar topic is to study and report about approaches to apply CNN structures to unordered sets of points.

Point clouds are often referred to unordered sets of points and mark one important class of unordered inputs. However, point clouds still inherit an order through their 3D coordinates.

The student is to present a concise presentation on point-cloud processing approaches (e.g. point-based, voxel-based, graph-based) and in addition to conceptionally discuss CNNs for 'bag of ...' type of inputs.

References:

https://arxiv.org/pdf/1905.08705

YT: Image generation with generative models

Tutor: Yu Tian

Images could be generated by a model trained to generate new examples that plausibly come from an existing distribution of samples, such as new images that are similar but specifically different from a dataset of existing images. Images could also be generated semantically according to some text descriptions.

References:

https://arxiv.org/pdf/2103.04922.pdf
https://link.springer.com/chapter/10.1007/978-3-030-22885-9_1
https://towardsdatascience.com/image-generation-in-10-minutes-with-generative-adversarial-networks-c2afc56bfa3b

VTB1: Modeling CSMA/CA

Tutor: Verónica Toro-Betancur

A channel access method determines the transmission dynamics in a wireless network. That is, it defines the steps that a device has to take before transmitting a data packet, e.g., listening to the channel first and, if it is idle, transmit the packet. These methods aim to minimize the collisions between packets in the network. Several channel access methods have been proposed and adopted in real-world deployments. One of the most popular methods is CSMA/CA (Carrier-sense multiple access with collision avoidance) as it is the default method used by WiFi and ZigBee. Research involving any of these technologies benefit from a complete model of the CSMA/CA technique. The goal of this topic is to review existing CSMA/CA models and highlight their complexity and accuracy. Students interested in this topic should be willing to dig into (sometimes) complex mathematical formulations.

References:

Xin Wang and K. Kar, "Throughput modelling and fairness issues in CSMA/CA based ad-hoc networks," Proceedings IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies., 2005, pp. 23-34 vol. 1, doi: 10.1109/INFCOM.2005.1497875.
Gamal, M., Sadek, N., Rizk, M. R., & Ahmed, M. A. E. (2020). Optimization and modeling of modified unslotted CSMA/CA for wireless sensor networks. Alexandria Engineering Journal, 59(2), 681-691.
Peregudov, M. A. E., Steshkovoy, A. S., & Boyko, A. A. (2018). Probabilistic random multiple access procedure model to the CSMA/CA type medium. Informatics and Automation, 59, 92-114.
Busson, A., & Chelius, G. (2009, October). Point processes for interference modeling in csma/ca ad-hoc networks. In Proceedings of the 6th ACM symposium on Performance evaluation of wireless ad hoc, sensor, and ubiquitous networks (pp. 33-40).
Wang, F., Li, D., & Zhao, Y. (2011). Analysis of csma/ca in IEEE 802.15. 4. IET communications, 5(15), 2187-2195.

VTB2: How to improve ADR?

Tutor: Verónica Toro-Betancur

LoRa is a very popular wireless communication technology that has been widely deployed in the world. LoRaWAN is the specification for LoRa networks that defines the Adaptive Data Rate (ADR) mechanism for assigning the communication parameters, to the LoRa devices, that increase the network capacity and decrease the energy consumption. This mechanism is widely used in real-world networks, however, it is known that such networks do not achieve satisfactory performance under dynamic scenarios. For instance, under wireless channels with a high variance. The goal of this topic is to review the many variants to ADR that have been proposed in the literature. Moreover, the student should propose ways to improve the performance of ADR under varying channel conditions.

References:

Kim, D. Y., Kim, S., Hassan, H., & Park, J. H. (2017). Adaptive data rate control in low power wide area networks for long range IoT services. Journal of computational science, 22, 171-178.
Slabicki, M., Premsankar, G., & Di Francesco, M. (2018, April). Adaptive configuration of LoRa networks for dense IoT deployments. In NOMS 2018-2018 IEEE/IFIP Network Operations and Management Symposium (pp. 1-9). IEEE.
Benkahla, N., Tounsi, H., Ye-Qiong, S. O. N. G., & Frikha, M. (2019, June). Enhanced ADR for LoRaWAN networks with mobility. In 2019 15th International Wireless Communications & Mobile Computing Conference (IWCMC) (pp. 1-6). IEEE.
Kufakunesu, R., Hancke, G. P., & Abu-Mahfouz, A. M. (2020). A survey on Adaptive Data Rate optimization in LoRaWAN: Recent solutions and major challenges. Sensors, 20(18), 5044.

AV1: Object tracking for mobile augmented reality

Tutor: Ashutosh Vaishnav

Real-time object tracking is a key feature required for realizing immersive augmented reality (AR) applications. A wide array of methods have been proposed to solve this problem, including deep learning and motion vector-based tracking. The key bottlenecks in utilizing these methods for mobile AR are tight latency constraints and limited hardware capabilities.

References:

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.112.8588&rep=rep1&type=pdf
https://arxiv.org/pdf/2006.13194.pdf
https://ieeexplore-ieee-org.libproxy.aalto.fi/stamp/stamp.jsp?tp=&arnumber=1046629&tag=1

AV2: Compressing deep neural networks for efficient inference

Tutor: Ashutosh Vaishnav

Deep neural networks (DNNs) today give the best performance for a wide array of prediction and classification problems. However, their large size and over-parameterization limit their usability for use cases involving lightweight hardware and tight latency constraints, e.g. mobile cross reality. Several methods have been proposed to reduce their inference latency and storage footprint by compressing DNNs.

References:

https://arxiv.org/abs/1510.00149
https://openaccess.thecvf.com/content_iccv_2017/html/He_Channel_Pruning_for_ICCV_2017_paper.html
https://openaccess.thecvf.com/content_cvpr_2016/papers/Wu_Quantized_Convolutional_Neural_CVPR_2016_paper.pdf

AYJ: Microservices - when and how to use them

Tutor: Antti Ylä-Jääski

Microservice architecture is a modern approach to software system design in which the functionality of the system is divided into small independent units. Microservice systems differ from a more traditional monolithic system in many ways some of which are unexpected. A cost of a migration from a monolithic system to a system based on microservices is often substantial so this decision needs to be carefully evaluated. Microservices have become very popular in recent years. An increasing number of companies (e.g., Amazon, Netflix, LinkedIn) are moving towards dismantling their existing monolithic applications in favor of distributed microservice systems. As with any big software project, migrating to a microservice architecture often requires considerable investment. In this project work, you will discuss the benefits and drawbacks of adopting microservice architecture in comparison to monolithic architecture. Another option is to describe and discuss how service mesh provides containers and microservices-based applications with services within the compute cluster

SZ: Data-efficient Imitation Learning

Tutor: Shibei Zhu

Reinforcement learning (RL) has been proven to be effective in many domains, including games, robotics, and even natural language processing. It relies on the mechanism of reward where an RL agent learns how to act by maximising its reward signal via trial-and-error. In reality, designing a good reward function is not straightforward and often task-specific which may require some prior domain knowledge. Imitation learning provides an alternative way to this by using human demonstration instead of the reward function. In this project, you will be conducting first a survey on current IL (or Behaviour Cloning) methods. And then, you will be designing a data-efficient IL algorithm that can be tested in some current benchmarks.

References:

https://arxiv.org/abs/1703.07326
https://dl.acm.org/doi/abs/10.1145/3054912
https://papers.nips.cc/paper/2016/hash/cc7e2b878868cbae992d1fb743995d8f-Abstract.html

LG: Tools and techniques for formal verification

Tutor: Lachlan Gunn

The extreme complexity of software systems means that not only will implementation errors slip in, but seemingly-minor errors can build up, causing emergent behaviour in the larger system that significantly deviates from the expectations of its designers. One way to mitigate this phenomenon is to use formal methods to mathematically prove the important high-level properties of the system, ensuring that such an accumulation of faults cannot occur. Tools exist to perform formal verification of many parts of practical systems: pure software, cryptographic protocols, and functional specifications. In this topic, you will investigate the state of the art of formal verification, identifying the practical approaches to verifying one or several varieties of systems.

References:

Cryptographic verification tools:

Tamarin: https://tamarin-prover.github.io/
Easycrypt: https://github.com/EasyCrypt/easycrypt

Software specification/verification tools:

F-Star: https://www.fstar-lang.org/
B-method: https://www.atelierb.eu/en/presentation-of-the-b-method/

Generic proof tools:

Isabelle: https://isabelle.in.tum.de/
Coq: https://coq.inria.fr/

Applications:

HACL* cryptographic library: https://doi.org/10.1145/3133956.3134043
Paris metro signalling system: https://doi.org/10.1109/MS.1994.1279941
seL4 microkernel: https://sel4.systems/

SS1: Biometric authentication beyond fingers and eyes

Tutor: Sanna Suoranta

First biometric method for authentication that comes to mind is usually based on fingerprints or iris of an eye. However, there are lot of other biometric methods that can be used in authentication. What of them would be most handy to use with computer or/and mobile device and most accepteble for every day use or some specific use?

References:

Yuxin Chen, Zhuolin Yang, Ruben Abbou, Pedro Lopes, Ben Y. Zhao, and Haitao Zheng. User Authentication via Electrical Muscle Stimulation. CHI'21. https://doi.org/10.1145/3411764.3445441
Zhang Rui and Zheng Yan. A Survey on Biometric Authentication: Toward Secure and Privacy-Preserving Identification. IEEE Access, vol 7, 2018. DOI: 10.1109/ACCESS.2018.2889996

SS2: Continuous Behavioral Authentication

Tutor: Sanna Suoranta

Some techniques allow user authentication as byproduct of common usage of a device. These technologies can be based on e.g. keystrokes, swipe gestures or walking habits. The aim of this article is to tell the current state-of-the-art of continuous authentication for a specific method for a specific use case, chosen by the student (and tutor).

References:

Soumik Mondal and Patrick Bours. A study on continuous authentication using a combination of keystroke and mouse biometrics. Neurocomputing, vol 230, 22 March 2017, pages 1-22.
Ioannis C. Styios, Olga Thanou, Iosif Androulidakis and Elena Zaitseva. A Review of Continuous Authentication Using Behavioral Biometrics. SEEDA-CECNSM '16: Proceedings of the SouthEast European Design Automation, Computer Engineering, Computer Networks and Social Media ConferenceSeptember 2016 Pages 72–79. https://doi.org/10.1145/2984393.2984403

BL: Adversarial attacks and defenses

Tutor: Blerta Lindqvist

An overview of the strongest current adversarial attacks and the best current defenses.

References:

https://nicholas.carlini.com/writing/2019/all-adversarial-example-papers.html

WM: Task Allocation for Vehicular Fog Computing

Tutor: Wencan Mao

In the fog/edge computing environment, task allocation is focused on deciding where to offload the tasks generated by the users to improve the quality of service (QoS) received by the users and achieve better techno-economic performance. Vehicular fog computing (VFC) has been proposed to complement stationary fog nodes with mobile ones, which are carried by vehicles, such as buses, taxis, and drones. Task allocation for VFC becomes challenging due to the mobility of vehicles including both the users and vehicular fog nodes. Both classic methods (e.g., optimization) and new methods (e.g., reinforcement learning) have been used for task allocation in the VFC environment. Please review these papers and compare them based on their characteristics (e.g., scenario, methods, objective, constraints, inputs, outputs, time complexity, etc.).

Reference:

[1] C. Zhu, J. Tao, G. Pastor, Y. Xiao, Y. Ji, Q. Zhou, Y. Li, and A. Ylä-Jääski, “Folo: Latency and quality optimized task allocation in vehicular fog computing,” IEEE Internet of Things Journal, vol. 6, no. 3, pp. 4150–4161, 2019.

[2] X. Hou, Z. Ren, J. Wang, W. Cheng, Y. Ren, K.-C. Chen, and H. Zhang, “Reliable computation offloading for edge-computing enabled software-defined IoV,” IEEE Internet of Things Journal, vol. 7, no. 8, pp. 7097–7111, 2020.

[3] Z. Zhou, H. Liao, X. Wang, S. Mumtaz, and J. Rodriguez, “When vehicular fog computing meets autonomous driving: Computational resource management and task offloading,” IEEE Network, vol. 34, no. 6, pp. 70–76, 2020.

[4] C. Zhu, Y. Chiang, A. Mehrabi, Y. Xiao, A. Ylä-Jääski, and Y. Ji, “Chameleon: Latency and resolution aware task offloading for visual based assisted driving,” IEEE Transactions on Vehicular Technology, vol. 68, no. 9, pp. 9038–9048, 2019.

[5] C. Zhu, Y.-H. Chiang, Y. Xiao, and Y. Ji, “Flexsensing: A QoI and latency-aware task allocation scheme for vehicle-based visual crowdsourcing via deep Q-Network,” IEEE Internet of Things Journal, vol. 8, no. 9, pp. 7625–7637, 2021.

[6] J. Shi, J. Du, J. Wang, J. Wang, and J. Yuan, “Priority-aware task offloading in vehicular fog computing based on deep reinforcement learning,” IEEE Transactions on Vehicular Technology, pp. 1–1, 2020.

[7] J. Wang, C. Jiang, K. Zhang, T. Q. S. Quek, Y. Ren, and L. Hanzo, “Vehicular sensing networks in a smart city: Principles, technologies and applications,” IEEE Wireless Communications, vol. 25, no. 1, pp.122–132, 2018.

Last modified: Thursday, 20 January 2022, 3:13 PM

Available topics

AA: Aalto startup ecosystem (and/ or University ecosystem for student (and/or research-based and deep tech) startups

Tutor: Antti Ainamo

TA1: Optimizing firewall policies

Tutor: Tuomas Aura

Tutor: Tuomas Aura

TA2: Detecting anomalies in firewall configurations

Tutor: Tuomas Aura

TA3: Layer-2 integrity protection

Tutor: Tuomas Aura

TA4: Firewalls and filtering policies for small networks

Tutor: Tuomas Aura

Tutor: Ayush Bharti

CB: Identity-hiding in secure communication

Tutor: Christopher Brzuska

LC: Theoretical Framework for Cloud Computing Network Measurements

Tutor: Lorenzo Corneo

AD: Deep Reinforcement Learning and Games

Tutor: Anton Debner

JH: Asymmetric multi-core schedulingTutor: Jaakko Harjuhahto

Tutor: Jaakko Harjuhahto

VH: Co-inference techniques for Edge and Fog computing

Tutor: Vesa Hirvisalo

AJ1: Implicit bayesian neural networks with normalizing flows

Tutor: Anirudh Jain

AJ2: Marginal normalizing flows for discrete data

Tutor: Anirudh Jain

TJ1: Active Learning

Tutor: Ti John

TJ2: Best of both worlds? Combinations of Gaussian Processes and Neural Network

Tutor: Ti John

TJ3: Approximate inference: how accurate is it really?

Tutor: Ti John

ALJ1: Explainable Empirical Risk Minimization

Tutor: Alexander Jung

ALJ2: Networked Federated Learning

Tutor: Alexander Jung

ALJ3: Federated Learning in Block Models

Tutor: Alexander Jung

ALJ4: Explainable Empirical Risk Minimization

Tutor: Alexander Jung

KKH: Approximation Algorithms for Clustering Problems

Tutor: Kamyar Khodamoradi

MK: A survey of deterministic networking

Tutor: Miika Komu

Tutor: Miika Komu

NHK: Animating Virtual Characters with Deep Neural Networks

Tutor: Nam Hee Kim

Tutor: Zeinab R. Yousefi

VR1: Explaining deep neural networks using Neuro Symbolic Integration

Tutor: Vishnu Raj

VR2: Gradient estimators for Implicit probabilistic models

Tutor: Vishnu Raj

SS: CNN over unordered sets

Tutor: Stephan Sigg

YT: Image generation with generative models

Tutor: Yu Tian

VTB1: Modeling CSMA/CA

Tutor: Verónica Toro-Betancur

VTB2: How to improve ADR?

Tutor: Verónica Toro-Betancur

AV1: Object tracking for mobile augmented reality

Tutor: Ashutosh Vaishnav

AV2: Compressing deep neural networks for efficient inference

Tutor: Ashutosh Vaishnav

AYJ: Microservices - when and how to use them

Tutor: Antti Ylä-Jääski

SZ: Data-efficient Imitation Learning

Tutor: Shibei Zhu

LG: Tools and techniques for formal verification

Tutor: Lachlan Gunn

SS1: Biometric authentication beyond fingers and eyes

Tutor: Sanna Suoranta

SS2: Continuous Behavioral Authentication

Tutor: Sanna Suoranta

BL: Adversarial attacks and defenses

Tutor: Blerta Lindqvist

WM: Task Allocation for Vehicular Fog Computing

Tutor: Wencan Mao

Students

JH: Asymmetric multi-core scheduling
Tutor: Jaakko Harjuhahto