Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Added new topics

...

OTHER Other systems security research themes


Table of Contents


...

Locally Sensitive Neural Based- Perceptual Hashing and Contextual Hashing ML & SEC

Note: this is actually two topics but in different domains (Computer Vision and Natural Language Processing).

Locally sensitive hashing (LSH) is an embedding function that for two similar inputs A and B, produces hashes Ah and Ab that are similar. This is different from from hashing in a cryptographic sense - where distribution of embeddings should not reveal information about the inputs. LSH is extremely useful when it comes to efficient search and storage. However, for many domains the similarity of two inputs may be straightforward to determine for a human but not for the hashing function that is not aware of things such as context, meaning, representation. Hence, we need solutions that can accommodate for these deficiencies.

In these projects, we are going to investigate and design novel locally sensitive hash functions based on neural networks. Our focus is on two domains:

1) Perceptual hashing of images: image that was subject to change of contrast, shifting, rotation, inversion should produce similar hashes to the original.

2) Contextual text hashing of text: sentences that have roughly the same meaning and similar grammatical structure should have similar hashes, e.g. I think it was a great movie vs In my opinion, this was a nice film.

Note: 1) There's going to be a programming pre-assignment. 2) Literature review can be done as a special assignment.

Requirements

  • MSc students in security, computer science, machine learning
  • familiarity with both standard ML techniques as well as deep neural networks (NLP and image classification)
  • Good math and algorithmic skills (hashing in particular)
  • Strong programming skills in Python/Java/Scala/C/C++/Javascript (Python preferred as de facto language)

Nice to have

  • industry experience in software engineering or related
  • research experience
  • familiarity with adversarial machine learning

Some references:
[1]: Markus KettunenErik HärkönenJaakko Lehtinen E-LPIPS: Robust Perceptual Image Similarity via Random Transformation Ensembles

[2]: Lucas BourtouleVarun ChandrasekaranChristopher Choquette-ChooHengrui JiaAdelin TraversBaiwu ZhangDavid LieNicolas Papernot. Machine Unlearing

[3]: A Visual Survey of Data Augmentation in NLP

[4]: Bian Yang, Fan Gu, Xiamu Niu. Block Mean Value Based Image Perceptual Hashing in IIH-MSP 06: Proceedings of the 2006 International Conference on Intelligent Information Hiding and Multimedia

[5]: Locally Sensitive Hashing

For further information: Sebastian Szyller (sebastian.szyller@aalto.fi) and Prof. N. Asokan.

Stealing GANs and Other Generative Models ML & SEC

In recent years, the field of black-box model extraction has been growing in popularity [1-5]. During a black-box model extraction attack, adversary queries data to the victim model sitting behind an API and obtains the predictions. It then uses the data together with the predictions to reconstruct the model locally. Model exctraction attacks are a threat to the Model-as-a-Service business model that is becoming ubiquitous choice for ML offerings. Unfortunately, existing defense mechanisms are not sufficient and it is likely that model extraction will always be a threat[5].

So far model extraction attacks have been limited to a) simple models of various application or b) complex DNN classification models - both in image and text domains. In this work, we are going to explore techniques that could steal generative models. Generative models are quite different from regression and classification problems in that input-output interaction between the user and the model is vastly different (e.g. style transfer).


Note: 1) There's going to be a programming pre-assignment. 2) Literature review can be done as a special assignment.

Requirements

  • MSc students in security, computer science, machine learning
  • familiarity with both standard ML techniques as well as deep neural networks
  • Good math and algorithmic skills
  • Strong programming skills in Python/Java/Scala/C/C++/Javascript (Python preferred as de facto language)

Nice to have

  • industry experience in software engineering or related
  • research experience
  • familiarity with adversarial machine learning

Some references:
[1]: Nicolas Papernot, Patrick McDaniel, Ian Goodfellow, Somesh Jha, Z Berkay Celik, and Ananthram Swami. 2017. Practical black-box attacks against machine learning. In ACM Symposium on Information, Computer and Communications Security. ACM, 506–519.

[2]: Florian Tramèr, Fan Zhang, Ari Juels, Michael K Reiter, and Thomas Ristenpart. 2016. Stealing machine learning models via prediction apis. In 25th USENIX Security Symposium. 601–618.

[3]: Mika Juuti, Sebastian Szyller, Samuel Marchal, and N. Asokan. 2019. PRADA: Protecting against DNN Model Stealing Attacks. In IEEE European Symposium on Security & Privacy. IEEE, 1–16.

[4]: Tribhuvanesh Orekondy, Bernt Schiele, and Mario Fritz. 2019. Knockoff Nets: Stealing Functionality of Black-Box Models. In CVPR. 4954–4963.

[5]: Buse Atli, Sebastian Szyller, Mika Juuti, Samuel Marchal and N. Asokan 2019. Extraction of Complex DNN Models: Real Threat or Boogeyman? To appear in AAAI-20 Workshop on Engineering Dependable and Secure Machine Learning Systems

For further information: Sebastian Szyller (sebastian.szyller@aalto.fi) and Prof. N. Asokan.

Ineffectiveness of Non-label-flipping Defenses and Watermarking Schemes Against Model Extraction Attacks ML & SEC

In recent years, the field of black-box model extraction has been growing in popularity [1-5]. During a black-box model extraction attack, adversary queries data to the victim model sitting behind an API and obtains the predictions. It then uses the data together with the predictions to reconstruct the model locally. Model exctraction attacks are a threat to the Model-as-a-Service business model that is becoming ubiquitous choice for ML offerings. 


Note: 1) There's going to be a programming pre-assignment. 2) Literature review can be done as a special assignment.

Requirements

  • MSc students in security, computer science, machine learning
  • familiarity with both standard ML techniques as well as deep neural networks
  • Good math and algorithmic skills
  • Strong programming skills in Python/Java/Scala/C/C++/Javascript (Python preferred as de facto language)

Nice to have

  • industry experience in software engineering or related
  • research experience
  • familiarity with adversarial machine learning

Some references:
[1]: Nicolas Papernot, Patrick McDaniel, Ian Goodfellow, Somesh Jha, Z Berkay Celik, and Ananthram Swami. 2017. Practical black-box attacks against machine learning. In ACM Symposium on Information, Computer and Communications Security. ACM, 506–519.

[2]: Florian Tramèr, Fan Zhang, Ari Juels, Michael K Reiter, and Thomas Ristenpart. 2016. Stealing machine learning models via prediction apis. In 25th USENIX Security Symposium. 601–618.

[3]: Mika Juuti, Sebastian Szyller, Samuel Marchal, and N. Asokan. 2019. PRADA: Protecting against DNN Model Stealing Attacks. In IEEE European Symposium on Security & Privacy. IEEE, 1–16.

[4]: Tribhuvanesh Orekondy, Bernt Schiele, and Mario Fritz. 2019. Knockoff Nets: Stealing Functionality of Black-Box Models. In CVPR. 4954–4963.

[5]: Buse Atli, Sebastian Szyller, Mika Juuti, Samuel Marchal and N. Asokan 2019. Extraction of Complex DNN Models: Real Threat or Boogeyman? To appear in AAAI-20 Workshop on Engineering Dependable and Secure Machine Learning Systems

For further information: Sebastian Szyller (sebastian.szyller@aalto.fi) and Prof. N. Asokan.

Real Time Universal Adversarial Attacks on Deep Policies  Policies  ML & SEC

It is well known that machine learning models are vulnerable to inputs which are constructed by adversaries to force misclassification. Adversarial examples [1] have been extensively studied in image classification models using deep neural networks (DNNs). Recent work [2] shows that reinforcement learning agents using deep reinforcement learning (DRL) algorithms are also susceptible to adversarial examples, since DRL uses neural networks to train a policy in the decision-making process [3]. Existing adversarial attacks [4-8] try to corrupt the observation of the agent by perturbing the sensor readings so that the corresponding observation is different from the true environment that the agent interacts. 

Current adversarial attacks against DRL agents [4-8] are generated per observation, which is not infeasible due to the dynamic nature of RL. Creating observation-specific adversarial samples is usually slower than the agent’s sampling rate, introducing a delay between old and new observations. The goal of this work is to design an approach for creating universal, additive adversarial perturbations that requires almost no computation in real time. This work will focus on different pre-computed attack strategies that are capable of changing the input before it is observed by the RL agent. We will analyze the effect of universal adversarial perturbations on Atari games, continuous task simulators and self driving car simulators trained with different DRL algorithms. 

Required skills:

  • MSc students in security, computer science, machine learning, robotics or automated systems

  • Basic understanding of both standard ML techniques as well as deep neural networks (You should at least take Machine learning: basic principles course from SCI department or some other similar course)

  • Good knowledge of mathematical methods and algorithmic skills

  • Strong programming skills in Python/Java/Scala/C/C++/Javascript (Python preferred as de facto language)

Nice to have:

  • Familiarity with one of the topics: adversarial machine learning, reinforcement learning, statistics, control theory, artificial intelligence

  • Familiarity with deep learning frameworks (PyTorch, Tensorflow, Keras etc.)

  • Sufficient skills to work and interact in English

  • An interest to do research and Arcade games

Some references for reading:

[1] Goodfellow, Ian J., Jonathon Shlens, and Christian Szegedy. 2014. "Explaining and harnessing adversarial examples." 

[2] Huang, Sandy, et al. 2017. "Adversarial attacks on neural network policies." 

[3] Mnih, Volodymyr, et al.  "Human-level control through deep reinforcement learning." Nature 518 518.7540 (2015): 529.

[4] Lin, Yen-Chen, et al. IJCAI-2017. "Tactics of adversarial attack on deep reinforcement learning agents."

[5] Behzadan, Vahid, and Arslan Munir.    Springer, Cham, 2017. "Vulnerability of deep reinforcement learning to policy induction attacks."

[6] Kos, Jernej, and Dawn Song. 2017.  "Delving into adversarial attacks on deep policies." 

[7] Tretschk, Edgar, Seong Joon Oh, and Mario Fritz. 2018. "Sequential attacks on agents for long-term adversarial goals." 

[8] Xiao, Chaowei, et al. 2019. "Characterizing Attacks on Deep Reinforcement Learning." 

For further information:  Please contact Buse G. A. Tekgul  (buse.atli@aalto.fi), and prof. N. Asokan.

Guaranatees of Differential Privacy in Overparameterised Models ML & SEC

...

Some references

[1]: Cynthia Dwork, Aaron Roth. 2014. Algorithmic Foundations of Differential Privacy

[2]: Reza Shokri, Vitaly Shmatikov. 2015. Privacy Preserving Deep Learning. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security Pages 1310-1321

[3]: Congzheng Song, Vitaly Shmatikov. 2019. Overlearning Reveals Sensitive Attributes

[4]: Lecuyer, M., Atlidakis, V., Geambasu, R., Hsu, D., and Jana, S. 2019. Certified Robustness to Adversarial Examples with Differential PrivacyIn IEEE Symposium on Security and Privacy (SP) 2019

[5]: M. Abadi, A. Chu, I. Goodfellow, H. Brendan McMahan, I. Mironov, K. Talwar, and L. Zhang. 2016. Deep Learning with Differential PrivacyArXiv e-prints.

For further information: Sebastian Szyller (sebastian.szyller@aalto.fi) and Prof. N. Asokan.

...

Latent Representations in Model Extraction Attacks ML & SEC

In recent years, the field of black-box model extraction has been growing in popularity [1-5]. During a black-box model extraction attack, adversary queries data to the victim model sitting behind an API and obtains the predictions. It then uses the data together with the predictions to reconstruct the model locally. Model exctraction attacks are a threat to the Model-as-a-Service business model that is becoming ubiquitous choice for ML offerings. Unfortunately, existing defense mechanisms are not sufficient and it is likely that model extraction will always be a threat[5]. However, despite the threat, research community does not fully understand why model extraction works and what are its current shortcoming and limitations.

In this work, we are going to explore the quality of latent representations learned during model extraction attacks, study the relationship between the victim and stolen models. We will investigate the impact of robustness-increasing techniques (e.g. adversarial training) on the effectiveness of model extraction and finally, formalise the field of model extraction attacks through the lense of transfer learning.

Note: 1) There's going to be a programming pre-assignment. 2) Literature review can be done as a special assignment.

Requirements

  • MSc students in security, computer science, machine learning
  • familiarity with both standard ML techniques as well as deep neural networks
  • Good math and algorithmic skills
  • Strong programming skills in Python/Java/Scala/C/C++/Javascript (Python preferred as de facto language)

...

  • industry experience in software engineering or related
  • research experience
  • familiarity with adversarial machine learning

Some references:
[1]: Nicolas Papernot, Patrick McDaniel, Ian Goodfellow, Somesh Jha, Z Berkay Celik, and Ananthram Swami. 2017. Practical black-box attacks against machine learning. In ACM Symposium on Information, Computer and Communications Security. ACM, 506–519.

[2]: Florian Tramèr, Fan Zhang, Ari Juels, Michael K Reiter, and Thomas Ristenpart. 2016. Stealing machine learning models via prediction apis. In 25th USENIX Security Symposium. 601–618.

[3]: Mika Juuti, Sebastian Szyller, Samuel Marchal, and N. Asokan. 2019. PRADA: Protecting against DNN Model Stealing Attacks. In IEEE European Symposium on Security & Privacy. IEEE, 1–16.

[4]: Tribhuvanesh Orekondy, Bernt Schiele, and Mario Fritz. 2019. Knockoff Nets: Stealing Functionality of Black-Box Models. In CVPR. 4954–4963.

[5]: Buse Atli, Sebastian Szyller, Mika Juuti, Samuel Marchal and N. Asokan 2019. Extraction of Complex DNN Models: Real Threat or Boogeyman? To appear in AAAI-20 Workshop on Engineering Dependable and Secure Machine Learning Systems

[6]: Tero Karras, Samuli Laine, Timo Aila. A Style-Based Generator Architecture for Generative Adversarial Networks, in CVPR 2019

For further information: Sebastian  Sebastian Szyller (sebastian.szyller@aaltoszyller@aalto.fi) and Prof. N. Asokan.

ROP-gadget finder for PA-protected binaries  PLATSEC

Return-oriented programming [1,2] is an exploitation technique that enables run-time attacks to execute code in a vulnerable process in the presence of security defenses such as W⊕X memory protection policies (e.g. a NX bit). In ROP, the adversary exploits a memory vulnerability to manipulate return addresses stored on the stack, thereby altering the program’s backward-edge control flow. ROP allows Turing-complete attacks by chaining together multiple multiple gadgets, i.e., adversary-chosen sequences of pre-existing program instructions ending in a return instruction that together perform the desired operations. Identifiying suitable ROP-gadget chains useful in attacks can be automated using gadget-finding tools such as ROPGadget [3] or ROPgenerator [4].

More advanced defenses that are effectively against ROP and other control-flow hijacking attacks are becoming commonplace, and modern processor architectures even deploy deploy hardware-assisted defenses designed to thwart ROP attacks. An example is the recently added support for pointer authentication (PA) in the ARMv8-A processor architecture [5], commonly used in devices like smartphones. PA is a low-cost technique to authenticate pointers so as to resist memory vulnerabilities. It has been shown to enable practical protection against memory vulnerabilities that corrupt return addresses or function pointers. However, current PA-based defenses are vulnerable to to reuse attacks, where the adversary can reuse previously observed valid protected pointers. Current implementations of PA-based return address protection in GCC and LLVM mitigate reuse attacks, but cannot completely prevent them [6].

The objective of this topic is to design and implement a ROP-gadget finder that takes into account PA-based defenses such as GCC's and LLVM's -msign-return-address (GCC < 9.0) [7] / -mbranch-protection=pac-ret[+leaf] (GCC 9.0 and newer) [8]. These defenses cryptographically bind the return addresses stored on the stack to the stack pointer value at the time the address is pushed to the stack. To exploit PA-protected return addresses in a ROP-chain, the adversary must obtain signed return addresses that correspond to the value of the stack pointer when the ROP-gadget executes it's return instruction using the reused protected address.

NOTE: Part of this topic will be performed as a special assignment, which is a pre-requisite for an eventual  thesis topic.

Required skills:

  • Basic understanding of of control-flow hijacking and  and ROP attacks
  • Basic understanding of the C  C runtime environment, assembler programming and  and debugging tools (GDB).
  • Strong programming skills with with one or more of the following programming languages:  C/C++, Python, Ruby, Rust, Go, Java, Perl (C and and/or or Python preferred preferred)

Nice to have:

  • Prior experience with ARMv8-A assembler programming (AArch64 instruction set).
  • Prior experience with Capstone disassemby framework framework programming [9].
  • Basic understanding of ARM Pointer Authentication.

References:

[1]: Shacham. The geometry of innocent flesh on the bone: return-into-libc without function calls (on the x86). 
      In Proceedings of the 14th ACM conference on Computer and communications security (CCS '07). ACM, New York, NY, USA, 552-561.2007 2007.
      DOI: https://doi.org/10.1145/1315245.1315313
[2]: Kornau. Return Oriented Programming for the ARM Architecture. MSc thesis. Ruhr-Universität Bochum. 2009.
[3]: https://github.com/JonathanSalwan/ROPgadget
[4]: https://github.com/Boyan-MILANOV/ropgenerator
[5]: Qualcomm. Pointer Authentication on ARMv8.3. Whitepaper. 2017.
[6]: Liljestrand et al. PAC it up: Towards pointer integrity using ARM pointer authentication.
      In 28th USENIX Security Symposium, USENIX Security 2019, Santa Clara, CA, USA, August 14-16, pages 177–194, 2019
[7]: Using the GNU Compiler Collection (GCC 7.10): 3.18.1 AArch64 Options. [Retrieved 2019-09-10]
[8]: Using the GNU Compiler Collection (GCC 9.10): 3.18.1 AArch64 Options. [Retrieved 2019-09-10]
[9]: https://www.capstone-engine.org/

For further information: Please contact Thomas Nyman (thomas.nyman@aalto.fi), Hans Liljestrand (hans.liljestrand@aaltoliljestrand@aalto.fi) and prof. N. Asokan.

...

Byzantine fault tolerance with rich fault models  OTHER

...

For further information: Please contact Lachlan Gunn (lachlan.gunn@aalto.fi) and Prof. N. Asokan.

...

Tor hidden service geolocation geolocation (special assignment only)    OTHER

Tor is the most well-known and most-used anonymity system, based on onion routing: data is relayed through several nodes with multiple layers of encryption. Each node strips a layer of encryption and routes the message to the next node in the chain until it reaches its destination.

...