We use cookies to analyze the browsing and usage of our website and to personalize your experience. You can disable these technologies at any time, but this may limit certain functionalities of the site. Read our Privacy Policy for more information.
Setting cookies
You can enable and disable the types of cookies you wish to accept. However certain choices you make could affect the services offered on our sites (e.g. suggestions, personalised ads, etc.).
Essential cookies
These cookies are necessary for the operation of the site and cannot be deactivated. (Still active)
Analytics cookies
Do you accept the use of cookies to measure the audience of our sites?
Multimedia Player
Do you accept the use of cookies to display and allow you to watch the video content hosted by our partners (YouTube, etc.)?
It is widely known that it is possible to implant backdoors into neural networks,
by which an attacker can choose an input to produce a part… (see more)icular undesirable output
(e.g.\ misclassify an image).
We propose to use \emph{meta-models}, neural networks that take another network's parameters
as input, to detect backdoors directly from model weights.
To this end we present a meta-model architecture and train it on a dataset of approx.\ 4000 clean and backdoored CNNs trained on CIFAR-10.
Our approach is simple and scalable, and is able to detect the presence of a backdoor with
It is widely known that it is possible to implant backdoors into neural networks,
by which an attacker can choose an input to produce a part… (see more)icular undesirable output
(e.g.\ misclassify an image).
We propose to use \emph{meta-models}, neural networks that take another network's parameters
as input, to detect backdoors directly from model weights.
To this end we present a meta-model architecture and train it on a dataset of approx.\ 4000 clean and backdoored CNNs trained on CIFAR-10.
Our approach is simple and scalable, and is able to detect the presence of a backdoor with
Zero-shot coordination (ZSC) is a popular setting for studying the ability of AI agents to coordinate with novel partners. Prior formulation… (see more)s of ZSC make the assumption that the problem setting is common knowledge i.e. each agent has the knowledge of the underlying Dec-POMDP, every agent knows the others have this knowledge, and so on ad infinitum. However, in most real-world situations, different agents are likely to have different models of the (real world) environment, thus breaking this assumption. To address this limitation, we formulate the _noisy zero-shot coordination_ (NZSC) problem, where agents observe different noisy versions of the ground truth Dec-POMDP generated by passing the true Dec-POMDP through a noise model. Only the distribution of the ground truth Dec-POMDPs and the noise model are common knowledge. We show that any noisy ZSC problem can be reformulated as a ZSC problem by designing a meta-Dec-POMDP with an augmented state space consisting of both the ground truth Dec-POMDP and its corresponding state. In our experiments, we analyze various aspects of NZSC and show that achieving good performance in NZSC requires agents to make use of both the noisy observations of ground truth Dec-POMDP, knowledge of each other's noise models and their interactions with the ground truth Dec-POMDP. Through experimental results, we further establish that ignoring the noise in problem specification can result in sub-par ZSC coordination performance, especially in iterated scenarios. On the whole, our work highlights that NZSC adds an orthogonal challenge to traditional ZSC in tackling the uncertainty about the true problem.
Zero-shot coordination (ZSC) is a popular setting for studying the ability of AI agents to coordinate with novel partners. Prior formulation… (see more)s of ZSC make the assumption that the problem setting is common knowledge i.e. each agent has the knowledge of the underlying Dec-POMDP, every agent knows the others have this knowledge, and so on ad infinitum. However, in most real-world situations, different agents are likely to have different models of the (real world) environment, thus breaking this assumption. To address this limitation, we formulate the _noisy zero-shot coordination_ (NZSC) problem, where agents observe different noisy versions of the ground truth Dec-POMDP generated by passing the true Dec-POMDP through a noise model. Only the distribution of the ground truth Dec-POMDPs and the noise model are common knowledge. We show that any noisy ZSC problem can be reformulated as a ZSC problem by designing a meta-Dec-POMDP with an augmented state space consisting of both the ground truth Dec-POMDP and its corresponding state. In our experiments, we analyze various aspects of NZSC and show that achieving good performance in NZSC requires agents to make use of both the noisy observations of ground truth Dec-POMDP, knowledge of each other's noise models and their interactions with the ground truth Dec-POMDP. Through experimental results, we further establish that ignoring the noise in problem specification can result in sub-par ZSC coordination performance, especially in iterated scenarios. On the whole, our work highlights that NZSC adds an orthogonal challenge to traditional ZSC in tackling the uncertainty about the true problem.
Over the past decade, there has been extensive research aimed at enhancing the robustness of neural networks, yet this problem remains vastl… (see more)y unsolved. Here, one major impediment has been the overestimation of the robustness of new defense approaches due to faulty defense evaluations. Flawed robustness evaluations necessitate rectifications in subsequent works, dangerously slowing down the research and providing a false sense of security. In this context, we will face substantial challenges associated with an impending adversarial arms race in natural language processing, specifically with closed-source Large Language Models (LLMs), such as ChatGPT, Google Bard, or Anthropic's Claude. We provide a first set of prerequisites to improve the robustness assessment of new approaches and reduce the amount of faulty evaluations. Additionally, we identify embedding space attacks on LLMs as another viable threat model for the purposes of generating malicious content in open-sourced models. Finally, we demonstrate on a recently proposed defense that, without LLM-specific best practices in place, it is easy to overestimate the robustness of a new approach.
Attention has become a common ingredient in deep learning architectures. It adds a dynamical selection of information on top of the static s… (see more)election of information supported by weights. In the same way, we can imagine a higher-order informational filter built on top of attention: an Attention Schema (AS), namely, a descriptive and predictive model of attention. In cognitive neuroscience, Attention Schema Theory (AST) supports this idea of distinguishing attention from AS. A strong prediction of this theory is that an agent can use its own AS to also infer the states of other agents' attention and consequently enhance coordination with other agents. As such, multi-agent reinforcement learning would be an ideal setting to experimentally test the validity of AST. We explore different ways in which attention and AS interact with each other. Our preliminary results indicate that agents that implement the AS as a recurrent internal control achieve the best performance. In general, these exploratory experiments suggest that equipping artificial agents with a model of attention can enhance their social intelligence.
GFlowNets have exhibited promising performance in generating diverse candidates with high rewards. These networks generate objects increment… (see more)ally and aim to learn a policy that assigns probability of sampling objects in proportion to rewards. However, the current training pipelines of GFlowNets do not consider the presence of isomorphic actions, which are actions resulting in symmetric or isomorphic states. This lack of symmetry increases the amount of samples required for training GFlowNets and can result in inefficient and potentially incorrect flow functions. As a consequence, the reward and diversity of the generated objects decrease. In this study, our objective is to integrate symmetries into GFlowNets by identifying equivalent actions during the generation process. Experimental results using synthetic data demonstrate the promising performance of our proposed approaches.
GFlowNets have exhibited promising performance in generating diverse candidates with high rewards. These networks generate objects increment… (see more)ally and aim to learn a policy that assigns probability of sampling objects in proportion to rewards. However, the current training pipelines of GFlowNets do not consider the presence of isomorphic actions, which are actions resulting in symmetric or isomorphic states. This lack of symmetry increases the amount of samples required for training GFlowNets and can result in inefficient and potentially incorrect flow functions. As a consequence, the reward and diversity of the generated objects decrease. In this study, our objective is to integrate symmetries into GFlowNets by identifying equivalent actions during the generation process. Experimental results using synthetic data demonstrate the promising performance of our proposed approaches.
Understanding causal relationships within Gene Regulatory Networks (GRNs) is essential for unraveling the gene interactions in cellular proc… (see more)esses. However, causal discovery in GRNs is a challenging problem for multiple reasons including the existence of cyclic feedback loops and uncertainty that yields diverse possible causal structures. Previous works in this area either ignore cyclic dynamics (assume acyclic structure) or struggle with scalability. We introduce Swift-DynGFN as a novel framework that enhances causal structure learning in GRNs while addressing scalability concerns. Specifically, Swift-DynGFN exploits gene-wise independence to boost parallelization and to lower computational cost. Experiments on real single-cell RNA velocity and synthetic GRN datasets showcase the advancement in learning causal structure in GRNs and scalability in larger systems.