Développez des compétences fondamentales en intelligence artificielle (IA) responsable grâce à des cours autodirigés, animés par des expert·e·s de Mila reconnu·e·s à l’échelle internationale.
Le Fellowship Mila en politiques de l'IA transforme l'expertise approfondie en IA en politiques rigoureuses d'intérêt public. Découvrez la dernière publication Combler la disparité en matière d’expertise : mécanismes de transfert des connaissances pour la réglementation de l’IA par Moritz von Knebel.
Ce programme soutient les startups spécialisées en IA à tout moment de l'année. Bénéficiez de ressources de pointe et d'un accompagnement sur mesure pour accélérer le développement de votre technologie.
Nous utilisons des témoins pour analyser le trafic et l’utilisation de notre site web, afin de personnaliser votre expérience. Vous pouvez désactiver ces technologies à tout moment, mais cela peut restreindre certaines fonctionnalités du site. Consultez notre Politique de protection de la vie privée pour en savoir plus.
Paramètre des cookies
Vous pouvez activer et désactiver les types de cookies que vous souhaitez accepter. Cependant certains choix que vous ferez pourraient affecter les services proposés sur nos sites (ex : suggestions, annonces personnalisées, etc.).
Cookies essentiels
Ces cookies sont nécessaires au fonctionnement du site et ne peuvent être désactivés. (Toujours actif)
Cookies analyse
Acceptez-vous l'utilisation de cookies pour mesurer l'audience de nos sites ?
Lecteur Multimédia
Acceptez-vous l'utilisation de cookies pour afficher et vous permettre de regarder les contenus vidéo hébergés par nos partenaires (YouTube, etc.) ?
Publications
QRelScore: Better Evaluating Generated Questions with Deeper Understanding of Context-aware Relevance
Existing metrics for assessing question generation not only require costly human reference but also fail to take into account the input cont… (voir plus)ext of generation, rendering the lack of deep understanding of the relevance between the generated questions and input contexts. As a result, they may wrongly penalize a legitimate and reasonable candidate question when it (1) involves complicated reasoning with the context or (2) can be grounded by multiple evidences in the context.In this paper, we propose QRelScore, a context-aware Relevance evaluation metric for Question Generation.Based on off-the-shelf language models such as BERT and GPT2, QRelScore employs both word-level hierarchical matching and sentence-level prompt-based generation to cope with the complicated reasoning and diverse generation from multiple evidences, respectively.Compared with existing metrics, our experiments demonstrate that QRelScore is able to achieve a higher correlation with human judgments while being much more robust to adversarial samples.
2022-11-30
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (publié)
Mobile Edge Computing (MEC) involves placing computational capability and applications at the edge of the network, providing benefits such a… (voir plus)s reduced latency, reduced network congestion, and improved performance of applications. The performance and reliability of MEC degrades significantly when the edge server(s) in the cluster are overloaded. In this work, an adaptive admission control policy to prevent edge node from getting overloaded is presented. This approach is based on a recently-proposed low complexity RL (Reinforcement Learning) algorithm called SALMUT (Structure-Aware Learning for Multiple Thresholds), which exploits the structure of the optimal admission control policy in multi-class queues for an average-cost setting. We extend the framework to work for node overload-protection problem in a discounted-cost setting. The proposed solution is validated using several scenarios mimicking real-world deployments in two different settings — computer simulations and a docker testbed. Our empirical evaluations show that the total discounted cost incurred by SALMUT is similar to state-of-the-art deep RL algorithms such as PPO (Proximal Policy Optimization) and A2C (Advantage Actor Critic) but requires an order of magnitude less time to train, outputs easily interpretable policy, and can be deployed in an online manner.
2022-11-30
IEEE Transactions on Cognitive Communications and Networking (publié)
Computational approaches to the study of language emergence can help us understand how natural languages are shaped by cognitive and sociocu… (voir plus)ltural factors. Previous work focused on tasks where agents refer to a single entity. In contrast, we study how agents predicate, that is, how they express that some relation holds between several entities. We introduce a setup where agents talk about a variable number of entities that can be partially observed by the listener. In the presence of a least-effort pressure, they tend to discuss only entities that are not observed by the listener. Thus we can obtain artificial phrases that denote a single entity, as well as artificial sentences that denote several entities. In natural languages, if we ignore the verb, phrases are usually concatenated, either in a specific order or by adding case markers to form sentences. Our setup allows us to quantify how much this holds in emergent languages using a metric we call concatenability. We also measure transitivity, which quantifies the importance of word order. We demonstrate the usefulness of this new setup and metrics for studying factors that influence argument structure. We compare agents having access to input representations structured into pre-segmented objects with properties, versus unstructured representations. Our results indicate that the awareness of object structure yields a more natural sentence organization.
2022-11-30
Transactions of the Association for Computational Linguistics (publié)
VDGraph2Vec: Vulnerability Detection in Assembly Code using Message Passing Neural Networks
Ashita Diwan
Miles Q. Li
Benjamin C. M. Fung
Software vulnerability detection is one of the most challenging tasks faced by reverse engineers. Recently, vulnerability detection has rece… (voir plus)ived a lot of attention due to a drastic increase in the volume and complexity of software. Reverse engineering is a time-consuming and labor-intensive process for detecting malware and software vulnerabilities. However, with the advent of deep learning and machine learning, it has become possible for researchers to automate the process of identifying potential security breaches in software by developing more intelligent technologies. In this research, we propose VDGraph2Vec, an automated deep learning method to generate representations of assembly code for the task of vulnerability detection. Previous approaches failed to attend to topological characteristics of assembly code while discovering the weakness in the software. VDGraph2Vec embeds the control flow and semantic information of assembly code effectively using the expressive capabilities of message passing neural networks and the RoBERTa model. Our model is able to learn the important features that help distinguish between vulnerable and non-vulnerable software. We carry out our experimental analysis for performance benchmark on three of the most common weaknesses and demonstrate that our model can identify vulnerabilities with high accuracy and outperforms the current state-of-the-art binary vulnerability detection models.
2022-11-30
International Conference on Machine Learning and Applications (publié)
Learning the causal structure of observable variables is a central focus for scientific discovery. Bayesian causal discovery methods tackle … (voir plus)this problem by learning a posterior over the set of admissible graphs that are equally likely given our priors and observations. Existing methods primarily consider observations from static systems and assume the underlying causal structure takes the form of a directed acyclic graph (DAG). In settings with dynamic feedback mechanisms that regulate the trajectories of individual variables, this acyclicity assumption fails unless we account for time. We treat causal discovery in the unrolled causal graph as a problem of sparse identification of a dynamical system. This imposes a natural temporal causal order between variables and captures cyclic feedback loops through time. Under this lens, we propose a new framework for Bayesian causal discovery for dynamical systems and present a novel generative flow network architecture (Dyn-GFN) tailored for this task. Dyn-GFN imposes an edge-wise sparse prior to sequentially build a k -sparse causal graph. Through evaluation on temporal data, our results show that the posterior learned with Dyn-GFN yields improved Bayes coverage of admissible causal structures relative to state of the art Bayesian causal discovery methods.
We present a technique for zero-shot generation of a 3D model using only a target text prompt. Without any 3D supervision our method deforms… (voir plus) the control shape of a limit subdivided surface along with its texture map and normal map to obtain a 3D asset that corresponds to the input text prompt and can be easily deployed into games or modeling applications. We rely only on a pre-trained CLIP model that compares the input text prompt with differentiably rendered images of our 3D model. While previous works have focused on stylization or required training of generative models we perform optimization on mesh parameters directly to generate shape, texture or both. To constrain the optimization to produce plausible meshes and textures we introduce a number of techniques using image augmentations and the use of a pretrained prior that generates CLIP image embeddings given a text embedding.
Different types of mental rotation tests have been used extensively in psychology to understand human visual reasoning and perception. Under… (voir plus)standing what an object or visual scene would look like from another viewpoint is a challenging problem that is made even harder if it must be performed from a single image. We explore a controlled setting whereby questions are posed about the properties of a scene if that scene was observed from another viewpoint. To do this we have created a new version of the CLEVR dataset that we call CLEVR Mental Rotation Tests (CLEVR-MRT). Using CLEVR-MRT we examine standard methods, show how they fall short, then explore novel neural architectures that involve inferring volumetric representations of a scene. These volumes can be manipulated via camera-conditioned transformations to answer the question. We examine the efficacy of different model variants through rigorous ablations and demonstrate the efficacy of volumetric representations.
We propose a geometric scattering-based graph neural network (GNN) for approximating solutions of the NP-hard maximum clique (MC) problem. W… (voir plus)e construct a loss function with two terms, one which encourages the network to find highly connected nodes and the other which acts as a surrogate for the constraint that the nodes form a clique. We then use this loss to train an efficient GNN architecture that outputs a vector representing the probability for each node to be part of the MC and apply a rule-based decoder to make our final prediction. The incorporation of the scattering transform alleviates the so-called oversmoothing problem that is often encountered in GNNs and would degrade the performance of our proposed setup. Our empirical results demonstrate that our method outperforms representative GNN baselines in terms of solution accuracy and inference speed as well as conventional solvers like Gurobi with limited time budgets. Furthermore, our scattering model is very parameter efficient with only
2022-11-28
Conference on Neural Information Processing Systems (Accept)
Most Graph Neural Networks (GNNs) predict the labels of unseen graphs by learning the correlation between the input graphs and labels. Howev… (voir plus)er, by presenting a graph classification investigation on the training graphs with severe bias, surprisingly, we discover that GNNs always tend to explore the spurious correlations to make decision, even if the causal correlation always exists. This implies that existing GNNs trained on such biased datasets will suffer from poor generalization capability. By analyzing this problem in a causal view, we find that disentangling and decorrelating the causal and bias latent variables from the biased graphs are both crucial for debiasing. Inspiring by this, we propose a general disentangled GNN framework to learn the causal substructure and bias substructure, respectively. Particularly, we design a parameterized edge mask generator to explicitly split the input graph into causal and bias subgraphs. Then two GNN modules supervised by causal/bias-aware loss functions respectively are trained to encode causal and bias subgraphs into their corresponding representations. With the disentangled representations, we synthesize the counterfactual unbiased training samples to further decorrelate causal and bias variables. Moreover, to better benchmark the severe bias problem, we construct three new graph datasets, which have controllable bias degrees and are easier to visualize and explain. Experimental results well demonstrate that our approach achieves superior generalization performance over existing baselines. Furthermore, owing to the learned edge mask, the proposed model has appealing interpretability and transferability. Code and data are available at: https://github.com/googlebaba/DisC.
2022-11-28
Conference on Neural Information Processing Systems (Accept)
Functional connectivity subtypes associate robustly with ASD diagnosis
Sebastian G. W. Urchs
Angela Tam
Pierre Orban
Clara A. Moreau
Yassine Benhajali
Hien Duy Nguyen
Alan C. Evans
Lune P Bellec
Our understanding of the changes in functional brain organization in autism is hampered by the extensive heterogeneity that characterizes th… (voir plus)is neurodevelopmental disorder. Data driven clustering offers a straightforward way to decompose autism heterogeneity into subtypes of connectivity and promises an unbiased framework to investigate behavioral symptoms and causative genetic factors. Yet, the robustness and generalizability of functional connectivity subtypes is unknown. Here, we show that a simple hierarchical cluster analysis can robustly relate a given individual and brain network to a connectivity subtype, but that continuous assignments are more robust than discrete ones. We also found that functional connectivity subtypes are moderately associated with the clinical diagnosis of autism, and these associations generalize to independent replication data. We explored systematically 18 different brain networks as we expected them to associate with different behavioral profiles as well as different key regions. Contrary to this prediction, autism functional connectivity subtypes converged on a common topography across different networks, consistent with a compression of the primary gradient of functional brain organization, as previously reported in the literature. Our results support the use of data driven clustering as a reliable data dimensionality reduction technique, where any given dimension only associates moderately with clinical manifestations.
The white matter is organized into “tracts” or “bundles,” which connect different parts of the central nervous system. Knowing where… (voir plus) these tracts are located in each individual is important for understanding the cause of potential sensorial, motor or cognitive deficits and for developing appropriate treatments. Traditionally, tracts are found using tracer injection, which is a difficult, slow and poorly scalable technique. However, axon populations from a given tract exhibit specific characteristics in terms of morphometrics and myelination. Hence, the delineation of tracts could, in principle, be done based on their morphometry. The objective of this study was to generate automatic parcellation of the rat spinal white matter tracts using the manifold information from scanning electron microscopy images of the entire spinal cord. The axon morphometrics (axon density, axon diameter, myelin thickness and g-ratio) were computed pixelwise following automatic axon segmentation using AxonSeg. The parcellation was based on an agglomerative clustering algorithm to group the tracts. Results show that axon morphometrics provide sufficient information to automatically identify some white matter tracts in the spinal cord, however, not all tracts were correctly identified. Future developments of microstructure quantitative MRI even bring hope for a personalized clustering of white matter tracts in each individual patient. The generated atlas and the associated code can be found at https://github.com/neuropoly/tract-clustering.