GPAI Report & Policy Guide: Towards Substantive Equality in AI
Join us at Mila on November 26 for the launch of the report and policy guide that outlines actionable recommendations for building inclusive AI ecosystems.
We use cookies to analyze the browsing and usage of our website and to personalize your experience. You can disable these technologies at any time, but this may limit certain functionalities of the site. Read our Privacy Policy for more information.
Setting cookies
You can enable and disable the types of cookies you wish to accept. However certain choices you make could affect the services offered on our sites (e.g. suggestions, personalised ads, etc.).
Essential cookies
These cookies are necessary for the operation of the site and cannot be deactivated. (Still active)
Analytics cookies
Do you accept the use of cookies to measure the audience of our sites?
Multimedia Player
Do you accept the use of cookies to display and allow you to watch the video content hosted by our partners (YouTube, etc.)?
Publications
Linear Weight Interpolation Leads to Transient Performance Gains
Across the primate cortex, neurons that perform similar functions tend to be spatially grouped together. In high-level visual cortex, this w… (see more)idely observed biological rule manifests itself as a modular organization of neuronal clusters, each tuned to a specific object category. The tendency toward short connections is one of the most widely accepted views of why such an organization exists in the brains of many animals. Yet, how such a feat is implemented at the neural level remains unclear.
Here, using artificial deep neural networks as test beds, we demonstrate that a topographical organization similar to that in the primary, intermediate, and high-level human visual cortex emerges when units in these models are laterally connected and their weight parameters are tuned by top-down credit assignment. Importantly, the emergence of the modular organization in the absence of explicit topography-inducing learning rules and objectives questions their necessity and suggests that local lateral connectivity alone may be sufficient for the formation of the topographic organization across the cortex.
The issue of bias in Machine Learning (ML) models is a significant challenge for the machine learning community. Real-world biases can be em… (see more)bedded in the data used to train models, and prior studies have shown that ML models can learn and even amplify these biases. This can result in unfair treatment of individuals based on their inherent characteristics or sensitive attributes such as gender, race, or age. Ensuring fairness is crucial with the increasing use of ML models in high-stakes scenarios and has gained significant attention from researchers in recent years. However, the challenge of ensuring fairness becomes much greater when the assumption of full access to sensitive attributes does not hold. The settings where the hypothesis does not hold include cases where (1) only limited or noisy demographic information is available or (2) demographic information is entirely unobserved due to privacy restrictions. This survey reviews recent research efforts to enforce fairness when sensitive attributes are missing. We propose a taxonomy of existing works and, more importantly, highlight current challenges and future research directions to stimulate research in ML fairness in the setting of missing sensitive attributes.
Neural network training begins with a chaotic phase in which the network is sensitive to small perturbations, such as those caused by stocha… (see more)stic gradient descent (SGD). This sensitivity can cause identically initialized networks to diverge both in parameter space and functional similarity.
However, the exact degree to which networks are sensitive to perturbation, and the sensitivity of networks as they transition out of the chaotic phase, is unclear.
To address this uncertainty, we apply a controlled perturbation at a single point in training time and measure its effect on otherwise identical training trajectories.
We find that both the
Building world models that accurately and comprehensively represent the real world is the utmost aspiration for conditional image generative… (see more) models as it would enable their use as world simulators. For these models to be successful world models, they should not only excel at image quality and prompt-image consistency but also ensure high representation diversity. However, current research in generative models mostly focuses on creative applications that are predominantly concerned with human preferences of image quality and aesthetics. We note that generative models have inference time mechanisms - or knobs - that allow the control of generation consistency, quality, and diversity. In this paper, we use state-of-the-art text-to-image and image-and-text-to-image models and their knobs to draw consistency-diversity-realism Pareto fronts that provide a holistic view on consistency-diversity-realism multi-objective. Our experiments suggest that realism and consistency can both be improved simultaneously; however there exists a clear tradeoff between realism/consistency and diversity. By looking at Pareto optimal points, we note that earlier models are better at representation diversity and worse in consistency/realism, and more recent models excel in consistency/realism while decreasing significantly the representation diversity. By computing Pareto fronts on a geodiverse dataset, we find that the first version of latent diffusion models tends to perform better than more recent models in all axes of evaluation, and there exist pronounced consistency-diversity-realism disparities between geographical regions. Overall, our analysis clearly shows that there is no best model and the choice of model should be determined by the downstream application. With this analysis, we invite the research community to consider Pareto fronts as an analytical tool to measure progress towards world models.
This work presents a novel approach that synergistically integrates convolutional neural networks (CNNs) and Transformer models for decoding… (see more) continuous fine finger motions from surface electromyography (sEMG) signals. This integration capitalizes on CNNs’ proficiency in extracting rich temporal and spatial features from multichannel sEMG data and the Transformer’s superior capability in recognizing complex patterns and long-range dependencies. A significant advancement in this field is the use of a custom-developed Epidermal Electrode Array Sleeve (EEAS) for capturing high-fidelity sEMG signals, enabling more accurate and reliable signal acquisition than traditional methods. The decoded joint angles could be used in seamless and intuitive human-machine interaction in various applications, such as virtual reality, augmented reality, robotic control, and prosthetic control. Evaluations demonstrate the superior performance of the proposed CNN-Transformer hybrid architecture in decoding continuous fine finger motions, outperforming individual CNN and Transformer models. The synergistic integration of CNNs and Transformers presents a powerful framework for sEMG decoding, offering exciting opportunities for naturalistic and intuitive human-machine interaction applications. Its robustness and efficiency make it an ideal choice for real-world applications, promising to enhance the interface between humans and machines significantly. The implications of this research extend to advancing the understanding of human neuromuscular signals and their application in computing interfaces.
2024-06-14
2024 IEEE International Conference on Computational Intelligence and Virtual Environments for Measurement Systems and Applications (CIVEMSA) (published)
The field of temporal graph learning aims to learn from evolving network data to forecast future interactions. Given a collection of observe… (see more)d temporal graphs, is it possible to predict the evolution of an unseen network from the same domain? To answer this question, we first present the Temporal Graph Scaling (TGS) dataset, a large collection of temporal graphs consisting of eighty-four ERC20 token transaction networks collected from 2017 to 2023. Next, we evaluate the transferability of Temporal Graph Neural Networks (TGNNs) for the temporal graph property prediction task by pre-training on a collection of up to sixty-four token transaction networks and then evaluating the downstream performance on twenty unseen token networks. We find that the neural scaling law observed in NLP and Computer Vision also applies in temporal graph learning, where pre-training on greater number of networks leads to improved downstream performance. To the best of our knowledge, this is the first empirical demonstration of the transferability of temporal graphs learning. On downstream token networks, the largest pre-trained model outperforms single model TGNNs on thirteen unseen test networks. Therefore, we believe that this is a promising first step towards building foundation models for temporal graphs.