Publications

Policy composition in reinforcement learning via multi-objective policy optimization
Shruti Mishra
Ankit Anand
Jordan Hoffmann
Nicolas Heess
Martin A. Riedmiller
Abbas Abdolmaleki
We enable reinforcement learning agents to learn successful behavior policies by utilizing relevant pre-existing teacher policies. The teach… (see more)er policies are introduced as objectives, in addition to the task objective, in a multi-objective policy optimization setting. Using the Multi-Objective Maximum a Posteriori Policy Optimization algorithm (Abdolmaleki et al. 2020), we show that teacher policies can help speed up learning, particularly in the absence of shaping rewards. In two domains with continuous observation and action spaces, our agents successfully compose teacher policies in sequence and in parallel, and are also able to further extend the policies of the teachers in order to solve the task. Depending on the specified combination of task and teacher(s), teacher(s) may naturally act to limit the final performance of an agent. The extent to which agents are required to adhere to teacher policies are determined by hyperparameters which determine both the effect of teachers on learning speed and the eventual performance of the agent on the task. In the humanoid domain (Tassa et al. 2018), we also equip agents with the ability to control the selection of teachers. With this ability, agents are able to meaningfully compose from the teacher policies to achieve a superior task reward on the walk task than in cases without access to the teacher policies. We show the resemblance of composed task policies with the corresponding teacher policies through videos.
Sociotechnical Harms of Algorithmic Systems: Scoping a Taxonomy for Harm Reduction
Renee Shelby
Shalaleh Rismani
Kathryn Henne
Paul Nicholas
N'Mah Yilla-Akbari
Jess Gallegos
Andrew J Smart
Emilio Garcia
Gurleen Virk
What does it mean to be a responsible AI practitioner: An ontology of roles and skills
Shalaleh Rismani
With the growing need to regulate AI systems across a wide variety of application domains, a new set of occupations has emerged in the indus… (see more)try. The so-called responsible Artificial Intelligence (AI) practitioners or AI ethicists are generally tasked with interpreting and operationalizing best practices for ethical and safe design of AI systems. Due to the nascent nature of these roles, however, it is unclear to future employers and aspiring AI ethicists what specific function these roles serve and what skills are necessary to serve the functions. Without clarity on these, we cannot train future AI ethicists with meaningful learning objectives. In this work, we examine what responsible AI practitioners do in the industry and what skills they employ on the job. We propose an ontology of existing roles alongside skills and competencies that serve each role. We created this ontology by examining the job postings for such roles over a two-year period (2020-2022) and conducting expert interviews with fourteen individuals who currently hold such a role in the industry. Our ontology contributes to business leaders looking to build responsible AI teams and provides educators with a set of competencies that an AI ethics curriculum can prioritize.
Beyond performance: the role of task demand, effort, and individual differences in ab initio pilots
Mohammad Javad Darvishi Bayazi
Andrew Law
Sergio Mejia Romero
Sion Jennings
Jocelyn Faubert
From Assistive Devices to Manufacturing Cobot Swarms
Monica Li
Bruno Belzile
Ali Imran
Lionel Birglen
David St-Onge
This paper provides an overview of the latest trends in robotics research and development, with a particular focus on applications in manufa… (see more)cturing and industrial settings. We highlight recent advances in robot design, including cutting-edge collaborative robot mechanics and advanced safety features, as well as exciting developments in perception and human-swarm interaction. By examining recent contributions from Kinova, a leading robotics company, we illustrate the differences between industry and academia in their approaches to developing innovative robotic systems and technologies that enhance productivity and safety in the workplace. Ultimately, this paper demonstrates the tremendous potential of robotics to revolutionize manufacturing and industrial operations, and underscores the crucial role of companies like Kinova in driving this transformation forward.
Motion In-Betweening via Deep <inline-formula><tex-math notation="LaTeX">$\Delta$</tex-math><alternatives><mml:math><mml:mi>Δ</mml:mi></mml:math><inline-graphic xlink:href="oreshkin-ieq1-3309107.gif"/></alternatives></inline-formula>-Interpolator
Boris Oreshkin
Antonios Valkanas
Félix Harvey
Louis-Simon Ménard
Florent Bocquelet
We show that the task of synthesizing human motion conditioned on a set of key frames can be solved more accurately and effectively if a dee… (see more)p learning based interpolator operates in the delta mode using the spherical linear interpolator as a baseline. We empirically demonstrate the strength of our approach on publicly available datasets achieving state-of-the-art performance. We further generalize these results by showing that the
Speech Self-Supervised Representations Benchmarking: a Case for Larger Probing Heads
Salah Zaiem
Youcef Kemiche
Titouan Parcollet
Slim Essid
Efficient Epistemic Uncertainty Estimation in Regression Ensemble Models Using Pairwise-Distance Estimators
Lucas Berry
This work introduces an efficient novel approach for epistemic uncertainty estimation for ensemble models for regression tasks using pairwis… (see more)e-distance estimators (PaiDEs). Utilizing the pairwise-distance between model components, these estimators establish bounds on entropy. We leverage this capability to enhance the performance of Bayesian Active Learning by Disagreement (BALD). Notably, unlike sample-based Monte Carlo estimators, PaiDEs exhibit a remarkable capability to estimate epistemic uncertainty at speeds up to 100 times faster while covering a significantly larger number of inputs at once and demonstrating superior performance in higher dimensions. To validate our approach, we conducted a varied series of regression experiments on commonly used benchmarks: 1D sinusoidal data,
Party Prediction for Twitter
Kellin Pelrine
Anne Imouza
Zachary Yang
Jacob-Junqi Tian
Sacha Lévy
Gabrielle Desrosiers-Brisebois
Aarash Feizi
C'ecile Amadoro
André Blais
Jean-François Godbout
A comparison of reinforcement learning frameworks for software testing tasks
Paulina Stevia Nouwou Mindom
Amin Nikanjam
Multivariate Time-Series Anomaly Detection with Contaminated Data: Application to Physiological Signals
Thi Kieu Khanh Ho
Speech Self-Supervised Representation Benchmarking: Are We Doing it Right?
Salah Zaiem
Youcef Kemiche
Titouan Parcollet
Slim Essid
Self-supervised learning (SSL) has recently allowed leveraging large datasets of unlabeled speech signals to reach impressive performance on… (see more) speech tasks using only small amounts of annotated data. The high number of proposed approaches fostered the need and rise of extended benchmarks that evaluate their performance on a set of downstream tasks exploring various aspects of the speech signal. However, and while the number of considered tasks has been growing, most rely upon a single decoding architecture that maps the frozen SSL representations to the downstream labels. This work investigates the robustness of such benchmarking results to changes in the decoder architecture. Interestingly, it appears that varying the architecture of the downstream decoder leads to significant variations in the leaderboards of most tasks. Concerningly, our study reveals that benchmarking using limited decoders may cause a counterproductive increase in the sizes of the developed SSL models.