Publications

WOODS: Benchmarks for Out-of-Distribution Generalization in Time Series

Jean-Christophe Gagnon-Audet

Kartik Ahuja

Mohammad Javad Darvishi Bayazi

Pooneh Mousavi

Guillaume Dumas

Irina Rish

2023-09-01

TMLR (accepted)

openreview.net

Effective Test Generation Using Pre-trained Large Language Models and Mutation Testing

Arghavan Moradi Dakhel

Amin Nikanjam

Vahid Majdinasab

Foutse Khomh

Michel C. Desmarais

One of the critical phases in software development is software testing. Testing helps with identifying potential bugs and reducing maintenan… (see more)ce costs. The goal of automated test generation tools is to ease the development of tests by suggesting efficient bug-revealing tests. Recently, researchers have leveraged Large Language Models (LLMs) of code to generate unit tests. While the code coverage of generated tests was usually assessed, the literature has acknowledged that the coverage is weakly correlated with the efficiency of tests in bug detection. To improve over this limitation, in this paper, we introduce MuTAP for improving the effectiveness of test cases generated by LLMs in terms of revealing bugs by leveraging mutation testing. Our goal is achieved by augmenting prompts with surviving mutants, as those mutants highlight the limitations of test cases in detecting bugs. MuTAP is capable of generating effective test cases in the absence of natural language descriptions of the Program Under Test (PUTs). We employ different LLMs within MuTAP and evaluate their performance on different benchmarks. Our results show that our proposed method is able to detect up to 28% more faulty human-written code snippets. Among these, 17% remained undetected by both the current state-of-the-art fully automated test generation tool (i.e., Pynguin) and zero-shot/few-shot learning approaches on LLMs. Furthermore, MuTAP achieves a Mutation Score (MS) of 93.57% on synthetic buggy code, outperforming all other approaches in our evaluation. Our findings suggest that although LLMs can serve as a useful tool to generate test cases, they require specific post-processing steps to enhance the effectiveness of the generated test cases which may suffer from syntactic or functional errors and may be ineffective in detecting certain types of bugs and testing corner cases PUTs.

2023-08-31

ArXiv (preprint)

doi.org

arxiv.org

Learning Lyapunov-Stable Polynomial Dynamical Systems Through Imitation

Amin Abyaneh

Hsiu-Chin Lin

Imitation learning is a paradigm to address complex motion planning problems by learning a policy to imitate an expert's behavior. However, … (see more)relying solely on the expert's data might lead to unsafe actions when the robot deviates from the demonstrated trajectories. Stability guarantees have previously been provided utilizing nonlinear dynamical systems, acting as high-level motion planners, in conjunction with the Lyapunov stability theorem. Yet, these methods are prone to inaccurate policies, high computational cost, sample inefficiency, or quasi stability when replicating complex and highly nonlinear trajectories. To mitigate this problem, we present an approach for learning a globally stable nonlinear dynamical system as a motion planning policy. We model the nonlinear dynamical system as a parametric polynomial and learn the polynomial's coefficients jointly with a Lyapunov candidate. To showcase its success, we compare our method against the state of the art in simulation and conduct real-world experiments with the Kinova Gen3 Lite manipulator arm. Our experiments demonstrate the sample efficiency and reproduction accuracy of our method for various expert trajectories, while remaining stable in the face of perturbations.

2023-08-30

robot-learning.org/CoRL/2023/Conference (poster)

doi.org

openreview.net

Beyond the ML Model: Applying Safety Engineering Frameworks to Text-to-Image Development

Shalaleh Rismani

Renee Shelby

Andrew J Smart

Renelito Delos Santos

AJung Moon

Negar Rostamzadeh

Identifying potential social and ethical risks in emerging machine learning (ML) models and their applications remains challenging. In this … (see more)work, we applied two well-established safety engineering frameworks (FMEA, STPA) to a case study involving text-to-image models at three stages of the ML product development pipeline: data processing, integration of a T2I model with other models, and use. Results of our analysis demonstrate the safety frameworks – both of which are not designed explicitly examine social and ethical risks – can uncover failure and hazards that pose social and ethical risks. We discovered a broad range of failures and hazards (i.e., functional, social, and ethical) by analyzing interactions (i.e., between different ML models in the product, between the ML product and user, and between development teams) and processes (i.e., preparation of training data or workflows for using an ML service/product). Our findings underscore the value and importance of examining beyond an ML model in examining social and ethical risks, especially when we have minimal information about an ML model.

2023-08-29

Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society (published)

doi.org

arxiv.org

Policy composition in reinforcement learning via multi-objective policy optimization

Shruti Mishra

Ankit Anand

Jordan Hoffmann

Nicolas Heess

Martin A. Riedmiller

Abbas Abdolmaleki

Doina Precup

We enable reinforcement learning agents to learn successful behavior policies by utilizing relevant pre-existing teacher policies. The teach… (see more)er policies are introduced as objectives, in addition to the task objective, in a multi-objective policy optimization setting. Using the Multi-Objective Maximum a Posteriori Policy Optimization algorithm (Abdolmaleki et al. 2020), we show that teacher policies can help speed up learning, particularly in the absence of shaping rewards. In two domains with continuous observation and action spaces, our agents successfully compose teacher policies in sequence and in parallel, and are also able to further extend the policies of the teachers in order to solve the task. Depending on the specified combination of task and teacher(s), teacher(s) may naturally act to limit the final performance of an agent. The extent to which agents are required to adhere to teacher policies are determined by hyperparameters which determine both the effect of teachers on learning speed and the eventual performance of the agent on the task. In the humanoid domain (Tassa et al. 2018), we also equip agents with the ability to control the selection of teachers. With this ability, agents are able to meaningfully compose from the teacher policies to achieve a superior task reward on the walk task than in cases without access to the teacher policies. We show the resemblance of composed task policies with the corresponding teacher policies through videos.

2023-08-29

ArXiv (preprint)

doi.org

arxiv.org

Sociotechnical Harms of Algorithmic Systems: Scoping a Taxonomy for Harm Reduction

Renee Shelby

Shalaleh Rismani

Kathryn Henne

AJung Moon

Negar Rostamzadeh

Paul Nicholas

N'Mah Yilla-Akbari

Jess Gallegos

Andrew J Smart

Emilio Garcia

Gurleen Virk

2023-08-29

Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society (published)

doi.org

arxiv.org

What does it mean to be a responsible AI practitioner: An ontology of roles and skills

Shalaleh Rismani

AJung Moon

With the growing need to regulate AI systems across a wide variety of application domains, a new set of occupations has emerged in the indus… (see more)try. The so-called responsible Artificial Intelligence (AI) practitioners or AI ethicists are generally tasked with interpreting and operationalizing best practices for ethical and safe design of AI systems. Due to the nascent nature of these roles, however, it is unclear to future employers and aspiring AI ethicists what specific function these roles serve and what skills are necessary to serve the functions. Without clarity on these, we cannot train future AI ethicists with meaningful learning objectives. In this work, we examine what responsible AI practitioners do in the industry and what skills they employ on the job. We propose an ontology of existing roles alongside skills and competencies that serve each role. We created this ontology by examining the job postings for such roles over a two-year period (2020-2022) and conducting expert interviews with fourteen individuals who currently hold such a role in the industry. Our ontology contributes to business leaders looking to build responsible AI teams and provides educators with a set of competencies that an AI ethics curriculum can prioritize.

2023-08-29

Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society (published)

doi.org

arxiv.org

Beyond performance: the role of task demand, effort, and individual differences in ab initio pilots

Mohammad-Javad Darvishi-Bayazi

Andrew Law

Sergio Mejia Romero

Sion Jennings

Irina Rish

Jocelyn Faubert

2023-08-28

Scientific Reports (published)

doi.org

From Assistive Devices to Manufacturing Cobot Swarms

Monica Li

Bruno Belzile

Ali Imran

Lionel Birglen

Giovanni Beltrame

David St-Onge

This paper provides an overview of the latest trends in robotics research and development, with a particular focus on applications in manufa… (see more)cturing and industrial settings. We highlight recent advances in robot design, including cutting-edge collaborative robot mechanics and advanced safety features, as well as exciting developments in perception and human-swarm interaction. By examining recent contributions from Kinova, a leading robotics company, we illustrate the differences between industry and academia in their approaches to developing innovative robotic systems and technologies that enhance productivity and safety in the workplace. Ultimately, this paper demonstrates the tremendous potential of robotics to revolutionize manufacturing and industrial operations, and underscores the crucial role of companies like Kinova in driving this transformation forward.

2023-08-28

IEEE International Symposium on Robot and Human Interactive Communication (published)

doi.org

Speech Self-Supervised Representations Benchmarking: a Case for Larger Probing Heads

Salah Zaiem

Youcef Kemiche

Titouan Parcollet

Slim Essid

Mirco Ravanelli

2023-08-28

ArXiv (preprint)

doi.org

arxiv.org

Efficient Epistemic Uncertainty Estimation in Regression Ensemble Models Using Pairwise-Distance Estimators

Lucas Berry

David Meger

This work introduces an efficient novel approach for epistemic uncertainty estimation for ensemble models for regression tasks using pairwis… (see more)e-distance estimators (PaiDEs). Utilizing the pairwise-distance between model components, these estimators establish bounds on entropy. We leverage this capability to enhance the performance of Bayesian Active Learning by Disagreement (BALD). Notably, unlike sample-based Monte Carlo estimators, PaiDEs exhibit a remarkable capability to estimate epistemic uncertainty at speeds up to 100 times faster while covering a significantly larger number of inputs at once and demonstrating superior performance in higher dimensions. To validate our approach, we conducted a varied series of regression experiments on commonly used benchmarks: 1D sinusoidal data,

2023-08-25

ArXiv (preprint)

arxiv.org

Party Prediction for Twitter

Kellin Pelrine

Anne Imouza

Zachary Yang

Jacob-Junqi Tian

Sacha Lévy

Gabrielle Desrosiers-Brisebois

Aarash Feizi

C'ecile Amadoro

André Blais

Jean-François Godbout

Reihaneh Rabbany

2023-08-25

ArXiv (preprint)

doi.org

arxiv.org

AI Research Driven by Real-World Problems

AI Policy Compass

Student Life and Resources

Publications

AI Research Driven by Real-World Problems

AI Policy Compass

Student Life and Resources

Popular keywords:

Publications