Publications

Deep Learning Model for Multi-Step Ahead Prediction of Solar Irradiance: Case of Study of Morocco
Saad Benbrahim
Ismail Belhaj
Abdelali Djdiaa
Hicham Bouzekri
Abdelaziz Berrado
Accurate solar irradiance forecasting is crucial for managing energy generation and consumption in the rapidly evolving landscape of renewab… (see more)le energy. It enables renewable energy operators to make informed decisions and maximize their output. This study employs deep learning-based forecasting models to predict the Global Horizontal Irradiance (GHI) of the R&D platform situated in Ouarzazate, Morocco. A sensitivity analysis was conducted on multiple scenarios for a one day-ahead horizon. Moreover, a forecasting technique that encompasses numerous horizons, ranging from one day to three days in advance, was evaluated. The study's findings suggest that the encoder-decoder model we proposed exhibited superior performance compared to the other models tested and produced dependable predictions.
Towards an Effective Electrical Market Design: Identifying and Defining Key Criteria for Decision-Making
Souhaila Chiguer
Ismail Belhaj
Abdelali Djdiaa
Hicham Bouzekri
Abdelaziz Berrado
In our changing energy landscape, electricity is taking a major role in achieving decarbonization goals. Electricity can be a clean and effi… (see more)cient source of energy, and it is well-suited to help countries meet their climate goals. However, the electrical market is complex and constantly evolving, and it is important to carefully choose the design elements of the market to ensure that it is meeting its objectives. In this context, evaluating an electrical market's effectiveness requires a multifaceted approach that takes into account a range of elements, from environmental impact to economic viability. This paper provides an overview of several evaluation methods for different objectives to finally select the key criteria to consider in assisting decision-makers, regulators, and stakeholders in developing an electricity market that is not only effective but also reliable and sustainable.
Responses to pattern-violating visual stimuli evolve differently over days in somata and distal apical dendrites
Colleen J. Gillon
Jason E. Pina
Jérôme A. Lecoq
Ruweida Ahmed
Yazan N. Billeh
Shiella Caldejon
Peter Groblewski
Timothy M. Henley
India Kato
Eric Lee
Jennifer Luviano
Kyla Mace
Chelsea Nayan
Thuyanh V. Nguyen
Kat North
Jed Perkins
Sam Seid
Matthew T. Valley
Ali Williford
Timothy P. Lillicrap
Blake A. Richards
Scientists have long conjectured that the neocortex learns patterns in sensory data to generate top-down predictions of upcoming stimuli. In… (see more) line with this conjecture, different responses to pattern-matching vs pattern-violating visual stimuli have been observed in both spiking and somatic calcium imaging data. However, it remains unknown whether these pattern-violation signals are different between the distal apical dendrites, which are heavily targeted by top-down signals, and the somata, where bottom-up information is primarily integrated. Furthermore, it is unknown how responses to pattern-violating stimuli evolve over time as an animal gains more experience with them. Here, we address these unanswered questions by analyzing responses of individual somata and dendritic branches of layer 2/3 and layer 5 pyramidal neurons tracked over multiple days in primary visual cortex of awake, behaving female and male mice. We use sequences of Gabor patches with patterns in their orientations to create pattern-matching and pattern-violating stimuli, and two-photon calcium imaging to record neuronal responses. Many neurons in both layers show large differences between their responses to pattern-matching and pattern-violating stimuli. Interestingly, these responses evolve in opposite directions in the somata and distal apical dendrites, with somata becoming less sensitive to pattern-violating stimuli and distal apical dendrites more sensitive. These differences between the somata and distal apical dendrites may be important for hierarchical computation of sensory predictions and learning, since these two compartments tend to receive bottom-up and top-down information, respectively.
A Study of Human-Robot Handover through Human-Human Object Transfer
Charlotte Morissette
Bobak H. Baghi
Francois Hogan
In this preliminary study, we investigate changes in handover behaviour when transferring hazardous objects with the help of a high-resoluti… (see more)on touch sensor. Participants were asked to hand over a safe and hazardous object (a full cup and an empty cup) while instrumented with a modified STS sensor. Our data shows a clear distinction in the length of handover for the full cup vs the empty one, with the former being slower. Sensor data further suggests a change in tactile behaviour dependent on the object's risk factor. The results of this paper motivate a deeper study of tactile factors which could characterize a risky handover, allowing for safer human-robot interactions in the future.
Challenging Common Assumptions About Catastrophic Forgetting and Knowledge Accumulation
Timothee LESORT
Pau Rodríguez
Md Rifat Arefin
Building learning agents that can progressively learn and accumulate knowledge is the core goal of the continual learning (CL) research fiel… (see more)d. Unfortunately, training a model on new data usually compromises the performance on past data. In the CL literature, this effect is referred to as catastrophic forgetting (CF). CF has been largely studied, and a plethora of methods have been proposed to address it on short sequences of non-overlapping tasks. In such setups, CF always leads to a quick and significant drop in performance in past tasks. Nevertheless, despite CF, recent work showed that SGD training on linear models accumulates knowledge in a CL regression setup. This phenomenon becomes especially visible when tasks reoccur. We might then wonder if DNNs trained with SGD or any standard gradient-based optimization accumulate knowledge in such a way. Such phenomena would have interesting consequences for applying DNNs to real continual scenarios. Indeed, standard gradient-based optimization methods are significantly less computationally expensive than existing CL algorithms. In this paper, we study the progressive knowledge accumulation (KA) in DNNs trained with gradient-based algorithms in long sequences of tasks with data re-occurrence. We propose a new framework, SCoLe (Scaling Continual Learning), to investigate KA and discover that catastrophic forgetting has a limited effect on DNNs trained with SGD. When trained on long sequences with data sparsely re-occurring, the overall accuracy improves, which might be counter-intuitive given the CF phenomenon. We empirically investigate KA in DNNs under various data occurrence frequencies and propose simple and scalable strategies to increase knowledge accumulation in DNNs.
Dealing with Non-Stationarity in Decentralized Cooperative Multi-Agent Deep Reinforcement Learning via Multi-Timescale Learning
Decentralized cooperative multi-agent deep reinforcement learning (MARL) can be a versatile learning framework, particularly in scenarios wh… (see more)ere centralized training is either not possible or not practical. One of the critical challenges in decentralized deep MARL is the non-stationarity of the learning environment when multiple agents are learning concurrently. A commonly used and efficient scheme for decentralized MARL is independent learning in which agents concurrently update their policies independently of each other. We first show that independent learning does not always converge, while sequential learning where agents update their policies one after another in a sequence is guaranteed to converge to an agent-by-agent optimal solution. In sequential learning, when one agent updates its policy, all other agent's policies are kept fixed, alleviating the challenge of non-stationarity due to simultaneous updates in other agents' policies. However, it can be slow because only one agent is learning at any time. Therefore it might also not always be practical. In this work, we propose a decentralized cooperative MARL algorithm based on multi-timescale learning. In multi-timescale learning, all agents learn simultaneously, but at different learning rates. In our proposed method, when one agent updates its policy, other agents are allowed to update their policies as well, but at a slower rate. This speeds up sequential learning, while also minimizing non-stationarity caused by other agents updating concurrently. Multi-timescale learning outperforms state-of-the-art decentralized learning methods on a set of challenging multi-agent cooperative tasks in the epymarl(Papoudakis et al., 2020) benchmark. This can be seen as a first step towards more general decentralized cooperative deep MARL methods based on multi-timescale learning.
Minimal Value-Equivalent Partial Models for Scalable and Robust Planning in Lifelong Reinforcement Learning
Learning models of the environment from pure interaction is often considered an essential component of building lifelong reinforcement learn… (see more)ing agents. However, the common practice in model-based reinforcement learning is to learn models that model every aspect of the agent’s environment, regardless of whether they are important in coming up with optimal decisions or not. In this paper, we argue that such models are not particularly well-suited for performing scalable and robust planning in lifelong reinforcement learning scenarios and we propose new kinds of models that only model the relevant aspects of the environment, which we call \emph{minimal value-equivalent partial models}. After providing a formal definition for these models, we provide theoretical results demonstrating the scalability advantages of performing planning with such models and then perform experiments to empirically illustrate our theoretical results. Then, we provide some useful heuristics on how to learn these kinds of models with deep learning architectures and empirically demonstrate that models learned in such a way can allow for performing planning that is robust to distribution shifts and compounding model errors. Overall, both our theoretical and empirical results suggest that minimal value-equivalent partial models can provide significant benefits to performing scalable and robust planning in lifelong reinforcement learning scenarios.
Responsible AI Research Needs Impact Statements Too
A.R. Olteanu
Michael Ekstrand
Carlos Castillo
Jina Suh
All types of research, development, and policy work can have unintended, adverse consequences - work in responsible artificial intelligence … (see more)(RAI), ethical AI, or ethics in AI is no exception.
Substituting Data Annotation with Balanced Neighbourhoods and Collective Loss in Multi-label Text Classification
Muberra Ozmen
Mark J. Coates
Multi-label text classification (MLTC) is the task of assigning multiple labels to a given text, and has a wide range of application domains… (see more). Most existing approaches require an enormous amount of annotated data to learn a classifier and/or a set of well-defined constraints on the label space structure, such as hierarchical relations which may be complicated to provide as the number of labels increases. In this paper, we study the MLTC problem in annotation-free and scarce-annotation settings in which the magnitude of available supervision signals is linear to the number of labels. Our method follows three steps, (1) mapping input text into a set of preliminary label likelihoods by natural language inference using a pre-trained language model, (2) calculating a signed label dependency graph by label descriptions, and (3) updating the preliminary label likelihoods with message passing along the label dependency graph, driven with a collective loss function that injects the information of expected label frequency and average multi-label cardinality of predictions. The experiments show that the proposed framework achieves effective performance under low supervision settings with almost imperceptible computational and memory overheads added to the usage of pre-trained language model outperforming its initial performance by 70% in terms of example-based F1 score.
Task-Agnostic Continual Reinforcement Learning: Gaining Insights and Overcoming Challenges
Massimo Caccia
Jonas Mueller
Rasool Fakoor
Towards Few-Shot Coordination: Revisiting Ad-Hoc Teamplay Challenge in the Game of Hanabi
Cooperative Multi-agent Reinforcement Learning (MARL) algorithms with Zero-Shot Coordination (ZSC) have gained significant attention in rece… (see more)nt years. ZSC refers to the ability of agents to coordinate zero-shot (without additional interaction experience) with independently trained agents. While ZSC is crucial for cooperative MARL agents, it might not be possible for complex tasks and changing environments. Agents also need to adapt and improve their performance with minimal interaction with other agents. In this work, we show empirically that state-of-the-art ZSC algorithms have poor performance when paired with agents trained with different learning methods, and they require millions of interaction samples to adapt to these new partners. To investigate this issue, we formally defined a framework based on a popular cooperative multi-agent game called Hanabi to evaluate the adaptability of MARL methods. In particular, we created a diverse set of pre-trained agents and defined a new metric called adaptation regret that measures the agent's ability to efficiently adapt and improve its coordination performance when paired with some held-out pool of partners on top of its ZSC performance. After evaluating several SOTA algorithms using our framework, our experiments reveal that naive Independent Q-Learning (IQL) agents in most cases adapt as quickly as the SOTA ZSC algorithm Off-Belief Learning (OBL). This finding raises an interesting research question: How to design MARL algorithms with high ZSC performance and capability of fast adaptation to unseen partners. As a first step, we studied the role of different hyper-parameters and design choices on the adaptability of current MARL algorithms. Our experiments show that two categories of hyper-parameters controlling the training data diversity and optimization process have a significant impact on the adaptability of Hanabi agents.
Assessing the Security of GitHub Copilot's Generated Code - A Targeted Replication Study
Vahid Majdinasab
Michael Joshua Bishop
Shawn Rasheed
Amjed Tahir