Code as Reward: Empowering Reinforcement Learning with VLMs
David Venuto
Mohammad Sami Nur Islam
Martin Klissarov
Sherry Yang
A Distributional Analogue to the Successor Representation
Harley Wiltzer
Jesse Farebrother
Arthur Gretton
Yunhao Tang
Andre Barreto
Will Dabney
Mark Rowland
This paper contributes a new approach for distributional reinforcement learning which elucidates a clean separation of transition structure … (see more)and reward in the learning process. Analogous to how the successor representation (SR) describes the expected consequences of behaving according to a given policy, our distributional successor measure (SM) describes the distributional consequences of this behaviour. We formulate the distributional SM as a distribution over distributions and provide theory connecting it with distributional and model-based reinforcement learning. Moreover, we propose an algorithm that learns the distributional SM from data by minimizing a two-level maximum mean discrepancy. Key to our method are a number of algorithmic techniques that are independently valuable for learning generative models of state. As an illustration of the usefulness of the distributional SM, we show that it enables zero-shot risk-sensitive policy evaluation in a way that was not previously possible.
Dynamic System Modeling Using a Multisource Transfer Learning-Based Modular Neural Network for Industrial Application
Haoshan Duan
Xi Meng
JunFei Qiao
Establishing an accurate model of dynamic systems poses a challenge for complex industrial processes. Due to the ability to handle complex t… (see more)asks, modular neural networks (MNN) have been widely applied to industrial process modeling. However, the phenomenon of domain drift caused by operating conditions may lead to a cold start of the model, which affects the performance of MNN. For this reason, a multisource transfer learning-based MNN (MSTL-MNN) is proposed in this study. First, the knowledge-driven transfer learning process is performed with domain similarity evaluation, knowledge extraction, and fusion, aiming to form an initial subnetwork in the target domain. Then, the positive transfer process of effective knowledge can avoid the cold start problem of MNN. Second, during the data-driven fine-tuning process, a regularized self-organizing long short-term memory algorithm is designed to fine-tune the structure and parameters of the initial subnetwork, which can improve the prediction performance of MNN. Meanwhile, relevant theoretical analysis is given to ensure the feasibility of MSTL-MNN. Finally, the effectiveness of the proposed method is confirmed by two benchmark simulations and a real industrial dataset of a municipal solid waste incineration process. Experimental results demonstrate the merits of MSTL-MNN for industrial applications.
Fairness-aware data-driven-based model predictive controller: A study on thermal energy storage in a residential building
Ying Sun
Fariborz Haghighat
Fairness-aware data-driven-based model predictive controller: A study on thermal energy storage in a residential building
Ying Sun
Fariborz Haghighat
Faithfulness Measurable Masked Language Models
Andreas Madsen
Generative AI in Software Engineering Must Be Human-Centered: The Copenhagen Manifesto
Daniel Russo
Sebastian Baltes
Niels van Berkel
Paris Avgeriou
Fabio Calefato
Beatriz Cabrero-Daniel
Gemma Catolino
Jürgen Cito
Neil Ernst
Thomas Fritz
Hideaki Hata
Reid Holmes
Maliheh Izadi
Mikkel Baun Kjærgaard
Grischa Liebel
Alberto Lluch Lafuente
Stefano Lambiase
Walid Maalej
Gail Murphy … (see 15 more)
Nils Brede Moe
Gabrielle O'Brien
Elda Paja
Mauro Pezzè
John Stouby Persson
Rafael Prikladnicki
Paul Ralph
Martin P. Robillard
Thiago Rocha Silva
Klaas-Jan Stol
Margaret-Anne Storey
Viktoria Stray
Paolo Tell
Christoph Treude
Bogdan Vasilescu
Generative AI in Software Engineering Must Be Human-Centered: The Copenhagen Manifesto
Daniel Russo
Sebastian Baltes
Niels van Berkel
Paris Avgeriou
Fabio Calefato
Beatriz Cabrero-Daniel
Gemma Catolino
Jürgen Cito
Neil Ernst
Thomas Fritz
Hideaki Hata
Reid Holmes
Maliheh Izadi
Mikkel Baun Kjærgaard
Grischa Liebel
Alberto Lluch Lafuente
Stefano Lambiase
Walid Maalej
Gail Murphy … (see 15 more)
Nils Brede Moe
Gabrielle O'Brien
Elda Paja
Mauro Pezzè
John Stouby Persson
Rafael Prikladnicki
Paul Ralph
Martin P. Robillard
Thiago Rocha Silva
Klaas-Jan Stol
Margaret-Anne Storey
Viktoria Stray
Paolo Tell
Christoph Treude
Bogdan Vasilescu
Generative AI in Software Engineering Must Be Human-Centered: The Copenhagen Manifesto
Daniel Russo
Sebastian Baltes
Niels van Berkel
Paris Avgeriou
Fabio Calefato
Beatriz Cabrero-Daniel
Gemma Catolino
Jürgen Cito
Neil Ernst
Thomas Fritz
Hideaki Hata
Reid Holmes
Maliheh Izadi
Mikkel Baun Kjærgaard
Grischa Liebel
Alberto Lluch Lafuente
Stefano Lambiase
Walid Maalej
Gail Murphy … (see 15 more)
Nils Brede Moe
Gabrielle O'Brien
Elda Paja
Mauro Pezzè
John Stouby Persson
Rafael Prikladnicki
Paul Ralph
Martin P. Robillard
Thiago Rocha Silva
Klaas-Jan Stol
Margaret-Anne Storey
Viktoria Stray
Paolo Tell
Christoph Treude
Bogdan Vasilescu
Generative AI in Software Engineering Must Be Human-Centered: The Copenhagen Manifesto
Daniel Russo
Sebastian Baltes
Niels van Berkel
Paris Avgeriou
Fabio Calefato
Beatriz Cabrero-Daniel
Gemma Catolino
Jürgen Cito
Neil Ernst
Thomas Fritz
Hideaki Hata
Reid Holmes
Maliheh Izadi
Mikkel Baun Kjærgaard
Grischa Liebel
Alberto Lluch Lafuente
Stefano Lambiase
Walid Maalej
Gail Murphy … (see 15 more)
Nils Brede Moe
Gabrielle O'Brien
Elda Paja
Mauro Pezzè
John Stouby Persson
Rafael Prikladnicki
Paul Ralph
Martin P. Robillard
Thiago Rocha Silva
Klaas-Jan Stol
Margaret-Anne Storey
Viktoria Stray
Paolo Tell
Christoph Treude
Bogdan Vasilescu
Generative AI in Software Engineering Must Be Human-Centered: The Copenhagen Manifesto
Daniel Russo
Sebastian Baltes
Niels van Berkel
Paris Avgeriou
Fabio Calefato
Beatriz Cabrero-Daniel
Gemma Catolino
Jürgen Cito
Neil Ernst
Thomas Fritz
Hideaki Hata
Reid Holmes
Maliheh Izadi
Mikkel Baun Kjærgaard
Grischa Liebel
Alberto Lluch Lafuente
Stefano Lambiase
Walid Maalej
Gail Murphy … (see 15 more)
Nils Brede Moe
Gabrielle O'Brien
Elda Paja
Mauro Pezzè
John Stouby Persson
Rafael Prikladnicki
Paul Ralph
Martin P. Robillard
Thiago Rocha Silva
Klaas-Jan Stol
Margaret-Anne Storey
Viktoria Stray
Paolo Tell
Christoph Treude
Bogdan Vasilescu
Harmony in Diversity: Merging Neural Networks with Canonical Correlation Analysis
Stefan Horoi
Albert Manuel Orozco Camacho
Ensembling multiple models enhances predictive performance by utilizing the varied learned features of the different models but incurs signi… (see more)ficant computational and storage costs. Model fusion, which combines parameters from multiple models into one, aims to mitigate these costs but faces practical challenges due to the complex, non-convex nature of neural network loss landscapes, where learned minima are often separated by high loss barriers. Recent works have explored using permutations to align network features, reducing the loss barrier in parameter space. However, permutations are restrictive since they assume a one-to-one mapping between the different models' neurons exists. We propose a new model merging algorithm, CCA Merge, which is based on Canonical Correlation Analysis and aims to maximize the correlations between linear combinations of the model features. We show that our method of aligning models leads to better performances than past methods when averaging models trained on the same, or differing data splits. We also extend this analysis into the harder many models setting where more than 2 models are merged, and we find that CCA Merge works significantly better in this setting than past methods.