Learn how to leverage generative AI to support and improve your productivity at work. The next cohort will take place online on April 28 and 30, 2026, in French.
We use cookies to analyze the browsing and usage of our website and to personalize your experience. You can disable these technologies at any time, but this may limit certain functionalities of the site. Read our Privacy Policy for more information.
Setting cookies
You can enable and disable the types of cookies you wish to accept. However certain choices you make could affect the services offered on our sites (e.g. suggestions, personalised ads, etc.).
Essential cookies
These cookies are necessary for the operation of the site and cannot be deactivated. (Still active)
Analytics cookies
Do you accept the use of cookies to measure the audience of our sites?
Multimedia Player
Do you accept the use of cookies to display and allow you to watch the video content hosted by our partners (YouTube, etc.)?
In Deep Reinforcement Learning (RL), it is a challenge to learn representations that do not exhibit catastrophic forgetting or interference … (see more)in non-stationary environments. Successor Features (SFs) offer a potential solution to this challenge. However, canonical techniques for learning SFs from pixel-level observations often lead to representation collapse, wherein representations degenerate and fail to capture meaningful variations in the data. More recent methods for learning SFs can avoid representation collapse, but they often involve complex losses and multiple learning phases, reducing their efficiency. We introduce a novel, simple method for learning SFs directly from pixels. Our approach uses a combination of a Temporal-difference (TD) loss and a reward prediction loss, which together capture the basic mathematical definition of SFs. We show that our approach matches or outperforms existing SF learning techniques in both 2D (Minigrid), 3D (Miniworld) mazes and Mujoco, for both single and continual learning scenarios. As well, our technique is efficient, and can reach higher levels of performance in less time than other approaches. Our work provides a new, streamlined technique for learning SFs directly from pixel observations, with no pretraining required.
2023-12-31
Advances in Neural Information Processing Systems 37 (published)
Learning Tabu Search Algorithms: A Scheduling Application
Nazgol Niroumandrad
Nadia Lahrichi
Andrea Lodi
. Metaheuristics are widely recognized as efficient approaches for many combinatorial problems. Studies to improve the performance of metahe… (see more)uristics have increasingly relied on the use of various methods either combining different metaheuristics or methods originating outside of the metaheuristic field. This paper presents a learning algorithm to improve tabu search by reducing its search space and the evaluation effort. We study the performance of a learning tabu search algorithm using classification methods in an attempt to select moves through the search space more wisely. The experimental results demonstrate the benefit of using a learning mechanism under deterministic and stochastic conditions.
In recent years, video games have surged in popularity, attracting millions of players across platforms. Citizen science games (CSGs) levera… (see more)ge the processing power of gamers to solve computational and scientific problems. Borderlands Science (BLS) is a mini-game within the mass market game Borderlands 3 that turns multiple sequence alignment (MSA) problems into puzzles. Parallel research demonstrated that BLS players outperformed classical approaches solving small sequence alignment tasks. This study aims to analyze the strategical differences in player solutions in BLS as they gain experience. Through the many collected player solutions from players of different experience level, we gained insights into players’ strategies, differences between expert and non-expert players, and how strategies evolve. We developed a Markov chain trained on solutions from players of different experience levels to understand their actions and outcomes. Results indicate that expert players utilize more gaps and achieve more matches, gradually improving and converging toward unique strategies. Our findings reveal distinct and evolving player strategies. For future citizen science projects, it will be important to consider the identification of player strategies and their evolution over time to improve the game design and data processing.
Reinforcement learning is an attractive approach to learn good resource allocation and scheduling policies based on data when the system mod… (see more)el is unknown. However, the cumulative regret of most RL algorithms scales as ˜ O(S
2023-12-31
IEEE Transactions on Control of Network Systems (published)
Knowledge distillation aims to train a compact student network using soft supervision from a larger teacher network and hard supervision fro… (see more)m ground truths. However, determining an optimal knowledge fusion ratio that balances these supervisory signals remains challenging. Prior methods generally resort to a constant or heuristic-based fusion ratio, which often falls short of a proper balance. In this study, we introduce a novel adaptive method for learning a sample-wise knowledge fusion ratio, exploiting both the correctness of teacher and student, as well as how well the student mimics the teacher on each sample. Our method naturally leads to the intra-sample trilateral geometric relations among the student prediction (
List comprehensions are a Pythonic functional construct allowing developers to express in a concise way loops to build and manipulate lists.… (see more) Previous studies point to a gain in speed when list comprehensions are adopted. This paper reports the results of a study that compares the execution time performance of Python code written using list comprehensions as opposed to equivalent imperative programming. To this aim, we have developed a set of transformation rules to map Python for loops into list comprehensions. On the one hand, on artificial code snippets, we found list comprehensions to be faster than procedural code, with differences becoming evident if amplifying the tests, i.e., executing the code fragment thousands of times. On the other hand, this does not happen when executing real-world Python projects, where the performance may or may not improve, depending on the projects' features and the nature of the manipulated objects.
Generative Flow Networks (GFlowNets) are amortized sampling methods that learn a distribution over discrete objects proportional to their re… (see more)wards. GFlowNets exhibit a remarkable ability to generate diverse samples, yet occasionally struggle to consistently produce samples with high rewards due to over-exploration on wide sample space. This paper proposes to train GFlowNets with local search, which focuses on exploiting high-rewarded sample space to resolve this issue. Our main idea is to explore the local neighborhood via backtracking and reconstruction guided by backward and forward policies, respectively. This allows biasing the samples toward high-reward solutions, which is not possible for a typical GFlowNet solution generation scheme, which uses the forward policy to generate the solution from scratch. Extensive experiments demonstrate a remarkable performance improvement in several biochemical tasks. Source code is available: https://github.com/dbsxodud-11/ls_gfn.
The current state-of-the-art in artificial intelligence is impressive, especially in terms of mastery of language, but not so much in terms … (see more)of mathematical reasoning. What could be missing? Can we learn something useful about that gap from how the brains of mathematicians go about their craft? This essay builds on the idea that current deep learning mostly succeeds at system 1 abilities -- which correspond to our intuition and habitual behaviors -- but still lacks something important regarding system 2 abilities -- which include reasoning and robust uncertainty estimation. It takes an information-theoretical posture to ask questions about what constitutes an interesting mathematical statement, which could guide future work in crafting an AI mathematician. The focus is not on proving a given theorem but on discovering new and interesting conjectures. The central hypothesis is that a desirable body of theorems better summarizes the set of all provable statements, for example by having a small description length while at the same time being close (in terms of number of derivation steps) to many provable statements.
The recent developments in neural fields have brought phenomenal capabilities to the field of shape generation, but they lack crucial proper… (see more)ties, such as incremental control - a fundamental requirement for artistic work. Triangular meshes, on the other hand, are the representation of choice for most geometry related tasks, offering efficiency and intuitive control, but do not lend themselves to neural optimization. To support downstream tasks, previous art typically proposes a two-step approach, where first a shape is generated using neural fields, and then a mesh is extracted for further processing. Instead, in this paper we introduce a hybrid approach that maintains both a mesh and a Signed Distance Field (SDF) representations consistently. Using this representation, we introduce MagicClay - an artist friendly tool for sculpting regions of a mesh according to textual prompts while keeping other regions untouched. Our framework carefully and efficiently balances consistency between the representations and regularizations in every step of the shape optimization; Relying on the mesh representation, we show how to render the SDF at higher resolutions and faster. In addition, we employ recent work in differentiable mesh reconstruction to adaptively allocate triangles in the mesh where required, as indicated by the SDF. Using an implemented prototype, we demonstrate superior generated geometry compared to the state-of-the-art, and novel consistent control, allowing sequential prompt-based edits to the same mesh for the first time.