Nous utilisons des témoins pour analyser le trafic et l’utilisation de notre site web, afin de personnaliser votre expérience. Vous pouvez désactiver ces technologies à tout moment, mais cela peut restreindre certaines fonctionnalités du site. Consultez notre Politique de protection de la vie privée pour en savoir plus.
Paramètre des cookies
Vous pouvez activer et désactiver les types de cookies que vous souhaitez accepter. Cependant certains choix que vous ferez pourraient affecter les services proposés sur nos sites (ex : suggestions, annonces personnalisées, etc.).
Cookies essentiels
Ces cookies sont nécessaires au fonctionnement du site et ne peuvent être désactivés. (Toujours actif)
Cookies analyse
Acceptez-vous l'utilisation de cookies pour mesurer l'audience de nos sites ?
Multimedia Player
Acceptez-vous l'utilisation de cookies pour afficher et vous permettre de regarder les contenus vidéo hébergés par nos partenaires (YouTube, etc.) ?
Publications
Explaining by Analogy: Case-based Abductive Natural Language Inference
Existing accounts of explanation emphasise 001 the role of prior experience and analogy in 002 the solution of new problems. However, most 0… (voir plus)03 of the contemporary models for multi-hop tex-004 tual inference construct explanations consider-005 ing each test case in isolation. This paradigm 006 is known to suffer from semantic drift, which 007 causes the construction of spurious explana-008 tions leading to wrong predictions. In con-009 trast, we propose an abductive framework for 010 multi-hop inference that adopts the retrieve - 011 reuse - revise paradigm largely studied in case-012 based reasoning . Specifically, we present 013 ETNA ( E xplana t io n by A nalogy), a novel 014 model that addresses unseen inference prob-015 lems by retrieving and adapting prior expla-016 nations from similar training examples. We 017 empirically evaluate the case-based abductive 018 framework on downstream commonsense and 019 scientific reasoning tasks. Our experiments 020 demonstrate that ETNA can be effectively in-021 tegrated with sparse and dense encoding mech-022 anisms or downstream transformers, achiev-023 ing strong performance when compared to ex-024 isting explainable approaches. Moreover, we 025 study the impact of the retrieve - reuse - revise 026 paradigm on explainability and semantic drift, 027 showing that it boosts the quality of the con-028 structed explanations, resulting in improved 029 downstream inference performance. 030
Exploring the Wasserstein metric for time-to-event analysis.
Survival analysis is a type of semi-supervised task where the target output (the survival time) is often right-censored. Utilizing this info… (voir plus)rmation is a challenge because it is not obvious how to correctly incorporate these censored examples into a model. We study how three categories of loss functions can take advantage of this information: partial likelihood methods, rank methods, and our own classification method based on a Wasserstein metric (WM) and the non-parametric Kaplan Meier (KM) estimate of the probability density to impute the labels of censored examples. The proposed method predicts the probability distribution of an event, letting us compute survival curves and expected times of survival that are easier to interpret than the rank. We also demonstrate that this approach directly optimizes the expected C-index which is the most common evaluation metric for survival models.
Factorizing Declarative and Procedural Knowledge in Structured, Dynamical Environments
Decomposing knowledge into interchangeable pieces promises a generalization advantage when there are changes in distribution. A learning age… (voir plus)nt interacting with its environment is likely to be faced with situations requiring novel combinations of existing pieces of knowledge. We hypothesize that such a decomposition of knowledge is particularly relevant for being able to generalize in a systematic manner to out-of-distribution changes. To study these ideas, we propose a particular training framework in which we assume that the pieces of knowledge an agent needs and its reward function are stationary and can be re-used across tasks. An attention mechanism dynamically selects which modules can be adapted to the current task, and the parameters of the selected modules are allowed to change quickly as the learner is confronted with variations in what it experiences, while the parameters of the attention mechanisms act as stable, slowly changing, meta-parameters. We focus on pieces of knowledge captured by an ensemble of modules sparsely communicating with each other via a bottleneck of attention. We find that meta-learning the modular aspects of the proposed system greatly helps in achieving faster adaptation in a reinforcement learning setup involving navigation in a partially observed grid world with image-level input. We also find that reversing the role of parameters and meta-parameters does not work nearly as well, suggesting a particular role for fast adaptation of the dynamically selected modules.
In this paper, we study the finite-time behaviour of temporal difference (TD) learning algorithms when combined with tail-averaging, and pr… (voir plus)esent instance dependent bounds on the parameter error of the tail-averaged TD iterate. Our error bounds hold in expectation as well as with high probability, exhibit a sharper rate of decay for the initial error (bias), and are comparable with existing bounds in the literature.
This paper is about the problem of learning a stochastic policy for generating an object (like a molecular graph) from a sequence of actions… (voir plus), such that the probability of generating an object is proportional to a given positive reward for that object. Whereas standard return maximization tends to converge to a single return-maximizing sequence, there are cases where we would like to sample a diverse set of high-return solutions. These arise, for example, in black-box function optimization when few rounds are possible, each with large batches of queries, where the batches should be diverse, e.g., in the design of new molecules. One can also see this as a problem of approximately converting an energy function to a generative distribution. While MCMC methods can achieve that, they are expensive and generally only perform local exploration. Instead, training a generative policy amortizes the cost of search during training and yields to fast generation. Using insights from Temporal Difference learning, we propose GFlowNet, based on a view of the generative process as a flow network, making it possible to handle the tricky case where different trajectories can yield the same final state, e.g., there are many ways to sequentially add atoms to generate some molecular graph. We cast the set of trajectories as a flow and convert the flow consistency equations into a learning objective, akin to the casting of the Bellman equations into Temporal Difference methods. We prove that any global minimum of the proposed objectives yields a policy which samples from the desired distribution, and demonstrate the improved performance and diversity of GFlowNet on a simple domain where there are many modes to the reward function, and on a molecule synthesis task.
Haze removal techniques employed to increase the visibility level of an image play an important role in many vision-based systems. Several t… (voir plus)raditional dark channel prior-based methods have been proposed to remove haze formation and thereby enhance the robustness of these systems. However, when the captured images contain disproportionate haze distributions, these methods usually fail to attain effective restoration in the restored image. Specifically, disproportionate haze distribution in an image means that the background region possesses heavy haze density and the foreground region possesses little haze density. This phenomenon usually occurs in a hazy image with a deep depth of field. In response, a novel hybrid transmission map-based haze removal method that specifically targets this situation is proposed in this work to achieve clear visibility restoration and effective information maintenance. Experimental results via both qualitative and quantitative evaluations demonstrate that the proposed method is capable of performing with higher efficacy when compared with other state-of-the-art methods, in respect to both background regions and foreground regions of restored test images captured in real-world environments.
We study session-based recommendation scenarios where we want to recommend items to users during sequential interactions to improve their lo… (voir plus)ng-term utility. Optimizing a long-term metric is challenging because the learning signal (whether the recommendations achieved their desired goals) is delayed and confounded by other user interactions with the system. Immediately measurable proxies such as clicks can lead to suboptimal recommendations due to misalignment with the long-term metric. Many works have applied episodic reinforcement learning (RL) techniques for session-based recommendation but these methods do not account for policy-induced drift in user intent across sessions. We develop a new batch RL algorithm called Short Horizon Policy Improvement (SHPI) that approximates policy-induced distribution shifts across sessions. By varying the horizon hyper-parameter in SHPI, we recover well-known policy improvement schemes in the RL literature. Empirical results on four recommendation tasks show that SHPI can outperform matrix factorization, offline bandits, and offline RL baselines. We also provide a stable and computationally efficient implementation using weighted regression oracles.