Robust prior-biased acquisition function for human-in-the-loop Bayesian optimization
Rose Guay-Hottin
Lison Kardassevitch
Hugo Pham
Round and Round We Go! What makes Rotary Positional Encodings useful?
Federico Barbero
Alex Vitvitskyi
Christos Perivolaropoulos
Petar Veličković
Positional Encodings (PEs) are a critical component of Transformer-based Large Language Models (LLMs), providing the attention mechanism wit… (voir plus)h important sequence-position information. One of the most popular types of encoding used today in LLMs are Rotary Positional Encodings (RoPE), that rotate the queries and keys based on their relative distance. A common belief is that RoPE is useful because it helps to decay token dependency as relative distance increases. In this work, we argue that this is unlikely to be the core reason. We study the internals of a trained Gemma 7B model to understand how RoPE is being used at a mechanical level. We find that Gemma learns to use RoPE to construct robust "positional" attention patterns by exploiting the highest frequencies. We also find that, in general, Gemma greatly prefers to use the lowest frequencies of RoPE, which we suspect are used to carry semantic information. We mathematically prove interesting behaviours of RoPE and conduct experiments to verify our findings, proposing a modification of RoPE that fixes some highlighted issues and improves performance. We believe that this work represents an interesting step in better understanding PEs in LLMs, which we believe holds crucial value for scaling LLMs to large sizes and context lengths.
Is sharing always caring? Entropy, boundaries and the plurality of psychotherapeutic process.
Lena Adel
Ana Gómez-Carrillo
Jonas Mago
Michael Lifshitz
Spinal cord demyelination predicts neurological deterioration in patients with mild degenerative cervical myelopathy
Abdul Al-Shawwa
Michael Craig
Kalum Ost
David Anderson
Steven Casha
W. Bradley Jacobs
Nathan Evaniew
Saswati Tripathy
Jacques Bouchard
Peter Lewkonia
Fred Nicholls
Alex Soroceanu
Ganesh Swamy
Kenneth C. Thomas
Stephan duPlessis
Michael M.H. Yang
Nicholas Dea
Jefferson R. Wilson
David W. Cadotte
A stochastic integer programming approach to reserve staff scheduling with preferences
Carl Perreault‐Lafleur
Guy Desaulniers
Strong Model Collapse.
Yunzhen Feng
Arjun Subramonian
Julia Kempe
SynFlowNet: Design of Diverse and Novel Molecules with Synthesis Constraints
M. Cretu
Charles Harris
Ilia Igashov
Arne Schneuing
Marwin Segler
Bruno Correia
Julien Roy
Pietro Lio
Generative models see increasing use in computer-aided drug design. However, while performing well at capturing distributions of molecular m… (voir plus)otifs, they often produce synthetically inaccessible molecules. To address this, we introduce SynFlowNet, a GFlowNet model whose action space uses chemical reactions and buyable reactants to sequentially build new molecules. By incorporating forward synthesis as an explicit constraint of the generative mechanism, we aim at bridging the gap between in silico molecular generation and real world synthesis capabilities. We evaluate our approach using synthetic accessibility scores and an independent retrosynthesis tool to assess the synthesizability of our compounds, and motivate the choice of GFlowNets through considerable improvement in sample diversity compared to baselines. Additionally, we identify challenges with reaction encodings that can complicate traversal of the MDP in the backward direction. To address this, we introduce various strategies for learning the GFlowNet backward policy and thus demonstrate how additional constraints can be integrated into the GFlowNet MDP framework. This approach enables our model to successfully identify synthesis pathways for previously unseen molecules.
The Normative Leadership of the World Health Organization : a quantitative analysis 
Gaelle Foucault
Jean-Louis Denis
Pierre Larouche
Miriam Cohen
The Normative Leadership of the World Health Organization : a quantitative analysis 
Gaelle Foucault
Jean-Louis Denis
Pierre Larouche
Miriam Cohen
The Normative Leadership of the World Health Organization : a quantitative analysis 
Gaelle Foucault
Jean-Louis Denis
Pierre Larouche
Miriam Cohen
The role of AI for MRI-analysis in multiple sclerosis—A brief overview
Jean-Pierre R. Falet
Steven Nobile
Aliya Szpindel
Berardino Barile
Amar Kumar
Joshua D. Durso-Finley
Douglas Arnold
The Superposition of Diffusion Models Using the Itô Density Estimator
Marta Skreta
Lazar Atanackovic
Alexander Tong
The Cambrian explosion of easily accessible pre-trained diffusion models suggests a demand for methods that combine multiple different pre-t… (voir plus)rained diffusion models without incurring the significant computational burden of re-training a larger combined model. In this paper, we cast the problem of combining multiple pre-trained diffusion models at the generation stage under a novel proposed framework termed superposition. Theoretically, we derive superposition from rigorous first principles stemming from the celebrated continuity equation and design two novel algorithms tailor-made for combining diffusion models in SuperDiff. SuperDiff leverages a new scalable It\^o density estimator for the log likelihood of the diffusion SDE which incurs no additional overhead compared to the well-known Hutchinson's estimator needed for divergence calculations. We demonstrate that SuperDiff is scalable to large pre-trained diffusion models as superposition is performed solely through composition during inference, and also enjoys painless implementation as it combines different pre-trained vector fields through an automated re-weighting scheme. Notably, we show that SuperDiff is efficient during inference time, and mimics traditional composition operators such as the logical OR and the logical AND. We empirically demonstrate the utility of using SuperDiff for generating more diverse images on CIFAR-10, more faithful prompt conditioned image editing using Stable Diffusion, as well as improved conditional molecule generation and unconditional de novo structure design of proteins. https://github.com/necludov/super-diffusion