Gemma 2: Improving Open Language Models at a Practical Size
Gemma Team Morgane Riviere
Shreya Pathak
Pier Giuseppe Sessa
Cassidy Hardin
Surya Bhupatiraju
L'eonard Hussenot
Thomas Mesnard
Bobak Shahriari
Alexandre Ram'e
Johan Ferret
Peter Liu
Pouya Dehghani Tafti
Abe Friesen
Michelle Casbon
Sabela Ramos
Ravin Kumar
Charline Le Lan
Sammy Jerome
Anton Tsitsulin
Nino Vieillard … (see 175 more)
Piotr Stańczyk
Sertan Girgin
Nikola Momchev
Matt Hoffman
Shantanu Thakoor
Jean-Bastien Grill
Behnam Neyshabur
Alanna Walton
Aliaksei Severyn
Alicia Parrish
Aliya Ahmad
Allen Hutchison
Alvin Abdagic
Amanda Carl
Amy Shen
Andy Brock
Andy Coenen
Anthony Laforge
Antonia Paterson
Ben Bastian
Bilal Piot
Boxi Wu
Brandon Royal
Charlie Chen
Chintu Kumar
Chris Perry
Christoper A. Welty
Christopher A. Choquette-Choo
Danila Sinopalnikov
David Weinberger
Dimple Vijaykumar
Dominika Rogozi'nska
D. Herbison
Elisa Bandy
Emma Wang
Eric Noland
Erica Moreira
Evan Senter
Evgenii Eltyshev
Francesco Visin
Gabriel Rasskin
Gary Wei
Glenn Cameron
Gus Martins
Hadi Hashemi
Hanna Klimczak-Pluci'nska
Harleen Batra
Harsh Dhand
Ivan Nardini
Jacinda Mein
Jack Zhou
James Svensson
Jeff Stanway
Jetha Chan
Jin Zhou
Joana Carrasqueira
Joana Iljazi
Jocelyn Becker
Joe Fernandez
Joost Van Amersfoort
Josh Gordon
Josh Lipschultz
Joshua Newlan
Junsong Ji
Kareem Mohamed
Kartikeya Badola
Kat Black
Katie Millican
Keelin McDonell
Kelvin Nguyen
Kiranbir Sodhia
Kish Greene
Lars Lowe Sjoesund
Lauren Usui
Laurent Sifre
L. Heuermann
Leti-cia Lago
Lilly McNealus
Livio Baldini Soares
Logan Kilpatrick
Lucas Dixon
Luciano Martins
Machel Reid
Manvinder Singh
Mark Iverson
Martin Gorner
Mat Velloso
Mateo Wirth
Matt Davidow
Matt Miller
Matthew Rahtz
Matthew Watson
Meg Risdal
Mehran Kazemi
Michael Moynihan
Ming Zhang
Minsuk Kahng
Minwoo Park
Mofi Rahman
Mohit Khatwani
Natalie Dao
Nenshad Bardoliwalla
N. Devanathan
Neta Dumai
Nilay Chauhan
O. Wahltinez
Pankil Botarda
Parker Barnes
Paul R. Barham
Paul Michel
Peng-chong Jin
Petko Georgiev
Phil Culliton
Pradeep Kuppala
Ramona Comanescu
Ramona Merhej
Reena Jana
R. Rokni
Ryan Mullins
Samaneh Saadat
S. M. Carthy
Sarah Perrin
S'ebastien M. R. Arnold
Se-bastian Krause
Shengyang Dai
S. Garg
Shruti Sheth
S. Ronstrom
Susan Chan
Timothy Jordan
Ting Yu
Tom Eccles
Tom Hennigan
Tomas Kocisky
Tulsee Doshi
Vihan Jain
Vikas Yadav
Vilobh Meshram
Vishal Dharmadhikari
Warren Barkley
Wei Wei
Wenming Ye
Woohyun Han
Woosuk Kwon
Xiang Xu
Zhe Shen
Zhitao Gong
Zichuan Wei
Victor Cotruta
Phoebe Kirk
Anand Rao
Minh Giang
Ludovic Peran
Tris Brian Warkentin
Eli Collins
Joelle Barral
Zoubin Ghahramani
Raia Hadsell
D. Sculley
Jeanine Banks
Anca Dragan
Slav Petrov
Oriol Vinyals
Jeffrey Dean
Demis Hassabis
Koray Kavukcuoglu
Clément Farabet
Elena Buchatskaya
Sebastian Borgeaud
Noah Fiedel
Armand Joulin
Kathleen Kenealy
Robert Dadashi
Alek Andreev
In this work, we introduce Gemma 2, a new addition to the Gemma family of lightweight, state-of-the-art open models, ranging in scale from 2… (see more) billion to 27 billion parameters. In this new version, we apply several known technical modifications to the Transformer architecture, such as interleaving local-global attentions (Beltagy et al., 2020a) and group-query attention (Ainslie et al., 2023). We also train the 2B and 9B models with knowledge distillation (Hinton et al., 2015) instead of next token prediction. The resulting models deliver the best performance for their size, and even offer competitive alternatives to models that are 2-3 times bigger. We release all our models to the community.
scHiCyclePred: a deep learning framework for predicting cell cycle phases from single-cell Hi-C data using multi-scale interaction information
Yingfu Wu
Zhenqi Shi
Xiangfei Zhou
Pengyu Zhang
Xiuhui Yang
Hao Wu
Assessing Programming Task Difficulty for Efficient Evaluation of Large Language Models
Florian Tambon
Amin Nikanjam
Giuliano Antoniol
Strong gravitational lensing as a probe of dark matter
Simona Vegetti
Simon Birrer
Giulia Despali
C. Fassnacht
Daniel A. Gilman
L.
J. McKean
D. Powell
Conor M. O'riordan
G.
Vernardos
Dark matter structures within strong gravitational lens galaxies and along their line of sight leave a gravitational imprint on the multiple… (see more) images of lensed sources. Strong gravitational lensing provides, therefore, a key test of different dark matter models in a way that is independent of the baryonic content of matter structures on subgalactic scales. In this chapter, we describe how galaxy-scale strong gravitational lensing observations are sensitive to the physical nature of dark matter. We provide a historical perspective of the field, and review its current status. We discuss the challenges and advances in terms of data, treatment of systematic errors and theoretical predictions, that will enable one to deliver a stringent and robust test of different dark matter models in the near future. With the advent of the next generation of sky surveys, the number of known strong gravitational lens systems is expected to increase by several orders of magnitude. Coupled with high-resolution follow-up observations, these data will provide a key opportunity to constrain the properties of dark matter with strong gravitational lensing.
TaskEval: Assessing Difficulty of Code Generation Tasks for Large Language Models
Florian Tambon
Amin Nikanjam
Cyrine Zid
Giuliano Antoniol
AAPM task group report 288: Recommendations for guiding radiotherapy event narratives
Bruce Thomadsen
Ajay Kapur
Bette Blankenship
Barrett Caldwell
Lindsey Claps
Joanne Cunningham
Jennifer Elee
Suzanne Evans
Eric Ford
Debbie Gilley
Sandra Hayden
Kathleen Hintenlang
Rishabh Kapoor
Linda Kroger
Ksenija Kujundzic
Qing Liang
Sasa Mutic
Anita O'Donovan
Michael O'Hara … (see 6 more)
Zoubir Ouhib
Jatinder Palta
Todd Pawlicki
William Salter
Stacey Schmidt
Sugata Tripathi
Development of AI-assisted microscopy frameworks through realistic simulation with pySTED
Anthony Bilodeau
Albert Michaud-Gagnon
Julia Chabbert
Benoit Turcotte
Jörn Heine
Flavie Lavoie-Cardinal
Development of AI-assisted microscopy frameworks through realistic simulation with pySTED
Anthony Bilodeau
Albert Michaud-Gagnon
Julia Chabbert
Benoit Turcotte
Jörn Heine
Flavie Lavoie-Cardinal
Development of AI-assisted microscopy frameworks through realistic simulation with pySTED
Anthony Bilodeau
Albert Michaud-Gagnon
Julia Chabbert
Benoit Turcotte
Jörn Heine
Flavie Lavoie-Cardinal
The integration of artificial intelligence (AI) into microscopy systems significantly enhances performance, optimizing both the image acquis… (see more)ition and analysis phases. Development of AI-assisted super-resolution microscopy is often limited by the access to large biological datasets, as well as by the difficulties to benchmark and compare approaches on heterogeneous samples. We demonstrate the benefits of a realistic STED simulation platform, pySTED, for the development and deployment of AI-strategies for super-resolution microscopy. The simulation environment provided by pySTED allows the augmentation of data for the training of deep neural networks, the development of online optimization strategies, and the training of reinforcement learning models, that can be deployed successfully on a real microscope.
Development of AI-assisted microscopy frameworks through realistic simulation with pySTED
Anthony Bilodeau
Albert Michaud-Gagnon
Julia Chabbert
Benoit Turcotte
Jörn Heine
Flavie Lavoie-Cardinal
Implicitly Bayesian Prediction Rules in Deep Learning
Bruno Mlodozeniec
Richard Turner
The Bayesian approach leads to coherent updates of predictions under new data, which makes adhering to Bayesian principles appealing in deci… (see more)sion-making contexts. Traditionally, integrating Bayesian principles into models like deep neural networks involves setting priors on parameters and approximating posteriors. This is done despite the fact that, typically, priors on parameters reflect any prior beliefs only insofar as they dictate function space behaviour. In this paper, we rethink this approach and consider what properties characterise a prediction rule as being Bayesian. Algorithms meeting such criteria can be deemed implicitly Bayesian — they make the same predictions as some Bayesian model, without explicitly manifesting priors and posteriors. We argue this might be a more fruitful approach towards integrating Bayesian principles into deep learning. In this paper, we propose how to measure how close a general prediction rule is to being implicitly Bayesian, and empirically evaluate multiple prediction strategies using our approach. We also show theoretically that agents relying on non-implicitly Bayesian prediction rules can be easily exploited in adversarial betting settings.
Long-term plasticity induces sparse and specific synaptic changes in a biophysically detailed cortical model
András Ecker
Daniela Egas Santander
Marwan Abdellah
Jorge Blanco Alonso
Sirio Bolaños-Puchet
Giuseppe Chindemi
Dhuruva Priyan Gowri Mariyappan
James B. Isbister
James King
Pramod Kumbhar
Ioannis Magkanaris
Michael W. Reimann