Portrait of Francesco Visin is unavailable

Francesco Visin

Alumni

Publications

Gemma 3 Technical Report
Gemma Team Aishwarya Kamath
Johan Ferret
Shreya Pathak
Nino Vieillard
Ramona Merhej
Tatiana Matejovicova
Alexandre Ram'e
Morgane Rivière
Louis Rouillard
Geoffrey Cideron
Jean-Bastien Grill
Sabela Ramos
Edouard Yvinec
Michelle Casbon
Etienne Pot
Ivo Penchev
Gael Liu
Kathleen Kenealy
Lucas Beyer
Xiaohai Zhai
Anton Tsitsulin
Róbert Busa-Fekete
Alex Feng
Noveen Sachdeva
Benjamin Coleman
Yi Gao
Basil Mustafa
Iain Barr
Emilio Parisotto
David Tian
Matan Eyal
Colin Cherry
Jan-Thorsten Peter
Danila Sinopalnikov
Surya Bhupatiraju
Mehran Kazemi
Dan Malkin
Ravin Kumar
David Vilar
Idan Brusilovsky
Jiaming Luo
Andreas Steiner
Abe Friesen
Abhanshu Sharma
Abheesht Sharma
Adi Mayrav Gilady
Adrian Goedeckemeyer
Alaa Saade
Alexander Kolesnikov
Alexei Bendebury
Alvin Abdagic
Amit Vadi
Andr'as Gyorgy
André Susano Pinto
Anil Das
Ankur Bapna
Antoine Miech
Antoine Yang
Antonia Paterson
Ashish Shenoy
Ayan Chakrabarti
Bilal Piot
Boxi Wu
Bobak Shahriari
Bryce Petrini
Charlie Chen
Christopher A. Choquette-Choo
CJ Carey
Cormac Brick
Daniel Deutsch
Danielle Eisenbud
Dee Cattle
Derek Cheng
Dimitris Paparas
Divyashree Shivakumar Sreepathihalli
Doug Reid
Dustin Tran
Dustin Zelle
Eric Noland
Erwin Huizenga
Eugene Kharitonov
Frederick Liu
Gagik Amirkhanyan
Glenn Cameron
Hadi Hashemi
Hanna Klimczak-Pluci'nska
Harman Singh
Harsh Mehta
Harshal Tushar Lehri
Hussein Hazimeh
Ian Ballantyne
Idan Szpektor
Ivan Nardini
Jetha Chan
Joe Stanton
J. Michael Wieting
Jonathan Lai
Jordi Orbay
Joe Fernandez
Joshua Newlan
Junsong Ji
Jyotinder Singh
Kat Black
Kathy Yu
Kevin Hui
Kiran N. Vodrahalli
Klaus Greff
Linhai Qiu
Marcella Valentine
Marina Coelho
Marvin Ritter
Matt Hoffman
Matthew Watson
Mayank Chaturvedi
Michael Moynihan
Min Ma
Nabila Babar
Natasha Noy
Nathan Byrd
Nick Roy
Nikola Momchev
Nilay Chauhan
Oskar Bunyan
Pankil Botarda
Paul Caron
Paul Kishan Rubenstein
Phil Culliton
Philipp Schmid
Pier Giuseppe Sessa
Pingmei Xu
Piotr Stańczyk
Pouya Dehghani Tafti
Rakesh Shivanna
Renjie Wu
Renke Pan
R. Rokni
Rob Willoughby
Rohith Vallu
Ryan Mullins
Sammy Jerome
Sara Smoot
Sertan Girgin
Shariq Iqbal
Shashir Reddy
Shruti Sheth
Siim Põder
Sijal Bhatnagar
S. Panyam
Sivan Eiger
Susan Zhang
Tianqi Liu
Trevor Yacovone
T. Liechty
Uday Kalra
Utku Evci
Vedant Misra
Vincent Roseberry
Vladimir Feinberg
Vlad Kolesnikov
Woohyun Han
Woosuk Kwon
X. T. Chen
Yinlam Chow
Yuvein Zhu
Zichuan Wei
Z. Egyed
Victor Cotruta
Minh Giang
Phoebe Kirk
Anand Rao
Jessica Lo
Erica Moreira
Luiz GUStavo Martins
Omar Sanseviero
Lucas Gonzalez
Zach Gleicher
Tris Brian Warkentin
Seyed Vahab Mirrokni
Evan Senter
Eli Collins
Joelle Barral
Zoubin Ghahramani
Raia Hadsell
Yossi Matias
D. Sculley
Slav Petrov
Noah Fiedel
Noam M. Shazeer
Oriol Vinyals
Jeffrey Dean
Demis Hassabis
Koray Kavukcuoglu
Clément Farabet
Elena Buchatskaya
Jean-Baptiste Alayrac
Rohan Anil
Dmitry Lepikhin
Sebastian Borgeaud
Olivier Bachem
Armand Joulin
Alek Andreev
Cassidy Hardin
Robert Dadashi
L'eonard Hussenot
We introduce Gemma 3, a multimodal addition to the Gemma family of lightweight open models, ranging in scale from 1 to 27 billion parameters… (see more). This version introduces vision understanding abilities, a wider coverage of languages and longer context - at least 128K tokens. We also change the architecture of the model to reduce the KV-cache memory that tends to explode with long context. This is achieved by increasing the ratio of local to global attention layers, and keeping the span on local attention short. The Gemma 3 models are trained with distillation and achieve superior performance to Gemma 2 for both pre-trained and instruction finetuned versions. In particular, our novel post-training recipe significantly improves the math, chat, instruction-following and multilingual abilities, making Gemma3-4B-IT competitive with Gemma2-27B-IT and Gemma3-27B-IT comparable to Gemini-1.5-Pro across benchmarks. We release all our models to the community.
PixelVAE: A Latent Variable Model for Natural Images
Natural image modeling is a landmark challenge of unsupervised learning. Variational Autoencoders (VAEs) learn a useful latent representatio… (see more)n and model global structure well but have difficulty capturing small details. PixelCNN models details very well, but lacks a latent code and is difficult to scale for capturing large structures. We present PixelVAE, a VAE model with an autoregressive decoder based on PixelCNN. Our model requires very few expensive autoregressive layers compared to PixelCNN and learns latent codes that are more compressed than a standard VAE while still capturing most non-trivial structure. Finally, we extend our model to a hierarchy of latent variables at different scales. Our model achieves state-of-the-art performance on binarized MNIST, competitive performance on 64 × 64 ImageNet, and high-quality samples on the LSUN bedrooms dataset.
ReSeg: A Recurrent Neural Network-based Model for Semantic Segmentation
Adriana Romero
Matteo Matteucci
Marco Ciccone
We propose a structured prediction architecture, which exploits the local generic features extracted by Convolutional Neural Networks and th… (see more)e capacity of Recurrent Neural Networks (RNN) to retrieve distant dependencies. The proposed architecture, called ReSeg, is based on the recently introduced ReNet model for image classification. We modify and extend it to perform the more challenging task of semantic segmentation. Each ReNet layer is composed of four RNN that sweep the image horizontally and vertically in both directions, encoding patches or activations, and providing relevant global information. Moreover, ReNet layers are stacked on top of pre-trained convolutional layers, benefiting from generic local features. Upsampling layers follow ReNet layers to recover the original image resolution in the final predictions. The proposed ReSeg architecture is efficient, flexible and suitable for a variety of semantic segmentation tasks. We evaluate ReSeg on several widely-used semantic segmentation datasets: Weizmann Horse, Oxford Flower, and CamVid; achieving state-of-the-art performance. Results show that ReSeg can act as a suitable architecture for semantic segmentation tasks, and may have further applications in other structured prediction problems. The source code and model hyperparameters are available on https://github.com/fvisin/reseg.
Theano: A Python framework for fast computation of mathematical expressions
Rami Al-Rfou
Amjad Almahairi
Christof Angermueller
Frédéric Bastien
Justin Bayer
Anatoly Belikov
Alexander Belopolsky
Josh Bleecher Snyder
Pierre-Luc Carrier
Paul Christiano
Myriam Côté
Yann N. Dauphin
Julien Demouth
Sander Dieleman
Ziye Fan
Mathieu Germain
Matt Graham
Balázs Hidasi
Arjun Jain
Kai Jia
Mikhail Korobov
Vivek Kulkarni
Pascal Lamblin
Eric Larsen
Sean Lee
Simon Lefrancois
Jesse A. Livezey
Cory Lorenz
Jeremiah Lowin
Qianli Ma
Robert T. McGibbon
Mehdi Mirza
Alberto Orlandi
Christopher Pal
Colin Raffel
Daniel Renshaw
Matthew Rocklin
Adriana Romero
Markus Roth
Peter Sadowski
John Salvatier
Jan Schlüter
John Schulman
Gabriel Schwartz
Iulian Vlad Serban
Samira Shabanian
Sigurd Spieckermann
S. Ramana Subramanyam
Gijs van Tulder
Sebastian Urban
Dustin J. Webb
Matthew Willson
Lijun Xue
Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficie… (see more)ntly. Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements. Theano is being actively and continuously developed since 2008, multiple frameworks have been built on top of it and it has been used to produce many state-of-the-art machine learning models. The present article is structured as follows. Section I provides an overview of the Theano software and its community. Section II presents the principal features of Theano and how to use them, and compares them with other similar projects. Section III focuses on recently-introduced functionalities and improvements. Section IV compares the performance of Theano against Torch7 and TensorFlow on several machine learning models. Section V discusses current limitations of Theano and potential ways of improving it.