Publications

Bernt Schiele

2016-05-12

ArXiv (prépublication)

Movie Description

Anna Rohrbach

Atousa Torabi

Marcus Rohrbach

Niket Tandon

Bernt Schiele

Audio description (AD) provides linguistic descriptions of movies and allows visually impaired people to follow a movie along with their pee… (voir plus)rs. Such descriptions are by design mainly visual and thus naturally form an interesting data source for computer vision and computational linguistics. In this work we propose a novel dataset which contains transcribed ADs, which are temporally aligned to full length movies. In addition we also collected and aligned movie scripts used in prior work and compare the two sources of descriptions. We introduce the Large Scale Movie Description Challenge (LSMDC) which contains a parallel corpus of 128,118 sentences aligned to video clips from 200 movies (around 150 h of video in total). The goal of the challenge is to automatically generate descriptions for the movie clips. First we characterize the dataset by benchmarking different approaches for generating video descriptions. Comparing ADs to scripts, we find that ADs are more visual and describe precisely what is shown rather than what should happen according to the scripts created prior to movie production. Furthermore, we present and compare the results of several teams who participated in the challenges organized in the context of two workshops at ICCV 2015 and ECCV 2016.

2016-05-12

ArXiv (prépublication)

Movie Description

Anna Rohrbach

Atousa Torabi

Marcus Rohrbach

Niket Tandon

Bernt Schiele

2016-05-12

ArXiv (prépublication)

Movie Description

Anna Rohrbach

Atousa Torabi

Marcus Rohrbach

Niket Tandon

Bernt Schiele

2016-05-12

ArXiv (prépublication)

Movie Description

Anna Rohrbach

Atousa Torabi

Marcus Rohrbach

Niket Tandon

Bernt Schiele

Audio description (AD) provides linguistic descriptions of movies and allows visually impaired people to follow a movie along with their pee… (voir plus)rs. Such descriptions are by design mainly visual and thus naturally form an interesting data source for computer vision and computational linguistics. In this work we propose a novel dataset which contains transcribed ADs, which are temporally aligned to full length movies. In addition we also collected and aligned movie scripts used in prior work and compare the two sources of descriptions. We introduce the Large Scale Movie Description Challenge (LSMDC) which contains a parallel corpus of 128,118 sentences aligned to video clips from 200 movies (around 150 h of video in total). The goal of the challenge is to automatically generate descriptions for the movie clips. First we characterize the dataset by benchmarking different approaches for generating video descriptions. Comparing ADs to scripts, we find that ADs are more visual and describe precisely what is shown rather than what should happen according to the scripts created prior to movie production. Furthermore, we present and compare the results of several teams who participated in the challenges organized in the context of two workshops at ICCV 2015 and ECCV 2016.

2016-05-12

ArXiv (prépublication)

Movie Description

Anna Rohrbach

Atousa Torabi

Marcus Rohrbach

Niket Tandon

Bernt Schiele

Audio description (AD) provides linguistic descriptions of movies and allows visually impaired people to follow a movie along with their pee… (voir plus)rs. Such descriptions are by design mainly visual and thus naturally form an interesting data source for computer vision and computational linguistics. In this work we propose a novel dataset which contains transcribed ADs, which are temporally aligned to full length movies. In addition we also collected and aligned movie scripts used in prior work and compare the two sources of descriptions. We introduce the Large Scale Movie Description Challenge (LSMDC) which contains a parallel corpus of 128,118 sentences aligned to video clips from 200 movies (around 150 h of video in total). The goal of the challenge is to automatically generate descriptions for the movie clips. First we characterize the dataset by benchmarking different approaches for generating video descriptions. Comparing ADs to scripts, we find that ADs are more visual and describe precisely what is shown rather than what should happen according to the scripts created prior to movie production. Furthermore, we present and compare the results of several teams who participated in the challenges organized in the context of two workshops at ICCV 2015 and ECCV 2016.

2016-05-12

ArXiv (prépublication)

Movie Description

Anna Rohrbach

Atousa Torabi

Marcus Rohrbach

Niket Tandon

Bernt Schiele

2016-05-12

ArXiv (prépublication)

Movie Description

Anna Rohrbach

Atousa Torabi

Marcus Rohrbach

Niket Tandon

Bernt Schiele

2016-05-12

ArXiv (prépublication)

Theano: A Python framework for fast computation of mathematical expressions

Rami Al-rfou'

Guillaume Alain

Amjad Almahairi

Christof Angermüller

Dzmitry Bahdanau

Nicolas Ballas

Frédéric Bastien

Justin S. Bayer

A. Belikov

A. Belopolsky

Yoshua Bengio

Arnaud Bergeron

J. Bergstra

Valentin Bisson

Josh Bleecher Snyder

Nicolas Bouchard

Nicolas Boulanger-Lewandowski

Xavier Bouthillier

Alexandre De Brébisson

Olivier Breuleux … (voir 92 de plus)

Pierre Luc Carrier

Kyunghyun Cho

Jan Chorowski

Paul F. Christiano

Tim Cooijmans

Marc-Alexandre Côté

Myriam Côté

Yann Dauphin

Olivier Delalleau

Julien Demouth

Guillaume Desjardins

Sander Dieleman

Laurent Dinh

M'elanie Ducoffe

Vincent Dumoulin

Samira Ebrahimi Kahou

Dumitru Erhan

Ziye Fan

Orhan Firat

Mathieu Germain

Xavier Glorot

Ian J. Goodfellow

Matthew Graham

Caglar Gulcehre

Philippe Hamel

Iban Harlouchet

Jean-philippe Heng

Balázs Hidasi

Sina Honari

Arjun Jain

S'ebastien Jean

Kai Jia

Mikhail V. Korobov

Vivek Kulkarni

Alex Lamb

Pascal Lamblin

Eric P. Larsen

César Laurent

S. Lee

Simon-mark Lefrancois

Simon Lemieux

Nicholas Léonard

Zhouhan Lin

J. Livezey

Cory R. Lorenz

Jeremiah L. Lowin

Qianli M. Ma

Pierre-Antoine Manzagol

Olivier Mastropietro

R. McGibbon

Roland Memisevic

Bart van Merriënboer

Vincent Michalski

Mehdi Mirza

Alberto Orlandi

Razvan Pascanu

Mohammad Pezeshki

Colin Raffel

Daniel Renshaw

Matthew David Rocklin

Adriana Romero Soriano

Markus Dr. Roth

Peter Sadowski

John Salvatier

François Savard

Jan Schlüter

John D. Schulman

Gabriel Schwartz

Iulian V. Serban

Dmitriy Serdyuk

Samira Shabanian

Etienne Simon

Sigurd Spieckermann

S. Subramanyam

Jakub Sygnowski

Jérémie Tanguay

Gijs van Tulder

Joseph P. Turian

Sebastian Urban

Pascal Vincent

Francesco Visin

Harm de Vries

David Warde-Farley

Dustin J. Webb

M. Willson

Kelvin Xu

Lijun Xue

Li Yao

Saizheng Zhang

Ying Zhang

Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficie… (voir plus)ntly. Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements. Theano is being actively and continuously developed since 2008, multiple frameworks have been built on top of it and it has been used to produce many state-of-the-art machine learning models. The present article is structured as follows. Section I provides an overview of the Theano software and its community. Section II presents the principal features of Theano and how to use them, and compares them with other similar projects. Section III focuses on recently-introduced functionalities and improvements. Section IV compares the performance of Theano against Torch7 and TensorFlow on several machine learning models. Section V discusses current limitations of Theano and potential ways of improving it.

2016-05-09

ArXiv (preprint)

Theano: A Python framework for fast computation of mathematical expressions

Rami Al-rfou'

Guillaume Alain

Amjad Almahairi

Christof Angermüller

Dzmitry Bahdanau

Nicolas Ballas

Frédéric Bastien

Justin S. Bayer

A. Belikov

A. Belopolsky

Yoshua Bengio

Arnaud Bergeron

James Bergstra

Valentin Bisson

Josh Bleecher Snyder

Nicolas Bouchard

Nicolas Boulanger-Lewandowski

Xavier Bouthillier

Alexandre De Brébisson

Olivier Breuleux … (voir 92 de plus)

Pierre Luc Carrier

Kyunghyun Cho

Jan Chorowski

Paul F. Christiano

Tim Cooijmans

Marc-Alexandre Côté

Myriam Côté

Yann Dauphin

Olivier Delalleau

Julien Demouth

Guillaume Desjardins

Sander Dieleman

Laurent Dinh

M'elanie Ducoffe

Vincent Dumoulin

Samira Ebrahimi Kahou

Dumitru Erhan

Ziye Fan

Orhan Firat

Mathieu Germain

Xavier Glorot

Ian G Goodfellow

Matthew Graham

Caglar Gulcehre

Philippe Hamel

Iban Harlouchet

Jean-philippe Heng

Balázs Hidasi

Sina Honari

Arjun Jain

Sébastien Jean

Kai Jia

Mikhail V. Korobov

Vivek Kulkarni

Alex Lamb

Pascal Lamblin

Eric Larsen

César Laurent

S. Lee

Simon-mark Lefrancois

Simon Lemieux

Nicholas Léonard

Zhouhan Lin

J. Livezey

Cory R. Lorenz

Jeremiah L. Lowin

Qianli M. Ma

Pierre-Antoine Manzagol

Olivier Mastropietro

R. McGibbon

Roland Memisevic

Bart van Merriënboer

Vincent Michalski

Mehdi Mirza

Alberto Orlandi

Razvan Pascanu

Mohammad Pezeshki

Colin Raffel

Daniel Renshaw

Matthew David Rocklin

Adriana Romero Soriano

Markus Dr. Roth

Peter Sadowski

John Salvatier

François Savard

Jan Schlüter

John D. Schulman

Gabriel Schwartz

Iulian V. Serban

Dmitriy Serdyuk

Samira Shabanian

Etienne Simon

Sigurd Spieckermann

S. Subramanyam

Jakub Sygnowski

Jérémie Tanguay

Gijs van Tulder

Joseph Turian

Sebastian Urban

Pascal Vincent

Francesco Visin

Harm de Vries

David Warde-Farley

Dustin J. Webb

M. Willson

Kelvin Xu

Lijun Xue

Li Yao

Saizheng Zhang

Ying Zhang

Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficie… (voir plus)ntly. Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements. Theano is being actively and continuously developed since 2008, multiple frameworks have been built on top of it and it has been used to produce many state-of-the-art machine learning models. The present article is structured as follows. Section I provides an overview of the Theano software and its community. Section II presents the principal features of Theano and how to use them, and compares them with other similar projects. Section III focuses on recently-introduced functionalities and improvements. Section IV compares the performance of Theano against Torch7 and TensorFlow on several machine learning models. Section V discusses current limitations of Theano and potential ways of improving it.

2016-05-09

ArXiv (prépublication)

A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues

Iulian V. Serban

Ryan Lowe

Sequential data often possesses hierarchical structures with complex dependencies between sub-sequences, such as found between the utterance… (voir plus)s in a dialogue. To model these dependencies in a generative framework, we propose a neural network-based generative architecture, with stochastic latent variables that span a variable number of time steps. We apply the proposed model to the task of dialogue response generation and compare it with other recent neural-network architectures. We evaluate the model performance through a human evaluation study. The experiments demonstrate that our model improves upon recently proposed models and that the latent variables facilitate both the generation of meaningful, long and diverse responses and maintaining dialogue state.

2016-05-01

ArXiv (prépublication)

Nash equilibria in the two-player kidney exchange game

Margarida Carvalho

Andrea Lodi

João Pedro Pedroso

Ana Luiza D'ávila Viana

2016-04-21

Mathematical Programming (publié)