Publications

PhotoBot: Reference-Guided Interactive Photography via Natural Language

Oliver Limoyo

Jimmy Li

Dmitriy Rivkin

Jonathan Kelly

Gregory Dudek

We introduce PhotoBot, a framework for fully automated photo acquisition based on an interplay between high-level human language guidance an… (see more)d a robot photographer. We propose to communicate photography suggestions to the user via reference images that are selected from a curated gallery. We leverage a visual language model (VLM) and an object detector to characterize the reference images via textual descriptions and then use a large language model (LLM) to retrieve relevant reference images based on a user’s language query through text-based reasoning. To correspond the reference image and the observed scene, we exploit pretrained features from a vision transformer capable of capturing semantic similarity across marked appearance variations. Using these features, we compute suggested pose adjustments for an RGB-D camera by solving a perspective-n-point (PnP) problem. We demonstrate our approach using a manipulator equipped with a wrist camera. Our user studies show that photos taken by PhotoBot are often more aesthetically pleasing than those taken by users themselves, as measured by human feedback. We also show that PhotoBot can generalize to other reference sources such as paintings.

2024-01-19

ArXiv (preprint)

PhotoBot: Reference-Guided Interactive Photography via Natural Language

Oliver Limoyo

Jimmy Li

Dmitriy Rivkin

Jonathan Kelly

Gregory Dudek

We introduce PhotoBot, a framework for fully automated photo acquisition based on an interplay between high-level human language guidance an… (see more)d a robot photographer. We propose to communicate photography suggestions to the user via reference images that are selected from a curated gallery. We leverage a visual language model (VLM) and an object detector to characterize the reference images via textual descriptions and then use a large language model (LLM) to retrieve relevant reference images based on a user’s language query through text-based reasoning. To correspond the reference image and the observed scene, we exploit pretrained features from a vision transformer capable of capturing semantic similarity across marked appearance variations. Using these features, we compute suggested pose adjustments for an RGB-D camera by solving a perspective-n-point (PnP) problem. We demonstrate our approach using a manipulator equipped with a wrist camera. Our user studies show that photos taken by PhotoBot are often more aesthetically pleasing than those taken by users themselves, as measured by human feedback. We also show that PhotoBot can generalize to other reference sources such as paintings.

2024-01-19

ArXiv (preprint)

Bridging State and History Representations: Understanding Self-Predictive RL

Benjamin Eysenbach

Representations are at the core of all deep reinforcement learning (RL) methods for both Markov decision processes (MDPs) and partially obse… (see more)rvable Markov decision processes (POMDPs). Many representation learning methods and theoretical frameworks have been developed to understand what constitutes an effective representation. However, the relationships between these methods and the shared properties among them remain unclear. In this paper, we show that many of these seemingly distinct methods and frameworks for state and history abstractions are, in fact, based on a common idea of self-predictive abstraction. Furthermore, we provide theoretical insights into the widely adopted objectives and optimization, such as the stop-gradient technique, in learning self-predictive representations. These findings together yield a minimalist algorithm to learn self-predictive representations for states and histories. We validate our theories by applying our algorithm to standard MDPs, MDPs with distractors, and POMDPs with sparse rewards. These findings culminate in a set of preliminary guidelines for RL practitioners.

2024-01-17

ArXiv (preprint)

Bridging State and History Representations: Understanding Self-Predictive RL

Benjamin Eysenbach

Representations are at the core of all deep reinforcement learning (RL) methods for both Markov decision processes (MDPs) and partially obse… (see more)rvable Markov decision processes (POMDPs). Many representation learning methods and theoretical frameworks have been developed to understand what constitutes an effective representation. However, the relationships between these methods and the shared properties among them remain unclear. In this paper, we show that many of these seemingly distinct methods and frameworks for state and history abstractions are, in fact, based on a common idea of self-predictive abstraction. Furthermore, we provide theoretical insights into the widely adopted objectives and optimization, such as the stop-gradient technique, in learning self-predictive representations. These findings together yield a minimalist algorithm to learn self-predictive representations for states and histories. We validate our theories by applying our algorithm to standard MDPs, MDPs with distractors, and POMDPs with sparse rewards. These findings culminate in a set of preliminary guidelines for RL practitioners.

2024-01-17

ArXiv (preprint)

Bridging State and History Representations: Understanding Self-Predictive RL

Benjamin Eysenbach

Representations are at the core of all deep reinforcement learning (RL) methods for both Markov decision processes (MDPs) and partially obse… (see more)rvable Markov decision processes (POMDPs). Many representation learning methods and theoretical frameworks have been developed to understand what constitutes an effective representation. However, the relationships between these methods and the shared properties among them remain unclear. In this paper, we show that many of these seemingly distinct methods and frameworks for state and history abstractions are, in fact, based on a common idea of self-predictive abstraction. Furthermore, we provide theoretical insights into the widely adopted objectives and optimization, such as the stop-gradient technique, in learning self-predictive representations. These findings together yield a minimalist algorithm to learn self-predictive representations for states and histories. We validate our theories by applying our algorithm to standard MDPs, MDPs with distractors, and POMDPs with sparse rewards. These findings culminate in a set of preliminary guidelines for RL practitioners.

2024-01-17

ArXiv (preprint)

Deployable Reinforcement Learning with Variable Control Rate

Yong Wang

Giovanni Beltrame

2024-01-17

ArXiv (preprint)

METhodological RadiomICs Score (METRICS): a quality scoring tool for radiomics research endorsed by EuSoMII

Burak Kocak

Tugba Akinci D’Antonoli

Nathaniel Mercaldo

Angel Alberich-Bayarri

Bettina Baessler

Ilaria Ambrosini

Anna E. Andreychenko

Spyridon Bakas

Regina G. H. Beets-Tan

Keno Bressem

Irene Buvat

Roberto Cannella

Luca Alessandro Cappellini

Armando Ugo Cavallo

Leonid L. Chepelev

Linda Chi Hang Chu

Aydin Demircioglu

Nandita M. deSouza

Matthias Dietzel

Salvatore Claudio Fanni … (see 40 more)

Andrey Fedorov

Laure S. Fournier

Valentina Giannini

Rossano Girometti

Kevin B. W. Groot Lipman

Georgios Kalarakis

Brendan S. Kelly

Michail E. Klontzas

Dow-Mu Koh

Elmar Kotter

Ho Yun Lee

Mario Maas

Luis Marti-Bonmati

Henning Müller

Nancy Obuchowski

Fanny Orlhac

Nikolaos Papanikolaou

Ekaterina Petrash

Elisabeth Pfaehler

Daniel Pinto dos Santos

Andrea Ponsiglione

Sebastià Sabater

Francesco Sardanelli

Philipp Seeböck

Nanna M. Sijtsema

Arnaldo Stanzione

Alberto Traverso

Lorenzo Ugga

Lisanne V. van Dijk

Joost J. M. van Griethuysen

Robbert W. van Hamersvelt

Peter van Ooijen

Federica Vernuccio

Alan Wang

Stuart Williams

Jan Witowski

Zhongyi Zhang

Alex Zwanenburg

Renato Cuocolo

2024-01-17

Insights into Imaging (published)

METhodological RadiomICs Score (METRICS): a quality scoring tool for radiomics research endorsed by EuSoMII

Burak Kocak

Tugba Akinci D’Antonoli

Nathaniel Mercaldo

Angel Alberich-Bayarri

Bettina Baessler

Ilaria Ambrosini

Anna E. Andreychenko

Spyridon Bakas

Regina G. H. Beets-Tan

Keno Bressem

Irene Buvat

Roberto Cannella

Luca Alessandro Cappellini

Armando Ugo Cavallo

Leonid L. Chepelev

Linda Chi Hang Chu

Aydin Demircioglu

Nandita M. deSouza

Matthias Dietzel

Salvatore Claudio Fanni … (see 40 more)

Andrey Fedorov

Laure S. Fournier

Valentina Giannini

Rossano Girometti

Kevin B. W. Groot Lipman

Georgios Kalarakis

Brendan S. Kelly

Michail E. Klontzas

Dow-Mu Koh

Elmar Kotter

Ho Yun Lee

Mario Maas

Luis Marti-Bonmati

Henning Müller

Nancy Obuchowski

Fanny Orlhac

Nikolaos Papanikolaou

Ekaterina Petrash

Elisabeth Pfaehler

Daniel Pinto dos Santos

Andrea Ponsiglione

Sebastià Sabater

Francesco Sardanelli

Philipp Seeböck

Nanna M. Sijtsema

Arnaldo Stanzione

Alberto Traverso

Lorenzo Ugga

Lisanne V. van Dijk

Joost J. M. van Griethuysen

Robbert W. van Hamersvelt

Peter van Ooijen

Federica Vernuccio

Alan Wang

Stuart Williams

Jan Witowski

Zhongyi Zhang

Alex Zwanenburg

Renato Cuocolo

2024-01-17

Insights into Imaging (published)

METhodological RadiomICs Score (METRICS): a quality scoring tool for radiomics research endorsed by EuSoMII

Burak Kocak

Tugba Akinci D’Antonoli

Nathaniel Mercaldo

Angel Alberich-Bayarri

Bettina Baessler

Ilaria Ambrosini

Anna E. Andreychenko

Spyridon Bakas

Regina G. H. Beets-Tan

Keno Bressem

Irene Buvat

Roberto Cannella

Luca Alessandro Cappellini

Armando Ugo Cavallo

Leonid L. Chepelev

Linda Chi Hang Chu

Aydin Demircioglu

Nandita M. deSouza

Matthias Dietzel

Salvatore Claudio Fanni … (see 40 more)

Andrey Fedorov

Laure S. Fournier

Valentina Giannini

Rossano Girometti

Kevin B. W. Groot Lipman

Georgios Kalarakis

Brendan S. Kelly

Michail E. Klontzas

Dow-Mu Koh

Elmar Kotter

Ho Yun Lee

Mario Maas

Luis Marti-Bonmati

Henning Müller

Nancy Obuchowski

Fanny Orlhac

Nikolaos Papanikolaou

Ekaterina Petrash

Elisabeth Pfaehler

Daniel Pinto dos Santos

Andrea Ponsiglione

Sebastià Sabater

Francesco Sardanelli

Philipp Seeböck

Nanna M. Sijtsema

Arnaldo Stanzione

Alberto Traverso

Lorenzo Ugga

Lisanne V. van Dijk

Joost J. M. van Griethuysen

Robbert W. van Hamersvelt

Peter van Ooijen

Federica Vernuccio

Alan Wang

Stuart Williams

Jan Witowski

Zhongyi Zhang

Alex Zwanenburg

Renato Cuocolo

2024-01-17

Insights into Imaging (published)

METhodological RadiomICs Score (METRICS): a quality scoring tool for radiomics research endorsed by EuSoMII

Burak Kocak

Tugba Akinci D’Antonoli

Nathaniel Mercaldo

Angel Alberich-Bayarri

Bettina Baessler

Ilaria Ambrosini

Anna E. Andreychenko

Spyridon Bakas

Regina G. H. Beets-Tan

Keno Bressem

Irene Buvat

Roberto Cannella

Luca Alessandro Cappellini

Armando Ugo Cavallo

Leonid L. Chepelev

Linda Chi Hang Chu

Aydin Demircioglu

Nandita M. deSouza

Matthias Dietzel

Salvatore Claudio Fanni … (see 40 more)

Andrey Fedorov

Laure S. Fournier

Valentina Giannini

Rossano Girometti

Kevin B. W. Groot Lipman

Georgios Kalarakis

Brendan S. Kelly

Michail E. Klontzas

Dow-Mu Koh

Elmar Kotter

Ho Yun Lee

Mario Maas

Luis Marti-Bonmati

Henning Müller

Nancy Obuchowski

Fanny Orlhac

Nikolaos Papanikolaou

Ekaterina Petrash

Elisabeth Pfaehler

Daniel Pinto dos Santos

Andrea Ponsiglione

Sebastià Sabater

Francesco Sardanelli

Philipp Seeböck

Nanna M. Sijtsema

Arnaldo Stanzione

Alberto Traverso

Lorenzo Ugga

Lisanne V. van Dijk

Joost J. M. van Griethuysen

Robbert W. van Hamersvelt

Peter van Ooijen

Federica Vernuccio

Alan Wang

Stuart Williams

Jan Witowski

Zhongyi Zhang

Alex Zwanenburg

Renato Cuocolo

2024-01-17

Insights into Imaging (published)

METhodological RadiomICs Score (METRICS): a quality scoring tool for radiomics research endorsed by EuSoMII

Burak Kocak

Tugba Akinci D’Antonoli

Nathaniel Mercaldo

Angel Alberich-Bayarri

Bettina Baessler

Ilaria Ambrosini

Anna E. Andreychenko

Spyridon Bakas

Regina G. H. Beets-Tan

Keno Bressem

Irene Buvat

Roberto Cannella

Luca Alessandro Cappellini

Armando Ugo Cavallo

Leonid L. Chepelev

Linda Chi Hang Chu

Aydin Demircioglu

Nandita M. deSouza

Matthias Dietzel

Salvatore Claudio Fanni … (see 40 more)

Andrey Fedorov

Laure S. Fournier

Valentina Giannini

Rossano Girometti

Kevin B. W. Groot Lipman

Georgios Kalarakis

Brendan S. Kelly

Michail E. Klontzas

Dow-Mu Koh

Elmar Kotter

Ho Yun Lee

Mario Maas

Luis Marti-Bonmati

Henning Müller

Nancy Obuchowski

Fanny Orlhac

Nikolaos Papanikolaou

Ekaterina Petrash

Elisabeth Pfaehler

Daniel Pinto dos Santos

Andrea Ponsiglione

Sebastià Sabater

Francesco Sardanelli

Philipp Seeböck

Nanna M. Sijtsema

Arnaldo Stanzione

Alberto Traverso

Lorenzo Ugga

Lisanne V. van Dijk

Joost J. M. van Griethuysen

Robbert W. van Hamersvelt

Peter van Ooijen

Federica Vernuccio

Alan Wang

Stuart Williams

Jan Witowski

Zhongyi Zhang

Alex Zwanenburg

Renato Cuocolo

2024-01-17

Insights into Imaging (published)