Publications

VisPaD: Visualization and Pattern Discovery for Fighting Human Trafficking

Pratheeksha Nair

Yifei Li

Catalina Vajiac

Andreas Olligschlaeger

Meng-Chieh Lee

Namyong Park

Duen Horng Chau

Christos Faloutsos

Reihaneh Rabbany

Chieh Lee

Human trafficking analysts investigate groups of related online escort advertisements (called micro-clusters) to detect suspicious activitie… (see more)s and identify various modus operandi. This task is complex as it requires finding patterns and linked meta-data across micro-clusters such as the geographical spread of ads, cluster sizes, etc. Additionally, drawing insights from the data is challenging without visualizing these micro-clusters. To address this, in close-collaboration with domain experts, we built VisPaD, a novel interactive way for characterizing and visualizing micro-clusters and their associated meta-data, all in one place. VisPaD helps discover underlying patterns in the data by projecting micro-clusters in a lower dimensional space. It also allows the user to select micro-clusters involved in suspicious patterns and interactively examine them leading to faster detection and identification of trends in the data. A demo of VisPaD is also released1.

2022-04-25

The Web Conference (published)

doi.org

VisPaD: Visualization and Pattern Discovery for Fighting Human Trafficking

Pratheeksha Nair

Yifei Li

Catalina Vajiac

Andreas Olligschlaeger

Meng-Chieh Lee

Namyong Park

Duen Horng Chau

Christos Faloutsos

Reihaneh Rabbany

Chieh Lee

2022-04-25

The Web Conference (published)

doi.org

Local Learning with Neuron Groups

Adeetya Patel

Michael Eickenberg

Eugene Belilovsky

2022-04-21

ICLR.cc/2022/Workshop/Cells2Societies (poster)

doi.org

openreview.net

Summarizing Societies: Agent Abstraction in Multi-Agent Reinforcement Learning

Amin Memarian

Maximilian Puelma Touzel

Matthew D Riemer

Rupali Bhati

Irina Rish

Agents cannot make sense of many-agent societies through direct consideration of small-scale, low-level agent identities, but instead must r… (see more)ecognize emergent collective identities. Here, we take a first step towards a framework for recognizing this structure in large groups of low-level agents so that they can be modeled as a much smaller number of high-level agents—a process that we call agent abstraction. We illustrate this process by extending bisimulation metrics for state abstraction in reinforcement learning to the setting of multi-agent reinforcement learning and analyze a straightforward, if crude, abstraction based on experienced joint actions. It addresses non-stationarity due to other learning agents by improving minimax regret by a intuitive factor. To test if this compression factor provides signal for higher-level agency, we applied it to a large dataset of human play of the popular social dilemma game Diplomacy. We find that it correlates strongly with the degree of ground-truth abstraction of low-level units into the human players.

2022-04-21

ICLR.cc/2022/Workshop/Cells2Societies (poster)

openreview.net

A Strong Node Classification Baseline for Temporal Graphs

Farimah Poursafaei

Željko Žilić

Reihaneh Rabbany

2022-04-20

Proceedings of the 2022 SIAM International Conference on Data Mining (SDM) (published)

doi.org

Microscopy-BIDS: An Extension to the Brain Imaging Data Structure for Microscopy Data

Marie-Hélène Bourget

L. Kamentsky

Satrajit S. Ghosh

Giacomo Mazzamuto

Alberto Lazari

Christopher J. Markiewicz

Robert Oostenveld

Guiomar Niso

Yaroslav O. Halchenko

Ilona Lipp

Sylvain Takerkart

P. Toussaint

Ali Raza Khan

Gustav Nilsonne

Filippo Maria Castelli

Julien Cohen-Adad

The Brain Imaging Data Structure (BIDS) is a specification for organizing, sharing, and archiving neuroimaging data and metadata in a reusab… (see more)le way. First developed for magnetic resonance imaging (MRI) datasets, the community-led specification evolved rapidly to include other modalities such as magnetoencephalography, positron emission tomography, and quantitative MRI (qMRI). In this work, we present an extension to BIDS for microscopy imaging data, along with example datasets. Microscopy-BIDS supports common imaging methods, including 2D/3D, ex/in vivo, micro-CT, and optical and electron microscopy. Microscopy-BIDS also includes comprehensible metadata definitions for hardware, image acquisition, and sample properties. This extension will facilitate future harmonization efforts in the context of multi-modal, multi-scale imaging such as the characterization of tissue microstructure with qMRI.

2022-04-19

Frontiers in Neuroscience (published)

doi.org

Microscopy-BIDS: An Extension to the Brain Imaging Data Structure for Microscopy Data

Marie-Hélène Bourget

Lee Kamentsky

Satrajit S. Ghosh

Giacomo Mazzamuto

Alberto Lazari

Christopher J. Markiewicz

Robert Oostenveld

Guiomar Niso

Yaroslav O. Halchenko

Ilona Lipp

Sylvain Takerkart

Paule-Joanne Toussaint

Ali R. Khan

Gustav Nilsonne

Filippo Maria Castelli

Stefan Ross Eric Franklin Anthony Rémi Christopher J. Taylor Appelhoff

Julien Cohen-Adad

The Brain Imaging Data Structure (BIDS) is a specification for organizing, sharing, and archiving neuroimaging data and metadata in a reusab… (see more)le way. First developed for magnetic resonance imaging (MRI) datasets, the community-led specification evolved rapidly to include other modalities such as magnetoencephalography, positron emission tomography, and quantitative MRI (qMRI). In this work, we present an extension to BIDS for microscopy imaging data, along with example datasets. Microscopy-BIDS supports common imaging methods, including 2D/3D, ex/in vivo, micro-CT, and optical and electron microscopy. Microscopy-BIDS also includes comprehensible metadata definitions for hardware, image acquisition, and sample properties. This extension will facilitate future harmonization efforts in the context of multi-modal, multi-scale imaging such as the characterization of tissue microstructure with qMRI.

2022-04-19

Frontiers in Neuroscience (published)

doi.org

On the Origin of Hallucinations in Conversational Models: Is it the Datasets or the Models?

Nouha Dziri

Sivan Milton

Mo Yu

Osmar R Zaiane

Siva Reddy

Knowledge-grounded conversational models are known to suffer from producing factually invalid statements, a phenomenon commonly called hallu… (see more)cination. In this work, we investigate the underlying causes of this phenomenon: is hallucination due to the training data, or to the models? We conduct a comprehensive human study on both existing knowledge-grounded conversational benchmarks and several state-of-the-art models. Our study reveals that the standard benchmarks consist of > 60% hallucinated responses, leading to models that not only hallucinate but even amplify hallucinations. Our findings raise important questions on the quality of existing datasets and models trained using them. We make our annotations publicly available for future research.

2022-04-17

ArXiv (preprint)

doi.org

arxiv.org

On the Origin of Hallucinations in Conversational Models: Is it the Datasets or the Models?

Nouha Dziri

Sivan Milton

Mo Yu

Osmar R Zaiane

Siva Reddy

Knowledge-grounded conversational models are known to suffer from producing factually invalid statements, a phenomenon commonly called hallu… (see more)cination. In this work, we investigate the underlying causes of this phenomenon: is hallucination due to the training data, or to the models? We conduct a comprehensive human study on both existing knowledge-grounded conversational benchmarks and several state-of-the-art models. Our study reveals that the standard benchmarks consist of > 60% hallucinated responses, leading to models that not only hallucinate but even amplify hallucinations. Our findings raise important questions on the quality of existing datasets and models trained using them. We make our annotations publicly available for future research.

2022-04-17

ArXiv (preprint)

doi.org

arxiv.org

Improving Passage Retrieval with Zero-Shot Question Generation

Devendra Singh Sachan

Mike Lewis

Mandar S. Joshi

Armen Aghajanyan

Wen-292 Tau Yih

Joelle Pineau

Luke Zettlemoyer

We propose a simple and effective re-ranking method for improving passage retrieval in open question answering. The re-ranker re-scores retr… (see more)ieved passages with a zero-shot question generation model, which uses a pre-trained language model to compute the probability of the input question conditioned on a retrieved passage. This approach can be applied on top of any retrieval method (e.g. neural or keyword-based), does not require any domain- or task-specific training (and therefore is expected to generalize better to data distribution shifts), and provides rich cross-attention between query and passage (i.e. it must explain every token in the question). When evaluated on a number of open-domain retrieval datasets, our re-ranker improves strong unsupervised retrieval models by 6%-18% absolute and strong supervised models by up to 12% in terms of top-20 passage retrieval accuracy. We also obtain new state-of-the-art results on full open-domain question answering by simply adding the new re-ranker to existing models with no further changes.

2022-04-15

ArXiv (preprint)

doi.org

arxiv.org

Evolution of cell size control is canalized towards adders or sizers by cell cycle structure and selective pressures

Felix Proulx-Giraldeau

J. Skotheim

Paul François

Cell size is controlled to be within a specific range to support physiological function. To control their size, cells use diverse mechanisms… (see more) ranging from ‘sizers’, in which differences in cell size are compensated for in a single cell division cycle, to ‘adders’, in which a constant amount of cell growth occurs in each cell cycle. This diversity raises the question why a particular cell would implement one rather than another mechanism? To address this question, we performed a series of simulations evolving cell size control networks. The size control mechanism that evolved was influenced by both cell cycle structure and specific selection pressures. Moreover, evolved networks recapitulated known size control properties of naturally occurring networks. If the mechanism is based on a G1 size control and an S/G2/M timer, as found for budding yeast and some human cells, adders likely evolve. But, if the G1 phase is significantly longer than the S/G2/M phase, as is often the case in mammalian cells in vivo, sizers become more likely. Sizers also evolve when the cell cycle structure is inverted so that G1 is a timer, while S/G2/M performs size control, as is the case for the fission yeast S. pombe. For some size control networks, cell size consistently decreases in each cycle until a burst of cell cycle inhibitor drives an extended G1 phase much like the cell division cycle of the green algae Chlamydomonas. That these size control networks evolved such self-organized criticality shows how the evolution of complex systems can drive the emergence of critical processes.

2022-04-14

bioRxiv (preprint)

doi.org

Masked Siamese Networks for Label-Efficient Learning

Mahmoud Assran

Mathilde Caron

Ishan Misra

Piotr Bojanowski

Florian Bordes

Pascal Vincent

Armand Joulin

Michael Rabbat

Nicolas Ballas

We propose Masked Siamese Networks (MSN), a self-supervised learning framework for learning image representations. Our approach matches the … (see more)representation of an image view containing randomly masked patches to the representation of the original unmasked image. This self-supervised pre-training strategy is particularly scalable when applied to Vision Transformers since only the unmasked patches are processed by the network. As a result, MSNs improve the scalability of joint-embedding architectures, while producing representations of a high semantic level that perform competitively on low-shot image classification. For instance, on ImageNet-1K, with only 5,000 annotated images, our base MSN model achieves 72.4% top-1 accuracy, and with 1% of ImageNet-1K labels, we achieve 75.7% top-1 accuracy, setting a new state-of-the-art for self-supervised learning on this benchmark. Our code is publicly available.

2022-04-14

ArXiv (preprint)

doi.org

arxiv.org

AI Research Driven by Real-World Problems

AI Policy Compass

Student Life and Resources

Publications

AI Research Driven by Real-World Problems

AI Policy Compass

Student Life and Resources

Popular keywords:

Publications