Publications

GFlowNet Foundations

Edward J Hu

Mo Tiwari

2021-11-17

ArXiv (prépublication)

GFlowNet Foundations

Edward J Hu

Mo Tiwari

2021-11-17

ArXiv (prépublication)

GFlowNet Foundations

Edward J Hu

Mo Tiwari

2021-11-17

ArXiv (prépublication)

arxiv.org

Splitting, Renaming, Removing: A Study of Common Cleaning Activities in Jupyter Notebooks

Helen Dong

Shurui Zhou

Jin Guo

Christian KÃ¤stner

Data scientists commonly use computational notebooks because they provide a good environment for testing multiple models. However, once the … (voir plus)scientist completes the code and finds the ideal model, he or she will have to dedicate time to clean up the code in order for others to easily understand it. In this paper, we perform a qualitative study on how scientists clean their code in hopes of being able to suggest a tool to automate this process. Our end goal is for tool builders to address possible gaps and provide additional aid to data scientists, who then can focus more on their actual work rather than the routine and tedious cleaning work. By sampling notebooks from GitHub and analyzing changes between subsequent commits, we identified common cleaning activities, such as changes to markdown (e.g., adding headers sections or descriptions) or comments (both deleting dead code and adding descriptions) as well as reordering cells. We also find that common cleaning activities differ depending on the intended purpose of the notebook. Our results provide a valuable foundation for tool builders and notebook users, as many identified cleaning activities could benefit from codification of best practices and dedicated tool support, possibly tailored depending on intended use.

2021-11-15

2021 36th IEEE/ACM International Conference on Automated Software Engineering Workshops (ASEW) (publié)

doi.org

Subtle Bugs Everywhere: Generating Documentation for Data Wrangling Code

Chenyang Yang

Shurui Zhou

Jin Guo

Christian KÃ¤stner

Data scientists reportedly spend a significant amount of their time in their daily routines on data wrangling, i.e. cleaning data and extrac… (voir plus)ting features. However, data wrangling code is often repetitive and error-prone to write. Moreover, it is easy to introduce subtle bugs when reusing and adopting existing code, which results in reduced model quality. To support data scientists with data wrangling, we present a technique to generate documentation for data wrangling code. We use (1) program synthesis techniques to automatically summarize data transformations and (2) test case selection techniques to purposefully select representative examples from the data based on execution information collected with tailored dynamic program analysis. We demonstrate that a JupyterLab extension with our technique can provide on-demand documentation for many cells in popular notebooks and find in a user study that users with our plugin are faster and more effective at finding realistic bugs in data wrangling code.

2021-11-15

2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE) (publié)

doi.org

ZERO: Playing Mathematical Programming Games

Gabriele Dragotto

S. Sankaranarayanan

Margarida Carvalho

Andrea Lodi

2021-11-15

ArXiv (prépublication)

arxiv.org

Hidden Hypergraphs, Error-Correcting Codes, and Critical Learning in Hopfield Networks

Christopher Hillar

Tenzin Chan

Rachel Taubman

David Rolnick

In 1943, McCulloch and Pitts introduced a discrete recurrent neural network as a model for computation in brains. The work inspired breakthr… (voir plus)oughs such as the first computer design and the theory of finite automata. We focus on learning in Hopfield networks, a special case with symmetric weights and fixed-point attractor dynamics. Specifically, we explore minimum energy flow (MEF) as a scalable convex objective for determining network parameters. We catalog various properties of MEF, such as biological plausibility, and then compare to classical approaches in the theory of learning. Trained Hopfield networks can perform unsupervised clustering and define novel error-correcting coding schemes. They also efficiently find hidden structures (cliques) in graph theory. We extend this known connection from graphs to hypergraphs and discover n-node networks with robust storage of 2Ω(n1−ϵ) memories for any ϵ>0. In the case of graphs, we also determine a critical ratio of training samples at which networks generalize completely.

2021-11-11

Entropy (publié)

doi.org

OSSEM: one-shot speaker adaptive speech enhancement using meta learning

Cheng Yu

Szu‐wei Fu

Tsun-An Hsieh

Yu-shan Tsao

Mirco Ravanelli

Although deep learning (DL) has achieved notable progress in speech enhancement (SE), further research is still required for a DL-based SE s… (voir plus)ystem to adapt effectively and efficiently to particular speakers. In this study, we propose a novel meta-learning-based speaker-adaptive SE approach (called OSSEM) that aims to achieve SE model adaptation in a one-shot manner. OSSEM consists of a modified transformer SE network and a speaker-specific masking (SSM) network. In practice, the SSM network takes an enrolled speaker embedding extracted using ECAPA-TDNN to adjust the input noisy feature through masking. To evaluate OSSEM, we designed a modified Voice Bank-DEMAND dataset, in which one utterance from the testing set was used for model adaptation, and the remaining utterances were used for testing the performance. Moreover, we set restrictions allowing the enhancement process to be conducted in real time, and thus designed OSSEM to be a causal SE system. Experimental results first show that OSSEM can effectively adapt a pretrained SE model to a particular speaker with only one utterance, thus yielding improved SE results. Meanwhile, OSSEM exhibits a competitive performance compared to state-of-the-art causal SE systems.

2021-11-10

ArXiv (preprint)

doi.org

arxiv.org

The Cut and Play Algorithm: Computing Nash Equilibria via Outer Approximations

Margarida Carvalho

Gabriele Dragotto

Andrea Lodi

Sriram Sankaranarayanan

We introduce the Cut-and-Play, an efficient algorithm for computing equilibria in simultaneous non-cooperative games where players solve non… (voir plus)convex and possibly unbounded optimization problems. Our algorithm exploits an intrinsic relationship between the equilibria of the original nonconvex game and the ones of a convexified counterpart. In practice, Cut-and-Play formulates a series of convex approximations of the original game and refines them with techniques from integer programming, for instance, cutting planes and branching operations. We test our algorithm on two families of challenging nonconvex games involving discrete decisions and bilevel programs, and we empirically demonstrate that it efficiently computes equilibria and outperforms existing game-specific algorithms.

2021-11-10

ArXiv (prépublication)

arxiv.org

S$^3$: Sign-Sparse-Shift Reparametrization for Effective Training of Low-bit Shift Networks

Xinlin Li

Bang Liu

Yaoliang Yu

Wulong Liu

Chunjing Xu

Vahid Partovi Nia

openreview.net

Active 3D Shape Reconstruction from Vision and Touch

Edward J. Smith

David Meger

Luis Pineda

Roberto Calandra

Jitendra Malik

Adriana Romero Soriano

Michal Drozdzal

Humans build 3D understandings of the world through active object exploration, using jointly their senses of vision and touch. However, in 3… (voir plus)D shape reconstruction, most recent progress has relied on static datasets of limited sensory data such as RGB images, depth maps or haptic readings, leaving the active exploration of the shape largely unexplored. In active touch sensing for 3D reconstruction, the goal is to actively select the tactile readings that maximize the improvement in shape reconstruction accuracy. However, the development of deep learning-based active touch models is largely limited by the lack of frameworks for shape exploration. In this paper, we focus on this problem and introduce a system composed of: 1) a haptic simulator leveraging high spatial resolution vision-based tactile sensors for active touching of 3D objects; 2) a mesh-based 3D shape reconstruction model that relies on tactile or visuotactile signals; and 3) a set of data-driven solutions with either tactile or visuotactile priors to guide the shape exploration. Our framework enables the development of the first fully data-driven solutions to active touch on top of learned models for object understanding. Our experiments show the benefits of such solutions in the task of 3D shape understanding where our models consistently outperform natural baselines. We provide our framework as a tool to foster future research in this direction.

openreview.net

Discrete-Valued Neural Communication

Dianbo Liu

Alex Lamb

Kenji Kawaguchi

Anirudh Goyal

Chen Sun

Michael Curtis Mozer

Yoshua Bengio

Deep learning has advanced from fully connected architectures to structured models organized into components, e.g., the transformer composed… (voir plus) of positional elements, modular architectures divided into slots, and graph neural nets made up of nodes. In structured models, an interesting question is how to conduct dynamic and possibly sparse communication among the separate components. Here, we explore the hypothesis that restricting the transmitted information among components to discrete representations is a beneficial bottleneck. The motivating intuition is human language in which communication occurs through discrete symbols. Even though individuals have different understandings of what a"cat"is based on their specific experiences, the shared discrete token makes it possible for communication among individuals to be unimpeded by individual differences in internal representation. To discretize the values of concepts dynamically communicated among specialist components, we extend the quantization mechanism from the Vector-Quantized Variational Autoencoder to multi-headed discretization with shared codebooks and use it for discrete-valued neural communication (DVNC). Our experiments show that DVNC substantially improves systematic generalization in a variety of architectures -- transformers, modular architectures, and graph neural networks. We also show that the DVNC is robust to the choice of hyperparameters, making the method very useful in practice. Moreover, we establish a theoretical justification of our discretization process, proving that it has the ability to increase noise robustness and reduce the underlying dimensionality of the model.

openreview.net

Science éclair

À l’avant-garde d’une nouvelle ère

Demandes de supervision

Publications

Science éclair

À l’avant-garde d’une nouvelle ère

Demandes de supervision

Mots-clés populaires:

Publications