Publications

Hard-Constrained Deep Learning for Climate Downscaling
Prasanna Sattegeri
D. Szwarcman
Campbell Watson
The availability of reliable, high-resolution climate and weather data is important to inform long-term decisions on climate adaptation and … (see more)mitigation and to guide rapid responses to extreme events. Forecasting models are limited by computational costs and, therefore, often generate coarse-resolution predictions. Statistical downscaling, including super-resolution methods from deep learning, can provide an efficient method of upsampling low-resolution data. However, despite achieving visually compelling results in some cases, such models frequently violate conservation laws when predicting physical variables. In order to conserve physical quantities, here we introduce methods that guarantee statistical constraints are satisfied by a deep learning downscaling model, while also improving their performance according to traditional metrics. We compare different constraining approaches and demonstrate their applicability across different neural architectures as well as a variety of climate and weather data sets. Besides enabling faster and more accurate climate predictions through downscaling, we also show that our novel methodologies can improve super-resolution for satellite data and natural images data sets.
A Heat Diffusion Perspective on Geodesic Preserving Dimensionality Reduction
Edward De Brouwer
Yanlei Zhang
Ian Adelstein
Diffusion-based manifold learning methods have proven useful in representation learning and dimensionality reduction of modern high dimensio… (see more)nal, high throughput, noisy datasets. Such datasets are especially present in fields like biology and physics. While it is thought that these methods preserve underlying manifold structure of data by learning a proxy for geodesic distances, no specific theoretical links have been established. Here, we establish such a link via results in Riemannian geometry explicitly connecting heat diffusion to manifold distances. In this process, we also formulate a more general heat kernel based manifold embedding method that we call heat geodesic embeddings. This novel perspective makes clearer the choices available in manifold learning and denoising. Results show that our method outperforms existing state of the art in preserving ground truth manifold distances, and preserving cluster structure in toy datasets. We also showcase our method on single cell RNA-sequencing datasets with both continuum and cluster structure, where our method enables interpolation of withheld timepoints of data. Finally, we show that parameters of our more general method can be configured to give results similar to PHATE (a state-of-the-art diffusion based manifold learning method) as well as SNE (an attraction/repulsion neighborhood based method that forms the basis of t-SNE).
Hierarchical Distributed Energy Management Framework for Multiple Greenhouses Considering Demand Response
Ehsan Rezaei
Kianoosh Ojand
Greenhouses are a key component of modernised agriculture, aiming for producing high-quality crops and plants. Furthermore, a network of gre… (see more)enhouses has enormous potential as part of demand response programs. Saving energy during off-peak time, reducing power consumption and delaying the start time of subsystems during on-peak time are some strategies that can be used to limit power exchanged with the main grid. In this work, a hierarchical distributed alternating direction method of multipliers-based model predictive control framework is proposed that has two main objectives: 1) providing appropriate conditions for greenhouses' crops and plants to grow, and 2) limiting the total power exchanged with the main grid. At each time step in the framework, an aggregator coordinates the greenhouses to reach a consensus and limit the total electric power exchanged while managing shared resources, e.g., reservoir water. The proposed framework's performance is investigated through a case study.
High-Throughput Edge Inference for BERT Models via Neural Architecture Search and Pipeline.
Hung-Yang Chang
Seyyed Hasan Mozafari
James J. Clark
Brett H. Meyer
Warren J. Gross
There has been growing interest in improving the BERT inference throughput on resource-constrained edge devices for a satisfactory user expe… (see more)rience. One methodology is to employ heterogeneous computing, which utilizes multiple processing elements to accelerate inference. Another methodology is to deploy Neural Architecture Search (NAS) to find optimal solutions in accuracy-throughput design space. In this paper, for the first time, we incorporate NAS with pipelining for BERT models. We show that performing NAS with pipelining achieves on average 53% higher throughput, compared to NAS with a homogeneous system. Additionally, we propose a NAS algorithm that incorporates hardware performance feedback to accelerate the NAS process. Our proposed NAS algorithm speeds up the search process by ~4x, and 5.5x on the design space of the BERT and CNNs, respectively. Also, by exploring the accuracy-throughput design space of BERT models, we demonstrate that performing pipelining then NAS (Pipeline-then-NAS) can lead to solutions with up to 9x higher inference throughput, compared to running homogeneous inference on the BERT-base model, with only a 1.3% decrease in accuracy.
Home alone: A population neuroscience investigation of brain morphology substrates
MaryAnn Noonan
Chris Zajner
As a social species, ready exchange with peers is a pivotal asset - our “social capital”. Yet, single-person households have come to per… (see more)vade metropolitan cities worldwide, with unknown consequences in the long run. Here, we systematically explore the morphological manifestations associated with singular living in ∼40,000 UK Biobank participants. The uncovered population-level signature spotlights the highly associative default mode network, in addition to findings such as in the amygdala central, cortical and corticoamygdaloid nuclei groups, as well as the hippocampal fimbria and dentate gyrus. Sex-stratified analyses revealed male-specific neural substrates, including somatomotor, saliency and visual systems, while female-specific neural substrates centred on the dorsomedial prefrontal cortex. In line with our demographic profiling results, the discovered neural imprint of living alone is potentially linked to alcohol and tobacco consumption, anxiety, sleep quality as well as daily TV watching. The secular trend for solitary living will require new answers from public-health decision makers.
Homotopic local-global parcellation of the human cerebral cortex from resting-state functional connectivity
Xiaoxuan Yan
Ru Kong
Aihuiping Xue
Qing Yang
Csaba Orban
Lijun An
Avram J. Holmes
Xing Qian
Jianzhong Chen
Xi-Nian Zuo
Juan Helen Zhou
Marielle V Fortier
Ai Peng Tan
Peter Gluckman
Yap Seng Chong
Michael J Meaney
Simon B. Eickhoff
B.T. Thomas Yeo
Resting-state fMRI is commonly used to derive brain parcellations, which are widely used for dimensionality reduction and interpreting human… (see more) neuroscience studies. We previously developed a model that integrates local and global approaches for estimating areal-level cortical parcellations. The resulting local-global parcellations are often referred to as the Schaefer parcellations. However, the lack of homotopic correspondence between left and right Schaefer parcels has limited their use for brain lateralization studies. Here, we extend our previous model to derive homotopic areal-level parcellations. Using resting-fMRI and task-fMRI across diverse scanners, acquisition protocols, preprocessing and demographics, we show that the resulting homotopic parcellations are as homogeneous as the Schaefer parcellations, while being more homogeneous than five publicly available parcellations. Furthermore, weaker correlations between homotopic parcels are associated with greater lateralization in resting network organization, as well as lateralization in language and motor task activation. Finally, the homotopic parcellations agree with the boundaries of a number of cortical areas estimated from histology and visuotopic fMRI, while capturing sub-areal (e.g., somatotopic and visuotopic) features. Overall, these results suggest that the homotopic local-global parcellations represent neurobiologically meaningful subdivisions of the human cerebral cortex and will be a useful resource for future studies. Multi-resolution parcellations estimated from 1479 participants are publicly available (https://github.com/ThomasYeoLab/CBIG/tree/master/stable_projects/brain_parcellation/Yan2023_homotopic).
How can intelligent systems revolutionise health care?
How Useful Are Educational Questions Generated by Large Language Models?
Sabina Elkins
Ekaterina Kochmar
Jackie CK Cheung
Iulian V. Serban
Human-Centered Responsible Artificial Intelligence: Current & Future Trends
Mohammad Tahaei
Marios Constantinides
Daniele Quercia
Sean Kennedy
Michael Muller
Simone Stumpf
Q. Vera Liao
Ricardo Baeza-Yates
Lora Aroyo
Jess Holbrook
Ewa Luger
Michael Madaio
Ilana Golbin Blumenfeld
Maria De-Arteaga
Jessica Vitak
A.R. Olteanu
HyenaDNA: Long-Range Genomic Sequence Modeling at Single Nucleotide Resolution
Eric Nguyen
Michael Poli
Marjan Faizi
Armin W Thomas
Callum Birch-Sykes
Michael Wornow
Aman Patel
Clayton M. Rabideau
Stefano Ermon
Stephen Baccus
Christopher Re
Identification of Substitutable Context-Free Languages over Infinite Alphabets from Positive Data
Yutaro Numaya
Diptarama Hendrian
Ryo Yoshinaka
Ayumi Shinohara
François Coste
Faissal Ouardi
This paper is concerned with the identification in the limit from positive data of sub-stitutable context-free languages cfl s) over infinit… (see more)e alphabets. Clark and Eyraud (2007) showed that substitutable cfl s over finite alphabets are learnable in this learning paradigm. We show that substitutable cfl s generated by grammars whose production rules may have predicates that represent sets of potentially infinitely many terminal symbols in a compact manner are learnable if the terminal symbol sets represented by those predicates are learnable, under a certain condition. This can be seen as a result parallel to Argyros and D’Antoni’s work (2018) that amplifies the query learnability of predicate classes to that of symbolic automata classes. Our result is the first that shows such amplification is possible for identifying some cfl s in the limit from positive data.
Impact in Software Engineering Activities After One Year of COVID-19 Restrictions for Startups and Established Companies
Hosna Hooshyar
Eduardo Guerra
Jorge Melegati
Dron Khanna
Abdullah Aldaeej
Gerardo Matturro
Luciana Zaina
Des Greer
Usman Rafiq
Rafael Chanin
Xiaofeng Wang
Juan Garbajosa
Pekka Abrahamsson
Anh Nguyen-Duc
The restrictions imposed by the COVID-19 pandemic required software development teams to adapt, being forced to work remotely and adjust the… (see more) software engineering activities accordingly. In the studies evaluating these effects, a few have assessed the impact on software engineering activities from a broader perspective and after a period of time when teams had time to adjust to the changes. No studies have been found comparing software startups and established companies either. This paper aims to investigate the impacts of COVID-19 on software development activities after one year of the pandemic restrictions, comparing the results between startups and established companies. Our approach was to design a cross-sectional survey and distribute it online among software development companies worldwide. The participants were asked about their perception of COVID-19’s pandemic impact on different software engineering activities: requirements engineering, software architecture, user experience design, software implementation, and software quality assurance. The survey received 170 valid answers from 29 countries, and for all the software engineering activities, we found that most respondents did not observe a significant impact. The results also showed that software startups and established companies were affected differently since, in some activities, we found a negative impact in the former and a positive impact in the latter. Regarding the time spent on each software engineering activity, most of the answers reported no change, but on those that did, the result points to an increase in time. Thus, we cannot find any relation between the change in time of effort and the reported positive or negative impact.