Adrien Aumon

Random Forest Autoencoders for Guided Representation Learning

Kevin R. Moon

Jake S. Rhodes

Decades of research have produced robust methods for unsupervised data visualization, yet supervised visualization…

2025-10-22

logconference.io/LOG/2025/Conference (poster)

doi.org

openreview.net

Random Forest Autoencoders for Guided Representation Learning

Kevin R. Moon

Jake S. Rhodes

Decades of research have produced robust methods for unsupervised data visualization, yet supervised visualization…

2025-02-18

ArXiv (preprint)

arxiv.org

Gaining Biological Insights through Supervised Data Visualization

Jake S. Rhodes

Adrien Aumon

Sacha Morin

Marc Girard

Catherine Larochelle

Elsa Brunet-Ratnasingham

Amélie Pagliuzza

Lorie Marchitto

Wei Zhang

Adele Cutler

F. Grand'Maison

Anhong Zhou

Andrés Finzi

Nicolas Chomont

Daniel E. Kaufmann

Stephanie Zandee

Alexandre Prat

Guy Wolf

Kevin R. Moon

Dimensionality reduction-based data visualization is pivotal in comprehending complex biological data. The most common methods, such as PHAT… (see more)E, t-SNE, and UMAP, are unsupervised and therefore reflect the dominant structure in the data, which may be independent of expert-provided labels. Here we introduce a supervised data visualization method called RF-PHATE, which integrates expert knowledge for further exploration of the data. RF-PHATE leverages random forests to capture intricate featurelabel relationships. Extracting information from the forest, RF-PHATE generates low-dimensional visualizations that highlight relevant data relationships while disregarding extraneous features. This approach scales to large datasets and applies to classification and regression. We illustrate RF-PHATE’s prowess through three case studies. In a multiple sclerosis study using longitudinal clinical and imaging data, RF-PHATE unveils a sub-group of patients with non-benign relapsingremitting Multiple Sclerosis, demonstrating its aptitude for time-series data. In the context of Raman spectral data, RF-PHATE effectively showcases the impact of antioxidants on diesel exhaust-exposed lung cells, highlighting its proficiency in noisy environments. Furthermore, RF-PHATE aligns established geometric structures with COVID-19 patient outcomes, enriching interpretability in a hierarchical manner. RF-PHATE bridges expert insights and visualizations, promising knowledge generation. Its adaptability, scalability, and noise tolerance underscore its potential for widespread adoption.

2024-01-21

bioRxiv (preprint)

doi.org

Gaining Biological Insights through Supervised Data Visualization

Jake S. Rhodes

Adrien Aumon

Sacha Morin

Marc Girard

Catherine Larochelle

Boaz Lahav

Elsa Brunet-Ratnasingham

Amélie Pagliuzza

Lorie Marchitto

Wei Zhang

Adele Cutler

F. Grand'Maison

Anhong Zhou

Andrés Finzi

Nicolas Chomont

Daniel E. Kaufmann

Stephanie Zandee

Alexandre Prat

Guy Wolf

Kevin R. Moon

Dimensionality reduction-based data visualization is pivotal in comprehending complex biological data. The most common methods, such as PHAT… (see more)E, t-SNE, and UMAP, are unsupervised and therefore reflect the dominant structure in the data, which may be independent of expert-provided labels. Here we introduce a supervised data visualization method called RF-PHATE, which integrates expert knowledge for further exploration of the data. RF-PHATE leverages random forests to capture intricate featurelabel relationships. Extracting information from the forest, RF-PHATE generates low-dimensional visualizations that highlight relevant data relationships while disregarding extraneous features. This approach scales to large datasets and applies to classification and regression. We illustrate RF-PHATE’s prowess through three case studies. In a multiple sclerosis study using longitudinal clinical and imaging data, RF-PHATE unveils a sub-group of patients with non-benign relapsingremitting Multiple Sclerosis, demonstrating its aptitude for time-series data. In the context of Raman spectral data, RF-PHATE effectively showcases the impact of antioxidants on diesel exhaust-exposed lung cells, highlighting its proficiency in noisy environments. Furthermore, RF-PHATE aligns established geometric structures with COVID-19 patient outcomes, enriching interpretability in a hierarchical manner. RF-PHATE bridges expert insights and visualizations, promising knowledge generation. Its adaptability, scalability, and noise tolerance underscore its potential for widespread adoption.

2024-01-21

bioRxiv (preprint)

doi.org

Enhancing Supervised Visualization through Autoencoder and Random Forest Proximities for Out-of-Sample Extension

Shuang Ni

Adrien Aumon

Guy Wolf

Kevin R. Moon

Jake S. Rhodes

The value of supervised dimensionality reduction lies in its ability to uncover meaningful connections between data features and labels. Com… (see more)mon dimensionality reduction methods embed a set of fixed, latent points, but are not capable of generalizing to an unseen test set. In this paper, we provide an out-of-sample extension method for the random forest-based supervised dimensionality reduction method, RF-PHATE, combining information learned from the random forest model with the function-learning capabilities of autoencoders. Through quantitative assessment of various autoencoder architectures, we identify that networks that reconstruct random forest proximities are more robust for the embedding extension problem. Furthermore, by leveraging proximity-based prototypes, we achieve a 40% reduction in training time without compromising extension quality. Our method does not require label information for out-of-sample points, thus serving as a semi-supervised method, and can achieve consistent quality using only 10% of the training data.

2024-01-01

MLSP (published)

doi.org

arxiv.org

Mila AI Policy Conference

Leading in a New Era

TRAIL: Responsible AI for Professionals and Leaders

Adrien Aumon

Publications

Mila AI Policy Conference

Leading in a New Era

TRAIL: Responsible AI for Professionals and Leaders

Popular keywords:

Adrien Aumon

Publications