Portrait de Toby Dylan Hocking

Toby Dylan Hocking

Membre académique associé
Professeur agrégé, Université Sherbrooke, Département d'informatique
Sujets de recherche
Apprentissage automatique médical
Apprentissage profond
Biologie computationnelle
Exploration des données
Optimisation
Vision par ordinateur

Biographie

Originaire de Californie et ayant fait ses études à Berkeley, Toby Dylan Hocking a obtenu son doctorat en mathématiques (apprentissage automatique) à l'École normale supérieure de Cachan (Paris, France) en 2012. Il a travaillé comme postdoc dans le laboratoire d'apprentissage automatique de Masashi Sugiyama à Tokyo Tech en 2013, et dans le laboratoire de génomique de Guillaume Bourque à l'Université McGill.

Il a été professeur adjoint menant à la permanence à la Northern Arizona University pendant 5 ans et aujourd'hui il est professeur agrégé permanent à l'Université de Sherbrooke, où il dirige le laboratoire de recherche LASSO (Learning Algorithms, Statistical Software, Optimization).Toby est également un membre académique associé à Mila - Institut québécois d'intelligence artificielle.

Il est l'auteur de dizaines de paquets R et a publié plus de 50 articles de recherche évalués par des pairs sur l'apprentissage automatique et les logiciels statistiques. Il a encadré plus de 30 étudiants dans des projets de recherche, ainsi que plus de 30 contributeurs de logiciels libres avec le projet R dans le cadre du Google Summer of Code.

Publications

Finite Sample Complexity Analysis of Binary Segmentation
Binary segmentation is the classic greedy algorithm which recursively splits a sequential data set by optimizing some loss or likelihood fun… (voir plus)ction. Binary segmentation is widely used for changepoint detection in data sets measured over space or time, and as a sub-routine for decision tree learning. In theory it should be extremely fast for
SOAK: Same/Other/All K-fold cross-validation for estimating similarity of patterns in data subsets
Gabrielle Thibault
C. S. Bodine
Paul Nelson Arellano
Alexander F Shenkin
Olivia J. Lindly
In many real-world applications of machine learning, we are interested to know if it is possible to train on the data that we have gathered … (voir plus)so far, and obtain accurate predictions on a new test data subset that is qualitatively different in some respect (time period, geographic region, etc). Another question is whether data subsets are similar enough so that it is beneficial to combine subsets during model training. We propose SOAK, Same/Other/All K-fold cross-validation, a new method which can be used to answer both questions. SOAK systematically compares models which are trained on different subsets of data, and then used for prediction on a fixed test subset, to estimate the similarity of learnable/predictable patterns in data subsets. We show results of using SOAK on six new real data sets (with geographic/temporal subsets, to check if predictions are accurate on new subsets), 3 image pair data sets (subsets are different image types, to check that we get smaller prediction error on similar images), and 11 benchmark data sets with predefined train/test splits (to check similarity of predefined splits).
Enhancing Changepoint Detection: Penalty Learning through Deep Learning Techniques
Tung L. Nguyen
Enhancing Changepoint Detection: Penalty Learning through Deep Learning Techniques
Tung L. Nguyen
Changepoint detection, a technique for identifying significant shifts within data sequences, is crucial in various fields such as finance, g… (voir plus)enomics, medicine, etc. Dynamic programming changepoint detection algorithms are employed to identify the locations of changepoints within a sequence, which rely on a penalty parameter to regulate the number of changepoints. To estimate this penalty parameter, previous work uses simple models such as linear or tree-based models. This study introduces a novel deep learning method for predicting penalty parameters, leading to demonstrably improved changepoint detection accuracy on large benchmark supervised labeled datasets compared to previous methods.
Penalty Learning for Optimal Partitioning using Multilayer Perceptron
Tung L. Nguyen
Changepoint detection is a technique used to identify significant shifts in sequences and is widely used in fields such as finance, genomics… (voir plus), and medicine. To identify the changepoints, dynamic programming (DP) algorithms, particularly Optimal Partitioning (OP) family, are widely used. To control the changepoints count, these algorithms use a fixed penalty to penalize the changepoints presence. To predict the optimal value of that penalty, existing methods used simple models such as linear or tree-based, which may limit predictive performance. To address this issue, this study proposes using a multilayer perceptron (MLP) with a ReLU activation function to predict the penalty. The proposed model generates continuous predictions -- as opposed to the stepwise ones in tree-based models -- and handles non-linearity better than linear models. Experiments on large benchmark genomic datasets demonstrate that the proposed model improves accuracy and F1 score compared to existing models.
Automated River Substrate Mapping From Sonar Imagery With Machine Learning
C. S. Bodine
D. Buscombe
Reply to: Model uncertainty obscures major driver of soil carbon
Feng Tao
Benjamin Z. Houlton
Serita D. Frey
Johannes Lehmann
Stefano Manzoni
Yuanyuan Huang
Lifen Jiang
Umakant Mishra
Bruce A. Hungate
Michael W. I. Schmidt
Markus Reichstein
Nuno Carvalhais
Philippe Ciais
Ying-Ping Wang
Bernhard Ahrens
Gustaf Hugelius
Xingjie Lu
Zheng Shi
Kostiantyn Viatkin … (voir 15 de plus)
K. Viatkin
Ronald Vargas
Yusuf Yigini
Christian Omuto
Ashish A. Malik
Guillermo Peralta
Rosa Cuevas-Corona
Luciano E. Di Paolo
Isabel Luotto
Cuijuan Liao
Yi-Shuang Liang
Yixin Liang
Vinisa S. Saynes
Xiaomeng Huang
Yiqi Luo
Functional Labeled Optimal Partitioning
Jacob M. Kaufman
Alyssa J. Stenberg
Deep Learning Approach for Changepoint Detection: Penalty Parameter Optimization
Tung L. Nguyen
Changepoint detection, a technique for identifying significant shifts within data sequences, is crucial in various fields such as finance, g… (voir plus)enomics, medicine, etc. Dynamic programming changepoint detection algorithms are employed to identify the locations of changepoints within a sequence, which rely on a penalty parameter to regulate the number of changepoints. To estimate this penalty parameter, previous work uses simple models such as linear models or decision trees. This study introduces a novel deep learning method for predicting penalty parameters, leading to demonstrably improved changepoint detection accuracy on large benchmark supervised labeled datasets compared to previous methods.