Weihua Shi

PhD - McGill University

Supervisor

Archer Yang

Research Topics

Deep Learning

Optimization

Reinforcement Learning

Publications

Beyond Go/No-Go Decisions: A Regional Selection Framework for Uncertainty-Aware Molecule Screening

Weihua Shi

Yixuan Li

Tian Bai

Yijie Zhang

Kaiqiong Zhao

Marc‐André Legault

Hui Peng

Yue Zhao

Eric D. Kolaczyk

Xiang Yu

Archer Y. Yang

In drug discovery, quantitative structure–activity relationship (QSAR) models are widely used to guide Go/No-Go decisions within the Desig… (see more)n–Make–Test–Analyze (DMTA) cycle. However, conventional decision heuristics typically rely on a single cutoff, leading to a rigid binary select/discard paradigm. This approach is particularly ill-suited for borderline compounds near the decision boundary, where screening decisions are especially sensitive to prediction uncertainty and premature choices may either discard viable leads or advance likely failures, thereby increasing downstream assay costs. To address this limitation, we propose Regional Selection (RS), an uncertainty-aware three-way decision framework that partitions compounds into Predicted Pass, Predicted Fail, and Predicted Indeterminate regions. By explicitly reserving high-uncertainty compounds for targeted follow-up, RS avoids the pitfalls of premature binary classification. We formalize this framework through Regional Selection Inference (RSI), which casts region assignment as a multiple-hypothesis testing problem. We develop two imple- mentations of RSI: an empirical calibration-based method (RSI-EC), which thresholds uncertainty-normalized scores via empirical calibration, and a conformal selectionbased method (RSI-CS), which constructs conformal p-values for region assignment. RSI-EC is supported by large-sample calibration arguments, whereas RSI-CS provides finite-sample, distribution-free guarantees under exchangeability. Extensive evaluations across 15 high-dimensional QSAR benchmarks show that both RSI procedures reliably control the false discovery rate while maintaining high screening power. In limited-data regimes, RSI-CS yields particularly stable FDR control, whereas RSI-EC can be slightly less conservative; both perform strongly as sample sizes increase. We further study a cost-aware extension that incorporates asymmetric downstream costs through the score construction while keeping the nominal FDR target fixed. This extension introduces a tuning parameter that can reduce realized downstream cost, with dataset-dependent trade-offs against screening power. Overall, RSI offers a mathematically grounded and resource-aware alternative to single-threshold screening, allowing discovery teams to better balance decision confidence with assay budgets.

2026-05-27

ChemRxiv (accepted)

doi.org

AI Policy Fellowship Publications

Mila Ventures Launchpad

AI Policy Compass

Weihua Shi

Publications

AI Policy Fellowship Publications

Mila Ventures Launchpad

AI Policy Compass

Popular keywords:

Weihua Shi

Publications