Portrait of Soumya Sharma is unavailable

Soumya Sharma

PhD - McGill University
Supervisor
Co-supervisor
Research Topics
Creativity
Deep Learning
Natural Language Processing
Reasoning

Publications

Robust Reward Modeling via Causal Rubrics
Pragya Srivastava
Harman Singh
Rahul Madhavan
Sravanti Addepalli
Arun Suggala
Rengarajan Aravamudhan
Anirban Laha
Aravindan Raghuveer
Karthikeyan Shanmugam
Reward models (RMs) for LLM alignment often exhibit reward hacking, mistaking spurious correlates (e.g., length, format) for causal quality … (see more)drivers (e.g., factuality, relevance), leading to brittle RMs. We introduce CROME (Causally Robust Reward Modeling), a causally-grounded framework using targeted augmentations to mitigate this. CROME employs: (1) Causal Augmentations, pairs isolating specific causal attribute changes, to enforce sensitivity, and (2) Neutral Augmentations, tie-labeled pairs varying spurious attributes while preserving causal content, to enforce invariance. Crucially, augmentations target LLM-identified causal rubrics, requiring no prior knowledge of spurious factors. CROME significantly outperforms baselines on RewardBench (Avg +5.4\%, Safety +13.2\%, Reasoning +7.2\%) and demonstrates enhanced robustness via improved Best-of-N performance across RewardBench, WildGuardTest, and GSM8k.