Portrait de Nour Shaheen

Nour Shaheen

Représentant du laboratoire
Maitrise de recherche
Superviseur⋅e principal⋅e
Co-supervisor
Sujets de recherche
Apprentissage profond
Modèles de fondation
Traitement du langage naturel

Biographie

Nour est en deuxième année de maîtrise à Polytechnique Montréal, sous la direction des professeurs Amine Mhedhbi et Sarath Chandar. Ses recherches portent sur les modèles de fondation pour données tabulaires et les méthodes de fusion de modèles (model merging). Elle est passionnée par la science, le bon café et l'idée de travailler dans un environnement où les gens ont sincèrement du plaisir à être présents.

Publications

Is Depth Heterogeneity a Barrier to Model Merging?
Model merging offers a way to combine the capabilities of several networks at test time without retraining or additional finetuning, but mos… (voir plus)t merging methods assume identical architectures. Depth differences are commonly viewed as a major obstacle because they remove clear layer correspondences. We test this assumption by merging residual networks that differ only in depth, using a simple training-free pipeline based on identity expansion and permutation alignment. Across both same-task and multitask image classification experiments, heterogeneous merges closely match homogeneous ones. The results suggest that, for residual networks, depth mismatch is not the main barrier to effective model merging, and that the main difficulty in model merging comes from aligning independently trained weights in a homogeneous setting.
Towards Optimizing SQL Generation via LLM Routing
Mohammadhossein Malekpour
Text-to-SQL enables users to interact with databases through natural language, simplifying access to structured data. Although highly capabl… (voir plus)e large language models (LLMs) achieve strong accuracy for complex queries, they incur unnecessary latency and dollar cost for simpler ones. In this paper, we introduce the first LLM routing approach for Text-to-SQL, which dynamically selects the most cost-effective LLM capable of generating accurate SQL for each query. We present two routing strategies (score- and classification-based) that achieve accuracy comparable to the most capable LLM while reducing costs. We design the routers for ease of training and efficient inference. In our experiments, we highlight a practical and explainable accuracy-cost trade-off on the BIRD dataset.