Publications

Canadian Spine Society

Antoine Dionne

Majeed Al-Zakri

Hubert Labelle

Julie Joncas

Baron Lonner

Ali Eren

Patrick J Cahill

Peter Newton

Liisa Jaakkimainen

Teresa To

Maryse Bouchard

Sarah Hardy

Dilani Thevarajah

Rajendra Sakhrekar

Ayesha Hadi

Andrea Doria

Aya Mitani

Andrew Howard

Samuel Yoon

Karen Mathias … (voir 346 de plus)

Tracey Bastrom

Amer Samdani

Marjolaine Roy-Beaudry

Marie Beausejour

Rachelle Imbeault

Justin Dufresne

Stefan Parent

Jessica Romeo

Holly Livock

Kevin Smit

James Jarvis

Andrew Tice

Vivien K. Chan

Robert Cho

Selina Poon

David L. Skaggs

Geoffrey K. Shumilak

Brett Rocos

Juan P. Sardi

Anastasios Charalampidis

Jeff Gum

Peter S. Tretiakov

Oluwatobi Onafowokan

Jamshaid Mir

Ankita Das

Tyler Williamson

Pooja Dave

Bailey Imbo

Jordan Lebovic

Pawel Jankowski

Peter G. Passias

Yousef Aljamaan

Vishal P. Varshney

Ramesh Sahjpaul

Jill Osborn

Rémi Pelletier-Roy

Michael Asmussen

Manjot Birk

Taryn Ludwig

Fred Nicholls

Ariel Zohar

Janneke Loomans

Ferran Pellise

Justin Smith

So Kato

Zeeshan Sardar

Lawrence G. Lenke

Stephen J. Lewis

Aazad Abbas

Jay Toor

Gurjovan Sahi

Dusan Kovacevic

Johnathan Lex

Firoz Miyanji

Anthony V. Perruccio

Nizar Mahomed

Mayilee Canizares

Yousef Kamel

Galil Osman

Nikolaus Koegl

Brandon Herrington

Renan R. Fernandes

Jennifer C. Urquhart

Ramtin Hakimjavadi

Zachary DeVries

Noah Fine

Laura Stone

Mohit Kapoor

Alexandre Chenevert

Sonia Bédard

Julien Goulet

Jerome Couture

Bernard LaRue

Meaghan Rye

Alexa Roussac

Neda Naghdi

Luciana G. Macedo

James Elliott

Richard DeMont

Véronique Pepin

Z. Wang

Maroun Rizkallah

Jesse Shen

Michel Alexandre Lebreton

Edisond Florial

Fidaa Alshakfa

Ghassan Boubez

Abdullah A.S.M. AlDuwaisan

Kim Phan

Sarah Nowell

Niels Wedderkopp

Michael Craig

Abdul Al-Shawwa

Kalum Ost

Saswati Tripathy

Bradley W. Jacobs

Nathan Evaniew

Chris Bailey

W. Bradley Jacobs

Andrew Nataraj

David W. Cadotte

Kenneth C. Thomas

Hamilton Hall

Eva Y. Liu

Amit R.L. Persad

Nathan Baron

Daryl Fourney

Jingyi Huang

Thamer Alfawaz

Tinghua Zhang

CSORN Investigators

Karlo M. Pedro

Mohammed Ali Alvi

Jessica C.W. Wang

Nicolas Dea

Tamir Ailon

Scott Paquette

John Street

Charlotte Dandurand

R. Mumtaz

Khaled Skaik

Eugene K. Wai

Alexandra Stratton

Ragavan Manoharan

Jenna Smith-Forrester

JoAnne E. Douglas

Evan Nemeth

Jacob Alant

Sean Barry

Andrew Glennie

William Oxner

Lutz M. Weise

Sabahat Saeed

Patrick Toyota

Jack Su

Braeden Newton

Nicole Coote

Maria S. Rachevits

Helen Razmjou

Susan Robarts

Albert Yee

Joel Finkelstein

Alysa Almojuela

Frederick Zeiler

Sarvesh Logsetty

Perry Dhaliwal

Mark Abdelnour

Yuxin Zhang

Stephen P. Kingwell

Philippe Phan

Taylor A. Smith

Michael Bond

Stephan Dombrowski

Gwyneth Price

Jose Manuel García-Moreno

Steven Qiu

Vithushan Surendran

Victoria Shi Emily Cheung

Sophie Ngana

Muhammad A. Qureshi

Sunjay V. Sharma

Markian Pahuta

Daipayan Guha

Ahmed Essa

Husain Shakil

James Byrne

Andrew S. Jack

Francois Mathieu

Eva Yuan

Christopher W. Smith

Erin M. Harrington

Rachel H. Jaffe

Alick P. Wang

Karim Ladha

Avery B. Nathens

Ryan V. Sandarage

Ahmad Galuta

Eve C. Tsai

Naama Rotem-Kohavi

Marcel Dvorak

Jijie Xu

Nader Fallah

Zeina Waheed

Melody Chen

Vanessa K. Noonan

Toluyemi Malomo

Charles G. Fisher

Rachael Jaffe

Peter Coyte

Brian Chan

Armaan Malhotra

Rebecca Hancock-Howard

Jefferson R. Wilson

Christopher D. Witiw

Newton Cho

Jordan Squair

Viviana Aureli

Nicholas James

Lea Bole-Feysot

Inssia Dewany

Nicolas Hankov

Laetitia Baud

Anna Leonhartsberger

Kristina Sveistyte

Michael Skinnider

Matthieu Gautier

Katia Galan

Maged Goubran

Jimmy Ravier

Frederic Merlos

Laura Batti

Stéphane Pagès

Nadia Bérard

Nadine Intering

Camille Varescon

Stefano Carda

Kay Bartholdi

Thomas Hutson

Claudia Kathe

Michael Hodara

Mark Anderson

Bogdan Draganski

Robin Demesmaeker

Leonie Asboth

Quentin Barraud

Jocelyne Bloch

Gregoire Courtine

Sean D. Christie

Ryan Greene

Mustafa Nadi

Bill Oxner

Lisa Julien

Clara Lownie

Cumhur F.C. Öner

Alexander Joeris

K. Schnake

Mark Phillips

Alexander R. Vaccaro

Richard Bransford

Eugen Cezar Popescu

Mohammed El-Sharkawi

Shanmuganathan Rajasekaran

Lorin M. Benneker

Greg D. Schroeder

Jin W. Tee

John France

Jérôme Paquet

Richard Allen

William F. Lavelle

Emiliano Vialle

David Magnuson

Andréane Richard-Denis

Yvan Petit

Francis Bernard

Dorothy Barthélemy

Lukas Grassner

Daniel Garcia-Ovejero

Evelyn Beyerer

Orpheus Mach

Iris Leister

Doris Maier

Ludwig Aigner

Angel Arevalo-Martin

Mark Alexander MacLean

Antoinette Charles

Raphaële Charest-Morin

Rory Goodwin

Michael H. Weber

Emile Brouillard

Ismail Laassassy

Paul Khoueir

Étienne Bourassa-Moreau

Gilles Maurais

Jean-Marc Mac-Thiong

Julien Francisco Zaldivar-Jolissaint

Aysha Allard Brown

Kitty So

Neda Manouchehri

Megan Webster

Jay Ethridge

Audrey Warner

Avril Billingsley

Rochelle Newsome

Kirsten Bale

Andrew Yung

Mehara Seneviratne

Jimmy Cheng

Jing Wang

Shenani Basnayake

Femke Streijger

Manraj Heran

Piotr Kozlowski

Brian K. Kwon

Jeff D. Golan

Lior M. Elkaim

Qais Alrashidi

Miltiadis Georgiopoulos

Oliver Lasry

Drew A. Bednar

Alyson Love

Soroush Nedaie

Pranjan Gandhi

Prarthan C. Amin

Christopher J. Neilsen

Amanda Vandewint

Y. Raja Rampersaud

Jeffrey Hebert

Eden Richardson

Jillian Kearney

Raja Rampersaud

Aditya Raj

Nanadan Marathe

Greg McIntosh

Manmeet Dhiman

Taylor J. Bader

David Hart

Ganesh Swamy

Neil Duncan

Dragana Ponjevic

John R. Matyas

Connor P. O’Brien

Erin Bigney

Edward Abraham

Neil Manson

Najmedden Attabib

Chris Small

Luke LaRochelle

Gabriella Rivas

James Lawrence

Robert Ravinsky

Lily S. Switzer

David E. Lebel

Chanelle Montpetit

Nicolas Vaillancourt

Emma Nadler

Jennifer A. Dermott

Dorothy J. Kim

Brent Rosenstein

Daniel Wolfe

Geoffrey Dover

Mathieu Boily

Maryse Fortin

Jetan Badhiwala

Vishu Karthikeyan

Yingshi He

Michael G. Fehlings

2024-11-13

Canadian journal of surgery. Journal canadien de chirurgie (publié)

doi.org

Canadian Spine Society

Antoine Dionne

Majeed Al-Zakri

Hubert Labelle

Julie Joncas

Baron Lonner

Ali Eren

Patrick J Cahill

Peter Newton

Liisa Jaakkimainen

Teresa To

Maryse Bouchard

Sarah Hardy

Dilani Thevarajah

Rajendra Sakhrekar

Ayesha Hadi

Andrea Doria

Aya Mitani

Andrew Howard

Samuel Yoon

Karen Mathias … (voir 346 de plus)

Tracey Bastrom

Amer Samdani

Marjolaine Roy-Beaudry

Marie Beausejour

Rachelle Imbeault

Justin Dufresne

Stefan Parent

Jessica Romeo

Holly Livock

Kevin Smit

James Jarvis

Andrew Tice

Vivien K. Chan

Robert Cho

Selina Poon

David L. Skaggs

Geoffrey K. Shumilak

Brett Rocos

Juan P. Sardi

Anastasios Charalampidis

Jeff Gum

Peter S. Tretiakov

Oluwatobi Onafowokan

Jamshaid Mir

Ankita Das

Tyler Williamson

Pooja Dave

Bailey Imbo

Jordan Lebovic

Pawel Jankowski

Peter G. Passias

Yousef Aljamaan

Vishal P. Varshney

Ramesh Sahjpaul

Jill Osborn

Rémi Pelletier-Roy

Michael Asmussen

Manjot Birk

Taryn Ludwig

Fred Nicholls

Ariel Zohar

Janneke Loomans

Ferran Pellise

Justin Smith

So Kato

Zeeshan Sardar

Lawrence G. Lenke

Stephen J. Lewis

Aazad Abbas

Jay Toor

Gurjovan Sahi

Dusan Kovacevic

Johnathan Lex

Firoz Miyanji

Anthony V. Perruccio

Nizar Mahomed

Mayilee Canizares

Yousef Kamel

Galil Osman

Nikolaus Koegl

Brandon Herrington

Renan R. Fernandes

Jennifer C. Urquhart

Ramtin Hakimjavadi

Zachary DeVries

Noah Fine

Laura Stone

Mohit Kapoor

Alexandre Chenevert

Sonia Bédard

Julien Goulet

Jerome Couture

Bernard LaRue

Meaghan Rye

Alexa Roussac

Neda Naghdi

Luciana G. Macedo

James Elliott

Richard DeMont

Véronique Pepin

Z. Wang

Maroun Rizkallah

Jesse Shen

Michel Alexandre Lebreton

Edisond Florial

Fidaa Alshakfa

Ghassan Boubez

Abdullah A.S.M. AlDuwaisan

Kim Phan

Sarah Nowell

Niels Wedderkopp

Michael Craig

Abdul Al-Shawwa

Kalum Ost

Saswati Tripathy

Bradley W. Jacobs

Nathan Evaniew

Chris Bailey

W. Bradley Jacobs

Andrew Nataraj

David W. Cadotte

Kenneth C. Thomas

Hamilton Hall

Eva Y. Liu

Amit R.L. Persad

Nathan Baron

Daryl Fourney

Jingyi Huang

Thamer Alfawaz

Tinghua Zhang

CSORN Investigators

Karlo M. Pedro

Mohammed Ali Alvi

Jessica C.W. Wang

Nicolas Dea

Tamir Ailon

Scott Paquette

John Street

Charlotte Dandurand

R. Mumtaz

Khaled Skaik

Eugene K. Wai

Alexandra Stratton

Ragavan Manoharan

Jenna Smith-Forrester

JoAnne E. Douglas

Evan Nemeth

Jacob Alant

Sean Barry

Andrew Glennie

William Oxner

Lutz M. Weise

Sabahat Saeed

Patrick Toyota

Jack Su

Braeden Newton

Nicole Coote

Maria S. Rachevits

Helen Razmjou

Susan Robarts

Albert Yee

Joel Finkelstein

Alysa Almojuela

Frederick Zeiler

Sarvesh Logsetty

Perry Dhaliwal

Mark Abdelnour

Yuxin Zhang

Stephen P. Kingwell

Philippe Phan

Taylor A. Smith

Michael Bond

Stephan Dombrowski

Gwyneth Price

Jose Manuel García-Moreno

Steven Qiu

Vithushan Surendran

Victoria Shi Emily Cheung

Sophie Ngana

Muhammad A. Qureshi

Sunjay V. Sharma

Markian Pahuta

Daipayan Guha

Ahmed Essa

Husain Shakil

James Byrne

Andrew S. Jack

Francois Mathieu

Eva Yuan

Christopher W. Smith

Erin M. Harrington

Rachel H. Jaffe

Alick P. Wang

Karim Ladha

Avery B. Nathens

Ryan V. Sandarage

Ahmad Galuta

Eve C. Tsai

Naama Rotem-Kohavi

Marcel Dvorak

Jijie Xu

Nader Fallah

Zeina Waheed

Melody Chen

Vanessa K. Noonan

Toluyemi Malomo

Charles G. Fisher

Rachael Jaffe

Peter Coyte

Brian Chan

Armaan Malhotra

Rebecca Hancock-Howard

Jefferson R. Wilson

C. Witiw

Newton Cho

Jordan Squair

Viviana Aureli

Nicholas James

Lea Bole-Feysot

Inssia Dewany

Nicolas Hankov

Laetitia Baud

Anna Leonhartsberger

Kristina Sveistyte

Michael Skinnider

Matthieu Gautier

Katia Galan

Maged Goubran

Jimmy Ravier

Frederic Merlos

Laura Batti

Stéphane Pagès

Nadia Bérard

Nadine Intering

Camille Varescon

Stefano Carda

Kay Bartholdi

Thomas Hutson

Claudia Kathe

Michael Hodara

Mark Anderson

Bogdan Draganski

Robin Demesmaeker

Leonie Asboth

Quentin Barraud

Jocelyne Bloch

Gregoire Courtine

Sean D. Christie

Ryan Greene

Mustafa Nadi

Bill Oxner

Lisa Julien

Clara Lownie

Cumhur F.C. Öner

Alexander Joeris

K. Schnake

Mark Phillips

Alexander R. Vaccaro

Richard Bransford

Eugen Cezar Popescu

Mohammed El-Sharkawi

Shanmuganathan Rajasekaran

Lorin M. Benneker

Greg D. Schroeder

Jin W. Tee

John France

Jérôme Paquet

Richard Allen

William F. Lavelle

Emiliano Vialle

David Magnuson

Andréane Richard-Denis

Yvan Petit

Francis Bernard

Dorothy Barthélemy

Lukas Grassner

Daniel Garcia-Ovejero

Evelyn Beyerer

Orpheus Mach

Iris Leister

Doris Maier

Ludwig Aigner

Angel Arevalo-Martin

Mark Alexander MacLean

Antoinette Charles

Raphaële Charest-Morin

Rory Goodwin

Michael H. Weber

Emile Brouillard

Ismail Laassassy

Paul Khoueir

Étienne Bourassa-Moreau

Gilles Maurais

Jean-Marc Mac-Thiong

Julien Francisco Zaldivar-Jolissaint

Aysha Allard Brown

Kitty So

Neda Manouchehri

Megan Webster

Jay Ethridge

Audrey Warner

Avril Billingsley

Rochelle Newsome

Kirsten Bale

Andrew Yung

Mehara Seneviratne

Jimmy Cheng

Jing Wang

Shenani Basnayake

Femke Streijger

Manraj Heran

Piotr Kozlowski

Brian K. Kwon

Jeff D. Golan

Lior M. Elkaim

Qais Alrashidi

Miltiadis Georgiopoulos

Oliver Lasry

Drew A. Bednar

Alyson Love

Soroush Nedaie

Pranjan Gandhi

Prarthan C. Amin

Christopher J. Neilsen

Amanda Vandewint

Y. Raja Rampersaud

Jeffrey Hebert

Eden Richardson

Jillian Kearney

Raja Rampersaud

Aditya Raj

Nanadan Marathe

Greg McIntosh

Manmeet Dhiman

Taylor J. Bader

David Hart

Ganesh Swamy

Neil Duncan

Dragana Ponjevic

John R. Matyas

Connor P. O’Brien

Erin Bigney

Edward Abraham

Neil Manson

Najmedden Attabib

Chris Small

Luke LaRochelle

Gabriella Rivas

James Lawrence

Robert Ravinsky

Lily S. Switzer

David E. Lebel

Chanelle Montpetit

Nicolas Vaillancourt

Emma Nadler

Jennifer A. Dermott

Dorothy J. Kim

Brent Rosenstein

Daniel Wolfe

Geoffrey Dover

Mathieu Boily

Maryse Fortin

Jetan Badhiwala

Vishu Karthikeyan

Yingshi He

Michael G. Fehlings

2024-11-13

Canadian journal of surgery. Journal canadien de chirurgie (publié)

doi.org

Abstract 4142894: Multimorbidity Trajectories Across the Lifespan in Patients with Congenital Heart Disease

Chao Li

Aihua Liu

Solomon Bendayan

Liming Guo

Judith Therrien

Archer Yang

Robyn Tamblyn

Jay Brophy

Yue Li

Ariane Marelli

Background: Befitted from advances in medical care, patients with congenital heart disease (CHD) now survive to adulthood but face elevated… (voir plus) risks of both cardiac and non-cardiac complications. Understanding the trajectories of comorbidity development over a patient's lifespan is cornerstone to optimize care expected to improve long-term health outcomes. Research Aim: This study aims to investigate the temporal sequences and evolution of comorbidities in CHD patients across their lifespan. We hypothesize that multimorbidity trajectories in CHD patients are linked to CHD lesion severity and age at onset of specific comorbidities. Methods: Using the Quebec CHD database which comprised data in outpatient visits, hospitalization records and vital status from 1983 to 2017, we designed a longitudinal cohort study evaluating the development of 39 comorbidities coded using ICD-9/10. Temporal sequences were mapped using median age of onset. Associations between disease pairs were quantified by hazard ratios from Cox proportional hazard models adjusting for age, sex, genetic syndrome, competing risks of death, and taking into account the time-varying nature of the predictor diseases. Results: The cohort included 9,764 individuals with severe and 127,729 with non-severe CHD lesions. In severe CHD patients, most comorbidities developed between ages 25 and 40. Comorbidity progression began with childhood cardiovascular diseases, followed by systemic diseases such as diabetes, liver and kidney diseases, and advanced to heart failure and dementia in middle adulthood. In addition, mental disorders emerged in early adulthood and were associated with subsequent development of kidney diseases and dementia. Different trajectories were observed in non-severe CHD patients with 2-3 decades later disease onsets and non-differential onsets between cardiovascular and systemic complications (Figure). Conclusions: Distinct multimorbidity trajectories were observed in CHD patients by CHD lesion severity. In patients with severe CHD lesions, early systemic diseases significantly influenced subsequent complications. These findings highlight the need for well-timed surveillance guidelines and interventions to improve health outcomes.

2024-11-12

Circulation (publié)

doi.org

Beyond the Safety Bundle: Auditing the Helpful and Harmless Dataset

Khaoula Chehbouni

Jonathan Colacco-Carr

Yash More

Jackie Ck Cheung

Golnoosh Farnadi

In an effort to mitigate the harms of large language models (LLMs), learning from human feedback (LHF) has been used to steer LLMs towards o… (voir plus)utputs that are intended to be both less harmful and more helpful. Despite the widespread adoption of LHF in practice, the quality of this feedback and its effectiveness as a safety mitigation technique remain unclear. This study addresses these issues by auditing the widely-used Helpful and Harmless (HH) dataset by Anthropic. Our work includes: (1) a thorough investigation of the dataset's content through both manual and automated evaluation; (2) experiments demonstrating the dataset's impact on models' safety; and (3) an analysis of the 100 most influential papers citing this dataset. Through our audit, we showcase how conceptualization failures and quality issues identified in the HH dataset can create additional harms by leading to disparate safety behaviors across demographic groups. Our findings highlight the need for more nuanced, context-sensitive approaches to safety mitigation in LLMs.

2024-11-12

ArXiv (prépublication)

doi.org

arxiv.org

Beyond the Safety Bundle: Auditing the Helpful and Harmless Dataset

Yash More

In an effort to mitigate the harms of large language models (LLMs), learning from human feedback (LHF) has been used to steer LLMs towards o… (voir plus)utputs that are intended to be both less harmful and more helpful. Despite the widespread adoption of LHF in practice, the quality of this feedback and its effectiveness as a safety mitigation technique remain unclear. This study addresses these issues by auditing the widely-used Helpful and Harmless (HH) dataset by Anthropic. Our work includes: (1) a thorough investigation of the dataset's content through both manual and automated evaluation; (2) experiments demonstrating the dataset's impact on models' safety; and (3) an analysis of the 100 most influential papers citing this dataset. Through our audit, we showcase how conceptualization failures and quality issues identified in the HH dataset can create additional harms by leading to disparate safety behaviors across demographic groups. Our findings highlight the need for more nuanced, context-sensitive approaches to safety mitigation in LLMs.

2024-11-12

ArXiv (prépublication)

doi.org

arxiv.org

Fault Localization in Deep Learning-based Software: A System-level Approach

Mohammad Mehdi Morovati

Amin Nikanjam

Foutse Khomh

2024-11-12

ArXiv (prépublication)

doi.org

arxiv.org

Fault Localization in Deep Learning-based Software: A System-level Approach

Mohammad Mehdi Morovati

Amin Nikanjam

Foutse Khomh

Over the past decade, Deep Learning (DL) has become an integral part of our daily lives. This surge in DL usage has heightened the need for … (voir plus)developing reliable DL software systems. Given that fault localization is a critical task in reliability assessment, researchers have proposed several fault localization techniques for DL-based software, primarily focusing on faults within the DL model. While the DL model is central to DL components, there are other elements that significantly impact the performance of DL components. As a result, fault localization methods that concentrate solely on the DL model overlook a large portion of the system. To address this, we introduce FL4Deep, a system-level fault localization approach considering the entire DL development pipeline to effectively localize faults across the DL-based systems. In an evaluation using 100 faulty DL scripts, FL4Deep outperformed four previous approaches in terms of accuracy for three out of six DL-related faults, including issues related to data (84%), mismatched libraries between training and deployment (100%), and loss function (69%). Additionally, FL4Deep demonstrated superior precision and recall in fault localization for five categories of faults including three mentioned fault types in terms of accuracy, plus insufficient training iteration and activation function.

2024-11-12

ArXiv (prépublication)

doi.org

arxiv.org

Investigating the Effectiveness of Explainability Methods in Parkinson's Detection from Speech

Paolo Torroni

Speech impairments in Parkinson's disease (PD) provide significant early indicators for diagnosis. While models for speech-based PD detectio… (voir plus)n have shown strong performance, their interpretability remains underexplored. This study systematically evaluates several explainability methods to identify PD-specific speech features, aiming to support the development of accurate, interpretable models for clinical decision-making in PD diagnosis and monitoring. Our methodology involves (i) obtaining attributions and saliency maps using mainstream interpretability techniques, (ii) quantitatively evaluating the faithfulness of these maps and their combinations obtained via union and intersection through a range of established metrics, and (iii) assessing the information conveyed by the saliency maps for PD detection from an auxiliary classifier. Our results reveal that, while explanations are aligned with the classifier, they often fail to provide valuable information for domain experts.

2024-11-12

ArXiv (prépublication)

doi.org

arxiv.org

Refining SARS-CoV-2 Intra-host Variation by Leveraging Large-scale Sequencing Data

Fatima Mostefai

Jean-Christophe Grenier

Raphael Poujol

Julie Hussin

2024-11-12

NAR Genomics and Bioinformatics (publié)

doi.org

Combining Domain and Alignment Vectors to Achieve Better Knowledge-Safety Trade-offs in LLMs

Megh Thakkar

Yash More

Quentin Fournier

Matthew D Riemer

Pin-Yu Chen

Amal Zouaq

Payel Das

Sarath Chandar

There is a growing interest in training domain-expert LLMs that excel in specific technical fields compared to their general-purpose instruc… (voir plus)tion-tuned counterparts. However, these expert models often experience a loss in their safety abilities in the process, making them capable of generating harmful content. As a solution, we introduce an efficient and effective merging-based alignment method called \textsc{MergeAlign} that interpolates the domain and alignment vectors, creating safer domain-specific models while preserving their utility. We apply \textsc{MergeAlign} on Llama3 variants that are experts in medicine and finance, obtaining substantial alignment improvements with minimal to no degradation on domain-specific benchmarks. We study the impact of model merging through model similarity metrics and contributions of individual models being merged. We hope our findings open new research avenues and inspire more efficient development of safe expert LLMs.

2024-11-11

ArXiv (prépublication)

doi.org

arxiv.org

Combining Domain and Alignment Vectors to Achieve Better Knowledge-Safety Trade-offs in LLMs

Megh Thakkar

Yash More

Quentin Fournier

Matthew D Riemer

Pin-Yu Chen

Amal Zouaq

Payel Das

Sarath Chandar

There is a growing interest in training domain-expert LLMs that excel in specific technical fields compared to their general-purpose instruc… (voir plus)tion-tuned counterparts. However, these expert models often experience a loss in their safety abilities in the process, making them capable of generating harmful content. As a solution, we introduce an efficient and effective merging-based alignment method called \textsc{MergeAlign} that interpolates the domain and alignment vectors, creating safer domain-specific models while preserving their utility. We apply \textsc{MergeAlign} on Llama3 variants that are experts in medicine and finance, obtaining substantial alignment improvements with minimal to no degradation on domain-specific benchmarks. We study the impact of model merging through model similarity metrics and contributions of individual models being merged. We hope our findings open new research avenues and inspire more efficient development of safe expert LLMs.

2024-11-11

ArXiv (prépublication)

doi.org

arxiv.org

Comparing Bottom-Up and Top-Down Steering Approaches on In-Context Learning Tasks

Madeline Brumley

Joe Kwon

David Scott Krueger

Dmitrii Krasheninnikov

Usman Anwar

A key objective of interpretability research on large language models (LLMs) is to develop methods for robustly steering models toward desir… (voir plus)ed behaviors. To this end, two distinct approaches to interpretability -- ``bottom-up"and ``top-down"-- have been presented, but there has been little quantitative comparison between them. We present a case study comparing the effectiveness of representative vector steering methods from each branch: function vectors (FV; arXiv:2310.15213), as a bottom-up method, and in-context vectors (ICV; arXiv:2311.06668) as a top-down method. While both aim to capture compact representations of broad in-context learning tasks, we find they are effective only on specific types of tasks: ICVs outperform FVs in behavioral shifting, whereas FVs excel in tasks requiring more precision. We discuss the implications for future evaluations of steering methods and for further research into top-down and bottom-up steering given these findings.

2024-11-11

ArXiv (prépublication)

doi.org

arxiv.org

Conférence d'ouverture | Créer une IA plus sécuritaire pour la santé mentale des jeunes

TRAIL : IA responsable pour les professionnels et les leaders

Fondateur en résidence Mila Ventures

Éclaireurs autochtones en IA

Publications

Conférence d'ouverture | Créer une IA plus sécuritaire pour la santé mentale des jeunes

TRAIL : IA responsable pour les professionnels et les leaders

Fondateur en résidence Mila Ventures

Éclaireurs autochtones en IA

Mots-clés populaires:

Publications