Portrait of Dan Poenaru

Dan Poenaru

Associate Academic Member
Professor, McGill University, Department of Pediatric Surgery
Research Topics
AI and Healthcare
AI in Health
Medical Machine Learning

Biography

Dan Poenaru is a professor of pediatric surgery at McGill University and a senior scientist at the research institute of the McGill University Health Centre. He has a master’s degrees in health professions education and international development, and a doctorate in health strategy and management. Poenaru is a Fonds de recherche du Québec - Santé (FRQS) and a Canadian Institutes of Health Research (CIHR)-funded investigator in patient-centered surgical care, head of the McGill CommiSur Lab, director of the Jean-Martin Laberge Fellowship in Global Pediatric Surgery, and a founding member of the Global Initiative for Children’s Surgery (GICS).

His current areas of academic interest are technology-assisted surgical communication and medical education, including AI, VR and digital health devices, patient-centred surgical care, and developing global surgical research capacity.

Current Students

PhD - McGill University
Master's Research - McGill University
PhD - McGill University
PhD - McGill University
PhD - McGill University
Principal supervisor :
PhD - McGill University
PhD - McGill University
PhD - Université de Sherbrooke
Co-supervisor :
PhD - Université de Sherbrooke
Co-supervisor :
Master's Research - McGill University

Publications

Assessing Language Bias in Pediatric Surgical Systematic Reviews: A Meta-epidemiological Study.
Dunya Moghul
Elena Guadagno
Robert Baird
Patient safety culture in the operating room of African hospitals: a systematic review
Jacques Fadhili Bake
Naïcen Ghanmi
Elena Guadagno
K. M. Claude
Tsongo Kibendelwa Zacharie
Patient safety in operating rooms has globally improved through interventions such as the World Health Organization (WHO) Surgical Safety Ch… (see more)ecklist and multidisciplinary team training. However, while evidence from high-income countries is well documented, there remains limited consolidated knowledge on the understanding, application, and effectiveness of safety culture interventions in African surgical settings, which this review seeks to address. This systematic review examined factors and protocols affecting surgical safety in African operating rooms. We hypothesized that persistent systemic barriers undermine safety culture despite adoption of global measures. Following PRISMA 2020, we searched eight databases (Medline, Embase, Cochrane, Africa-Wide, CINAHL, Global Health, Global Index Medicus, Web of Science) from inception to 5 December 2024, using variations of text words present in the title, abstract, or keyword fields, alongside relevant subject headings, to identify articles addressing surgical safety and culture throughout Africa. Included studies involved operating room professionals in African countries and used quantitative, qualitative, or mixed-methods designs. We excluded non-operating room settings, patient-only studies, inaccessible full texts, reviews, editorials, letters, conference abstracts, and duplicates. Two reviewers independently screened and appraised studies using the Mixed Methods Appraisal Tool. Findings were synthesized narratively with subgroup analysis by study type and theme. Out of 9,875 identified records, 22 studies from 12 African countries (2014–2024) met inclusion criteria, with Ethiopia contributing the highest number (n = 4). Various assessment tools, including the Hospital Survey on Patient Safety Culture, the Safety Attitudes Questionnaire, and the National Surgical, Obstetric, and Anaesthesia Plans interview manual, revealed recurring challenges: inadequate non-punitive responses to errors, communication barriers, hierarchical structures, and resource constraints. Four interventions showed promise: implementation and training on the WHO Surgical Safety Checklist, Safe Surgery 2020 initiatives, Non-Technical Skills for Surgeons training, and multidisciplinary training. The heterogeneity of study designs, sample sizes, and outcome measures limited direct comparisons and precluded meta-analysis. Nonetheless, the review highlights persistent barriers and emerging opportunities to strengthen patient safety culture in African operating rooms. While the WHO Surgical Safety Checklist remains valuable, sustainable progress requires multi-level strategies that address systemic constraints and incorporate context-sensitive adaptations. PROSPERO, CRD42024627076.
Comparing Virtual Reality Trauma Training Across Diverse Clinical Backgrounds: A Mixed-Methods Study in Canada And India.
Boaz Laor
Samia Benabess
S. Kundu
Ayla Gerk
F. Botelho
Jean-Robert Kwizera
Arjunaditya Kundu
Tom Dolby
Elena Guadagno
Dhruva Ghosh
Vishal Micheal
Rohit Theodore
Thejus Varghese
Large language models for electronic health records in pediatric and surgical care: a systematic review.
Waseem Abu-Ashour
Elena Guadagno
Suspected Biliary Atresia in Brazil: Impact of Regional Healthcare Variations on Diagnostic Timeliness
Luiza Telles
Paulo Henrique Moreira Melo
Ana Maria Bicudo Diniz
Gabriele Lech
Ayla Gerk
Lauren Kratky
David P. Mooney
Joaquim Bustorff-Silva
The Impact of Pediatric Surgery Global Travel Fellowships: A Study by the Canadian Association of Paediatric Surgeons Global Partnership Committee.
Sacha Williams
Natasha Bejjani
Elena Guadagno
Robert Baird
Shahrzad Joharifard
Melanie Morris
Robin Petroze
Sherif Emil
Perspective on patient and non-academic partner engagement for the responsible integration of large language models in health chatbots
Nikhil Jaiswal
Yuanchao Ma
Bertrand Lebouché
Marie-Pascale Pomey
Sofiane Achiche
David Lessard
Kim Engler
Zully Montiel
Hector Acevedo
Rodrigo Rosa Gameiro
Leo Anthony Celi
Esli Osmanlliu
Uses of large language models (LLMs) in health chatbots are expanding into high-stakes clinical contexts, heightening the need for tools tha… (see more)t are evidence-based, accountable, accurate, and patient-centred. This conceptual, practice-informed Perspective reflects on engaging patients and non-academic partners for the responsible integration of LLMs, grounded in the co-construction of MARVIN (for people living with HIV) and in an emerging collaboration with MIT Critical Data. Organised by the Software Development Life Cycle, we describe: conception/needs assessment with patient partners to identify use cases, acceptable trade-offs, and privacy expectations; development that prioritises grounding via vetted sources, structured human feedback, and data-validation committees including patient partners; testing and evaluation using patient-reported outcome measures (PROMs) and patient-reported experience measures (PREMs) chosen in collaboration with patients to capture usability, acceptability, trust, and perceived safety, alongside task performance and harmful-output monitoring; and implementation via diverse governance boards, knowledge-mobilisation materials to set expectations, and risk-management pathways for potentially unsafe outputs. Based on our experience with MARVIN, we recommend early and continuous engagement of patients and non-academic partners, fair compensation, shared decision-making power, transparent decision logging, and inclusive, adaptable governance that can evolve with changing models and standards. These lessons highlight how patient partnership can directly shape chatbot design and oversight, helping teams align LLM-enabled tools with patient-centred goals while building accountable, safe, and equitable systems. Health chatbots powered by large language models (LLMs) can make medical information more accessible, but most are developed without meaningful input from the people who will use them. This risks unsafe answers, hidden bias, and tools that mainly work for privileged groups. Our team built a chatbot called MARVIN to support people living with HIV, and we are now adapting it for cancer care and children’s health. Patients, caregivers, and community partners shaped what MARVIN should do, chose which sources it should trust, and tested early versions. Their feedback led to concrete improvements including clearer language, more relevant features, and safeguards against misinformation. We are also partnering with MIT Critical Data, which brings patients, members of the public, clinicians, engineers, and policymakers together at events to find and fix bias in medical AI. We have learned that technical fixes alone are not enough: trust, fairness, and accountability require active involvement of diverse users at every stage. Based on these lessons, we recommend: (1) including patients and non-academic partners from the start so their insights can shape core design decisions; (2) compensating them fairly so participation is sustainable; (3) giving them real decision-making power so their input is not tokenistic; and (4) being transparent about the limits of AI so expectations are realistic. In our experience, responsible health AI depends on the lived expertise of the people it serves.
Synthetic Validation of Pediatric Trust Instruments using Persona-Driven Large Language Models
Katya Loban
Elena Guadagno
Trust is foundational to patient-physician relationships and is associated with improved care-seeking and adherence in primary care. However… (see more), validated trust instruments for pediatric emergency and surgical contexts are lacking, and traditional instrument development is slow and resource-intensive. Large language models (LLMs) could streamline the validation process by serving as scalable, systematic expert panel surrogates. We developed four new trust assessment instruments: two for patient-families and two for physicians. Two-phase content validation was conducted using two parallel synthetic and human expert panels. Synthetic panels consisted of three persona-prompted LLMs (Claude Sonnet 4, GPT-5, Grok4). Human panels served as traditional comparators. Scale-Content Validity Index (S-CVI) and Fleiss’ kappa (k) acceptance thresholds were set at ≥0.80. Combined human–synthetic expert panels revealed substantial inter-rater reliability across all instruments. Fleiss’ kvalues for dimensional validation were: patient-family = 0.84 (95% CI [0.72, 0.96]), physician = 0.87 (95% CI [0.72, 1.00]);contextual validation: patient-family = 0.83 (95% CI [0.73, 0.93]), physician = 0.88 (95% CI [0.80, 0.96]). All instruments exceeded S-CVI ≥0.80 thresholds across both validation phases. Persona-prompted LLMs demonstrated comparable validity outcomes to human experts while accelerating validation timelines from months to weeks. Future research needs to evaluate this approach across psychometric testing phases. This synthetic instrument validation methodology offers a scalable blueprint for healthcare measurement development, enabling faster creation of validated tools to support evidence-based patient care.
Risk factors for catastrophic healthcare expenditure and high economic burden for children with anorectal malformations in Southwestern Uganda
Felix Oyania
Caroline Q. Stephens
Sarah Ullrich
Amy M. Shui
Meera Kotagal
Godfrey Zari Rukundo
Joseph Ngonzi
Ava Yap
Francis Bajunirwe
Doruk Ozgediz
Intersectionality in Surgical Care in LMICs: A Systematic Scoping Review
Ayla Gerk
Elena Guadagno
Justina Seyi-Olajide
Dunya Moghul
Joaquim Bustorff-Silva
Cristina Camargo
Predictive Performance Precision Analysis in Medicine: Identification of low-confidence predictions at patient and profile levels (MED3pa I)
Félix Camirand Lemyre
Jean-François Ethier
Lyna Hiba Chikouche
Ludmila Amriou
Artificial Intelligence models are increasingly used in healthcare, yet global performance metrics can mask variations in reliability across… (see more) individual patients or subgroups with shared attributes, called patient profiles . This study introduces MED3pa, a method that identifies when models are less reliable, allowing clinicians to better assess model limitations. We propose a framework that estimates predictive confidence using three combined approaches: Individualized (IPC), Aggregated (APC), and Mixed Predictive Confidence (MPC). IPC estimates confidence for each patient, APC assesses it across profiles, and MPC combines both. We evaluate our method on four datasets: one simulated, two public, and one private clinical dataset. Metrics by Declaration Rate (MDR) curves show how performance changes when retaining only the most confident predictions, while interpretable decision trees reveal profiles with higher or lower model confidence. We demonstrate our method in internal, temporal, and external validation settings, as well as through a clinical example. In internal validation, limiting predictions to the 93% most confident cases improved sensitivity by 14.3% and the AUC by 5.1%. In the clinical example, MED3pa identified a patient profile with high misclassification risk, demonstrating its potential for safer deployment. By identifying low-confidence predictions, our framework improves model reliability in clinical settings. It can be integrated into decision support systems to help clinicians make more informed decisions. Confidence thresholds help balance model performance with the proportion of patients for whom predictions are considered reliable. Better leveraging confidence in model predictions could improve reliability and trustworthiness, supporting safer and more effective use in healthcare.
The Impact of a Pediatric Surgery Fundamentals Boot Camp on New Surgical Trainees' Perceived Knowledge and Confidence Levels.
Julia Ferreira
Simon Rahman
Fabio Botelho
Farhan Banji
W. A. Igrine
Gianluca Bertolizio
Sam Daniel
Thomas Engelhardt
Chantal Frigon
Lily H P Nguyen
Catherine Paquet
Pramod Puligandla
Hussein Wissanji
Davinia Withington
Yasmine Yousef
Sherif Emil