Learn how to leverage generative AI to support and improve your productivity at work. The next cohort will take place online on April 28 and 30, 2026, in French.
We use cookies to analyze the browsing and usage of our website and to personalize your experience. You can disable these technologies at any time, but this may limit certain functionalities of the site. Read our Privacy Policy for more information.
Setting cookies
You can enable and disable the types of cookies you wish to accept. However certain choices you make could affect the services offered on our sites (e.g. suggestions, personalised ads, etc.).
Essential cookies
These cookies are necessary for the operation of the site and cannot be deactivated. (Still active)
Analytics cookies
Do you accept the use of cookies to measure the audience of our sites?
Multimedia Player
Do you accept the use of cookies to display and allow you to watch the video content hosted by our partners (YouTube, etc.)?
Trust is foundational to patient-physician relationships and is associated with improved care-seeking and adherence in primary care. However… (see more), validated trust instruments for pediatric emergency and surgical contexts are lacking, and traditional instrument development is slow and resource-intensive. Large language models (LLMs) could streamline the validation process by serving as scalable, systematic expert panel surrogates.
We developed four new trust assessment instruments: two for patient-families and two for physicians. Two-phase content validation was conducted using two parallel synthetic and human expert panels. Synthetic panels consisted of three persona-prompted LLMs (Claude Sonnet 4, GPT-5, Grok4). Human panels served as traditional comparators. Scale-Content Validity Index (S-CVI) and Fleiss’ kappa (k) acceptance thresholds were set at ≥0.80.
Combined human–synthetic expert panels revealed substantial inter-rater reliability across all instruments. Fleiss’ kvalues for dimensional validation were: patient-family = 0.84 (95% CI [0.72, 0.96]), physician = 0.87 (95% CI [0.72, 1.00]);contextual validation: patient-family = 0.83 (95% CI [0.73, 0.93]), physician = 0.88 (95% CI [0.80, 0.96]). All instruments exceeded S-CVI ≥0.80 thresholds across both validation phases.
Persona-prompted LLMs demonstrated comparable validity outcomes to human experts while accelerating validation timelines from months to weeks. Future research needs to evaluate this approach across psychometric testing phases.
This synthetic instrument validation methodology offers a scalable blueprint for healthcare measurement development, enabling faster creation of validated tools to support evidence-based patient care.