Validity evidence for the Quality of Assessment for Learning score: a quality metric for supervisor comments in Competency Based Medical Education




Background: Competency based medical education (CBME) relies on supervisor narrative comments contained within entrustable professional activities (EPA) for programmatic assessment, but the quality of these supervisor comments is unassessed. There is validity evidence supporting the QuAL (Quality of Assessment for Learning) score for rating the usefulness of short narrative comments in direct observation.

Objective: We sought to establish validity evidence for the QuAL score to rate the quality of supervisor narrative comments contained within an EPA by surveying the key end-users of EPA narrative comments: residents, academic advisors, and competence committee members.

Methods: In 2020, the authors randomly selected 52 de-identified narrative comments from two emergency medicine EPA databases using purposeful sampling. Six collaborators (two residents, two academic advisors, and two competence committee members) were recruited from each of four EM Residency Programs (Saskatchewan, McMaster, Ottawa, and Calgary) to rate these comments with a utility score and the QuAL score.  Correlation between utility and QuAL score were calculated using Pearson’s correlation coefficient. Sources of variance and reliability were calculated using a generalizability study.

Results: All collaborators (n = 24) completed the full study.  The QuAL score had a high positive correlation with the utility score amongst the residents (r = 0.80) and academic advisors (r = 0.75) and a moderately high correlation amongst competence committee members (r = 0.68).  The generalizability study found that the major source of variance was the comment indicating the tool performs well across raters.

Conclusion: The QuAL score may serve as an outcome measure for program evaluation of supervisors, and as a resource for faculty development.


Metrics Loading ...

Author Biographies

Sim Singh, University of Saskatchewan

Sim Singh, BSc, MD Candidate is a medical student at the University of Saskatchewan, Saskatoon, SK, Canada

Brent Thoma, University of Saskatchewan

Brent Thoma, MD, MA, MSc is an associate professor, Department of Emergency Medicine, University of Saskatchewan, Saskatoon, SK, Canada. He is also a clinician educator, Royal College of Physicians and Surgeons of Canada

Catherine Patocka, University of Calgary

Catherine Patocka, MD, MHPE is a clinical associate professor, Department of Emergency Medicine, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.

Warren Cheung, University of Ottawa

Warren J. Cheung, MD, MMEd is an associate professor, Department of Emergency Medicine, University of Ottawa, Ottawa, ON, Canada. He is also a Clinician Educator, Royal College of Physicians and Surgeons of Canada, Ottawa, ON, Canada.

Sandra Monteiro, McMaster University

Sandra Monteiro, PhD is an associate professor, Department of Health Research Methods, Evidence and Impact, McMaster University, Hamilton, ON, Canada. She is also Director of Scholarship at the Centre for Simulation Based Learning and Scientist, McMaster Education Research, Innovation, and Theory (MERIT), Hamilton, ON, Canada.

Teresa M Chan, McMaster University

Teresa M. Chan, MD, MHPE is an associate professor, Divisions of Emergency Medicine and Education and Innovation in the Department of Medicine, McMaster University, Hamilton, ON, Canada. She is also associate dean, continuing professional development within the Faculty of Health Sciences and clinician scientist, McMaster Education Research, Innovation, and Theory (MERIT) at McMaster University in Hamilton, ON, Canada.


Frank JR, Snell LS, Cate OT, et al. Competency-based medical education: theory to practice. Med Teach. 2010 Aug;32(8):638-45.

Lockyer J, Carraccio C, Chan MK, et al. Core principles of assessment in competency-based medical education. Med Teach. 2017 Jun 3;39(6):609-16.

Cheung WJ, Patey AM, Frank JR, Mackay M, Boet S. Barriers and enablers to direct observation of trainees' clinical performance: a qualitative study using the theoretical domains framework. Acad Med. 2019 Jan;94(1):101-14.

Ginsburg S, van der Vleuten C, Eva KW, Lingard L. Hedging to save face: a linguistic analysis of written comments on in-training evaluation reports. Adv Health Sci Educ. 2016 Mar;21(1):175-88.

Scarff CE, Bearman M, Chiavaroli N, Trumble S. Keeping mum in clinical supervision: private thoughts and public judgements. Med Educ. 2019 Feb;53(2):133-42.

Sherbino J, Bandiera G, Doyle K, et al. The competency-based medical education evolution of Canadian emergency medicine specialist training. CJEM. 2020 Jan;22(1):95-102.

Thoma B, Hall AK, Clark K, et al. evaluation of a national competency-based assessment system in emergency medicine: a CanDREAM study. J Grad Med Educ. 2020 Aug;12(4):425-34.

Chan TM, Sherbino J, Mercuri M. Nuance and noise: lessons learned from longitudinal aggregated assessment data. J Grad Med Educ. 2017 Dec;9(6):724-9.

Cook DA, Kuper A, Hatala R, Ginsburg S. When assessment data are words: validity evidence for qualitative educational assessments. Acad Med. 2016 Oct;91(10):1359-69.

Ginsburg S, van der Vleuten CPM, Eva KW. The hidden value of narrative comments for assessment: a quantitative reliability analysis of qualitative data. Acad Med. 2017 Nov;92(11):1617-21.

Thoma B, Caretta-Weyer H, et al. Becoming a deliberately developmental organization: using competency based assessment data for organizational development. Med Teach. 2021 Jul 3;43(7):801-9.

Chan TM, Sebok-Syer SS, Sampson C, Monteiro S. The Quality of Assessment of Learning (Qual) score: validity evidence for a scoring system aimed at rating short, workplace-based comments on trainee performance. Teach Learn Med. 2020 May 26;32(3):319-29.

Cook DA, Brydges R, Ginsburg S, Hatala R. A contemporary approach to validity arguments: a practical guide to Kane's framework. Med Educ. 2015 Jun;49(6):560-75.

Tri-Council Policy Statement 2018. Available from:

Bismil R, Dudek NL, Wood TJ. In-training evaluations: developing an automated screening tool to measure report quality. Med Educ. 2014 Jul;48(7):724-32.

Monteiro S, Sullivan GM, Chan TM. Generalizability theory made simple(r): an introductory primer to g-studies. J Grad Med Educ. 2019 Aug 1;11(4):365-70.

Schober P, Boer C, Schwarte LA. Correlation coefficients: appropriate use and interpretation. Anesth Analg. 2018 May;126(5):1763-8.

Streiner DL. Starting at the beginning: an introduction to coefficient alpha and internal consistency. J Pers Assess. 2003 Feb;80(1):99-103.

Vleuten CPM, Norman GR, Graaff E. Pitfalls in the pursuit of objectivity: issues of reliability. Med Educ. 1991 Mar;25(2):110-8.

Watling CJ, Ginsburg S. Assessment, feedback and the alchemy of learning. Med Educ. 2019 Jan;53(1):76-85.

Ginsburg S, van der Vleuten CP, Eva KW, Lingard L. Cracking the code: residents' interpretations of written assessment comments. Med Educ. 2017 Apr;51(4):401-10.

Sebok-Syer SS, Klinger DA, Sherbino J, Chan TM. Mixed messages or miscommunication? investigating the relationship between assessors' workplace-based assessment scores and written comments: Acad Med. 2017 Dec;92(12):1774-9.

Acai A, Li SA, Sherbino J, Chan TM. Attending emergency physicians' perceptions of a programmatic workplace-based assessment system: the McMaster Modular Assessment Program (McMAP). Teach Learn Med. 2019 Aug 8;31(4):434-44.

Cheung WJ, Chan TM, Hauer KE, et al. CAEP 2019 Academic Symposium: got competence? best practices in trainee progress decisions. CJEM. 2020 Mar;22(2):187-93.

Hodges B. Assessment in the post-psychometric era: Learning to love the subjective and collective. Med Teach. 2013 Jul;35(7):564-8.

Chan TM, Paterson QS, Hall AK, et al. Outcomes in the age of competency-based medical education: Recommendations for emergency medicine training in Canada from the 2019 symposium of academic emergency physicians. CJEM. 2020 Mar;22(2):204-14.

Chan T, Sebok-Syer S, Thoma B, Wise A, Sherbino J, Pusic M. Learning analytics in medical education assessment: the past, the present, and the future. Promes S, editor. AEM Educ Train. 2018 Apr;2(2):178-87.

Ginsburg S, Gingerich A, Kogan JR, Watling CJ, Eva KW. Idiosyncrasy in assessment comments: do faculty have distinct writing styles when completing in-training evaluation reports? Acad Med. 2020 Nov;95(11S):S81-8.

Dudek NL, Marks MB, Bandiera G, White J, Wood TJ. Quality in-training evaluation reports-does feedback drive faculty performance? Acad Med. 2013 Aug;88(8):1129-34.

Zhang R. Automated assessment of medical training evaluation text. AMIA Annu Symp Proc. 2012;1459-68.

Ötleş E, Kendrick D, Solano QP, et al. Using natural language processing to automatically assess feedback quality: findings from three surgical residencies. Acad Med. 2021 May 4; Publish Ahead of Print. Available from: [Accessed May 31, 2021].

Ross S, Hamza D, Zulla R, Stasiuk S, Nichols D. Development of and preliminary validity evidence for the EFeCT feedback scoring tool. J Grad Med Educ. 2022 Feb 1;14(1):71-9.

ten Cate O, Regehr G. the power of subjectivity in the assessment of medical trainees: Acad Med. 2019 Mar;94(3):333-7.

Gomez-Garibello C, Young M. Emotions and assessment: considerations for rater-based judgements of entrustment. Med Educ. 2018 Mar;52(3):254-62.

Watling C, LaDonna KA, Lingard L, Voyer S, Hatala R. 'Sometimes the work just needs to be done': socio-cultural influences on direct observation in medical training. Med Educ. 2016 Oct;50(10):1054-64.

Gingerich A, Regehr G, Eva KW. Rater-based assessments as social judgments: rethinking the etiology of rater errors: Acad Med. 2011 Oct;86:S1-7.

Ginsburg S, Kogan JR, Gingerich A, Lynch M, Watling CJ. taken out of context: hazards in the interpretation of written assessment comments. Acad Med. 2020 Jul;95(7):1082-8.




How to Cite

Woods R, Singh S, Thoma B, Patocka C, Cheung W, Monteiro S, Chan TM. Validity evidence for the Quality of Assessment for Learning score: a quality metric for supervisor comments in Competency Based Medical Education. Can. Med. Ed. J [Internet]. 2022 Aug. 16 [cited 2024 Jul. 21];13(6):19-35. Available from:



Original Research

Most read articles by the same author(s)

1 2 3 > >>