Validity evidence for the Quality of Assessment for Learning score: a quality metric for supervisor comments in Competency Based Medical Education

Rob Woods; Sim Singh; Brent Thoma; Catherine Patocka; Warren Cheung; Sandra Monteiro; Teresa M Chan

doi:10.36834/cmej.74860

Authors

Rob Woods University of Saskatchewan https://orcid.org/0000-0002-3811-1369
Sim Singh University of Saskatchewan https://orcid.org/0000-0002-4598-2007
Brent Thoma University of Saskatchewan https://orcid.org/0000-0003-1124-5786
Catherine Patocka University of Calgary https://orcid.org/0000-0003-4683-5655
Warren Cheung University of Ottawa https://orcid.org/0000-0002-2730-8190
Sandra Monteiro McMaster University https://orcid.org/0000-0001-8723-5942
Teresa M Chan McMaster University https://orcid.org/0000-0001-6104-462X

DOI:

https://doi.org/10.36834/cmej.74860

Abstract

Background: Competency based medical education (CBME) relies on supervisor narrative comments contained within entrustable professional activities (EPA) for programmatic assessment, but the quality of these supervisor comments is unassessed. There is validity evidence supporting the QuAL (Quality of Assessment for Learning) score for rating the usefulness of short narrative comments in direct observation.

Objective: We sought to establish validity evidence for the QuAL score to rate the quality of supervisor narrative comments contained within an EPA by surveying the key end-users of EPA narrative comments: residents, academic advisors, and competence committee members.

Methods: In 2020, the authors randomly selected 52 de-identified narrative comments from two emergency medicine EPA databases using purposeful sampling. Six collaborators (two residents, two academic advisors, and two competence committee members) were recruited from each of four EM Residency Programs (Saskatchewan, McMaster, Ottawa, and Calgary) to rate these comments with a utility score and the QuAL score. Correlation between utility and QuAL score were calculated using Pearson’s correlation coefficient. Sources of variance and reliability were calculated using a generalizability study.

Results: All collaborators (n = 24) completed the full study. The QuAL score had a high positive correlation with the utility score amongst the residents (r = 0.80) and academic advisors (r = 0.75) and a moderately high correlation amongst competence committee members (r = 0.68). The generalizability study found that the major source of variance was the comment indicating the tool performs well across raters.

Conclusion: The QuAL score may serve as an outcome measure for program evaluation of supervisors, and as a resource for faculty development.

Author Biographies

Sim Singh, University of Saskatchewan

Sim Singh, BSc, MD Candidate is a medical student at the University of Saskatchewan, Saskatoon, SK, Canada

Brent Thoma, University of Saskatchewan

Brent Thoma, MD, MA, MSc is an associate professor, Department of Emergency Medicine, University of Saskatchewan, Saskatoon, SK, Canada. He is also a clinician educator, Royal College of Physicians and Surgeons of Canada

Catherine Patocka, University of Calgary

Catherine Patocka, MD, MHPE is a clinical associate professor, Department of Emergency Medicine, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.

Warren Cheung, University of Ottawa

Warren J. Cheung, MD, MMEd is an associate professor, Department of Emergency Medicine, University of Ottawa, Ottawa, ON, Canada. He is also a Clinician Educator, Royal College of Physicians and Surgeons of Canada, Ottawa, ON, Canada.

Sandra Monteiro, McMaster University

Sandra Monteiro, PhD is an associate professor, Department of Health Research Methods, Evidence and Impact, McMaster University, Hamilton, ON, Canada. She is also Director of Scholarship at the Centre for Simulation Based Learning and Scientist, McMaster Education Research, Innovation, and Theory (MERIT), Hamilton, ON, Canada.

Teresa M Chan, McMaster University

Teresa M. Chan, MD, MHPE is an associate professor, Divisions of Emergency Medicine and Education and Innovation in the Department of Medicine, McMaster University, Hamilton, ON, Canada. She is also associate dean, continuing professional development within the Faculty of Health Sciences and clinician scientist, McMaster Education Research, Innovation, and Theory (MERIT) at McMaster University in Hamilton, ON, Canada.

References

Frank JR, Snell LS, Cate OT, et al. Competency-based medical education: theory to practice. Med Teach. 2010 Aug;32(8):638-45. https://doi.org/10.3109/0142159X.2010.501190

Lockyer J, Carraccio C, Chan MK, et al. Core principles of assessment in competency-based medical education. Med Teach. 2017 Jun 3;39(6):609-16. https://doi.org/10.1080/0142159X.2017.1315082

Cheung WJ, Patey AM, Frank JR, Mackay M, Boet S. Barriers and enablers to direct observation of trainees' clinical performance: a qualitative study using the theoretical domains framework. Acad Med. 2019 Jan;94(1):101-14. https://doi.org/10.1097/ACM.0000000000002396

Ginsburg S, van der Vleuten C, Eva KW, Lingard L. Hedging to save face: a linguistic analysis of written comments on in-training evaluation reports. Adv Health Sci Educ. 2016 Mar;21(1):175-88. https://doi.org/10.1007/s10459-015-9622-0

Scarff CE, Bearman M, Chiavaroli N, Trumble S. Keeping mum in clinical supervision: private thoughts and public judgements. Med Educ. 2019 Feb;53(2):133-42. https://doi.org/10.1111/medu.13728

Sherbino J, Bandiera G, Doyle K, et al. The competency-based medical education evolution of Canadian emergency medicine specialist training. CJEM. 2020 Jan;22(1):95-102. https://doi.org/10.1017/cem.2019.417

Thoma B, Hall AK, Clark K, et al. evaluation of a national competency-based assessment system in emergency medicine: a CanDREAM study. J Grad Med Educ. 2020 Aug;12(4):425-34.

https://doi.org/10.4300/JGME-D-19-00803.1

Chan TM, Sherbino J, Mercuri M. Nuance and noise: lessons learned from longitudinal aggregated assessment data. J Grad Med Educ. 2017 Dec;9(6):724-9. https://doi.org/10.4300/JGME-D-17-00086.1

Cook DA, Kuper A, Hatala R, Ginsburg S. When assessment data are words: validity evidence for qualitative educational assessments. Acad Med. 2016 Oct;91(10):1359-69. https://doi.org/10.1097/ACM.0000000000001175

Ginsburg S, van der Vleuten CPM, Eva KW. The hidden value of narrative comments for assessment: a quantitative reliability analysis of qualitative data. Acad Med. 2017 Nov;92(11):1617-21. https://doi.org/10.1097/ACM.0000000000001669

Thoma B, Caretta-Weyer H, et al. Becoming a deliberately developmental organization: using competency based assessment data for organizational development. Med Teach. 2021 Jul 3;43(7):801-9. https://doi.org/10.1080/0142159X.2021.1925100

Chan TM, Sebok-Syer SS, Sampson C, Monteiro S. The Quality of Assessment of Learning (Qual) score: validity evidence for a scoring system aimed at rating short, workplace-based comments on trainee performance. Teach Learn Med. 2020 May 26;32(3):319-29. https://doi.org/10.1080/10401334.2019.1708365

Cook DA, Brydges R, Ginsburg S, Hatala R. A contemporary approach to validity arguments: a practical guide to Kane's framework. Med Educ. 2015 Jun;49(6):560-75. https://doi.org/10.1111/medu.12678

Tri-Council Policy Statement 2018. Available from: https://ethics.gc.ca/eng/tcps2-eptc2_2018_chapter2-chapitre2.html

Bismil R, Dudek NL, Wood TJ. In-training evaluations: developing an automated screening tool to measure report quality. Med Educ. 2014 Jul;48(7):724-32. https://doi.org/10.1111/medu.12490

Monteiro S, Sullivan GM, Chan TM. Generalizability theory made simple(r): an introductory primer to g-studies. J Grad Med Educ. 2019 Aug 1;11(4):365-70. https://doi.org/10.4300/JGME-D-19-00464.1

Schober P, Boer C, Schwarte LA. Correlation coefficients: appropriate use and interpretation. Anesth Analg. 2018 May;126(5):1763-8. https://doi.org/10.1213/ANE.0000000000002864

Streiner DL. Starting at the beginning: an introduction to coefficient alpha and internal consistency. J Pers Assess. 2003 Feb;80(1):99-103. https://doi.org/10.1207/S15327752JPA8001_18

Vleuten CPM, Norman GR, Graaff E. Pitfalls in the pursuit of objectivity: issues of reliability. Med Educ. 1991 Mar;25(2):110-8. https://doi.org/10.1111/j.1365-2923.1991.tb00036.x

Watling CJ, Ginsburg S. Assessment, feedback and the alchemy of learning. Med Educ. 2019 Jan;53(1):76-85. https://doi.org/10.1111/medu.13645

Ginsburg S, van der Vleuten CP, Eva KW, Lingard L. Cracking the code: residents' interpretations of written assessment comments. Med Educ. 2017 Apr;51(4):401-10. https://doi.org/10.1111/medu.13158

Sebok-Syer SS, Klinger DA, Sherbino J, Chan TM. Mixed messages or miscommunication? investigating the relationship between assessors' workplace-based assessment scores and written comments: Acad Med. 2017 Dec;92(12):1774-9. https://doi.org/10.1097/ACM.0000000000001743

Acai A, Li SA, Sherbino J, Chan TM. Attending emergency physicians' perceptions of a programmatic workplace-based assessment system: the McMaster Modular Assessment Program (McMAP). Teach Learn Med. 2019 Aug 8;31(4):434-44. https://doi.org/10.1080/10401334.2019.1574581

Cheung WJ, Chan TM, Hauer KE, et al. CAEP 2019 Academic Symposium: got competence? best practices in trainee progress decisions. CJEM. 2020 Mar;22(2):187-93. https://doi.org/10.1017/cem.2019.480

Hodges B. Assessment in the post-psychometric era: Learning to love the subjective and collective. Med Teach. 2013 Jul;35(7):564-8. https://doi.org/10.3109/0142159X.2013.789134

Chan TM, Paterson QS, Hall AK, et al. Outcomes in the age of competency-based medical education: Recommendations for emergency medicine training in Canada from the 2019 symposium of academic emergency physicians. CJEM. 2020 Mar;22(2):204-14. https://doi.org/10.1017/cem.2019.491

Chan T, Sebok-Syer S, Thoma B, Wise A, Sherbino J, Pusic M. Learning analytics in medical education assessment: the past, the present, and the future. Promes S, editor. AEM Educ Train. 2018 Apr;2(2):178-87. https://doi.org/10.1002/aet2.10087

Ginsburg S, Gingerich A, Kogan JR, Watling CJ, Eva KW. Idiosyncrasy in assessment comments: do faculty have distinct writing styles when completing in-training evaluation reports? Acad Med. 2020 Nov;95(11S):S81-8. https://doi.org/10.1097/ACM.0000000000003643

Dudek NL, Marks MB, Bandiera G, White J, Wood TJ. Quality in-training evaluation reports-does feedback drive faculty performance? Acad Med. 2013 Aug;88(8):1129-34. https://doi.org/10.1097/ACM.0b013e318299394c

Zhang R. Automated assessment of medical training evaluation text. AMIA Annu Symp Proc. 2012;1459-68.

Ötleş E, Kendrick D, Solano QP, et al. Using natural language processing to automatically assess feedback quality: findings from three surgical residencies. Acad Med. 2021 May 4; Publish Ahead of Print. Available from: https://journals.lww.com/10.1097/ACM.0000000000004153 [Accessed May 31, 2021].

Ross S, Hamza D, Zulla R, Stasiuk S, Nichols D. Development of and preliminary validity evidence for the EFeCT feedback scoring tool. J Grad Med Educ. 2022 Feb 1;14(1):71-9. https://doi.org/10.4300/JGME-D-21-00602.1

ten Cate O, Regehr G. the power of subjectivity in the assessment of medical trainees: Acad Med. 2019 Mar;94(3):333-7. https://doi.org/10.1097/ACM.0000000000002495

Gomez-Garibello C, Young M. Emotions and assessment: considerations for rater-based judgements of entrustment. Med Educ. 2018 Mar;52(3):254-62. https://doi.org/10.1111/medu.13476

Watling C, LaDonna KA, Lingard L, Voyer S, Hatala R. 'Sometimes the work just needs to be done': socio-cultural influences on direct observation in medical training. Med Educ. 2016 Oct;50(10):1054-64. https://doi.org/10.1111/medu.13062

Gingerich A, Regehr G, Eva KW. Rater-based assessments as social judgments: rethinking the etiology of rater errors: Acad Med. 2011 Oct;86:S1-7. https://doi.org/10.1097/ACM.0b013e31822a6cf8

Ginsburg S, Kogan JR, Gingerich A, Lynch M, Watling CJ. taken out of context: hazards in the interpretation of written assessment comments. Acad Med. 2020 Jul;95(7):1082-8. https://doi.org/10.1097/ACM.0000000000003047