METHODS: We analysed 350 items used in 7 professional examinations and determined their distractor efficiency and the number of functional distractors per item. The items were sorted into five groups - excellent, good, fair, remediable and discarded based on their discrimination index. We studied how the distractor efficiency and functional distractors per item correlated with these five groups.
RESULTS: Correlation of distractor efficiency with psychometric indices was significant but far from perfect. The excellent group topped in distractor efficiency in 3 tests, the good group in one test, the remediable group equalled excellent group in one test, and the discarded group topped in 2 tests.
CONCLUSIONS: The distractor efficiency did not correlate in a consistent pattern with the discrimination index. Fifty per cent or higher distractor efficiency, not hundred percent, was found to be the optimum.
MATERIALS AND METHODS: We compared the LC scores of three previous years with those of the SBCE and studied the feedback of the three stakeholders: students, examiners, and simulated patients (SPs), regarding their experience with SBCE and the suitability of SBCE as an alternative for LC in future examinations.
RESULTS: The SBCE scores were higher than those of the LC. Most of the examiners and students were not in favour of SBCE replacing LC, as such. The SPs were more positive about the proposition. The comments of the three stakeholders brought out the plus and minus points of LC and SBCE, which prompted our proposals to make SBCE more practical for future examinations.
CONCLUSION: Having analysed the feedback of the stakeholders, and the positive and negative aspects of LC and SBCE, it was evident that SBCE needed improvements. We have proposed eight modifications to SBCE to make it a viable alternative for LC.
DISCUSSION: Creating an inclusive assessment culture is important for equitable education, even if priorities for inclusion might differ between contexts. We recognise challenges in the enactment of inclusive assessment, namely, the notion of lowering standards, harming reliability and robustness of assessment design and inclusion as a poorly defined and catchall term. Importantly, the lack of awareness that inclusion means recognising intersectionality is a barrier for well-designed inclusive assessments. This is why we offer considerations for HPE practitioners that can guide towards a unified direction of travel for inclusive assessments. This article highlights the importance of contextual prioritisation and initiatives to be considered at the global level to national, institutional, programme and the individual level. Utilising experience and literature from undergraduate, higher education contexts, we offer considerations with applicability across the assessment continuum.
CONTEXT: In this state of science paper, we were set the challenge of providing cross-cultural viewpoints on inclusive assessment. In this discursive article, we focus on inclusive assessment within undergraduate health professions education whilst looking to the wider higher education literature, since institutional policies and procedures frequently drive assessment decisions and influence the environment in which they occur. We explore our experiences of working in inclusive assessment, with the aim of bridging and enhancing practices of inclusive assessments for HPE. Unlike other articles that juxtapose views, we all come from the perspective of supporting inclusive assessment. We begin with a discussion on what inclusive assessment is and then describe our contexts as a basis for understanding differences and broadening conversations. We work in the United Kingdom, Australia and Malaysia, having undertaken research, facilitated workshops and seminars on inclusive assessment nationally and internationally. We recognise our perspectives will differ as a consequence of our global context, institutional culture, individual characteristics and educational experiences. (Note that individual characteristics are also known as protected characteristics in some countries). Then, we outline challenges and opportunities associated with inclusive assessment, drawing on evidence within our contexts, acknowledging that our understanding of inclusive assessment research is limited to publications in English and currently tilted to publications from the Global North. In the final section, we then offer recommendations for championing inclusion, focussing firstly on assessment designs, and then broader considerations to organise collective action. Our article is unapologetically practical; the deliberate divergence from a theoretical piece is with the intent that anyone who reads this paper might enact even one small change progressing towards more inclusive assessment practices within their context.
METHODS: We compared two methods of OSCE feedback delivered to fourth year medical students in Malaysia: (i) Face to face (FTF) immediate feedback (semester one) (ii) Individualised enhanced written (EW) feedback containing detailed scores in each domain, examiners' free text comments and the marking rubric (semester two). Both methods were evaluated by students and staff examiners, and students' responses were compared against their OSCE performance.
RESULTS: Of the 116 students who sat for both formative OSCEs, 82.8% (n=96) and 86.2% (n=100) responded to the first and second survey respectively. Most students were comfortable to receive feedback (91.3% in FTF, 96% in EW) with EW feedback associated with higher comfort levels (p=0.022). Distress affected a small number with no differences between either method (13.5% in FTF, 10% in EW, p=0.316). Most students perceived both types of feedback improved their performance (89.6% in FTF, 95% in EW); this perception was significantly stronger for EW feedback (p=0.008). Students who preferred EW feedback had lower OSCE scores compared to those preferring FTF feedback (mean scores ± SD: 43.8 ± 5.3 in EW, 47.2 ± 6.5 in FTF, p=0.049). Students ranked the "marking rubric" to be the most valuable aspect of the EW feedback. Tutors felt both methods of feedback were equally beneficial. Few examiners felt they needed training (21.4% in FTF, 15% in EW) but students perceived this need for tutors' training differently (53.1% in FTF, 46% in EW) CONCLUSION: Whilst both methods of OSCE feedback were highly valued, students preferred to receive EW feedback and felt it was more beneficial. Learning cultures of Malaysian students may have influenced this view. Information provided in EW feedback should be tailored accordingly to provide meaningful feedback in OSCE exams.
METHODS: This critique on the OSCE is based on the published findings of researchers from its inception in 1975 to 2004.
RESULTS: The reliability, validity, objectivity and practicability or feasibility of this examination are based on the number of stations, construction of stations, method of scoring (checklists and/ or global scoring) and number of students assessed. For a comprehensive assessment of clinical competence, other methods should be used in conjunction with the OSCE.
CONCLUSION: The OSCE can be a reasonably reliable, valid and objective method of assessment, but its main drawback is that it is resource-intensive.
METHODS: A total of 328 final-year dental students were trained across six cohorts. Three cohorts (175 students) received F2F training from the academic years 2016/2017 to 2018/2019, and the remaining three (153 students) underwent online training during the Covid-19 pandemic from 2019/2020 to 2021/2022. Participant scores were analysed using the Wilcoxon signed rank test, the Mann-Whitney test, Cohen's d effect size, and multiple linear regression.
RESULTS: Both F2F and online training showed increases in mean scores from pre-test to post-test 3: from 67.66 ± 11.81 to 92.06 ± 5.27 and 75.89 ± 11.03 to 90.95 ± 5.22, respectively. Comparison between F2F and online methods revealed significant differences in mean scores with large effect sizes at the pre-test stage (p