DISCUSSION: Creating an inclusive assessment culture is important for equitable education, even if priorities for inclusion might differ between contexts. We recognise challenges in the enactment of inclusive assessment, namely, the notion of lowering standards, harming reliability and robustness of assessment design and inclusion as a poorly defined and catchall term. Importantly, the lack of awareness that inclusion means recognising intersectionality is a barrier for well-designed inclusive assessments. This is why we offer considerations for HPE practitioners that can guide towards a unified direction of travel for inclusive assessments. This article highlights the importance of contextual prioritisation and initiatives to be considered at the global level to national, institutional, programme and the individual level. Utilising experience and literature from undergraduate, higher education contexts, we offer considerations with applicability across the assessment continuum.
CONTEXT: In this state of science paper, we were set the challenge of providing cross-cultural viewpoints on inclusive assessment. In this discursive article, we focus on inclusive assessment within undergraduate health professions education whilst looking to the wider higher education literature, since institutional policies and procedures frequently drive assessment decisions and influence the environment in which they occur. We explore our experiences of working in inclusive assessment, with the aim of bridging and enhancing practices of inclusive assessments for HPE. Unlike other articles that juxtapose views, we all come from the perspective of supporting inclusive assessment. We begin with a discussion on what inclusive assessment is and then describe our contexts as a basis for understanding differences and broadening conversations. We work in the United Kingdom, Australia and Malaysia, having undertaken research, facilitated workshops and seminars on inclusive assessment nationally and internationally. We recognise our perspectives will differ as a consequence of our global context, institutional culture, individual characteristics and educational experiences. (Note that individual characteristics are also known as protected characteristics in some countries). Then, we outline challenges and opportunities associated with inclusive assessment, drawing on evidence within our contexts, acknowledging that our understanding of inclusive assessment research is limited to publications in English and currently tilted to publications from the Global North. In the final section, we then offer recommendations for championing inclusion, focussing firstly on assessment designs, and then broader considerations to organise collective action. Our article is unapologetically practical; the deliberate divergence from a theoretical piece is with the intent that anyone who reads this paper might enact even one small change progressing towards more inclusive assessment practices within their context.
METHODS: A total of 328 final-year dental students were trained across six cohorts. Three cohorts (175 students) received F2F training from the academic years 2016/2017 to 2018/2019, and the remaining three (153 students) underwent online training during the Covid-19 pandemic from 2019/2020 to 2021/2022. Participant scores were analysed using the Wilcoxon signed rank test, the Mann-Whitney test, Cohen's d effect size, and multiple linear regression.
RESULTS: Both F2F and online training showed increases in mean scores from pre-test to post-test 3: from 67.66 ± 11.81 to 92.06 ± 5.27 and 75.89 ± 11.03 to 90.95 ± 5.22, respectively. Comparison between F2F and online methods revealed significant differences in mean scores with large effect sizes at the pre-test stage (p
OBJECTIVES: This study aims to compare student performance in MCQ and VSAQ and obtain feedback. from the stakeholders.
METHODS: Conduct multiple true-false, one best answer, and VSAQ tests in two batches of medical students, compare their scores and psychometric indices of the tests and seek opinion from students and academics regarding these assessment methods.
RESULTS: Multiple true-false and best answer test scores showed skewed results and low psychometric performance compared to better psychometrics and more balanced student performance in VSAQ tests. The stakeholders' opinions were significantly in favour of VSAQ.
CONCLUSION AND RECOMMENDATION: This study concludes that VSAQ is a viable alternative to multiple-choice question tests, and it is widely accepted by medical students and academics in the medical faculty.
OBJECTIVE: To appraise and synthesize the best available evidence that examines the effectiveness of OBE approaches towards the competencies of nursing students.
DESIGN: A systematic review of interventional experimental studies.
DATA SOURCES: Eight online databases namely CINAHL, EBSCO, Science Direct, ProQuest, Web of Science, PubMed, EMBASE and SCOPUS were searched.
REVIEW METHODS: Relevant studies were identified using combined approaches of electronic database search without geographical or language filters but were limited to articles published from 2006 to 2016, handsearching journals and visually scanning references from retrieved studies. Two reviewers independently conducted the quality appraisal of selected studies and data were extracted.
RESULTS: Six interventional studies met the inclusion criteria. Two of the studies were rated as high methodological quality and four were rated as moderate. Studies were published between 2009 and 2016 and were mostly from Asian and Middle Eastern countries. Results showed that OBE approaches improves competency in knowledge acquisition in terms of higher final course grades and cognitive skills, improve clinical skills and nursing core competencies and higher behavioural skills score while performing clinical skills. Learners' satisfaction was also encouraging as reported in one of the studies. Only one study reported on the negative effect.
CONCLUSIONS: Although OBE approaches does show encouraging effects towards improving competencies of nursing students, more robust experimental study design with larger sample sizes, evaluating other outcome measures such as other areas of competencies, students' satisfaction, and patient outcomes are needed.
MATERIALS AND METHODS: MCQ items in papers taken from Year II Parts A, B and C examinations for Sessions 2001/02, and Part B examinations for 2002/03 and 2003/04, were analysed to obtain their difficulty indices and discrimination indices. Each paper consisted of 250 true/false items (50 questions of 5 items each) on topics drawn from different disciplines. The questions were first constructed and vetted by the individual departments before being submitted to a central committee, where the final selection of the MCQs was made, based purely on the academic judgement of the committee.
RESULTS: There was a wide distribution of item difficulty indices in all the MCQ papers analysed. Furthermore, the relationship between the difficulty index (P) and discrimination index (D) of the MCQ items in a paper was not linear, but more dome-shaped. Maximal discrimination (D = 51% to 71%) occurred with moderately easy/difficult items (P = 40% to 74%). On average, about 38% of the MCQ items in each paper were "very easy" (P > or =75%), while about 9% were "very difficult" (P <25%). About two-thirds of these very easy/difficult items had "very poor" or even negative discrimination (D < or =20%).
CONCLUSIONS: MCQ items that demonstrate good discriminating potential tend to be moderately difficult items, and the moderately-to-very difficult items are more likely to show negative discrimination. There is a need to evaluate the effectiveness of our MCQ items.
MATERIALS AND METHODS: This review is on some of the issues in standard setting based on the published articles of educational assessment researchers.
RESULTS: Standard or cut-off score should be to determine whether the examinee attained the requirement to be certified competent. There is no perfect method to determine cut score on a test and none is agreed upon as the best method. Setting standard is not an exact science. Legitimacy of the standard is supported when performance standard is linked to the requirement of practice. Test-curriculum alignment and content validity are important for most educational test validity arguments.
CONCLUSION: Representative percentage of must-know learning objectives in the curriculum may be the basis of test items and pass/fail marks. Practice analysis may help in identifying the must-know areas of curriculum. Cut score set by this procedure may give the credibility, validity, defensibility and comparability of the standard. Constructing the test items by subject experts and vetted by multi-disciplinary faculty members may ensure the reliability of the test as well as the standard.
METHODS: This study employed a phenomenological design. Five focus groups were conducted with medical students who had participated in several Kahoot! sessions.
RESULTS: Thirty-six categories and nine sub-themes emerged from the focus group discussions. They were grouped into three themes: attractive learning tool, learning guidance and source of motivation.
CONCLUSIONS: The results suggest that Kahoot! sessions motivate students to study, to determine the subject matter that needs to be studied and to be aware of what they have learned. Thus, the platform is a promising tool for formative assessment in medical education.