The Extended matching Questions (EMQs), or R-type questions, are format of selected-response. The validity evidence for this format is crucial, but there have been reports of misunderstandings about validity. It is unclear what kinds of evidence should be presented and how to present them to support their educational impact. This review explores the pattern and quality of reporting the sources of validity evidence of EMQs in health professions education, encompassing content, response process, internal structure, relationship to other variables, and consequences. A systematic search in the electronic databases including MEDLINE via PubMed, Scopus, Web of Science, CINAHL, and ERIC was conducted to extract studies that utilize EMQs. The framework for a unitary concept of validity was applied to extract data. A total of 218 titles were initially selected, the final number of titles was 19. The most reported pieces of evidence were the reliability coefficient, followed by the relationship to another variable. Additionally, the adopted definition of validity is mostly the old tripartite concept. This study found that reporting and presenting validity evidence appeared to be deficient. The available evidence can hardly provide a strong validity argument that supports the educational impact of EMQs. This review calls for more work on developing a tool to measure the reporting and presenting validity evidence.