DESIGN: 1805 consecutive unselected patients with FGID who presented for primary or secondary care to 11 centres across Asia completed a cultural and linguistic adaptation of the Rome III Diagnostic Questionnaire that was translated to the local languages. Principal components factor analysis with varimax rotation was used to identify symptom clusters.
RESULTS: Nine symptom clusters were identified, consisting of two oesophageal factors (F6: globus, odynophagia and dysphagia; F9: chest pain and heartburn), two gastroduodenal factors (F5: bloating, fullness, belching and flatulence; F8 regurgitation, nausea and vomiting), three bowel factors (F2: abdominal pain and diarrhoea; F3: meal-related bowel symptoms; F7: upper abdominal pain and constipation) and two anorectal factors (F1: anorectal pain and constipation; F4: diarrhoea, urgency and incontinence).
CONCLUSION: We found that the broad categorisation used both in clinical practice and in the Rome system, that is, broad anatomical divisions, and certain diagnoses with long historical records, that is, IBS with diarrhoea, and chronic constipation, are still valid in our Asian societies. In addition, we found a bowel symptom cluster with meal trigger and a gas cluster that suggests a different emphasis in our populations. Future studies to compare a non-Asian cohort and to match to putative pathophysiology will help to verify our findings.
RESULTS: Our models learned several syntactic, lexical, and n-gram linguistic biomarkers to distinguish the probable AD group from the healthy group. In contrast to the healthy group, we found that the probable AD patients had significantly less usage of syntactic components and significantly higher usage of lexical components in their language. Also, we observed a significant difference in the use of n-grams as the healthy group were able to identify and make sense of more objects in their n-grams than the probable AD group. As such, our best diagnostic model significantly distinguished the probable AD group from the healthy elderly group with a better Area Under the Receiving Operating Characteristics Curve (AUC) using the Support Vector Machines (SVM).
CONCLUSIONS: Experimental and statistical evaluations suggest that using ML algorithms for learning linguistic biomarkers from the verbal utterances of elderly individuals could help the clinical diagnosis of probable AD. We emphasise that the best ML model for predicting the disease group combines significant syntactic, lexical and top n-gram features. However, there is a need to train the diagnostic models on larger datasets, which could lead to a better AUC and clinical diagnosis of probable AD.