A machine learning-based linguistic battery for diagnosing mild cognitive impairment due to Alzheimer's disease

Document Type


Publication Title



There is a limited evaluation of an independent linguistic battery for early diagnosis of Mild Cognitive Impairment due to Alzheimer's disease (MCI-AD). We hypothesized that an independent linguistic battery comprising of only the language components or subtests of popular test batteries could give a better clinical diagnosis for MCI-AD compared to using an exhaustive battery of tests. As such, we combined multiple clinical datasets and performed Exploratory Factor Analysis (EFA) to extract the underlying linguistic constructs from a combination of the Consortium to Establish a Registry for Alzheimer's disease (CERAD), Wechsler Memory Scale (WMS) Logical Memory (LM) I and II, and the Boston Naming Test. Furthermore, we trained a machine-learning algorithm that validates the clinical relevance of the independent linguistic battery for differentiating between patients with MCI-AD and cognitive healthy control individuals. Our EFA identified ten linguistic variables with distinct underlying linguistic constructs that show Cronbach's alpha of 0.74 on the MCI-AD group and 0.87 on the healthy control group. Our machine learning evaluation showed a robust AUC of 0.97 when controlled for age, sex, race, and education, and a clinically reliable AUC of 0.88 without controlling for age, sex, race, and education. Overall, the linguistic battery showed a better diagnostic result compared to the Mini-Mental State Examination (MMSE), Clinical Dementia Rating Scale (CDR), and a combination of MMSE and CDR.



Publication Date