A machine learning approach reveals features related to clinicians’ diagnosis of clinically relevant knee osteoarthritis
Objectives To identify highly-ranked features related to clinicians’ diagnosis of clinically relevant knee osteoarthritis (OA). Methods General practitioners (GPs) and secondary care physicians (SPs) were recruited to evaluate 5-10 years follow-up clinical and radiographic data of knees from the CHECK cohort for the presence of clinically relevant OA. GPs and SPs were gathered in pairs; each pair consisted of 1 GP and 1 SP, and the paired clinicians independently evaluated the same subset of knees. A diagnosis was made for each knee by the GP and SP before and after viewing radiographic data. Nested 5-fold cross-validation enhanced random forest models were built to identify the top 10 features related to the diagnosis. Results 17 clinician pairs evaluated 1106 knees with 139 clinical and 36 radiographic features. GPs diagnosed clinically relevant OA in 42% and 43% knees, before and after viewing radiographic data, respectively. SPs diagnosed in 43% and 51% knees, respectively. Models containing top 10 features had good performance for explaining clinicians’ diagnosis with area under the curve ranging from 0.76-0.83. Before viewing radiographic data, quantitative symptomatic features (i.e. WOMAC scores) were the most important ones related to the diagnosis of both GPs and SPs; after viewing radiographic data, radiographic features appeared in the top lists for both, but seemed to be more important for SPs than GPs. Conclusions Random forest models presented good performance in explaining clinicians’ diagnosis, which helped to reveal typical features of patients recognized as clinically relevant knee OA by clinicians from two different care settings.
Keywords: CHECK cohort; Knee osteoarthritis; clinician’s diagnosis; machine learning.