Background/Aim: Automated analysis of plain radiographs for prospective fracture (HF) risk prediction using artificial intelligence (AI) methods could expand availability of diagnostic tests, automate, and potentially improve the overall identification of patients at risk.
AI-driven computer vision, first applied in 1958, is by far not a new technique. Enduring several “AI winters”, key contributions to the field eventually let to breakthroughs, in particular of the deep convolutional neural networks (CNNs). Most notably the jump in image classification accuracy of AlexNet in the 2012 ImageNet challenge triggered the so-called “AI spring”.
At the Intelligent Imaging Lab (I2Lab) in Kiel, different AI-related approaches are investigated, both for diagnostic and prospective outcomes. Besides image classification tasks (e.g. COVID-19, stroke, osteoporosis, fractures or anomalies) on radiographs, computed tomography (CT) and/or magnetic resonance imaging (MRI) data, localization and segmentation tasks for identification, labeling and placement of regions/volumes of interest (ROI/VOI) are also an important and substantial part of the desired automated processing pipeline.
AI approaches require big and good data. At I2Lab, we started one prognostic hip fracture (HF) risk prediction study: “Study of Osteoporotic Fracture” (SOF) - Investigation by Artificial Intelligence” (SOFIA) and will present our findings in relation to other similar prognostic studies. To better access the required big data for future studies, we at I2Lab are keen to cooperate in federated learning studies.
Methods: In SOFIA, we investigate, how these deep CNNs in conjunction with automatic ROI placement based on a key-point-detector CNN (CenterNet) predict hip fracture risk based on digitized pelvic radiographs. We developed and tested three different AI models pre-trained on ImageNet data, whereas two were based on Resnet50 and one on DenseNet121. The femoral neck region (aBMD_FN) from dual X-ray Absorptiometry (DXA) data served as reference standard and Cox proportional hazard models incorporating aBMD_FN or AI based risk estimates without and with age & BMI adjustment were compared for difference in Harrell’s C.
Results: Of a total of 7964 women (age 71.6±5.1 at baseline) preprocessing resulted in a dataset of 6338 women for training and validation of the DCNNs (with 924 incident HF during 14.0±6.3 years of follow-up) and of 1252 women for the test dataset (with 184 incident HF during 15.0±5.7 years of follow-up). aBMD_FN and all of the AI predictors were significantly associated with HF incidence in univariate and in age & BMI adjusted models (all p<0.001, table). Age & BMI adjusted DenseNet based predictors showed significantly better predictive power than age-adjusted aBMD_FN on same subjects (p<0.05).
Conclusion: Automated AI-driven analysis of pelvic radiographs based on a DenseNet121 model predicts HF better than DXA-based aBMD of the femoral neck and shows potential for predictive power better than DXA and thus may enable sites without DXA access to achieve high-quality HF risk predictions.