Warning: mkdir(): Permission denied in /home/virtual/lib/view_data.php on line 87 Warning: chmod() expects exactly 2 parameters, 3 given in /home/virtual/lib/view_data.php on line 88 Warning: fopen(/home/virtual/e-kjs/journal/upload/ip_log/ip_log_2025-04.txt): failed to open stream: No such file or directory in /home/virtual/lib/view_data.php on line 95 Warning: fwrite() expects parameter 1 to be resource, boolean given in /home/virtual/lib/view_data.php on line 96 Predicting Postoperative Progression of Ossification of the Posterior Longitudinal Ligament in the Cervical Spine Using Interpretable Radiomics Models

Predicting Postoperative Progression of Ossification of the Posterior Longitudinal Ligament in the Cervical Spine Using Interpretable Radiomics Models

Article information

Neurospine. 2025;22(1):144-156
Publication date (electronic) : 2025 March 31
doi : https://doi.org/10.14245/ns.2448846.423
1Department of Radiology, Peking University Third Hospital, Beijing, China
2Department of Spinal Surgery, Peking University Third Hospital, Beijing, China
Corresponding Author Ning Lang Department of Radiology, Peking University Third Hospital, 49 North Garden Road, Haidian District, Beijing, 100191, China Email: langning800129@126.com
*Siyuan Qin and Ruomu Qu contributed equally to this study as co-first authors.
Received 2024 August 22; Revised 2024 October 27; Accepted 2024 November 7.

Abstract

Objective

This study investigates the potential of radiomics to predict postoperative progression of ossification of the posterior longitudinal ligament (OPLL) after posterior cervical spine surgery.

Methods

This retrospective study included 473 patients diagnosed with OPLL at Peking University Third Hospital between October 2006 and September 2022. Patients underwent posterior spinal surgery and had at least 2 computed tomography (CT) examinations spaced at least 1 year apart. OPLL progression was defined as an annual growth rate exceeding 7.5%. Radiomic features were extracted from preoperative CT images of the OPLL lesions, followed by feature selection using correlation coefficient analysis and least absolute shrinkage and selection operator, and dimensionality reduction using principal component analysis. Univariable analysis identified significant clinical variables for constructing the clinical model. Logistic regression models, including the Rad-score model, clinical model, and combined model, were developed to predict OPLL progression.

Results

Of the 473 patients, 191 (40.4%) experienced OPLL progression. On the testing set, the combined model, which incorporated the Rad-score and clinical variables (area under the receiver operating characteristic curve [AUC] = 0.751), outperformed both the radiomics-only model (AUC = 0.693) and the clinical model (AUC = 0.620). Calibration curves demonstrated good agreement between predicted probabilities and observed outcomes, and decision curve analysis confirmed the clinical utility of the combined model. SHAP (SHapley Additive exPlanations) analysis indicated that the Rad-score and age were key contributors to the model’s predictions, enhancing clinical interpretability.

Conclusion

Radiomics, combined with clinical variables, provides a valuable predictive tool for assessing the risk of postoperative progression in cervical OPLL, supporting more personalized treatment strategies. Prospective, multicenter validation is needed to confirm the utility of the model in broader clinical settings.

INTRODUCTION

Ossification of the posterior longitudinal ligament (OPLL) is a degenerative spinal condition characterized by pathological ossification along the posterior longitudinal ligament, which can lead to significant spinal canal stenosis, neurological deficits, and, in severe cases, cervical myelopathy [1]. OPLL most commonly affects the cervical spine, with a cross-sectional study reporting a prevalence of approximately 4.1% in the Chinese population [2]. The progression and clinical impact of OPLL can vary significantly among patients, creating substantial challenges in management and prognosis.

The natural history of OPLL is not fully understood, but it is known that the ossified mass can enlarge over time, exacerbating spinal cord compression and worsening clinical outcomes [3]. Some studies have shown that more than 60% of patients experience OPLL progression after posterior cervical surgery [4-7]. This progression can negatively impact the outcomes of spinal decompression surgery, leading to symptom recurrence or the need for additional surgical interventions [8]. Therefore, predicting the likelihood of postoperative progression is crucial for optimizing treatment plans and improving long-term patient outcomes. While some clinical studies have identified risk factors for postoperative OPLL progression, such as age, OPLL subtype, and surgical technique, no studies have yet developed predictive models for this progression [9-12].

Radiomics, an emerging field that extracts a large number of quantitative features from medical imaging, has shown promise in enhancing the prediction of various disease outcomes [13]. By analyzing these high-dimensional data, radiomics can capture intralesional heterogeneity that might not be apparent on standard imaging. In recent years, radiomics has been increasingly applied in oncology for tumor characterization, treatment response prediction, and survival analysis, but its application in degenerative spinal diseases remains underexplored [14-17]. Some researchers have applied this approach to predict the minimal clinically important difference after OPLL surgery, demonstrating the potential of radiomics in OPLL prognostication by providing additional information for predicting surgical outcomes through the extraction of high-dimensional features from OPLL lesions [18].

This study aims to develop interpretable radiomics models to predict the postoperative progression of cervical OPLL. By integrating clinical and radiomics data, the study seeks to enhance the accuracy of prognostic predictions, potentially leading to more personalized treatment and postoperative management strategies for patients undergoing spinal surgery for OPLL.

MATERIALS AND METHODS

1. Patients

The study was performed according to the Helsinki Declaration (http://www.wma.net/en/30publications/10policies/b3/) and approved by the Institutional Review Board (IRB) of Peking University Third Hospital on June 18, 2024 (approval No. IRB00006761-M2024489). Patient informed consent was waived. Due to the retrospective study design and the use of existing data, the requirement for informed consent was waived. Patient information was retrieved from electronic medical records for a retrospective analysis of individuals diagnosed with OPLL at Peking University Third Hospital between October 2006 and September 2022. Inclusion criteria included: (1) patients who underwent posterior spinal surgery; (2) availability of at least 2 computed tomography (CT) examinations spaced at least 1 year apart. Exclusion criteria encompassed: (1) patients who received additional anterior procedures or revision surgeries and (2) Initial OPLL lesion has too few voxels for feature extraction. CT scans were performed both before surgery and at the final follow-up. Demographic data (such as sex and age) were extracted from the electronic health records, and radiological assessments were made on the x-ray, CT, and magnetic resonance images from both the preoperative and the final postoperative visits. The patient inclusion and exclusion process are illustrated in Fig. 1.

Fig. 1.

Patient inclusion and exclusion process. OPLL, ossification of the posterior longitudinal ligament; CT, computed tomography.

2. CT Acquisition Scan Protocol

Initial cervical spine CT scans were performed using the following equipment: GE CT620 (GE Healthcare, Chicago, IL, USA), GE CT750HD (GE Healthcar), GE Rev (GE Healthcare), SE FLASH (Siemens Healthineers, Erlangen, Germany), UI CT790 (United Imaging Healthcare, Shanghai, China), UI CT960 (United Imaging Healthcare), UI CT 710 (United Imaging Healthcare). All CT scans were performed using parameters set at 120 kV and smart mA, with reconstructions oriented along the long axis of the cervical pedicles. The slice thickness was set at 3 mm.

3. Calculation of the Annualized Growth Rate of OPLL Volume

First, the preoperative and the most recent postoperative follow-up axial CT images of the cervical spine were downloaded from the hospital’s PACS (Picture Archiving and Communication System). These images were then imported into ITK SNAP (ver. 4.0.1) software. The window width and level were adjusted to optimally display the lesions. An experienced radiologist with 4 years of experience segmented the OPLL lesions at the C1–7 levels layer by layer on the images, marking them as the region of interest (ROI). After the initial segmentation, a senior musculoskeletal radiologist with 22 years of experience reviewed and modified the ROI. Upon completion of the segmentation, the software automatically generated a 3-dimensional (3D) volume of interest (VOI) and calculated the lesion volume. The annual growth rate of OPLL volume was calculated using the following formula:

Annual growth rate=OPLL volumepost-OPLL volumepreFollow-up time (months)×12×100%

According to Katsumi et al. [11], an annual growth rate greater than 7.5% was defined as the OPLL progression (P) group, and an annual growth rate less than 7.5% was defined as the nonprogression (NP) group. Fig. 2. illustrates the segmentation of OPLL and the calculation of the annualized growth rate.

Fig. 2.

A 56-year-old male patient underwent laminoplasty for cervical myelopathy. (A) Axial computed tomography (CT) image of the cervical spine at initial presentation; the red area indicates the ROI for OPLL. (B) Three-dimensional representation of the lesion preoperatively, with a volume of 1,638 mm3. (C) Axial CT image of the cervical spine 80 months postoperatively. (D) Three-dimensional representation of the lesion 80 months postoperatively, with a volume of 3,419 mm3. The annual growth rate of the OPLL volume was calculated to be 16.3%. ROI, region of interest; OPLL, ossification of the posterior longitudinal ligament.

4. Radiomics Feature Extraction

Feature extraction was performed on the VOI of the preoperative OPLL volume. Due to the use of multiple CT devices in this study and to minimize the impact of imaging protocol variations, the voxels of the original images were resampled to a resolution of 1 mm³× 1 mm³× 1 mm³ prior to feature extraction. The min-max normalization method was then used to normalize the pixel values of the images to the [0, 1] range. CT-based radiomic features were extracted using the Pyradiomics package (ver. 3.0) in Python (ver. 3.7.3), following the guidelines of the Image Biomarker Standardization Initiative [19].

The manually crafted features can be divided into 3 groups: (1) shape (14 features), (2) intensity (18 features), and (3) texture (73 features). Shape features describe the 3D shape characteristics of OPLL. Intensity features describe the first-order statistical distribution of voxel intensities within the OPLL. Texture features describe patterns or the second- and higher-order spatial distributions of intensities. Various methods were used to extract texture features, including the Gray Level Co-occurrence Matrix, the Gray Level Run Length Matrix, the Gray Level Size Zone Matrix, and the Neighboring Gray Tone Difference Matrix methods.

In addition to extracting features from the original images, the same types of features were extracted from filtered images using several filters, including the Laplacian of Gaussian (with sigma values of 1.0, 2.0, and 3.0 mm), Wavelet, Local Binary Patterns in 3D, Exponential, Square, SquareRoot, Logarithm, and Gradient filters. In total, 1,834 radiomic features were extracted from each VOI.

5. Feature Selection

In the feature selection process, we first group the dataset and split it into training and testing sets in a 7:3 ratio. Following this, we standardize the features to have a mean of zero and a standard deviation of one, ensuring that all features contribute equally to the analysis and preventing larger scale features from disproportionately influencing the model. The standardized dataset then undergoes an initial feature removal step where constant features are discarded. Next, we address multicollinearity by calculating the correlation matrix for all numerical features. We identify pairs with correlation coefficients greater than 0.8, retaining only one feature from each pair and discarding the others to remove highly correlated features. Subsequently, we applied the least absolute shrinkage and selection operator (LASSO) for further feature selection, using 5-fold cross-validation to determine the optimal regularization parameter (alpha). To further reduce feature dimensionality and minimize interfeature correlation, we employed principal component analysis (PCA) to extract principal components that cumulatively explained 95% of the variance, enhancing the model’s computational efficiency and robustness.

6. Collection of Baseline Clinical Information

Age, sex, surgical approach, number of treatment segments, initial OPLL volume, and OPLL type were recorded as baseline clinical and radiological information. The surgical approaches included laminoplasty (LMP) and posterior decompression with instrumented fusion (PDF). OPLL types included continuous, segmental, mixed, and localized types. On cervical spine x-rays, the C2–7 angle and range of motion (ROM) are measured. The C2–7 angle is defined as the angle between the lower endplates of C2 and C7. Extension ROM is calculated as the change in the C2–7 angle from the neutral position to the extension position, while flexion ROM is the change in angle from the neutral to the flexion position. On sagittal MRI T2-weighted images, the signal intensity at the narrowest segment of the spinal cord is classified into 3 grades: grade 0: no high signal intensity; grade 1: mild high signal intensity (blurred signal); grade 2: high-intensity signal (bright signal).

7. Statistical Analysis

1) Statistical methods for clinical variables

Continuous variables, including age, initial OPLL volume, C2–7 angle, extension ROM, and flexion ROM, were tested for normality (Shapiro test) and subsequently analyzed using either the t-test or the Mann-Whitney U-test. Categorical variables, including sex, surgical approach, number of treatment segments, OPLL type, and intramedullary signal intensity (ISI), were analyzed using chi-square tests. We employed the Kruskal-Wallis H-test to evaluate the significant differences in annual progression rates among different OPLL types. Following this, Dunn test was conducted to identify specific pairs of OPLL types with significant differences. This analysis utilized pairwise Mann-Whitney U-tests with Bonferroni correction to control for type I errors resulting from multiple comparisons.

2) Development of logistic regression models

Spearman correlation analysis was used to calculate the correlation coefficients between different principal components, which were visualized using a heatmap. The selected principal components were used to develop a logistic regression (LR) model, and the optimal model parameters were determined through grid search. The Rad-score was calculated based on the principal components and the coefficients of the LR model, as described in Supplementary Material 1. Clinical variables with p<0.05 in univariable analysis were used to develop a clinical model. A combined model was constructed using both clinical variables and the Rad-score. The model’s performance was evaluated using the area under the receiver operating characteristic curve (AUC) with a 95% confidence interval (CI), and the optimal classification threshold was determined using the maximum Youden index. Sensitivity, specificity, precision, recall, F1 score, accuracy, positive predictive value (PPV), and negative predictive value (NPV) were calculated at this threshold.

3) Model evaluation, clinical utility, and interpretability techniques

A nomogram was constructed based on the LR model to provide an individualized prediction tool for clinical use. Calibration curves were used to assess the agreement between predicted probabilities and observed outcomes for the LR models. Decision curve analysis (DCA) was conducted to determine the clinical utility of the model by quantifying the net benefit across a range of threshold probabilities. SHAP (SHapley Additive exPlanations) analysis was performed to interpret the contributions of individual features to the model’s predictions. SHAP values indicate the impact of each feature on the prediction, allowing for a comprehensive assessment of feature importance and how different features influence the model output.

A 2-sided p-value of less than 0.05 was considered statistically significant.

The radiomics workflow in this study is shown on Fig. 3.

Fig. 3.

The radiomics workflow in this study. CT, computed tomography; 3D, 3-dimensional; ROI, region of interest; PCA, principal component analysis; ROC, receiver operating characteristic; SHAP, SHapley Additive exPlanations.

RESULTS

1. Baseline Patient Information

A total of 473 patients were included in this study, with a follow-up period ranging from 12 to 202 months and a median follow-up time of 22 months. Among these, 191 patients experienced OPLL progression and were classified into the P group (annual progression rate > 7.5%), while 282 patients did not show OPLL progression and were classified into the NP group. The P group had a significantly higher average initial OPLL volume (2250 mm³) compared to the NP group (1,883 mm³, p=0.027). Additionally, the P group demonstrated a significantly higher median annual progression rate of 13.7%, while the NP group had a median annual progression rate of only 2.0% (p<0.001). The P group had a significantly lower average age (52.94±8.89 years) compared to the NP group (57.06±9.00 years, p<0.001). For preoperative OPLL type, the distribution differed significantly between the 2 groups (p<0.001). In terms of other preoperative clinical and radiological information, there were no significant differences between the 2 groups in sex, C2–7 angle, extension ROM, flextion ROM, ISI, number of treatment segments, surgical approach, and preoperative OPLL volume. Detailed patient information is shown in Table 1.

Detailed patient information

2. Analysis of Progression Rates in Different Types of OPLL

In this study, the mixed type of OPLL exhibited the highest median annual progression rate at 7.5% (interquartile range [IQR], 4.3%–14.9%), followed by the continuous type at 5.1% (IQR, 1.8%–9.8%) and the segmental type at 4.9% (IQR, 0%–12.5%). The localized type showed the lowest median annual progression rate of 0% (IQR, 0%–4.3%) (Fig. 4A). The Kruskal-Wallis H-test revealed significant differences in annual progression rates among the different OPLL types (H=39.703, p<0.001). Post hoc pairwise comparisons using the Bonferroni correction demonstrated that the mixed type had significantly higher progression rates compared to the segmental (p=0.002), localized (p<0.001), and continuous types (p=0.046). Additionally, the progression rate of the localized type was also significantly lower compared to the segmental type (p=0.001) and the continuous type (p<0.001) (Supplementary Tables 1, 2). These findings underscore the variability in progression rates across OPLL types, suggesting that specific type-related factors may influence disease progression.

Fig. 4.

(A) Boxplot of annual progression rates by OPLL type. The distribution of annual progression rates is shown for 4 OPLL types: continuous, segmental, mixed, and localized. The boxes represent the interquartile range (IQR) with medians indicated by horizontal lines. Whiskers extend to 1.5 times the IQR, and outliers are shown as points. Significant differences between OPLL types are marked with asterisks (*) based on Dunn test with Bonferroni correction. (B) Changes in cervical spine ossification types from preoperation to postoperation. Each line represents an individual patient’s transition between segmental, mixed, continuous, and localized ossification types before and after operation. OPLL, ossification of the posterior longitudinal ligament.

During the follow-up period, regarding OPLL classification, 5 patients progressed from localized type to segmental or mixed type, 42 patients progressed from segmental type to mixed or continuous type, 30 patients progressed from mixed type to continuous type, and 1 patient with continuous type developed a new segment, becoming mixed type (Fig. 4B).

3. Feature Selection and Model Development

Following correlation coefficient analysis and LASSO, 16 radiomics features were retained (Supplementary Table 3). After PCA, the first 13 principal components were retained. Detailed procedures and parameters for LASSO and PCA are provided in Supplementary Materials 2.1 and 2.2. As shown in Supplementary Fig. 4, the correlations between the 13 principal components were very low, making them suitable for LR modeling. The calculation formula for the Rad-score is provided in Supplementary Material 1. The clinical model was developed using 3 variables: age, postoperative OPLL type, and preoperative OPLL volume. The combined model was constructed using both the Rad-score and clinical variables. Table 2 summarizes the performance metrics of the Rad-score, clinical, and combined models on the training and testing sets. The combined model achieved the best overall performance, with the highest AUC for both the training set (0.777; 95% CI, 0.729–0.826) and the testing set (0.751; 95% CI, 0.670–0.828). The AUC of the combined model on the testing set was significantly higher than that of the Rad-score model (AUC=0.693, p=0.009) and the clinical model (AUC=0.620, p=0.001) (Supplementary Table 4). On the testing set, the combined model’s sensitivity, specificity, accuracy, precision, recall, F1 score, PPV, and NPV were 0.654, 0.733, 0.704, 0.586, 0.654, 0.618, 0.586, and 0.786, respectively.

Performance metrics of different models on the training and testing sets

Fig. 5A and B display the receiver operating characteristic curves for the Rad-score, clinical, and combined models. Fig. 5C shows the nomogram developed to predict the probability of OPLL progression, integrating Rad-score, preoperative OPLL type, preoperative OPLL volume, and age. The nomogram provides an individualized risk prediction, allowing for practical use in clinical decision-making. Supplementary Fig. 5A and B depict the calibration curves for the combined model on the training and testing sets, respectively. Both curves indicate good calibration, with the bias-corrected line closely following the ideal diagonal line, suggesting a strong agreement between predicted probabilities and observed outcomes. The mean absolute error was 0.012 for the training set and 0.029 for the test set, indicating minimal deviation from ideal calibration. Supplementary Fig. 5C and D present the DCA for the training and testing sets, respectively. The combined model consistently demonstrated a higher net benefit compared to the Rad-score and clinical models across a wide range of threshold probabilities. Fig. 5D and E depict the waterfall plots of predicted probabilities for the training and testing sets of the combined model, respectively. These plots highlight the model’s ability to effectively distinguish between the 2 classes, particularly in the testing set.

Fig. 5.

(A, B) Receiver operating characteristic (ROC) curves for both the training and test datasets. The ROC curves illustrate the performance of 3 different models: the Rad-score model, the clinical model, and the combined model. (C) Nomogram for predicting OPLL progression. The nomogram integrates the Rad-score, preoperative OPLL type, preoperative OPLL volume, and age to predict the likelihood of OPLL progression. To use the nomogram, locate the patient’s value on each predictor axis, draw a line vertically upward to determine the corresponding point value, and sum these points to calculate the total score. The total score is then mapped to the linear predictor and probability of event scales, indicating the estimated risk of postoperative OPLL progression. (D, E) Waterfall plots of predicted probabilities for the training set (D) and testing set (E). Each bar represents a sample, colored according to the actual label (red for label 0, indicating nonprogression, and blue for label 1, indicating progression). The x-axis shows the sample index, sorted by increasing predicted probability, while the y-axis represents the predicted probability of progression. The horizontal red dashed line represents the decision threshold, with predictions above this threshold indicating “progression.” OPLL, ossification of the posterior longitudinal ligament; AUC, area under the receiver operating characteristic curve.

4. SHAP Analysis

Fig. 6AC illustrates the SHAP analysis results to interpret the contribution of each feature to the model’s predictions. Fig. 6A presents the summary plot of SHAP values, indicating the impact of individual features on the model output. The Rad-score had the highest contribution to the model’s prediction, followed by age. Features such as preoperative OPLL type (segmental, mixed, localized, continuous) and preoperative OPLL volume also contributed, but to a lesser extent. Fig. 6B shows the average absolute SHAP values for each feature, providing an overview of their importance. The Rad-score emerged as the most significant predictor, with a mean SHAP value much higher than the other features. Age also showed a considerable impact, while the different OPLL types and volume had relatively lower contributions. Fig. 6C depicts the SHAP dependence plot, illustrating the individual effects of the features on the model output. It shows the interaction effects, with high Rad-score and age values being strong predictors of the outcome, as reflected by positive SHAP values. The plot suggests that both imaging and clinical variables contribute meaningfully to the model’s decision-making process, enhancing the model’s interpretability in predicting OPLL progression.

Fig. 6.

SHAP analysis for model interpretability. (A) A summary plot showing the SHAP values of each feature, with dot color indicating the feature value (blue for low and pink for high). (B) The feature importance ranked by mean absolute SHAP values, indicating the average contribution of each feature to the model. (C) A decision plot, visualizing how each feature sequentially contributes to individual predictions, with color representing feature values and lines showing cumulative impact on the model output. SHAP, SHapley Additive exPlanations; OPLL, ossification of the posterior longitudinal ligament.

DISCUSSION

The findings of our study indicate that the combined model, integrating Rad-score and clinical variables, outperformed both the Rad-score-only and clinical models in predicting the postoperative progression of OPLL, achieving an AUC of 0.751 on the test set. This demonstrates the value of combining quantitative imaging biomarkers with clinical information to improve predictive accuracy in OPLL patients. The Rad-score-only model achieved an AUC of 0.693, highlighting the utility of imaging features for risk stratification, while the clinical model alone showed limited predictive power with an AUC of 0.620. The incorporation of clinical variables, such as age, postoperative OPLL type, and preoperative OPLL volume, with the Rad-score significantly enhanced the model’s discriminative ability. These results emphasize the importance of a multimodal approach, integrating Rad-score and clinical data to provide a more comprehensive prediction model, potentially facilitating more informed clinical decision-making and better individualized treatment strategies for OPLL patients.

Our analysis revealed that age, preoperative OPLL type and initial OPLL volume were significant clinical predictors of postoperative progression in OPLL patients. Specifically, our findings indicate that younger patients are more susceptible to OPLL progression, which aligns with previous studies [9,11,20]. This increased risk may be due to higher metabolic activity and the greater mechanical stress exerted on the spine from an increased ROM in younger individuals. We observed that the mixed type of OPLL exhibited the fastest progression, while the localized type showed the slowest progression. This contrasts with the findings of Doi et al. [9], who identified continuous-type OPLL as a risk factor for progression. The discrepancy could be attributed to the differing criteria used to define progression; Doi et al. [9] measured a 2-mm increase in lesion length in any direction, while our study used an annual volume increase of 7.5%. The larger initial volume in continuous-type OPLL might explain its lower progression rate under our criteria. Additionally, our study analyzed the progression patterns of OPLL in each patient, observing that 5 patients progressed from a localized to segmental or mixed type, 42 from segmental to mixed or continuous type, and 30 from mixed to continuous type. This indicates that the classification of OPLL is dynamic and subject to change, which may contribute to the instability of this factor. Our study is the first to find that patients with smaller initial OPLL volumes are less likely to experience OPLL progression, which may be attributed to the fact that smaller OPLL volumes are often associated with the localized type. Surgical techniques may also influence the rate of progression of OPLL. Although several studies have reported that PDF can inhibit OPLL progression by limiting cervical ROM [9,21-23], our results showed no significant difference in progression rates between PDF and LMP. This may be due to selection bias, as patients at higher risk of progression might be more likely to choose the PDF procedure.

Our study utilized 3D OPLL volume analysis, which, compared to 2-dimensional measurements, provides a more comprehensive representation of the entire lesion and allows for precise evaluation of lesion changes. Additionally, for the first time, we applied radiomics methods to extract high-dimensional features from the OPLL lesion, capturing information that is difficult to observe with the naked eye. Our study addresses some of the key challenges in radiomics, such as the high dimensionality of features and the difficulty of interpretation by clinicians. By employing a combination of LASSO and PCA for feature selection and dimensionality reduction, we effectively minimized the risk of overfitting and improved model stability. The use of PCA allowed us to further compress the information into fewer principal components, retaining the most relevant variance while reducing redundancy. Subsequently, the computation of a single Rad-score, derived from these selected principal components, simplified the interpretation of the model’s output, enhancing its clinical applicability. This approach provides a straightforward and interpretable score that can be easily utilized by clinicians for risk assessment, thereby bridging the gap between complex radiomic analysis and practical clinical use. Our findings illustrate the potential of radiomics and machine learning in predicting postoperative outcomes of OPLL surgery, consistent with recent scholarly work in this domain. For instance, Maki et al. [18] developed a model using machine learning and deep learning to predict surgical outcomes in cervical OPLL patients, achieving an accuracy of 71.9% and highlighting the significance of preoperative JOA scores and imaging features. Similarly, Ito et al. [24] demonstrated a model with 74.6% accuracy for overall complications and 91.7% for neurological complications, surpassing traditional LR models. Moreover, Kim et al. [25] found that a machine learning model using an adaptive reinforcement learning algorithm and downsampling outperformed conventional LR in predicting postoperative C5 palsy in cervical OPLL patients, with an AUC of 0.88 compared to 0.69. Although our study achieved an AUC of 0.751 and an accuracy of 70.4%, which are modest compared to some earlier studies, this may be attributed to our focus on a continuous variable outcome, where machine learning models may face challenges with samples near the decision boundary. Our waterfall plots revealed that our model accurately predicts patients with very high or very low risks of OPLL progression, thereby facilitating surgical approach selection and postoperative management. This capability underscores our model’s effectiveness in accurately identifying patients at extreme risks of progression, essential for optimizing surgical and postoperative strategies.

Our study has several limitations that merit attention. Firstly, while it represents the largest sample of OPLL patients with CT follow-up data to date, the single-center, retrospective design of our research poses a significant limitation. Future prospective data collection and external validation at other centers are essential to demonstrate the generalizability of our model. Secondly, our study did not consider a sufficient range of clinical and radiological factors, such as inflammatory markers, the number of OPLL-involved segments, and occupation rate. Including more of these factors in future research could potentially enhance model performance. Thirdly, our radiomics analysis focused solely on the lesion without considering surrounding structures, which may lead to incomplete information extraction.

CONCLUSION

Radiomics provides a valuable predictive tool for assessing the risk of postoperative progression in cervical OPLL, supporting more personalized treatment strategies. Further research with a prospective, multicenter approach is necessary to validate and enhance the model’s utility.

Supplementary Materials

Supplementary Materials 1-2, Supplementary Tables 1-4, and Supplementary Figs. 1-5 for this article is available at https://doi.org/10.14245/ns.2448846.423.

Supplementary Table 1.

Annual progression rates and proportional distribution of different ossification of the posterior longitudinal ligament types

ns-2448846-423-Supplementary-Table-1.pdf
Supplementary Table 2.

Post hoc pairwise comparisons of annual progression rates among different ossification of the posterior longitudinal ligament types

ns-2448846-423-Supplementary-Table-2.pdf
Supplementary Table 3.

Features selected by least absolute shrinkage and selection operator

ns-2448846-423-Supplementary-Table-3.pdf
Supplementary Table 4.

The results of the DeLong test for different models in the training and test set

ns-2448846-423-Supplementary-Table-4.pdf
Supplementary Fig. 1.

Least absolute shrinkage and selection operator (LASSO) cross-Validation plot for optimal alpha selection. LASSO cross-validation plot showing the mean squared error (MSE) across different values of the regularization parameter, log (alpha). The blue points represent the mean MSE, with error bars indicating the 95% confidence interval (CI). The optimal value of alpha (0.0473) is indicated by the red dashed line, corresponding to the lowest MSE, which balances model complexity and predictive accuracy.

ns-2448846-423-Supplementary-Fig-1.pdf
Supplementary Fig. 2.

Principal component analysis (PCA) cumulative explained variance plot. The x-axis represents the number of principal components, while the y-axis indicates the cumulative explained variance.

ns-2448846-423-Supplementary-Fig-2.pdf
Supplementary Fig. 3.

Three-dimensional (3D) scatter plot of the first 3 principal components (PC1, PC2, PC3) derived from principal component analysis (PCA). Each point represents a sample, with colors corresponding to different labels (label 0 and label 1). The plot visualizes the separation of the 2 classes based on the reduced feature set, highlighting how the first 3 principal components capture key variance in the data to differentiate between classes.

ns-2448846-423-Supplementary-Fig-3.pdf
Supplementary Fig. 4.

Correlation heatmap between the principal components (PC1 to PC13). The heatmap illustrates the correlation coefficients between the principal components, with values close to zero indicating minimal correlation. The diagonal elements represent a perfect correlation of each principal component with itself (correlation=1), while the off-diagonal elements show very low correlation between the principal components, demonstrating their independence.

ns-2448846-423-Supplementary-Fig-4.pdf
Supplementary Fig. 5.

The calibration curves for the combined model on the training (A) and test sets (B). In both figures, the dashed line represents the ideal calibration curve, where predicted probabilities perfectly match the actual outcomes. The solid line shows the bias-corrected calibration, while the dotted line represents the apparent calibration. Panel A, which corresponds to the training set, demonstrates a mean absolute error (MAE) of 0.012 with 331 samples, while panel B for the test set shows a slightly higher MAE of 0.029 with 142 samples, reflecting the generalization performance of the model. Decision curve analysis for both the training (C) and test sets (D) comparing the net benefit of the Rad-score model (red), clinical model (blue), and combined model (green). The gray lines represent the net benefit of treating all patients (“all”) and treating no patients (“none”). Across a wide range of threshold probabilities, the combined model (green) demonstrates superior net benefit compared to both the Rad_score and clinical models alone, particularly at lower threshold probabilities in the test set, indicating better clinical decision-making performance of the combined model in both cohorts.

ns-2448846-423-Supplementary-Fig-5.pdf

Notes

Conflict of Interest

The authors have nothing to disclose.

Funding/Support

This study has received funding by National Natural Science Foundation of China (No. 82371921) and the Proof of Concept Program of Zhongguancun Science City and Peking University Third Hospital under Grant (HDCXZHKC2022202).

Author Contribution

Conceptualization: NL; Formal analysis: SYQ, RMQ; Methodology: SYQ, RMQ, KL, FFZ, NL; Visualization: SYQ; Data curation: SYQ, RMQ, KL, RXY, WLZ, JX, ELZ; Writing – original draft: SYQ, RMQ, KL; Writing – review & editing: SYQ, RMQ, FFZ, NL

References

1. Ono K, Yonenobu K, Miyamoto S, et al. Pathology of ossification of the posterior longitudinal ligament and ligamentum flavum. Clin Orthop Relat Res 1999;(359):18–26.
2. Liang H, Liu G, Lu S, et al. Epidemiology of ossification of the spinal ligaments and associated factors in the Chinese population: a cross-sectional study of 2000 consecutive individuals. BMC Musculoskelet Disord 2019;20:253.
3. Nouri A, Tessitore E, Molliqaj G, et al. Degenerative cervical myelopathy: development and natural history [AO spine RECODE-DCM research priority number 2]. Global Spine J 2022;12(1_suppl):39S–54S.
4. Fargen KM, Cox JB, Hoh DJ. Does ossification of the posterior longitudinal ligament progress after laminoplasty? Radiographic and clinical evidence of ossification of the posterior longitudinal ligament lesion growth and the risk factors for late neurologic deterioration. J Neurosurg Spine 2012;17:512–24.
5. Iwasaki M, Kawaguchi Y, Kimura T, et al. Long-term results of expansive laminoplasty for ossification of the posterior longitudinal ligament of the cervical spine: more than 10 years follow up. J Neurosurg 2002;96(2 Suppl):180–9.
6. Hori T, Kawaguchi Y, Kimura T. How does the ossification area of the posterior longitudinal ligament thicken following cervical laminoplasty? Spine (Phila Pa 1976) 2007;32:E551–6.
7. Kawaguchi Y, Kanamori M, Ishihara H, et al. Progression of ossification of the posterior longitudinal ligament following en bloc cervical laminoplasty. J Bone Joint Surg Am 2001;83:1798–802.
8. Kawaguchi Y, Nakano M, Yasuda T, et al. Clinical impact of ossification of the posterior longitudinal ligament progression after cervical laminoplasty. Clin Spine Surg 2019;32:E133–9.
9. Doi T, Sakamoto R, Horii C, et al. Risk factors for progression of ossification of the posterior longitudinal ligament in asymptomatic subjects. J Neurosurg Spine 2020;33:316–22.
10. Wang L, Jiang Y, Li M, et al. Postoperative progression of cervical ossification of posterior longitudinal ligament: a systematic review. World Neurosurg 2019;126:593–600.
11. Katsumi K, Watanabe K, Yamazaki A, et al. Predictive biomarkers of ossification progression and bone metabolism dynamics in patients with cervical ossification of the posterior longitudinal ligament. Eur Spine J 2023;32:1282–90.
12. Kang MS, Kim KH, Park JY, et al. Progression of cervical ossification of posterior longitudinal ligament after laminoplasty or laminectomy with posterior fixation. Clin Spine Surg 2019;32:363–8.
13. Nakajima H, Watanabe S, Honjoh K, et al. Long-term outcome of anterior cervical decompression with fusion for cervical ossification of posterior longitudinal ligament including postsurgical remnant ossified spinal lesion. Spine (Phila Pa 1976) 2019;44:E1452–60.
14. Pezoulas VC, Zaridis DI, Mylona E, et al. Synthetic data generation methods in healthcare: a review on open-source tools and methods. Comput Struct Biotechnol J 2024;23:2892–910.
15. Chen M, Copley SJ, Viola P, et al. Radiomics and artificial intelligence for precision medicine in lung cancer treatment. Semin Cancer Biol 2023;93:97–113.
16. Meng Y, Yang Y, Hu M, et al. Artificial intelligence-based radiomics in bone tumors: technical advances and clinical application. Semin Cancer Biol 2023;95:75–87.
17. Kang W, Qiu X, Luo Y, et al. Application of radiomics-based multiomics combinations in the tumor microenvironment and cancer prognosis. J Transl Med 2023;21:598.
18. Maki S, Furuya T, Katsumi K, et al. Multimodal deep learning-based radiomics approach for predicting surgical outcomes in patients with cervical ossification of the posterior longitudinal ligament. Spine (Phila Pa 1976) 2024;49:1561–9.
19. Zwanenburg A, Vallières M, Abdalah MA, et al. The image biomarker standardization initiative: standardized quantitative radiomics for high-throughput image-based phenotyping. Radiology 2020;295:328–38.
20. Choi BW, Baek DH, Sheffler LC, et al. Analysis of progression of cervical OPLL using computerized tomography: typical sign of maturation of OPLL mass. J Neurosurg Spine 2015;23:539–43.
21. Shi H, Chen L, Zhu L, et al. Instrumented fusion versus instrumented non-fusion following expansive open-door laminoplasty for multilevel cervical ossification of the posterior longitudinal ligament. Arch Orthop Trauma Surg 2023;143:2919–27.
22. Katsumi K, Izumi T, Ito T, et al. Posterior instrumented fusion suppresses the progression of ossification of the posterior longitudinal ligament: a comparison of laminoplasty with and without instrumented fusion by three-dimensional analysis. Eur Spine J 2016;25:1634–40.
23. Ota M, Furuya T, Maki S, et al. Addition of instrumented fusion after posterior decompression surgery suppresses thickening of ossification of the posterior longitudinal ligament of the cervical spine. J Clin Neurosci 2016;34:162–5.
24. Ito S, Nakashima H, Yoshii T, et al. Deep learning-based prediction model for postoperative complications of cervical posterior longitudinal ligament ossification. Eur Spine J 2023;32:3797–806.
25. Kim SH, Lee SH, Shin DA. Could machine learning better predict postoperative C5 palsy of cervical ossification of the posterior longitudinal ligament? Clin Spine Surg 2022;35:E419–25.

Article information Continued

Fig. 1.

Patient inclusion and exclusion process. OPLL, ossification of the posterior longitudinal ligament; CT, computed tomography.

Fig. 2.

A 56-year-old male patient underwent laminoplasty for cervical myelopathy. (A) Axial computed tomography (CT) image of the cervical spine at initial presentation; the red area indicates the ROI for OPLL. (B) Three-dimensional representation of the lesion preoperatively, with a volume of 1,638 mm3. (C) Axial CT image of the cervical spine 80 months postoperatively. (D) Three-dimensional representation of the lesion 80 months postoperatively, with a volume of 3,419 mm3. The annual growth rate of the OPLL volume was calculated to be 16.3%. ROI, region of interest; OPLL, ossification of the posterior longitudinal ligament.

Fig. 3.

The radiomics workflow in this study. CT, computed tomography; 3D, 3-dimensional; ROI, region of interest; PCA, principal component analysis; ROC, receiver operating characteristic; SHAP, SHapley Additive exPlanations.

Fig. 4.

(A) Boxplot of annual progression rates by OPLL type. The distribution of annual progression rates is shown for 4 OPLL types: continuous, segmental, mixed, and localized. The boxes represent the interquartile range (IQR) with medians indicated by horizontal lines. Whiskers extend to 1.5 times the IQR, and outliers are shown as points. Significant differences between OPLL types are marked with asterisks (*) based on Dunn test with Bonferroni correction. (B) Changes in cervical spine ossification types from preoperation to postoperation. Each line represents an individual patient’s transition between segmental, mixed, continuous, and localized ossification types before and after operation. OPLL, ossification of the posterior longitudinal ligament.

Fig. 5.

(A, B) Receiver operating characteristic (ROC) curves for both the training and test datasets. The ROC curves illustrate the performance of 3 different models: the Rad-score model, the clinical model, and the combined model. (C) Nomogram for predicting OPLL progression. The nomogram integrates the Rad-score, preoperative OPLL type, preoperative OPLL volume, and age to predict the likelihood of OPLL progression. To use the nomogram, locate the patient’s value on each predictor axis, draw a line vertically upward to determine the corresponding point value, and sum these points to calculate the total score. The total score is then mapped to the linear predictor and probability of event scales, indicating the estimated risk of postoperative OPLL progression. (D, E) Waterfall plots of predicted probabilities for the training set (D) and testing set (E). Each bar represents a sample, colored according to the actual label (red for label 0, indicating nonprogression, and blue for label 1, indicating progression). The x-axis shows the sample index, sorted by increasing predicted probability, while the y-axis represents the predicted probability of progression. The horizontal red dashed line represents the decision threshold, with predictions above this threshold indicating “progression.” OPLL, ossification of the posterior longitudinal ligament; AUC, area under the receiver operating characteristic curve.

Fig. 6.

SHAP analysis for model interpretability. (A) A summary plot showing the SHAP values of each feature, with dot color indicating the feature value (blue for low and pink for high). (B) The feature importance ranked by mean absolute SHAP values, indicating the average contribution of each feature to the model. (C) A decision plot, visualizing how each feature sequentially contributes to individual predictions, with color representing feature values and lines showing cumulative impact on the model output. SHAP, SHapley Additive exPlanations; OPLL, ossification of the posterior longitudinal ligament.

Table 1.

Detailed patient information

Variable Overall (n = 473) NP (n = 282) P (n = 191) p-value
Sex 0.795
 Male 310 (65.5) 183 (64.9) 127 (66.5)
 Female 163 (34.5) 99 (35.1) 64 (33.5)
Age (yr) 55.40 ± 9.17 57.06 ± 9.00 52.94 ± 8.89 < 0.001
Preoperative OPLL type < 0.001
 Continuous 46 (9.7) 29 (10.3) 17 (8.9)
 Segmental 256 (54.1) 152 (53.9) 104 (54.5)
 Mixed 127 (26.8) 62 (22.0) 65 (34.0)
 Localized 44 (9.3) 39 (13.8) 5 (2.6)
Postoperative OPLL type < 0.001
 Continuous 87 (18.4) 40 (14.2) 47 (24.6)
 Segmental 217 (45.9) 143 (50.7) 74 (38.7)
 Mixed 129 (27.3) 61 (21.6) 68 (35.6)
 Localized 40 (8.5) 38 (13.5) 2 (1.0)
C2–7 angle (°) 8.36 ± 8.95 8.65 ± 9.27 7.93 ± 8.45 0.39
Extension ROM (°) 11.74 ± 7.44 11.95 ± 7.28 11.43 ± 7.68 0.459
Flextion ROM (°) 22.64 ± 10.10 23.18 ± 10.35 21.84 ± 9.69 0.157
ISI 0.932
 0 106 (22.4) 63 (22.3) 43 (22.5)
 1 322 (68.1) 191 (67.7) 131 (68.6)
 2 45 (9.5) 28 (9.9) 17 (8.9)
No. of treatment segments 0.915
 3 3 (0.6) 2 (0.7) 1 (0.5)
 4 64 (13.5) 41 (14.5) 23 (12.0)
 5 312 (66.0) 184 (65.2) 128 (67.0)
 6 85 (18.0) 49 (17.4) 36 (18.8)
 7 9 (1.9) 6 (2.1) 3 (1.6)
Surgical approach 0.734
 LMP 374 (79.1) 221 (78.4) 153 (80.1)
 PDF 99 (20.9) 61 (21.6) 38 (19.9)
Preoperative OPLL volume (mm3) 2,031 ± 1,765 1,883 ± 1,899 2,250 ± 1,524 0.027
Postoperative OPLL volume (mm3) 2,541 ± 2,276 2,093 ± 2,217 3,201 ± 2,205 < 0.001
Annual progress rate (%) 5.2 (1.2–12.0) 2.0 (0.1–4.6) 13.7 (10.2–20.7) < 0.001
Follow-up time (mo) 22 (13–41) 21.5 (13.0–43.8) 22.0 (14.0–38.5) 0.923

Values are presented as number (%), mean±standard deviation, or median (interquartile range).

NP, nonprogression; P, progression; OPLL, ossification of the posterior longitudinal ligament; ROM, range of motion; ISI, intramedullary signal intensity; LMP, laminoplasty; PDF, posterior decompression with instrumented fusion.

Table 2.

Performance metrics of different models on the training and testing sets

Model AUC (95% CI) Sensitivity Specificity Accuracy Precision Recall F1 Score PPV NPV
Rad-score model
 Train 0.742 (0.692–0.793) 0.791 0.599 0.680 0.588 0.791 0.675 0.588 0.799
 Test 0.693 (0.608–0.782) 0.731 0.533 0.606 0.475 0.731 0.576 0.475 0.774
Clinical model
 Train 0.659 (0.602–0.716) 0.676 0.635 0.653 0.573 0.676 0.62 0.573 0.731
 Test 0.620 (0.526–0.709) 0.654 0.567 0.599 0.466 0.654 0.544 0.466 0.739
Combined model
 Train 0.777 (0.729–0.826) 0.705 0.719 0.713 0.645 0.705 0.674 0.645 0.771
 Test 0.751 (0.670–0.828) 0.654 0.733 0.704 0.586 0.654 0.618 0.586 0.786

AUC, area under the curve; CI, confidence interval; PPV, positive predictive value; NPV, negative predictive value.