Incidence of Hoarseness After General Spine Surgery: Interim Report of Prospective Observational Study
Article information
Abstract
Objective
Hoarseness can occur after spinal surgery under general anesthesia, which has been assessed through self-report measures based on questionnaires. Given the inherent biases associated with self-report instruments, there is a need for more objective measures to assess hoarseness.
Methods
Single institute, a prospective observational study was planned to include 427 patients after spine surgery. This interim analysis was planned to include 215 patients who met the inclusion criteria. All subjects included in this study submitted a questionnaire of Korean Voice Handicap Index (KVHI)-10. Voice analysis including low or high pitch (Herz), frequency variation rate (jitter), amplitude variation rate (Shimmer), and noise-to-harmonic ratio (NHR) was performed with a software of Pratt.
Results
This interim report enrolled a total of 215 patients who met the inclusion criteria, and among them, 162 patients (75.5%) were subjected to interim analysis after excluding those with data loss (8 patients), operation cancellation (3 patients), and loss to follow-up (42 patients). The incidence of hoarseness was 35.0% on postoperative day (POD)0 and 5.5% on POD30. In the acoustic parameters analyzed, hertz and jitter were significantly positively correlated with the KVHI-10 scores on POD0, while only the jitter value significantly correlated with POD30. The optimal cutoff values of the acoustic parameter on POD30 from the receiver operating characteristic curve were 0.65% in jitter, 4.67% in shimmer, and 16.96 dB in NHR.
Conclusion
This study revealed a correlation between objective acoustic parameters obtained from voice analysis and subjective questionnaire scores for hoarseness.
INTRODUCTION
After surgery, patients often experience changes in their overall voice. Sometimes they may also feel voice difficulties, especially in pitch or loudness. The effort to produce sound as breathy, raspy, and lower in pitch refers to hoarseness [1-3]. Typically, hoarseness is diagnosed through clinical evaluation and self-report measures, such as questionnaires (Korean Voice Handicap Index-10, KVHI-10) without the need for additional testing or investigations [3-5]. However, acknowledging the potential biases in self-report instruments, there is a need for a more objective indicator to assess hoarseness beyond relying solely on subjective questionnaires or symptom reporting.
The incidence of hoarseness has been studied in only a limited subset of spine surgery cases [3,6]. The risk of hoarseness was mostly studied for the anterior cervical approach due to proximity of the recurrent laryngeal nerve or superior laryngeal nerve and the incidence hoarseness is known to be wide ranged from 1.69% to 24.2% [7-9]. We experience postoperative hoarseness after other spinal surgery patients, but the incidence of hoarseness after general spinal surgery was rarely reported.
We designed a prospective observational study to evaluate the incidence of hoarseness after spine surgery. As an interim analysis, we planned to validate the appropriateness of KVHI-10 questionnaire as a criterion for diagnosis of hoarseness compared to objective acoustic parameters from voice analysis. There are various existing studies on the correlation between the KVHI-10 and acoustic parameters. However, some studies have reported that the correlation between subjective assessments using questionnaires and acoustic parameters for hoarseness is weak, suggesting that they evaluate different aspects of voice and should therefore be used complementarily [10,11].
In the prospective study being to be analyzed, the primary outcome is the incidence rate of hoarseness and the characteristics of acoustic parameters on postoperative day (POD)0. The secondary outcomes involve examining the correlation between acoustic parameters and hoarseness based on KVHI-10, and comparing the results with previous studies comparing acoustic parameters on POD30.
The objectives of this interim analysis were to show incidence of hoarseness after general spinal surgery and to provide criteria to objectively assess hoarseness.
MATERIALS AND METHODS
1. Study Design and Population
A prospective observational study was planned to include 427 patients after spine surgery (Clinical Trial No. NCT05996146; Seoul National University College of Medicine/Seoul National University Hospital Institutional Review Board approved the review and analysis of the data No. 2112-013-1279). Spine surgery was defined as surgeries limited to the cervical, thoracic, lumbar, and sacral regions, and patients were recruited without any restrictions on position.
This study recruited all patients with spinal diseases requiring surgery at Seoul National University Hospital. Patients admitted to the neurosurgery department, who provide consent to participate in the study, are included for prospective data collection. Inclusion criteria for this study are as follows: an age range is 20 to 80 years old, patients requiring surgical treatment for degenerative spine conditions and patients voluntarily consenting to participate. Exclusion criteria include history of surgery around the airway or mediastinum, vocal cord-related disorders, fractures, bleeding, or other trauma, neuromuscular diseases, Parkinson disease, psychiatric disorder, patients unable to be extubated or transferred to the ICU after surgery. Pregnant women and those who do not wish to participate in the study also excluded.
The reported incidence of postoperative hoarseness after tracheal intubation for general anesthesia was reported to be 49% on the day of surgery among 3,093 patients [12]. The incidence rates on POD1, 3, and 7 were 29%, 11%, and 0.8%, respectively [12]. Assuming a postoperative hoarseness incidence of 0.49, a 95% confidence interval with a width of 0.1 requires a sample size of 384. Assuming a dropout rate of 10%, we planned to recruit 427 participants. Informed consent was obtained from all participating patients, ensuring their participation based on the comprehensive explanations provided and willingness to take part.
The interim analysis was planned to include 215 patients who met the inclusion criteria. Among them, 162 patients (75.5%) were subjected to interim analysis after excluding 8 patients with data loss, 3 patients with operation cancellation, and 42 patients lost to follow-up (Fig. 1).
Trial profile. A total of 324 patients were assessed for eligibility. One hundred nine patients were excluded. Among 215 patients enrolled, lost to follow-up was 42. In this prospective study, 162 patients (75.5%) were included in the analysis. POD, postoperative day.
The presence of hoarseness is determined using the KVHI-10 criteria, where hoarseness is defined as a KVHI-10 score of 8 or higher, or an increase of 4 points or more compared to before surgery [13-20].
As the purpose of this interim analysis, it is essential to identify which acoustic parameters from voice analysis best correlate with the KVHI-10 scores to investigate to determine the validity of KVHI-10 as a criterion for assessing hoarseness currently being used as a gold standard [5]. The voice analysis was conducted using the Pratt program (ver. 6.2.09, Boersma & Weenink, Amsterdam, accessed on Feb 15, 2022) to examine the correlation between KVHI-10 and low or high pitch (Hz), frequency variation rate (jitter), amplitude variation rate (Shimmer), noise-to- harmonic ratio (NHR) as parameters.
During the study period, all neuro-anesthesiologist were blinded to the test results. Orotracheal intubation for general anesthesia was performed by staffs with fellows and residents under their supervision. Videolaryngoscopic intubation was primarily attempted with cervical spine immobilization using a cervical collar for patients undergoing cervical spine surgery, while direct laryngoscopic intubation without cervical spine immobilization for patients undergoing other spine surgeries. Reinforced tracheal tubes with inner diameters of 7.5 mm and 7.0 mm was used for male and female patients, respectively. Malleable stylets were only used when necessary. Endotracheal cuff pressure was adjusted to range of 15–30 cmH2O immediately after intubation, and no further adjustments were made during surgery unless an alarm or abnormal signal was detected by the ventilator.
2. Data Collection
Patients complete a questionnaire (KVHI-10) to assess hoarseness on the day before surgery, POD0, POD1, and POD2 while in the ward. Additionally, the patients record their voice using their own phone and send the recording files to the research team.
In a quiet environment, the patients utter one vowel /a/ for 5 seconds at a normal pitch and volume. The recording file was analyzed using the Pratt program. When the patient visits the outpatient clinic one month after surgery, the same method was used to collect the questionnaire and voice recording file.
If a hoarseness persists at the 1-month postoperative outpatient visit, the patient was referred to an otolaryngology clinic for additional evaluation with any treatment necessary. Clinical data collected include demographics, (age, sex, height, and weight), medication history (hypertension, diabetes mellitus, or others), smoking history, diagnosis, surgical site and method, anesthetic factors (performer, anesthesia time, Cormack grades, time for intubation, and equipment used).
3. Statistical Analysis
Continuous variables with 3 or more groups were analyzed using analysis of variance, while categorical variables were analyzed using the chi-square test or Fisher exact test. All analysis was performed with IBM SPSS Statistics ver. 26.0 (IBM Co., USA) and R ver. 4.3.1 (R foundation for statistical computing, Vienna). A 2-tailed p-value of 0.05 was regarded as significant. We calculated area under the receiver operating characteristic (ROC) curve (AUC) for an acoustic parameter, using the Youden index. This was done to determine the optimal cutoff point for the acoustic parameter. The correlation coefficient between KVHI-10 scores and acoustic parameters was assessed using Spearrman rank. When examining the distribution of the differences in each parameter, the significance was investigated using the minimum p-value approach.
RESULTS
1. Demographics and Variables
From December 15th, 2021 to November 27th, 2022, a total of 324 patients were assessed for eligibility. Among them, 109 patients were excluded because they did not meet the inclusion criteria. Out of these 109 participants, 61 individuals did not give their consent. Among the remaining 48 participants, 28 were aged older than 80 years, and 3 were aged younger than 20 years. There were 3 participants with vocal cord-related disorders and 2 had fractures due to trauma. Additionally, 12 participants were unable to undergo endotracheal extubation immediately after surgery and were transferred to the intensive care unit after the surgery. Out of the 215 enrolled patients, data from 8 patients were lost, and the operations of 3 patients were canceled. The number of patients lost to follow-up was 42. Out of these 42 participants, 34 patients declined further participation in data collection and 1 participant did not visit the outpatient clinic due to residing overseas. Additionally, 3 participants were unable to participate due to their medical condition, and 4 participants were unable to join due to device problems such as using a different cellphone from the one previously used for voice recording. As planned, this interim report includes the analysis of 162 patients in this prospective study (Fig. 1).
Table 1 presents the demographic characteristics of all patients. Among the total 162 participants, 91 were male, and the average age was 55.6±15.4 years. Views obtained by direct laryngoscopy was classified with Cormack grade [21] (full view of glottis, grade I; neither glottis nor epiglottis seen, grade 4), and grade I accounted for the majority at 74%. The average anesthesia time was 227.7 minutes, and the average time for intubation was 48.4±28.8 seconds. For the comorbidities, 41 had hypertension, and 21 had diabetes mellitus. The average weight was 69.4±12.9 kg, and the average height was 165.2±8.6 cm.
The changes in KVHI-10 scores before and after surgery up to POD30 follow-up are depicted in Fig. 2. The KVHI-10 scores on POD0 were recorded as the highest, followed by a gradual decrease in scores over time (Fig. 2). The incidence of hoarseness was 35.0% (57 of 162) on POD0 and 5.5% (9 of 62) on POD30 (Tables 2 and 3).
The mean Korean Voice Handicap Index-10 (KVHI-10) scores with 95% confidential intervals were measured in patients before surgery and up to 30 days afterward. The symptoms were most severe immediately after the surgery, but showed gradual improvement over time following the procedure. POD, postoperative day; Preop, preoperative.
According to the definition of hoarseness based on KVHI-10 scores, we divided the participants into hoarseness and no hoarseness group. The comparative analysis of clinical factors between the hoarseness group and no hoarseness group found that only anesthesia time was statistically significant on POD0 (Table 2), while on the POD30 there is nothing significant (Table 3).
2. Correlation Analysis Hoarseness and Acoustic Parameters
On POD0, we conducted statistical verification using ROC curves to analyze the change in acoustic parameters between preoperative day and POD0 (Fig. 3). The AUC value is 0.71 for changes in Hertz (ΔHz), and 0.62 for changes in jitter (ΔJitter). Also, AUC values in shimmer and in NHR are 0.58 and 0.48, respectively. To statistically validate whether there is a significant division into 2 groups, the minimum p-value approach was employed. Only hertz and jitter were statistically significant. When calculating AUC to determine the optimal values for distinguishing between hoarseness and no hoarseness, not only hertz and jitter showed significant results, but when comparing AUC values, the prediction model for hertz performed most reliably.
The receiver operating characteristic (ROC) curves based on the changes in acoustic parameters on the preoperative day and POD0. (A) The ROC curve for changes in hertz (ΔHz), and its area under the curve (AUC) is 0.71. (B) The ROC curve for changes in jitter (ΔJitter), with an AUC of 0.62. The ROC curve for changes in shimmer (C) and the ROC curve for changes in NHR (D), with respective AUC values of 0.58 and 0.48. Comparing AUC values, the prediction model for hertz and jitter performed reliable. POD, postoperative day; NHR, noise-to-harmonic ratio.
On POD30, ROC curves revealed the optimal cutoff values of acoustic parameters (Fig. 4). The cutoff value of jitter was 0.65%, while shimmer was 4.67%. NHR was 16.96 dB (Table 4). Among the acoustic parameters, AUC value of jitter was measured to be the highest (Fig. 4). This suggests that jitter is the most effective predictor for the detection of hoarseness on POD30. When correlating these acoustic parameters with KVHI-10 scores, a statistically significant positive correlation was observed with only the jitter value (Fig. 5).
The receiver operating characteristic (ROC)curves of acoustic parameters on POD30. (A) The ROC curve for hertz, and its area under the curve (AUC) is 0.54. (B) The ROC curve for jitter, with an AUC of 0.84. Shimmer (C) and NHR (D), with respective AUC values of 0.78 and 0.75. Comparing AUC values, the prediction model for jitter performed the best.
Correlation analysis using Spearrman rank correlation coefficient between KVHI-10 scores and acoustic parameters on POD30. In this graph, when conducting a correlation analysis using Spearrman rank correlation coefficient between KVHI-10 scores and acoustic parameters on POD30. (A) The correlation between Hertz and KVHI-10 scores is depicted in graph. (B) Hertz did not show statistical significance. Jitter exhibited a positive correlation with KVHI-10 and statistically significant. The panels C and D are illustrated correlation between KVHI-10 scores and Shimmer as well as NHR, but they do not appear to exhibit a statistically significant correlation with KVHI-10 scores. KVHI-10, Korean Voice Handicap Index-10; POD, postoperative day; NHR, noise-to-harmonic ratio.
The logistic regression analysis on POD0 indicated that the significant factors were anesthesia time and acoustic parameters, s specifically, Hz and jitter. However, the logistic regression analysis on POD30 did not reveal any significant factors (Table 5).
3. Position and Miscellaneous
We conducted a statistical analysis by dividing the spine surgeries into prone and supine positions. However, since there were only 5 cases in the supine position, the sample size was too small to perform a meaningful statistical comparison. This raised serious concerns about the potential for overinterpretation of the results by readers, so we decided to omit this analysis.
A total of 9 patients were defined as having hoarseness on POD30. Although they were provided guidance for visiting an otolaryngology clinic, none of them actually attended the appointment. As all the patients did not experience new or worsened symptoms and were in an improved state, they did not desire further medical consultation.
In 1 participant, preoperative day KVHI-10 score was 10, which was classified hoarseness based on the KVHI-10 criteria even before the surgery. After the surgery, this patient’s KVHI-10 score increased to 33, and the acoustic parameters showed a twofold increase in hertz compared to the preoperative measurement.
DISCUSSION
In this interim analysis, the incidence of hoarseness was 35.0% on the day of surgery with the anesthesia time statistically significant. The hoarseness incidence rate on POD30 in this interim report was 5.5%, which appears to be higher than the generally reported incidence rate of less than 1% in the previous studies [12,22]. This may be attributed to the prone position, which was usually performed in patients included in this study. Prone position commonly used in spine surgery, may increase the risk of vocal cord edema and increased pressure on the larynx [6,22-24]. The prone position may also promote edema of the pharynx and larynx due to local compression, venous congestion, or lymphatic obstruction, especially with neck rotation or hyperflexion. This edema further increases the likelihood of postoperative hoarseness [25]. Furthermore, prolonged prone positioning can impair microcirculation due to increased pressure on the laryngeal mucosa, making the tissue more vulnerable to injury and edema [26].
Changes in body temperature or ambient temperature during surgery can also affect cuff pressure, as gases expand or contract with temperature fluctuations, resulting in unintended increases or decreases in cuff pressure [27]. The use of muscle relaxants and other anesthetic interventions may alter airway tone and compliance, thereby influencing cuff pressure and further contributing to pressure variations during surgery [28]. Increased cuff pressure in the prone position is common and can contribute to laryngeal and tracheal mucosal injury, increasing the risk of hoarseness. High cuff pressure (≥17 cmH2O) in any position, including supine, is associated with an increased risk of hoarseness after extubation [29,30].
Additionally, reinforced tracheal tubes are known to exert continuous pressure on the airway mucosa due to their relatively rigid structure and lower flexibility. These characteristics of the tube may influence the development of postoperative hoarseness, sore throat, and vocal cord edema [31].
Therefore, it is important to take appropriate measures to minimize the risk of hoarseness in patients undergoing spine surgery in the prone position [8]. Optimizing the duration of intubation and using an appropriately sized endotracheal tube are essential. Since cuff pressure may increase during anesthesia after intubation, raising the risk of vocal cord injury and other complications, adjusting the cuff pressure at 1- to 2-hour intervals during prolonged surgeries may help prevent hoarseness [32-36].
Jitter increases when there is irregularity or instability in the vibration of the vocal folds during phonation. Such irregularities can be caused by conditions like vocal fold paralysis, nodules, inflammation, or aging, which may alter the mass, tension, or symmetry of the vocal folds, leading to less regular vibrations and increased jitter [37,38]. Patients may adapt to voice changes over time or feel less discomfort psychologically, leading to lower KVHI-10 scores even if acoustic parameters like jitter remain unchanged or worsen. Compensatory strategies (speaking slowly or resting the voice) can reduce perceived symptoms but may not improve vocal fold vibration regularity. Thus, a decrease in KVHI-10 with increased jitter is clinically possible, emphasizing the need to interpret both subjective and objective measures together in voice disorder assessment [11,39,40].
Jitter refers to the cycle-to-cycle variability in the period duration of the acoustic signal generated during voice production. In other words, it measures the random, short-term fluctuations in the fundamental frequency of the voice between successive vibratory cycles of the vocal folds [37,41]. Therefore, jitter can be considered a parameter that responds sensitively to anatomical changes around the vocal folds. Future studies will be needed to explore the potential of using jitter as a screening tool for hoarseness.
Both jitter and NHR are sensitive to anatomical changes in the larynx and vocal tract. Elevated values are commonly associated with pathological or postsurgical states, indicating irregular vocal fold vibration or inefficient glottal closure. Persistent abnormalities in jitter and NHR after anatomical changes may warrant further investigation for underlying structural or functional disorders [38,42].
This interim report aims to provide criteria to objectively assess hoarseness with acoustic parameters, not solely relying on KVHI-10 scores in determining hoarseness. The AUC values of acoustic parameters were found to be statistically significant, with values of 0.709 for hertz and 0.615 for jitter, respectively. This indicates that, even with the changes in hertz and jitter, rather than solely relying on KVHI-10 after surgery, it is possible to discern the occurrence of hoarseness.
According to the previous study, acoustic parameters suggested that the potential to discriminate the presence of hoarseness after spine surgery [43]. In assessing hoarseness, some studies have reported a positive correlation between subjective assessments using questionnaires like the KVHI-10 and acoustic parameters, while others found no statistically significant relationship [10,11,44,45]. In our study as well, among the various acoustic measures evaluated, only jitter showed a statistically significant correlation with the KVHI-10. This finding emphasizes that not all acoustic parameters are strongly or consistently correlated with KVHI-10, highlighting jitter as a potentially meaningful objective indicator of postoperative voice changes. Therefore, it suggests that the KVHI-10, which reflects the patient’s subjective experience and quality of life, and acoustic parameters should be used complementarily.
Unlike anterior cervical spine surgery, which is already known to carry a potential risk of postoperative hoarseness, this study focuses on spinal surgeries—most of which were performed in the prone position—where the risk of hoarseness has not been thoroughly evaluated. The distinctive aspect of this study lies in its detailed examination of the correlation between subjective hoarseness assessment using KVHI-10 and objective acoustic parameters in a surgical population with a potentially higher risk (i.e., prone positioning) compared to the supine position.
Our results highlight the risks of postoperative hoarseness associated with general spine surgery. Furthermore, the use of acoustic parameters provides an objective method for evaluating hoarseness, offering greater reliability compared to the subjective questionnaire-based criteria currently in use. This approach has the potential to reveal the true incidence of postoperative hoarseness in general spine surgeries.
This interim report performed at single center, not large enough effect size. These shortcomings of this study are potentially limiting external validity. The small sample size may have limited the statistical power of the analysis. In particular, we were unable to compare subjective assessment tools such as the KVHI-10 and objective acoustic analysis parameters among surgeries performed in the prone, supine, and lateral positions. Future studies may provide a better understanding of the incidence of postoperative hoarseness based on surgical position, as well as the risk factors associated with its development.
This study has a limitation in that the KVHI-10 was used simultaneously as both the diagnostic criterion and the comparison standard in evaluating the relationship with acoustic parameters. Therefore, future research should include a head-to-head comparison of the KVHI-10 and acoustic measures to evaluate their relative accuracy in reflecting hoarseness from the patient’s perspective.
Patients with persistent hoarseness up to POD30 were advised to seek an otolaryngology consultation; however, none actually underwent further evaluation. As a result, the persistence of hoarseness was assessed solely based on the KVHI-10 score without confirming other possible causes such as anatomical issues, which represents a limitation of using a subjective assessment tool.
CONCLUSION
Currently, there is no clear gold standard for diagnosing hoarseness other than KVHI-10. Therefore, it cannot be definitively asserted that KVHI-10 is insufficient. If there were clear items in questionnaires specifically addressing the presence of hoarseness, KVHI-10 could become more objective diagnosing tool. However, such items are not currently present in the KVHI-10 questionnaires. Hence, there is a possibility to advocate for the incorporation of objective measures other than the KVHI-10 through using acoustic parameters in this study.
We aimed to establish objective measurements in voice analysis to complement the subjective evaluation of hoarseness. Hertz and jitter could serve as useful objective indicators in assessing the occurrence of hoarseness after surgery. By confirming that anesthesia time significantly affects hoarseness occurrence, surgeons should strive to reduce the duration of surgery. The authors anticipate that the final analysis of this prospective observational study will provide new criteria for diagnosing hoarseness based on objective measurements. Furthermore, a more thorough evaluation of the current KVHI-10 criteria used in spine surgery is necessary.
Notes
Conflict of Interest
The authors have nothing to disclose.
Acknowledgments
The authors appreciate the statistical advice provided by the Medical Research Collaborating Center at Seoul National University Hospital.
Author Contribution
Conceptualization: CKC; Data curation: SK, YC, YRK; Formal analysis: YC; Methodology: YC; Project administration: CKC; Writing – original draft: SK; Writing – review & editing: YC, HP, JHK, WYJ, KWS, HO, HCL, HPP, CHL, CHK.
