• AIMS
    • Patients with cauda equina syndrome (CES) require emergency imaging and surgical decompression. The severity and type of symptoms may influence the timing of imaging and surgery, and help predict the patient's prognosis. Categories of CES attempt to group patients for management and prognostication purposes. We aimed in this study to assess the inter-rater reliability of dividing patients with CES into categories to assess whether they can be reliably applied in clinical practice and in research.
  • METHODS
    • A literature review was undertaken to identify published descriptions of categories of CES. A total of 100 real anonymized clinical vignettes of patients diagnosed with CES from the Understanding Cauda Equina Syndrome (UCES) study were reviewed by consultant spinal surgeons, neurosurgical registrars, and medical students. All were provided with published category definitions and asked to decide whether each patient had 'suspected CES'; 'early CES'; 'incomplete CES'; or 'CES with urinary retention'. Inter-rater agreement was assessed for all categories, for all raters, and for each group of raters using Fleiss's kappa.
  • RESULTS
    • Each of the 100 participants were rated by four medical students, five neurosurgical registrars, and four consultant spinal surgeons. No groups achieved reasonable inter-rater agreement for any of the categories. CES with retention versus all other categories had the highest inter-rater agreement (kappa 0.34 (95% confidence interval 0.27 to 0.31); minimal agreement). There was no improvement in inter-rater agreement with clinical experience. Across all categories, registrars agreed with each other most often (kappa 0.41), followed by medical students (kappa 0.39). Consultant spinal surgeons had the lowest inter-rater agreement (kappa 0.17).
  • CONCLUSION
    • Inter-rater agreement for categorizing CES is low among clinicians who regularly manage these patients. CES categories should be used with caution in clinical practice and research studies, as groups may be heterogenous and not comparable.