Accurately measuring the quality and effectiveness of lumbar surgery in registry efforts: determining the most valid and responsive instruments.

BACKGROUND CONTEXT
- Prospective registries have emerged as a feasible way to capture real-world care across large patient populations. However, the proven validity of more robust and cumbersome patient-reported outcomes instruments (PROis) must be balanced with what is feasible to apply in large-scale registry efforts.

PURPOSE
- To determine the relative validity and responsiveness of common PROis in accurately determining effectiveness of lumbar fusion for degenerative lumbar spondylolisthesis in registry efforts.

PATIENT SAMPLE
- Fifty-eight patients undergoing transforaminal lumbar interbody fusion (TLIF) for degenerative lumbar spondylolisthesis

OUTCOME MEASURES
- Patient-reported outcome measures for pain (numeric rating scale for back and leg pain [NRS-BP, NRS-LP]), disability (Oswestry Disability Index [ODI]), general health (Short Form [SF]-12), quality of life (QOL) (EuroQol five dimensions [EQ-5D]), and depression (Zung depression scale [ZDS]) were assessed.

METHODS
- Fifty-eight patients undergoing primary TLIF for lumbar spondylolisthesis were entered into an institutional registry and prospectively followed for 2 years. Baseline and 2-year patient-reported outcomes were assessed. To assess the validity of PROis to discriminate between effective and noneffective improvements, receiver operating characteristic curves were generated for each outcomes instrument. An area under the curve (AUC) of ≥0.80 was considered an accurate discriminator. The difference between standardized response means (SRMs) in patients reporting meaningful improvement versus not was calculated to determine the relative responsiveness of each instrument.

RESULTS
- For pain and disability, ODI had AUC=0.94, suggesting it as an accurate discriminator of meaningful improvement. Oswestry Disability Index was most responsive to postoperative improvement (SRM difference: 2.18), followed by NRS-BP and NRS-LP. For general health and QOL, SF-12 physical component score (AUC: 0.90), ZDS (AUC: 0.89), and SF-12 mental component score (AUC: 0.85) were all accurate discriminators of meaningful improvement, however, EQ-5D was most accurate (AUC: 0.97). EuroQol five dimensions was also most responsive (SRM difference: 2.83).

CONCLUSIONS
- For pain and disability, ODI was the most valid and responsive measure of effectiveness of lumbar fusion. Numeric rating scale-BP and NRS-LP should not be used as substitutes for ODI in measuring effectiveness of care in registry efforts. For health-related QOL, EQ-5D was the most valid and responsive measure of improvement, however, SF-12 and ZDS are valid alternatives with less responsiveness.

Action	Numeric Key	Letter Key	Function Key
Show Bullets		S	Enter (frontside only)
20%	1	N
40%	2	H
60%	3	F	Enter (backside only)
80%	4	E
100%	5	M
Previous Card			Left Arrow
Next Card		N	Right Arrow
Toss	0	T

Action	Numeric Key	Letter Key	Function Key
Choose 1	1
Choose 2	2
Choose 3	3
Choose 4	4
Choose 5	5
Submit Response			Enter
Previous Question			Left Arrow
Next Question		N	Right Arrow
Open/Close Bookmode		C
Open Image			Spacebar

Action	Numeric Key	Letter Key	Function Key
Choose 1	1
Choose 2	2
Choose 3	3
Choose 4	4
Choose 5	5
Submit Response			Enter
Previous Question			Left Arrow
Next Question		N	Right Arrow
Open/Close Bookmode		C
Open Image			Spacebar