• BACKGROUND
    • Periprosthetic proximal humerus fractures (PPHFs) are a detrimental complication of shoulder arthroplasty, yet their characterization and management have been poorly studied. We aimed to determine the intra- and interobserver reliability of 4 previously described PPHF classification systems to evaluate which classifications are the most consistent.
  • METHODS
    • We retrospectively reviewed 32 patients (34 fractures) that were diagnosed with a PPHF between 1990 and 2017. Patient electronic medical records and research electronic data capture (REDCap) were used for data collection. Post-PPHF radiographs in multiple views for all 34 cases were organized into an encrypted, randomized Qualtrics survey. Four blinded fellowship-trained shoulder and elbow surgeons graded each fracture using previously reported classification systems by (1) Wright and Cofield (1995), (2) Campbell et al (1998), (3) Worland et al (1999), and (4) Groh et al (2008), along with selecting a preferred management strategy for each fracture. Grading was performed twice with at least 2 weeks between each randomized attempt. Intraobserver reliability was calculated as an unweighted Cohen kappa coefficient between attempt 1 and attempt 2 for each surgeon. Interobserver reliability and agreeability between surgeons' preferred management strategies were calculated for each classification system using Fleiss kappa coefficient. The kappa coefficients were interpreted using the Landis and Koch criteria.
  • RESULTS
    • The average intraobserver kappa coefficient for each classification was as follows: Wright and Cofield = 0.703, Campbell = 0.527, Worland = 0.637, Groh = 0.699. The overall Fleiss kappa coefficient for interobserver reliability for each classification was as follows: Wright and Cofield = 0.583, Campbell = 0.488, Worland = 0.496, Groh = 0.483. Interobserver reliability was significantly greater with the Wright and Cofield classification. Using Landis and Koch criteria, all the classification systems assessed demonstrated only moderate interobserver agreement. Additionally, the mean interobserver agreeability kappa coefficient for preferred management strategy was 0.490, indicating only moderate interobserver agreement.
  • CONCLUSION
    • There is only moderate interobserver reliability among the 4 PPHF classification systems and the preferred management strategy for the fractures assessed. Of the 4 PPHF classification systems, Wright and Cofield demonstrated the greatest mean intraobserver reliability and overall interobserver reliability. Our study highlights a need for the development of a PPHF classification system that can achieve high intra- and interobserver reliability and that can allow for a standardized treatment algorithm in the management of PPHFs.