Original Research

Analysis of the relationship between surgeon procedure volume and complications after total knee arthroplasty using a propensity-matched cohort study

Abstract

Objectives This study aimed to identify a threshold in annual surgeon volume associated with increased risk of revision (for any cause) and deep infection requiring surgery following primary elective total knee arthroplasty (TKA).

Design A propensity score matched cohort study.

Setting Ontario, Canada.

Participants 169 713 persons who received a primary TKA between 2002 and 2016, with 3-year postoperative follow-up.

Main outcome measures Revision arthroplasty (for any cause), and the occurrence of deep surgical infection requiring surgery.

Results Based on restricted cubic spline analysis, the threshold for increased probability of revision and deep infection requiring surgery was <70 cases/year. After matching of 51 658 TKA recipients from surgeons performing <70 cases/year to TKA recipients from surgeons with greater than 70 cases/year, patients in the former group had a higher rate of revision (for any cause, 2.23% (95% Confidence Interval (CI) 1.39 to 3.07) vs 1.70% (95% CI 0.85 to 2.55); Hazard Ratio (HR) 1.33, 95% CI 1.21 to 1.47, p<0.0001) and deep infection requiring surgery (1.29% (95% CI 0.44 to 2.14) vs 1.09% (95% CI 0.24 to 1.94); HR 1.33, 95% CI 1.17 to 1.51, p<0.0001).

Conclusions For primary TKA recipients, cases performed by surgeons who had performed fewer than 70 TKAs in the year prior to the index TKA were at 31% increased relative risk of revision (for any cause), and 18% increased relative risk for deep surgical infection requiring surgery, at 3-year follow-up.

Key messages

What is already known about this subject?

  • Increased surgeon volume is associated with reduced risk for surgical complications for total knee arthroplasty (TKA), but there is no specific threshold associated with a reduced risk.

What are the new findings?

  • This study has shown that, for a cohort of first-time TKA recipients, cases performed by surgeons who had performed 70 or fewer TKA procedures in the year prior to the index TKA were at increased risk for a major surgical complication requiring further surgery.

How might these results affect future research or surgical practice?

  • Having a healthcare policy that restricts provision of TKAs to only high-volume surgeons can have the effect of restricting access to care. Research should be focused on improving quality of care and ensuring that it is uniform across surgeons with varying volume. This has significant cost implications in terms of training, revalidation and continuing professional development.

Background

Patients undergoing a variety of procedures and treatments with high volume providers, that is, physicians and hospitals that undertake the procedure with relatively high frequency, tend to achieve better outcomes.1 The identification of a specific volume threshold below which the risk for complications increase serves as a useful marker for surgeons planning their practice, and for hospitals when allocating resources.

No accepted rigorous methodology for identification of volume thresholds with regard to total knee arthroplasty (TKA) currently exists. Most studies have defaulted to assigning arbitrary cut-offs or ranking patients by volume and splitting them into quantiles for analysis, with potential information loss and the creation of volume categories that have limited rational basis.2 3 Consequently, published volume categories are remarkably inconsistent between studies.

The aim of this study was to use a restricted cubic splines methodology to visualize the relationship between surgeon volume (defined as the number of TKAs performed by the surgeon in the year before the index arthroplasty) and the rates of revision and deep infection requiring surgery within 3 years of surgery. We further aimed to identify a threshold of surgeon volume below which the risk for complications appreciably increased, and to quantify this risk using a propensity score match analysis.

Methods

Study sample

We used health administrative databases from Ontario, Canada. The main data sources were hospital discharge abstracts from the Canadian Institute for Health Information Discharge Abstract Database, physician claims from the Ontario Health Insurance Plan (OHIP), and demographic information on each physician from the Ontario Physician Human Resources Data Centre and OHIP Corporate Provider Database. Using specific procedure and diagnostic codes from the Canadian version of the 10th revision of the International Statistical Classification of Diseases and the Canadian Classification of Health Interventions (ICD-10-CA/CCI), we defined a cohort of patients who received their first primary elective TKA for osteoarthritis between April 1, 2002 and March 31, 2016. Patients or the public were not involved in the design, or conduct, or reporting, or dissemination plans of our research.

Outcomes of interest: revision within three years of surgery

The primary outcome of this study was the occurrence of a revision arthroplasty (for any cause) within 3 years of surgery. Revision arthroplasty was defined placement of new components following failure of some or all of a failed primary implant. The secondary outcome was the occurrence of deep infection requiring surgery within 3 years. Primary TKAs that involved the use of computer navigation, patient-specific instrumentation or robotics were also excluded. Revision procedures (for any cause) were identified using ICD-10-CA/CCI procedure codes accompanied by the supplementary status attribute ‘R’. Deep infections requiring surgery were identified using two methods: (1) occurrence of an ICD-10-CA diagnostic code for intra-articular infection, with a confirmatory code for an irrigation and debridement and/or (2) occurrence of an OHIP code for a spacer insertion. Deep infections that led to a revision (irrigation and debridement as a first stage), as well as irrigation and debridement with implant retention were also included.

Covariates of interest

We measured several patient and provider covariates that have been previously shown to affect the risk of occurrence of complications following knee replacement. Patient age and sex was obtained from the OHIP Registered Persons Databases.4–6 Comorbidities listed on hospital discharge abstracts in the 3 years before the index admission were categorized according to an adaptation of the Charlson Comorbidity Index.7 The Johns Hopkins Adjusted Clinical Groups (ACG) ‘frailty’ indicator, based on diagnosis codes from hospitalizations and physician visits in the 2 years before the index admission for TKA, was used to determine the presence of ‘frailty’ (yes/no) at the time of the surgery.8 We identified patients with a history of pre-existing coronary artery disease (CAD), diabetes, hypertension, chronic obstructive pulmonary disease (COPD), and dementia using validated algorithms (see online supplemental appendix 19–17). This study used administrative data available at the Institute for Clinical Evaluative Sciences (ICES). As Ontario, Canada has a single-payer healthcare system, ICES is able to capture every interaction a patient has within this system. Several algorithms comprised of ICD-10 codes and provider billing codes have been used to create cohorts of patients with various comorbidities. The algorithms used in this study have been validated with chart abstraction to determine their sensitivity, specificity and positive predictive value (online supplemental appendix 19–17). We also identified patients that had been previously counseled on smoking cessation through appropriate physician billing codes.

Neighborhood income quintile was used as a surrogate for socioeconomic status. This measure categorizes small geographic areas into five roughly equal population groups, with the lowest quintile referring to the least affluent neighborhoods.18 19 The Ontario marginalization index was also assessed. This comprises four elements: ethnic concentration, residential instability, dependency, and deprivation.20 Each element is sorted into fifths, arranged from least (lowest fifth) to most marginalized (highest fifth).20 The index has been shown to be stable across time periods and across different geographic areas and to be associated with health outcomes including depression, smoking, alcohol consumption, and body mass index.20 For each TKA, hospital volume was defined as the number of primary knee arthroplasty procedures (both primary and revision) performed at the hospital where the surgery was performed in the 365 days prior to the index procedure. Unicompartmental knee arthroplasties (UKA) were excluded, as was any revision of a UKA to a TKA. Surgeon experience was operationalized as the number of years since completion of residency/fellowship orthopedic training, as evidenced by the start of Royal College accreditation and start of independent practice, for the senior surgeon at the time of the index surgery. This includes foreign surgeons moving to Canada with substantial experience, but recent Royal College accreditation.

Main exposure variable: surgeon volume

For each TKA, surgeon volume was defined as the number of TKA procedures (both primary and revision) performed by the senior surgeon in the 365 days prior to the index procedure. This definition allowed for an individual surgeon’s volume to change dynamically over time. Only revisions of primary TKAs were included with revisions of UKAs to TKAs excluded.

Statistical analyses

Baseline cohort characteristics were calculated using proportions and medians as appropriate, and were compared between groups using Wilcoxon rank sum tests for continuous variables and χ² tests for categorical variables. A propensity score for receipt of a TKA from a surgeon with the determined threshold value was calculated using a logistic regression model.21 22 The covariates entered into the propensity score were sociodemographics (age, sex, income quintile, Ontario marginalization index), health status (Charlson score, frailty, hypertension, Chronic Obstructive Pulmonary Disease (COPD), Congestive Heart Failure (CHF), diabetes, Coronary artery disease (CAD), smoking status), provider characteristics (annual hospital volume), and surgeon experience.

Restricted cubic splines with four knots23 were used to model the relationship between surgeon volume and the occurrence of revision and infection after controlling for patient age, sex, income quintile, comorbidities, hospital volume and surgeon experience. Restricted cubic splines are a flexible tool to model complex, non-linear relationships between a continuous variable and an outcome.23 Normal regression analysis assumes a linear relationship between the predictor and outcome variables. This would suggest that the impact of an increase in surgeon volume would be the same if the increase in volume was from 10 to 30 cases/year, or from 210 to 230 cases/year. A spline, on the other hand, does not make any assumptions of linearity. It divides the relationship into smaller ‘chunks’, allowing for portions that are not linear. In the current study, use of a spline also allowed us to identify a threshold of volume at which the greatest benefit of surgeon volume is obtained. The non-linear relationship between surgeon volume and the risk of revision was examined to identify an inflection point, if any, which could be used to dichotomize annual volume into categories in a clinically meaningful way. If an area of inflection was observed, multivariable logistic regression was used to determine the area under the curve (AUC) for the models relating various surgeon volume cut-points to the risk of revision. The surgeon volume with the maximum AUC was selected as the cut-point to dichotomize surgeon volume.

TKA recipients from a surgeon who had operated below the selected threshold value in the preceding year were matched to those from a surgeon with more than the threshold number of cases on the logit of the propensity score using calipers of width equal to 0.2 of the SD of the logit of the propensity score.24 A matching ratio of 1:1 was used.25

We estimated standardized differences for all covariates after matching, with a standardized difference of 10% or more considered indicative of imbalance.25 The occurrence of complications were also compared between the two groups after matching, using methods appropriate for the analysis of matched data in estimating the treatment effect and its statistical significance. Generalized estimating equations were used to determine the increased risk (if any) in patients with surgeons whose volumes were lower than the selected threshold, after taking pair-matching into account.23 All analyses were performed at the Institute for Clinical Evaluative Sciences (http://www.ices.ca) using SAS V.9.3 for UNIX (SAS Institute, Cary, North Carolina, USA). The type I error probability was set to 0.05 for all analyses.

Results

Patient and characteristics

Between April 1, 2002 and March 31, 2016, there were 169 713 eligible TKA recipients (table 1), with a median age of 68, and 62.5% were of female gender. The median annual surgeon volume was 89 cases (IQR 59–127), and the median number of years in practice was 15 years (IQR 8–24).

Table 1
|
Characteristics of eligible primary total knee arthroplasty recipients

Regression splines describing the relationship between surgeon volume and risk of complications

The restricted cubic splines relating annual surgeon volume to revision and infection had similar shapes—both were negatively sloped with inflection points at approximately 70 cases/year, after which the rates of complications continued to decrease with increased surgeon volume, but at a lower rate (figures 1 and 2). The AUC was determined for various cut-points of surgeon volume (60, 65, 70, 75, 80, 85, 90). Subsequently, surgeon volume was dichotomized at 70 cases (<70 or ≥70 cases in the 365 days prior to the procedure). A total of 56 265 procedures (33%) were performed by ‘low-volume’ surgeons over the study period.

Figure 1
Figure 1

Probability of revision (for any cause) within 3 years vs annual surgeon volume for patients undergoing primary elective total knee arthroplasty. KEY—thick red line—mean values. Lighter vertical bars—95% CI.

Figure 2
Figure 2

Probability of deep infection requiring surgery within 3 years vs annual surgeon volume for patients undergoing primary elective total knee arthroplasty. KEY—thick blue line—mean values. Lighter vertical bars—95% CI.

Matching

A total of 51 658 subjects (92%) who received a TKA from a surgeon with <70 cases in the 365 days prior to the surgery were successfully matched to a TKA recipient from a surgeon with ≥70 cases in the 365 days prior to the surgery (table 2). There were no exclusions to the propensity-score match. Patients that did not complete follow-up to 3 years postoperatively because of death (n=4374 (2.57%)) were not excluded from the match. A patient flow diagram is illustrated in figure 3. After matching, the absolute standardized differences were less than 10% for all variables entered into the propensity score, indicating an adequate match (table 2).

Figure 3
Figure 3

Participant flow diagram for patients undergoing analysis of outcome based on surgeon procedure volume using a propensity score matched cohort study after primary elective TKA. TKA, total knee arthroplasty.

Table 2
|
Comparison of primary TKA recipients, after matching

Outcomes after matching

Knee replacement recipients from a surgeon with low annual volumes (<70 cases in the year prior) had a higher rate of revision within 3 years (2.23% (95% CI 1.39 to 3.07) vs 1.70% (95% CI 0.85 to 2.55); HR 1.33, 95% CI 1.21 to 1.47, p<0.0001) and infection requiring surgery within 3 years (1.29% (95% CI 0.44 to 2.14) vs 1.09% (95% CI 0.24 to 1.94); HR 1.33, 95% CI 1.17 to 1.51, p<0.0001) (table 3). The number needed to harm for revision was 189 persons (95% CI 143 to 278), which translates to an absolute risk increase of 0.53% (95%CI 0.37% to 0.71%).

Table 3
|
Proportion of primary TKA recipients with specific complications, after matching

Discussion

The model used in this study indicated that as surgeon volume increased, the risks for complications decreased. We observed an inflection point at 70 TKAs/year, after which the rate of decrease in the risks for complications leveled off. After performing a propensity score match for surgeon volume, which also controlled for surgeon experience, we found that in patients operated on by surgeons with annual volumes <70 cases, the relative risk for revision (for any cause) increased by 31% (2.23% (95% CI 1.39 to 3.07) vs 1.70% (95% CI 0.85 to 2.55)), and the relative risk of deep surgical infection requiring surgery increased by 18% (1.29% (95% CI 0.44 to 2.14) vs 1.09% (95% CI 0.24 to 1.94)). The findings of this study indicate that for surgeons performing <70 primary TKAs per year, regardless of previous experience, there is an increased likelihood for these two complications.

Use of the above approach is lacking in other studies which define volume categories instead. One such study used stratum-specific likelihood ratio analysis, a method of analyzing receiver operating characteristic curves to show that there is a significant decrease in 90-day complication and 2-year revision rates for surgeons in higher volume categories (60–145 TKAs per year and >146 TKAs per year) vs those in lower volume categories (0–12, 13–59 TKAs per year).26 A criticism of the aforementioned study is that it does not take account of changes in procedures and practices over time,27 something which the cubic spline technique is able to account for, as it is a dynamic assessment of a surgeon’s practice, meaning that each individual surgeon can have different volumes across the study period.

The threshold of 70 primary TKAs per year is higher than previously reported for total hip arthroplasty (THA) (35 cases per year28). A possible reason for this difference is that the rates of early complications after TKA are generally a little lower than after THA; as such, a higher volume threshold is required to observe a difference in outcomes. The THA study complications included dislocation, periprosthetic fracture, and a broader definition for infection. The early complications of venous thromboembolism and death within 90 days were also included in that study.

Many patients do not have the means or social support to travel to high volume TKA providers from rural areas. Having a healthcare policy that restricts provision of TKAs to only high volume surgeons can have the effect of restricting access to care. Research should be focused on improving quality of care and ensuring that it is uniform across surgeons with varying volume.27 This has significant cost implications in terms of training, revalidation and continuing professional development.

Through the use of restricted cubic splines, we found that there was a noticeable decrease in likelihood of revision (for any cause) and deep infection requiring surgery, as the surgeon’s yearly primary knee arthroplasty volume increased; however, the relationship was not linear. While the relative improvement in risk of revision (for any cause) and deep infection requiring surgery with increasing surgeon volume attenuated after this point, there continued to be a downward trend in the risks for these two complications, indicating that increased surgeon volume continues to have a beneficial impact, although one that is less pronounced.

The cumulative risk of revision for TKAs in the National Joint registry for England and Wales ranges from 1.53% for cemented TKAs, to 1.80% for hybrid TKAs, and 2.09% for uncemented TKAs, within 3 years following surgery.29 The absolute difference in revision rates between these implant designs is of similar magnitude (0.56) to the difference between low and high volume surgeons found in the current study (0.53).

There are limitations to the work we have presented. First, we did not have any information on patient-reported outcomes. Second, we are unable to report a subgroup analysis based on the indications for revision TKA, nor assess center effects in the modelling we have undertaken. We were also unable to control for technical aspects of the procedure, for example, severity of pre-existing deformity, the need for adjunctive soft tissue releases, or length of surgery, factors which have been linked with complication following TKA.30 With regard to the analysis performed, there is a risk of introducing bias with propensity score matching.31 In this study, however, we have matched more than 90% of the cases performed by low volume surgeons, suggesting a representative sample, and standardized differences for all relevant measured confounders were under 10% after matching.

This study was conducted in Ontario, Canada where it is standard practice for healthcare organizations (hospitals or hospital networks) to standardize implants for use by their surgeons. For the vast majority of surgeons therefore, implant selection would be equivalent irrespective of volume. We accept that having the detail of the types of brands may help define if there is an advantage for low volume surgeons using the best performing implants but the converse could also be the case. Unfortunately, this detail is not something we have available in our database to provide. There is no role for private practice in the provision of joint replacement; hence, all practicing surgeons are accounted for in the available data set. Surgeons registered with the College of Physicians and Surgeons of Ontario are not allowed to work across two provinces. As such, we are confident that the surgeon volumes captured in the study are not an underestimate, but we also accept that we are unable to account for the unlikely movement of surgeons in and out of the Ontario system during the period of the study.

Further research, potentially using data sources that capture this information, is recommended to confirm or refute these hypotheses. With regard to deep infection requiring surgery as an outcome, we have used a limited definition (cases requiring irrigation and debridement, liner exchange or spacer insertion), meaning that we have not captured instances of superficial surgical site infection (SSI). It has recently been demonstrated that increased surgical duration is associated with a higher risk for infection following TKA32 and this is a plausible mechanism linking low volumes and infection following TKA. Infection should be viewed as a spectrum from acute to chronic.33 We define an infection in our database as one that is serious enough to warrant an additional intervention such as spacer insertion or wound irrigation and debridement. The rate of superficial SSIs that are managed primarily by surgeons or primary care providers is difficult to quantify and this information is not available in our database.

This paper and the description of the study methodology provide an opportunity for a similar type of analysis to be carried out in different geographical locations/settings as it may be inappropriate to apply a number obtained from one region to another. This is especially important as the median numbers performed per year, are different between countries (median number of TKAs per surgeon in UK (n=36)29; in the USA median number of TKAs per surgeon (n=23)34).

Conclusion

In a cohort of primary TKA recipients, we found that cases performed by surgeons who had performed fewer than 70 TKA procedures in the year prior to the index TKA were at 31% increased relative risk of revision (for any cause), and 18% increased relative risk for deep surgical infection requiring surgery, at 3 years of follow up.