Accuracy of conventional and novel scoring systems in predicting severity and outcomes of acute pancreatitis: a retrospective study

Background Recently, several novel scoring systems have been developed to evaluate the severity and outcomes of acute pancreatitis. This study aimed to compare the effectiveness of novel and conventional scoring systems in predicting the severity and outcomes of acute pancreatitis. Methods Patients treated between January 2003 and August 2020 were reviewed. The Ranson score (RS), Glasgow score (GS), bedside index of severity in acute pancreatitis (BISAP), pancreatic activity scoring system (PASS), and Chinese simple scoring system (CSSS) were determined within 48 h after admission. Multivariate logistic regression was used for severity, mortality, and organ failure prediction. Optimum cutoffs were identified using receiver operating characteristic curve analysis. Results A total of 1848 patients were included. The areas under the curve (AUCs) of RS, GS, BISAP, PASS, and CSSS for severity prediction were 0.861, 0.865, 0.829, 0.778, and 0.816, respectively. The corresponding AUCs for mortality prediction were 0.693, 0.736, 0.789, 0.858, and 0.759. The corresponding AUCs for acute respiratory distress syndrome prediction were 0.745, 0.784, 0.834, 0.936, and 0.820. Finally, the corresponding AUCs for acute renal failure prediction were 0.707, 0.734, 0.781, 0.868, and 0.816. Conclusions RS and GS predicted severity better than they predicted mortality and organ failure, while PASS predicted mortality and organ failure better. BISAP and CSSS performed equally well in severity and outcome predictions.


Background
Acute pancreatitis (AP) is an inflammatory disease of the pancreas with a worldwide incidence varying from 33.2/100,000 to 45/100,000 in the general population [1][2][3]. Approximately 10%~20% of patients with AP have a severe clinical course, with significant morbidity and mortality due to local and systemic complications [3][4][5][6]. Acute respiratory distress syndrome (ARDS) and acute renal failure (ARF) are common complications of severe acute pancreatitis, and result in worse outcomes [7][8][9]. Therefore, the early detection of ARDS and ARF in patients with AP is indispensable.
Many studies have compared biochemical markers and various scoring systems in the early stage to predict disease course and outcomes in AP [10][11][12][13]. Conventional scoring systems, including the Ranson score (RS), Glasgow score (GS), and Acute Physiology, Chronic Health Evaluation (APACHE) II score, and bedside index of severity in acute pancreatitis (BISAP) have been used to assess the severity of AP. However, these scores are complicated and require multiple difficult clinical parameters for risk stratification. Although biomarkers are easy to obtain, their ability in predicting outcomes varies [14][15][16][17]. Recently, some novel scoring systems have been developed. A prospective cohort study [18] showed that the pancreatic activity scoring system (PASS ; Table 1), which was first reported by the Southern California Pancreas Study Group in 2017 [19], could predict important clinical events at different points during the course of AP. Another new scoring system called the Chinese simple scoring system (CSSS; Table 2) was proposed in 2020 [20]. Both scoring systems are not yet widely used.
The present study aimed to specifically determine the accuracy of these conventional and novel scoring systems as well as biomarkers in predicting severity, mortality, and organ failure in patients with AP.

Study design and patient selection
A retrospective study was conducted. Records of patients with AP who were treated in The First Affiliated Hospital of Guangxi Medical University, between January 2003 and July 2020, were reviewed.
Patients were diagnosed with AP if they met at least two of the following three criteria: (1) abdominal pain consistent with AP, (2) serum lipase activity or amylase activity at least three times greater than the upper limit of normal, and (3) characteristic findings on abdominal imaging. Patients younger than 16 years, those known to have chronic pancreatitis, or those without sufficient data were excluded from the study.

Definitions of severity and organ failure
Severity of AP was evaluated based on the revised Atlanta classification [21]. Mild AP was defined as AP in the absence of organ failure and local/systemic complications. Severe AP was characterized by the presence of organ failure and/or local complications. Organ failure was defined according to the modified Marshall scoring system [22].
Biochemical markers, scoring systems, and their cutoffs Biochemical markers measured within 48 h after admission were analysed. RS [23], GS [24], BISAP [25], PASS [19], and CSSS [20] were calculated for each patient within 48 h after admission. Scores were compared for their accuracy in the prediction of disease severity, mortality, and development of organ failure (ARDS and ARF).
Statistical analysis SPSS v23.0 (IBM Corp., Armonk, NY) was used for statistical analyses. Continuous variables were displayed as mean ± standard deviation. The Student t-test was used for continuous variables. The chi-square test was used for categorical variables. Univariate and multivariate logistic regression analyses were carried out to identify risk factors. Potential risk factors with P < 0.05 in the univariate analyses were enrolled into the binary logistic backward stepwise regression analysis. The results are presented as odds ratios (OR) with 95% confidence intervals (CIs). ROC curves of the scores were used for the prediction of severe AP, mortality, ARDS, and ARF. Areas under the curve (AUCs) were used to evaluate the predictive accuracy of each scoring system. All optimum cutoffs were identified on the basis of the highest sensitivity and specificity values generated from the ROC

Baseline characteristics
Among 1848 patients enrolled, 1164 (62.99%) had mild AP and 684 (37.01%) had severe AP. The mean age of the patients was 48.22 ± 16.21 years. The mean age of severe group was significantly higher in the severe AP group than in the mild AP group (P < 0.001). A male preponderance (68.19%) was found. ARF was more common in male patients than in female patients (P < 0.001). A higher body-mass index (BMI) was observed in the severe AP group than in the mild AP group (P < 0.001). The BMI of patients with ARDS/ ARF was higher than those of patients without ARDS/ ARF (P < 0.05; Table 3). Gallstones (38.47%) were the most common cause of AP, followed by hypertriglyceridemia (16.72%) and alcohol consumption (10.77%). Alcohol-associated pancreatitis was more common in the severe AP group, ARDS group, and ARF group (Table 3). Hyperlipidemia (14.88%) and type-2 diabetes mellitus (7.52%) were common comorbidities. A history of smoking and alcohol intake history was present in 541 (29.27%) and 591 (31.98%) patients, respectively. Alcohol consumption was more common in patients with severe AP (P < 0.001), ARDS (P = 0.002), and ARF (P < 0.001; Table 3). Longer hospital stay was observed in patients with severe AP than in patients with mild AP (P < 0.001). The mortality rate was much higher in the severe AP group than in the mild AP group (P < 0.001; Table 3).
Value of biomarkers in predicting severity, mortality, and organ failure In the multivariate analysis, white blood cell count (WBC), serum albumin, lactate dehydrogenase (LDH), calcium, glucose, and C-reactive protein (CRP) predicted the severity of AP.

Discussion
In the present study, BMI was an independent factor for the development of ARDS in AP patients, which is consistent with the result of a meta-analysis, that demonstrated that obesity was an important risk factor for the development of ARDS [26]. Studies have shown that patients who are obese have higher levels of circulating neutrophils [27] and blood cytokines [28], and have lowgrade chronic inflammation triggered by obesity [29]. Moreover, innate immune cell activation and endothelial injury in the pulmonary microvasculature are major contributors to increased cell permeability and pulmonary edema in obese patients [30,31]. This study revealed that serum Ca 2+ showed good ORs for severity and ARDS prediction. Abnormal regulation of Ca 2+ signals acts as a crucial trigger in the pathogenesis of AP [32]. A study has shown that hypocalcemia is an independent risk factor of severe AP and for respiratory failure in AP [33]. According to the present study, the WBC predicted the development of severe AP and ARDS. Furthermore, serum albumin, glucose, LDH, and CRP were also predictive factors for severe AP. These biomarkers are commonly used factors to predict severe AP. In terms of mortality prediction, the multivariate analysis identified that an increase in serum total bilirubin was a risk factor. Although few studies have reported a definite relationship between total bilirubin and mortality in AP, some studies have   ARDS Acute respiratory distress syndrome, ARF Acute renal failure, AP acute pancreatitis, BMI Body-mass index, T2DM Type-2 diabetes mellitus, CRP C-reactive protein, AST Aspartate transaminase, BUN Blood urea nitrogen, LDH Lactate dehydrogenase, BISAP Bedside index of severity in acute pancreatitis, PASS Pancreatic activity scoring system, CSSS Chinese simple scoring system, WBC White blood cell count, OR Odds ratio, CI Confidence interval. P < 0.05 was accepted as statistically significant found that the albumin-bilirubin score has a high predictive capacity for in-hospital mortality or prognosis in patients with critical diseases such as acute upper gastrointestinal bleeding due to liver cirrhosis [34], postoperative hepatic carcinoma [35,36], and AP [37]. Moreover, the present study showed that the elevation of serum triglycerides was a risk factor for ARF in AP patients, which is consistent with the findings of a metaanalysis reported in 2018 [38]. RS, GS, and BISAP showed high accuracy in predicting the severity rather than the outcomes of AP in the present study. RS and GS predicted the severity and 3 outcomes of AP equally well, which was probably due to the similar parameters they share. Although simple, these scores are not repeatable. According to this study, BISAP was inferior to both RS and GS in predicting severity, which is consistent with the findings of other prospective studies [39,40]. This is because the items in RS and GS cover more systems than those in BISAP. Nevertheless, BISAP was superior to RS and GS in predicting mortality in the present study. Hall et al. also found that RS and GS were not good indicators of mortality in AP [41]. BISAP was also better than RS and GS at predicting ARDS and ARF, possibly because it is based on 3 important items that are related to the renal and respiratory systems, namely, blood urea nitrogen (BUN), systemic inflammatory response syndrome (SIRS), and pleural effusion. PASS is a system that assesses the activity of AP at any time during hospitalization. It contains not only objective items (organ failure and SIRS), but also subjective items (abdominal pain, morphine usage and ability to tolerate solid diet). The repeatable items make it available to be used at any time during hospitalization. A prospective study [18] demonstrated that a cutoff PASS score of > 140 on admission was associated with an AUC of 0.71 for predicting severe AP. The present study found a similar AUC for PASS for severe AP prediction. As the center in which this study was conducted rarely uses morphine to relieve abdominal pain in patients with ARDS Acute respiratory distress syndrome, ARF Acute renal failure, AP Acute pancreatitis, AUC Area under the curve, PPV Positive predictive value, NPV negative predictive value; BUN Blood urea nitrogen; BISAP Bedside index of severity in acute pancreatitis, PASS Pancreatic activity scoring system, CSSS Chinese simple scoring system, CI confidence interval AP, the cutoff for severity prediction was only 90. In the present study, PASS scores best predicted mortality and organ failure, especially ARDS prediction. This is because PASS contains organ failure items. However, its subjective items (such as abdominal pain, morphine usage and ability to tolerate solid diet) make it inferior to other scores in severity prediction. Thus, no study has reported the predictive ability of PASS for the outcomes of AP. Four biomarkers, heart rate, and pancreatic imaging findings are included in CSSS. According to the present study, the AUCs of CSSS for severity and mortality prediction were 0.834 and 0.838, respectively. The cutoff points were 4 for severity and 6 for mortality. However, the AUCs and cutoff points in this study are smaller than those reported in the previous study [20], which is probably attributable to the larger sample size of the current study. In the present study, CSSS showed nearly the same ability in predicting the 4 outcomes of AP, and it shared nearly equal capacity with BISAP for predicting the outcomes of AP, which indicates that CSSS is a promising scoring system. However, no study evaluating Fig. 1 a Receiver operating characteristic curves of scoring systems to predict severe AP. b with AP. c Receiver operating characteristic curves of scoring systems to predict ARDS in patients with AP. d Receiver operating characteristic curves of scoring systems to predict ARF in patients with AP CSSS was found. Hence, studies with larger sample size and prospective designed are required to verify the efficiency of this new scoring system.

Study strengths and limitations
The strengths of the present study are that it compared both conventional and novel scoring systems as well as biomarkers in a large sample of Chinese patients for the prediction of the severity and outcomes of AP.
This study does have some limitations. First, this was a retrospective single-center study. Second, there was diversity in the period between the onset of AP and admission. This probably resulted in heterogeneity in the timings of score calculations and biochemical marker measurements.

Conclusion
RS and GS predicted severity better than mortality and organ failure, while PASS predicted mortality and organ failure better. As a novel scoring system, PASS has potential, but some of its items are not that suitable for Chinese medical centers. BISAP and CSSS performed equally well in severity and outcome prediction.