Identification of potential diagnostic and prognostic biomarkers for sepsis based on machine learning.
Computational and structural biotechnology journal
Background:To identify potential diagnostic and prognostic biomarkers of the early stage of sepsis. Methods:The differentially expressed genes (DEGs) between sepsis and control transcriptomes were screened from GSE65682 and GSE134347 datasets. The candidate biomarkers were identified by the least absolute shrinkage and selection operator (LASSO) regression and support vector machine recursive feature elimination (SVM-RFE) analyses. The diagnostic and prognostic abilities of the markers were evaluated by plotting receiver operating characteristic (ROC) curves and Kaplan-Meier survival curves. Gene Set Enrichment Analysis (GSEA) and single-sample GSEA (ssGSEA) were performed to further elucidate the molecular mechanisms and immune-related processes. Finally, the potential biomarkers were validated in a septic mouse model by qRT-PCR and western blotting. Results:Eleven DEGs were identified between the sepsis and control samples, including YOD1, GADD45A, BCL11B, IL1R2, UGCG, TLR5, S100A12, ITK, HP, CCR7 and C19orf59 (all AUC>0.9). Furthermore, the survival analysis identified YOD1, GADD45A, BCL11B and IL1R2 as the prognostic biomarkers of sepsis. According to GSEA, four DEGs were significantly associated with immune-related processes. In addition, ssGSEA demonstrated a significant difference in the enriched immune cell populations between the sepsis and control groups (all < 0.05). Moreover, YOD1, GADD45A and IL1R2 were upregulated, and BCL11B was downregulated in the heart, liver, lungs, and kidneys of the septic mice model. Conclusions:We identified four potential immune-releated diagnostic and prognostic gene markers for sepsis that offer new insights into its underlying mechanisms.
10.1016/j.csbj.2023.03.034
Predicting sepsis in-hospital mortality with machine learning: a multi-center study using clinical and inflammatory biomarkers.
European journal of medical research
BACKGROUND:This study aimed to develop and validate an interpretable machine-learning model that utilizes clinical features and inflammatory biomarkers to predict the risk of in-hospital mortality in critically ill patients suffering from sepsis. METHODS:We enrolled all patients diagnosed with sepsis in the Medical Information Mart for Intensive Care IV (MIMIC-IV, v.2.0), eICU Collaborative Research Care (eICU-CRD 2.0), and the Amsterdam University Medical Centers databases (AmsterdamUMCdb 1.0.2). LASSO regression was employed for feature selection. Seven machine-learning methods were applied to develop prognostic models. The optimal model was chosen based on its accuracy, F1 score and area under curve (AUC) in the validation cohort. Moreover, we utilized the SHapley Additive exPlanations (SHAP) method to elucidate the effects of the features attributed to the model and analyze how individual features affect the model's output. Finally, Spearman correlation analysis examined the associations among continuous predictor variables. Restricted cubic splines (RCS) explored potential non-linear relationships between continuous risk factors and in-hospital mortality. RESULTS:3535 patients with sepsis were eligible for participation in this study. The median age of the participants was 66 years (IQR, 55-77 years), and 56% were male. After selection, 12 of the 45 clinical parameters collected on the first day after ICU admission remained associated with prognosis and were used to develop machine-learning models. Among seven constructed models, the eXtreme Gradient Boosting (XGBoost) model achieved the best performance, with an AUC of 0.94 and an F1 score of 0.937 in the validation cohort. Feature importance analysis revealed that Age, AST, invasive ventilation treatment, and serum urea nitrogen (BUN) were the top four features of the XGBoost model with the most significant impact. Inflammatory biomarkers may have prognostic value. Furthermore, SHAP force analysis illustrated how the constructed model visualized the prediction of the model. CONCLUSIONS:This study demonstrated the potential of machine-learning approaches for early prediction of outcomes in patients with sepsis. The SHAP method could improve the interoperability of machine-learning models and help clinicians better understand the reasoning behind the outcome.
10.1186/s40001-024-01756-0
Machine learning for the prediction of acute kidney injury in patients with sepsis.
Journal of translational medicine
BACKGROUND:Acute kidney injury (AKI) is the most common and serious complication of sepsis, accompanied by high mortality and disease burden. The early prediction of AKI is critical for timely intervention and ultimately improves prognosis. This study aims to establish and validate predictive models based on novel machine learning (ML) algorithms for AKI in critically ill patients with sepsis. METHODS:Data of patients with sepsis were extracted from the Medical Information Mart for Intensive Care III (MIMIC- III) database. Feature selection was performed using a Boruta algorithm. ML algorithms such as logistic regression (LR), k-nearest neighbors (KNN), support vector machine (SVM), decision tree, random forest, Extreme Gradient Boosting (XGBoost), and artificial neural network (ANN) were applied for model construction by utilizing tenfold cross-validation. The performances of these models were assessed in terms of discrimination, calibration, and clinical application. Moreover, the discrimination of ML-based models was compared with those of Sequential Organ Failure Assessment (SOFA) and the customized Simplified Acute Physiology Score (SAPS) II model. RESULTS:A total of 3176 critically ill patients with sepsis were included for analysis, of which 2397 cases (75.5%) developed AKI during hospitalization. A total of 36 variables were selected for model construction. The models of LR, KNN, SVM, decision tree, random forest, ANN, XGBoost, SOFA and SAPS II score were established and obtained area under the receiver operating characteristic curves of 0.7365, 0.6637, 0.7353, 0.7492, 0.7787, 0.7547, 0.821, 0.6457 and 0.7015, respectively. The XGBoost model had the best predictive performance in terms of discrimination, calibration, and clinical application among all models. CONCLUSION:The ML models can be reliable tools for predicting AKI in septic patients. The XGBoost model has the best predictive performance, which can be used to assist clinicians in identifying high-risk patients and implementing early interventions to reduce mortality.
10.1186/s12967-022-03364-0