Machine learning modeling and prognostic value analysis of invasion-related genes in cutaneous melanoma.
Computers in biology and medicine
In this study, we aimed to develop an invasion-related risk signature and prognostic model for personalized treatment and prognosis prediction in skin cutaneous melanoma (SKCM), as invasion plays a crucial role in this disease. We identified 124 differentially expressed invasion-associated genes (DE-IAGs) and selected 20 prognostic genes (TTYH3, NME1, ORC1, PLK1, MYO10, SPINT1, NUPR1, SERPINE2, HLA-DQB2, METTL7B, TIMP1, NOX4, DBI, ARL15, APOBEC3G, ARRB2, DRAM1, RNF213, C14orf28, and CPEB3) using Cox and LASSO regression to establish a risk score. Gene expression was validated through single-cell sequencing, protein expression, and transcriptome analysis. Negative correlations were discovered between risk score, immune score, and stromal score using ESTIMATE and CIBERSORT algorithms. High- and low-risk groups exhibited significant differences in immune cell infiltration and checkpoint molecule expression. The 20 prognostic genes effectively differentiated between SKCM and normal samples (AUCs >0.7). We identified 234 drugs targeting 6 genes from the DGIdb database. Our study provides potential biomarkers and a risk signature for personalized treatment and prognosis prediction in SKCM patients. We developed a nomogram and machine-learning prognostic model to predict 1-, 3-, and 5-year overall survival (OS) using risk signature and clinical factors. The best model, Extra Trees Classifier (AUC = 0.88), was derived from pycaret's comparison of 15 classifiers. The pipeline and app are accessible at https://github.com/EnyuY/IAGs-in-SKCM.
10.1016/j.compbiomed.2023.107089
Characterization of cuproptosis in gastric cancer and relationship with clinical and drug reactions.
Frontiers in cell and developmental biology
Gastric cancer (GC) is the fifth most common cancer worldwide. Cuproptosis is associated with cell growth and death as well as tumorigenesis. Aiming to lucubrate the potential influence of CRGs in gastric cancer, we acquired datasets of gastric cancer patients from TCGA and GEO. The identification of molecular subtypes with CRGs expression was achieved through unsupervised learning-cluster analysis. To evaluate the application value of subtypes, the K-M survival analysis was conducted to evaluate the clinical prognostic characteristics. Subsequently, we performed Gene Set Variation Analysis (GSVA) and utilized ssGSEA to quantify the extent of immune infiltration. Further, the K-M survival analysis was used to identify the prognosis-related CRGs. Next, signature genes of diagnostic predictive value were screened using the least absolute shrinkage and selection operator (LASSO) algorithm from the expression matrix for TCGA, as well as the signature gene-related subtype was clustered by the "ConsensusClusterPlus" package. Finally, the immunological and drug sensitivity assessments of the signature gene-related subtypes were conducted. A total of 173 CRGs were identified, most of the CRGs undergo copy number variation in gastric cancer. Under different patient subtypes, immune cell levels differed significantly, and the subtype exhibiting high expression of the CRGs had a better prognosis. Furthermore, we selected 34 CRGs that were highly correlated with the prognosis of gastric cancer. By constructing a multivariate Cox proportional-hazards model and a hazard scoring system, we were able to categorize patients into high- and low-risk groups based on their hazard score. K-M analysis demonstrated a significant survival disadvantage in the high-risk group. Based on Lasso regression analysis, we screened 16 signature genes, a multivariate logistic regression model [cutoff: 0.149 (0.000, 0.974), AUC:0.987] and a prognosis network diagram was constructed and their prediction efficiency for gastric cancer prognostic diagnosis was well validated. According to the signature genes, the patients were separated to two signature subtypes. We found that patients with higher CRGs expression and better prognosis had lower levels of immune infiltration. Finally, according to the results of drug susceptibility analysis, docetaxel, 5-Fluorouracil, gemcitabin, and paclitaxel were found to be more sensitive to gastric cancer.
10.3389/fcell.2023.1172895
A reinforcement learning model for AI-based decision support in skin cancer.
Nature medicine
We investigated whether human preferences hold the potential to improve diagnostic artificial intelligence (AI)-based decision support using skin cancer diagnosis as a use case. We utilized nonuniform rewards and penalties based on expert-generated tables, balancing the benefits and harms of various diagnostic errors, which were applied using reinforcement learning. Compared with supervised learning, the reinforcement learning model improved the sensitivity for melanoma from 61.4% to 79.5% (95% confidence interval (CI): 73.5-85.6%) and for basal cell carcinoma from 79.4% to 87.1% (95% CI: 80.3-93.9%). AI overconfidence was also reduced while simultaneously maintaining accuracy. Reinforcement learning increased the rate of correct diagnoses made by dermatologists by 12.0% (95% CI: 8.8-15.1%) and improved the rate of optimal management decisions from 57.4% to 65.3% (95% CI: 61.7-68.9%). We further demonstrated that the reward-adjusted reinforcement learning model and a threshold-based model outperformed naïve supervised learning in various clinical scenarios. Our findings suggest the potential for incorporating human preferences into image-based diagnostic algorithms.
10.1038/s41591-023-02475-5
Machine learning-driven blood transcriptome-based discovery of SARS-CoV-2 specific severity biomarkers.
Journal of medical virology
The Coronavirus disease 2019 (COVID-19) pandemic, caused by rapidly evolving variants of severe acute respiratory syndrome coronavirus (SARS-CoV-2), continues to be a global health threat. SARS-CoV-2 infection symptoms often intersect with other nonsevere respiratory infections, making early diagnosis challenging. There is an urgent need for early diagnostic and prognostic biomarkers to predict severity and reduce mortality when a sudden outbreak occurs. This study implemented a novel approach of integrating bioinformatics and machine learning algorithms over publicly available clinical COVID-19 transcriptome data sets. The robust 7-gene biomarker identified through this analysis can not only discriminate SARS-CoV-2 associated acute respiratory illness (ARI) from other types of ARIs but also can discriminate severe COVID-19 patients from nonsevere COVID-19 patients. Validation of the 7-gene biomarker in an independent blood transcriptome data set of longitudinal analysis of COVID-19 patients across various stages of the disease showed that the dysregulation of the identified biomarkers during severe disease is restored during recovery, showing their prognostic potential. The blood biomarkers identified in this study can serve as potential diagnostic candidates and help reduce COVID-19-associated mortality.
10.1002/jmv.28488
Identifying stroke-related quantified evidence from electronic health records in real-world studies.
Artificial intelligence in medicine
BACKGROUND:Stroke is one of the leading causes of death and disability worldwide. The National Institutes of Health Stroke Scale (NIHSS) scores in electronic health records (EHRs), which quantitatively describe patients' neurological deficits in evidence-based treatment, are crucial in stroke-related clinical investigations. However, the free-text format and lack of standardization inhibit their effective use. Automatically extracting the scale scores from the clinical free text so that its potential value in real-world studies is realized has become an important goal. OBJECTIVE:This study aims to develop an automated method to extract scale scores from the free text of EHRs. METHODS:We propose a two-step pipeline method to identify NIHSS items and numerical scores and validate its feasibility using a freely accessible critical care database: MIMIC-III (Medical Information Mart for Intensive Care III). First, we utilize MIMIC-III to create an annotated corpus. Then, we investigate possible machine learning methods for two subtasks, NIHSS item and score recognition and item-score relation extraction. In the evaluation, we conduct both task-specific and end-to-end evaluations and compare our method with the rule-based method using precision, recall and F1 scores as evaluation metrics. RESULTS:We use all available discharge summaries of stroke cases in MIMIC-III. The annotated NIHSS corpus contains 312 cases, 2929 scale items, 2774 scores and 2733 relations. The results show that the best F1-score of our method was 0.9006, which was attained by combining BERT-BiLSTM-CRF and Random Forest, and it outperformed the rule-based method (F1-score = 0.8098). In the end-to-end task, our method could successfully recognize the item "1b level of consciousness questions", the score "1" and their relation "('1b level of consciousness questions', '1', 'has value')" from the sentence "1b level of consciousness questions: said name = 1", while the rule-based method could not. CONCLUSIONS:The two-step pipeline method we propose is an effective approach to identify NIHSS items, scores and their relations. With its help, clinical investigators can easily retrieve and access structured scale data, thereby supporting stroke-related real-world studies.
10.1016/j.artmed.2023.102552
Improving prognosis and assessing adjuvant chemotherapy benefit in locally advanced rectal cancer with deep learning for MRI: A retrospective, multi-cohort study.
Radiotherapy and oncology : journal of the European Society for Therapeutic Radiology and Oncology
PURPOSE:Adjuvant therapy is recommended to minimize the risk of distant metastasis (DM) and local recurrence (LR) in patients with locally advanced rectal cancer (LARC). However, its role is controversial. We aimed to develop a pretreatment MRI-based deep learning model to predict LR, DM, and overall survival (OS) over 5 years after surgery and to identify patients benefitting from adjuvant chemotherapy (AC). MATERIALS AND METHODS:The multi-survival tasks network (MuST) model was developed in a primary cohort (n = 308) and validated using two external cohorts (n = 247, 245). An AC decision tree integrating the MuST-DM score, perineural invasion (PNI), and preoperative carbohydrate antigen 19-9 (CA19-9) was constructed to assess chemotherapy benefits and aid personalized treatment of patients. We also quantified the prognostic improvement of the decision tree. RESULTS:The MuST network demonstrated high prognostic accuracy in the primary and two external cohorts for the prediction of three different survival tasks. Within the stratified analysis and decision tree, patients with CA19-9 levels > 37 U/mL and high MuST-DM scores exhibited favorable chemotherapy efficacy. Similar results were observed in PNI-positive patients with low MuST-DM scores. PNI-negative patients with low MuST-DM scores exhibited poor chemotherapy efficacy. Based on the decision tree, 14 additional patients benefiting from AC and 391 patients who received over-treatment were identified in this retrospective study. CONCLUSION:The MuST model accurately and non-invasively predicted OS, DM, and LR. A specific and direct tool linking chemotherapy decisions and benefit quantification has also been provided.
10.1016/j.radonc.2023.109899
A deep learning nomogram kit for predicting metastatic lymph nodes in rectal cancer.
Ding Lei,Liu Guangwei,Zhang Xianxiang,Liu Shanglong,Li Shuai,Zhang Zhengdong,Guo Yuting,Lu Yun
Cancer medicine
BACKGROUND:Preoperative diagnoses of metastatic lymph nodes (LNs) by the most advanced deep learning technology of Faster Region-based Convolutional Neural Network (Faster R-CNN) have not yet been reported. MATERIALS AND METHODS:In total, 545 patients with pathologically confirmed rectal cancer between January 2016 and March 2019 were included and were randomly allocated with a split ratio of 2:1 to the training and validation sets, respectively. The MRI images for metastatic LNs were evaluated by Faster R-CNN. Multivariate regression analyses were used to develop the predictive models. Faster R-CNN nomograms were constructed based on the multivariate analyses in the training sets and were validated in the validation sets. RESULTS:The Faster R-CNN nomogram for predicting metastatic LN status contained predictors of age, metastatic LNs by Faster R-CNN and differentiation degrees of tumors, with areas under the curves (AUCs) of 0.862 (95% CI: 0.816-0.909) and 0.920 (95% CI: 0.876-0.964) in the training and validation sets, respectively. The Faster R-CNN nomogram for predicting LN metastasis degree contained predictors of metastatic LNs by Faster R-CNN and differentiation degrees of tumors, with AUCs of 0.859 (95% CI: 0.804-0.913) and 0.886 (95% CI: 0.822-0.950) in the training and validation sets, respectively. Calibration plots and decision curve analyses demonstrated good calibrations and clinical utilities. The two nomograms were used jointly as a kit for predicting metastatic LNs. CONCLUSION:The Faster R-CNN nomogram kit exhibits excellent performance in discrimination, calibration, and clinical utility and is convenient and reliable for predicting metastatic LNs preoperatively. CLINICAL TRIAL REGISTRATION:ChiCTR-DDD-17013842.
10.1002/cam4.3490
An MRI Deep Learning Model Predicts Outcome in Rectal Cancer.
Radiology
Background Deep learning (DL) models can potentially improve prognostication of rectal cancer but have not been systematically assessed. Purpose To develop and validate an MRI DL model for predicting survival in patients with rectal cancer based on segmented tumor volumes from pretreatment T2-weighted MRI scans. Materials and Methods DL models were trained and validated on retrospectively collected MRI scans of patients with rectal cancer diagnosed between August 2003 and April 2021 at two centers. Patients were excluded from the study if there were concurrent malignant neoplasms, prior anticancer treatment, incomplete course of neoadjuvant therapy, or no radical surgery performed. The Harrell C-index was used to determine the best model, which was applied to internal and external test sets. Patients were stratified into high- and low-risk groups based on a fixed cutoff calculated in the training set. A multimodal model was also assessed, which used DL model-computed risk score and pretreatment carcinoembryonic antigen level as input. Results The training set included 507 patients (median age, 56 years [IQR, 46-64 years]; 355 men). In the validation set ( = 218; median age, 55 years [IQR, 47-63 years]; 144 men), the best algorithm reached a C-index of 0.82 for overall survival. The best model reached hazard ratios of 3.0 (95% CI: 1.0, 9.0) in the high-risk group in the internal test set ( = 112; median age, 60 years [IQR, 52-70 years]; 76 men) and 2.3 (95% CI: 1.0, 5.4) in the external test set ( = 58; median age, 57 years [IQR, 50-67 years]; 38 men). The multimodal model further improved the performance, with a C-index of 0.86 and 0.67 for the validation and external test set, respectively. Conclusion A DL model based on preoperative MRI was able to predict survival of patients with rectal cancer. The model could be used as a preoperative risk stratification tool. Published under a CC BY 4.0 license. See also the editorial by Langs in this issue.
10.1148/radiol.222223
Deep-learning-assisted analysis of echocardiographic videos improves predictions of all-cause mortality.
Nature biomedical engineering
Machine learning promises to assist physicians with predictions of mortality and of other future clinical events by learning complex patterns from historical data, such as longitudinal electronic health records. Here we show that a convolutional neural network trained on raw pixel data in 812,278 echocardiographic videos from 34,362 individuals provides superior predictions of one-year all-cause mortality. The model's predictions outperformed the widely used pooled cohort equations, the Seattle Heart Failure score (measured in an independent dataset of 2,404 patients with heart failure who underwent 3,384 echocardiograms), and a machine learning model involving 58 human-derived variables from echocardiograms and 100 clinical variables derived from electronic health records. We also show that cardiologists assisted by the model substantially improved the sensitivity of their predictions of one-year all-cause mortality by 13% while maintaining prediction specificity. Large unstructured datasets may enable deep learning to improve a wide range of clinical prediction models.
10.1038/s41551-020-00667-9
Deep learning for protein structure prediction and design-progress and applications.
Molecular systems biology
Proteins are the key molecular machines that orchestrate all biological processes of the cell. Most proteins fold into three-dimensional shapes that are critical for their function. Studying the 3D shape of proteins can inform us of the mechanisms that underlie biological processes in living cells and can have practical applications in the study of disease mutations or the discovery of novel drug treatments. Here, we review the progress made in sequence-based prediction of protein structures with a focus on applications that go beyond the prediction of single monomer structures. This includes the application of deep learning methods for the prediction of structures of protein complexes, different conformations, the evolution of protein structures and the application of these methods to protein design. These developments create new opportunities for research that will have impact across many areas of biomedical research.
10.1038/s44320-024-00016-x
Imaging research in fibrotic lung disease; applying deep learning to unsolved problems.
Walsh Simon L F,Humphries Stephen M,Wells Athol U,Brown Kevin K
The Lancet. Respiratory medicine
Over the past decade, there has been a groundswell of research interest in computer-based methods for objectively quantifying fibrotic lung disease on high resolution CT of the chest. In the past 5 years, the arrival of deep learning-based image analysis has created exciting new opportunities for enhancing the understanding of, and the ability to interpret, fibrotic lung disease on CT. Specific unsolved problems for which computer-based imaging analysis might provide solutions include the development of reliable methods for assisting with diagnosis, detecting early disease, and predicting disease behaviour using baseline imaging data. However, to harness this technology, technical and societal challenges must be overcome. Large CT datasets will be needed to power the training of deep learning algorithms. Open science research and collaboration between academia and industry must be encouraged. Prospective clinical utility studies will be needed to test computer algorithm performance in real-world clinical settings and demonstrate patient benefit over current best practice. Finally, ethical standards, which ensure patient confidentiality and mitigate against biases in training datasets, that can be encoded in machine-learning systems will be needed as well as bespoke data governance and accountability frameworks to encourage buy-in from health-care professionals, patients, and the public.
10.1016/S2213-2600(20)30003-5
Deep learning in image-based phenotypic drug discovery.
Trends in cell biology
Modern drug discovery approaches often use high-content imaging to systematically study the effect on cells of large libraries of chemical compounds. By automatically screening thousands or millions of images to identify specific drug-induced cellular phenotypes, for example, altered cellular morphology, these approaches can reveal 'hit' compounds offering therapeutic promise. In the past few years, artificial intelligence (AI) methods based on deep learning (DL) [a family of machine learning (ML) techniques] have disrupted virtually all image analysis tasks, from image classification to segmentation. These powerful methods also promise to impact drug discovery by accelerating the identification of effective drugs and their modes of action. In this review, we highlight applications and adaptations of ML, especially DL methods for cell-based phenotypic drug discovery (PDD).
10.1016/j.tcb.2022.11.011
Deep Learning for Cardiovascular Imaging: A Review.
JAMA cardiology
Importance:Artificial intelligence (AI), driven by advances in deep learning (DL), has the potential to reshape the field of cardiovascular imaging (CVI). While DL for CVI is still in its infancy, research is accelerating to aid in the acquisition, processing, and/or interpretation of CVI across various modalities, with several commercial products already in clinical use. It is imperative that cardiovascular imagers are familiar with DL systems, including a basic understanding of how they work, their relative strengths compared with other automated systems, and possible pitfalls in their implementation. The goal of this article is to review the methodology and application of DL to CVI in a simple, digestible fashion toward demystifying this emerging technology. Observations:At its core, DL is simply the application of a series of tunable mathematical operations that translate input data into a desired output. Based on artificial neural networks that are inspired by the human nervous system, there are several types of DL architectures suited to different tasks; convolutional neural networks are particularly adept at extracting valuable information from CVI data. We survey some of the notable applications of DL to tasks across the spectrum of CVI modalities. We also discuss challenges in the development and implementation of DL systems, including avoiding overfitting, preventing systematic bias, improving explainability, and fostering a human-machine partnership. Finally, we conclude with a vision of the future of DL for CVI. Conclusions and Relevance:Deep learning has the potential to meaningfully affect the field of CVI. Rather than a threat, DL could be seen as a partner to cardiovascular imagers in reducing technical burden and improving efficiency and quality of care. High-quality prospective evidence is still needed to demonstrate how the benefits of DL CVI systems may outweigh the risks.
10.1001/jamacardio.2023.3142
Machine Learning-Based Models Incorporating Social Determinants of Health vs Traditional Models for Predicting In-Hospital Mortality in Patients With Heart Failure.
JAMA cardiology
Importance:Traditional models for predicting in-hospital mortality for patients with heart failure (HF) have used logistic regression and do not account for social determinants of health (SDOH). Objective:To develop and validate novel machine learning (ML) models for HF mortality that incorporate SDOH. Design, Setting, and Participants:This retrospective study used the data from the Get With The Guidelines-Heart Failure (GWTG-HF) registry to identify HF hospitalizations between January 1, 2010, and December 31, 2020. The study included patients with acute decompensated HF who were hospitalized at the GWTG-HF participating centers during the study period. Data analysis was performed January 6, 2021, to April 26, 2022. External validation was performed in the hospitalization cohort from the Atherosclerosis Risk in Communities (ARIC) study between 2005 and 2014. Main Outcomes and Measures:Random forest-based ML approaches were used to develop race-specific and race-agnostic models for predicting in-hospital mortality. Performance was assessed using C index (discrimination), regression slopes for observed vs predicted mortality rates (calibration), and decision curves for prognostic utility. Results:The training data set included 123 634 hospitalized patients with HF who were enrolled in the GWTG-HF registry (mean [SD] age, 71 [13] years; 58 356 [47.2%] female individuals; 65 278 [52.8%] male individuals. Patients were analyzed in 2 categories: Black (23 453 [19.0%]) and non-Black (2121 [2.1%] Asian; 91 154 [91.0%] White, and 6906 [6.9%] other race and ethnicity). The ML models demonstrated excellent performance in the internal testing subset (n = 82 420) (C statistic, 0.81 for Black patients and 0.82 for non-Black patients) and in the real-world-like cohort with less than 50% missingness on covariates (n = 553 506; C statistic, 0.74 for Black patients and 0.75 for non-Black patients). In the external validation cohort (ARIC registry; n = 1205 Black patients and 2264 non-Black patients), ML models demonstrated high discrimination and adequate calibration (C statistic, 0.79 and 0.80, respectively). Furthermore, the performance of the ML models was superior to the traditional GWTG-HF risk score model (C index, 0.69 for both race groups) and other rederived logistic regression models using race as a covariate. The performance of the ML models was identical using the race-specific and race-agnostic approaches in the GWTG-HF and external validation cohorts. In the GWTG-HF cohort, the addition of zip code-level SDOH parameters to the ML model with clinical covariates only was associated with better discrimination, prognostic utility (assessed using decision curves), and model reclassification metrics in Black patients (net reclassification improvement, 0.22 [95% CI, 0.14-0.30]; P < .001) but not in non-Black patients. Conclusions and Relevance:ML models for HF mortality demonstrated superior performance to the traditional and rederived logistic regressions models using race as a covariate. The addition of SDOH parameters improved the prognostic utility of prediction models in Black patients but not non-Black patients in the GWTG-HF registry.
10.1001/jamacardio.2022.1900
Machine Learning Models ID Cancer Drivers.
Cancer discovery
Researchers have developed a machine learning method that could help advance research on tumorigenesis. Using large databases of human tumors, the team developed machine learning models that can identify driver and passenger mutations in specific cancer genes and determine the location and key features of cancer drivers.
10.1158/2159-8290.CD-NB2021-0376
Machine learning in haematological malignancies.
Radakovich Nathan,Nagy Matthew,Nazha Aziz
The Lancet. Haematology
Machine learning is a branch of computer science and statistics that generates predictive or descriptive models by learning from training data rather than by being rigidly programmed. It has attracted substantial attention for its many applications in medicine, both as a catalyst for research and as a means of improving clinical care across the cycle of diagnosis, prognosis, and treatment of disease. These applications include the management of haematological malignancy, in which machine learning has created inroads in pathology, radiology, genomics, and the analysis of electronic health record data. As computational power becomes cheaper and the tools for implementing machine learning become increasingly democratised, it is likely to become increasingly integrated into the research and practice landscape of haematology. As such, machine learning merits understanding and attention from researchers and clinicians alike. This narrative Review describes important concepts in machine learning for unfamiliar readers, details machine learning's current applications in haematological malignancy, and summarises important concepts for clinicians to be aware of when appraising research that uses machine learning.
10.1016/S2352-3026(20)30121-6
Deep learning for prediction of colorectal cancer outcome: a discovery and validation study.
Lancet (London, England)
BACKGROUND:Improved markers of prognosis are needed to stratify patients with early-stage colorectal cancer to refine selection of adjuvant therapy. The aim of the present study was to develop a biomarker of patient outcome after primary colorectal cancer resection by directly analysing scanned conventional haematoxylin and eosin stained sections using deep learning. METHODS:More than 12 000 000 image tiles from patients with a distinctly good or poor disease outcome from four cohorts were used to train a total of ten convolutional neural networks, purpose-built for classifying supersized heterogeneous images. A prognostic biomarker integrating the ten networks was determined using patients with a non-distinct outcome. The marker was tested on 920 patients with slides prepared in the UK, and then independently validated according to a predefined protocol in 1122 patients treated with single-agent capecitabine using slides prepared in Norway. All cohorts included only patients with resectable tumours, and a formalin-fixed, paraffin-embedded tumour tissue block available for analysis. The primary outcome was cancer-specific survival. FINDINGS:828 patients from four cohorts had a distinct outcome and were used as a training cohort to obtain clear ground truth. 1645 patients had a non-distinct outcome and were used for tuning. The biomarker provided a hazard ratio for poor versus good prognosis of 3·84 (95% CI 2·72-5·43; p<0·0001) in the primary analysis of the validation cohort, and 3·04 (2·07-4·47; p<0·0001) after adjusting for established prognostic markers significant in univariable analyses of the same cohort, which were pN stage, pT stage, lymphatic invasion, and venous vascular invasion. INTERPRETATION:A clinically useful prognostic marker was developed using deep learning allied to digital scanning of conventional haematoxylin and eosin stained tumour tissue sections. The assay has been extensively evaluated in large, independent patient populations, correlates with and outperforms established molecular and morphological prognostic markers, and gives consistent results across tumour and nodal stage. The biomarker stratified stage II and III patients into sufficiently distinct prognostic groups that potentially could be used to guide selection of adjuvant treatment by avoiding therapy in very low risk groups and identifying patients who would benefit from more intensive treatment regimes. FUNDING:The Research Council of Norway.
10.1016/S0140-6736(19)32998-8
Machine learning for ECG diagnosis and risk stratification of occlusion myocardial infarction.
Nature medicine
Patients with occlusion myocardial infarction (OMI) and no ST-elevation on presenting electrocardiogram (ECG) are increasing in numbers. These patients have a poor prognosis and would benefit from immediate reperfusion therapy, but, currently, there are no accurate tools to identify them during initial triage. Here we report, to our knowledge, the first observational cohort study to develop machine learning models for the ECG diagnosis of OMI. Using 7,313 consecutive patients from multiple clinical sites, we derived and externally validated an intelligent model that outperformed practicing clinicians and other widely used commercial interpretation systems, substantially boosting both precision and sensitivity. Our derived OMI risk score provided enhanced rule-in and rule-out accuracy relevant to routine care, and, when combined with the clinical judgment of trained emergency personnel, it helped correctly reclassify one in three patients with chest pain. ECG features driving our models were validated by clinical experts, providing plausible mechanistic links to myocardial injury.
10.1038/s41591-023-02396-3
Machine learning in rare disease.
Nature methods
High-throughput profiling methods (such as genomics or imaging) have accelerated basic research and made deep molecular characterization of patient samples routine. These approaches provide a rich portrait of genes, molecular pathways and cell types involved in disease phenotypes. Machine learning (ML) can be a useful tool for extracting disease-relevant patterns from high-dimensional datasets. However, depending upon the complexity of the biological question, machine learning often requires many samples to identify recurrent and biologically meaningful patterns. Rare diseases are inherently limited in clinical cases, leading to few samples to study. In this Perspective, we outline the challenges and emerging solutions for using ML for small sample sets, specifically in rare diseases. Advances in ML methods for rare diseases are likely to be informative for applications beyond rare diseases for which few samples exist with high-dimensional data. We propose that the method community prioritize the development of ML techniques for rare disease research.
10.1038/s41592-023-01886-z
Deep learning in cancer diagnosis, prognosis and treatment selection.
Genome medicine
Deep learning is a subdiscipline of artificial intelligence that uses a machine learning technique called artificial neural networks to extract patterns and make predictions from large data sets. The increasing adoption of deep learning across healthcare domains together with the availability of highly characterised cancer datasets has accelerated research into the utility of deep learning in the analysis of the complex biology of cancer. While early results are promising, this is a rapidly evolving field with new knowledge emerging in both cancer biology and deep learning. In this review, we provide an overview of emerging deep learning techniques and how they are being applied to oncology. We focus on the deep learning applications for omics data types, including genomic, methylation and transcriptomic data, as well as histopathology-based genomic inference, and provide perspectives on how the different data types can be integrated to develop decision support tools. We provide specific examples of how deep learning may be applied in cancer diagnosis, prognosis and treatment management. We also assess the current limitations and challenges for the application of deep learning in precision oncology, including the lack of phenotypically rich data and the need for more explainable deep learning models. Finally, we conclude with a discussion of how current obstacles can be overcome to enable future clinical utilisation of deep learning.
10.1186/s13073-021-00968-x