Use of cortical volume to predict response to temporary CSF drainage in patients with idiopathic normal pressure hydrocephalus

J Neurosurg 139:1776–1783, 2023

Temporary drainage of CSF with lumbar puncture or lumbar drainage has a high predictive value for identifying patients with suspected idiopathic normal pressure hydrocephalus (iNPH) who may benefit from ventriculoperitoneal shunt insertion. However, it is unclear what differentiates responders from nonresponders. The authors hypothesized that nonresponders to temporary CSF drainage would have patterns of reduced regional gray matter volume (GMV) as compared with those of responders. The objective of the current investigation was to compare regional GMV between temporary CSF drainage responders and nonresponders. Machine learning using extracted GMV was then used to predict outcomes.

METHODS This retrospective cohort study included 132 patients with iNPH who underwent temporary CSF drainage and structural MRI. Demographic and clinical variables were examined between groups. Voxel-based morphometry was used to calculate GMV across the brain. Group differences in regional GMV were assessed and correlated with change in results on the Montreal Cognitive Assessment (MoCA) and gait velocity. A support vector machine (SVM) model that used extracted GMV values and was validated with leave-one-out cross-validation was used to predict clinical outcome.

RESULTS There were 87 responders and 45 nonresponders. There were no group differences in terms of age, sex, baseline MoCA score, Evans index, presence of disproportionately enlarged subarachnoid space hydrocephalus, baseline total CSF volume, or baseline white matter T2-weighted hyperintensity volume (p > 0.05). Nonresponders demonstrated decreased GMV in the right supplementary motor area (SMA) and right posterior parietal cortex as compared with responders (p < 0.001, p < 0.05 with false discovery rate cluster correction). GMV in the posterior parietal cortex was associated with change in MoCA (r 2 = 0.075, p < 0.05) and gait velocity (r 2 = 0.076, p < 0.05). Response status was classified by the SVM with 75.8% accuracy.

CONCLUSIONS Decreased GMV in the SMA and posterior parietal cortex may help identify patients with iNPH who are unlikely to benefit from temporary CSF drainage. These patients may have limited capacity for recovery due to atrophy in these regions that are known to be important for motor and cognitive integration. This study represents an important step toward improving patient selection and predicting clinical outcomes in the treatment of iNPH.

Network-level prediction of set-shifting deterioration after lower-grade glioma resection

J Neurosurg 137:1329–1337, 2022

The aim of this study was to predict set-shifting deterioration after resection of low-grade glioma.

METHODS The authors retrospectively analyzed a bicentric series of 102 patients who underwent surgery for low-grade glioma. The difference between the completion times of the Trail Making Test parts B and A (TMT B-A) was evaluated preoperatively and 3–4 months after surgery. High dimensionality of the information related to the surgical cavity topography was reduced to a small set of predictors in four different ways: 1) overlap between surgical cavity and each of the 122 cortical parcels composing Yeo’s 17-network parcellation of the brain; 2) Tractotron: disconnection by the cavity of the major white matter bundles; 3) overlap between the surgical cavity and each of Yeo’s networks; and 4) disconets: signature of structural disconnection by the cavity of each of Yeo’s networks. A random forest algorithm was implemented to predict the postoperative change in the TMT B-A z-score.

RESULTS The last two network-based approaches yielded significant accuracies in left-out subjects (area under the receiver operating characteristic curve [AUC] approximately equal to 0.8, p approximately equal to 0.001) and outperformed the two alternatives. In single tree hierarchical models, the degree of damage to Yeo corticocortical network 12 (CC 12) was a critical node: patients with damage to CC 12 higher than 7.5% (cortical overlap) or 7.2% (disconets) had much higher risk to deteriorate, establishing for the first time a causal link between damage to this network and impaired set-shifting.

CONCLUSIONS The authors’ results give strong support to the idea that network-level approaches are a powerful way to address the lesion-symptom mapping problem, enabling machine learning–powered individual outcome predictions.

Survival Prediction After Neurosurgical Resection of Brain Metastases: A Machine Learning Approach

Neurosurgery 91:381–388, 2022

Current prognostic models for brain metastases (BMs) have been constructed and validated almost entirely with data from patients receiving up-front radiotherapy, leaving uncertainty about surgical patients.

OBJECTIVE: To build and validate a model predicting 6-month survival after BM resection using different machine learning algorithms.

METHODS: An institutional database of 1062 patients who underwent resection for BM was split into an 80:20 training and testing set. Seven different machine learning algorithms were trained and assessed for performance; an established prognostic model for patients with BM undergoing radiotherapy, the diagnosis-specific graded prognostic assessment, was also evaluated. Model performance was assessed using area under the curve (AUC) and calibration.

RESULTS: The logistic regression showed the best performance with an AUC of 0.71 in the hold-out test set, a calibration slope of 0.76, and a calibration intercept of 0.03. The diagnosis-specific graded prognostic assessment had an AUC of 0.66. Patients were stratified into regular-risk, high-risk and very high-risk groups for death at 6 months; these strata strongly predicted both 6-month and longitudinal overall survival (P < .0005). The model was implemented into a web application that can be accessed through http://

CONCLUSION: We developed and internally validated a prediction model that accurately predicts 6-month survival after neurosurgical resection for BM and allows for meaningful risk stratification. Future efforts should focus on external validation of our model.

Artificial intelligence in predicting early‑onset adjacent segment degeneration following anterior cervical discectomy and fusion

European Spine Journal (2022) 31:2104–2114

Anterior cervical discectomy and fusion (ACDF) is a common surgical treatment for degenerative disease in the cervical spine. However, resultant biomechanical alterations may predispose to early-onset adjacent segment degeneration (EO-ASD), which may become symptomatic and require reoperation. This study aimed to develop and validate a machine learning (ML) model to predict EO-ASD following ACDF.

Methods Retrospective review of prospectively collected data of patients undergoing ACDF at a quaternary referral medical center was performed. Patients > 18 years of age with > 6 months of follow-up and complete pre- and postoperative X-ray and MRI imaging were included. An ML-based algorithm was developed to predict EO-ASD based on preoperative demographic, clinical, and radiographic parameters, and model performance was evaluated according to discrimination and overall performance.

Results In total, 366 ACDF patients were included (50.8% male, mean age 51.4 ± 11.1 years). Over 18.7 ± 20.9 months of follow-up, 97 (26.5%) patients developed EO-ASD. The model demonstrated good discrimination and overall performance according to precision (EO-ASD: 0.70, non-ASD: 0.88), recall (EO-ASD: 0.73, non-ASD: 0.87), accuracy (0.82), F1-score (0.79), Brier score (0.203), and AUC (0.794), with C4/C5 posterior disc bulge, C4/C5 anterior disc bulge, C6 posterior superior osteophyte, presence of osteophytes, and C6/C7 anterior disc bulge identified as the most important predictive features.

Conclusions Through an ML approach, the model identified risk factors and predicted development of EO-ASD following ACDF with good discrimination and overall performance. By addressing the shortcomings of traditional statistics, ML techniques can support discovery, clinical decision-making, and precision-based spine care.

Prediction of Shunt Responsiveness in Suspected Patients With Normal Pressure Hydrocephalus Using the Lumbar Infusion Test: A Machine Learning Approach

Neurosurgery 90:407–418, 2022

Machine learning (ML) approaches can significantly improve the classical Rout -based evaluation of the lumbar infusion test (LIT) and the clinical management of the normal pressure hydrocephalus.

OBJECTIVE: To develop a ML model that accurately identifies patients as candidates for permanent cerebral spinal fluid shunt implantation using only intracranial pressure and electrocardiogram signals recorded throughout LIT.

METHODS: This was a single-center cohort study of prospectively collected data of 96 patients who underwent LIT and 5-day external lumbar cerebral spinal fluid drainage (external lumbar drainage) as a reference diagnostic method. A set of selected 48 intracranial pressure/ electrocardiogram complex signal waveform features describing nonlinear behavior, wavelet transform spectral signatures, or recurrent map patterns were calculated for each patient. After applying a leave-one-out cross-validation training–testing split of the data set, we trained and evaluated the performance of various state-of-the-art ML algorithms.

RESULTS: The highest performing ML algorithm was the eXtreme Gradient Boosting. This model showed a good calibration and discrimination on the testing data, with an area under the receiver operating characteristic curve of 0.891 (accuracy: 82.3%, sensitivity: 86.1%, and specificity: 73.9%) obtained for 8 selected features. Our ML model clearly outperforms the classical Rout based manual classification commonly used in clinical practice with an accuracy of 62.5%.

CONCLUSION: This study successfully used the ML approach to predict the outcome of a 5-day external lumbar drainage and hence which patients are likely to benefit from permanent shunt implantation. Our automated ML model thus enhances the diagnostic utility ofLIT in management.

Deep Learning for Outcome Prediction in Neurosurgery

Neurosurgery 90:16–38, 2022

Deep learning (DL) is a powerful machine learning technique that has increasingly been used to predict surgical outcomes. However, the large quantity of data required and lack of model interpretability represent substantial barriers to the validity and reproducibility of DL models.

The objective of this study was to systematically review the characteristics of DL studies involving neurosurgical outcome prediction and to assess their bias and reporting quality.

Literature search using the PubMed, Scopus, and Embase databases identified 1949 records of which 35 studies were included. Of these, 32 (91%) developed and validated a DL model while 3 (9%) validated a pre-existing model. The most commonly represented subspecialty areas were oncology (16 of 35, 46%), spine (8 of 35, 23%), and vascular (6 of 35, 17%). Risk of bias was low in 18 studies (51%), unclear in 5 (14%), and high in 12 (34%), most commonly because of data quality deficiencies.

Adherence to transparent reporting of a multivariable prediction model for individual prognosis or diagnosis reporting standards was low, with a median of 12 transparent reporting of a multivariable prediction model for individual prognosis or diagnosis items (39%) per study not reported. Model transparency was severely limited because code was provided in only 3 studies (9%) and final models in 2 (6%).

With the exception of public databases, no study data sets were readily available. No studies described DL models as ready for clinical use. The use of DL for neurosurgical outcome prediction remains nascent. Lack of appropriate data sets poses a major concern for bias. Although studies have demonstrated promising results, greater transparency in model development and reporting is needed to facilitate reproducibility and validation.

Predicting Spinal Surgery Candidacy From Imaging Data Using Machine Learning

Neurosurgery 89:116–121, 2021

The referral process for consultation with a spine surgeon remains inefficient, given a substantial proportion of referrals to spine surgeons are nonoperative.

OBJECTIVE: To develop a machine-learning-based algorithm which accurately identifies patients as candidates for consultation with a spine surgeon, using only magnetic resonance imaging (MRI).

METHODS: We trained a deep U-Net machine learning model to delineate spinal canals on axial slices of 100 normal lumbar MRI scans which were previously delineated by expert radiologists and neurosurgeons. We then tested the model against lumbar MRI scans for 140 patients who had undergone lumbar spine MRI at our institution (60 of whom ultimately underwent surgery, and 80 of whom did not). The model generated automated segmentations of the lumbar spinal canals and calculated a maximum degree of spinal stenosis for each patient,which served as our biomarker for surgical pathology warranting expert consultation.

RESULTS: Themachine learning model correctly predicted surgical candidacy (ie, whether patients ultimately underwent lumbar spinal decompression) with high accuracy (area under the curve = 0.88), using only imaging data from lumbar MRI scans.

CONCLUSION: Automated interpretation of lumbar MRI scans was sufficient to correctly determine surgical candidacy in nearly 90% of cases. Given that a significant proportion of referrals placed for spine surgery evaluation fail to meet criteria for surgical intervention, our model could serve as a valuable tool for patient triage and thereby address some of the inefficiencies within the outpatient surgical referral process.

Machine Learning for the Prediction of Molecular Markers in Glioma on Magnetic Resonance Imaging: A Systematic Review and Meta-Analysis

Neurosurgery 89:31–44, 2021

Molecular characterization of glioma has implications for prognosis, treatment planning, and prediction of treatment response. Current histopathology is limited by intratumoral heterogeneity and variability in detection methods. Advances in computational techniques have led to interest in mining quantitative imaging features to noninvasively detect genetic mutations.

OBJECTIVE: To evaluate the diagnostic accuracy of machine learning (ML) models in molecular subtyping gliomas on preoperative magnetic resonance imaging (MRI).

METHODS: A systematic search was performed following PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analysis) guidelines to identify studies up to April 1, 2020. Methodological quality of studies was assessed using the Quality Assessment for Diagnostic Accuracy Studies (QUADAS)-2. Diagnostic performance estimates were obtained using a bivariate model and heterogeneity was explored using metaregression.

RESULTS: Forty-four original articles were included. The pooled sensitivity and specificity for predicting isocitrate dehydrogenase (IDH) mutation in training datasets were 0.88 (95% CI 0.83-0.91) and 0.86 (95% CI 0.79-0.91), respectively, and 0.83 to 0.85 in validation sets. Use of data augmentation and MRI sequence type were weakly associated with heterogeneity. Both O6-methylguanine-DNA methyltransferase (MGMT) gene promoter methylation and 1p/19q codeletion could be predicted with a pooled sensitivity and specificity between 0.76 and 0.83 in training datasets.

CONCLUSION: ML application to preoperative MRI demonstrated promising results for predicting IDHmutation, MGMT methylation, and 1p/19q codeletion in glioma. Optimized ML models could lead to a noninvasive, objective tool that captures molecular information important for clinical decisionmaking. Future studies should use multicenter data, external validation and investigate clinical feasibility of ML models.

Correlations between genomic subgroup and clinical features in a cohort of more than 3000 meningiomas

J Neurosurg 133:1345–1354, 2020

Recent large-cohort sequencing studies have investigated the genomic landscape of meningiomas, identifying somatic coding alterations in NF2, SMARCB1, SMARCE1, TRAF7, KLF4, POLR2A, BAP1, and members of the PI3K and Hedgehog signaling pathways. Initial associations between clinical features and genomic subgroups have been described, including location, grade, and histology. However, further investigation using an expanded collection of samples is needed to confirm previous findings, as well as elucidate relationships not evident in smaller discovery cohorts.

METHODS Targeted sequencing of established meningioma driver genes was performed on a multiinstitution cohort of 3016 meningiomas for classification into mutually exclusive subgroups. Relevant clinical information was collected for all available cases and correlated with genomic subgroup. Nominal variables were analyzed using Fisher’s exact tests, while ordinal and continuous variables were assessed using Kruskal-Wallis and 1-way ANOVA tests, respectively. Machine-learning approaches were used to predict genomic subgroup based on noninvasive clinical features.

RESULTS Genomic subgroups were strongly associated with tumor locations, including correlation of HH tumors with midline location, and non-NF2 tumors in anterior skull base regions. NF2 meningiomas were significantly enriched in male patients, while KLF4 and POLR2A mutations were associated with female sex. Among histologies, the results confirmed previously identified relationships, and observed enrichment of microcystic features among “mutation unknown” samples. Additionally, KLF4-mutant meningiomas were associated with larger peritumoral brain edema, while SMARCB1 cases exhibited elevated Ki-67 index. Machine-learning methods revealed that observable, noninvasive patient features were largely predictive of each tumor’s underlying driver mutation.

CONCLUSIONS Using a rigorous and comprehensive approach, this study expands previously described correlations between genomic drivers and clinical features, enhancing our understanding of meningioma pathogenesis, and laying further groundwork for the use of targeted therapies. Importantly, the authors found that noninvasive patient variables exhibited a moderate predictive value of underlying genomic subgroup, which could improve with additional training data. With continued development, this framework may enable selection of appropriate precision medications without the need for invasive sampling procedures.

An Online Calculator for the Prediction of Survival in Glioblastoma Patients Using Classical Statistics and Machine Learning

Neurosurgery, Volume 86, Issue 2, February 2020, Pages E184–E192

Although survival statistics in patients with glioblastoma multiforme (GBM) are well-defined at the group level, predicting individual patient survival remains challenging because of significant variation within strata.

OBJECTIVE: To compare statistical and machine learning algorithms in their ability to predict survival in GBM patients and deploy the best performing model as an online survival calculator.

METHODS: Patients undergoing an operation for a histopathologically confirmed GBM were extracted from the Surveillance Epidemiology and End Results (SEER) database (2005-2015) and split into a training and hold-out test set in an 80/20 ratio. Fifteen statistical and machine learning algorithms were trained based on 13 demographic, socioeconomic, clinical, and radiographic features to predict overall survival, 1-yr survival status, and compute personalized survival curves.

RESULTS: In total, 20 821 patients met our inclusion criteria. The accelerated failure time model demonstrated superior performance in terms of discrimination (concordance index = 0.70), calibration, interpretability, predictive applicability, and computational efficiency compared to Cox proportional hazards regression and other machine learning algorithms. This model was deployed through a free, publicly available software interface (

CONCLUSION: The development and deployment of survival prediction tools require a multimodal assessment rather than a single metric comparison. This study provides a framework for the development of prediction tools in cancer patients, as well as an online survival calculator for patients with GBM. Future efforts should improve the interpretability, predictive applicability, and computational efficiency of existing machine learning algorithms, increase the granularity of population-based registries, and externally validate the proposed prediction tool.

Machine learning—aided personalized DTI tractographic planning for deep brain stimulation of the superolateral medial forebrain bundle using HAMLET

Acta Neurochirurgica (2019) 161:1559–1569

Growing interest exists for superolateral medial forebrain bundle (slMFB) deep brain stimulation (DBS) in psychiatric disorders. The surgical approach warrants tractographic rendition. Commercial stereotactic planning systems use deterministic tractography which suffers from inherent limitations, is dependent on manual interaction (ROI definition), and has to be regarded as subjective. We aimed to develop an objective but patient-specific tracking of the slMFB which at the same time allows the use of a commercial surgical planning system in the context of deep brain stimulation.

Methods The HAMLET (Hierarchical Harmonic Filters for Learning Tracts from Diffusion MRI) machine learning approach was introduced into the standardized workflow of slMFB DBS tractographic planning on the basis of patientspecific dMRI. Rendition of the slMFB with HAMLET serves as an objective comparison for the refinement of the deterministic tracking procedure. Our application focuses on the tractographic planning of DBS (N = 8) for major depression and OCD.

Results Previous results have shown that only fibers belonging to the ventral tegmental area to prefrontal/orbitofrontal axis should be targeted. With the proposed technique, the deterministic tracking approach, that serves as the surgical planning data, can be refined, over-sprouting fibers are eliminated, bundle thickness is reduced in the target region, and thereby probably a more accurate targeting is facilitated. The HAMLET-driven method is meant to achieve a more objective surgical fiber display of the slMFB with deterministic tractography.

Conclusions The approach allows overlying the results of patient-specific planning from two different approaches (manual deterministic and machine learningHAMLET). HAMLET shows the slMFB as a volume and thus serves as an objective tracking corridor. It helps to refine results from deterministic tracking in the surgical workspace without interfering with any part of the standard software solution. We have now included this workflow in our daily clinical experimental work on slMFB DBS for psychiatric indications.