Characterization of cerebral small vessel disease by neutrophil and platelet activation markers using artificial intelligence

Characterization of cerebral


Introduction
Cerebral small vessel disease (cSVD) is a collective term for different pathological processes that affect the small vessels of the brain, i.e. the small arteries and veins and the capillary beds.cSVD accounts for a quarter of ischemic strokes and is a main cause of vascular dementia (Debette andMarkus, 2010, Wardlaw et al. , 2019).The most common type of cSVD is age-and hypertensionrelated sporadic SVD.In current clinical practice, magnetic resonance imaging (MRI) is used for diagnosis.The occurrence of pathologic MRI features, e.g.white matter hyperintensities (WMH), lacunar infarcts, microbleeds, perivascular spaces, and cerebral atrophy is common in cSVD patients and considered diagnostic for cSVD (Wardlaw et al. , 2013).There is no causative treatment with proven efficacy against cSVD and current therapeutic options are limited to general cardiovascular risk management like lowering blood pressure and plasma cholesterol, and antiplatelet therapy.Despite advances during the last decades in the field of neuroimaging and biomarkers, the pathogenesis of cSVD is still poorly characterized.
Although pathologic features typical for cSVD have been identified, the definition of distinct pathologic stages for cSVD is still challenging (Mustapha et al. , 2019).It is hypothesized that a loss of the ability to regulate cerebral blood flow in response to variations in blood pressure during ageing may initiate its development.In addition, hypertension and increased arterial stiffness may result in increased blood flow velocities and increased pulsatility in the cerebral arterioles.These hemodynamic changes might subsequently lead to dysfunctional cerebral microvascular endothelium, disrupting intercellular communication with perivascular cells and oligodendrocytes or their precursors (Mustapha, Nassir, 2019) and leading to an alteration of blood brain barrier (BBB) integrity and permeability (Zhang et al. , 2017).There is increasing evidence that endothelial dysfunction and BBB leakage are involved in cSVD pathophysiology (Cuadrado-Godia et al. , 2018, Skoog et al. , 1998, Wardlaw et al. , 2003, Zhang, Wong, 2017), which is supported by circulating biomarkers of endothelial dysfunction, e.g.soluble forms of ICAM-1, VCAM1, CD62E and CD62P (Poggesi et al. , 2016, Rouhl et al. , 2012).Next to endothelial dysfunction, inflammatory responses and leukocyte infiltration are common pathological features of cSVD and may also contribute to, or further propagate its pathogenesis (Fu and Yan, 2018, Koizumi et al. , 2019, Low et al. , 2019).
Previous studies have shown a relation between endothelial cell activation and endothelial dysfunction in patients with WMH and lacunar infarction (de Leeuw et al. , 2002, Fassbender et al. , 1999, Fornage et al. , 2008, Hassan et al. , 2003).During the past years, a possible role of neutrophils in cSVD is gaining increased attention.For example, the neutrophil to lymphocyte ratio, a marker of systemic inflammation, was found to be associated with WHM in healthy people (Nam et al. , 2017).
Neutrophil and NETs (nuclear DNA expelled by neutrophils) contain many cytotoxic and inflammato-

J o u r n a l P r e -p r o o f
Journal Pre-proof ry compounds, with the potential to induce endothelial activation and damage (Xu et al. , 2009), the production of the cytokine interleukin-1 (Folco et al. , 2018), and to cause ischemic brain injury and BBB disruption (Armao et al. , 1997, Segel et al. , 2011).Thus, markers of neutrophil activation such as free or DNA-associated myeloperoxidase (MPO) and calprotectin (S100A8/A9) might help unravel a part of the pathogenesis of cSVD and might offer possibilities for identification and classification of individuals with this illness.
The vasculature is in tight contact with platelets.Platelets rapidly respond to endothelial damage and they are important for the maintenance of vascular integrity (Gupta et al. , 2020, Ho-Tin-Noé et al. , 2018), which is particularly relevant in the brain during development or inflammation (Farley et al. , 2021, Goerge et al. , 2008, Lowe et al. , 2015).Platelets can also rapidly induce endothelial activation (Henn et al. , 1998, Huo et al. , 2003).In turn, inflamed endothelium can activate platelets, a process that may subsequently contribute to thrombosis or to the propagation of inflammation by stimulating endothelial-leukocyte interactions (Coenen et al. , 2021, Coenen et al. , 2017, Martins et al. , 2006).Platelets can also initiate and propagate vascular inflammation by the release of chemokines.The chemokines CXCL4 and CXCL7, abundantly released after platelet activation, are established neutrophil activators and attractants (Kasper et al. , 2004, Schenk et al. , 2002) and evidence exists that these chemokines act as a link between platelet activation and neutrophil effector functions.For example, platelet-derived CXCL7 directs the migration of neutrophils through arterial platelet thrombi during ischemia-reperfusion injury (Ghasemzadeh et al. , 2013) and CXCL4 was found to induce the formation of NETs during experimental lung and heart injury in mice (Rossaint et al. , 2014, Vajen et al. , 2018).Interestingly, CXCL4 was also linked to NET formation in patients with ANCA-associated vasculitis, highlighting CXCL4 as a functional connection between platelets and neutrophils during vascular inflammation (Matsumoto et al. , 2021).A recent study suggested an association of CXCL4 and CXCL7 with neutrophil counts and activation in a cohort of patients with acute ischemic stroke (Kollikowski et al. , 2021).
Given an involvement of platelets in the maintenance of vascular integrity and of neutrophils in endothelial damage, we hypothesized that platelet, neutrophil and (endothelial) inflammation markers were altered in patients with cSVD.A further aim of this study was to investigate whether these markers had added value to the MRI-based practice in the identification of cSVD patients.

J o u r n a l P r e -p r o o f
Journal Pre-proof 2. Methods

Patient population
Patient recruitment and extensive patient (microvascular) phenotyping was part of a previously published study on BBB integrity in cSVD (Zhang, Wong, 2017).Patients with mild vascular cognitive impairment (mVCI, n=36) due to cSVD or first-ever lacunar stroke (Laci, n=44) were recruited between April 2013 and December 2014 at the Maastricht University Medical Centre and Zuyderland Medical Centre, the Netherlands.Cardiac embolic stroke (e.g.due to atrial fibrillation) or ipsilateral carotid stenosis of ≥ 50% were exclusion criteria.In addition, blood sampling and MRI measurements were performed at least 3 months after diagnosis to exclude changes due to acute stroke (Zhang, Wong, 2017).In some analysis procedures, the mVCI and Laci groups were pooled as a single cSVD group.For every two cSVD patients one age-and sex-matched healthy control (n=38) was included.
This study was performed in accordance with the Declaration of Helsinki and approved by the Medical Ethical Committee of Maastricht University Medical Centre.All participants gave written informed consent.

Blood samples
Blood samples, drawn by venepuncture in tubes containing EDTA, were centrifuged at 2000 g for 10 minutes at 4˚C without brake to obtain platelet poor plasma (PPP).Plasma samples were stored at -80°C until further analysis.

J o u r n a l P r e -p r o o f
Journal Pre-proof and wavelength correction was set to 540 nm, using an EL808 Ultra Microplate Reader and Gen5 software (Bio-Tek Instruments, Winooski, VT).Standard curves and sample concentrations were calculated using a four-parameter logistic regression algorithm in Python 3. MPO-DNA was measured in 96-well plates using 3 flashes per well for measuring fluorescence at 480 nm excitation and 520 nm emission using FlexStation 3 (Molecular Devices, San Jose, CA).ELISA measurements of all patient and matching control samples were performed using the same assay, simultaneously and in the same institution.

Machine learning
Python machine learning library Scikit-learn (open-source software) was used to perform regularized logistic regression (Pedregosa F. et al. , 2011).Liblinear library was used as solver in the regularized logistic regression to determine optimal fit for categorized data (scikit-learn, 2021).In total 67% of the cases were used to train the regularized logistic regression and 33% were used to validate the model.Random forest, also known as random decision forest, uses random sampling and features (including biomarkers, demographics and clinical characteristics of patients and controls) to generate multiple decision trees (n = 100).In total 80% of the data was used to generate 100 different decision trees to generate one classification algorithm by a forest of decision trees.K-Nearest Neighbor (KNN) mean is a supervised classification algorithm that requires labelled data, to obtain new data points accordingly to the k number of the closest data points.KNN algorithm assumes that similar subjects, in this study individuals with or without cSVD, exist in close proximity.For regularized logistic regression and KNN means normalized data was entered.As random forest uses Euclidean distance in an n-dimensional space, categorized data was used to compute algorithms.All algorithms are published on Github (Roosen, 2021).Prediction models can be validated by various measurements including accuracy, precision, sensitivity/recall and/or specificity, which are mathematically expressed as: In which true positive is the number of cases correctly identified as patient, false positive is the number of cases incorrectly identified as patient, true negative is the number of cases correctly identified as healthy and false negative is the number of cases incorrectly identified as healthy.

J o u r n a l P r e -p r o o f
Journal Pre-proof Statistical analyses were performed with GraphPad Prism 9 (GraphPad Software, San Diego, CA, USA) or in R 4. 1.0 (CoreTeam, 2021).Data were checked for normality with the D'Agostino Pearson omnibus normality test.All continuous data are presented as median (interquartile rage, IQR).Categorical data were displayed as frequency (percentage).Significance of differences were determined by Mann-Whitney U-test, Kruskal Wallis, or one-way ANOVA, as appropriate.P values < 0.05 were considered to be significant.

Levels of individual biomarkers within the study cohort
Platelet activation biomarkers CXCL4 and CXCL7, vascular inflammation markers CX 3 CL1 and IL-1 and markers of neutrophil activation and NETs MPO, MPO-DNA and S100A8/A9 were measured in plasma from patients and controls, and directly compared by univariable analysis (Figure 1, Table 2).While all IL-1β levels were below detection limits, the concentrations of CXCL4, CXCL7, CX 3 CL1 and S100A8/A9 did not significantly differ between patients and controls before (Figure 1, Table 2), or after stratification of the patients for mVCI and Laci (Suppl.Figure 1).Some CX 3 CL1 concentrations were below detection limit and most were found to be in the pg/mL range, which is low but in agreement with previous findings (Damas et al. , 2005, Flierl et al. , 2015).Stratification revealed a tendency towards different levels of CX 3 CL1 mVCI and Laci, but the difference was not statistically significant (p=0.066)(Figure 2A).MPO-DNA, which reflects NET-release also did not differ between cSVD patients and controls, without (Figure 1, Table 2) or with stratification (Suppl.Figure 1 and suppl.Table 2).
MPO levels were significantly increased in cSVD compared to controls (Figure 1, Table 2).When the cSVD group was split into mVCI and Laci, the MPO levels were significantly increased in the Laci, but not the mVCI group, when compared to controls (Figure 2B and Suppl.Table 2).A multivariable linear regression analysis indicated that MPO levels were not confounded by overrepresented patient

J o u r n a l P r e -p r o o f
Journal Pre-proof characteristics (hypertension, hypercholesterolemia, smoking, and statins, Table 1) or by diabetes (Suppl.Table 3).

Composition of multivariate models for machine learning analysis
As MPO was the only plasma marker that was elevated in cSVD patients, we aimed to determine whether discrimination between cSVD and controls could be improved by incorporating demographics and clinical and MRI characteristics of both patients and controls.For this analysis only patients and controls with a complete dataset were included, leading to the exclusion of 4 cSVD patients and a final inclusion of 76 cSVD patients and 38 controls into the analysis.
Three different machine learning algorithms were implemented; 1) regularized logistic regression, 2) KNN means, and 3) random forest.cSVD/control was set as the dependent variable, and all 28 patient characteristics, measurements and markers were considered independent variables for the 3 models.
Four different models were composed and compared to each other: A) patient characteristics alone, B) patient characteristics were combined with MRI characteristics, which is the state-of-the-art for diagnosis of cSVD patients, C) patient characteristics, without MRI characteristics, with platelet, neutrophil and vascular plasma markers, D) patient characteristics, with MRI, with platelet, neutrophil and vascular plasma markers (all parameters).The individual parameters are listed in Suppl.Table 4.

Regularized logistic regression
The results from regularized logistic regression are visualized in Figure 3 and summarized in Table 3.
Model accuracy, specificity and sensitivity, as well as receiver operating characteristic (ROC) curves (insets in Figure 3 panels A-D) and the corresponding area under the curve (AUROC) were calculated (Table 3).Entering only patient characteristics (model A) included 9 out of the 17 possible parameters in an optimized predictive model, listed in Figure 3A (e.g.LDL levels, diabetes, hypertension, BMI, and smoking), with an accuracy of 56.4% and an AUROC of 52.7 (Table 3), indicating poor predictive performance of this model.The prediction improved in model B in which MRI characteristics (periventricular and/or deep extensive white matter hyperintensities, lacunar lesions, and deep perivascular spaces) were entered in addition to the patient characteristics (model accuracy: 66.7% ; AUROC: 62.0, Table 3), with 3 out of 22 variables were selected into the optimized regression model by the algorithm (listed in Figure 3B).
When patient characteristics were complemented with the ELISA-determined plasma markers, but without the MRI-related parameters (model C), 5 out of the 23 variables entered were selected into the optimized prediction (Figure 3C).Besides patient characteristics such as LDL levels and diastolic blood pressure, MPO was also included as a determinant in the model.Although model C resulted in J o u r n a l P r e -p r o o f Journal Pre-proof the highest specificity of all 4 models tested (88%), the sensitivity was only 38.5% and the AUROC (63.2) was similar to model B (Table 3).When all characteristics were entered in the analysis (model D), 11 out of 28 features were selected as predictors by the algorithm.These included the 3 MRI-related parameters returned by model B, followed by the patient characteristics LDL levels and hypercholesteremia, and the plasma markers MPO and CXCL7, the former outweighing hypertension (Figure 3D).The model accuracy and AUROC were similar (73.7%, 73.4,respectively, Table 3), indicating robust predictive performance.In retrospect, the inclusion of the plasma marker data appeared to compensate for the exclusion of MRI characteristics resulting in a model with comparable predictive quality, when considering the AUROC values (Table 3, model C vs model B).Model D also showed highest predictive potential when a KNN algorithm was implemented, while the models A-C only showed poor prediction using this algorithm (Suppl.Table 5).

Random forest analysis
To further support the above results and to identify additional relevant parameters associated with cSVD, a random forest analysis was performed.In this analysis, 100 decision trees were generated obtaining a so-called forest in which each parameter is scaled to their importance.The  4. The OOB score reflects how well those subjects who were left out of the sample during bootstrapping, were predicted by the algorithm during the training of the random forest model.The model score reflects how well the subjects of the test dataset (i.e. the 20% random sample) were predicted by the random forest algorithm after completion of the training with the remaining 80% of the dataset.Whereas the regularized logistic regression algorithm resulted in models that showed a high specificity in distinguishing cSVD patients from controls but were less efficient in detecting cSVD (sensitivity, Table 3), the random forest algorithm rather resulted in models that could sensitively detect cSVD yet poorly distinguished patients from controls (low specificity) (Table 4).Thus, the primary utility of the random forest algorithm was to identify potential predictors of cSVD among the parameters investigated.
In accordance with the regularized logistic regression analysis, model A, which includes only patient/control characteristics again returned LDL and BMI as important parameters, and also returned systolic blood pressure, age, HDL, triglycerides, creatinine and glucose levels (Figure 4A).Interestingly, even when MRI parameters were incorporated (model B), patient characteristics remained the most important features in the generation of the decision trees (Figure 4B).When MRI parameters were replaced by the ELISA plasma markers (model C), MPO and LDL were returned as the most

J o u r n a l P r e -p r o o f
Journal Pre-proof

Discussion
In the present study, plasma markers of vascular inflammation, platelet and neutrophil activation were measured in a cohort of patients with cSVD and age-matched controls.In addition, machine learning technology was implemented in order to identify whether these markers could distinguish controls from patients, thereby providing additional clues about cSVD aetiology.
Of the plasma markers measured, only MPO levels were significantly elevated in patients with cSVD in a direct comparison with age-and sex-matched controls.After patient stratification, increased MPO levels were found to be specifically associated with lacunar stroke.The MPO levels were not confounded by diabetes or overrepresented features in the patient group.MPO is stored in the primary secretory granules of neutrophils (Nauseef, 2014).MPO has a cytotoxic effector function as an enzymatic source of hypochlorite, important for the elimination of invading pathogens, and is also required for the proper release of NETs (Papayannopoulos et al. , 2010).Although MPO antigen levels indirectly imply MPO activity (Pulli et al. , 2013), circulating MPO levels have been in focus as a biomarker for inflammation, mainly reflecting the activation of neutrophils (Fuchs et al. , 2012).In addition, the circulating complex of DNA with MPO serves as a biomarker for the release of NETs (Fuchs, Kremer Hovinga, 2012, Hally, Parker, 2021, Jiménez-Alcázar, Limacher, 2018, Konkoth et al. , 2021).Our finding that increased MPO levels were specifically associated with lacunar stroke suggests that neutrophil activation might be involved in the pathophysiology of lacunar infarct.Moreover, it is conceivable that tissue damage can be exacerbated by the reactive oxygen species generated by MPO, a process that is implicated to drive the progression of neurodegenerative disease (Gellhaar et al. , 2017, Ray andKatyal, 2016).In addition, mice lacking MPO were protected against the progression of cognitive decline in a model of Alzheimer's disease (Volkman et al. , 2019) and polymorphisms in the MPO gene were found to modify the risk of Alzheimer's disease in a population of Han Chinese (Ji and Zhang, 2017).Besides its cytotoxic effector function as an enzymatic source of hypochlorite, important for the elimination of invading pathogens, MPO is also required for the proper release of NETs (Papayannopoulos, Metzler, 2010).A recent study indicates that dietary supplementation with anserine, a scavenger of hypochlorite, slowed the progression of cognitive decline in a cohort of individuals with mild cognitive impairment (Masuoka et al. , 2021).
Because neutrophil activation, degranulation and NET release lead to damage of the vasculature (Segel, Halterman, 2011), we reasoned that these processes, when occurring in the brain, might contribute to the pathogenesis of cSVD (Guo et al. , 2021).The neutrophil granule proteases elastase and cathepsin G were shown to cause disruption of the BBB when injected into rats (Armao, Kornfeld, 1997).Histones, released along with the NETs by neutrophils can activate platelets (Fuchs et al. , 2011, Semeraro et al. , 2011) and might also exert direct cytotoxic effects on the cells of the J o u r n a l P r e -p r o o f Journal Pre-proof vasculature (Silvestre-Roig et al. , 2019).Besides histones, the enzymes cathepsin G, elastase and MPO are retained on NETs and can potentiate inflammation and cell dysfunction by locally activating cytokines (e.g.IL-1) or by the production of reactive oxygen species (Guo, Zeng, 2021, Nauseef, 2014, Ray and Katyal, 2016).In our study, the increase of circulating MPO was not paralleled by elevated levels of MPO-DNA.This suggests that, unlike other (immune-driven) vascular diseases (Fuchs, Kremer Hovinga, 2012, Kessenbrock et al. , 2009, Konkoth, Saraswat, 2021, Sui et al. , 2021), cSVD is not accompanied by excessive NET formation.In addition, the concentrations of S100A8/A9, a cytosolic protein that also reflects neutrophil activation (Foell et al. , 2004, Marki et al. , 2020) were not altered in the cSVD group.Although this appears counterintuitive, a poor correlation between neutrophil-derived markers has also been observed in previous studies (Hally, Parker, 2021) and might reflect different modes of neutrophil activation as well as the contribution of different cellular sources of MPO e.g.macrophages and microglial cells (Gray et al. , 2008).
In this study, no differences of plasma CXCL4 and CXCL7 concentrations were observed between controls and patients with cSVD.Although this observation does not rule out an involvement of platelet activation in the pathophysiology of cSVD, it can be concluded that cSVD is not accompanied by increased platelet activation and granule secretion in this cohort of patients, compared with controls.While plasma levels of IL-1were below the detection limit in the current cohort, also no differences in CX 3 CL1 levels between patients and controls were observed.The membrane-bound chemokine CX 3 CL1 is of particular interest, as it is highly expressed by neurons in the brain (Pawelec et al. , 2020), but also on activated vascular cells, e.g.smooth muscle cells and endothelial cells (Flierl, Bauersachs, 2015, Lucas et al. , 2003, Ludwig, Berkhout, 2002).Membrane-bound CX 3 CL1 can directly activate and attract leukocytes and platelets through interactions with CX 3 CR1 (Hildemann, Schulz, 2014, Konkoth, Saraswat, 2021, Postea et al. , 2012, Schafer et al. , 2004, Schulz et al. , 2007).
The observation that the levels of CX 3 CL1 between patients and controls were not different in this study suggests that cSVD is not accompanied by an increased inflammatory activation of CX 3 CL1expressing cells, or by inflammation in general.This notion is supported by a recent study in patients with cSVD, that did not reveal differences in IL-1 and TNF levels between patients and controls (Wang et al. , 2020).However, CX 3 CL1 measurements might be interesting for the characterization of mVCI in future studies, as CX 3 CL1 concentrations showed a trend towards elevation in the mVCI group.Interestingly, the levels of the neutrophil attractant CXCL8 were significantly elevated in patients with cSVD and found to be associated with chronic insomnia (Wang, Chen, 2020).Of note, the chemokine CXCL8 and CXCL7 both activate neutrophils by the receptor CXCR2 (Russo et al. , 2014) and activation of CXCR2 induces the release of NETs, for example during thrombosis (Teijeira et al. , 2021, Yago et al. , 2018).

J o u r n a l P r e -p r o o f Journal Pre-proof
To further explore the relevance of the plasma markers determined in this study, machine learning algorithms were implemented.Interestingly, both the regularized logistic regression and random forest algorithms returned plasma LDL levels as a strong predictor of cSVD.In the regression-based machine learning algorithm, patient and MRI characteristics dominated over the ELISA plasma markers as predictors, with only MPO and CXCL7 returned as possible predictors.At the level of prediction accuracy, the inclusion of patient characteristics alone or combined with the plasma markers led to >80% specificity in distinguishing cSVD patients from controls, but in poor sensitivity (<40%) for identifying cSVD.Inclusion of all parameters in the model improved the AUROC, and thus the general quality of prediction.Interestingly, the decision tree-based random forest algorithm had a higher preference for the ELISA plasma markers than the regularized logistic regression algorithm, suggesting that these markers facilitated cut-off point selection during the generation of decision trees.
Using random forest, patient characteristics dominated over MRI parameters when plasma markers were not included.When included in the model, MPO was consistently returned as the most important feature, along with LDL levels.Also CXCL4, MPO-DNA and S100A8/A9 were among the top 10 most important features, despite the lack of significant differences between their concentrations between the cSVD and control groups.At the level of prediction accuracy however, all random forest-based models were able to accurately detect cSVD in the cohort, with sensitivities of up to 100%, but performed poorly in distinguishing controls from cSVD patients.The predictions were not improved by the implementation of a third KNN algorithm.Taken together, the machine learning algorithms do provide information about possible pathophysiologic determinants of cSVD and yet are currently not optimal to generate models that allow accurate detection of cSVD or to distinguish non-SVD from cSVD individuals.
The study has some limitations, for example the number of subjects included might not be optimal to unfold the full potential of the machine learning algorithms.In addition, the ratio of 2 cSVD patients to 1 control might skew the course of the algorithms towards cSVD.Finally, several basic features, e.g.hypercholesteremia and smoking, were inherently different between the cSVD and control groups, which might skew the learning process in the machine learning algorithms.Although this might appear to be the case for hypercholesteremia and MRI parameters in the regularized logistic regression model (model D), other parameters enriched in the cSVD group such as smoking, were not returned by the algorithm (model D).In this sense, it is interesting that the random forest algorithm appeared to disregard the a priori group differences and returned the ELISA plasma markers as important features.
The use of artificial intelligence (AI) in clinical applications has greatly increased during the past decade, and promising results have been achieved particularly in the areas of pathology, radiology and J o u r n a l P r e -p r o o f Journal Pre-proof feature importance of all individual parameters sums up to 1.The top 10 most important parameters of each model are listed in the respective panels A-D of Figure 4. Model accuracy, sensitivity, specificity and accompanying out-of-bag (OOB) scores are shown in Table

Figure 3 :
Figure 3: Graphical results of regularized logistic regression of cSVD and control characteristics.A) Regularized logistic regression analysis of model A, the dataset of patient characteristics, resulted in the selection of: LDL/HDL, low / high density lipoprotein; hypercholesteremia; DM, diabetes mellitus; hypertension; DBP, diastolic blood pressure; BMI, body mass index; smoking; and sex, by the algorithm.B) Regularized logistic regression analysis of model B, including patient and MRI parameters, resulted in the selection of: WMH PVD, periventricular and/or deep extensive white matter hyperintensities; presence of lacunes; and PVS D, deep perivascular spaces, by the algorithm.C) Regularized logistic regression analysis of model C, patient characteristics and plasma marker data, resulted in the selection of: LDL; hypercholesteremia; DBP; hypertension; and MPO, Myeloperoxidase, by the algorithm.D) Regularized logistic regression analysis of model D, including the entire set of variables, resulted in the selection of: WMH PVD; PVS D; lacunar infarct; LDL; hypercholesteremia; MPO; hypertension; number of microbleeds; BMI; CXCL7; and HDL, by the algorithm.Bar graphs represent p-values of the F-statistic for the features which have the most predictive power for each dataset.Inset: ROC curve of true positive rate (TPR) versus false positive rate (FPR).Analysis was performed on data from 76 cSVD patients and 38 controls (training data n=76, validation data n=38).

Figure 4 :FiguresFigure 1 :Figure 3 :Figure 4 :
Figure 4: Graphical results of random forest analysis of cSVD and control characteristics Top 10 parameters, scaled to their importance (summed up to 1) in random forest analysis for each dataset.
J o u r n a l P r e -p r o o fJournal Pre-proof