
2023 Impact Factor
A mood disorder is a mental health condition primarily affecting your emotional state. Bipolar disorder (BD) and major depressive disorder (MDD) are regarded as representative mood disorders. BD is a chronic psychiatric illness that affects approximately 1% of the general population [1]. MDD is a highly prevalent psychiatric disorder, and the lifetime prevalence of MDD ranges from 2 to 21% [2]. Both BD and MDD cause a sizeable socio-economic burden. In addition, some challenges remain in managing BD and MDD, including their under and misdiagnosis in clinical settings [3]. That is, the diagnosis is based solely on clinical evaluations, which subjective biases of patients and informants can be posed in reporting symptoms [4]. Moreover, the depressive episodes of these two mood disorders have similar clinical manifestations, such as depressive symptoms and duration of affective episodes during the course of illness [5]. When MDD occurs with an individual who has also had a history of manic episodes, this is called bipolar disorder type 1, and a history of hypomanic episodes, this is called bipolar disorder type 2. These two mood disorders are very different but have similarities simultaneously [6]. Furthermore, plenty of patients with BD have developed their first mood disorder in depressive episodes, and the conversion from unipolar MDD to BD is more likely to occur within five years of the first episode of depression [7]. One study following up on patients with MDD for eight years revealed that 7.6−12.1% of them subsequently had their MDD diagnosis converted to BD [8].
Moreover, it is also important matter to distinguish mood disorders from other psychiatric conditions, such as schizophrenia (SPR), anxiety disorder, and attention deficit hyperactivity disorder (ADHD). Depression is a prevalent comorbid syndrome among three psychiatric disorders, and it exhibits certain symptoms that are also shared with mood disorders. Unipolar depression is a prevalent comorbidity disorder among SPR, and it shares with MDD several risk factors and precursors [9]. Furthermore, manic phase episode, hallucination, and delusion are frequent symptoms and signs of both BD and SPR [10]. In anxiety disorder, alterations in emotion regulation and brain activities or networks have been consistently observed with mood disorder [11,12]. Additionally, ADHD patients exhibit symptoms that are highly similar to those of BD, including impulsivity, inattention, rapid emotional fluctuations, and hyper energy [13].
Due to the difficulties in differentiating BD from MDD or differentiating mood disorders from other psychiatric disorders in clinical settings, a lot of psychiatric disorder patients are misdiagnosed and receive inappropriate treatment, which might worsen their symptoms [14,15]. Early diagnosis and timely personalized treatment can have essential effects on the prognosis [16]. Given the findings from previous research on differences in genetics, course, and symptoms, we might expect to see significant differences in neuropathology [17-19]. This has increased the need to understand the neurobiological differences between BD and MDD, and between those mood disorders and the other psychiatric disorders.
Meanwhile, the evaluation of treatment response and the finding of factors affecting treatment response are very critical issues in treating mood disorders. Specifically, finding biomarkers affecting treatment response is crucial in this field. There have been efforts to find markers not only for diagnosis but also for treatment response.
Electroencephalography (EEG) signals create a considerable amount of meaningful information that is difficult to observe and interpret. EEG is an excellent tool for exploring neurocognitive functions with high temporal resolution [20]. In addition, EEG is sensitive to alterations in neurotransmission secondary to pharmacological manipulations or brain dysfunction [21]. Resting-state brain activity reflects the baseline status of the brain and has been proposed as a means of exploring the underlying pathophysiology of mental disorders [22]. Moreover, event-related potential (ERP) reflects cognitive function in humans. Thus, resting-state EEG analysis, which has been a validated tool [23], could help to better understand the mental disorders’ pathophysiology. Numerous researchers have identified a variety of significant biomarkers of psychosis from these EEG signals using mathematical signal processing methods, such as frequency band power, signal complexity, and signal connectivity [24,25]. These biomarkers of EEG have been used to find complex neural brain mechanisms underlying depression [26,27]. Also, they have consistently contributed to the diagnosis of mood disorders, as well as the prediction of treatment responses in mood disorders [28]. In addition to conventional statistical or correlation analysis, machine learning (ML) approaches to psychological research have used EEG biomarkers over the past years [29,30].
ML is one part of artificial intelligence that involves computers trained to mimic the cognitive abilities of the human brain. The efficiency and usability of ML have been enhanced throughout time as a result of algorithmic advancements, advances in computational resources, and the availability of extensive datasets. By employing a variety of mathematical ML algorithms, researchers can extract some complicated patterns from a large number of data, enabling them to make predictions [31]. ML has diverse applications, and we can apply ML techniques across many primary domains and specific datasets. In addition, ML approaches have allowed researchers to acquire information and significant insights from EEG data. Furthermore, ML enables the identification of essential EEG biomarkers for the classification of disorders through the process of feature selection [32].
Several EEG studies on mood disorders have adapted ML approaches for disease diagnosis or prediction for treatment response. Using a variety of ML algorithms, they discovered several EEG or ERP biomarkers through feature extraction and selection. In almost all their studies, accuracy exceeded 70% by using cross-validation methods. As a result, the wise use of ML approaches based on EEG has the possibility to assist clinicians in making decisions and treatments.
This study will review the research to date on the use of ML techniques to differentiate diagnoses of mood disorders and the use of ML techniques to predict treatment response in mood disorders using EEG. With these attempts, we will explore the potential ML techniques to address the challenges of diagnosing and treating psychiatric disorders.
This narrative review will cover the most relevant findings on the use of ML techniques in the field of affective disorders, especially the studies to compare mood disorders (such as MDD and BD) to other mental illnesses (SPR, anxiety disorder, and ADHD) by summarizing the studies published in the last 23 years (from 2000 to 2023).
The authors searched the following electronic databases: MEDLINE Complete, Embase, PubMed, PsyInfo, and Cochrane Library. The search used a combination of controlled vocabulary and free-text terms to capture the concepts of “mood disorder (MDD, BD),” “EEG,” and “ML.” Example searches are as follows: “EEG-based classification in mood disorder,” “ML approaches using EEG between mood disorders and SPR,” “Classification of treatment response in mood disorders using EEG,” “ML-based predict treatment response of MDD,” “ML techniques using EEG in treatment of mood disorder.”
We focused on recent articles in the English language. A full-text review of the included studies was carried out, and data were extracted on the study’s characteristics and outcomes. An overview of ML approaches using EEG biomarkers is shown in Figure 1.
SPR and BD are high-risk inherited mental illnesses with debilitating symptoms [33]. These two disorders have many overlaps and similarities in clinical symptoms. For example, patients with BD could have hallucinations and delusions like SPR in their manic and depressive episodes. Due to these clinically similar symptoms, the diagnostic differentiation between psychotic features in BD and SPR would be challenging. There is ample electrophysiological evidence of cognitive dysfunction in the EEG signals of both BD and SPR [34]. Most SPR and BD ML publications are in the domain of magnetic resonance imaging (MRI) and functional MRI (fMRI) data [35]. More-over, most studies use ML techniques in EEG to examine the diagnostic distinction between SPR and healthy control (HC) and between BD and HC. Still, few have directly investigated the diagnostic distinction between SPR and BD using classification algorithms.
Alimardani et al. [36] suggested state-of-art synchronization methods from various domains like phase-locking value (PLV), robust synchronization (RS), and synchronization likelihood (SL) to differentiate patients with SPR from those with BD. The synchronization vectors obtained from 53 subjects were fed into a support vector machine (SVM) classifier, where the classification result with and without applying greedy overall relevancy yielded 92.45% and 88.68% accuracy, respectively. They also investigated whether EEGs evoked by visual stimuli modulated at a specific frequency to induce steady-state visual evoked potential for classifying the BD and SPR. O1 and Fz showed a significant difference between the two disorders for signal-to-noise ratio mean and kurtosis, res-pectively. The k-nearest neighbor (KNN) provided the highest classification accuracy of 91.30%, with the best feature set selected by the Fisher score [37]. The other study using P50 and P300 ERP measures that are regarded as endophenotype candidates for SPR reported a failure to differentiate SPR from BD [38].
More recently, Kim [39] tried to find diagnostic accuracy between SPR and BD using ML. For efficient classification of schizophrenic and bipolar patients, the features were selected by Fisher score, and the classifier used linear discriminant analysis (LDA) and applied to leave-one-out cross-validation (LOOCV) to calculate the classification accuracy. When 15 features were selected from the combination of mismatch negativity and 40-Hz auditory steady-state response features for the classification of both disorder groups, the classification accuracy was 73.85%.
Major depressive disorder vs. schizophreniaJang et al. [40] conducted two-group classification using LDA and SVM classifiers. SPR and MDD patients showed significant differences in auditory P300 amplitude compared to a healthy population. In the sum of amplitudes and cortical sources, the LDA’s classification accuracies were 71.31% for discriminating SPR vs. HCs and 74.55% for distinguishing MDD vs. HCs. However, the comparison between SZ and MDD showed low classification accuracy of 59.71%, sensitivity of 65.08%, and specificity of 54.83%.
Very recently, Chen et al. [41] conducted a study to identify dynamic patterns within the spatiotemporal feature space specific to non-psychotic MDD, psychotic MDD, and SPR. It also evaluates the effectiveness of ML algorithms based on these network manifestations in differentiating patients with the three conditions. The size of the population in this study was quite large (a total of 579 participants were recruited, including 152 patients with non-psychotic major depression, 45 patients with psychotic major depression, 185 patients with SPR, and 197 HCs). A dynamic functional connectivity (FC) approach was employed to estimate each diagnostic group’s principal FC states. Using dynamic FC features, the classification accuracy among the patients with Non-psychotic MDD, psychotic MDD, and SPR achieved 83.9%.
Bipolar disorder vs. major depressive disorderDifferentiating between BD and MDD conditions remains a major clinical challenge, as mentioned in the introduction [42]. Similarly, while many studies have compared each diagnosis to HCs, few have directly compared the two (i.e., BD and MDD) diagnostic groups [43]. However, more studies have been published in recent years comparing the two disorder groups.
Tekin Erguzel et al. [44] used a novel feature selection algorithm based on standard Ant Colony Optimization (ACO) called improved ACO (IACO) to reduce the number of features by removing irrelevant and redundant data. Moreover, they fed into an SVM to classify MDD and BD patients. The proposed method used coherence values calculated from alpha, theta, and delta frequency bands. The performance of the novel IACO-SVM approach showed that it is possible to discriminate patients with BD and MDD using 22 features with 80.19 % overall classification accuracy. Erguzel et al. [45] also performed another artificial intelligence approach to classify MDD and BD. In this study, the performance of artificial neural networks (ANNs)-particle swarm optimization (PSO) approaches for feature selection was quite satisfying: it could discriminate 31 people with bipolar and 58 with unipolar using selected features from alpha and theta frequency bands with 89.89 % overall classification accuracy.
Meanwhile, Barreiros et al. [46] investigated the P300 ERP component as a biomarker of cognitive dysfunction to differentiate BP and MDD. Although significantly lower P300 amplitude was found in the BP group compared to the MDD group, the accuracy level of the classification model was only 53.5%. The authors speculated the low classification accuracy might be due to small sample sizes. Similarly, one other study conducted with left and right EEG frontal asymmetry and theta power related to happy and sad face stimuli showed reduced accuracy for classification between BD and MDD [42].
Ravan et al. [47] introduced a multi-step preprocessing method and proposed an ML algorithm using extracted symbolic transfer entropy features to distinguish MDD from BD patients. They employed a training dataset of resting state EEG from 71 MDD and 71 BD patients. The proposed method yielded a total accuracy of 84.9%. The resulting classifier delivered an evaluation accuracy of 88.5% and 89.3%, using 80% of the data for training and evaluation and the remaining 20% for testing, respectively. The most recently published article written by Mittal and Korgaonkar [48] implemented traditional statistical techniques such as logistic regression (LR), KNN and random forest (RF) techniques, and deep learning architectures. The highest cross-validated accuracy for differentiating BD and MDD was 74.34% using LR method, with a sensitivity of 70.1%.
Several attempts have been made to differentiate between BD and MDD diagnoses using ML techniques using EEG markers. Previous studies have shown that accuracy varies depending on which EEG markers have been used, the sample size, and the ML techniques. Moreover, although not as good as the overall accuracy shown in studies comparing each BD and MDD group to a healthy population (BD vs. HC: 95.8% [49]; MDD vs. HC: 98.4% [50]), recent studies have shown that it can distinguish between the two diagnoses with a significant degree of accuracy.
Mood disorders (BD and MDD) vs. anxiety disorderAderinwale et al. [51] investigated the potential rapid and accurate diagnoses of panic disorder (PD) and MDD through the application of ML. The SVM was supplied with 2-channel EEG signals from the frontal lobes (Fp1 and Fp2) of 149 participants to classify using non-linear measures as features. The study obtained a 59% classification accuracy between PD and MDD patients.
Mood disorders (BD and MDD) vs. adult attention deficit hyperactivity disorder (ADHD)Most of the reports of ADHD classification with ML use datasets from ADHD children and HCs lacking studies that compare data with other disorders. Additionally, it has been widely reported that ADHD has comorbidities. As a consequence, more research on contrasting ADHD with other disorders needs to be carried out for differential diagnosis, not only to identify abnormalities but also to determine symptoms specific to ADHD [52].
Since there is little research on the discrimination of adult ADHD from mood disorder, we introduce two studies with adolescents with ADHD. One study with 21 subjects with ADHD and 22 patients with BD showed 74.16% average accuracy using band-power with the eye-open condition [53]. The other study, with 12 patients with ADHD and 12 with BD, reported 92.85% classification accuracy applied to the 1-nearest neighbor classifier [54].
The studies of ML approaches using EEG to diagnose mood disorders are shown in Table 1.
Main machine learning techniques used to diagnose mood disordersSelecting EEG data features is one of the most crucial steps in ML. Feature selection reduces the dimensionality of the characteristics to eliminate redundant and irrelevant data. By selecting features, it is possible to decrease the computational cost of the operation and enhance the model’s performance [55]. Some studies did not employ feature selection techniques, while others employed various methods for feature selection. To our knowledge, about half of the studies reviewed here did not use the feature selection procedure. Ravan et al. [47] and Sanchez et al. [56] used the minimum redundancy maximum relevance (mRMR) method. mRMR uses the correlation between features, and it selects the features with the highest correlation (relevance) and lowest correlation (redundancy) between the classes [57]. Alimardani et al. [37] and Kim [39] used the Fisher score criterion as a feature selection method. The Fisher score is a widely used statistical criterion in pattern classification, indicating a greater distance between the mean values of two classes and a smaller variance within each class than lower scores. The Fisher score is computed by dividing the between-class scatter by the within-class scatter. Features are then ranked according to the Fisher score, which signifies greater discriminatory ability [58]. Alimardani et al. [36] used another feature selection method called greedy overall relevancy (GOR). GOR is a filter-based algorithm based on its information gain ratio and considers all cases using greedy search to select the features. Also, Erguzel et al. [44] used ACO, IACO, and PSO methods for feature selection. These three methods are wrapper-based techniques for selecting features by eliminating irrelevant and redundant information. Chen adapted the Boruta package in Python [59], an algorithm for iterative feature selection based on the RF classification. Aderinwale et al. [51] performed the least absolute shrinkage and selection operator (LASSO) regression-based feature selection to classify MDD and PD. From the coefficients of each feature, the LASSO method selects the optimal subset of features to improve accuracy [60]. Sadatnezhad et al. [53] used the Fisher LDA method to select the features. In addition to being used as a classifier, LDA is also used for feature selection, which reduces dimensionality by finding the plane that best classifies the data and projecting the data onto that plane [58].
Machine learning algorithmSince performance varies depending on which ML algorithm we use, it is essential to apply the correct ML algorithm based on the nature of the data we want to learn [61]. Each study used different ML algorithms. Even within a single study, multiple algorithms were utilized. Numerous studies have used SVM, RF, and KNN frequently. The SVM algorithm is a way to separate two groups of data by selecting the hyperplane that is farthest away from the data to separate them [62]. RF is a collective algorithm that trains multiple decision tree (DT) models and merges their outcomes to formulate predictions [63]. The KNN is a method for predicting the label of a data point based on the labels of its nearest neighbor [64]. Alimardani, Kim, and Jang research teams [37,39,40] use the LDA algorithm for classification. LDA projects the data points by finding the vector that minimizes the variance within classes and maximizes the distance between classes [65]. Also, quadratic discriminant analysis (QDA) has been adopted by Alimardani et al. [37]. While QDA and LDA function similarly, QDA classifies the two populations assuming that the covariance matrix between classes could vary [66]. Erguzel et al. [45] used an ANN. ANN is an artificial intelligence technique for creating a computational model based on the characteristics of biological neural networks that mimic human brain activity [67]. Some studies [37,42,46,48] utilized the LR algorithm to diagnose mood disorders. The LR model is an algorithm that employs regression to estimate the probability of data belonging to a category as a value between 0 and 1. Then it categorizes data into the most probable category based on the calculated probability [68]. Also, Sadatnezhad et al. [53] used an extended classifier system for function approximation (XCSF) algorithm to predict diagnosis. By applying a learning classifier system that solves complex problems, the XCSF method approximates functions by merging reinforcement learning and evolutionary computation [69].
Validation strategyMostly, the classification performance of ML is evaluated using cross-validation. Cross-validation is a method of cross-selecting data for training and evaluation when training a model. LOOCV checks the validity of a model by taking one data sample from N data samples as the test set and leaving the remaining N-1 as the training set [70]. The k-fold cross-validation is evaluated by dividing the entire dataset into k folds, splitting one-fold into test data and the remaining k-1 folds into train data, and repeating the procedure. The evaluation process is the same as k-fold, but the stratified k-fold cross-validation constructs the set by considering the proportion of classes in the total data [71]. Five out of six studies classifying BD and MDD used nested cross-validation. Nested cross-validation is an embedded approach to conventional cross-validation that performs in both the outer and inner loops. The outer loop of the test set is divided into numerous folds. The inner loop is then applied to the train set to separate it into train set and validation set for validation [72].
Machine Learning Approaches Using EEG Biomarkers to Predict Treatment Response in Depression EEG biomarkers and machine learning performanceThere are few studies on the treatment response to mood stabilizers for BD, and one or two relevant studies are based on fMRI [73] or polygenic scores [74] rather than EEG-based ML. Therefore, this review does not include ML studies on the treatment response of BD but only reviews the response of MDD.
Many studies have investigated ML techniques for predicting treatment response to medication in depression. Among the medications, selective serotonin reuptake inhibitors (SSRIs) are the most commonly studied.
In previous studies, salient quantitative biomarkers or features extracted from the pre-treatment EEG data could automatically distinguish between responders and non-responders to antidepressant treatment for MDD. Khodayari-Rostamabad et al. [75] proposed an ML procedure for feature selection known as a mixture of factor analysis (MFA) model that predicts responses in the form of a likelihood value. Twenty-two patients with MDD were treated with a 6-week course of an SSRI (mainly Sertraline), and the specificity of the proposed method was 80.9% while sensitivity was 94.9%, with an overall prediction accuracy of 87.9%.
In another study, 51 patients with MDD in a 12-week antidepressant pharmacotherapy trial were randomized to one of three antidepressant regimens (escitalopram, bupropion, and placebo) [76]. The study used a tree-based estimator to select a relatively small number of significant features from demographic and clinical data, scalp-level EEG power, and source-localized current density. Next, they applied kernel principal component analysis (KPCA) to reduce and map only the important features. Then, they constructed a set of ML models to classify response outcomes based on mapped features. When the most important features were extracted in the final model, 12 predictive features emerged (with 78% accuracy), including baseline scalp-EEG frontopolar theta, parietal alpha2, and frontopolar alpha1.
Zhdanov et al. [77] conducted a study to predict the outcome of escitalopram treatment from baseline EEG data using an SVM classifier on patients with MDD. Of the 122 participants, the classifier was able to identify responders with an estimated accuracy of 79.2% (sensitivity, 67.3%; specificity, 91.0%). Interestingly, the study collected additional EEG data recorded after the first 2 weeks of treatment. After updating the classifiers with these data, the accuracy increased to 82.4% (sensitivity, 79.2%; specificity, 85.5%).
Another study conducted by Oakley et al. [78], developed a ML algorithm that searched for connectivity patterns within an individual’s EEG signal that are predictive of the probability of responding to the sertraline. First, the directed phase lag index (DPLI), a measure of phase synchronization between brain regions that are not sensitive to volume conduction, was applied to resting-state EEG data. Then, the resulting DPLI matrix was searched for a pattern set of features that can be used to successfully predict the response to sertraline or placebo. Among 224 patients with depression, the classifier predicted a response to sertraline or placebo with more than 80% accuracy.
A recent meta-analysis on ML to estimate treatment response using EEG in MDD showed that the pooled accuracy across studies was 83.93% (95% confidence interval [CI], 78.90−89.29) comprising 758 patients within a random-effects model with a restricted maximum likelihood estimator [79]. The study reported that the average sensitivity and specificity across models were 77.96% (95% CI, 60.05−88.70), and 84.60% (95% CI, 67.89−92.39), respectively. The study also showed that the specificity of EEG models was greater than the sensitivity, suggesting that EEG models thus far better identify non-responders than responders to treatment in MDD.
Most recently, one study developed a novel method based on deep learning and brain-effective connectivity to classify responders and non-responders to SSRI antidepressants in patients with MDD prior to the treatment using EEG signals. The authors concatenated EEG signals and transformed them into images. Using the images, they fine-tuned a hybrid Convolutional Neural Network (CNN) enhanced with bidirectional Long Short-Term Memory (BiLSTM) cells based on transfer learning. The models are followed by BiLSTM and dense layers in order to classify responders and non-responders to SSRI treatment. Results showed that the EfficiencyNet-B0 has the highest accuracy of 98.33%. A new method was proposed in this study that uses deep learning models to extract both spatial and temporal features automatically, which will improve classification results. However, due to the small sample size (N = 30), the result should be carefully considered.
There have been some studies that tried to review ML findings for other treatment tools for MDD, such as transcranial magnetic stimulation (TMS). A recent meta-analysis on ML predicting treatment responses using EEG in MDD showed that greater performance was observed in predicting response to repetitive TMS (rTMS) (pooled accuracy: 85.70% [95% CI, 77.45−94.83]) in a subgroup analysis [79].
One study discerned EEG markers for treatment response with 39 patients with refractory depression and 20 HCs who underwent 5−8 weeks of rTMS [80]. Working memory-related features such as fronto-midline theta power and theta connectivity provided a sensitivity of 0.90 at predicting responders and specificity of 0.92. The other study tested the hypothesis that baseline EEG coherence predicted the outcome and assessed if EEG coherence was changed after TMS using Lasso regression and SVM [81]. The authors collected resting-state 8-channel EEG data before and after TMS (5 Hz to the left dorsolateral prefrontal cortex). After treatment, the model could predict clinical responses to TMS based on pre-treatment EEG coherence (N = 29). The accuracy of the model based on alpha, beta, theta, and delta bands classified using the SVM classifier was 75.4 ± 1.5%, 77.4 ± 1.4%, 73.8 ± 1.5%, and 78.6 ± 1.4%, respectively.
Erguzel et al. [82] conducted an ANN classification using pre-treatment frontal quantitative EEG (QEEG) cordance to determine whether they are responders or non-responders to rTMS treatment among 55 MDD subjects. The ANN classification identified responders to rTMS treatment with a sensitivity of 93.33%, and its overall accuracy reached 89.09%. They collected pre-treatment QEEG data in the delta and theta bands from 147 MDD patients with rTMS. ANN, SVM, and DT were used and their performances showed that it is possible to predict rTMS treatment responders with a sensitivity of 95.6%, an accuracy of 86.4% [83]. According to Hasanzadeh et al. [84], features extracted from EEG include Lempel-Ziv complexity (LZC), Katz fractal dimension (KFD), correlation dimension (CD), the power spectral density, features based on bispectrum, frontal and prefrontal cordance, and combinations of them. For classifying responder or non-responder, KNN was applied. It showed high performance of 91.3% accuracy, 91.3% specificity, and 91.3% sensitivity with EEG beta power, the sum of bispectrum diagonal elements in delta and beta bands, and CD as features.
Predictive models of treatment response using EEG hold promise in MDD. However, there is a need for prospective model validation in independent datasets, larger sample sizes, and a greater emphasis on replicating EEG markers.
The studies of ML approaches using EEG to predict treatment response in mood disorders are shown in Table 2.
Mainly used machine learning techniques to predict treatment response in mood disordersIn line with diagnosing mood disorders, various feature selection techniques were used to predict treatment responses in mood disorders. Khodayari-Rostamabad et al. [75] improved the accuracy of classification by selecting features using the Fisher score. Jaworska et al. [76] adopted the KPCA method for feature selection to predict treatment response for MDD. Similar to the PCA method that transforms data into a lower-dimensional space to extract only meaningful features while preserving as much variance as possible, KPCA reduces dimensionality by projection using a non-linear kernel function [85,86]. An alternative approach was a t test-based filter method. Each feature is assigned a score according to the t test between classes, with the most informative features being ranked highest [77]. Oakley et al. [78] and Hasanzadeh et al. [84] selected features for ML analysis using the mRMR method.
Machine learning algorithmML algorithms such as RF, SVM, KNN, and ANN have been used in studies to predict treatment response. Khodayari-Rostamabad et al. [75] utilized a MFA. The MFA algorithm is a classification method utilizing maximum likelihood that conforms to inherent nonlinearities as well as linear feature spaces. These likelihoods can be used to predict the response in studies of treatment response and to rank treatment options [87]. Jaworska and Oakley used an RF algorithm for the EEG classification [76,78]. Zhdanov, Oakley, Zandvakili, and Bailey adopted SVM to predict the treatment response of mood disorders [77,78,80,81]. Additionally, Erguzel and Tarhan [83] attempted to implement the ANN, SVM, and DT algorithm. DT algorithm is a tree structure with a root node and branches that partitions data based on certain conditions to make predictions [82,88]. Hasanzadeh et al. [84] used the KNN classifier in ML.
Validation strategyK-fold cross-validation and LOOCV have been adopted in the study predicting treatment response in mood disorders. LOOCV was utilized by Khodayari-Rostamabad, Zhdanov, Zandvakili, and Hasanzadeh to evaluate the performance stability of ML [75,77,81,84]. Jaworska and Erguzel validated accuracy using a 10-fold cross-validation [76,83]. Additionally, Oakley, Bailey, and Erguzel assessed the efficacy of classification using a 5- or 6-fold cross-validation [78,80,82].
Limitations of machine learning approaches in mood disordersML methods play a crucial role in analyzing data, as they enable the prediction of features and the extraction of meaningful insights from the provided data. ML approaches can facilitate early clinical symptom assessment and prevention of mood disorders [89]. However, there are several significant limitations to ML approaches in diagnosing mood disorders. In contrast to the majority of studies comparing normal healthy individuals, much research comparing mood disorders, such as MDD and BD, with other mental diseases, such as SPR, ADHD, and anxiety disorders, has used a small number of samples [39-42,46,48,75]. Due to limited resources and variability in data acquisition in the clinical field, current EEG-based ML studies are faced with the issue of a small sample size. The extent to which the accuracy of their ML classifier can be generalized might be constrained because the studies were conducted on small samples, posing challenges in accurately representing the entire population with a limited sample size. Likewise, ML models constructed using little sample sets may be overfitting, which can be a significant factor in the unstable accuracy of the classification [90]. Therefore, it becomes essential to perform studies involving a substantial number of sub-jects.
Feature extraction and feature selection are essential procedures in ML because the model’s performance is easily influenced by the specific features used [91,92]. Creating an ML model with all the features without selecting the appropriate features can reduce the efficacy of ML. Furthermore, even when feature selection has been conducted, it is crucial to integrate the selection of features into the cross-validation stage to get a reliable outcome. Data leakage can occur in feature selection and standardization from the performance validation [93,94]. If features are selected from the entire data, including the data to be tested, before cross-validation, the train and test data are not entirely independent, and the features should be selected considering both. As a result, building a model with selected features and then validating it with test data to evaluate performance increases the likelihood of overfitting and bias. Four studies mentioned above in this review used nested cross-validation to validate their model’s performance. The nested cross-validation separates the outer and interior validation loops and extracts separate test data sets. Hence, conducting the feature selection procedure within the cross-validation setting is crucial, ensuring that the process only includes the test data, partially preventing the risk of overfitting.
Another limitation to using ML approaches in mood disorders lies in the diversity of EEG collection machines and diagnostic tools used across different institutions. Many researchers use various methods for recording EEG and preprocessing EEG data, which can produce other characteristics for each piece of data. In addition, the use of distinct diagnostic tools and criteria across various research may impact the efficacy of the ML methodology. If the diagnoses labeled in the training data vary based on the criteria, it will not generalize and will only perform well on the specific data that fits the criteria.
Future directions in the use of machine learning techniques in mental disordersComparing HCs with patients with mood disorders yields a reasonably high accuracy, but the comparison between MDD and BD could be more accurate. Although this is the reality in clinical settings where the more significant the heterogeneity between groups yields, the higher the accuracy, the clinicians should look for ways to achieve maximum accuracy.
To use better and more appropriate ML approaches in mental disorders, it is crucial to have a sufficient sample size and to optimize the ML techniques [90]. Also, ML approaches depend on various factors, including the study’s purpose, the data’s characteristics, and feature engineering (i.e., feature extraction and selection) [95]. So, it is necessary to experiment with different ML algorithms to compare and adopt various feature selection and validation strategies methods. When experimenting with different algorithms, hyperparameter tuning for each model can also serve as an effective ML approach to tune the model appropriately [96]. Combining multiple EEG features using different analysis methods or neuroimaging data, such as structural MRI (sMRI) or fMRI, can also improve ML performance. This is because there may be biomarkers that reflect the characteristics of a particular disease better than EEG data for the purposes of the study.
In addition, clinically, there are no studies applying ML techniques to differentiate MDD or BD from other diagnoses, such as anxiety disorders or personality disorders, especially borderline disorders, even though ample EEG studies have been conducted on those disorders. Therefore, we were unable to review the diagnoses in these categories. Further studies using ML techniques to differentiate between these diagnoses should be conducted in the future.
Clinicians have been asked to address the various challenges in diagnosing and regulating the treatment of mood disorders. In particular, the depressive conditions of MDD and BD are difficult to differentiate from each other. Therefore, researchers and clinicians tried to find biological markers of each disorder. However, it is not easy to find the critical marker. Moreover, there is a significant heterogeneity in treatment response and progression in mood disorders.
While ML techniques currently lack a robust biomarker and consistent accuracy for diagnosing and predicting treatment responses in mood disorders, we anticipate that as technology advances, these techniques will eventually identify discernible patterns. Through the utilization of extensive and diverse datasets of physiological information, advancing feature selection methods, and ML modeling algorithms tailored to each data characteristic, ML holds significant promise in recognizing both the diagnosis and treatment outcomes. By overcoming the challenges associated with ML, we can gain a deeper insight into these conditions. This, in turn, allows us to develop valuable tools that personalize both the identification and treatment processes, thereby enhancing the clinical approach to mood disorders.
No potential conflict of interest relevant to this article was reported.
Conceptualization: Ji Sun Kim. Formal analysis: Young Wook Song, Ho Sung Lee. Funding : Ji Sun Kim. Investigation: Sungkean Kim, Kibum Kim, Bin-Na Kim. Writing–original draft preparation: Young Wook Song, Ho Sung Lee. Writing–review & editing: Ji Sun Kim. All authors have read and agreed to the published version of the manuscript.
![]() |
![]() |