
2022 Impact Factor
Attention-deficit/hyperactivity disorder (ADHD) is a common neurodevelopmental disease in children that is characterized by deficiencies in cognitive and emotional self-control and affects 5−8 percent of school-aged children [1]. Because ADHD symptoms often appear in early elementary school and frequently proceed to a chronic state in adulthood [2], early diagnosis and treatments are critical for long-term prognoses [3]. Unfortunately, despite significant amount of research on ADHD over the last 10 to 20 years, relevant neurobiological markers that may lead to different diagnostic classifications remain lacking [4,5].
As an objective neural biomarker must be simple, non-invasive, and specific and perform an auxiliary role in the diagnostic process, non-invasive functional brain imaging techniques, such as functional magnetic resonance imaging (fMRI) and positron emission tomography (PET), have been used to investigate ADHD [6,7]. However, due to the large size of the devices, these approaches are challenging to use in bed settings for diagnostic and therapeutic applications [8]. In addition, fMRI requires a lot of time for analysis as complex preprocessing procedures, and PET needs radioactive contrast infusion. Therefore, functional near-infrared spectroscopy (fNIRS) has grown in popularity recently, with participants simply wearing a small light cap equipped with detector-emitter pairs of near-infrared light [9]. fNIRS offers evident advantages such as compactness, low cost, tolerance to body movement, and accessibility. Compared with fMRI, there is no burden due to space and noise, and there is less concern about side effects due to examinations like PET requiring contrast infusion. Thus fNIRS is regarded as suitable for the clinical evaluation of ADHD children [10].
fNIRS is a potential functional brain imaging method for examining changes in cortical activity in ADHD. fNIRS is a device that detects reflected light by irradiating near-infrared light with a wavelength of 650−1,000 nm to the living tissue [11]. fNIRS can non-invasively measure the degree of hemoglobin oxygenation in cerebral blood flow, which can be related to neuronal activity by the mechanism of neurovascular coupling [12]. Mauri et al. [13] tested 20 ADHD children and 25 typically developing (TD) peers on their capacity to perceive emotional stimuli and deal with emotional interference using a visual continuous performance task with cues of different emotional content. They reported that fNIRS data revealed between-group differences in the right prefrontal and frontal cortices, with the TD group showing more signi-ficant hemoglobin concentration changes.
Several studies analyzed functional connectivity using fNIRS. Wang et al. [14] evaluated 30 ADHD and 30 healthy control (HC) children, respectively, and found that ADHD children had significant deficits in functional connectivity and global network efficiency. ADHD children had a lower likelihood of experiencing the dominant connectivity state and a higher likelihood of experiencing other connectivity states. Sutoko et al. [15] found that ADHD children tended to show decreased probability of the dominant connectivity state and an increased probability of other connectivity states during an inhibitory control task.
Although fNIRS studies can provide valuable insight into ADHD, traditional group-level statistical methodologies do not provide a framework for evaluating the discriminative power of the discovered features at the individual level. Machine learning (ML) has recently been regarded as a promising tool for neuroimaging in psychiatry. Unlike traditional statistical methods, it automatically generates the most optimal results from various features [16]. In particular, because underlying neurobiology in psychiatric disorders has complexity [17,18], the use of machine learning classification approaches is increasing. In addition, it is expected to be helpful in personalized diagnostic classification [19]. Several studies employing fMRI data of ADHD participants for diagnosis prediction have been published [20,21]. However, few studies have used fNIRS [10,22] and none have used ML to evaluate Stroop task-based fNIRS data targeting children with ADHD. Based on the prefrontal cortex (PFC) hemodynamic dysfunction previously reported using the fNIRS methodology in ADHD [9], we hypothesized that ADHD patients could be classified from the typical development group. Thus, this study aims to apply a machine learning approach to identify medication-naive ADHD patients and HC groups using task-based fNIRS data.
Participants for this study were recruited from four distinct data sources. The ADHD group was composed of three data sources, and the HC group was composed of the other data sources. The three data sources for the ADHD group consisted of Wonkwang University Hos-pital, Hanyang University Seoul Hospital, and Seoul Na-tional University Hospital (SNUH), respectively. The data source for the HC group consisted of Daegu Catholic University medical center. Eligible patients (aged 8−15 years) had ADHD diagnostic evaluation by a child and adolescent psychiatrist, and the Kiddie Schedule for Affective Disorders and Schizophrenia present and lifetime version, a full-scale intelligence quotient (FSIQ) score of 70 or higher and a verbal comprehension index of 80 or higher. Key exclusion criteria were a history of psychotic disorder, mood disorder, anxiety disorder, developmental disorder, and neurologic disorder and being treated with ADHD medication within the last four weeks. Also, the control group had the same criteria except for clinical diagnosis. Subjects were recruited through bulletin boards and homepages of each hospital or community mental health center. A total of 51 patients and 61 HCs were registered, of which 33 patients and 39 HCs were included in the final analysis, excluding missing and incomplete data. The study was conducted with written consent after obtaining approval from the institutional review board of Seoul National University Hospital (SNUH 1905-145- 1035).
Based on the Diagnostic and Statistical Manual of Men-tal Disorders, fourth edition diagnostic criteria, an ADHD Rating Scale-IV (ARS) of 18 items was used to assess the severity of ADHD symptoms in the study subjects [23]. The Korean-ARS (K-ARS) was developed and reliability and validity were established [24]. The Korean version was divided into odd items that measure symptoms of attention deficit and even items that evaluate hyperactivity and impulsivity, consisting of nine items for each cate-gory. The total score included all item scores.
In this study, parents were asked to rate the subject’s symptoms [25].
Figure 1 presents the overall flow of this study. Figure 2 shows fNIRS optodes and channel locations. The surface regions of the PFC were covered by 15-channel fNIRS equipment (NIRSIT Lite; OBELAB Inc.). The topline (Ch03, Ch06, Ch09, and Ch12) mid-line (Ch02, Ch05, Ch08, Ch11, and Ch14), and bottom-line channels (Ch01, Ch04, Ch07, Ch10, Ch13, and Ch15) were laid on dorsolateral PFC (DLPFC), orbitofrontal cortex (OFC), and frontopolar cortex (FPC), respectively. Optical densities at two wavelengths (780 and 850 nm) were sampled at a sampling rate of 13.3 Hz, then were transformed into relative concentration changes in oxy-Hb and deoxy-Hb (Doxy- Hb and Ddeoxy-Hb, respectively) by means of the modified Beer-Lambert law [26].
The Stroop task consisted of three repetitions of a 30 seconds baseline period followed by a 45 seconds task period and lasted 250 seconds in total. In each task period, participants performed as fast as possible the Stroop task for a single condition in random order among ‘word’ (reading words), ‘color’ (reading colors in XXX letters), and ‘incongruent’ (reading colors with word-color mismatch). In the task period, 100 words were given on a monitor. In the word, color, incongruent conditions, participants uttered the words written in black, ‘XXX’ letters written in a certain color, and incongruent color with the word (e.g., ‘red’ written in green color, ‘green’ written in red color, and the like), respectively. All words were written in the participants’ native language (Korean). Figure 3 shows the experimental design of the Stroop task.
Data processing of fNIRS data was conducted using MATLAB 2021b (Mathworks). The collected fNIRS data were first linear detrended and then filtered by a zero-order filter implemented with the third-order low-pass Butter-worth filter with a cut-off frequency of 0.1 Hz. Next, the filtered fNIRS data were segmented into −1 to 45 seconds (45 seconds task) oxy- and deoxy-Hb epochs relative to the task onset at 0 seconds. Then, baseline correction was applied channel-by-channel to all oxy- and deoxy-Hb epochs on each fNIRS channel by subtracting the average value of a reference interval between −1 to 0 seconds from the epoched data. A total of three epochs including a single ‘word,’ ‘color,’ and ‘incongruent’ epoch for an individual participant were produced. As a result, 135 epochs (45 epochs per condition) were produced for all participants. In order to minimize the influence of motion artifacts on the fNIRS signals, the participants were asked to avoid body movements (especially the head), and the influence of motion artifacts on the fNIRS signals was removed using the low-pass filter.
A machine learning approach was adopted to discriminate hemodynamic responses between two groups. First, we only selected epochs for the incongruent condition showing the distinct differences in oxy- and deoxy-Hb time courses. Then, mean oxy- and deoxy-Hb values within 5 seconds sequential time windows in a task period (i.e., 0−5 seconds, 5−10 seconds, 35−40 seconds, and 40−45 seconds) were extracted to build a feature vector. Hereafter we paired ‘channel number-hemoglobin type’ for a specific channel name to avoid confusion in naming a channel name (e.g., CH01oxy). Hence, the dimensionality of the feature vector was 120 (30 fNIRS channels × 4 windows). The regularized linear discriminant analysis (RLDA) shows robustness for classification problems with high-dimensional vectors. Therefore, leave- one-out cross- validation (LOOCV) with an RLDA was applied to assess the classification performance of hemodynamic responses induced by the ADHD and HC groups. Note that the feature vector was standardized to enhance classification accuracy after dividing the feature vector into training and test sets. Finally, the optimal classification accuracy was confirmed while the normalization and feature selection parameters were changed according to the grid search approach.
The performance of our machine learning model was assessed through classification accuracy, sensitivity, and specificity. These metrics were calculated from the confusion matrix. Each performance metric is given by
where TP, TN, FN, and FNstand for the ‘true positive,’ ‘true negative,’ ‘false positive,’ and ‘false negative,’ re-spectively.
Table 1 shows the demographic characteristics of ADHD and HC. Age and FSIQ showed no difference between ADHD and HC groups. On the other hand, the proportion of women in the ADHD patient group was lower at 24.2% compared to 51.3% in the HC group, and the total score of K-ARS was higher at 28.9 points on average in the ADHD patient group.
Grand averaged waveforms of Doxy-Hb and Ddeoxy- Hb in both patients with ADHD and HC groups are shown in Figure 4. Overall, the ADHD group showed greater activation than HC patients during the Stroop task under the ‘incongruent’ condition except for left DLPFC (Ch9 and Ch12). In the three channels (Ch04, Ch05, and Ch07), there are statistical differences in both the mean and the maximum values, but they are not significant after false discovery rate multiple corrections.
Through LOOCVs, we are able to achieve an average classification accuracy of 0.82, a sensitivity of 0.67, and a specificity of 0.93.
This is the first study that created a machine learning classifier to identify unmedicated ADHD children using only fNIRS data during the Stroop task without any other neuropsychological data or clinical evaluation. The model had an 0.82 classification accuracy, confirming its performance with acceptable diagnostic accuracy [27].
In this study, we used the RLDA classifier to identify the remarkable hemodynamic responses between ADHD children and HC during the incongruent stimuli of the Stroop task. Using the RLDA classifier, the ADHD children were discriminated from HC with a sensitivity of 0.67, specificity of 0.93, and accuracy of 0.82. A study by Dai et al. [28] used ML classification of ADHD children through multimodal MRI data, but despite 624 participants for training, the accuracy was 0.68. Although there are differences in the number of study participants and classification techniques, in previous studies using fMRI data to diagnose ADHD patients using ML, the accuracy in nine of the total 13 fMRI studies did not exceed 0.8 [20]. Excluding the independent validation study, four of eight studies exceeded 0.8, and their average sample size was 625 people, about nine times higher than ours.
It is considered that the distinct difference revealed in task-based brain hemodynamic response contributed to the different performance results. Zhang et al. [21] reviewed 11 studies about ADHD diagnosis using machine learning and deep learning methods. The data modality was T1-weighted, DTI, fMRI, resting state-fMRI, or both, and the mean accuracy was 0.81. Considering that nine studies were conducted on more than 194 participants, our model using just 72 participants showed sufficient ability to identify ADHD children. Also, the model performance in this study was comparable to the fNIRS study, including clinical psychological data. Yasumura et al. [22] developed an support vector machine classifier including psychological data, age, and NIRS data during a reverse Stroop task to classify ADHD patients with 108 ADHD and 108 TD children, and the classification accuracy was 0.86.
Also, we found that compared to HC, the ADHD children showed global hyperactivity except for the left DLPFC during the incongruent Stroop task. Although it was not significant in multiple comparisons, higher activity was confirmed in the ADHD group compared to HC, especially in Ch04, Ch05, and Ch07 corresponding to ventromedial PFC (vmPFC) and OFC. Several fNIRS studies have shown differences in activation between ADHD children and HC. During the Stroop task, Jourdan Moser et al. [29] found hyperactivity in the right DLPFC of the ADHD group in response to an increase in task difficulty, and this hyperactivation was considered a compensatory mechanism. Negoro et al. [30] identified hypoactivity in the bilateral inferior lateral PFC. These two research were conducted on medication-naïve children like our study. Yasumura and colleagues reported hypoactivity in the left lateral and medial PFC [22]. Meanwhile, Suzuki et al. [31] showed hyperactivity in the left superior frontal cortex (SFC) during the flanker test. They suggested that children with ADHD need greater SFC activation to cope with interference. As in previous studies [29,31], the hyperactivation of this study might be the result of a compensatory mechanism due to inefficient information processing of PFCs in children with ADHD. Accordingly, more energy is required than the appropriate amount for the normal group to maintain interference control and cognitive flexibility.
Meanwhile, these heterogeneous results of previous studies using fNIRS are thought to be due to differences in sample size, medication use, types of cognitive tasks, and developmental age. The results of fMRI studies so far also identified abnormal functions in the orbital and ventromedial prefrontal areas consistent with the results of our study [32]. Abnormal hyperactivity of OFC/vmPFC is related to motivation and emotional processing rather than the cool execution function working accurately in ADHD patients. In addition, it can be interpreted as disrupted deactivation of the default mode network during Stroop tasks. Taken together, these functional abnormalities suggest delayed maturation in brain development.
Although classification performance is similar to previous studies, our work has several limitations. First, performance can be overestimated and overfitted due to relatively small sample sizes [33] and datasets that are only partially independent during the training process. Second, the fNIRS signal may be relatively unstable because the Stroop task was performed in a single test for participants. However, even considering these limitations, the authors expect this study to have a high potential for clinical application in various fields in that it first developed a machine learning model that identifies ADHD children using only fNIRS data. The authors expect that large-scale follow-up studies will further solidify the diagnostic value and increase the possibility of clinical application.
This research was supported by a grant of the Korea Health Technology R&D Project through the Korea Health Industry Development Institute (KHIDI), funded by the Ministry of Health & Welfare, Republic of Korea (grant number: HI19C0844).
The funders of the study had no role in the study design, data collection, data analysis, data interpretation, or writing of the report. The corresponding author had full access to all the data in the study and had final responsibility for the decision to submit for publication.
Conceptualization: Chan-Mo Yang, Bung-Nyun Kim. Data acquisition: Jaeyoung Shin, Chan-Mo Yang. Formal analysis: Chan-Mo Yang, Jaeyoung Shin. Funding: Bung- Nyun Kim. Supervision: Johanna Inhyang Kim, You Bin Lim, So Hyun Park. Writing—original draft: Chan-Mo Yang, Jaeyoung Shin. Writing—review & editing: all authors.
![]() |
![]() |