Deep transfer learning for cerebral cortex using area-preserving geometry mapping基于面积保持几何映射的大脑皮层深度传递学习 Cerebral Cortex 大脑皮层, Volume 32, Issue 14, 15 July 2022, Pages 2972–2984, ,第32卷,第14期,2022年7月15日,第2972-2984页, Published: 返回文章页面 16 November 2021 2021年11月16日 LeCun etal. 2015; Cole etal. 2017; Farooq etal. 2017; Kamnitsas etal. 2017; Shen etal. 2017; Mohsen etal. 2018). Deep convolutional neural network (CNN) (Simonyan and Zisserman 2014; He etal. 2016; Huang etal. 2017; Krizhevsky etal. 2017) has better representation capability and could automatically extract low-to-high-level spatial features (Gu etal. 2018; Abrol etal. 2021), usually outperforms conventional machine learning methods that need hand-crafted features. Inspired by the great success of deep learning in medical imaging analysis (Shen etal. 2017; Kermany etal. 2018; Sevakula etal. 2018; Ting etal. 2018; Zhu etal. 2020), neuroimaging researchers pay more attention to deep learning. Some studies use 3D CNN to analyze the 3D brain images since the brain is 3D (Valliani and Soni 2017; Oh etal. 2019; Bashyam etal. 2020; Thomas etal. 2020). The 3D models can preserve the spatial information of the brain and capture the features at the individual level. Nevertheless, they have too many parameters to be trained, leading to difficulty of converging. The computational efficiency is low due to the invalid calculation of volumes that contain null values. Moreover, the well-tuning of the models requires a large amount of data, but the collection of brain imaging data is expensive and time-consuming.

In recent years, transfer learning has emerged as a crucial method for solving the insufficient training data problem by transferring knowledge from a source domain to a target domain (Pan and Yang 2009; Tan etal. 2018). The models are usually pretrained on a large-scale source dataset (e.g., ImageNet) (Deng etal. 2009) and then fine-tuned on the small target dataset in deep learning (Tan etal. 2018). The wealth of knowledge learned from the source dataset is implicitly encoded in huge parameters, making it possible for the target task to achieve better performance with limited samples. For brain imaging analysis, the surface-based cortical shape morphometry is closely related to sex, age, and neuropsychiatric disorders (Yuan etal. 2015; Bedford etal. 2020; Gharehgazlou etal. 2021). Many medical imaging studies use large-scale natural image datasets as the source domain and achieve better performance (Kermany etal. 2018; Sevakula etal. 2018; Ting etal. 2018; Zhu etal. 2020), indicating the feasibility of transfer learning from natural images to brain imaging data. However, most of the current pretrained models are designed for 2D planar images and cannot be directly applied to 3D brain magnetic resonance imaging (MRI). How to bridge the 3D brain images and 2D pretrained models remains unsolved.


Most MRI studies based on CNN models use Euclidean distance as the metric for brain imaging analysis, ignoring the fact that the human brain has complex structures with folded sulcus and gyri (Fischl etal. 2008; Zhang etal. 2020). Nevertheless, the Euclidean distance is not the best metric of the brain images due to the non-Euclidean geometry of the complicated folding of the cerebral cortex (Seong etal. 2018). Treating brain images as ordinary images and applying Euclidean distance-based deep models directly to the brain images, which lack the neurobiological basis, may lead to the signal mixture of different brain regions and destruction of topological structure (Glasser etal. 2013). On the contrary, the distance along the brain surface is more consistent with neurobiology and cerebral cortex geometry (Fischl 2012; Glasser etal. 2013; Honnorat etal. 2015). For the aforementioned reasons, it is more reasonable to adopt the distance along the surface of the cerebral cortex instead of Euclidean distance in CNN models to obtain results with neurobiological significance.

To address these issues, we propose a novel framework to bridge the gap between 3D MRI data and 2D CNN models by mapping the 3D cerebral cortex into 2D images and utilizing transfer learning to improve network performance. The proposed framework can be roughly divided into 4 steps. The first step is to process the cortical data with FreeSurfer and transform the cerebral cortex into 3D surface meshes and vertex-wise cortical shape metrics (Glasser etal. 2013). The 3D surface meshes are then topologically and equally mapped into 2D planar meshes through an area-preserving geometry mapping approach (Zhao etal. 2013) and further converted into 2D images for the subsequent analysis. The converted images reflect the distance along the brain surface of different brain regions, and the convolution on the converted images is similar to the convolution along the cortical surface, which is more consistent with neurological significance. The third step is to train the models using transfer learning. We choose the pretrained ResNet-50 (He etal. 2016) and DenseNet-121 (Huang etal. 2017) as the backbone networks. The pretrained models are fine-tuned with the acquired 2D images. Finally, the results from different metrics are ensembled using the stacking ensemble method (Wolpert 1992) to generate final individual-level classification results. The effectiveness of the proposed method is demonstrated with sex classification.

Moreover, previous studies have reported sex differences of structures and functions in autism spectrum disorder (ASD) (Bejerot etal. 2012; Loomes etal. 2017; Bedford etal. 2020; Liu etal. 2020). A reasonable assumption is that the sex-related features may be helpful in ASD classification. Thus, we further develop a 2-stage transfer learning framework for the classification of ASD by using the sex classification of healthy people as an intermediate task to reduce the distribution differences of the source domain and the target domain for better performance.

The contributions of this paper can be summarized as follows:


A novel framework is proposed to bridge the gap between 3D MRI data and 2D CNN models.

We demonstrate the effectiveness of transfer learning in MRI studies under our framework.


We introduce a 2-stage transfer learning method for brain imaging analysis and demonstrate that the sex classification of healthy people could be used as an intermediate task to improve the ASD classification performance.

Materials and Methods材料及方法 Data and Preprocessing数据和预处理

The data used in sex classification come from the Human Connectome Project (HCP) S1200 release (Van Essen etal. 2012), including 1113 subjects (505 females vs. 606 males). Subjects are scanned in 3T Siemens scanners in Washington University with the following parameters: spatial resolution = 2 × 2 × 2mm3, time repetition (TR) = 720ms, time echo (TE) = 33.1ms, field of view (FoV) = 208 × 180mm2, slices = 72, flip angle = 52 degrees. Male and female subjects are matched in age and education.

We use the large-scale publicly available dataset, the Autism Brain Imaging Data Exchange (ABIDE) dataset (Di Martino etal. 2014; Di Martino etal. 2017), for ASD classification. The ABIDE dataset consists of 2 subsets: ABIDE I and ABIDE II. ABIDE I contains 1112 subjects (539 ASD patients vs. 573 normal controls) collected from 16 sites and ABIDE II comprises 1114 subjects (521 ASD patients vs. 593 normal controls) collected from 19 sites. We first discard 219 samples from 2226 samples of ABIDE dataset whose scans fail to complete all steps of the Freesurfer preprocessing pipeline due to low-image quality (Backhausen etal. 2016). In addition, Freesurfer sometimes generates incorrect segmentation owing to the low-image quality and the challenging of whole-brain reconstruction, even though the sample passes the pipeline. So the segmentation quality is further checked by visual inspection, and 13 subjects whose segmentations are incorrect are excluded. Finally, a total of 1994 subjects are involved in the following analysis.

As with many other MRI studies, we focus on the cerebral cortex. The cerebral cortex can be regarded as a thin folding surface with heterogeneous thickness, so it is impossible to transform it into a 2D image directly. In this study, the structural MRI preprocessing pipeline of FreeSurfer is adopted to preprocess data from both HCP and ABIDE (Glasser etal. 2013). The pipeline includes the segmentation of T1w volume, tessellation and topology correction of the initial white matter surface, spherical inflation of the white matter surface, registration to the fsaverage surface template, segmentation of sulci and gyri, pial surface generation, surface and volume anatomical parcellations, and morphometric measurements (Fischl 2012; Glasser etal. 2013). The 32k cortical meshes and vertex-wise cortical shape metrics, including thickness, sulcal depth, curvature, and myelin map, are generated from the cerebral cortex. Due to the lack of T2-weighted images, the myelin map for ABIDE is unavailable.

Geometry Mapping几何映射

As mentioned above, each metric is composed of surfaces from 2 hemispheres. To adapt the 3D imaging data to 2D models, we need to map 3D cortical meshes into 2D images. A 3D mesh generated by FreeSurfer is a folded closed surface that could not be directly mapped into a planar mesh. However, vertexes corresponding to the medial wall that is close to the subcortical regions would have null values. We remove these vertexes and thus obtain unclosed meshes, which can be theoretically mapped into a regular planar mesh using geometry mapping approaches. Conformal mapping (Wang etal. 2011) and area-preserving mapping (Su etal. 2013; Zhao etal. 2013) are 2 commonly used geometry mapping approaches that map irregular 3D meshes as regular planar meshes. The former keeps the mapping of angles but leads to area distortion, and the latter will cause the opposite effects (Nadeem etal. 2016). The area distortion may seriously influence the training of deep models, so we adopt the area-preserving mapping approach. Considering the compatibility with deep learning, we map the brain image into a unit rectangle.

Supposing is the input surface mesh in with Riemannian metric ⁠, there is a unique conformal mapping according to the Riemann mapping theorem, where D is a unit square with 4 corners predefined as 4 vertexes equally distributed along the surface edge. Then there is a unique Brenier mapping ⁠, which makes sure the area of each cell is preserved. The area-preserving mapping (Zhao etal. 2013) is the combination of the Riemann mapping and Brenier mapping: ⁠. In practice, the conformal mapping procedure can be implemented with the discrete Ricci Flow method. Supposing the vertexes are ⁠, the curvature is ⁠, and the target curvature is ⁠, the conformal factor is defined as ⁠, and the discrete Ricci flow can be represented by (Wang etal. 2011; Zhao etal. 2013) close to original areas ⁠, where represent the initial area of ⁠. Assigning the height vector of the cells as ⁠. For any given measurement ⁠, there must exist a unique that satisfies the area-preservation constraints for all cells. The Brenier mapping can be calculated with the energy function (Su etal. 2013). We set the target curvature to for 4 corners and 0 for other vertexes to generate a rectangular mesh. The 3D cortical mesh is then topologically and equably mapped onto a rectangular planar mesh without any tearing or overlaps after the geometry mapping. The acquired planar mesh cannot be directly utilized by convolutional networks, and the brain images need to have the same data form as the natural images used in the pretrained models. So it is necessary to transform it into 2D images. We use a weighted triangular interpolation approach based on barycentric coordinates to avoid the null values in the 2D images. The average of the points that fall within the same pixel is taken as the pixel's value.

Training with CNN和 CNN 一起训练 Deep Model Architecture深度模型体系结构

After the area-preserving geometry mapping, the vertex-wise cortical shape metrics are mapped as 224 × 224 images. To demonstrate the reliability of our method, we use 2 popular deep convolutional networks (i.e., ResNet-50 and DenseNet-121) for experiments. ResNet adds skip connections between the adjacent layers and calculates residuals from inputs to outputs. It alleviates the gradient disappearance in deep learning and achieves better performance. DenseNet introduces skip connections between every 2 layers and uses concatenation operation instead of summation operation used in ResNet. Both models have been demonstrated to be robust and efficient in image classification, and the corresponding pretrained models are widely used and available online.

Transfer Learning from ImageNet基于 ImageNet 的迁移学习

Transfer learning is utilized to improve network performance on small datasets. Deep neural networks are first pretrained on the large-scale natural image dataset ImageNet. Then the fully connected layer is replaced to meet the class number of the target task, and the pretrained models are fine-tuned using the acquired 2D images of different metrics, respectively. The 10-fold cross-validation strategy is used to test the reliability of the classification performance. The samples are randomly shuffled and divided into 10-folds, without consideration of scanning site, sex, and patient/control ratio. In each experiment, we use 9-folds for training and the left fold for testing. Before training, the input images are normalized, and the resulted images can be formulated as ⁠, where and are the mean value and standard deviation of the input images, respectively.

A mix-up strategy is also introduced in the training procedure (Zhang etal. 2018). Mix-up is a widely used data augmentation method in computer vision. It uses the linear interpolation of 2 random samples and their labels as virtual samples to improve the generalization capability of the network. The mix-up can be formulated as are randomly selected samples, are corresponding labels, and are generated sample and label, respectively. The hyperparameter is used to adjust the mix ratio. We set as the uniform distribution between 0 and 1 in the study. The mix-up strategy is conducted in the first half of the model training procedure to improve the representation capability of the model. Then we refine-tune the networks with original data to improve the performance on real data.

After the training and testing procedure, we obtain the results of different metrics of both hemispheres. It is necessary to ensemble the metric-level results to generate the final individual-level classification results. Instead of simple voting or weighted voting methods, we adopt the hierarchical model ensemble method, that is, stacking, for individual-level ensemble (Wolpert 1992). Specifically, the results of different metrics are concatenated as the new input features of the individual-level classification model. The extreme gradient boosting (XGboost) (Chen and Guestrin 2016) is adopted as the stacking model. Compared with voting-based ensemble methods, stacking can automatically learn the weights of the input features and usually gets better results. The hyperparameters of XGboost are optimized with the grid search method.

Two-Stage Transfer Learning两阶段迁移学习

For the classification of ASD, we introduce the 2-stage transfer learning approach for better performance. Although it is popular to use the models pretrained with large-scale natural image datasets on other fields, the domain differences between the source and target datasets will still affect the effectiveness of transfer learning (Jean etal. 2016). An effective approach is to use an intermediate domain to bridge the source domain and target domain. The model is first transferred from the source domain to the intermediate domain and then transferred from the intermediate domain to the target domain. For neuropsychiatric disorders such as ASD, sex classification of healthy people is an excellent intermediate domain task. Firstly, brain images from healthy people and ASD patients have similar features. Compared with neuropsychiatric disorders, healthy people’s data usually have better homogeneity. Moreover, sex labels are credible, which is vital in brain imaging analysis. In the 2-stage transfer learning, we first convert both the intermediate domain (HCP) and the target domain (ABIDE) from 3D MRI data to 2D images with the Freesurfer pipeline. The models are transferred from ImageNet to the sex classification of healthy people with the HCP dataset. Then the acquired models are further transferred to the classification of ASD with the ABIDE dataset. In the 2-stage transfer learning framework, the models are fine-tuned twice using the intermediate domain and the target domain, respectively.

To interpret the classification results of the models and locate the cortical shape morphometric differences, the occlusion test is adopted to measure the importance of different regions in the classification (Zeiler and Fergus 2014). Specifically, we cover the image with a 30 × 30 black square and calculate the accuracy drop, which is regarded as the importance of the covered region in the classification. Then we move the square to the next region with a stride of 4 until the whole image is covered. We finally get occlusion test maps and resize them to the same size as the original image. We use the average of all images as the final results. The occlusion test results are then reconstructed as 3D meshes and visualized with Connectome Workbench visualization software.

Results结果 Training Details培训详情

The models are trained on an Ubuntu 18.04.1 server with 2 8-core Intel E5 2609 1.7GHz processors and 4 NVIDIA GTX-V100 graphical processing units. The code is written in Python and Pytorch framework (Paszke etal. 2019). The models pretrained on ImageNet are acquired from torchvision ( Each model is trained for 125 epochs with a batch size of 64, of which the first 75 epochs are trained with mix-up, and the latter 50 epochs are trained with original data, and the model of the last epoch is retained for testing. Stochastic gradient descent and cross-entropy loss are adopted for model optimization. The learning rate is set to 0.01 initially, divided by 10 every 25 epochs for the first 75 epochs, and is then fixed at 0.0001 for the following 50 epochs. The momentum is set to 0.9. The hyperparameters are optimized using the grid search strategy.

Area-Preserving Geometry Mapping Results面积保持几何映射结果

The sketch diagram of area-preserving geometry mapping for cortical meshes is shown in Figure 1. To visually compare the 3D cortical meshes and the corresponding 2D images, we map the Desikan–Killiany (D–K) atlas (Desikan etal. 2006) as a 2D planar atlas (Fig. 2). Brain regions are illustrated in different colors for visualization. For the 3D atlas, a series of different views are needed to show the complete information of the whole brain, and some regions are still hard to observe due to the complex folding of the cerebral cortex. However, our 2D atlas can avoid these disadvantages and show the whole brain without occlusion in one view, which demonstrates the potential and superiority of our method in the visualization of brain images.

Overview of the proposed framework. FreeSurfer is used to generate 3D cortical meshes and vertex-wise cortical shape metrics. The 3D mesh is then converted into a planar mesh using area-preserving geometry mapping. Different metrics are calculated using transfer learning with ImageNet, and then the results are further ensembled with a stacking approach to get individual-level results. The sketch diagram of the geometry mapping is also displayed. The red points distributed evenly on the edge are the selected corners.

拟议架构概览。FreeSurfer 用于生成3D 皮层网格和顶点形状度量。然后使用面积保持几何映射将三维网格转换成平面网格。使用 ImageNet 的转移学习计算不同的度量,然后将结果进一步集成到一个堆叠方法中,以获得个体级别的结果。还显示了几何映射的示意图。在边缘上均匀分布的红点是被选中的角。

The 3D D-K atlas (A) with different views and the 2D D–K atlas (B) generated by our method. The corresponding brain regions of 2 atlases are shown in the same colors.

该方法生成了具有不同视图的三维 D-K 图谱(A)和具有不同视图的二维 D-K 图谱(B)。2个地图集的相应脑区显示为相同的颜色。

Sex Classification Results性别分类结果

Sex classification is a fundamental problem in brain imaging analysis. There is a long debate about whether male and female brains are distinguishable, and many studies attempt to solve the problem with machine learning methods (Weis etal. 2020). We perform the sex classification task on the HCP dataset, and the results are shown in Table 1. Two different deep models are adopted to measure the effectiveness of the proposed method. For comparison, we test the models trained from scratch first and achieve 89.67% accuracy for ResNet and 92.99% accuracy for DenseNet. Furthermore, we test transfer learning by transferring the models pretrained on the source dataset (ImageNet) to the target dataset (HCP). Transfer learning achieves an accuracy of 94.34% for ResNet and 95.06% for DenseNet, resulting in improvements of 4.67% and 2.07% in accuracy. The results demonstrate that transfer learning could significantly boost classification performance. The receiver operating characteristic (ROC) curves and confusion matrices are shown in and , respectively. The proposed method achieves the best area under ROC curves (AUC) score of 0.9854. The high accuracy of our experiment on sex classification demonstrates the effectiveness of our framework and suggests that males and females are distinguishable with the cortical shape metrics revealed by structural MRI.

Sex classification results of cerebral cortex based on HCP

Methods 方法Acc (%) 进度(%)Sen (%) 森(%)Spc (%) 规格(%)AUC
ResNet (transfer) ResNet (传输)94.3495.2193.290.9832
DenseNet (transfer) 致密网(转让)95.0695.8794.080.9854

Acc, Sen, and Spc refer to accuracy, sensitivity, and specificity, respectively. The transfer refers to transfer learning from ImageNet to HCP. The best accuracy, sensitivity, specificity, and AUC are shown in bold.

不同指标的计算结果也采用了叠加法(图3)。来自两个半球的结果被串联起来作为个体层次分类器的输入。通过转移学习,ResNet 和 DenseNet 的度量级精度分别提高了4.49-11.01% 和1.53-4.13% 。在每个指标上的迁移学习的显著改进证明了我们框架的稳定性和有效性。髓鞘图达到了最佳的准确性。曲率比其他度量表现更差,但是通过迁移学习获得了最显著的改善。

Figure 3 图3100">Open in new tab 打开新标签Download slide 下载幻灯片

Classification results of single metric on sex classification (left) and ASD classification (right). In sex classification, the results on thickness, sulcal depth, curvature, and myelin map are shown to investigate the effectiveness of transfer learning under our framework. In ASD classification, the results of thickness, sulcal depth, and curvature are shown to observe performance improvement with the proposed method. The “transfer” represents the transfer learning from ImageNet to the target dataset, and the “2 stage” refers to the transfer learning from ImageNet to ABIDE with HCP as the intermediate domain.

性别分类单指标分类结果(左)和 ASD 分类单指标分类结果(右)。在性别分类中,结果显示厚度,沟深度,曲率和髓鞘地图,以调查有效的迁移学习在我们的框架下。在 ASD 分类中,通过对厚度、沟深和曲率的分析,可以观察到该方法对 ASD 分类性能的改善。“转移”是指从 ImageNet 到目标数据集的转移学习,“2阶段”是指以 HCP 为中间域的从 ImageNet 到 ABIDE 的转移学习。

Moreover, we explore the effects of total intracranial volume (TIV) on sex classification (Sanchis-Segura etal. 2020) under our framework. The results show that our framework still works well with matched TIV ().

ASD Classification ResultsASD 分类结果

We further apply our method to a multisite ASD dataset (ABIDE) to distinguish patients from healthy controls. Due to the lack of T2-weighted images, the myelin maps are not available for the ABIDE dataset, so we only use thickness, sulcal depth, and curvature for the classification of ASD. The results are shown in Table 2.

ASD classification results of cerebral cortex on ABIDE

Methods 方法Acc (%) 进度(%)Sen (%) 森(%)Spc (%) 规格(%)AUC
PCA + SVM PCA + SVM58.1248.9966.820.6102
DenseNet (slice) 致密网(片)61.8352.8369.860.6693
DenseNet (3D volume) DenseNet (3D 卷)62.3954.5369.380.6355
DenseNet (3D mesh) 致密网络(3D 网格)61.1353.2468.150.6438
ResNet (transfer) ResNet (传输)65.8960.2870.900.6996
DenseNet (transfer) 致密网(转让)65.5957.2972.990.7018
ResNet (2-stage) ResNet (2阶段)67.7062.7372.130.7199
DenseNet (2-stage) 致密网络(2阶段)67.8561.6673.360.7237

The transfer refers to direct transfer learning from ImageNet to ABIDE, whereas the 2 stage represents the ImageNet-HCP-ABIDE transfer learning strategy. The PCA + SVM, DenseNet (slice), and DenseNet (3D volume) are based on the 3D cerebral cortex for a fair comparison. The DenseNet (3D mesh) is based on the 3D cortical shape metrics. The best accuracy, sensitivity, specificity, and AUC are shown in bold.

ASD classification results of cerebral cortex on ABIDE

大脑皮层 ASD 在 ABIDE 上的分类结果

Methods 方法Acc (%) 进度(%)Sen (%) 森(%)Spc (%) 规格(%)AUC
PCA + SVM PCA + SVM58.1248.9966.820.6102
DenseNet (slice) 致密网(片)61.8352.8369.860.6693
DenseNet (3D volume)62.3954.5369.380.6355
DenseNet (3D mesh)61.1353.2468.150.6438
ResNet (transfer) ResNet (传输)65.8960.2870.900.6996
DenseNet (transfer) 致密网(转让)65.5957.2972.990.7018
ResNet (2-stage) ResNet (2阶段)67.7062.7372.130.7199
DenseNet (2-stage) 致密网络(2阶段)67.8561.6673.360.7237

The transfer refers to direct transfer learning from ImageNet to ABIDE, whereas the 2 stage represents the ImageNet-HCP-ABIDE transfer learning strategy. The PCA + SVM, DenseNet (slice), and DenseNet (3D volume) are based on the 3D cerebral cortex for a fair comparison. The DenseNet (3D mesh) is based on the 3D cortical shape metrics. The best accuracy, sensitivity, specificity, and AUC are shown in bold.

We first train and test ResNet and DenseNet from scratch and obtain the accuracies of 63.04% and 63.64%, respectively. Furthermore, we test direct transfer learning from ImageNet to the ABIDE dataset and achieve the accuracies of 65.89% for ResNet and 65.59% for DenseNet. Then the 2-stage transfer learning is tested based on the hypothesis that sex classification on healthy people can provide valuable features for the diagnostic classification of neuropsychiatric disorders. The models are transferred from ImageNet to HCP first and further transferred to ABIDE. The 2-stage transferred ResNet and DenseNet achieve the accuracies of 67.70% and 67.85%, respectively. The direct transfer learning from ImageNet brings increases of 2.85% for ResNet and 1.95% for DenseNet. Although the 2-stage transfer learning achieves improvements of 4.66% for ResNet and 4.21% for DenseNet in accuracy, the best AUC score is 0.7237.

Moreover, we validate transfer learning from HCP to ABIDE to investigate the role of sex classification in 2-stage transfer learning. ResNet and DenseNet achieve accuracies of 65.44% and 65.25%, respectively, indicating that the pretraining on sex classification is helpful for the ASD classification.

The results of thickness, sulcal depth, and curvature are shown in Figure 3. The results of single metrics are consistent with the individual-level results. Direct transfer learning achieves better results than training from scratch, whereas the 2-stage transfer learning reaches the highest accuracies in all metrics. Thickness seems to perform better in the ASD classification, whereas sulcal depth and curvature achieve considerable performance. Compared with direct transfer learning, 2-stage transfer learning brings more performance improvements in thickness and sulcal depth.

Moreover, we use the leave-one-site-out cross-validation to investigate the performance of different models on unseen sites, which can further demonstrate the generalization ability. In each experiment, one site is used as the testing set, and the rest are used as the training set. The results are shown in . The model trained from scratch, transfer learning, and 2-stage transfer learning achieve lower accuracies (61.34%, 63.45%, 65.41%) than those in 10-fold cross-validation. It is reasonable because testing on the unseen site is usually more difficult. However, the classification performance benefits from transfer learning and 2-stage transfer learning in leave-one-site-out cross-validation as well, indicating the robustness and effectiveness of the proposed framework.

Comparison With Other Methods on ASD ClassificationASD 分类方法与其他方法的比较

Many methods have been used for ASD classification based on the ABIDE dataset (Sabuncu etal. 2015; Aghdam etal. 2018; Monté-Rubio etal. 2018; Arya etal. 2020; Shahamat and Abadeh 2020). However, the sample size and brain features used in these studies vary a lot, making it difficult for horizontal comparison. To better measure the property of the proposed method and make a fair comparison, we compare the proposed framework with 4 other methods, including the support vector machine (SVM) (Chang and Lin 2011), slice-based 2D CNN, volume-based 3D CNN, and mesh-based 3D CNN using identical samples and brain features. Since ResNet and DenseNet have comparable performance, we only test the corresponding methods using the DenseNet architecture. We only consider the cerebral cortex in these experiments for a fair comparison.

The results of these methods are shown in Table 2. SVM is one of the classic methods for brain MRI. The data are preprocessed and reshaped into a vector, and the principal component analysis (PCA) (Wold etal. 1987) is adopted for feature extraction. An accuracy of 58.12% is achieved with SVM, which is significantly lower than that of the proposed 2-stage transfer learning, indicating the superiority of our framework. Some studies use 2D CNNs to analyze brain images by cutting them into slices. Similarly, in the slice-based DenseNet, the 3D brain images are cut into slices in 3 directions, and the pretrained models are fine-tuned using the acquired slices. The results of the slices are finally ensembled with stacking. The mix-up, stacking, and transfer learning strategies are adopted to ensure a fair comparison. Although the brain MRI is 3D, the slice-based 2D models analyze slices of one subject independently, leading to the loss of structural information (Khodatars etal. 2020; Wen etal. 2020). Moreover, the conversion leads to the loss of interslice information, resulting in a suboptimal accuracy of 61.83%. In volume-based 3D DenseNet, we use a 3D model with the same depth as the 2D model and get an accuracy of 62.39%. Mix-up is also adopted in the 3D model. Compared with these 2 methods, our method achieves significant performance improvement of 6.02% and 5.46%, powerfully demonstrating the superiority of our framework. Moreover, we train a 3D DenseNet using 3D cortical shape metrics to further examine the influence of different distance measurement methods. We resample the 3D surface meshes of each shape metric (thickness, sulcal depth, and curvature) into the 3D matrix, and then we train 3D DenseNets on the obtained 3D matrices. We use the same training strategies as the 2D models, and the mix-up and stacking are used for a fair comparison. The 3D model uses the Euclidean distance directly and achieves an accuracy of 61.13%, which is lower than our framework.

Ablation Study消融研究

We investigate the effectiveness of each module in our model with an ablation study (Table 3). DenseNet is adopted as the base model. The stacking and mix-up strategies are investigated. We combine stacking and mix-up with the base model to get new models. Then the acquired models are tested with the ABIDE dataset. The stacking and mix-up bring improvements of 1.50% and 0.70% accuracy, respectively, demonstrating the effectiveness of the stacking and mix-up training strategies.

Results of ablation study

Stacking 堆叠Mix-up 搞错了Acc (%) 进度(%)Sen (%) 森(%)Spc (%) 规格(%)AUC

All experiments are based on the proposed 2-stage transfer learning framework. The average of all metrics is used as the result when the stacking is not adopted. The best results are shown in bold.

Results of ablation study


Stacking 堆叠Mix-up 搞错了Acc (%) 进度(%)Sen (%) 森(%)Spc (%) 规格(%)AUC

All experiments are based on the proposed 2-stage transfer learning framework. The average of all metrics is used as the result when the stacking is not adopted. The best results are shown in bold.


As mentioned above, we utilize the occlusion test to visualize the critical regions for sex classification and ASD classification. We average the results of all subjects to obtain the group-level differences. The results of thickness, sulcal depth, curvature, and myelin map are calculated, respectively. The most critical regions are shown in Figure 4. The red color indicates highly discriminative brain regions, whereas the blue color denotes less discriminative regions.

Figure 4 图4100">Open in new tab 打开新标签Download slide 下载幻灯片

Visualization of discriminative regions for sex classification and ASD classification using occlusion test. The most discriminative brain regions are marked. The red regions contribute more to classification.

基于遮挡试验的性别分类和 ASD 分类区域可视化研究。大脑中最具辨别力的区域被标记出来。红色区域对分类的贡献更大。

Discussion讨论 Methodology研究方法

In this study, we use deep learning for MRI imaging analysis for several reasons. As the size of the dataset increases, the representation ability of traditional machine learning methods has reached limits. On the contrary, CNN models have many parameters to be trained and show better representational capability of fitting high-dimensional data such as brain images. Traditional machine learning algorithms like SVM usually depend on hand-crafted features. CNN models can automatically extract local features and distributed representations without complicated feature engineering, which is of great significance for finding brain differences and searching neurological biomarkers. Moreover, deep learning can be adapted to different tasks due to its strong adaptability. The transfer learning techniques for deep learning are matured and thus the pretrained models can be used small sample tasks for better performance.

在本研究中,我们使用深度学习进行 MRI 成像分析有几个原因。随着数据集规模的增大,传统机器学习方法的表示能力已经达到了极限。相反,细胞神经网络模型有许多参数需要训练,并显示出更好的表征能力来拟合高维数据,如脑图像。像 SVM 这样的传统机器学习算法通常依赖于手工制作的特性。细胞神经网络模型可以自动提取局部特征和分布式表征,而不需要复杂的特征工程,这对于发现大脑差异和寻找神经生物标志物具有重要意义。此外,深度学习具有很强的适应性,可以适应不同的任务。用于深度学习的迁移学习技术已经成熟,因此预先训练的模型可以用于小样本任务以获得更好的性能。

Mapping the 3D cerebral cortex as 2D images also brings several benefits. The complex geometry and folding patterns of the cerebral cortex hinder its analysis. For example, 2 adjacent voxels in Euclidean space may be anatomically or functionally segregated due to the non-Euclidean geometry of the folding cerebral cortex. Current CNN models are based on Euclidean distance and ignore the structural features of the cerebral cortex, which is straightforward but coarse. The signatures from different brain regions are mixed after the convolutional operations, which is harmful in searching diagnostic biomarkers. Different from current deep learning models, the proposed framework uses the distance along the cortical surface to measure the relative positions of different brain regions, which is more neurobiologically relevant. Compared with other models based on Euclidean distance, our framework could better preserve the structural layout of the brain and obtain results with neurobiological significance. Moreover, many 2D pretrained models are available for transfer learning under our framework, which can significantly improve network performance. It is also evident that the converted 2D images are substantially compressed from 3D raw data while keeping the most valid information, leading to high efficiency during model training. Freesurfer also provides a 2D mapping, but the topological structure is not preserved because the mapped cerebral cortex is torn. The mapped 2D image of Freesurfer is irregular, which is harmful in the training of deep models. Moreover, even though we use the Freesurfer for preprocessing in this paper, our framework is also compatible with other preprocessing tools such as CIVET (MacDonald etal. 2000) and Fastsurfer (Henschel etal. 2020) only if they can generate similar cortical surface meshes and vertex-wise shape metrics.

将3D 大脑皮层映射为2D 图像也会带来一些好处。大脑皮层的复几何和折叠模式妨碍了它的分析。例如,由于折叠的大脑皮层的非欧几里得几何,欧几里得空间中相邻的两个体素可能在解剖学上或功能上被隔离。目前的 CNN 模型是基于欧几里得度量的,忽略了大脑皮层的结构特征,大脑皮层虽然简单但是粗糙。不同脑区的特征信号在卷积术后混杂在一起,不利于寻找诊断性的生物标志物。与目前的深度学习模型不同,提出的框架使用沿皮层表面的距离来测量不同大脑区域的相对位置,这在神经生物学上更相关。与其他基于欧几里得度量的模型相比,我们的框架能够更好地保存大脑的结构布局,并获得具有神经生物学意义的结果。此外,在我们的框架下,许多二维预训练模型可用于迁移学习,这可以显著提高网络性能。显然,转换后的二维图像基本上是从三维原始数据压缩而来的,同时保留了最有效的信息,从而提高了模型训练的效率。Freesurfer 还提供了一个2D 映射,但是由于映射的大脑皮层被撕裂,拓扑结构没有得到保留。自由冲浪运动员的二维映射图像不规则,不利于深度模型的训练。此外,即使我们在本文中使用 Freesurfer 进行预处理,我们的框架也与 CIVET (MacDonald et al。2000)和 Fastsurfer (Henschel et al。2020)等其他预处理工具兼容,只要它们能够生成类似的皮层表面网格和顶点形状指标。

Transfer Learning From ImageNet从 ImageNet 迁移学习

The training of deep models requires a large-scale dataset, which is troublesome in brain imaging analysis since the collection of data is expensive and time-consuming. Many studies use models pretrained on natural image to medical image classification and achieve success. However, natural images are 2D planar ones, whereas brain images are always 3D, hindering the application of transfer learning in brain imaging studies. Our solution is to transform the cerebral cortex into 2D images and make transfer learning from natural images to brain images applicable. In this study, the models are first pretrained on a large natural image dataset, that is, ImageNet, and then fine-tuned with the converted 2D brain images. We demonstrate the effectiveness of transfer learning from natural images to MRI data by a robust and significant performance improvement of both sex classification and ASD classification. The success of the proposed deep transfer learning framework is expected and reasonable. Even though brain images are different from natural images, the pretrained model can provide universal features, especially low-level features, which can be effectively reused in brain imaging analysis. In the converted 2D brain images, morphometric features such as sulcal depth can be regarded as a kind of textural feature in the image processing field. The models pretrained on natural images have learned a mass of knowledge on texture features and are excellent in extracting texture features, which is helpful in the training of the brain morphometric features. Transfer learning also provides models with suitable initial parameters and makes the network easier to converge for small-scale datasets.

深度模型的训练需要大规模的数据集,由于数据的收集成本高、耗时长,给脑成像分析带来了很大的困难。许多研究利用基于自然图像的预训练模型对医学图像进行分类,并取得了成功。然而,自然图像是二维平面图像,而大脑图像往往是三维的,阻碍了转移学习在脑成像研究中的应用。我们的解决方案是将大脑皮层转换成二维图像,并使从自然图像到大脑图像的转移学习适用。在这项研究中,模型首先在一个大型自然图像数据集,即 ImageNet 上进行预训练,然后用转换后的2D 脑图像进行微调。我们证明了从自然图像到 MRI 数据的转移学习的有效性,通过性别分类和 ASD 分类的强大和显着的性能改进。所提出的深度迁移学习框架的成功预期是合理的。尽管脑图像不同于自然图像,但预训练模型可以提供通用特征,特别是低层特征,可以有效地重用于脑图像分析。在转换后的二维脑图像中,脑沟深度等形态测量特征可以看作是图像处理领域的一种纹理特征。在自然图像上预训练的模型学习了大量关于纹理特征的知识,在提取纹理特征方面表现出色,有助于训练大脑形态特征。传递学习还为模型提供了合适的初始参数,使得小规模数据集的网络收敛更加容易。

Two-Stage Transfer Learning两阶段迁移学习

The ABIDE dataset used in this study is collected from over 30 sites. The differences in scanning machines and experimental parameters usually introduce intersite data heterogeneity (Chen etal. 2015). Compared with single-site ASD classification studies, such multisite studies are more difficult (Katuwal etal. 2016). In contrast with many other studies that use part of the ABIDE dataset, our study involves all available subjects from ABIDE-I and ABIDE-II, which is more challenging but fairer. We have demonstrated that transfer learning from natural image classification to ASD classification works. To further improve the performance on ABIDE, we propose a 2-stage transfer learning framework using the HCP as an intermediate domain and achieve higher accuracy than direct transfer learning from ImageNet. HCP is an excellent intermediate dataset due to its good data quality and better homogeneity. As aforementioned, direct transfer learning from ImageNet could provide useful low-level features. Compared with ImageNet, HCP is more similar to the data of ABIDE. A reasonable guess is that the data distribution inconsistency between natural images and brain images of ASD can be alleviated with brain images of healthy people. The fine-tuning on HCP makes the features more suitable for the ASD classification. To validate our hypothesis, we calculate the filter variation of different convolutional layers (). Compared with direct transfer learning, the second-stage transfer learning from HCP to ABIDE shows a smaller variation in both low-level and high-level filters, suggesting that the first-stage transfer learning brings benefits in both low-level and high-level features. Moreover, sex classification is an excellent intermediate task. It is well known that sex is closely related to some neuropsychiatric disorders. For example, ASD appears 4 times greater in males than females. Previous studies have investigated the sex differences in ASD and emphasized the critical role of sex in ASD studies (Lawrence etal. 2020). The models fine-tuned on the HCP dataset have learned sex-related features, which may be helpful in the classification of ASD, considering the sex-biased phenomenon of the disorder. The doubtless sex label of HCP also ensures the reliability of the first-stage transfer learning. Moreover, we evaluate the effect of age by grouping the ABIDE into 2 groups based on whether the age of the sample is covered by HCP. There is no significant difference in performance improvement for 2-stage transfer learning between 2 groups. Our 2-stage transfer learning provides a new approach for solving the small sample problem in the classification of neuropsychiatric disorders by using both natural images and other brain imaging data.

The brain differences between males and females have been explored in many previous studies. In this study, we find some critical regions in the sex classification based on shape metrics, that is, thickness, sulcal depth, curvature, and myelin map. The occlusion test maps directly visualize the importance of different brain regions in the sex classification. As shown in Figure 4, we get 4 different occlusion test maps for the brain shape metrics, whereas some metric-shared discriminative regions exist. We find that the superior frontal cortex, superior parietal cortex, supramarginal cortex, paracentral cortex, precuneus, temporal pole, and right lingual cortex exhibit higher discriminative power in the sex classification, which have also been reported in the previous studies with different imaging modalities. One of our previous studies has reported the sex-related structural and functional differences in the superior frontal cortex, right supramarginal cortex, right lingual cortex, and left superior parietal cortex (Wang etal. 2012). The gray matter volume differences of the precuneus and temporal pole are demonstrated in previous studies (Ruigrok etal. 2014). The sex differences of the superior frontal cortex, superior parietal cortex, and paracentral cortex are also observed during the risk-taking tasks (Lee etal. 2009). The identified brain regions are closely related to behavioral or cognitive differences between males and females. The superior frontal cortex is a vital brain region involved with various cognitive and motor tasks, including working memory, self-awareness, and attention (Li etal. 2013). The superior parietal cortex mainly focuses on visuospatial and attention processing, long-term and working memory (Koenigs etal. 2009). The lingual cortex is closely related to vision tasks and word processing (Mechelli etal. 2000). Precuneus participates in visuospatial imagery, episodic memory retrieval, and self-processing operations (Cavanna and Trimble 2006). The temporal pole is linked to social and emotional processing (Snowden etal. 2004). The paracentral cortex controls the motor and sensory innervations of the lower extremity, including muscles and the urinary bladder (Spasojević etal. 2013).

ASD is a developmental disorder characterized by difficulty in social interactions, verbal and nonverbal communication deficits, and stereotyped activities and limited interests (Lord etal. 2018). In this study, we find that the superior frontal cortex, precentral cortex, postcentral cortex, inferior temporal cortex, middle temporal cortex, left superior temporal cortex, and right fusiform are critical regions in the classification of ASD in more than 2 metrics. These regions are also investigated in other studies. For example, the curvature and folding index features from frontal and temporal cortices are dominant in the early detection of ASD (Katuwal etal. 2016). Differences in the right inferior temporal cortex and right fusiform are reported between ASD patients and normal controls (Shahamat and Abadeh 2020). A functional magnetic resonance imaging (fMRI) study reveals group differences in the development of the superior temporal cortex (Prigge etal. 2013). The change of postcentral cortex, precentral cortex, and superior frontal cortex is also reported (Chen etal. 2015). The frontal cortex is thought to be related to high-order cognition, social and emotional functions, language, which are deficient in ASD (Carper and Courchesne 2005). Some studies report motor function abnormalities in ASD, which is regarded to be related to the precentral and postcentral (Müller etal. 2001). Temporal regions are related to social perception, language, and the “theory of mind,” which are impaired in ASD (Gendry Meresse etal. 2005). Fusiform plays an essential role in face perception, which is the key feature of normal social functioning in humans. However, the fusiform cortex is found hypoactive in patients with ASD, which cause the abnormalities in face perception and social interactions (van Kooten etal. 2008).

Our results are consistent with the conclusions of previous sex and ASD studies, validating the reliability of our results. It should also be noted that the brain regions with group differences are observed in different metric maps, indicating that the alterations of these brain regions are stable.


Limitations and Future Work限制和未来的工作

Although the proposed framework has made some progress, there are still several limitations. Firstly, there is a lack of fair comparison with other studies due to different sample sizes and brain features. Secondly, the cerebellum and subcortical regions are not involved in the analysis, with which the classification performance may be further improved. Recent studies have shown that the cerebellum and subcortical regions can also be converted into surface meshes (Chye etal. 2019; Sereno etal. 2020), and we will follow-up on the relevant studies and further complete our framework. Thirdly, we only use structural MRI in this study, but the proposed framework can be further extended to functional MRI. The combined structural and functional MRI may achieve better performance. We will promote our framework into fMRI data in the future.

In this paper, we propose a framework to map the 3D cerebral cortex into 2D images with geometry mapping and facilitate transfer learning from natural images to brain images. In this way, the mature algorithm and techniques for 2D images in computer vision can be easily applied in brain image analysis. The topological information of brain structure is preserved, which is plausible for cortical visualization and neurobiological analysis. We validate the effectiveness of our framework on sex and ASD classification with both traditional transfer learning and a novel 2-stage transfer learning and achieved significant performance improvement. The proposed framework creatively applies 2D pretrained models to cortical shape-based classification, shedding new light for brain image analysis.

在本文中,我们提出了一个框架来映射三维大脑皮层到二维图像的几何映射,并促进从自然图像到大脑图像的转移学习。这样,成熟的计算机视觉二维图像处理算法和技术就可以很容易地应用于脑图像分析。保留了大脑结构的拓扑信息,可用于皮层可视化和神经生物学分析。我们通过传统的迁移学习和一种新的两阶段迁移学习验证了我们的框架在性别和 ASD 分类上的有效性,并取得了显著的性能改善。该框架创造性地将二维预训练模型应用于基于皮层形状的分类,为脑图像分析提供了新的思路。


Conflict of Interest: None declared.

National Key Research and Development Program (grant 2018YFB1305101); the National Natural Science Foundation of China (grant 62036013, 61722313, 61773391); the Science & Technology Innovation Program of Hunan Province (grant 2018RS3080).

