Test-retest reliability of freesurfer measurements within and between sites: Effects of visual approval process

Zafer Iscan; Tony B. Jin; Alexandria Kendrick; Bryan Szeglin; Hanzhang Lu; Madhukar Trivedi; Maurizio Fava; Patrick J. Mcgrath; Myrna Weissman; Benji T. Kurian; Phillip Adams; Sarah Weyandt; Marisa Toups; Thomas Carmody; Melvin Mcinnis; Cristina Cusin; Crystal Cooper; Maria A. Oquendo; Ramin V. Parsey; Christine Delorenzo

doi:10.1002/hbm.22856

Test-retest reliability of freesurfer measurements within and between sites: Effects of visual approval process

Zafer Iscan, Tony B. Jin, Alexandria Kendrick, Bryan Szeglin, Hanzhang Lu, Madhukar Trivedi, Maurizio Fava, Patrick J. Mcgrath, Myrna Weissman, Benji T. Kurian, Phillip Adams, Sarah Weyandt, Marisa Toups, Thomas Carmody, Melvin Mcinnis, Cristina Cusin, Crystal Cooper, Maria A. Oquendo, Ramin V. Parsey, Christine Delorenzo

Research output: Contribution to journal › Article › peer-review

114 Scopus citations

Abstract

In the last decade, many studies have used automated processes to analyze magnetic resonance imaging (MRI) data such as cortical thickness, which is one indicator of neuronal health. Due to the convenience of image processing software (e.g., FreeSurfer), standard practice is to rely on automated results without performing visual inspection of intermediate processing. In this work, structural MRIs of 40 healthy controls who were scanned twice were used to determine the test-retest reliability of FreeSurfer-derived cortical measures in four groups of subjects-those 25 that passed visual inspection (approved), those 15 that failed visual inspection (disapproved), a combined group, and a subset of 10 subjects (Travel) whose test and retest scans occurred at different sites. Test-retest correlation (TRC), intraclass correlation coefficient (ICC), and percent difference (PD) were used to measure the reliability in the Destrieux and Desikan-Killiany (DK) atlases. In the approved subjects, reliability of cortical thickness/surface area/volume (DK atlas only) were: TRC (0.82/0.88/0.88), ICC (0.81/0.87/0.88), PD (0.86/1.19/1.39), which represent a significant improvement over these measures when disapproved subjects are included. Travel subjects' results show that cortical thickness reliability is more sensitive to site differences than the cortical surface area and volume. To determine the effect of visual inspection on sample size required for studies of MRI-derived cortical thickness, the number of subjects required to show group differences was calculated. Significant differences observed across imaging sites, between visually approved/disapproved subjects, and across regions with different sizes suggest that these measures should be used with caution.

Original language	English (US)
Pages (from-to)	3472-3485
Number of pages	14
Journal	Human Brain Mapping
Volume	36
Issue number	9
DOIs	https://doi.org/10.1002/hbm.22856
State	Published - Sep 1 2015

Keywords

Cerebral cortical surface area
Cerebral cortical thickness
Cerebral cortical volume
FreeSurfer
Multisite MRI
Test-retest reliability

ASJC Scopus subject areas

Anatomy
Radiological and Ultrasound Technology
Radiology Nuclear Medicine and imaging
Neurology
Clinical Neurology

Access to Document

10.1002/hbm.22856

Cite this

Iscan, Z., Jin, T. B., Kendrick, A., Szeglin, B., Lu, H., Trivedi, M., Fava, M., Mcgrath, P. J., Weissman, M., Kurian, B. T., Adams, P., Weyandt, S., Toups, M., Carmody, T., Mcinnis, M., Cusin, C., Cooper, C., Oquendo, M. A., Parsey, R. V., & Delorenzo, C. (2015). Test-retest reliability of freesurfer measurements within and between sites: Effects of visual approval process. Human Brain Mapping, 36(9), 3472-3485. https://doi.org/10.1002/hbm.22856

Iscan, Z, Jin, TB, Kendrick, A, Szeglin, B, Lu, H, Trivedi, M, Fava, M, Mcgrath, PJ, Weissman, M, Kurian, BT, Adams, P, Weyandt, S, Toups, M , Carmody, T, Mcinnis, M, Cusin, C, Cooper, C, Oquendo, MA, Parsey, RV & Delorenzo, C 2015, 'Test-retest reliability of freesurfer measurements within and between sites: Effects of visual approval process', Human Brain Mapping, vol. 36, no. 9, pp. 3472-3485. https://doi.org/10.1002/hbm.22856

@article{f7162a3057a94dd5bc93f9be55f5de9a,

title = "Test-retest reliability of freesurfer measurements within and between sites: Effects of visual approval process",

abstract = "In the last decade, many studies have used automated processes to analyze magnetic resonance imaging (MRI) data such as cortical thickness, which is one indicator of neuronal health. Due to the convenience of image processing software (e.g., FreeSurfer), standard practice is to rely on automated results without performing visual inspection of intermediate processing. In this work, structural MRIs of 40 healthy controls who were scanned twice were used to determine the test-retest reliability of FreeSurfer-derived cortical measures in four groups of subjects-those 25 that passed visual inspection (approved), those 15 that failed visual inspection (disapproved), a combined group, and a subset of 10 subjects (Travel) whose test and retest scans occurred at different sites. Test-retest correlation (TRC), intraclass correlation coefficient (ICC), and percent difference (PD) were used to measure the reliability in the Destrieux and Desikan-Killiany (DK) atlases. In the approved subjects, reliability of cortical thickness/surface area/volume (DK atlas only) were: TRC (0.82/0.88/0.88), ICC (0.81/0.87/0.88), PD (0.86/1.19/1.39), which represent a significant improvement over these measures when disapproved subjects are included. Travel subjects' results show that cortical thickness reliability is more sensitive to site differences than the cortical surface area and volume. To determine the effect of visual inspection on sample size required for studies of MRI-derived cortical thickness, the number of subjects required to show group differences was calculated. Significant differences observed across imaging sites, between visually approved/disapproved subjects, and across regions with different sizes suggest that these measures should be used with caution.",

keywords = "Cerebral cortical surface area, Cerebral cortical thickness, Cerebral cortical volume, FreeSurfer, Multisite MRI, Test-retest reliability",

author = "Zafer Iscan and Jin, {Tony B.} and Alexandria Kendrick and Bryan Szeglin and Hanzhang Lu and Madhukar Trivedi and Maurizio Fava and Mcgrath, {Patrick J.} and Myrna Weissman and Kurian, {Benji T.} and Phillip Adams and Sarah Weyandt and Marisa Toups and Thomas Carmody and Melvin Mcinnis and Cristina Cusin and Crystal Cooper and Oquendo, {Maria A.} and Parsey, {Ramin V.} and Christine Delorenzo",

note = "Publisher Copyright: {\textcopyright} 2015 Wiley Periodicals, Inc.",

year = "2015",

month = sep,

day = "1",

doi = "10.1002/hbm.22856",

language = "English (US)",

volume = "36",

pages = "3472--3485",

journal = "Human Brain Mapping",

issn = "1065-9471",

publisher = "Wiley-Liss Inc.",

number = "9",

}

TY - JOUR

T1 - Test-retest reliability of freesurfer measurements within and between sites

T2 - Effects of visual approval process

AU - Iscan, Zafer

AU - Jin, Tony B.

AU - Kendrick, Alexandria

AU - Szeglin, Bryan

AU - Lu, Hanzhang

AU - Trivedi, Madhukar

AU - Fava, Maurizio

AU - Mcgrath, Patrick J.

AU - Weissman, Myrna

AU - Kurian, Benji T.

AU - Adams, Phillip

AU - Weyandt, Sarah

AU - Toups, Marisa

AU - Carmody, Thomas

AU - Mcinnis, Melvin

AU - Cusin, Cristina

AU - Cooper, Crystal

AU - Oquendo, Maria A.

AU - Parsey, Ramin V.

AU - Delorenzo, Christine

PY - 2015/9/1

Y1 - 2015/9/1

N2 - In the last decade, many studies have used automated processes to analyze magnetic resonance imaging (MRI) data such as cortical thickness, which is one indicator of neuronal health. Due to the convenience of image processing software (e.g., FreeSurfer), standard practice is to rely on automated results without performing visual inspection of intermediate processing. In this work, structural MRIs of 40 healthy controls who were scanned twice were used to determine the test-retest reliability of FreeSurfer-derived cortical measures in four groups of subjects-those 25 that passed visual inspection (approved), those 15 that failed visual inspection (disapproved), a combined group, and a subset of 10 subjects (Travel) whose test and retest scans occurred at different sites. Test-retest correlation (TRC), intraclass correlation coefficient (ICC), and percent difference (PD) were used to measure the reliability in the Destrieux and Desikan-Killiany (DK) atlases. In the approved subjects, reliability of cortical thickness/surface area/volume (DK atlas only) were: TRC (0.82/0.88/0.88), ICC (0.81/0.87/0.88), PD (0.86/1.19/1.39), which represent a significant improvement over these measures when disapproved subjects are included. Travel subjects' results show that cortical thickness reliability is more sensitive to site differences than the cortical surface area and volume. To determine the effect of visual inspection on sample size required for studies of MRI-derived cortical thickness, the number of subjects required to show group differences was calculated. Significant differences observed across imaging sites, between visually approved/disapproved subjects, and across regions with different sizes suggest that these measures should be used with caution.

AB - In the last decade, many studies have used automated processes to analyze magnetic resonance imaging (MRI) data such as cortical thickness, which is one indicator of neuronal health. Due to the convenience of image processing software (e.g., FreeSurfer), standard practice is to rely on automated results without performing visual inspection of intermediate processing. In this work, structural MRIs of 40 healthy controls who were scanned twice were used to determine the test-retest reliability of FreeSurfer-derived cortical measures in four groups of subjects-those 25 that passed visual inspection (approved), those 15 that failed visual inspection (disapproved), a combined group, and a subset of 10 subjects (Travel) whose test and retest scans occurred at different sites. Test-retest correlation (TRC), intraclass correlation coefficient (ICC), and percent difference (PD) were used to measure the reliability in the Destrieux and Desikan-Killiany (DK) atlases. In the approved subjects, reliability of cortical thickness/surface area/volume (DK atlas only) were: TRC (0.82/0.88/0.88), ICC (0.81/0.87/0.88), PD (0.86/1.19/1.39), which represent a significant improvement over these measures when disapproved subjects are included. Travel subjects' results show that cortical thickness reliability is more sensitive to site differences than the cortical surface area and volume. To determine the effect of visual inspection on sample size required for studies of MRI-derived cortical thickness, the number of subjects required to show group differences was calculated. Significant differences observed across imaging sites, between visually approved/disapproved subjects, and across regions with different sizes suggest that these measures should be used with caution.

KW - Cerebral cortical surface area

KW - Cerebral cortical thickness

KW - Cerebral cortical volume

KW - FreeSurfer

KW - Multisite MRI

KW - Test-retest reliability

UR - http://www.scopus.com/inward/record.url?scp=84939473755&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84939473755&partnerID=8YFLogxK

U2 - 10.1002/hbm.22856

DO - 10.1002/hbm.22856

M3 - Article

C2 - 26033168

AN - SCOPUS:84939473755

SN - 1065-9471

VL - 36

SP - 3472

EP - 3485

JO - Human Brain Mapping

JF - Human Brain Mapping

IS - 9

ER -

Test-retest reliability of freesurfer measurements within and between sites: Effects of visual approval process

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this