GSP1000 Preprocessed Connectome

Version 3.0

Cohen, Alexander; Soussand, Louis; McManus, Peter; Fox, Michael, 2020, "GSP1000 Preprocessed Connectome", https://doi.org/10.7910/DVN/ILXIKS, Harvard Dataverse, V3, UNF:6:NaHMV/y3utLZ3D9PE2tRiw== [fileUNF]

Learn about Data Citation Standards.

Contact Owner

Dataset Metrics

11,200 Downloads

Description	The GSP1000 Processed Connectome is derived from data acquired by the Brain Genomics Superstruct Project (GSP), which contained 1570 subjects in total (ages 18-36). From this dataset, 1000 subjects ~~(1:1 M/F)~~ were chosen and processed using publicly available tools to generate a normative functional connectivity dataset. This release contains one T1w anatomical image warped to the MNI152 2mm isovolumetric space distributed with FSL and either one or two preprocessed resting state fMRI BOLD runs. This dataset was created to provide a new version of the "Yeo1000" connectome that was created 10 years ago from software that is no longer available (Yeo et al., J Neurophys 2011), but has been used by a number of laboratories and research groups. Of note, the CBIG pipeline has been slightly modified, as have the default pipeline settings, to approximate those used for the original "yeo1000" dataset. Both our modified pipeline and the configuration file as well as code to apply this pipeline to BIDS-formatted data are linked below. (2020-11-12) We have updated the GSP1000 cohort to consist of a fully 1:1 M:F dataset. The original version consisted of 346:654 M:F participants, while the earlier unreleased "yeo1000" dataset consisted of 426:574 M:F participants. All data in both versions are usable, the associated text files delineate the entire GSP1570 cohort, and sample pairwise correlations and correlation maps are not substantially different at the group average (n=1000) level, with a very high degree of spatial correlation between "same-seed" functional connectivity maps. (2021-03-22) File GSP1000/GSP1000_v2_16.tar was made incorrectly (only contained 1 participant's data). This has been corrected. (2021-04-17)
Subject	Medicine, Health and Life Sciences
Related Publication	(Derived from): Buckner, Randy L.; Roffman, Joshua L.; Smoller, Jordan W., 2014, Brain Genomics Superstruct Project (GSP), Harvard Dataverse, V10.4doi: 10.7910/DVN/25833&version=10.4
Notes	## Data Acquisition: The original GSP data was acquired on matched Siemens 3T MAGNETOM Tim Trio MRI systems (Erlangen, Germany) using the vendor-supplied 12-channel phased-array head coil. Sequences, parameters, and instructions were unchanged throughout the collection process. However, not all subjects were acquired on the same scanner, as five different scanners were used. In addition, during the scanning period, the scanner console changed from B13 to B15 to B17. The scanner (Scanner_Bin) and console version (Console) for each imaging session are available within the CSV files included in the original data release (GSP_list_140630.csv and GSP_retest_140630.csv). The test-retest data include individuals scanned twice on the different scanners and across different console versions. The data may be useful in assessing any subtle differences. Finally, as a precaution, the original GSP authors highly recommend regressing the scanner and console from analyses. ## Data Processing: 1. GSP subjects included in this connectome were chosen based on a combined 3-way score of normalized DVARS, normalized Entropy Focused Criterion (EFC), and normalized Framewise Displacement (FD) values generated from mriqc (https://mriqc.readthedocs.io/en/stable/) consistent with the goal of minimizing the effects of motion without "scrubbing" (Power et al. 2012), as this was not standard practice for the original "yeo1000" dataset (Yeo et al., J Neurophys 2011). The 1000 GSP subjects with the best combined scores were selected for connectome inclusion. Additionally, if a subject contains 2 resting-state BOLD runs, then the 2 motion quality values are averaged to produce a single value. 2. Anatomical surfaces were created with FreeSurfer v4.5. 3. Functional preprocessing was then performed with a 'slightly' modified version of Thomas Yeo’s Computational Brain Imaging Group (CBIG) fMRI preprocessing pipeline to generate the BOLD runs used in the connectome (https://github.com/bchcohenlab/CBIG/blob/master/README.md). -- Note: Our "config file" for the CBIG pipeline is also included in the current upload to facilitate methodological replication and the CBIG preprocessing scripts themselves can be freely accessed here: https://github.com/bchcohenlab/CBIG/tree/master/stable_projects/preprocessing/CBIG_fMRI_Preproc2016. ## License: This data upload abides by the original GSP Open Access Data Use Terms agreement released with the initial GSP project; all data contained in this upload are deidentified, defaced, and no code (or other analysis techniques) used to process the data endangers the security of subjects' Protected Health Information or any additional confidential information that would otherwise violate Terms of the agreement. Given the size of this dataset, you may find better performance downloading it from Harvard Dataverse using the available command line tools: `# The API token is obtainable after you log-in to Harvard dataverse from your profile page export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx export SERVER_URL=https://dataverse.harvard.edu export PERSISTENT_ID=doi:10.7910/DVN/ILXIKS export VERSION=2.1` `curl -O -J -H "X-Dataverse-key:$API_TOKEN" $SERVER_URL/api/access/dataset/:persistentId/versions/$VERSION?persistentId=$PERSISTENT_ID`
License/Data Use Agreement	Custom Dataset Terms

Change View

Table

Tree

Filter by

	1 to 10 of 26 Files	Original Format Archival Format (.tab)
	dataset_description.json GSP1000/JSON - 521 B Published Nov 17, 2020 592 Downloads MD5: 8e7954819996acba48f559cb1246e8ae	Access File File Access Public Download Options JSON Download Metadata Data File Citation Download EndNote XML Download RIS Download BibTeX
	GSP1000.txt GSP1000/Plain Text - 8.8 KB Published Mar 22, 2021 576 Downloads MD5: 3f89fd937b283de9a0d584f52f31d599	Preview "GSP1000/GSP1000.txt" Access File File Access Public Download Options Plain Text Download Metadata Data File Citation Download EndNote XML Download RIS Download BibTeX
	GSP1000_v2_00.tar GSP1000/TAR Archive - 10.7 GB Published Mar 22, 2021 598 Downloads MD5: d05a9ede5f23258e713324fcbb76fe18	Access File File Access Public Download Options TAR Archive Download Metadata Data File Citation Download EndNote XML Download RIS Download BibTeX
	GSP1000_v2_01.tar GSP1000/TAR Archive - 11.8 GB Published Mar 22, 2021 449 Downloads MD5: 97ce2177fb37e863104dc752778022ca	Access File File Access Public Download Options TAR Archive Download Metadata Data File Citation Download EndNote XML Download RIS Download BibTeX
	GSP1000_v2_02.tar GSP1000/TAR Archive - 11.6 GB Published Mar 22, 2021 405 Downloads MD5: f3eb12a0feeee68dacd18d4612ee82d3	Access File File Access Public Download Options TAR Archive Download Metadata Data File Citation Download EndNote XML Download RIS Download BibTeX
	GSP1000_v2_03.tar GSP1000/TAR Archive - 11.7 GB Published Mar 22, 2021 421 Downloads MD5: f65f9f96184c72ebbac609cbd4ca7d2e	Access File File Access Public Download Options TAR Archive Download Metadata Data File Citation Download EndNote XML Download RIS Download BibTeX
	GSP1000_v2_04.tar GSP1000/TAR Archive - 11.1 GB Published Mar 22, 2021 404 Downloads MD5: b7a70b50e609c391ce793bf4bfbd2c32	Access File File Access Public Download Options TAR Archive Download Metadata Data File Citation Download EndNote XML Download RIS Download BibTeX
	GSP1000_v2_05.tar GSP1000/TAR Archive - 10.9 GB Published Mar 22, 2021 394 Downloads MD5: dd436fb053362ff2a5fca18ae5f5f66b	Access File File Access Public Download Options TAR Archive Download Metadata Data File Citation Download EndNote XML Download RIS Download BibTeX
	GSP1000_v2_06.tar GSP1000/TAR Archive - 10.3 GB Published Mar 22, 2021 401 Downloads MD5: 6d76dd62c0f153f4d867387497721fbf	Access File File Access Public Download Options TAR Archive Download Metadata Data File Citation Download EndNote XML Download RIS Download BibTeX
	GSP1000_v2_07.tar GSP1000/TAR Archive - 10.9 GB Published Mar 22, 2021 416 Downloads MD5: b3a944b9eaeb90d1d00f3c1bd9f05dcb	Access File File Access Public Download Options TAR Archive Download Metadata Data File Citation Download EndNote XML Download RIS Download BibTeX

Citation Metadata

Persistent Identifier	doi:10.7910/DVN/ILXIKS
Publication Date	2020-11-17
Title	GSP1000 Preprocessed Connectome
Author	Boston Children's Hospital, Harvard Medical School0000-0001-6557-5866 Soussand, LouisBoston Children's Hospital, Harvard Medical School McManus, PeterBoston Children's Hospital, Harvard Medical School Fox, MichaelBrigham and Women's Hospital, Harvard Medical School
Point of Contact	Use email button above to contact. Cohen, Alexander (Boston Children's Hospital)
Description	The GSP1000 Processed Connectome is derived from data acquired by the Brain Genomics Superstruct Project (GSP), which contained 1570 subjects in total (ages 18-36). From this dataset, 1000 subjects ~~(1:1 M/F)~~ were chosen and processed using publicly available tools to generate a normative functional connectivity dataset. This release contains one T1w anatomical image warped to the MNI152 2mm isovolumetric space distributed with FSL and either one or two preprocessed resting state fMRI BOLD runs. This dataset was created to provide a new version of the "Yeo1000" connectome that was created 10 years ago from software that is no longer available (Yeo et al., J Neurophys 2011), but has been used by a number of laboratories and research groups. Of note, the CBIG pipeline has been slightly modified, as have the default pipeline settings, to approximate those used for the original "yeo1000" dataset. Both our modified pipeline and the configuration file as well as code to apply this pipeline to BIDS-formatted data are linked below. (2020-11-12) We have updated the GSP1000 cohort to consist of a fully 1:1 M:F dataset. The original version consisted of 346:654 M:F participants, while the earlier unreleased "yeo1000" dataset consisted of 426:574 M:F participants. All data in both versions are usable, the associated text files delineate the entire GSP1570 cohort, and sample pairwise correlations and correlation maps are not substantially different at the group average (n=1000) level, with a very high degree of spatial correlation between "same-seed" functional connectivity maps. (2021-03-22) File GSP1000/GSP1000_v2_16.tar was made incorrectly (only contained 1 participant's data). This has been corrected. (2021-04-17)
Subject	Medicine, Health and Life Sciences
Related Publication	(Derived from): Buckner, Randy L.; Roffman, Joshua L.; Smoller, Jordan W., 2014, Brain Genomics Superstruct Project (GSP), Harvard Dataverse, V10.4 doi 10.7910/DVN/25833&version=10.4 https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/25833&version=10.4
Notes	## Data Acquisition: The original GSP data was acquired on matched Siemens 3T MAGNETOM Tim Trio MRI systems (Erlangen, Germany) using the vendor-supplied 12-channel phased-array head coil. Sequences, parameters, and instructions were unchanged throughout the collection process. However, not all subjects were acquired on the same scanner, as five different scanners were used. In addition, during the scanning period, the scanner console changed from B13 to B15 to B17. The scanner (Scanner_Bin) and console version (Console) for each imaging session are available within the CSV files included in the original data release (GSP_list_140630.csv and GSP_retest_140630.csv). The test-retest data include individuals scanned twice on the different scanners and across different console versions. The data may be useful in assessing any subtle differences. Finally, as a precaution, the original GSP authors highly recommend regressing the scanner and console from analyses. ## Data Processing: 1. GSP subjects included in this connectome were chosen based on a combined 3-way score of normalized DVARS, normalized Entropy Focused Criterion (EFC), and normalized Framewise Displacement (FD) values generated from mriqc (https://mriqc.readthedocs.io/en/stable/) consistent with the goal of minimizing the effects of motion without "scrubbing" (Power et al. 2012), as this was not standard practice for the original "yeo1000" dataset (Yeo et al., J Neurophys 2011). The 1000 GSP subjects with the best combined scores were selected for connectome inclusion. Additionally, if a subject contains 2 resting-state BOLD runs, then the 2 motion quality values are averaged to produce a single value. 2. Anatomical surfaces were created with FreeSurfer v4.5. 3. Functional preprocessing was then performed with a 'slightly' modified version of Thomas Yeo’s Computational Brain Imaging Group (CBIG) fMRI preprocessing pipeline to generate the BOLD runs used in the connectome (https://github.com/bchcohenlab/CBIG/blob/master/README.md). -- Note: Our "config file" for the CBIG pipeline is also included in the current upload to facilitate methodological replication and the CBIG preprocessing scripts themselves can be freely accessed here: https://github.com/bchcohenlab/CBIG/tree/master/stable_projects/preprocessing/CBIG_fMRI_Preproc2016. ## License: This data upload abides by the original GSP Open Access Data Use Terms agreement released with the initial GSP project; all data contained in this upload are deidentified, defaced, and no code (or other analysis techniques) used to process the data endangers the security of subjects' Protected Health Information or any additional confidential information that would otherwise violate Terms of the agreement. Given the size of this dataset, you may find better performance downloading it from Harvard Dataverse using the available command line tools: `# The API token is obtainable after you log-in to Harvard dataverse from your profile page export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx export SERVER_URL=https://dataverse.harvard.edu export PERSISTENT_ID=doi:10.7910/DVN/ILXIKS export VERSION=2.1` `curl -O -J -H "X-Dataverse-key:$API_TOKEN" $SERVER_URL/api/access/dataset/:persistentId/versions/$VERSION?persistentId=$PERSISTENT_ID`
Depositor	Cohen, Alexander
Deposit Date	2020-11-12

Dataset Terms

License/Data Use Agreement

Our Community Norms as well as good scientific practices expect that proper credit is given via citation. Please use the data citation shown on the dataset page.

Custom Dataset Terms - the following Custom Dataset Terms have been defined for this dataset.

Access and use of this derivative dataset is permitted solely under the terms of the original GSP data release: These restrictive terms of use take precedence over any less restrictive use terms that apply generally to Dataverse Network Terms of Use I request access to data collected as part of the Brain Genomics Superstruct Project (GSP) of Harvard University and the Massachusetts General Hospital, and I agree to the following: 1. I will not attempt to establish the identity of or attempt to contact any of the included human subjects. 2. I will not attempt to link any of the distributed data to any other data that might contain information about the included human subjects. 3. I understand that under no circumstances will the code that would link these data to Protected Health Information be given to me, nor will any additional information about individual human subjects be released to me under these Open Access Data Use Terms. 4. I will comply with all relevant rules and regulations imposed by my institution. This may mean that I need my research to be approved or declared exempt by a committee that oversees research on human subjects e.g., my Inter nal Review Board or Ethics Committee. Different committees operate under different national, state, and local laws and may interpret regulations differently, so it is important to ask about this. 5. I may redistribute original GSP Open Access data and any derived data as long as the data are redistributed under these same Data Use Terms. 6. I will acknowledge the use of GSP data and data derived from GSP data when publicly presenting any results or algorithms that benefitted from their use. a. Papers, book chapters, books, posters, oral presentations, and all other printed and digital presentations of results derived from GSP data should contain the following wordi ng in the acknowledgments section: âData were provided [in part] by the Brain Genomics Superstruct Project of Harvard University and the Massachusetts General Hospital, (Principal Investigators: Randy Buckner, Joshua Roffman, and Jordan Smoller), with support from the Center for Brain Science Neuroinformatics Research Group, the Athinoula A. Martinos Center for Biomedical Imaging, and the Center for Human Genetic Research. 20 individual investigators at Harvard and MGH generously contributed d ata to GSP Open Access Data Use Terms Version: 2014-Apr-22 the overall project.â b. Authors of publications or presentations using GSP data should cite relevant publications describing the methods used by the GSP to acquire and process the data. The specific publications that are appropriate to cite in any given study will depend on what GSP data were used and for what purposes. An annotated and appropriately up-to-date list of publications that may warrant consideration is available at http://neuroinformatics.harvard.edu/gsp/ c. The GSP as a consortium should not be included as an author of publications or presentations if this authorship would be based solely on the use of GSP data. 7. Failure to abide by these guidelines will result in termination of my privileges to access GSP data.

	Dataset Version	Summary	Contributors	Published on
No records found.

Edit File

This file has already been deleted (or replaced) in the current version. It may not be edited.

Restrict Access

Restricting limits access to published files. People who want to use the restricted files can request access by default. If you disable request access, you must add information about access to the Terms of Access field.

Learn about restricting files and dataset access in the User Guide.

Request Access

Enable access request

You must enable request access or add terms of access to restrict file access.

Terms of Access for Restricted Files

Save Changes

Edit Embargo

The selected file or files have already been published. Contact an administrator to change the embargo date or reason of the file or files.

Edit Retention Period

The selected file or files have already been published. Contact an administrator to change the retention period date or reason of the file or files.

Delete Files

The file will be deleted after you click on the Delete button.

Files will not be removed from previously published versions of the dataset.

Select File(s)

Please select one or more files.

Share Dataset

Share this dataset on your favorite social media networks.

Continue

Dataset Citations

Citations for this dataset are retrieved from Crossref via DataCite using Make Data Count standards. For more information about dataset metrics, please refer to the User Guide.

Sorry, no citations were found.

Inaccessible Files Selected

The selected file(s) may not be downloaded because you have not been granted access or the file(s) have a retention period that has expired or the files can only be transferred via Globus.

Ineligible Files Selected

The selected file(s) may not be transferred because you have not been granted access or the file(s) have a retention period that has expired or the files are not Globus accessible.

Download Options

The files selected are too large to download as a ZIP.

You can select individual files that are below the 15.0 GB download limit from the files table, or use the Data Access API for programmatic access to the files.

Select File(s)

Please select a file or files to be downloaded.

Inaccessible Files Selected

The selected file(s) may not be downloaded because you have not been granted access or the file(s) have a retention period that has expired.

Click Continue to download the files you have access to download.

Ineligible Files Selected

Some file(s) cannot be transferred. (They are restricted, embargoed, with an expired retention period, or not Globus accessible.)

Click Continue to transfer the elligible files.

Delete Dataset

Are you sure you want to delete this dataset and all of its files? You cannot undelete this dataset.

Delete Draft Version

Are you sure you want to delete this draft version? Files will be reverted to the most recently published version. You cannot undelete this draft.

Unpublished Dataset Preview URL

Preview URL can only be used with unpublished versions of datasets.

Unpublished Dataset Preview URL

Are you sure you want to disable the Preview URL? If you have shared the Preview URL with others they will no longer be able to use it to access your unpublished dataset.

Delete Files

The file(s) will be deleted after you click on the Delete button.

Files will not be removed from previously published versions of the dataset.

Compute

This dataset contains restricted files you may not compute on because you have not been granted access.

Deaccession Dataset

Are you sure you want to deaccession? This is permanent and the selected version(s) will no longer be viewable by the public.

Deaccession Dataset

Are you sure you want to deaccession this dataset? This is permanent an it will no longer be viewable by the public.

Version Differences Details

Please select two versions to view the differences.

Version Differences Details

Version:
Last Updated:

Select File(s)

Please select a file or files for access request.

Select File(s)

Embargoed files cannot be accessed. Please select an unembargoed file or files for your access request.

Edit Tags

Select existing file tags or create new tags to describe your files. Each file can have more than one tag.

Request Access

You need to Sign Up or Log In to request access.

Dataset Terms

Please confirm and/or complete the information needed below in order to request access to files in this dataset.

This dataset is made available under the following terms. Please confirm and/or complete the information needed below in order to continue.

License/Data Use Agreement

Our Community Norms as well as good scientific practices expect that proper credit is given via citation. Please use the data citation shown on the dataset page.

Custom terms specific to this dataset Custom Dataset Terms - the following Custom Dataset Terms have been defined for this dataset.

Terms of Use Access and use of this derivative dataset is permitted solely under the terms of the original GSP data release: These restrictive terms of use take precedence over any less restrictive use terms that apply generally to Dataverse Network Terms of Use I request access to data collected as part of the Brain Genomics Superstruct Project (GSP) of Harvard University and the Massachusetts General Hospital, and I agree to the following: 1. I will not attempt to establish the identity of or attempt to contact any of the included human subjects. 2. I will not attempt to link any of the distributed data to any other data that might contain information about the included human subjects. 3. I understand that under no circumstances will the code that would link these data to Protected Health Information be given to me, nor will any additional information about individual human subjects be released to me under these Open Access Data Use Terms. 4. I will comply with all relevant rules and regulations imposed by my institution. This may mean that I need my research to be approved or declared exempt by a committee that oversees research on human subjects e.g., my Inter nal Review Board or Ethics Committee. Different committees operate under different national, state, and local laws and may interpret regulations differently, so it is important to ask about this. 5. I may redistribute original GSP Open Access data and any derived data as long as the data are redistributed under these same Data Use Terms. 6. I will acknowledge the use of GSP data and data derived from GSP data when publicly presenting any results or algorithms that benefitted from their use. a. Papers, book chapters, books, posters, oral presentations, and all other printed and digital presentations of results derived from GSP data should contain the following wordi ng in the acknowledgments section: âData were provided [in part] by the Brain Genomics Superstruct Project of Harvard University and the Massachusetts General Hospital, (Principal Investigators: Randy Buckner, Joshua Roffman, and Jordan Smoller), with support from the Center for Brain Science Neuroinformatics Research Group, the Athinoula A. Martinos Center for Biomedical Imaging, and the Center for Human Genetic Research. 20 individual investigators at Harvard and MGH generously contributed d ata to GSP Open Access Data Use Terms Version: 2014-Apr-22 the overall project.â b. Authors of publications or presentations using GSP data should cite relevant publications describing the methods used by the GSP to acquire and process the data. The specific publications that are appropriate to cite in any given study will depend on what GSP data were used and for what purposes. An annotated and appropriately up-to-date list of publications that may warrant consideration is available at http://neuroinformatics.harvard.edu/gsp/ c. The GSP as a consortium should not be included as an author of publications or presentations if this authorship would be based solely on the use of GSP data. 7. Failure to abide by these guidelines will result in termination of my privileges to access GSP data.

Preview Guestbook

Upon downloading files the guestbook asks for the following information.

Guestbook Name

Collected Data

Account Information

Package File Download

Use the Download URL in a Wget command or a download manager to download this package file. Download via web browser is not recommended. User Guide - Downloading a Dataverse Package via URL

Download URL

https://qa.dataverse.org/api/access/datafile/

Compute Batch

Clear Batch

Dataset	Persistent Identifier	Change Compute Batch

Compute Batch

Submit for Review

You will not be able to make changes to this dataset while it is in review.

Publish Dataset

Are you sure you want to republish this dataset?

By default datasets are published with the CC0-“Public Domain Dedication” waiver. Learn more about the CC0 waiver here.

To publish with custom Terms of Use, click the Cancel button and go to the Terms tab for this dataset.

Select if this is a minor or major version update.

Minor Release (3.1)

Major Release (4.0)

Publish Dataset

This dataset cannot be published until CohenLab Dataverse is published by its administrator.

Publish Dataset

This dataset cannot be published until CohenLab Dataverse and Harvard Dataverse are published.

Return to Author

Return this dataset to contributor for modification. The reason for return entered below will be sent by email to the author.

Curation Status History

Status	Date	Assigner
No records found.

Add/Edit a Version Note

Styled Citation