GSP1000 Preprocessed Connectome (doi:10.7910/DVN/ILXIKS)

View:

Part 1: Document Description
Part 2: Study Description
Part 3: Data Files Description
Part 4: Variable Description
Part 5: Other Study-Related Materials
Entire Codebook

(external link)

Document Description
Citation
Title:	GSP1000 Preprocessed Connectome
Identification Number:	doi:10.7910/DVN/ILXIKS
Distributor:	Harvard Dataverse
Date of Distribution:	2020-11-17
Version:	3
Bibliographic Citation:	Cohen, Alexander; Soussand, Louis; McManus, Peter; Fox, Michael, 2020, "GSP1000 Preprocessed Connectome", https://doi.org/10.7910/DVN/ILXIKS, Harvard Dataverse, V3, UNF:6:NaHMV/y3utLZ3D9PE2tRiw== [fileUNF]
Study Description
Citation
Title:	GSP1000 Preprocessed Connectome
Identification Number:	doi:10.7910/DVN/ILXIKS
Authoring Entity:	Cohen, Alexander (Boston Children's Hospital, Harvard Medical School)
	Soussand, Louis (Boston Children's Hospital, Harvard Medical School)
	McManus, Peter (Boston Children's Hospital, Harvard Medical School)
	Fox, Michael (Brigham and Women's Hospital, Harvard Medical School)
Distributor:	Harvard Dataverse
Access Authority:	Cohen, Alexander
Depositor:	Cohen, Alexander
Date of Deposit:	2020-11-12
Holdings Information:	https://doi.org/10.7910/DVN/ILXIKS
Study Scope
Keywords:	Medicine, Health and Life Sciences
Abstract:	The GSP1000 Processed Connectome is derived from data acquired by the Brain Genomics Superstruct Project (GSP), which contained 1570 subjects in total (ages 18-36). From this dataset, 1000 subjects <strike>(1:1 M/F)</strike> were chosen and processed using publicly available tools to generate a normative functional connectivity dataset. This release contains one T1w anatomical image warped to the MNI152 2mm isovolumetric space distributed with FSL and either one or two preprocessed resting state fMRI BOLD runs. This dataset was created to provide a new version of the "Yeo1000" connectome that was created 10 years ago from software that is no longer available (Yeo et al., J Neurophys 2011), but has been used by a number of laboratories and research groups. Of note, the CBIG pipeline has been slightly modified, as have the default pipeline settings, to approximate those used for the original "yeo1000" dataset. Both our modified pipeline and the configuration file as well as code to apply this pipeline to BIDS-formatted data are linked below.
	<br> We have updated the GSP1000 cohort to consist of a fully 1:1 M:F dataset. The original version consisted of 346:654 M:F participants, while the earlier unreleased "yeo1000" dataset consisted of 426:574 M:F participants. All data in both versions are usable, the associated text files delineate the entire GSP1570 cohort, and sample pairwise correlations and correlation maps are not substantially different at the group average (n=1000) level, with a very high degree of spatial correlation between "same-seed" functional connectivity maps.
	<br> File GSP1000/GSP1000_v2_16.tar was made incorrectly (only contained 1 participant's data). This has been corrected.
Notes:	## Data Acquisition:<br> The original GSP data was acquired on matched Siemens 3T MAGNETOM Tim Trio MRI systems (Erlangen, Germany) using the vendor-supplied 12-channel phased-array head coil. Sequences, parameters, and instructions were unchanged throughout the collection process. However, not all subjects were acquired on the same scanner, as five different scanners were used. In addition, during the scanning period, the scanner console changed from B13 to B15 to B17. The scanner (Scanner_Bin) and console version (Console) for each imaging session are available within the CSV files included in the original data release (GSP_list_140630.csv and GSP_retest_140630.csv). The test-retest data include individuals scanned twice on the different scanners and across different console versions. The data may be useful in assessing any subtle differences. Finally, as a precaution, the original GSP authors highly recommend regressing the scanner and console from analyses. <br><p> ## Data Processing:<br> 1. GSP subjects included in this connectome were chosen based on a combined 3-way score of normalized DVARS, normalized Entropy Focused Criterion (EFC), and normalized Framewise Displacement (FD) values generated from mriqc (https://mriqc.readthedocs.io/en/stable/) consistent with the goal of minimizing the effects of motion without "scrubbing" (Power et al. 2012), as this was not standard practice for the original "yeo1000" dataset (Yeo et al., J Neurophys 2011). The 1000 GSP subjects with the best combined scores were selected for connectome inclusion. Additionally, if a subject contains 2 resting-state BOLD runs, then the 2 motion quality values are averaged to produce a single value.<br> 2. Anatomical surfaces were created with FreeSurfer v4.5.<br> 3. Functional preprocessing was then performed with a 'slightly' modified version of Thomas Yeo’s Computational Brain Imaging Group (CBIG) fMRI preprocessing pipeline to generate the BOLD runs used in the connectome (https://github.com/bchcohenlab/CBIG/blob/master/README.md).<br> -- Note: Our "config file" for the CBIG pipeline is also included in the current upload to facilitate methodological replication and the CBIG preprocessing scripts themselves can be freely accessed here: https://github.com/bchcohenlab/CBIG/tree/master/stable_projects/preprocessing/CBIG_fMRI_Preproc2016. <br><p> ## License:<br> This data upload abides by the original GSP Open Access Data Use Terms agreement released with the initial GSP project; all data contained in this upload are deidentified, defaced, and no code (or other analysis techniques) used to process the data endangers the security of subjects' Protected Health Information or any additional confidential information that would otherwise violate Terms of the agreement. <br><p> Given the size of this dataset, you may find better performance downloading it from Harvard Dataverse using the available command line tools:<br> <code> # The API token is obtainable after you log-in to Harvard dataverse from your profile page<br> <br> export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx<br> export SERVER_URL=https://dataverse.harvard.edu<br> export PERSISTENT_ID=doi:10.7910/DVN/ILXIKS<br> export VERSION=2.1<br> <p> curl -O -J -H "X-Dataverse-key:$API_TOKEN" $SERVER_URL/api/access/dataset/:persistentId/versions/$VERSION?persistentId=$PERSISTENT_ID </code>
Methodology and Processing
Sources Statement
Data Access
Notes:	Access and use of this derivative dataset is permitted solely under the terms of the original GSP data release: These restrictive terms of use take precedence over any less restrictive use terms that apply generally to Dataverse Network Terms of Use I request access to data collected as part of the Brain Genomics Superstruct Project (GSP) of Harvard University and the Massachusetts General Hospital, and I agree to the following: 1. I will not attempt to establish the identity of or attempt to contact any of the included human subjects. 2. I will not attempt to link any of the distributed data to any other data that might contain information about the included human subjects. 3. I understand that under no circumstances will the code that would link these data to Protected Health Information be given to me, nor will any additional information about individual human subjects be released to me under these Open Access Data Use Terms. 4. I will comply with all relevant rules and regulations imposed by my institution. This may mean that I need my research to be approved or declared exempt by a committee that oversees research on human subjects e.g., my Inter nal Review Board or Ethics Committee. Different committees operate under different national, state, and local laws and may interpret regulations differently, so it is important to ask about this. 5. I may redistribute original GSP Open Access data and any derived data as long as the data are redistributed under these same Data Use Terms. 6. I will acknowledge the use of GSP data and data derived from GSP data when publicly presenting any results or algorithms that benefitted from their use. a. Papers, book chapters, books, posters, oral presentations, and all other printed and digital presentations of results derived from GSP data should contain the following wordi ng in the acknowledgments section: â€œData were provided [in part] by the Brain Genomics Superstruct Project of Harvard University and the Massachusetts General Hospital, (Principal Investigators: Randy Buckner, Joshua Roffman, and Jordan Smoller), with support from the Center for Brain Science Neuroinformatics Research Group, the Athinoula A. Martinos Center for Biomedical Imaging, and the Center for Human Genetic Research. 20 individual investigators at Harvard and MGH generously contributed d ata to GSP Open Access Data Use Terms Version: 2014-Apr-22 the overall project.â€ b. Authors of publications or presentations using GSP data should cite relevant publications describing the methods used by the GSP to acquire and process the data. The specific publications that are appropriate to cite in any given study will depend on what GSP data were used and for what purposes. An annotated and appropriately up-to-date list of publications that may warrant consideration is available at http://neuroinformatics.harvard.edu/gsp/ c. The GSP as a consortium should not be included as an author of publications or presentations if this authorship would be based solely on the use of GSP data. 7. Failure to abide by these guidelines will result in termination of my privileges to access GSP data.
Other Study Description Materials
Related Publications
Citation
Title:	(Derived from): Buckner, Randy L.; Roffman, Joshua L.; Smoller, Jordan W., 2014, Brain Genomics Superstruct Project (GSP), Harvard Dataverse, V10.4
Identification Number:	10.7910/DVN/25833&version=10.4
Bibliographic Citation:	(Derived from): Buckner, Randy L.; Roffman, Joshua L.; Smoller, Jordan W., 2014, Brain Genomics Superstruct Project (GSP), Harvard Dataverse, V10.4
File Description--f4174738
File: participants.tab
	Number of cases: 1570 No. of variables per record: 4 Type of File: text/tab-separated-values
Notes:	UNF:6:NaHMV/y3utLZ3D9PE2tRiw==

Variable Description
List of Variables:	participant_id - participant_id sex - sex age - age handedness - handedness
Variables
participant_id
f4174738 Location:	Variable Format: character Notes: UNF:6:tzX1YaNXKyqlY8Nou2h6sA==
sex
f4174738 Location:	Variable Format: character Notes: UNF:6:wmbpw8R72Qqby4ayITWczQ==
age
f4174738 Location:	Summary Statistics: Valid 1570.0; StDev 2.8908119008800717; Min. 19.0; Mean 21.54140127388535; Max. 35.0 Variable Format: numeric Notes: UNF:6:7tdS+duiQ+ieMrC+0hq7mw==
handedness
f4174738 Location:	Variable Format: character Notes: UNF:6:UeU/1Q+KuCpKMf2xIoDZyg==
Other Study-Related Materials
Label:	dataset_description.json
Text:
Notes:	application/json
Other Study-Related Materials
Label:	GSP1000.txt
Text:
Notes:	text/plain
Other Study-Related Materials
Label:	GSP1000_v2_00.tar
Text:
Notes:	application/x-tar
Other Study-Related Materials
Label:	GSP1000_v2_01.tar
Text:
Notes:	application/x-tar
Other Study-Related Materials
Label:	GSP1000_v2_02.tar
Text:
Notes:	application/x-tar
Other Study-Related Materials
Label:	GSP1000_v2_03.tar
Text:
Notes:	application/x-tar
Other Study-Related Materials
Label:	GSP1000_v2_04.tar
Text:
Notes:	application/x-tar
Other Study-Related Materials
Label:	GSP1000_v2_05.tar
Text:
Notes:	application/x-tar
Other Study-Related Materials
Label:	GSP1000_v2_06.tar
Text:
Notes:	application/x-tar
Other Study-Related Materials
Label:	GSP1000_v2_07.tar
Text:
Notes:	application/x-tar
Other Study-Related Materials
Label:	GSP1000_v2_08.tar
Text:
Notes:	application/x-tar
Other Study-Related Materials
Label:	GSP1000_v2_09.tar
Text:
Notes:	application/x-tar
Other Study-Related Materials
Label:	GSP1000_v2_10.tar
Text:
Notes:	application/x-tar
Other Study-Related Materials
Label:	GSP1000_v2_11.tar
Text:
Notes:	application/x-tar
Other Study-Related Materials
Label:	GSP1000_v2_12.tar
Text:
Notes:	application/x-tar
Other Study-Related Materials
Label:	GSP1000_v2_13.tar
Text:
Notes:	application/x-tar
Other Study-Related Materials
Label:	GSP1000_v2_14.tar
Text:
Notes:	application/x-tar
Other Study-Related Materials
Label:	GSP1000_v2_15.tar
Text:
Notes:	application/x-tar
Other Study-Related Materials
Label:	GSP1000_v2_16.tar
Text:
Notes:	application/x-tar
Other Study-Related Materials
Label:	GSP1000_v2_17.tar
Text:
Notes:	application/x-tar
Other Study-Related Materials
Label:	GSP1000_v2_18.tar
Text:
Notes:	application/x-tar
Other Study-Related Materials
Label:	GSP1000_v2_19.tar
Text:
Notes:	application/x-tar
Other Study-Related Materials
Label:	participants.json
Text:
Notes:	application/json
Other Study-Related Materials
Label:	README
Text:
Notes:	text/plain; charset=US-ASCII
Other Study-Related Materials
Label:	Legacy_GSP_fwhm7.config
Text:
Notes:	application/octet-stream