Cryo2StructData : Small Subsample Dataset (doi:10.7910/DVN/CGUENL)
(Cryo2StructData)

View:

Part 1: Document Description
Part 2: Study Description
Part 5: Other Study-Related Materials
Entire Codebook

Document Description

Citation

Title:

Cryo2StructData : Small Subsample Dataset

Identification Number:

doi:10.7910/DVN/CGUENL

Distributor:

Harvard Dataverse

Date of Distribution:

2023-10-26

Version:

1

Bibliographic Citation:

Giri, Nabin; Wang, Liguo; Cheng, Jianlin, 2023, "Cryo2StructData : Small Subsample Dataset", https://doi.org/10.7910/DVN/CGUENL, Harvard Dataverse, V1

Study Description

Citation

Title:

Cryo2StructData : Small Subsample Dataset

Subtitle:

Cryo2StructData: A Large Labeled Cryo-EM Density Map Dataset for AI-based Modeling of Protein Structures

Alternative Title:

Cryo2StructData

Identification Number:

doi:10.7910/DVN/CGUENL

Authoring Entity:

Giri, Nabin (University of Missouri System)

Wang, Liguo (Brookhaven National Laboratory)

Cheng, Jianlin (University of Missouri System)

Other identifications and acknowledgements:

Giri, Nabin

Producer:

Cheng, Jianlin

Date of Production:

2023-03-20

Grant Number:

R01GM146340

Distributor:

Harvard Dataverse

Distributor:

Giri, Nabin

Access Authority:

Giri, Nabin

Access Authority:

Cheng, Jianlin

Depositor:

Giri, Nabin

Date of Deposit:

2023-10-24

Date of Distribution:

2023-10-24

Holdings Information:

https://doi.org/10.7910/DVN/CGUENL

Study Scope

Keywords:

Computer and Information Science, Medicine, Health and Life Sciences, Other, cryo-electron microscopy

Abstract:

This repository comprises the smaller subset of Cryo2StructData. From the full Cryo2StructData dataset, we extracted a reduced subset consisting of 1867 curated density maps, representing 25% of the original Cryo2StructData collection.

Notes:

Please note that this repository does not include labeled maps. We have provided a collection of scripts in our GitHub repository specifically designed to generate the label maps (amino acid types, atom types, and secondary structure types). These label maps, in combination with the normalized density map, are intended for training and testing AI-based models to build protein structures from cryo-EM density maps, as detailed in the Cryo2StructData paper (https://doi.org/10.1101/2023.06.14.545024).

Methodology and Processing

Sources Statement

Documentation and Access to Sources:

The source code and instructions to reproduce our results are freely available at https://github.com/BioinfoMachineLearning/cryo2struct

Data Access

Other Study Description Materials

Related Materials

The source code and instructions to reproduce our results are freely available at https://github.com/BioinfoMachineLearning/cryo2struct

Related Studies

The original cryo-EM density map are available in EMDataBank (https://www.emdataresource.org/) and the corresponding protein structure are available in Protein Data Bank (https://www.rcsb.org/) .

Other Study-Related Materials

Label:

small_dataset.zip

Notes:

application/zip