Replication data for: Coupling of Hidden Markov Models for the Discovery of Cis-Regulatory Modules in Multiple Species (doi:10.7910/DVN/P1FC4F)

View:

Part 1: Document Description
Part 2: Study Description
Part 5: Other Study-Related Materials
Entire Codebook

Document Description

Citation

Title:

Replication data for: Coupling of Hidden Markov Models for the Discovery of Cis-Regulatory Modules in Multiple Species

Identification Number:

doi:10.7910/DVN/P1FC4F

Distributor:

Harvard Dataverse

Date of Distribution:

2007-11-28

Version:

1

Bibliographic Citation:

Qing Zhou; Wing Hung Wong, 2007, "Replication data for: Coupling of Hidden Markov Models for the Discovery of Cis-Regulatory Modules in Multiple Species", https://doi.org/10.7910/DVN/P1FC4F, Harvard Dataverse, V1

Study Description

Citation

Title:

Replication data for: Coupling of Hidden Markov Models for the Discovery of Cis-Regulatory Modules in Multiple Species

Identification Number:

doi:10.7910/DVN/P1FC4F

Authoring Entity:

Qing Zhou (UCLA)

Wing Hung Wong (Stanford University)

Date of Production:

2007

Distributor:

Harvard Dataverse

Distributor:

Institute for Mathematical Statistics

Date of Deposit:

2007-10-01

Date of Distribution:

2007

Holdings Information:

https://doi.org/10.7910/DVN/P1FC4F

Study Scope

Keywords:

Cis-regulatory module, motif discovery, comparative genomics, coupled hidden Markov model, Markov chain Monte Carlo, dynamic programming

Abstract:

Cis-regulatory modules (CRMs) composed of multiple transcription factor binding sites (TFBSs) control gene expression in eukaryotic genomes. Comparative genomic studies have shown that these regulatory elements are more conserved across species due to evolutionary constraints. We propose a statistical method to combine module structure and cross-species orthology in de novo motif discovery. We use a hidden Markov model (HMM) to capture the module structure in each species and couple these HMMs through multiple-species alignment. Evolutionary models are incorporated to consider correlated structures among aligned sequence positions across different species. Based on our model, we develop a Markov chain Monte Carlo approach, MultiModule, to discover CRMs and their component motifs simultaneously in groups of orthologous sequences from multiple species. Our method is tested on both simulated and biological data sets in mammals and Drosophila, where significant improvement over other motif and module discovery methods is observed.

Notes:

Subject: STANDARD DEPOSIT TERMS 1.0 Type: DATAPASS:TERMS:STANDARD:1.0 Notes: This study was deposited under the of the Data-PASS standard deposit terms. A copy of the usage agreement is included in the file section of this study.;

Methodology and Processing

Sources Statement

Data Access

Notes:

<a href="http://creativecommons.org/publicdomain/zero/1.0">CC0 1.0</a>

Other Study Description Materials

Related Publications

Citation

Title:

Qing Zhou, and Wing Hung Wong. 2007. "Coupling of Hidden Markov Models for the Discovery of Cis-Regulatory Modules in Multiple Species." Ann. Appl. Statist. Volume 1, Number 1 (2007), 36-65. <a href="http://projecteuclid.org/DPubS?service=UI&amp;version=1.0&amp;verb=Display&amp;handle=euclid.aoas/1183143728" target= "_new">article available here</a>

Bibliographic Citation:

Qing Zhou, and Wing Hung Wong. 2007. "Coupling of Hidden Markov Models for the Discovery of Cis-Regulatory Modules in Multiple Species." Ann. Appl. Statist. Volume 1, Number 1 (2007), 36-65. <a href="http://projecteuclid.org/DPubS?service=UI&amp;version=1.0&amp;verb=Display&amp;handle=euclid.aoas/1183143728" target= "_new">article available here</a>

Other Study-Related Materials

Label:

supplement.pdf

Text:

Coupling Hidden Markov Models for the Discovery of Cis-Regulatory Modules in Multiple Species (Supplemental Notes)

Notes:

application/pdf