Genome-wide annotation of microRNA primary transcript structures reveals novel regulatory mechanisms

Tsung Cheng Chang, Mihaela Pertea, Sungyul Lee, Steven L. Salzberg, Joshua T. Mendell

Research output: Contribution to journalArticle

33 Citations (Scopus)

Abstract

Precise regulation of microRNA (miRNA) expression is critical for diverse physiologic and pathophysiologic processes. Nevertheless, elucidation of the mechanisms through which miRNA expression is regulated has been greatly hindered by the incomplete annotation of primarymiRNA(pri-miRNA) transcripts. While a subset ofmiRNAs are hosted in protein-coding genes, the majority of pri-miRNAs are transcribed as poorly characterized noncoding RNAs that are 10's to 100's of kilobases in length and low in abundance due to efficient processing by the endoribonucleaseDROSHA, which initiatesmiRNA biogenesis. Accordingly, these transcripts are poorly represented in existing RNA-seq data sets and exhibit limited and inaccurate annotation in current transcriptome assemblies. To overcome these challenges, we developed an experimental and computational approach that allows genome-wide detection and mapping of pri-miRNA structures. Deep RNA-seq in cells expressing dominant-negative DROSHA resulted in much greater coverage of pri-miRNA transcripts compared with standard RNA-seq.Acomputational pipeline was developed that produces highly accurate pri-miRNA assemblies, as confirmed by extensive validation. This approach was applied to a panel of human and mouse cell lines, providing pri-miRNA transcript structures for 1291/1871 human and 888/1181 mouse miRNAs, including 594 human and 425 mouse miRNAs that fall outside protein-coding genes. These new assemblies uncovered unanticipated features and new potential regulatory mechanisms, including links between pri-miRNAs and distant protein-coding genes, alternative pri-miRNA splicing, and transcripts carrying subsets of miRNAs encoded by polycistronic clusters. These results dramatically expand our understanding of the organization of miRNA-encoding genes and provide a valuable resource for the study of mammalian miRNA regulation.

Original languageEnglish (US)
Pages (from-to)1401-1409
Number of pages9
JournalGenome Research
Volume25
Issue number9
DOIs
StatePublished - Sep 1 2015

Fingerprint

MicroRNAs
Genome
RNA
Untranslated RNA
Proteins
Alternative Splicing
Transcriptome
Cell Line
Genes

ASJC Scopus subject areas

  • Genetics
  • Genetics(clinical)

Cite this

Genome-wide annotation of microRNA primary transcript structures reveals novel regulatory mechanisms. / Chang, Tsung Cheng; Pertea, Mihaela; Lee, Sungyul; Salzberg, Steven L.; Mendell, Joshua T.

In: Genome Research, Vol. 25, No. 9, 01.09.2015, p. 1401-1409.

Research output: Contribution to journalArticle

Chang, Tsung Cheng ; Pertea, Mihaela ; Lee, Sungyul ; Salzberg, Steven L. ; Mendell, Joshua T. / Genome-wide annotation of microRNA primary transcript structures reveals novel regulatory mechanisms. In: Genome Research. 2015 ; Vol. 25, No. 9. pp. 1401-1409.
@article{e0c3f5d78186462cb006d4ab9e9ce3a7,
title = "Genome-wide annotation of microRNA primary transcript structures reveals novel regulatory mechanisms",
abstract = "Precise regulation of microRNA (miRNA) expression is critical for diverse physiologic and pathophysiologic processes. Nevertheless, elucidation of the mechanisms through which miRNA expression is regulated has been greatly hindered by the incomplete annotation of primarymiRNA(pri-miRNA) transcripts. While a subset ofmiRNAs are hosted in protein-coding genes, the majority of pri-miRNAs are transcribed as poorly characterized noncoding RNAs that are 10's to 100's of kilobases in length and low in abundance due to efficient processing by the endoribonucleaseDROSHA, which initiatesmiRNA biogenesis. Accordingly, these transcripts are poorly represented in existing RNA-seq data sets and exhibit limited and inaccurate annotation in current transcriptome assemblies. To overcome these challenges, we developed an experimental and computational approach that allows genome-wide detection and mapping of pri-miRNA structures. Deep RNA-seq in cells expressing dominant-negative DROSHA resulted in much greater coverage of pri-miRNA transcripts compared with standard RNA-seq.Acomputational pipeline was developed that produces highly accurate pri-miRNA assemblies, as confirmed by extensive validation. This approach was applied to a panel of human and mouse cell lines, providing pri-miRNA transcript structures for 1291/1871 human and 888/1181 mouse miRNAs, including 594 human and 425 mouse miRNAs that fall outside protein-coding genes. These new assemblies uncovered unanticipated features and new potential regulatory mechanisms, including links between pri-miRNAs and distant protein-coding genes, alternative pri-miRNA splicing, and transcripts carrying subsets of miRNAs encoded by polycistronic clusters. These results dramatically expand our understanding of the organization of miRNA-encoding genes and provide a valuable resource for the study of mammalian miRNA regulation.",
author = "Chang, {Tsung Cheng} and Mihaela Pertea and Sungyul Lee and Salzberg, {Steven L.} and Mendell, {Joshua T.}",
year = "2015",
month = "9",
day = "1",
doi = "10.1101/gr.193607.115",
language = "English (US)",
volume = "25",
pages = "1401--1409",
journal = "Genome Research",
issn = "1088-9051",
publisher = "Cold Spring Harbor Laboratory Press",
number = "9",

}

TY - JOUR

T1 - Genome-wide annotation of microRNA primary transcript structures reveals novel regulatory mechanisms

AU - Chang, Tsung Cheng

AU - Pertea, Mihaela

AU - Lee, Sungyul

AU - Salzberg, Steven L.

AU - Mendell, Joshua T.

PY - 2015/9/1

Y1 - 2015/9/1

N2 - Precise regulation of microRNA (miRNA) expression is critical for diverse physiologic and pathophysiologic processes. Nevertheless, elucidation of the mechanisms through which miRNA expression is regulated has been greatly hindered by the incomplete annotation of primarymiRNA(pri-miRNA) transcripts. While a subset ofmiRNAs are hosted in protein-coding genes, the majority of pri-miRNAs are transcribed as poorly characterized noncoding RNAs that are 10's to 100's of kilobases in length and low in abundance due to efficient processing by the endoribonucleaseDROSHA, which initiatesmiRNA biogenesis. Accordingly, these transcripts are poorly represented in existing RNA-seq data sets and exhibit limited and inaccurate annotation in current transcriptome assemblies. To overcome these challenges, we developed an experimental and computational approach that allows genome-wide detection and mapping of pri-miRNA structures. Deep RNA-seq in cells expressing dominant-negative DROSHA resulted in much greater coverage of pri-miRNA transcripts compared with standard RNA-seq.Acomputational pipeline was developed that produces highly accurate pri-miRNA assemblies, as confirmed by extensive validation. This approach was applied to a panel of human and mouse cell lines, providing pri-miRNA transcript structures for 1291/1871 human and 888/1181 mouse miRNAs, including 594 human and 425 mouse miRNAs that fall outside protein-coding genes. These new assemblies uncovered unanticipated features and new potential regulatory mechanisms, including links between pri-miRNAs and distant protein-coding genes, alternative pri-miRNA splicing, and transcripts carrying subsets of miRNAs encoded by polycistronic clusters. These results dramatically expand our understanding of the organization of miRNA-encoding genes and provide a valuable resource for the study of mammalian miRNA regulation.

AB - Precise regulation of microRNA (miRNA) expression is critical for diverse physiologic and pathophysiologic processes. Nevertheless, elucidation of the mechanisms through which miRNA expression is regulated has been greatly hindered by the incomplete annotation of primarymiRNA(pri-miRNA) transcripts. While a subset ofmiRNAs are hosted in protein-coding genes, the majority of pri-miRNAs are transcribed as poorly characterized noncoding RNAs that are 10's to 100's of kilobases in length and low in abundance due to efficient processing by the endoribonucleaseDROSHA, which initiatesmiRNA biogenesis. Accordingly, these transcripts are poorly represented in existing RNA-seq data sets and exhibit limited and inaccurate annotation in current transcriptome assemblies. To overcome these challenges, we developed an experimental and computational approach that allows genome-wide detection and mapping of pri-miRNA structures. Deep RNA-seq in cells expressing dominant-negative DROSHA resulted in much greater coverage of pri-miRNA transcripts compared with standard RNA-seq.Acomputational pipeline was developed that produces highly accurate pri-miRNA assemblies, as confirmed by extensive validation. This approach was applied to a panel of human and mouse cell lines, providing pri-miRNA transcript structures for 1291/1871 human and 888/1181 mouse miRNAs, including 594 human and 425 mouse miRNAs that fall outside protein-coding genes. These new assemblies uncovered unanticipated features and new potential regulatory mechanisms, including links between pri-miRNAs and distant protein-coding genes, alternative pri-miRNA splicing, and transcripts carrying subsets of miRNAs encoded by polycistronic clusters. These results dramatically expand our understanding of the organization of miRNA-encoding genes and provide a valuable resource for the study of mammalian miRNA regulation.

UR - http://www.scopus.com/inward/record.url?scp=84940983900&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84940983900&partnerID=8YFLogxK

U2 - 10.1101/gr.193607.115

DO - 10.1101/gr.193607.115

M3 - Article

VL - 25

SP - 1401

EP - 1409

JO - Genome Research

JF - Genome Research

SN - 1088-9051

IS - 9

ER -