Precise regulation of microRNA (miRNA) expression is critical for diverse physiologic and pathophysiologic processes. Nevertheless, elucidation of the mechanisms through which miRNA expression is regulated has been greatly hindered by the incomplete annotation of primarymiRNA(pri-miRNA) transcripts. While a subset ofmiRNAs are hosted in protein-coding genes, the majority of pri-miRNAs are transcribed as poorly characterized noncoding RNAs that are 10's to 100's of kilobases in length and low in abundance due to efficient processing by the endoribonucleaseDROSHA, which initiatesmiRNA biogenesis. Accordingly, these transcripts are poorly represented in existing RNA-seq data sets and exhibit limited and inaccurate annotation in current transcriptome assemblies. To overcome these challenges, we developed an experimental and computational approach that allows genome-wide detection and mapping of pri-miRNA structures. Deep RNA-seq in cells expressing dominant-negative DROSHA resulted in much greater coverage of pri-miRNA transcripts compared with standard RNA-seq.Acomputational pipeline was developed that produces highly accurate pri-miRNA assemblies, as confirmed by extensive validation. This approach was applied to a panel of human and mouse cell lines, providing pri-miRNA transcript structures for 1291/1871 human and 888/1181 mouse miRNAs, including 594 human and 425 mouse miRNAs that fall outside protein-coding genes. These new assemblies uncovered unanticipated features and new potential regulatory mechanisms, including links between pri-miRNAs and distant protein-coding genes, alternative pri-miRNA splicing, and transcripts carrying subsets of miRNAs encoded by polycistronic clusters. These results dramatically expand our understanding of the organization of miRNA-encoding genes and provide a valuable resource for the study of mammalian miRNA regulation.
ASJC Scopus subject areas