TY - JOUR
T1 - A long-read RNA-seq approach to identify novel transcripts of very large genes
AU - Uapinyoying, Prech
AU - Goecks, Jeremy
AU - Knoblac, Susan M.
AU - Panchapakesan, Karuna
AU - Bonneman, Carsten G.
AU - Partridg, Terence A.
AU - Jaiswa, Jyoti K.
AU - Hoffma, Eric P.
N1 - Funding Information:
We thank Adam K.L. Wong, Keith Crandall, and the Colonial One HPC team at George Washington University for providing technical support and computational resources for the project; Hiroki Morizono for bioinformatics lessons and insight; Linda Werling for academic and motivational support; Heather Locovare for helping us optimize the Iso-Seq bench protocol; and Jaakko Saparanta for corrections regarding Ttn exon 312. We thank past and present members of the Eric Hoffman laboratory and the Genetic Medicine department at Children’s National Hospital for helpful discussions. This project was supported by award number 1U54HD090257 from the National Institutes of Health, District of Columbia Intellectual and Developmental Disabilities Research Center Award (DC-IDDRC) program. Its contents are solely the responsibility of the authors and do not necessarily represent the official views of the District of Columbia Intellectual and Developmental Disabilities Research Center or the National Institutes of Health. Additional support was provided by the National Institutes of Health award numbers R01NS029525 (NINDS) and T32AR056993 (NIAMS).
Publisher Copyright:
© 2020 Uapinyoying et al.
PY - 2020/6
Y1 - 2020/6
N2 - RNA-seq is widely used for studying gene expression, but commonly used sequencing platforms produce short reads that only span up to two exon junctions per read. This makes it difficult to accurately determine the composition and phasing of exons within transcripts. Although long-read sequencing improves this issue, it is not amenable to precise quantitation, which limits its utility for differential expression studies. We used long-read isoform sequencing combined with a novel analysis approach to compare alternative splicing of large, repetitive structural genes in muscles. Analysis of muscle structural genes that produce medium (Nrap: 5 kb), large (Neb: 22 kb), and very large (Ttn: 106 kb) transcripts in cardiac muscle, and fast and slow skeletal muscles identified unannotated exons for each of these ubiquitous muscle genes. This also identified differential exon usage and phasing for these genes between the different muscle types. By mapping the in-phase transcript structures to known annotations, we also identified and quantified previously unannotated transcripts. Results were confirmed by endpoint PCR and Sanger sequencing, which revealed muscle-type-specific differential expression of these novel transcripts. The improved transcript identification and quantification shown by our approach removes previous impediments to studies aimed at quantitative differential expression of ultralong transcripts.
AB - RNA-seq is widely used for studying gene expression, but commonly used sequencing platforms produce short reads that only span up to two exon junctions per read. This makes it difficult to accurately determine the composition and phasing of exons within transcripts. Although long-read sequencing improves this issue, it is not amenable to precise quantitation, which limits its utility for differential expression studies. We used long-read isoform sequencing combined with a novel analysis approach to compare alternative splicing of large, repetitive structural genes in muscles. Analysis of muscle structural genes that produce medium (Nrap: 5 kb), large (Neb: 22 kb), and very large (Ttn: 106 kb) transcripts in cardiac muscle, and fast and slow skeletal muscles identified unannotated exons for each of these ubiquitous muscle genes. This also identified differential exon usage and phasing for these genes between the different muscle types. By mapping the in-phase transcript structures to known annotations, we also identified and quantified previously unannotated transcripts. Results were confirmed by endpoint PCR and Sanger sequencing, which revealed muscle-type-specific differential expression of these novel transcripts. The improved transcript identification and quantification shown by our approach removes previous impediments to studies aimed at quantitative differential expression of ultralong transcripts.
UR - http://www.scopus.com/inward/record.url?scp=85089162894&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85089162894&partnerID=8YFLogxK
U2 - 10.1101/gr.259903.119
DO - 10.1101/gr.259903.119
M3 - Article
C2 - 32660935
AN - SCOPUS:85089162894
SN - 1088-9051
VL - 30
SP - 885
EP - 897
JO - PCR Methods and Applications
JF - PCR Methods and Applications
IS - 6
ER -