Hi class, I'm Andrew Jaffe. I'm an investigator at The Lieber Institute for Brain Development here on the Johns Hopkins Medical Campus in Baltimore, Maryland. I'm going to be talking about some of the RNA sequencing data that you'll be using in your capstone project that we've generated on the human brain across development. You can follow more of our work at my website aejaffe.com or follow me on Twitter @andrewejaffe. This project uses data that we generated to explore gene expression changes across the human brain development, looking from fetal samples to old age. And this research was motivated by previous research that explored these changes and found widespread differences in gene expression comparing fetal to later in life samples. However many of these previous approaches used microarray technologies which used pre-defined probe sequences to only query known gene sequence. And additionally existing RNA sequencing data sets like the BrainSpan project only has existing feature counts like genes and exons which might also limit biological discovery for age related changes in expression. And so we generated at The Lieber Institute for Brain Development RNA sequencing data which is an unbiased snapshot of the transcriptome. This process works by taking messenger RNA that has been presumably spliced from Introns and then generate short sequencing reads off of the mRNA. These reads are then aligned to the genome and transcriptome to see where they originated from. Then used to quantify the expression of known genes in transcripts. This paper by Cole Trapnell now has a very nice summary and overview of some of the various analysis pipelines you can do with RNA sequencing data. And all this data comes from a paper that we had published in Nature Neuroscience in December 2014 called development regulation of human cortex transcription and its clinical relevance at single base resolution. This was done through a collaboration with The Lieber Institute for Brain Development and your own instructor, Jeff Leek. You can find all the data on the bio project ID indicated on the slide and if you're not affiliated with the university you can find a copy of the PDF paper on my website on from the title slide. And so this data from the project was using polyA+ RNA sequencing. So, the selection of polyadenylated transcripts, which are presumably coding. We perform this on a sequencing in 36 samples across human brain development. At six important life stages including six samples in the second trimester of fetal life. Six samples in the first year of life or infanthood. Six samples in childhood from one to ten years of age. Six samples in the teenage years. Six samples in adulthood from 20 to 50 and then six samples from samples 50 plus. And these samples came from non hospital deaths in DC and Virginia through the medical examiners office, who we have partnerships with. Additionally, the fetal samples were elective abortions occurring in the second trimester obtained through the University of Maryland Brain Bank. These were very high quality samples for post-mortem human research. We tried to balance and match potential confounding variables, like RNA integrity number or RIN, which is a measured RNA quality, and post-mortem interval which is potentially also a measure of quality in these samples. As you can see the RNA integrity numbers are relatively balanced across groups and with the exception of the fetal samples so were all the samples after birth. And so the data that you will be analyzing for the capstone might include taking some of the sequencing reads aligning this with the genome to get BAM files via alignments and then to quantify expression using potential annotation that contains gene and exon information. I would like to acknowledge all of my fellow collaborators on this project, including many people at The Lieber Institute for Brain Development as well as several people at the Johns Hopkins Bloomberg School of Public Health. I hope you guys enjoy using this data for your projects, and feel free to ask me any questions.