Some information I’ve missed
I met with Steven on Tuesday, and he suggested I do a few things:
- Figure out if the mRNA genome feature file overlaps with introns and exons
- Count the number of unique genes the gene background overlapped with and get Uniprot codes
Feature file overlaps
TL;DR: Yes, the mRNA feature file includes introns and exons.
Figure 1. Various genome feature files in IGV.
I opened the tracks in IGV and found they overlapped. I have to consider this as I think about what the overlaps between DML and DMR and exon, intron, or mRNA coding regions actually mean. My guess is that I need to consider exon and intron overlaps as a subset of the mRNA overlaps. Unless the mRNA coding region file has information that isn’t an intron or exon, I could just compare exon and intron overlaps instead of using mRNA overlaps.
Unique genes from gene background-mRNA overlaps
I went back to my R Markdown file and subsetted unique Genbank IDs from the file with gene background-mRNA overlaps and Uniprot codes. I used the following code:
uniqueBackgroundmRNAblast <- subset(backgroundmRNAblast, !duplicated(backgroundmRNAblast$Genbank)) #Subset the unique Genbank IDs from backgroundmRNAUniprot and save as a new dataframe. nrow(uniqueBackgroundmRNAblast$Genbank) #Count the number of unique genes
The gene background overlapped with 14,943 unique genes. I saved the subsetted information in this file.
- Describe gene products for all remaining DML and DMR overlaps
- Compare genes with hypermethylated vs. hypomethylated loci and regions