I’ve migrated my notebook to here:
from Sam’s Notebook https://ift.tt/2QueyqN
via IFTTT
I’ve migrated my notebook to here:
from Sam’s Notebook https://ift.tt/2QueyqN
via IFTTT
methylKit
and bedtools
with Mox samplesI took my samples off of Mox, identified DML and DMR, and characterized their location! :tada:
bismark
I finished my bismark
pipeline on Mox! It did, however, take me a bit of tweaking. I ended up needing a few different scripts because I kept making mistakes in my code:
--samtools_path
.--samtools_path
.I moved all of my scripts to this folder for defunct scripts, and created a master script with all revisions that can be found here.
To move my files off of Mox, I initially thought I should create a checksum file and then use rsync
. When I tried creating a checksum file in the login node, I got an error message saying I was overloading the CPU. I posted this issue, and learned that I could either create checksums from the interactive node, or just rsync
and create the checksum file later since rsync
already verifies checksums as it is transferring the files. I went with the second option. All of the files from Mox are now on this gannet folder. I created the checksums with shasum
.
methylKit
This part was easy since I already had an R Markdown file ready. I changed the path for the filenames, and cranked away. Output from DML identification is here, and output from DMR identification can be found here. Nothing changed between this run and my previous methylKit
runs using files generated on genefish
.
bedtools
I characterized the location of DML and DMR in this Jupyter notebook. Based on this issue, I added sections to find overlaps between DML, DMR, and transposable elements. There are two different transposable element files. According to Sam, “C_virginica-3.0_TE-all.gff
used all species that exist in the database and C_virginica-3.0_TE-Cg.gff
only used Crassostrea virginica database” to identify transposable elements in the C. virginica genome. C_virginica-3.0_TE-all.gff
had more transposable elements, so I got different results for each file when I used intersectBed
. I also looked at overlaps between transposable elements and either exons, introns, or mRNA coding regions. I generated a lot of data! All of the files with overlap locations can be found here.
Steven said to focus on DMR characterization since he found those more interesting in the Olympia oyster data he’s working with, so I’ll develop all of the code for analyses for DMR first. I can then move on to DML if I want.
// Please enable JavaScript to view the comments powered by Disqus.
from the responsible grad student https://ift.tt/2zGZMWC
via IFTTT