Sam’s Notebook:Software Installation – RepeatMasker v4.0.7 on Emu/Roadrunner


Steven asked that I re-run some Olympia oyster transposable elements analysis using RepeatMasker and a newer version of our Olympia oyster genome assembly.

Installed the software on both of the Apple Xserves (Emu and Roadrunner) running Ubuntu 16.04.

Followed the instructions outlined here:

Starting with the prerequisites:

1. Download and install RMBlast

 - NCBI Blast 2.6.0 source - isb 2.6.0 patch  

Unfortunately, the make command continually failed:

 cd /home/shared/ncbi-blast-2.6.0+-src/c++ make 


While trying to troubleshoot this issue, continued with the other prerequisites:

2. Downloaded Tandem Repeat Finder v.4.09

 - Saved file (```trf409.linux64```) to ```/home/shared/bin```. NOTE: ```/home/shared/bin``` is part of the system PATH. See the ```/etc/environment``` file. - Changed permissions to be executable: <pre><code>sudo chmod 775 trf409.linux64</code></pre>  

3. Downloaded RepBase RepeatMasker Edition 20170127 (NOTE: This requires registration in order to obtain a username/password to download the file).

Installed RepeatMasker:

4. Downloaded RepeatMasker 4.0.7

 - Saved to ```/home/shared/RepeatMasker-4.0.7```  

5. Installed RepBase RepeatMasker Edition 20170127 in /home/shared//home/shared/RepeatMasker-4.0.7/Libraries

Currently re-building RMBlast and it takes forever… Will report back when I have it running.

from Sam’s Notebook

Yaamini’s Notebook: Gonad Methylation Analysis Part 16

A new enchilada

(P.S. I think I really want enchiladas now)

After fixing the bismark_methylation_extractor issue, Steven suggested I duplicate my notebook and rerun the analysis on a subset of the data. I created this notebook and started rerunning bismark to align the sequences to the prepared genome. I then deduplicated, sorted, and indexed the .bam files, and extracted methylation calls successfully! I also completed the HTML and Summary Report steps. All outputs from this notebook can be found in this folder.

The next step is to duplicate the notebook I created today, remove the -u argument, and run the commands on the full dataset. I created this notebook and started to run the alignment. I’ll check on it in a few days!

// Please enable JavaScript to view the comments powered by Disqus.

from the responsible grad student

Sam’s Notebook:TrimGalore/FastQC/MultiQC – TrimGalore! RRBS Geoduck BS-seq FASTQ data (direc tional)


Earlier this week, I ran TrimGalore!, but set the trimming, incorrectly – due to a copy/paste mistake, as --non-directional, so I re-ran with the correct settings.

Steven requested that I trim the Geoduck RRBS libraries that we have, in preparation to run them through Bismark.

These libraries were originally created by Hollie Putnam using the TruSeq DNA Methylation Kit (Illumina):

All analysis is documented in a Jupyter Notebook; see link below.

Overview of process:

  1. Run TrimGalore! with --paired and --rrbs settings.
  2. Run FastQC and MultiQC on trimmed files.
  3. Copy all data to owl (see Results below for link).
  4. Confirm data integrity via MD5 checksums.

Jupyter Notebook: