Sam’s Notebook:Transposable Element Mapping – Olympia Oyster Genome Assembly using RepeatMasker 4.07

0000-0002-2747-368X

Steven wanted transposable elements (TEs) in the Olympia oyster genome identified.

After some minor struggles, I was able to get RepeatMasker installed on on both of our Apple Xserves (emu & roadrunner; running Ubuntu 16.04LTS).

Genome used: pbjelly_sjw_01

I ran RepeatMasker (v4.07) with RepBase-20170127 and RMBlast 2.6.0 four times:

  1. Default settings (i.e. no species select – will use human genome).
  2. Species = Crassostrea gigas (Pacific oyster)
  3. Species = Crassostrea virginica (Eastern oyster)
  4. Species = Ostrea lurida (Olympia oyster)

The idea was to get a sense of how the analyses would differ with species specifications. However, it’s likely that the only species setting that will make any difference will be Run #2 (Crassostrea gigas).

The reason I say this is that RepeatMasker has a built in tool to query which species are available in the RepBase database (e.g.):

 RepeatMasker-4.0.7/util/queryRepeatDatabase.pl -species "crassostrea virginica" -stat 

Here’s a very brief overview of what that yields:

  • Crassotrea gigas: 792 specific repeats
  • Crassostrea virginica: 4 Crassostrea virginica specific repeats
  • Ostrea lurida: 0 Ostrea lurida specific repeats

All runs were performed on roadrunner.

All commands were documented in a Jupyter Notebook (GitHub):

NOTE: RepeatMasker writes the desired output files (*.out, *.cat.gz, and *.gff) to the same directory that the genome is located in! If you conduct multiple runs with the same genome in the same directory, it will overwrite those files, as they are named using the genome assembly filename.