Prepping for adding PacBio reads…

Prepping for adding PacBio reads to Oly reference genome w/ PBJelly

We have a visiting scientist from Peru working on some RNA-Seq stuff, so most of Emu’s processing power is taken up with Trinity, but I wanted to take the time to at least prepare the new PacBio sequencing data and the programs required to do the gap-filling on our Oly draft genome. I was working on my laptop so most of this was done via terminal.

Reading the PBJelly2 documentation, it requests .fasta files with their associated quality files. I’m hoping that it will accept .fastq files, as that has quality information baked in.

Copying files over from Owl to Emu:

Screen Shot 2017-04-05 at 12.12.12 PM

MD5 checking and whatnot (This step is probably silly, as we don’t have MD5s from the sequencing center to compare them to, habit I guess?)

Screen Shot 2017-04-05 at 11.36.04 AM

Next I started to install the various dependencies required by PBJelly2. PacBio is nice enough to have an installer for these they named Pitchfork available on github

First, clone the repo with:

git clone git://github.com/PacificBiosciences/pitchfork pitchfork

Then, initialize the make commands with:

make init

Then, install Blasr:

make blasr

Then wait… a long time.

Will update when Blasr is installed with the next step!

This gives me enough time to research what we’re going to do regarding the fact that our draft genome is in .fasta, and we may or may not have the associated quality file that PBJelly wants.

Update: Quality file for reference not a problem! It will just assume it doesn’t exist and chug along happily.

Got Blasr installed and installed the PBSuite from here

I’m now getting an error where the Jelly.py mapping function isn’t creating the mapping files for later use. Gotta track that down.

Advertisements