Sean’s Notebook: Canu running, figuring out Racon and what it needs next.
So I got Racon, a consensus caller installed, and it looks like it requires mapping position information. The Racon developers suggested Minimap for that purpose. I installed Minimap on Emu so I could run it on the filtered reads while Canu is running on Hyak.
It looks like the general process will be as follows
- Canu to make preliminary contigs
- Map Raw reads to contigs with Minimap
- Consensus call with Racon
- Repeat the above steps ~ 3 times (that seems to be a common number)
- Map the raw Illumina reads to the final Racon output with samtools
- Use Pilon to polish with the mapped .sam files.
- End with an assembly based on PacBio reads, and finished with Illumina, the opposite of the Platanus/Redundans workflow.
We’re currently on step 1, so this might take a little while…
I’ve been trying to get Falcon installed on Hyak, or Emu, or even my laptop for a couple days now with no success. There wasn’t much help on the Falcon GitHub, so after doing some reading, it looks like Canu may be an option that is as good, or better than Falcon, so lets give that a whirl!
Canu’s GitHub is here and documentation is here
To install was super simple, just cloned the GitHub repo on to Hyak via
git clone https://github.com/marbl/canu.git in to
/gscratch/srlab/programs/canu/, changed directory in to
/gscratch/srlab/programs/canu/src/ and ran
The Canu developers supply a sample assembly data set which can be downloaded via
curl -L -o p6.25x.fastq http://gembox.cbcb.umd.edu/mhap/raw/ecoli_p6_25x.filtered.fastq
which I downloaded in to
To run the assembly, I spool up a 4 hour Interactive session (hopefully this is long enough) and run
/gscratch/srlab/programs/canu/Linux-amd64/bin/canu -p ecoli -d ecoli-auto genomeSize=4.8m -pacbio-raw p6.25x.fastq .
This did not work, as Canu is built to run on a scheduler system, so it needs the
--useGrid=FALSE argument added to the command. After changing that, everything looks like it’s working fine. After I make sure this works, I’ll get to work on the PacBio only assembly for the Oly genome.
Edit: It finished, and looks like it works with the sample data. Now to try it with our Oly PacBio stuff.
Went into UW today and checked all the silos for larvae. Two out of the 13 individual oysters had spawned with one having thousands of larvae and one having only a few. Also possible presence ciliates, but there were larvae there!
Here are my counts:
In addition, I talked with Steven and Grace about how to image all of Laura’s histology slides from after the OA treatment. From there I will work to go through them all and look at gonad maturity between treatments and after just the overwintering as well as after the OA treatments.