Redundans finished over the weekend, but the results were a little… odd.
Stats for the Illumina only Platanus and the completed Redundans run are below. We increase the N50 from 105 to 251, but in doing so lose 75% of the total number of bases, and 80% of the contigs from the initial assembly. Something’s wonky in the pipeline.
I did some more research and I realized I may have made a faulty assumption that Redundans would take the final output of the Platanus pipeline, a gap closed scaffold assembly. It looks like it actually wants the initial assembled contigs.
I’ve started a run again supplying the contig assembly, and we’ll see if that yields better results! I’m a little skeptical though because the contig assembly only has 16mm base pairs, which still seems awfully low.
Redundans Output can be found: here/
The final assembly is: scaffolds.reduced.fa
I’ve also installed Falcon on Hyak. Falcon is de novo assembler for PacBio only reads.
Some notes on the assembly of Falcon.
- Don’t load the anaconda2_4.3.1 module. Falcon is nice scripts that download and install things inline, so you don’t have the ability to specify where things are installed (user directory vs anaconda module directory).
- Everything has to be done on a build node, rather than an interactive node, since the scripts both download and install packages
- Install networkx v 1.10 prior to starting using easy_install, specifying an install directory via something like
easy_install --install-dir /gscratch/srlab/programs/networkx networkx==1.10