Oly Genome: Redundans run finished.

Redundans finished over the weekend, but the results were a little… odd.

Stats for the Illumina only Platanus and the completed Redundans run are below. We increase the N50 from 105 to 251, but in doing so lose 75% of the total number of bases, and 80% of the contigs from the initial assembly. Something’s wonky in the pipeline.

I did some more research and I realized I may have made a faulty assumption that Redundans would take the final output of the Platanus pipeline, a gap closed scaffold assembly. It looks like it actually wants the initial assembled contigs.

I’ve started a run again supplying the contig assembly, and we’ll see if that yields better results! I’m a little skeptical though because the contig assembly only has 16mm base pairs, which still seems awfully low.

Redundans Output can be found: here/

The final assembly is: scaffolds.reduced.fa

I’ve also installed Falcon on Hyak. Falcon is de novo assembler for PacBio only reads.

Some notes on the assembly of Falcon.

  • Don’t load the anaconda2_4.3.1 module. Falcon is nice scripts that download and install things inline, so you don’t have the ability to specify where things are installed (user directory vs anaconda module directory).
  • Everything has to be done on a build node, rather than an interactive node, since the scripts both download and install packages
  • Install networkx v 1.10 prior to starting using easy_install, specifying an install directory via something like easy_install --install-dir /gscratch/srlab/programs/networkx networkx==1.10