I’ve been trying to get Falcon installed on Hyak, or Emu, or even my laptop for a couple days now with no success. There wasn’t much help on the Falcon GitHub, so after doing some reading, it looks like Canu may be an option that is as good, or better than Falcon, so lets give that a whirl!
To install was super simple, just cloned the GitHub repo on to Hyak via
git clone https://github.com/marbl/canu.git in to
/gscratch/srlab/programs/canu/, changed directory in to
/gscratch/srlab/programs/canu/src/ and ran
The Canu developers supply a sample assembly data set which can be downloaded via
curl -L -o p6.25x.fastq http://gembox.cbcb.umd.edu/mhap/raw/ecoli_p6_25x.filtered.fastq
which I downloaded in to
To run the assembly, I spool up a 4 hour Interactive session (hopefully this is long enough) and run
/gscratch/srlab/programs/canu/Linux-amd64/bin/canu -p ecoli -d ecoli-auto genomeSize=4.8m -pacbio-raw p6.25x.fastq .
This did not work, as Canu is built to run on a scheduler system, so it needs the
--useGrid=FALSE argument added to the command. After changing that, everything looks like it’s working fine. After I make sure this works, I’ll get to work on the PacBio only assembly for the Oly genome.
Edit: It finished, and looks like it works with the sample data. Now to try it with our Oly PacBio stuff.