So we’re looking for ways to combine the PacBio sequencing for Olys to the previous Illumina stuff we had. As always, this isn’t as simple as it looks. One of the options is to use an assembler like Platanus on the Illumina data, and then combine that with the PacBio data via Redundans. So lets get Redundans installed!
The developer of redundans made this really nice shell script that installs Redundans from GitHub as well as all of the dependencies. This doesn’t work on Hyak due to the lack of proper permissions, as always. So I tried to install by cloning the GitHub repo.
First, we clone the GitHub repo via
git clone --recursive https://github.com/lpryszcz/redundans.git
Then, due to some oddities with timestamps and GitHub clones, we navigate to
/redundans/bin/parallel/ and execute
autoreconf -ivf which fixes the timestamps of all the different autoconf files. Then, we execute
This… sort of works. But again fails to install the dependencies. So we get started doing that by hand! We already have GNU parallel and BWA installed, so those thankfully can be skipped.
Last was downloaded via
SNAP aligner was cloned from the repo via
git clone https://github.com/amplab/snap
SSPACE was downloaded via wget after signing up for a license from the provided website.
GapCloser was gotten via
curl -O -L https://sourceforge.net/projects/soapdenovo2/files/GapCloser/bin/r6/GapCloser-bin-v1.12-r6.tgz
pyScaf was downloaded via
git clone https://github.com/lpryszcz/pyScaf
FastaIndex was downloaded via
git clone https://github.com/lpryszcz/FastaIndex
Some notes about installing individual programs:
Last requires GCC 6.3 which is available via
module load contrib/gcc/6.3.0
SNAP Aligner requires GCC 4.3, which is the native GCC for hyak. So you have to unload GCC 6.3.0 that you just loaded for Last via
module unload contrib/gcc/6.3.0
SSPACE comes in both Standard and LongReads flavor, I wasn’t sure which Redundans wanted, so I got both!
Both pyScaf and FastaIndex require the
--user argument to the
python setup.py install command to install in the user directory. This will likely have downstream effects if other users try to use the programs, so we’ll have to figure out a longer term solution.
After adding all of the different program paths to my paths.sh file, this gets us nearly all the way to a functioning Redundans install.
Using the test data supplied with the install, we can execute
./redundans.py -v -i test/*.fq.gz -f test/contigs.fa -o test/run1 which loads up the contigs.fa file and then fails due to the following error:
Traceback (most recent call last): File "./redundans.py", line 517, in <module> main() File "./redundans.py", line 512, in main o.norearrangements, o.verbose, o.log) File "./redundans.py", line 312, in redundans libraries = get_libraries(fastq, lastOutFn, mapq, threads, verbose, log) File "./redundans.py", line 59, in get_libraries genomeFrac, stdfracTh, maxcfracTh) File "/gscratch/srlab/programs/redundans/bin/fastq2insert_size.py", line 189, in fastq2insert_size isstats = get_isize_stats(fq1, fq2, fasta, mapq, threads, limit, verbose, stdfracTh, maxcfracTh) File "/gscratch/srlab/programs/redundans/bin/fastq2insert_size.py", line 111, in get_isize_stats aligner = _get_snap_proc(fq1, fq2, fasta, threads, verbose, alignerlog) File "/gscratch/srlab/programs/redundans/bin/fastq2sspace.py", line 133, in _get_snap_proc proc = subprocess.Popen(args, stdout=subprocess.PIPE, stderr=log) File "/sw/anaconda-4.3.1/python2/lib/python2.7/subprocess.py", line 390, in __init__ errread, errwrite) File "/sw/anaconda-4.3.1/python2/lib/python2.7/subprocess.py", line 1024, in _execute_child raise child_exception OSError: [Errno 2] No such file or directory
Time to chase that down.