Sean’s Notebook: Starting GARM meta-assembly…

Sean’s Notebook: Starting GARM meta-assembly of PacBio and BGI assemblies for Olys.

The Pilon polishing for the CANU assembly finished last night, and it seems pretty small, but hopefully between that, the BGI assembly, and the Platanus assembly, we can assemble ourselves up one decent genome. Fingers crossed at least.

Assembly stats on the Polished assembly:

D-69-91-159-59:BGI_Oly_Genome Sean$ assembly-stats oly_polished_.fasta 
stats for oly_polished_.fasta
sum = 46364927, n = 3388, ave = 13685.04, largest = 61211
N50 = 14126, n = 1230
N60 = 12962, n = 1573
N70 = 11906, n = 1947
N80 = 10932, n = 2352
N90 = 9590, n = 2803
N100 = 2074, n = 3388
N_count = 1
Gaps = 1

1 gap is interesting, but with the assembly size being at least 1, of not two orders of magnitude smaller than the expected genome size, I think we’re short on coverage to allow for conservative error correction levels. Will have to reassemble with looser standards and see if we can bump it up.

Polished CANU assembly found: here

Pilon output file: here

Next step is GARM, to see what that gives us. I think I’ll also re-assemble the PacBio stuff with much less stringent error correction to see if that gives any measurable difference in outputs.

Edit: Also, I finished the –non-directional runs for Bismark, no change in mapping rates and less than 1% complementary mapping, so it looks like the regular arguments are correct. Output .bam files are found here with the NonDir tag.