Sean’s Notebook: Starting GARM meta-assembly of PacBio and BGI assemblies for Olys.
The Pilon polishing for the CANU assembly finished last night, and it seems pretty small, but hopefully between that, the BGI assembly, and the Platanus assembly, we can assemble ourselves up one decent genome. Fingers crossed at least.
Assembly stats on the Polished assembly:
D-69-91-159-59:BGI_Oly_Genome Sean$ assembly-stats oly_polished_.fasta stats for oly_polished_.fasta sum = 46364927, n = 3388, ave = 13685.04, largest = 61211 N50 = 14126, n = 1230 N60 = 12962, n = 1573 N70 = 11906, n = 1947 N80 = 10932, n = 2352 N90 = 9590, n = 2803 N100 = 2074, n = 3388 N_count = 1 Gaps = 1
1 gap is interesting, but with the assembly size being at least 1, of not two orders of magnitude smaller than the expected genome size, I think we’re short on coverage to allow for conservative error correction levels. Will have to reassemble with looser standards and see if we can bump it up.
Polished CANU assembly found: here
Pilon output file: here
Next step is GARM, to see what that gives us. I think I’ll also re-assemble the PacBio stuff with much less stringent error correction to see if that gives any measurable difference in outputs.
Edit: Also, I finished the –non-directional runs for Bismark, no change in mapping rates and less than 1% complementary mapping, so it looks like the regular arguments are correct. Output .bam files are found here with the NonDir tag.