Checked DNA integrity of the Crassostrea virginica mantle gDNA I isolated on 20181114 in preparation for MBD enrichment. Detailed sample info and sample processing (including calculations for MBD enrichment using the MethylMiner Kit) are here (Google Sheet):
Loaded 100ng of each sample on a 0.8% agarose, 1X Low TAE gel with ethidium bromide. Each sample was mixed with 3uL of 6x loading dye and brought up to 18uL with nuclease-free H2O. Gel was run for ~2hrs @ 100V. Used two ladders in a (failed) attempt to accurately assess gDNA size:
The idea being that the HighRange ladder would handle high integrity gDNA molecular weight, while the DNA Ladder Mix would handle any degraded DNA and/or RNA present.
Ran BUSCO on our completed annotation of the [P.generosa v071 genome] (GFF)(https://ift.tt/2GYMI4Jgeoduck_maker_genome_annotation/Pgenerosa_v071_genome_snap02.all.renamed.putative_function.domain_added.gff) (subset of sequences >10kbp). See this notebook entry for genome annotation info. This provides a nice metric on how “complete” a genome assembly (or transcriptome) is. Additionally, BUSCO is tied in with Augustus for gene prediction and generates _ab initio gene models. With that said, since I just want to evaluate the completeness of this particular genome assembly, I’ll be using the annotated genome generated through two rounds of SNAP gene prediction. Otherwise, I’d use the initial MAKER annotations to generate an Augustus gene model that could be used in conjuction with the SNAP models (I’ll likely do this at a later date).
Firstly, I needed a FastA as input for BUSCO, so I extracted the FastA from the GFF with the following script:
#!/bin/env bash # Script to extract FastA sequences from GFF3 (specifically, those produced by MAKER) # User needs to set GFF path and desired output file name #