We have an upcoming meeting with Illumina to discuss how the geoduck genome project is coming along and to decide how we want to proceed.
Used the following assemblies as references:
- sn_ph_01 : SuperNova assembly of 10x Genomics data
- sparse_03 : SparseAssembler assembly of BGI and Illumina project data
- pga_02 : Hi-C assembly of Phase Genomics data
The analysis is documented in a Jupyter Notebook.
Jupyter Notebook (GitHub):
NOTE: Due to large amount of stdout from first genome index command, the notebook does not render well on GitHub. I recommend downloading and opening notebook on a locally install version of Jupyter.
Here’s a brief overview of the process:
- Generate Bowtie2 indexes for each of the genome assemblies.
- Map 1,000,000 reads from the following Illumina NovaSeq FastQ files: