Sam’s Notebook:Kmer Estimation – Kmergenie on Geoduck Sequence Data (default settings)


After the last SparseAssembler assembly completed, I wanted to do another run with a different kmer size (last time was arbitrarily set at 101). However, I didn’t really know how to decide, particularly since this assembly consisted of mixed read lenghts (50bp and 100bp). So, I ran kmergenie on all of our geoduck (Panopea generosa) sequencing data in hopes of getting a kmer determination to apply to my next assembly.

The job was run on our Mox HPC node.

Slurm script:

Input files list (needed for kmergenie command – see Slurm script linked above): geoduck_fastq_list.txt


Output folder: 20180419_kmergenie_geoduck/

Slurm output file: 20180419_kmergenie_geoduck/slurm-161551.out

Kmer histograms (HTML): 20180419_kmergenie_geoduck/histograms_report.html

Screen cap from Kmer report:


This data estimates the best kmer size for this data to be 121.

However, based on the kmergenie documentation, this is likely to be inaccurate. This inaccuracy is based on the fact that our kmer graph should be concave. Our graph, instead, is only partial – we haven’t reached a kmer size where the number of kmers is decreasing.

As such, I’ll try re-running with a different maximum kmer settting (default max is 121).

from Sam’s Notebook

Yaamini’s Notebook: Project Ideas

Things to do if I had infinite research money

Now that I’m wrapping up my DNR and Manchester papers, I need to think of what to do next! I did some literature scans and have some ideas on what I could do if I had research funding. Some ideas are more fleshed out than others, and some are probably better off as side projects, but that’s okay.

  • Translating model organism work to a non-model organism: Manila clams
    • Manila clams have a genome and are an important tribal resource. It would be cool to do some of the first transgenerational ocean acidification, gene expression, and epigenetic regulation work on a species that people depend on for food
  • Invasive species and epigenetics
    • I don’t know why but I’ve really started to get interested in invasive species. I think considering both climate change stressors and invasive species effects on ecosystems would be cool to do at a genetic or epigenetic level
    • Epigenetic comparison of original Japanese Pacific oyster population, and the Washington population. There’s been some evidence to suggest that different populations of C. gigas respond differently to ocean acidification. I could compare the genetic and epigenetic potential of different C. gigas populations to adapt to climate change, and then relate different adaptation potentials to the projections for the environment they live in.
    • Tracking green crab invasions with epigenetics.
  • Response timing: when a stressor occurs, and when the effect can be detected
    • Test a gradient of different exposure times and lag times on both wild and hatchery populations
    • Sample wild oysters after an upwelling event (or other acute stressors) and see if we can detect a response
    • Compare adult and larval methylomes from my 2017 hatchery experiment to see if there’s any evidence of transgenerational epigenetic inheritance
    • Species comparison of genetic and epigenetic responses to multiple stressors. Can compare Pacific and Olympia oysters in both hatchery and wild settings

// Please enable JavaScript to view the comments powered by Disqus.

from the responsible grad student