Grace’s Notebook: Submitted 6 pooled crab samples to NWGC for QC

Today I submitted 6 pooled samples of Crab RNA to NWGC for QC. After they run QC, they’ll let us know what our sequencing options are.

GitHub Issue: #798

Pooled the samples originally on Nov 22nd, 2019

After pooling, I realized I had to have samples concentrated to NWGC minimum requirement of 50 ng/ul… so I attempted to concentrate them using a kit from Zymo after it arrived in the mail.

Attempt at concentrating was unsuccessful becuase I didn’t do the correct volume of RNA Binding Buffer (post: here).

Sam recommended I contact NWGC and ask if we can still get some sequencing done even though our samples are not at their requirements, which brought us to today!

I put the 6 pooled samples in a plate that was provided by Jeff Weiss at NWGC, and walked them over there.

They will do QC testing, and then let us know what our sequencing options are afterwards.

Here’s manifest of what I submitted today:

Plate Well Location Investigator Sample ID Additional Sample ID Family Number Replacement Sex Date of Birth Organism Race Concentration (ng/uL) Volume (uL) RNA Quality Score Sample Source Type of Sample Suspended In Extraction Method Certified for dbGaP dbGaP ID Investigator Last Name
(Required) (Required) (Required) (Required) (Required) (Required if RNAseq) (Required) (Required) (Required)
A:1 D9_0 Male Chionoecetes bairdi 15.7 33 crab hemolymph RNA TE Zymo Research: Quick-DNA/RNA Microprep Plus Kit Crandall
B:1 D9_1 Male Chionoecetes bairdi 17.6 33 crab hemolymph RNA TE Zymo Research: Quick-DNA/RNA Microprep Plus Kit Crandall
C:1 D12_cold_0 Male Chionoecetes bairdi 24.9 33 crab hemolymph RNA TE Zymo Research: Quick-DNA/RNA Microprep Plus Kit Crandall
D:1 D12_cold_1 Male Chionoecetes bairdi 26 33 crab hemolymph RNA TE Zymo Research: Quick-DNA/RNA Microprep Plus Kit Crandall
E:1 D12_warm_0 Male Chionoecetes bairdi 26 33 crab hemolymph RNA TE Zymo Research: Quick-DNA/RNA Microprep Plus Kit Crandall
F:1 D12_warm_1 Male Chionoecetes bairdi 24.4 33 crab hemolymph RNA TE Zymo Research: Quick-DNA/RNA Microprep Plus Kit Crandall

from Grace’s Lab Notebook

Laura’s Notebook: February 2020 goals

Yikes, it’s been a few months …

  • Finish QuantSeq libraries – last step is to process some deployed juvnile Olys (RNA isolation, library prep)
  • Coordinate sequencing – UW or UMinnesota?
  • DMG and DMR analysis on Oly methylation data. Make sure that I’ve controlled for genotype (i.e. differences aren’t due to presence/absence of certain genes/loci) – does the filtering accomplish this?
  • Prepare and deliver presentation at Aquaculture America 2020
  • Revise Oly Temp/Food draft, and rough draft of introduction
  • Submit Polydora paper to Aquaculture Research
  • Meet OA/Reproduction deadlines

Also … Met with Krista – we are a go on the internship. I will get my hands on the data as soon as it’s ready (April?). She supports me doing the NSF INTERN in Fall/Winter, so should continue pursuing that. Need to have full analysis of data by November at latest.

from The Shell Game

Sam’s Notebook: Data Wrangling – Arthropoda and Alveolata Day and Treatment Taxonomic RNAseq FastQ Extractions

After using MEGAN6 to extract Arthropoda and Alveolata reads from our RNAseq data on 20200114, I had then extracted taxonomic-specific reads and aggregated each into basic Read 1 and Read 2 FastQs to simplify transcriptome assembly for C.bairdi and for Hematodinium. That was fine and all, but wasn’t fully thought through.

For gene expression analysis, I need the FastQs based on infection status and sample days. So, I need to modify the read extraction procedure to parse reads based on those conditions. I could’ve/should’ve done this originally, as I could’ve just assembled the transcriptome from the FastQs I’m going to generate now. Oh well.

As a reminder, the reason I’m doing this is that I realized that the FastA headers were incomplete and did not distinguish between paired reads. Here’s an example:

R1 FastQ header:

@A00147:37:HG2WLDMXX:1:1101:5303:1000 1:N:0:AGGCGAAG+AGGCGAAG

R2 FastQ header:

@A00147:37:HG2WLDMXX:1:1101:5303:1000 2:N:0:AGGCGAAG+AGGCGAAG

However, the reads extracted via MEGAN have FastA headers like this:

>A00147:37:HG2WLDMXX:1:1101:5303:1000 SEQUENCE1 >A00147:37:HG2WLDMXX:1:1101:5303:1000 SEQUENCE2 

Those are a set of paired reads, but there’s no way to distinguish between R1/R2. This may not be an issue, but I’m not sure how downstream programs (i.e. Trinity) will handle duplicate FastA IDs as inputs. To avoid any headaches, I’ve decided to parse out the corresponding FastQ reads which have the full header info.

Anyway, here’s a brief rundown of the approach:

  1. Create list of unique read headers from MEGAN6 FastA files.
  2. Use list with seqtk program to pull out corresponding FastQ reads from the trimmed FastQ R1 and R2 files.

The entire procedure is documented in a Jupyter Notebook below.

Jupyter notebook (GitHub):

Laura’s Notebook: Oly DMG analysis, Jan. 30th, 2020

Today I identified 46 differentially methylated genes among two Olympia oyster populations, Hood Canal and South Sound. This was performed using a binomial GLM and Chi-square tests. The script was adapted from Hollie Putnam’s script (/hputnam/Geoduck_Meth/master/RAnalysis/Scripts/GM.Rmd), which may have been adopted from the Lieu et al. 2018 paper .

The analysis was performed in a RMarkdown notebook, please see that here: 09-DMG-analysis

Here are the GO terms associated with genes of known function. Some notes:
– 18 out of the 46 genes were annotated with GO terms
– 9 out of the 46 genes were annotated but did not have associated GO terms (may have to find those manually …)
– 19 out of the 46 genes were of unknown function

term ID description frequency pin? log10 p-value uniqueness dispensability
GO:0006468 protein phosphorylation 4.137 % -3.7877 0.40 0.00
GO:0006807 nitrogen compound metabolic process 38.744 % -2.2764 0.78 0.03
GO:0006207 ‘de novo’ pyrimidine nucleobase biosynthetic process 0.192 % -2.2764 0.46 0.06
GO:0006281 DNA repair 2.234 % -2.4853 0.50 0.20
GO:0006030 chitin metabolic process 0.077 % -1.6311 0.49 0.21
GO:0006520 cellular amino acid metabolic process 5.591 % -2.2764 0.42 0.35
GO:0006412 translation 5.686 % -2.4853 0.28 0.55
GO:0016567 protein ubiquitination 0.523 % -1.4336 0.44 0.56


from The Shell Game