Grace’s Notebook: Attempt to concentrate test pools using Zymo RNA Clean and Concentrator Kit -5

Received the Zymo RNA Clean and Concentrator Kit -5 yesterday. Plan is to concentrate my 6 pooled samples. I tested it out on some test pools I made quickly. Short answer: unclear if it worked because the initial qubit RNA HS readings of the test pools were both “TOO LOW”… and after concentrating, one test pool was still “TOO LOW”, while the other was 2.20 ng/ ul in a 2 ul sample from RNA eluted in 35ul. Details in post. (Also at the end of the post: general updates on crab project and my January plans).

Creating test pools

Test Pool 1

FRP Uniq_ID sample_day infection_status maturity tube_number sample vol remaining total RNA ng left ng RNA for pool vol for pool total pool conc
6272 6272_111_9 9 0 I 111 13 260 30 1.5 120
6176 6176_48_9 9 0 M 48 13 40.3 30 9.677419355 120
6179 6179_34_9 9 0 M 34 13 101.92 30 3.826530612 120
6205 6205_121_9 9 0 I 121 13 80.99 30 4.81540931 120

total pool vol ul 12.11767753

water to add ul 42.88232247

Test Pool 2

FRP Uniq_ID sample_day infection_status maturity tube_number sample vol remaining total RNA ng left ng RNA for pool vol for pool total pool conc
6140 6140_8_9 9 1 I 8 13 208 30 1.875 120
6125 6125_144_9 9 1 I 144 13 185.9 30 2.097902098 120
6137 6137_81_9 9 1 I 81 13 403 30 0.967741935 120
6158 6158_101_9 9 1 I 101 13 54.34 30 7.177033493 120

total pool vol ul 19.81935928

water to add ul 35.18064072

Steps for Pooling

  1. Vortex each tube
  2. Pool the volume (vol for pool column) into tube
  3. Pipet to mix
  4. Add enough H20 to get to 55ul (Zymo RNA Clean and Concentrator kit requires samples to be at least 50ul)

Inital Qubit RNA HS on test pools

Ran 2ul of each pool on qubit using RNA HS Kit.

Both were “TOO LOW”.

This is becuase I SHOULD NOT have added so much water to the samples… I should have run them on qubit, THEN added enough water to get to at least 50ul sample…

Using the kit


Followed protocol. (used 2 of 10 preps)
Step 2 – used 100% EtOH and centrifuge 10,000g 30 s
Step 5 – elute 35ul TE (NWGC requires at least 30ul and requires TE)

Post-concentration qubit

Ran 2ul of each 35ul concentrated test pool on qubit using RNA HS Kit.

Pool 1 –> “TOO LOW”
Pool 2 –> 2.20 ng in 2 ul (33 ul remaining) –> total RNA concentration remaining: 72.6 ng

If I pooled the samples well, they each should have had initially ~120 ng of RNA. However, since I added so much H20 before running the pools on the qubit, the RNA concentration was much too low to detect.


This probably works… I’m just a bit nervous to use it on my 6 pooled samples… maybe I’ll try again on a 2 new test pools and make sure I run 2ul on qubit before adding so much water for the concentrator kit…

from Grace’s Lab Notebook

Shelly’s Notebook: Tues. Jan. 7, 2020 Geoduck methylation analysis on 5x cov. destranded CpG data

Steven generated destranded (“merged”) coverage files from Bismark .cov output.

Analysis below was done 12/20/2019 – 01/07/2020

Global methylation analysis

  • Using this jupyter notebook 20191222_GlobalMethylation_5x_CpGs.ipynb I created this table allc_5x_CpG.txt which has the following columns:
    1. Sample
    2. Number of mCs
    3. Number of total Cs
    4. % methylation
  • Using this R project 20191222.Rproj and this R markdown script Overall_CpG_analysis.Rmd I generated the following figures:
    • number of mCpGs across samples from different time points 5x_num_mCpG_boxplot.jpg
    • percent CpG methylation for each sample group 5xCovPercMeth_boxplots.jpg
    • percent CpG methyaltion for each sample group facetted by time 5xCovPercMeth_facet_boxplots.jpg
  • I used this R script CpG_analysis_d0to135.Rmd to generate this figure for percent CpG methylation for samples from day 0, day 10, and day 135 for PAG presentation: d0to135_5xCovPercMeth_boxplot.jpg

DMRs from 5x destranded coverage files

Convert files to Methylpy allc format

  • Used this jupyter notebook 20191222_DMRfind_5xmerg.ipynb to convert files to allc format on my account on Ostrich
    • previously attempted to do this on 12/20 using this jupyter notebook 20191220_DMRfind_5xmerg.ipynb but it turned out the last column in the file is in fact the number of unmethylated Cs AND NOT the total number of Cs so that is why I could not get DMRfind to work. See issue posted here.
    • copied new allc.tsv files to Gannet using same notebook 20191222_DMRfind_5xmerg.ipynb

Running Methylpy DMRfind for all 4 comparisons

Filter DMRs for coverage in 3/4 samples per group

Running group statistics (ANOVA) on DMRs

  • Using this script MCmax25_asinT_groupStats.Rmd I performed ANOVA on all the DMRs from each comparison.
    1. CHECK DATA DISTRIBUTION: First looked at each groups’ % methylation distribution
      • All ambient sampels: ambDMR_percmeth_hist.jpg
      • Day 10 samples: d10DMR_percmeth_hist.jpg
      • Day 135 samples: d135DMR_percmeth_hist.jpg
      • Day 145 samples: d145DMR_percmeth_hist.jpg
    2. TRANSFORM DATA: No distribution is normal so performed arcsin square root transformation and here’s how the distrubtions changed:
      • All ambient sampels: ambDMR_Tpercmeth_hist.jpg
      • Day 10 samples: d10DMR_Tpercmeth_hist.jpg
      • Day 135 samples: d135DMR_Tpercmeth_hist.jpg
      • Day 145 samples: d145DMR_Tpercmeth_hist.jpg
    3. Perform ANOVA on transformed data
    4. Plot % methylation x group for significant DMRs
      • DMRs significant at p value < 0.01
        • All ambient samples:
          • ambDMR_MCmax25_Taov0.01pHPercMeth.jpg
        • Day 10 samples:
          • d10DMR_MCmax25_Taov0.01pHPercMeth.jpg
        • Day 135 samples:
          • d135DMR_MCmax25_Taov0.01pHPercMeth.jpg
        • Day 145 samples:
          • d145DMR_MCmax25_Taov0.01pHPercMeth.jpg
      • DMRs significant at p value < 0.05
        • All ambient samples:
          • ambDMR_MCmax25_Taov0.05pHPercMeth.jpg
        • Day 10 samples:
          • d10DMR_MCmax25_Taov0.05pHPercMeth.jpg
        • Day 135 samples:
          • d135DMR_MCmax25_Taov0.05pHPercMeth.jpg
        • Day 145 samples:
          • d145DMR_MCmax25_Taov0.05pHPercMeth.jpg
    5. Plot heatmaps of DMRs x samples colored by % methylation
      • DMRs significant at p value < 0.01
        • All ambient samples: DMR_MCmax25DMR_Taov0.01_amb_heatmap.jpg
        • Day 10 samples: DMR_MCmax25DMR_Taov0.01_d10_heatmap.jpg
        • Day 135 samples: DMR_MCmax25DMR_Taov0.01_d135_heatmap.jpg
        • Day 145 samples: DMR_MCmax25DMR_Taov0.01_d145_heatmap.jpg
      • DMRs significant at p value < 0.05
        • All ambient samples: DMR_MCmax25DMR_Taov0.05_amb_heatmap.jpg
        • Day 10 samples: DMR_MCmax25DMR_Taov0.05_d10_heatmap.jpg
        • Day 135 samples: DMR_MCmax25DMR_Taov0.05_d135_heatmap.jpg
        • Day 145 samples: DMR_MCmax25DMR_Taov0.05_d145_heatmap.jpg
    6. Plot heatmaps of DMRs x group means colored by % methyaltion
      • DMRs significant at p value < 0.01
        • All ambient samples: DMR_MCmax25DMR_Taov0.01_ambmean_heatmap.jpg
        • Day 10 samples: DMR_MCmax25DMR_Taov0.01_d10mean_heatmap.jpg
        • Day 135 samples: DMR_MCmax25DMR_Taov0.01_d135mean_heatmap.jpg
        • Day 145 samples: DMR_MCmax25DMR_Taov0.01_d145mean_heatmap.jpg
      • DMRs significant at p value < 0.05
        • All ambient samples: DMR_MCmax25DMR_Taov0.05_ambmean_heatmap.jpg
        • Day 10 samples: DMR_MCmax25DMR_Taov0.05_d10mean_heatmap.jpg
        • Day 135 samples: DMR_MCmax25DMR_Taov0.05_d135mean_heatmap.jpg
        • Day 145 samples: DMR_MCmax25DMR_Taov0.05_d145mean_heatmap.jpg
    7. Identify persistent DMRs
      • Using this jupyter notebook 20191223_PersistantDMRs.ipynb I compared DMRs from day10 samples (aov_0.05pH_d10DMR.bed) and DMRs from day 135 samples (aov_0.05pH_d135DMR.bed)
      • none were overlapping and the closest DMR was > 16kb away.
      • realized the DMR may not be the same because all these analyses were done separately. If all samples were processed together, I could compare DMRs from different time points

Running Methylpy DMRfind on all samples together


  • When I ran DMRfind on just the Day 10 samples and on just the Day 135 samples, then performed ANOVA on regions identified in each DMRfind run, there were no overlapping DMRs with significant pH effect (ANOVA p value < 0.05)
  • When I ran DMRfind on all 52 samples, then performed ANOVA on regions identified, 1 DMR (scaffold 3: 56511986-56512009) showing a significant pH effect (ANOVA pvalue < 0.05) was overlapping between day 10 and day 135 samples
  • ANOVA is not the best test to be using for this data because it is not normal
    • a GLM would likely be more sensitive but I would need to reformat the data to run this test (determine # Cs and # mCs for each region).
      • This is possible by running bedtools intersect or closest on DMR bedfile and counts ( files and then collapsing/summing the counts for DMRs. But I would need to code it.
  • For now, I’m just going to go with the results from running Methylpy DMRfind on all samples together.