Shelly’s Notebook: Wed. Oct. 30, Notes on differential methylation stats

Methods:

Reference: Schultz et al. (2015) Nature. doi:10.1038/nature14464 See pages 4-8 of Supp. Material

  • Methylkit:
    1. calculateDiffMeth uses a logistic regression model to test if treatment has any effect on CpG (loci) methylation or region methylation.
      • is log(πi/(1−πi))=β0+β1Treatmenti a “better” model than log(πi/(1−πi))=β0 ?
      • a q-value is calculated and a cutoff can be defined to select significant DMLs or DMRs
        • this method is only capable of comparing two groups at a time.

Reference: methylkit documentation, see section 3.6

  • DMGs:
    1. bionmial glm comparison between methylated and non-methylated CpGs
      • glm (methylated,non_methylated ~ pH * position, family = “binomial”)
      • each treatment replicate is compared to a control replicate in a combinatorial fashion to assign equal weightage to each replicate
    2. Liew also does glm on mean % methylation with gaussian method

Reference Liew et al. (2018) Science Advances

from shellytrigg https://ift.tt/2pruMbS
via IFTTT

Shelly’s Notebook: Wed. Oct. 30, Geoduck filtered DMR validation

Visualizing filtered DMRs in IGV

DMRs have been filtered for 5x coverage in at least 3/4 samples per experimental group, then further filtered for showing a significant difference in % methylation across experimental groups at an uncorrected ANOVA p.value < 0.1

…QC to be continued…

from shellytrigg https://ift.tt/31WZred
via IFTTT

Sam’s Notebook: Samples Received – Marinelli Shellfish Company C.gigas and C.sikamea Oysters

Steven was recently contacted by Marinelli Shellfish Company to see if we could help them determine if some oysters they had were Crassostrea gigas (Pacific oyster) or Crassostrea sikamea (Kumamoto). Steven knows of a paper with primer sequences to use with qPCR for this specific determination.

They sent ~12 of:

  • known Crassostrea gigas
  • known Crassostrea sikamea
  • unknown

We collected mantle tissue from 12 of each group. Samples were labeled in the following fashion:

  • C.gigas 1191-SS ## (known Crassostrea gigas)
  • C.sikamea CA5SS ## (known Crassostrea sikamea)
  • C.sikamea 1191-SS ## (unknown)

Samples were stored at -80oC in Rack 2, Column 4, Row 5:

NOTE: Only 11 samples were collected for C.sikamea CA5SS.

Sam’s Notebook: Data Wrangling – Create Panopea-generosa-vv0.74.a4 Intron and Intergenic BED Files

Since generating an updated Pgenerosa_v074 annotation, we also needed updated intergenic and intron bed files to put in the OSF repository for this project.

I generated intergenic and intron BED files by following along with Steven’s notebooks:

Steven intergenic BED file notebook (GitHub):

Steven intron BED file notebook (GitHub):

Here’s how I generated these two BED files.

Jupyter Notebook (GitHub):

Shelly’s Notebook: Wed. Oct. 30, Geoduck DMR filtering

Summary of DMR analysis so far:

  1. Call methylation state from bismark data (mox script here)
  2. Call DMRs within individual samples (mox script here)
  3. Filter DMRs for those in at least 3/4 samples/group (R script here, R proj here)
  4. Filter DMRs for those significant at ANOVA uncorrected p.value < 0.1 R markdown script here, Rproj here

Summary of Step 4 above

Filtering DMRs for those significant at ANOVA uncorrected p.value < 0.1 from all 4 comparisons (all ambient samples, day10 samples, day 135 samples, and day 145 samples)

amb_MCmax30DMR_aov0.1_heatmap.jpg

  • ANOVA significant all ambient MCmax30 DMR violinplots: amb_MCmax30DMR_aov0.1_boxplots.jpg
  • ANOVA significant day 10 MCmax30 DMR heatmap:
    • Heatmap key: Column color bar: cyan = ambient, light pink = low.pH, magenta = super.low.pH. heatmap cell color: Red = more methylation, blue = no methylation, black = no data. day10_MCmax30DMR_aov0.1_heatmap.jpg
  • ANOVA significant day 10 MCmax30 DMR violinplots: day10_MCmax30DMR_aov0.1_boxplots.jpg
  • ANOVA significant day 135 MCmax30 DMR heatmap:
    • Heatmap key: Column color bar: cyan = ambient, light pink = low.pH, magenta = super.low.pH. heatmap cell color: Red = more methylation, blue = no methylation, black = no data. day135_MCmax30DMR_aov0.1_heatmap.jpg
  • ANOVA significant day 135 MCmax30 DMR violinplots: day135_MCmax30DMR_aov0.1_boxplots.jpg
  • ANOVA significant day 145 MCmax30 DMR heatmap:
    • Heatmap key: Column color bar: cyan = ambient, light pink = low.pH, magenta = super.low.pH. heatmap cell color: Red = more methylation, blue = no methylation, black = no data. day145_MCmax30DMR_aov0.1_heatmap.jpg
  • ANOVA significant day 145 MCmax30 DMR violinplots: day145_MCmax30DMR_aov0.1_boxplots.jpg

Next steps:

  • visualize significant DMRs in IGV
  • Functional analysis of DMRs

from shellytrigg https://ift.tt/2Wp7oYu
via IFTTT

Shelly’s Notebook: Tues. Oct. 29, Geoduck DMR filtering

Performing group stats on DMRs

Comparing ANOVA vs. GLM significant DMRs

  • ANOVA and GLM each identify different DMRs so I plotted these using an uncorrected p-value < 0.1. Column color bar in heat maps below: cyan = ambient, light pink = low.pH, magenta = super.low.pH. heatmap color: Red = more methylation, blue = no methylation, black = no data.
  • ANOVA significant day 10 MCmax10 DMR heatmap: day10_MCmax10DMR_aov0.1_heatmap.jpg
  • GLM significant day 10 MCmax10 DMR heatmap: day10_MCmax10DMR_glm0.1_heatmap.jpg
  • ANOVA significant day 10 MCmax10 DMR violinplots: day10_MCmax10DMR_aov0.1_boxplots.jpg
  • GLM significant day 10 MCmax10 DMR violinplots: day10_MCmax10DMR_glm0.1_boxplots.jpg
  • CONCLUSIONS:
    • the GLM seems more likely to identify DMRs as significant when one group has zero % methylation and/or half of another group’s samples have zero % methylation.
    • I feel alittle more confident going with the ANOVA

Check ANOVA significant DMRs found by other DMRfind parameters

  • Next I ran ANOVA DMRs from DMR parameters MCmax = 25bp, 30bp, and 50bp for day10 samples to see if these DMRs show obvious group patterns in the heatmaps and violinplots
  • Rmarkdown files that generated figures below are here:
  • ANOVA significant day 10 MCmax25 DMR heatmap: day10_MCmax25DMR_aov0.1_heatmap.jpg
  • ANOVA significant day 10 MCmax25 DMR violinplots: day10_MCmax25DMR_aov0.1_boxplots.jpg
  • ANOVA significant day 10 MCmax30 DMR heatmap: day10_MCmax30DMR_aov0.1_heatmap.jpg
  • ANOVA significant day 10 MCmax30 DMR violinplots: day10_MCmax30DMR_aov0.1_boxplots.jpg
  • ANOVA significant day 10 MCmax50 DMR heatmap: day10_MCmax50DMR_aov0.1_heatmap.jpg
  • ANOVA significant day 10 MCmax50 DMR violinplots: day10_MCmax50DMR_aov0.1_boxplots.jpg
  • CONCLUSIONS: MCmax30 give the most number of DMRs while sill maintaining clear patterns in the heatmap and violinplots. I think the patterns are less obvious in the MCmax50 heatmap. Therefore, I think it’s safe to go with the MCmax30 parameter.
  • NEXT STEPS: Run ANOVA, generate heatmaps and violinplots for DMRs from all 4 comparisons

from shellytrigg https://ift.tt/330ac0J
via IFTTT

Sam’s Notebook: Genome Feature Counts – Panopea-generosa-vv0.74.a4

In preparation for a paper we’re writing, we needed some summary stats for Panopea-generosa-vv0.74.a4. This info will be compiled in to a table for the manuscript. See our Genomic Resources wiki for more info on GFFs:

Calculations were performed using Python in Jupyter Notebooks.

Genome Features Jupyter Notebook (GitHub):

Repeat Features Jupyter Notbooke (GitHub):

Shelly’s Notebook: Thur. Oct. 24, Geoduck DMR filtering

This analysis is a follow up to Wed. Oct 23 analysis

Rerun DMRfind with different parameters

Reran DMRfind with different MCmax settings (this specifies what a differentially methylated site (DMS) is; it allows loci to not be exactly be overlapping but be within a window to be considered a DMS which helps for low coverage samples). This window size is defined by MCmax:

Validate DMR bed files in IGV

CONCLUSIONS

  • Try running group stats on % methylation data and see if it excludes DMRs that don’t make sense
    • Yupeng confirmed that methylpy only runs statistics on within sample data, not on group data. So I need to apply an ANOVA or GLM to determine DMRs that are statistically different between groups

Grace’s Notebook: Day 12 RNA extractions – RNA in all samples!

Today I did some more extractions, but moved on to day 12. I did a mix of all groupings (temperature and infection statuses). All extracted samples had detectable RNA! Details in post. Additionally, I’ll give a summary table of what all I have extracted so far.

Samples extracted today:

FRP trtmnt_tank sample_day infection_status maturity tube_number
6137 ambient 12 1 I 301
6122 ambient 12 1 I 325
6125 ambient 12 1 I 303
6176 ambient 12 0 M 329
6179 ambient 12 0 M 315
6213 ambient 12 0 I 310
6104 cold 12 0 I 259
6106 cold 12 0 M 241
6118 cold 12 1 I 240
6120 cold 12 1 I 248
6126 cold 12 1 I 201
6191 cold 12 0 I 227
6148 cold 12 1 I 213
6149 cold 12 1 I 226
6151 cold 12 1 I 243
6242 warm 12 0 M 377
6249 warm 12 1 I 279
6250 warm 12 1 I 294
6251 warm 12 1 I 376
6254 warm 12 0 I 296
6259 warm 12 0 M 281
6260 warm 12 0 M 374
6264 warm 12 0 I 268
6265 warm 12 0 I 282

Sample prep and extraction

I did everything the same as what I’ve done for the past two extractions (1 and 2). No obvious mishaps or mistakes as far as I know.

Results:

[Raw qubit] [Google sheet qubit]

qubit_tube_conc_ng.ml original_sample_conc_ng.ul sample_vol_ul dilution_factor tube_number extraction_method ul_sample-used elution_vol_ul total-yield_ng
244 24.4 2 100 282 Zymo_microprep 35 15 317.2
152 15.2 2 100 268 Zymo_microprep 35 15 197.6
183 18.3 2 100 374 Zymo_microprep 35 15 237.9
231 23.1 2 100 281 Zymo_microprep 35 15 300.3
84.9 8.49 2 100 296 Zymo_microprep 35 15 110.37
84.7 8.47 2 100 376 Zymo_microprep 35 15 110.11
52.3 5.23 2 100 294 Zymo_microprep 35 15 67.99
355 35.5 2 100 279 Zymo_microprep 35 15 461.5
223 22.3 2 100 377 Zymo_microprep 35 15 289.9
23 2.3 2 100 243 Zymo_microprep 35 15 29.9
262 26.2 2 100 226 Zymo_microprep 35 15 340.6
340 34 2 100 213 Zymo_microprep 35 15 442
180 18 2 100 227 Zymo_microprep 35 15 234
62.7 6.27 2 100 201 Zymo_microprep 35 15 81.51
134 13.4 2 100 248 Zymo_microprep 35 15 174.2
216 21.6 2 100 240 Zymo_microprep 35 15 280.8
130 13 2 100 241 Zymo_microprep 35 15 169
317 31.7 2 100 259 Zymo_microprep 35 15 412.1
273 27.3 2 100 310 Zymo_microprep 35 15 354.9
178 17.8 2 100 315 Zymo_microprep 35 15 231.4
29 2.9 2 100 329 Zymo_microprep 35 15 37.7
274 27.4 2 100 303 Zymo_microprep 35 15 356.2
311 31.1 2 100 325 Zymo_microprep 35 15 404.3
289 28.9 2 100 301 Zymo_microprep 35 15 375.7

Summary table of what I have extracted so far:

from Grace’s Lab Notebook https://ift.tt/2q3zf4h
via IFTTT

Laura’s Notebook: November 2019 goals

Tasks that must be completed in November

  • GRIP application (Due Dec. 4)
  • NSF INTERN program (No due date, sooner the better)
  • Address Ecological Applications formatting changes & submit
  • Jackie class – lots of writing
  • Finish larval measurements for Oly temp/food paper
  • Finish final revisions on Polydora MS

Longer-term tasks

  • Revise Oly temp/food paper to incorporate larval size differences
  • Can I automate oocyte measurements for Oly temp/food paper to get time series of oocyte size?
  • Get cracking on the QuantSeq library prep!
  • Revisit Oly methylation data and begin next steps in analysis. Need to determine steps, probably visualize results from MethylKit + MACAU + SNPs together, describe locations including function if possible.
  • Oly methylation data –> what’s the angle in my Aquaculture America talk?
  • Revisit/revise QuantSeq pipeline using Salmon and the Oly genome
  • Make sure Christian knows which samples are which
  • Identify possible Aquaculture 2020 funding

Other responsibilities

  • Start Polydora research position (goes through January)
  • Help with GSS
  • NSA quartlerly newsletter
  • Any Baltimore tasks?

from The Shell Game https://ift.tt/2Nlckd3
via IFTTT