Sam’s Notebook: Data Received – C.virginica Mantle MBD-BSseq from ZymoResearch

We received the BSseq data from ZymoResearch today. Samples were submitted on 20190326. The samples constituted 24 Crassostrea virginica mantle samples which were prepared using the MethylMiner Kit (Invitrogen) on 20190319.

FastQ data (each file beginning with zr2576) for each sample was provided in three FastQ files per sample (presumably the FastQ files were split by file size limits set by ZymoResearch). I will likely concatenate the corresponding R1 and R2 files for each sample so that we simply have a single R1 and single R2 file for each sample. That will be detailed in a subsequent notebook entry when it happens.

E.g. current FastQs:

  • zr2576_14_s1_R1.fastq.gz
  • zr2576_14_s2_R1.fastq.gz
  • zr2576_14_s3_R1.fastq.gz

Data was downloaded from ZymoResearch FTP server using FileZilla.

MD5 checksums were verified after downloading:

 md5sum --check zr2576_md5_checksums.txt  

Files were rsync‘d to owl/nightingales/C_virginica.

File info was partially added to the nightingales Google Sheet. Only partially due to the difficult nature of detailing FastQ data split across three files for each sample read set. This is the reason I will likely perform the FastQ concatenation – greatly simplifies data management.

The master sample sheet (Google Sheet) for the project was updated to reflect sequencing library names for each sample:

from Sam’s Notebook

Shelly’s Notebook: Tues. Jun. 25, 2019 Salmon RRBS prep plans

Summary of RRBS in the lit:

paper enzyme dig. time fragment size depth cov DMR window # DMRs
Rob MspI + Taqa1 ? 60-180 42M 6M CpGs @ 6X 500kb of diff. exp genes ~10K sites
Mog MspI + Taqa1 ? 60-180 34M didn’t say 1Mb of diff exp genes ~5.5K sites
Mac MspI o/n 100-300 38M 600K CpGs; 112K regions in all libs @ 10X 100bp region ~100 genes
Web MspI 12 hr 70-220 64M 21M CpGs; 335K @ 10X in all libs within a gene or +/- 1.5kb of TSS for promoters ~1.9K sites
Lee MspI + Taqa1 2 hr + 2 hr 80-160 5M 3.3% of CpGs @ 10X NA NA

Next steps

  1. Digest 2ug DNA
    • doing 2ug because I don’t know what the yield is going to be like after gel size-selcetion
    • get 2ug DNA in 35uL H2O
    • MSPI digest reaction mix:
      • 31.5uL MSPI (21 * 1.5uL MSP1/sample)
      • 105uL Buffer (21 * 5uL cutsmart/sample)
      • 178.5uL Water (21 * 8.5uL H2O/sample)
      • 315 = final vol.
      • add 15uL master mix/sample
    • incubate @ 37C 12 hrs
    • heat inactivate @ 80C 20 min
    • hold @ 4C
    • Taq-aI digest reaction mix:
      • 63 uL Taq-a1 (21 * 3uL Taq-a1/sample)
      • 21 uL Buffer (21*1uL cutsmart/sample)
      • 126 uL Water (21 * 6uL H2O/sample)
      • 210 = final vol.
    • add 10uL master mix/sample
    • incubate @ 65C 2 hrs
    • heat inactivate @ 80C 20 min
    • hold @ 4C
  2. Gel purify
    • excise fragments between 80-280bp (seems like 100-300bp would be fine)
      • there may be a small size range that we want to exclude, but it’s not clear what range and if its even possible. *see frag. size range info below
    • use Millipore columns
    • measure concentration with Qubit HS
  3. Begin zymo pico methyl kit

Notes on fragment size selection after Taq-aI/MspI digest

  • Lee paper selected 150-197bp and 207-230bp (and excluded 198-206bp) after adapter ligation
    • paper clarifies this range corresponds to 80bp-160bp fragments (before adapter ligation); suggesting adapters add 70 bp to the fragments.
    • the reason for the small size exclusion is because in silico analysis showed fragments within this size contained repetitive sequences that won’t map.
  • Mog. paper selected 150-250bp and 250-350bp after adapter ligation
    • I emailed Mog. to ask why two seperate ranges when they seem overlapping and confirm whether or not there was a fragment size exclusion
    • if the adapters used were the same length as in Lee et al, Mog. fragments pre-adapter ligation would be 80-280bp. This is pretty close to what Mac size-selected (100-300bp).

from shellytrigg