Sam’s Notebook: TrimmingMultiQC – Methcompare Bisulfite FastQs with fastp on Mox

Steven asked me to trim a set of FastQ files, provided by Hollie Putnam, in preparation for methylation analysis using Bismark. The analysis is part of a coral project comparing DNA methylation profiles of different species, as well as comparing different sample prep protocols. There’s a dedicated GitHub repo here:

I roughly followed the trimming pipeline that Hollie had already put together, but opted to use the program fastp as it is generally faster than other trimmers and comes with the bonus ability of generating pre/post-trimming graphs/tables; similar to FastQC. Additionally, [MultiQC(https://multiqc.info/)] can also interpret the output of fastp to generate summary statistics/graphs like it can with FastQC.

The data consisted of two different types of libraries: reduced representation bisfultie (RRBS) and whole genome bisulfite (WGBS). Knowing this, I followed the Bismark trimming guidelines for each library type. The fastp trimming and MultiQC were run with the following SBATCH script (GitHub):

#!/bin/bash ## Job Name #SBATCH --job-name=pgen_fastp_trimming_EPI ## Allocation Definition #SBATCH --account=coenv #SBATCH --partition=coenv ## Resources ## Nodes #SBATCH --nodes=1 ## Walltime (days-hours:minutes:seconds format) #SBATCH --time=1-00:00:00 ## Memory per node #SBATCH --mem=120G ##turn on e-mail notification #SBATCH --mail-type=ALL #SBATCH --mail-user=samwhite@uw.edu ## Specify the working directory for this job #SBATCH --chdir=/gscratch/scrubbed/samwhite/outputs/20200305_methcompare_fastp_trimming ### WGBS and RRBS trimming using fastp. ### FastQ files were provide by Hollie Putnam. ### See this GitHub repo for more info: ### https://github.com/hputnam/Meth_Compare # Exit script if any command fails # set -e # Load Python Mox module for Python module availability module load intel-python3_2017 # Document programs in PATH (primarily for program version ID) { date echo "" echo "System PATH for $SLURM_JOB_ID" echo "" printf "%0.s-" {1..10} echo "${PATH}" | tr : \\n } >> system_path.log # Set number of CPUs to use threads=27 # Paths to programs fastp=/gscratch/srlab/programs/fastp-0.20.0/fastp multiqc=/gscratch/srlab/programs/anaconda3/bin/multiqc # Programs array programs_array=("${fastp}" "${multiqc}") # Capture program options for program in "${!programs_array[@]}" do echo "Program options for ${programs_array[program]}: " echo "" ${programs_array[program]} -h echo "" echo "" echo "

Sam’s Notebook: RNA Isolation and Quantification – C.bairdi RNA from Hemolymph Pellets in RNAlater

Based on qPCR results testing for residual gDNA from 20200225, a set of 24 samples were identified that required DNase treatment and/or additional RNA. I opted to just isolate more RNA from all samples, since the kit includes a DNase step and avoids diluting the existing RNA using the Turbo DNA-free Kit that we usully use. Isolated RNA using the Quick DNA/RNA Microprep Kit (ZymoResearch; PDF) according to the manufacturer’s protocol for liquids/cells in RNAlater.

  • Used 35uL from each RNAlater/hemocyte slurry.
  • Mixed with equal volume of H2O (35uL).
  • Retained DNA on the Zymo-Spin IC-XM columns for isolation after RNA isolation.
  • Performed on-column DNase step.
  • RNA was eluted in 15uL H2O

RNA was quantified on the Roberts Lab Qubit 3.0 using the RNA High Sensitivity Assay (Invitrogen), using 2uL of each sample.