Sam’s Notebook:TrimGalore/FastQC/MultiQC – 14bp Trim C.virginica MBD BS-seq FASTQ data

0000-0002-2747-368X

Yesterday, I ran TrimGalore/FastQC/MultiQC on the Crassostrea virginica MBD BS-seq data from ZymoResearch with the default settings (i.e. “auto-trim”). There was still some variability in the first ~15bp of the reads and Steven wanted to see how a hard trim would change things.

I ran TrimGalore (using the built-in FastQC option), with a hard trim of the first 14bp of each read and followed up with MultiQC for a summary of the FastQC reports.

TrimGalore job script:

Standard error was redirected on the command line to this file:

MD5 checksums were generated on the resulting trimmed FASTQ files:

All data was copied to my folder on Owl.

Checksums for FASTQ files were verified post-data transfer (data not shown).

Results:

Output folder:

FastQC output folder:

MultiQC output folder:

MultiQC HTML report:

OK, this trimming definitely took care of the variability seen in the first ~15bp of all the reads.

However, I noticed that the last 2bp of each of the Read 1 seqs all have some wonky stuff going on. I’m guessing I should probably trim that stuff off, too…

//

from Sam’s Notebook https://ift.tt/2qoBz2n
via IFTTT

Sam’s Notebook:TrimGalore/FastQC/MultiQC – Auto-trim C.virginica MBD BS-seq FASTQ data

0000-0002-2747-368X

Yesterday, I ran FastQC/MultiQC on the Crassostrea virginica MBD BS-seq data from ZymoResearch. Steven wanted to trim it and see how things turned out.

I ran TrimGalore (using the built-in FastQC option) and followed up with MultiQC for a summary of the FastQC reports.

TrimGalore job script:

Standard error was redirected on the command line to this file:

MD5 checksums were generated on the resulting trimmed FASTQ files:

All data was copied to my folder on Owl.

Checksums for FASTQ files were verified post-data transfer.

Results:

Output folder:

FastQC output folder:

MultiQC output folder:

MultiQC HTML report:

Overall, the auto-trim didn’t alter things too much. Specifically, Steven is concerned about the variability in the first 15bp (seen in the Per Base Sequence Content section of the MultiQC output). It was reduced, but not greatly. Will perform an independent run of TrimGalore and employ a hard trim of the first 14bp of each read and see how that looks.

//

from Sam’s Notebook https://ift.tt/2v14v6n
via IFTTT