Sean’s Notebook: Bismark mapping efficiency with Hard trimmed C. virginica sample.
Yesterday Mackenzie Gavery came by and offered some suggestions to increase mapping rates for our Virginica BS-Seq data using Bismark. Her two suggestions were using the –non_directional flag to account for the PBATness of the data, which had a huge effect, and hard trim the first 16 bases in our samples, because they look weird.
I tried everything on a single sample for speed and finished it this morning.
That cleans stuff up for sure. Unfortunately it didn’t have much of an effect on mapping rate, bringing us up from 28% to 28.3%. Was worth a shot though!
Final Alignment report ====================== Sequences analysed in total: 12197930 Number of alignments with a unique best hit from the different alignments: 3456338 Mapping efficiency: 28.3% Sequences with no alignments under any condition: 5760842 Sequences did not map uniquely: 2980750 Sequences which were discarded because genomic sequence could not be extracted:0 Number of sequences with unique best (first) alignment came from the bowtie output: CT/CT: 181719 ((converted) top strand) CT/GA: 166362 ((converted) bottom strand) GA/CT: 1588675 (complementary to (converted) top strand) GA/GA: 1519582 (complementary to (converted) bottom strand) Final Cytosine Methylation Report ================================= Total number of C's analysed: 61813350 Total methylated C's in CpG context: 12572131 Total methylated C's in CHG context: 4005979 Total methylated C's in CHH context: 12350257 Total methylated C's in Unknown context: 0 Total unmethylated C's in CpG context: 2271987 Total unmethylated C's in CHG context: 12442077 Total unmethylated C's in CHH context: 18170919 Total unmethylated C's in Unknown context: 5 C methylated in CpG context: 84.7% C methylated in CHG context: 24.4% C methylated in CHH context: 40.5% C methylated in Unknown context (CN or CHN): 0.0%
Bismark output files located: here
now time to run the rest of them!