Shelly’s Notebook: Thur. Nov 1, 2018

Methylation analysis:

Geoduck alignments too slow, should be directional, and many chromosomal sequence extraction errors

  • increase processors/cores to use from default(4) to 28 (i.e. -p 28)
  • change to directional mapping because non-directional mapping should be 1:1:1:1 But instead the non-directional mapping shows ~ 10:1:1:10 as follows:
    CT/GA/CT: 2758 ((converted) top strand)
    GA/CT/CT: 350 (complementary to (converted) top strand)
    GA/CT/GA: 316 (complementary to (converted) bottom strand)
    CT/GA/GA: 2725 ((converted) bottom strand)
  • padded genome
    • strigg/analyses/20181025/slurm-401715.out shows > 1M errors: ‘Chromosomal sequence could not be extracted’
    • creating padded genome to see if this improves mapping
    • Steven can run scripts from coenv : sbatch -p coenv -A coenv 20181101_bmrkPgenrGenmPadded.sh *needed to make my directory writable (chmod 777)
  • tested effects of directional mapping and genome padding on mapping efficiency with one sample (103)
    • jupyter notebook
    • mapping reports to compare:
      • original alignment (non-directional mapping, aligned to non-padded genome)
        • /Volumes/web/metacarcinus/Pgenerosa/20181025/EPI-103_S27_L005_R1_001_val_1_bismark_bt2_PE_report.txt
            Bismark was run with Bowtie 2 against the bisulfite genome of /gscratch/srlab/strigg/data/Pgenr/ with the specified options: -q --score-min L,0,-1.2 --ignore-quals --no-mixed --no-discordant --dovetail --minins 60 --maxins 500 Option '--non_directional' specified: alignments to all strands were being performed (OT, OB, CTOT, CTOB) Final Alignment report