WGBS Analysis Part 17

Identifying SNPs with BS-Snper

Now that I know how to use BS-Snper, I’m going to identify merged and individual SNPs in the Manchester data! I set up this Jupyter notebook to start the process.

The first thing I needed to do was merge BAM files with samtools merge:

%%bash samtools merge \ /Volumes/web/spartina/project-gigas-oa-meth/output/BS-Snper/merged-sorted-deduplicated.bam \ /Volumes/web/spartina/project-gigas-oa-meth/output/bismark-roslin/zr3616_1_R1_val_1_val_1_val_1_bismark_bt2_pe.deduplicated.sorted.bam \ /Volumes/web/spartina/project-gigas-oa-meth/output/bismark-roslin/zr3616_2_R1_val_1_val_1_val_1_bismark_bt2_pe.deduplicated.sorted.bam \ /Volumes/web/spartina/project-gigas-oa-meth/output/bismark-roslin/zr3616_3_R1_val_1_val_1_val_1_bismark_bt2_pe.deduplicated.sorted.bam \ /Volumes/web/spartina/project-gigas-oa-meth/output/bismark-roslin/zr3616_4_R1_val_1_val_1_val_1_bismark_bt2_pe.deduplicated.sorted.bam \ /Volumes/web/spartina/project-gigas-oa-meth/output/bismark-roslin/zr3616_5_R1_val_1_val_1_val_1_bismark_bt2_pe.deduplicated.sorted.bam \ /Volumes/web/spartina/project-gigas-oa-meth/output/bismark-roslin/zr3616_6_R1_val_1_val_1_val_1_bismark_bt2_pe.deduplicated.sorted.bam \ /Volumes/web/spartina/project-gigas-oa-meth/output/bismark-roslin/zr3616_7_R1_val_1_val_1_val_1_bismark_bt2_pe.deduplicated.sorted.bam \ /Volumes/web/spartina/project-gigas-oa-meth/output/bismark-roslin/zr3616_8_R1_val_1_val_1_val_1_bismark_bt2_pe.deduplicated.sorted.bam 

Once I had the merged BAM, I looked at the header with samtools view:

Screen Shot 2021-04-07 at 9 18 28 AM

I used default parameters, except for --mincov 5, with the merged BAM to identify SNP variants in my dataset. Again, I saved all of my output to gannet since it was large!

#Defaults: --minhetfreq 0.1 --minhomfreq 0.85 --minquali 15 --maxcover 1000 --minread2 2 --errorate 0.02 --mapvalue 20 #Saving output to gannet !perl /Users/Shared/Apps/BS-Snper-master/BS-Snper.pl \ --fa /Volumes/web/spartina/project-oyster-oa/data/Cg-roslin/cgigas_uk_roslin_v1_genomic-mito.fa \ --input /Volumes/web/spartina/project-gigas-oa-meth/output/BS-Snper/merged-sorted-deduplicated.bam \ --output /Volumes/web/spartina/project-gigas-oa-meth/output/BS-Snper/SNP-candidates.txt \ --methcg /Volumes/web/spartina/project-gigas-oa-meth/output/BS-Snper/CpG-meth-info.tab \ --methchg /Volumes/web/spartina/project-gigas-oa-meth/output/BS-Snper/CHG-meth-info.tab \ --methchh /Volumes/web/spartina/project-gigas-oa-meth/output/BS-Snper/CHH-meth-info.tab \ --mincover 5 \ > /Volumes/web/spartina/project-gigas-oa-meth/output/BS-Snper/SNP-results.vcf \ 2> /Volumes/web/spartina/project-gigas-oa-meth/output/BS-Snper/merged.ERR.log 

I used a loop to run BS-Snper on individual samples, making sure to set any variables necessary for the loop within the same cell. Before running the loop, I moved to this directory where the BAM files were located. After the loop finished running, I moved all the files this directory for BS-Snper output.

%%bash FILES=$(ls *deduplicated.sorted.bam) echo ${FILES} for file in ${FILES} do NAME=$(echo ${file} | awk -F "." '{print $1}') echo ${NAME} perl /Users/Shared/Apps/BS-Snper-master/BS-Snper.pl \ --fa /Volumes/web/spartina/project-oyster-oa/data/Cg-roslin/cgigas_uk_roslin_v1_genomic-mito.fa \ --input ${NAME}.deduplicated.sorted.bam \ --output ${NAME}.SNP-candidates.txt \ --methcg ${NAME}.CpG-meth-info.tab \ --methchg ${NAME}.CHG-meth-info.tab \ --methchh ${NAME}.CHH-meth-info.tab \ --mincover 5 \ > ${NAME}.SNP-results.vcf 2> ${NAME}.ERR.log done mv *SNP-candidates.txt *meth-info.tab *SNP-results.vcf *ERR.log ../BS-Snper/ 

Going forward

  1. Write methods
  2. Obtain relatedness matrix and SNPs with EpiDiverse/snp
  3. Revise methylKit study design: separate tests for indeterminate and female oysters
  4. Run R script on mox
  5. Write results
  6. Identify genomic location of DML
  7. Determine if RNA should be extracted
  8. Determine if larval DNA/RNA should be extracted

Please enable JavaScript to view the comments powered by Disqus.

from the responsible grad student https://ift.tt/2Q1mmF2
via IFTTT