Transcriptome Comparisons – C.bairdi Transcriptomes Evaluations with DETONATE rsem-eval on Mox

UPDATE: I’ll lead in with the fact that this failed with an error message that I can’t figure out. This will save the reader some time. I’ve posted the problem as an Issue on the DETONATE GitHub repo, however it’s clear that this software is no longer maintained, as the repo hasn’t been updated in >3yrs; even lacking responses to Issues that are that old.

Here’s the error message and some other details that could be useful for troubleshooting (which are beyond my knowledge – although I suspect that the XM tag is the culprit and the first entry in the BAM file has XM:i:2 and the error message might suggest that 2 is not an acceptable value e.g. Assertion val == 0 val == 1 val == 5’ failed.`):
rsem-synthesis-reference-transcripts cbai_transcriptome_v3.1.fasta.temp/cbai_transcriptome_v3.1.fasta 0 0 0 /gscratch/srlab/sam/data/C_bairdi/transcriptomes/cbai_transcriptome_v3.1.fasta Transcript Information File is generated! Group File is generated! Extracted Sequences File is generated! rsem-preref cbai_transcriptome_v3.1.fasta.temp/cbai_transcriptome_v3.1.fasta.transcripts.fa 1 cbai_transcriptome_v3.1.fasta.temp/cbai_transcriptome_v3.1.fasta Refs.makeRefs finished! Refs.saveRefs finished! cbai_transcriptome_v3.1.fasta.temp/cbai_transcriptome_v3.1.fasta.idx.fa is generated! cbai_transcriptome_v3.1.fasta.temp/cbai_transcriptome_v3.1.fasta.n2g.idx.fa is generated! rsem-parse-alignments cbai_transcriptome_v3.1.fasta.temp/cbai_transcriptome_v3.1.fasta cbai_transcriptome_v3.1.fasta.temp/cbai_transcriptome_v3.1.fasta cbai_transcriptome_v3.1.fasta.stat/cbai_transcriptome_v3.1.fasta b /gscratch/scrubbed/samwhite/outputs/20201224_cbai_bowtie2_transcriptomes_alignments/cbai_transcriptome_v3.1.fasta.sorted.bam -t 3 -tag XM rsem-parse-alignments: parseIt.cpp:92: void parseIt(SamParser*) [with ReadType = PairedEndReadQ; HitType = PairedEndHit]: Assertion `val == 0 || val == 1 || val == 5' failed. "rsem-parse-alignments cbai_transcriptome_v3.1.fasta.temp/cbai_transcriptome_v3.1.fasta cbai_transcriptome_v3.1.fasta.temp/cbai_transcriptome_v3.1.fasta cbai_transcriptome_v3.1.fasta.stat/cbai_transcriptome_v3.1.fasta b /gscratch/scrubbed/samwhite/outputs/20201224_cbai_bowtie2_transcriptomes_alignments/cbai_transcriptome_v3.1.fasta.sorted.bam -t 3 -tag XM" failed! Plase check if you provide correct parameters/options for the pipeline! 

Here’s what the head of the BAM file looks like:

[samwhite@n2233 20201224_cbai_bowtie2_transcriptomes_alignments]$ /gscratch/srlab/programs/samtools-1.10/samtools view cbai_transcriptome_v3.1.fasta.sorted.bam | head A00147:108:HLLJFDMXX:1:1369:3893:6637 163 TRINITY_DN5604_c0_g2_i1 1 42 101M = 81 181 GAAAGAAAAACCGACAGGAGGAATTTCTTTGTTACCAACAAAAACTAATATATTTCGCATACCTGACAGACATGGTGACAGCGCCTCTGATGTTCGCCGAA :FFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFF:FFFFFFFFFFFF:FFF:FFFFFFFFFFFFFFFFFFFFFFFF AS:i:-2 XN:i:0 XM:i:2 XO:i:0 XG:i:0 NM:i:2 MD:Z:2G31A66 YS:i:0 YT:Z:CP A00147:121:HLLVMDMXX:1:2159:5674:25786 163 TRINITY_DN5604_c0_g2_i1 4 42 101M = 72 169 AGAAAAACCGACAGGAGGAATTTCTTTGTTAACAACAAAAACTAATATATTTCGCATACCTGACAGACATGGTGACAGCGCCTCTGATGTTCGCCGAATTA FFFFFF:FFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:101 YS:i:0 YT:Z:CP A00147:108:HLLJFDMXX:2:1420:13702:14920 163 TRINITY_DN5604_c0_g2_i1 5 42 101M = 197 293 GAAAAACCGACAGGAGGAATTTCTTTGTTAACAACAAAAACTAATATATTTCGCATACCTGACAGACATGGTGACAGCGCCTCTGATGTTCGCCGAATTAA FFFFFFFFFFFFFFFFFF:FFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,FFFFFFFFFFF AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:101 YS:i:0 YT:Z:CP A00147:108:HLLJFDMXX:2:1420:13431:15796 163 TRINITY_DN5604_c0_g2_i1 5 42 101M = 197 293 GAAAAACCGACAGGAGGAATTTCTTTGTTAACAACAAAAACTAATATATTTCGCATACCTGACAGACATGGTGACAGCGCCTCTGATGTTCGCCGAATTAA FFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:101 YS:i:0 YT:Z:CP A00147:121:HLLVMDMXX:1:1258:5141:32377 163 TRINITY_DN5604_c0_g2_i1 5 42 101M = 107 203 GAAAAACCGACAGGAGGAATTTCTTTGTTACCAACAAAAACTAATATATTTCGCATACCTGACAGACATGGTGACAGCGCCTCTGATGTTCGCCGAATTAA FFFFF,FFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFF:F:FFFFFFFF AS:i:-1 XN:i:0 XM:i:1 XO:i:0 XG:i:0 NM:i:1 MD:Z:30A70 YS:i:0 YT:Z:CP A00147:121:HLLVMDMXX:1:1259:7645:5838 163 TRINITY_DN5604_c0_g2_i1 5 42 101M = 107 203 GAAAAACCGACAGGAGGAATTTCTTTGTTACCAACAAAAACTAATATATTTCGCATACCTGACAGACATGGTGACAGCGCCTCTGATGTTCGCCGAATTAA FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFF AS:i:-1 XN:i:0 XM:i:1 XO:i:0 XG:i:0 NM:i:1 MD:Z:30A70 YS:i:0 YT:Z:CP A00147:108:HLLJFDMXX:1:1351:27082:1705 163 TRINITY_DN5604_c0_g2_i1 6 42 101M = 158 253 AAAAACCGACAGGAGGAATTTCTTTGTTAACAACAAAAACTAATATATTTCGCATACCTGACAGACATGGTGACAGCGCCTCTGATGTTCGCCGAATTAAA FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF: AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:101 YS:i:0 YT:Z:CP A00147:108:HLLJFDMXX:1:1475:28926:32706 163 TRINITY_DN5604_c0_g2_i1 6 42 101M = 158 253 AAAAACCGACAGGAGGAATTTCTTTGTTAACAACAAAAACTAATATATTTCGCATACCTGACAGACATGGTGACAGCGCCTCTGATGTTCGCCGAATTAAA FFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:101 YS:i:0 YT:Z:CP A00147:121:HLLVMDMXX:1:2162:31385:26381 163 TRINITY_DN5604_c0_g2_i1 6 42 101M = 158 253 AAAAACCGACAGGAGGAANTTCTTTGTTAACAACAAAAACTAATATATTTCGCATACCTGACAGACATGGTGACAGCGCCTCTGATGTTCGCCGAATTAAA FFFFFFFFFFFFFFFFFF#FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFF: AS:i:-1 XN:i:0 XM:i:1 XO:i:0 XG:i:0 NM:i:1 MD:Z:18T82 YS:i:0 YT:Z:CP A00147:121:HLLVMDMXX:2:1425:28745:30639 163 TRINITY_DN5604_c0_g2_i1 7 42 101M = 169 263 AAAACCGACAGGAGGAATTTCTTTGTTAACAACAAAAACTAATATATTTCGCATACCTGACAGACATGGTGACAGCGCCTCTGATGTTCGCCGAATTAAAG FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF: AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:101 YS:i:0 YT:Z:CP 

bowtie2 commands to generate the BAM file:

 # Use bowtie2 and paired-end options # Uses settings specified for use with DETONATE # and for paired end reads when using DETONATE. ${programs_array[bowtie2]} \ -x ${transcriptome_name} \ -S ${transcriptome_name}.sam \ --threads ${threads} \ -1 ${R1_list} \ -2 ${R2_list} \ --sensitive \ --dpad 0 \ --gbar 99999999 \ --mp 1,1 \ --np 1 \ --score-min L,0,-0.1 \ --no-mixed \ --no-discordant # Convert SAM to sorted BAM # ${programs_array[samtools_view]} \ -b \ ${transcriptome_name}.sam \ | ${programs_array[samtools_sort]} \ -m ${mem_per_thread} \ --threads ${threads} \ -o ${transcriptome_name}.sorted.bam \ - 

With all of that out of the way, you can find the original post below.

Alignments – C.bairdi RNAseq Transcriptome Alignments Using Bowtie2 on Mox

I had previously attempted to compare all of our C.bairdi transcriptome assemblies using DETONATE on 20200601, but, due to hitting time limits on Mox, failed to successfully get the analysis to complete. I realized that the limiting factor was performing FastQ alignments, so I decided to run this step independently to see if I could at least get that step resolved. DETONATE (rsem-eval) will accept BAM files as input, so I’m hoping I can power through this alignment step and then provided DETONATE (rsem-eval) with the BAM files.

I ran bowtie2 on Mox using the alignment settings described in the DETONATE (rsem-eval) documentation (see SBATCH script below for actual bowtie2 parameters).

SBATCH script (GitHub):

#!/bin/bash ## Job Name #SBATCH --job-name=20201224_cbai_bowtie2_transcriptomes_alignments ## Allocation Definition #SBATCH --account=srlab #SBATCH --partition=srlab ## Resources ## Nodes #SBATCH --nodes=1 ## Walltime (days-hours:minutes:seconds format) #SBATCH --time=15-00:00:00 ## Memory per node #SBATCH --mem=500G ##turn on e-mail notification #SBATCH --mail-type=ALL #SBATCH --mail-user=samwhite ## Specify the working directory for this job #SBATCH --chdir=/gscratch/scrubbed/samwhite/outputs/20201224_cbai_bowtie2_transcriptomes_alignments # This is a script to generate BAM files for use in DETONATE's # rsem-eval program to compare C.bairdi transcriptome assembly "qualities". ################################################################################### # These variables need to be set by user # Assign Variables reads_dir=/gscratch/srlab/sam/data/C_bairdi/RNAseq transcriptomes_dir=/gscratch/srlab/sam/data/C_bairdi/transcriptomes threads=28 mem_per_thread=10G # Program paths bowtie2_dir="/gscratch/srlab/programs/bowtie2-2.4.2-linux-x86_64" samtools="/gscratch/srlab/programs/samtools-1.10/samtools" # Array of the various comparisons to evaluate # Each condition in each comparison should be separated by a "-" transcriptomes_array=( "${transcriptomes_dir}"/cbai_transcriptome_v1.0.fasta \ "${transcriptomes_dir}"/cbai_transcriptome_v1.5.fasta \ "${transcriptomes_dir}"/cbai_transcriptome_v1.6.fasta \ "${transcriptomes_dir}"/cbai_transcriptome_v1.7.fasta \ "${transcriptomes_dir}"/cbai_transcriptome_v2.0.fasta \ "${transcriptomes_dir}"/cbai_transcriptome_v2.1.fasta \ "${transcriptomes_dir}"/cbai_transcriptome_v3.0.fasta \ "${transcriptomes_dir}"/cbai_transcriptome_v3.1.fasta ) ################################################################################### # Exit script if any command fails set -e # Load Python Mox module for Python module availability module load intel-python3_2017 # Programs array declare -A programs_array programs_array=( [bowtie2]="${bowtie2_dir}/bowtie2" \ [bowtie2_build]="${bowtie2_dir}/bowtie2-build" \ [samtools_index]="${samtools} index" \ [samtools_sort]="${samtools} sort" \ [samtools_view]="${samtools} view" ) # Loop through each comparison for transcriptome in "${!transcriptomes_array[@]}" do ## Inititalize arrays R1_array=() R2_array=() reads_array=() # Variables R1_list="" R2_list="" transcriptome_name="${transcriptomes_array[$transcriptome]##*/}" # Capture FastA checksums for verification echo "Generating checksum for ${transcriptome_name}" md5sum "${transcriptomes_array[transcriptome]}" >> fasta.checksums.md5 echo "Finished generating checksum for ${transcriptome_name}" echo "" if [[ "${transcriptome_name}" == "cbai_transcriptome_v1.0.fasta" ]]; then reads_array=("${reads_dir}"/20200[15][13][138]*megan*.fq) # Create array of fastq R1 files R1_array=("${reads_dir}"/20200[15][13][138]*megan*R1.fq) # Create array of fastq R2 files R2_array=("${reads_dir}"/20200[15][13][138]*megan*R2.fq) elif [[ "${transcriptome_name}" == "cbai_transcriptome_v1.5.fasta" ]]; then reads_array=("${reads_dir}"/20200[145][13][138]*megan*.fq) # Create array of fastq R1 files R1_array=("${reads_dir}"/20200[145][13][138]*megan*R1.fq) # Create array of fastq R2 files R2_array=("${reads_dir}"/20200[145][13][138]*megan*R2.fq) elif [[ "${transcriptome_name}" == "cbai_transcriptome_v1.6.fasta" ]]; then reads_array=("${reads_dir}"/*megan*.fq) # Create array of fastq R1 files R1_array=("${reads_dir}"/*megan*R1.fq) # Create array of fastq R2 files R2_array=("${reads_dir}"/*megan*R2.fq) elif [[ "${transcriptome_name}" == "cbai_transcriptome_v1.7.fasta" ]]; then reads_array=("${reads_dir}"/20200[145][13][189]*megan*.fq) # Create array of fastq R1 files R1_array=("${reads_dir}"/20200[145][13][189]*megan*R1.fq) # Create array of fastq R2 files R2_array=("${reads_dir}"/20200[145][13][189]*megan*R2.fq) elif [[ "${transcriptome_name}" == "cbai_transcriptome_v2.0.fasta" ]] \ || [[ "${transcriptome_name}" == "cbai_transcriptome_v2.1.fasta" ]]; then reads_array=("${reads_dir}"/*fastp-trim*.fq) # Create array of fastq R1 files R1_array=("${reads_dir}"/*R1*fastp-trim*.fq) # Create array of fastq R2 files R2_array=("${reads_dir}"/*R2*fastp-trim*.fq) elif [[ "${transcriptome_name}" == "cbai_transcriptome_v3.0.fasta" ]] \ || [[ "${transcriptome_name}" == "cbai_transcriptome_v3.1.fasta" ]]; then reads_array=("${reads_dir}"/*fastp-trim*20[12][09][01][24]1[48]*.fq) # Create array of fastq R1 files R1_array=("${reads_dir}"/*R1*fastp-trim*20[12][09][01][24]1[48]*.fq) # Create array of fastq R2 files R2_array=("${reads_dir}"/*R2*fastp-trim*20[12][09][01][24]1[48]*.fq) fi # Create list of fastq files used in analysis ## Uses parameter substitution to strip leading path from filename printf "%s\n" "${reads_array[@]##*/}" >> "${transcriptome_name}".fastq.list.txt # Create comma-separated lists of FastQ reads R1_list=$(echo "${R1_array[@]}" | tr " " ",") R2_list=$(echo "${R2_array[@]}" | tr " " ",") # Build Bowtie2 index # Transcriptome name is used as index basename ${programs_array[bowtie2_build]} \ --threads ${threads} \ ${transcriptomes_array[$transcriptome]} \ ${transcriptome_name} # Run rsem-eval # Use bowtie2 and paired-end options # Uses settings specified for use with DETONATE # and for paired end reads when using DETONATE. ${programs_array[bowtie2]} \ -x ${transcriptome_name} \ -S ${transcriptome_name}.sam \ --threads ${threads} \ -1 ${R1_list} \ -2 ${R2_list} \ --sensitive \ --dpad 0 \ --gbar 99999999 \ --mp 1,1 \ --np 1 \ --score-min L,0,-0.1 \ --no-mixed \ --no-discordant # Convert SAM to sorted BAM # ${programs_array[samtools_view]} \ -b \ ${transcriptome_name}.sam \ | ${programs_array[samtools_sort]} \ -m ${mem_per_thread} \ --threads ${threads} \ -o ${transcriptome_name}.sorted.bam \ - # Capture BAM checksums for verification echo "Generating checksum for ${transcriptome_name}.sorted.bam" md5sum ${transcriptome_name}.sorted.bam >> bam.checksums.md5 echo "Finished generating checksum for ${transcriptome_name}.sorted.bam" echo "" done # Remove leftover SAM files rm *.sam # Capture program options echo "Logging program options..." for program in "${!programs_array[@]}" do { echo "Program options for ${program}: " echo "" # Handle samtools help menus if [[ "${program}" == "samtools_index" ]] \ || [[ "${program}" == "samtools_sort" ]] \ || [[ "${program}" == "samtools_view" ]] then ${programs_array[$program]} fi ${programs_array[$program]} -h echo "" echo "" echo "

GO terms and Venn Diagrams

Obtaining GO IDs My analysis prior to this used DESeq2 to obtain a series of tables of genes with significantly-different expressions for each of my four comparisons. I wrote a script in R that took in each DESeq2 output file, matched it against this annotated transcriptome, and produced a newline-separated file of UniProt accessions. I then took each newline-separated file and put it into this Bash script from Sam, which retrieved the Gene Ontology terms for each Accession ID. This is progressing fairly well! Next steps are as follows: Get the GOslim terms using the GSEAbase package Perform gene enrichment analysis with TopGO or GO-MWU When time allows, go back and annotate the transcriptome myself to understand it more fully Producing Venn Diagrams In order to determine the overlap between DEGs for my different comparisons, I produced a few Venn diagrams. Since a quick analysis found practically no overlap between…

from Aidan F. Coyle https://ift.tt/3hcKQof
via IFTTT

Lists of Differentially-Expressed Genes (plus graphs)

Outline I put a whole bunch of libraries through the same protocol as before – use Kallisto to create pseudoalignments for libraries, use a Trinity pipeline script to merge Kallisto counts into a matrix, analyze matrix with DESeq2, produce list of significantly different genes (padj <= 0.005) At this point,…

from Aidan F. Coyle https://ift.tt/3p9Gq4e
via IFTTT

Day 0 Ambient vs. Day 17 Ambient

Comparison Yesterday, I described how I planned to start with a comparison of Day 0 and Day 17 libraries at ambient temperatures. I was able to complete that! Methods followed the same protocol outlined previously – use Kallisto to create pseudoalignments for libraries, use a script from Trinity pipeline to…

from Aidan F. Coyle https://ift.tt/38f0AD7
via IFTTT

Adding Individual Libraries

DESeq2 First, I finished up my first quarter! Congratulations to me! Alright, I’ve made a lot of progress since last time. For the moment, building an index for transcriptome 3.0 is on the backburner – the top priority is just getting a list of differentially-expressed genes. I was able to…

from Aidan F. Coyle https://ift.tt/3r3DuYK
via IFTTT

Adding Individual Libraries

DESeq2 First, I finished up my first quarter! Congratulations to me! Alright, I’ve made a lot of progress since last time. For the moment, building an index for transcriptome 3.0 is on the backburner – the top priority is just getting a list of differentially-expressed genes. I was able to…

from Aidan F. Coyle https://ift.tt/34n40me
via IFTTT

Samples Received – Cockle Clam Gonad H and E Slides

Today we received the H & E-stained slides from the cockle clam gonad tissue blocks/cassettes we submitted on 20201201. Slides were added to Slide Case #5 – Rows 13 – 37 (Google Sheet).

All info has been added to:

from Sam’s Notebook https://ift.tt/3oZHVlA
via IFTTT

FastQC-MultiQC – M.magister MBD-BSseq Pool Test MiSeq Run on Mox

Earlier today we received the M.magister (C.magister; Dungeness crab) MiSeq data from Mac.

I ran FastQC and MultiQC on Mox.

SBATCH script (GitHub):

#!/bin/bash ## Job Name #SBATCH --job-name=20201211_mmag_fastqc_multiqc_mbd-bsseq_miseq ## Allocation Definition #SBATCH --account=coenv #SBATCH --partition=coenv ## Resources ## Nodes #SBATCH --nodes=1 ## Walltime (days-hours:minutes:seconds format) #SBATCH --time=10-00:00:00 ## Memory per node #SBATCH --mem=120G ##turn on e-mail notification #SBATCH --mail-type=ALL #SBATCH --mail-user=samwhite ## Specify the working directory for this job #SBATCH --chdir=/gscratch/scrubbed/samwhite/outputs/20201211_mmag_fastqc_multiqc_mbd-bsseq_miseq ### FastQC assessment of raw MiSeq sequencing test run for ### MBD-BSseq pool of M.magister samples from 20201202. ################################################################################### # These variables need to be set by user # FastQC output directory output_dir=$(pwd) # Set number of CPUs to use threads=28 # Input/output files checksums=fastq_checksums.md5 fastq_list=fastq_list.txt raw_reads_dir=/gscratch/srlab/sam/data/C_magister/MBD-BSseq/ # Paths to programs fastqc=/gscratch/srlab/programs/fastqc_v0.11.9/fastqc multiqc=/gscratch/srlab/programs/anaconda3/bin/multiqc # Programs associative array declare -A programs_array programs_array=( [fastqc]="${fastqc}" \ [multiqc]="${multiqc}" ) ################################################################################### # Exit script if any command fails set -e # Load Python Mox module for Python module availability module load intel-python3_2017 # Sync raw FastQ files to working directory rsync --archive --verbose \ "${raw_reads_dir}"CH*.fastq.gz . # Populate array with FastQ files fastq_array=(CH*.fastq.gz) # Pass array contents to new variable fastqc_list=$(echo "${fastq_array[*]}") # Run FastQC # NOTE: Do NOT quote ${fastqc_list} ${programs_array[fastqc]} \ --threads ${threads} \ --outdir ${output_dir} \ ${fastqc_list} # Create list of fastq files used in analysis echo "${fastqc_list}" | tr " " "\n" >> ${fastq_list} # Generate checksums for reference while read -r line do # Generate MD5 checksums for each input FastQ file echo "Generating MD5 checksum for ${line}." md5sum "${line}" >> "${checksums}" echo "Completed: MD5 checksum for ${line}." echo "" # Remove fastq files from working directory echo "Removing ${line} from directory" rm "${line}" echo "Removed ${line} from directory" echo "" done < ${fastq_list} # Run MultiQC ${programs_array[multiqc]} . # Capture program options for program in "${!programs_array[@]}" do { echo "Program options for ${program}: " echo "" # Handle samtools help menus if [[ "${program}" == "samtools_index" ]] \ || [[ "${program}" == "samtools_sort" ]] \ || [[ "${program}" == "samtools_view" ]] then ${programs_array[$program]} fi ${programs_array[$program]} -h echo "" echo "" echo "

Data Received – M.magister MBD-BSseq Pool Test MiSeq Run

After creating _M.magister (C.magister; Dungeness crab) MBD-BSseq libraries (on 20201124), I gave the pooled set of samples to Mac for a test sequencing run on the MiSeq on 20201202.

MiSeq data consisted of 76bp paired-end (PE) sequencing.

All files were downloaded to the C_magister folder on Owl(Synology server).

Have added files to our high-throughput sequencing database (Google Sheet):

Next up:

from Sam’s Notebook https://ift.tt/3gMdQD6
via IFTTT