Sam’s Notebook: Transcriptome Annotation – Hematodinium MEGAN Trinity Assembly Using DIAMOND BLASTx on Mox

As part of annotating the most recent transcriptome assembly from the MEGAN6 Hematodinium taxonomic-specific reads, I need to run DIAMOND BLASTx to use with Trinotate.

Ran DIAMOND BLASTx against the UniProt/SwissProt database (downloaded 20200123) on Mox.

SBATCH script (GitHub):

#!/bin/bash ## Job Name #SBATCH --job-name=hemat_blastx_DIAMOND ## Allocation Definition #SBATCH --account=coenv #SBATCH --partition=coenv ## Resources ## Nodes #SBATCH --nodes=1 ## Walltime (days-hours:minutes:seconds format) #SBATCH --time=10-00:00:00 ## Memory per node #SBATCH --mem=120G ##turn on e-mail notification #SBATCH --mail-type=ALL #SBATCH --mail-user=samwhite@uw.edu ## Specify the working directory for this job #SBATCH --chdir=/gscratch/scrubbed/samwhite/outputs/20200331_hemat_diamond_blastx_megan # Exit script if any command fails set -e # Load Python Mox module for Python module availability module load intel-python3_2017 # SegFault fix? export THREADS_DAEMON_MODEL=1 # Document programs in PATH (primarily for program version ID) { date echo "" echo "System PATH for $SLURM_JOB_ID" echo "" printf "%0.s-" {1..10} echo "${PATH}" | tr : \\n } >> system_path.log # Program paths diamond=/gscratch/srlab/programs/diamond-0.9.29/diamond # DIAMOND UniProt database dmnd=/gscratch/srlab/blastdbs/uniprot_sprot_20200123/uniprot_sprot.dmnd # Trinity assembly (FastA) fasta=/gscratch/srlab/sam/data/Hematodinium/transcriptomes/20200331.hemat.megan.Trinity.fasta # Strip leading path and extensions no_path=$(echo "${fasta##*/}") no_ext=$(echo "${no_path%.*}") # Run DIAMOND with blastx # Output format 6 produces a standard BLAST tab-delimited file ${diamond} blastx \ --db ${dmnd} \ --query "${fasta}" \ --out "${no_ext}".blastx.outfmt6 \ --outfmt 6 \ --evalue 1e-4 \ --max-target-seqs 1 \ --block-size 15.0 \ --index-chunks 4 

Sam’s Notebook: Transcriptome Assessment – BUSCO Metazoa on Hematodinium MEGAN Transcriptome

I previously created a C.bairdi de novo transcriptome assembly with Trinity from the MEGAN6 taxonomic-specific reads for Alveolata on 20200331 and decided to assess its “completeness” using BUSCO and the metazoa_odb9 database.

BUSCO was run with the --mode transcriptome option on Mox.

SBATCH script (GitHub):

#!/bin/bash ## Job Name #SBATCH --job-name=hemat_busco_megan_transcriptome ## Allocation Definition #SBATCH --account=coenv #SBATCH --partition=coenv ## Resources ## Nodes #SBATCH --nodes=1 ## Walltime (days-hours:minutes:seconds format) #SBATCH --time=3-00:00:00 ## Memory per node #SBATCH --mem=120G ##turn on e-mail notification #SBATCH --mail-type=ALL #SBATCH --mail-user=samwhite@uw.edu ## Specify the working directory for this job #SBATCH --chdir=/gscratch/scrubbed/samwhite/outputs/20200331_hemat_busco_megan # Load Python Mox module for Python module availability module load intel-python3_2017 # Load Open MPI module for parallel, multi-node processing module load icc_19-ompi_3.1.2 # SegFault fix? export THREADS_DAEMON_MODEL=1 # Document programs in PATH (primarily for program version ID) { date echo "" echo "System PATH for $SLURM_JOB_ID" echo "" printf "%0.s-" {1..10} echo "${PATH}" | tr : \\n } >> system_path.log # Establish variables for more readable code timestamp=$(date +%Y%m%d) species="hemat" prefix="${timestamp}.${species}" ## Input files and settings base_name="${prefix}.megan" busco_db=/gscratch/srlab/sam/data/databases/BUSCO/metazoa_odb9 transcriptome_fasta=/gscratch/srlab/sam/data/Hematodinium/transcriptomes/20200331.hemat.megan.Trinity.fasta augustus_species=fly threads=28 ## Save working directory wd=$(pwd) ## Set program paths augustus_bin=/gscratch/srlab/programs/Augustus-3.3.2/bin augustus_scripts=/gscratch/srlab/programs/Augustus-3.3.2/scripts blast_dir=/gscratch/srlab/programs/ncbi-blast-2.8.1+/bin/ busco=/gscratch/srlab/programs/busco-v3/scripts/run_BUSCO.py hmm_dir=/gscratch/srlab/programs/hmmer-3.2.1/src/ ## Augustus configs augustus_dir=${wd}/augustus augustus_config_dir=${augustus_dir}/config augustus_orig_config_dir=/gscratch/srlab/programs/Augustus-3.3.2/config ## BUSCO configs busco_config_default=/gscratch/srlab/programs/busco-v3/config/config.ini.default busco_config_ini=${wd}/config.ini # Export BUSCO config file location export BUSCO_CONFIG_FILE="${busco_config_ini}" # Export Augustus variable export PATH="${augustus_bin}:$PATH" export PATH="${augustus_scripts}:$PATH" export AUGUSTUS_CONFIG_PATH="${augustus_config_dir}" # Copy BUSCO config file cp ${busco_config_default} "${busco_config_ini}" # Make Augustus directory if it doesn't exist if [ ! -d "${augustus_dir}" ]; then mkdir --parents "${augustus_dir}" fi # Copy Augustus config directory cp --preserve -r ${augustus_orig_config_dir} "${augustus_dir}" # Edit BUSCO config file ## Set paths to various programs ### The use of the % symbol sets the delimiter sed uses for arguments. ### Normally, the delimiter that most examples use is a slash "/". ### But, we need to expand the variables into a full path with slashes, which screws up sed. ### Thus, the use of % symbol instead (it could be any character that is NOT present in the expanded variable; doesn't have to be "%"). sed -i "/^;cpu/ s/1/${threads}/" "${busco_config_ini}" sed -i "/^tblastn_path/ s%tblastn_path = /usr/bin/%path = ${blast_dir}%" "${busco_config_ini}" sed -i "/^makeblastdb_path/ s%makeblastdb_path = /usr/bin/%path = ${blast_dir}%" "${busco_config_ini}" sed -i "/^augustus_path/ s%augustus_path = /home/osboxes/BUSCOVM/augustus/augustus-3.2.2/bin/%path = ${augustus_bin}%" "${busco_config_ini}" sed -i "/^etraining_path/ s%etraining_path = /home/osboxes/BUSCOVM/augustus/augustus-3.2.2/bin/%path = ${augustus_bin}%" "${busco_config_ini}" sed -i "/^gff2gbSmallDNA_path/ s%gff2gbSmallDNA_path = /home/osboxes/BUSCOVM/augustus/augustus-3.2.2/scripts/%path = ${augustus_scripts}%" "${busco_config_ini}" sed -i "/^new_species_path/ s%new_species_path = /home/osboxes/BUSCOVM/augustus/augustus-3.2.2/scripts/%path = ${augustus_scripts}%" "${busco_config_ini}" sed -i "/^optimize_augustus_path/ s%optimize_augustus_path = /home/osboxes/BUSCOVM/augustus/augustus-3.2.2/scripts/%path = ${augustus_scripts}%" "${busco_config_ini}" sed -i "/^hmmsearch_path/ s%hmmsearch_path = /home/osboxes/BUSCOVM/hmmer/hmmer-3.1b2-linux-intel-ia32/binaries/%path = ${hmm_dir}%" "${busco_config_ini}" # Run BUSCO/Augustus training ${busco} \ --in ${transcriptome_fasta} \ --out ${base_name} \ --lineage_path ${busco_db} \ --mode transcriptome \ --cpu ${threads} \ --long \ --species ${augustus_species} \ --tarzip \ --augustus_parameters='--progress=true' 

Sam’s Notebook: Transcriptome Assembly – Hematodinium with MEGAN6 Taxonomy-specific Reads with Trinity on Mox

Ran a de novo assembly using the extracted reads classified under Alveolata from:

The assembly was performed with Trinity on Mox. It’s important to note that this assembly was not performed using the “stranded” option in Trinity. The previous Trinity assembly from 20200122 was performed using the “stranded” setting. The reason for this difference is that the most recent RNAseq libraries from 20200318 were not stranded libraries. As such, I think it might be best to use the “lowest common denominator” approach.

SBATCH script (GitHub):

#!/bin/bash ## Job Name #SBATCH --job-name=trinity_hemat ## Allocation Definition #SBATCH --account=srlab #SBATCH --partition=srlab ## Resources ## Nodes #SBATCH --nodes=1 ## Walltime (days-hours:minutes:seconds format) #SBATCH --time=10-00:00:00 ## Memory per node #SBATCH --mem=120G ##turn on e-mail notification #SBATCH --mail-type=ALL #SBATCH --mail-user=samwhite@uw.edu ## Specify the working directory for this job #SBATCH --chdir=/gscratch/scrubbed/samwhite/outputs/20200330_hemat_trinity_megan_RNAseq # Exit script if a command fails set -e # Load Python Mox module for Python module availability module load intel-python3_2017 # Document programs in PATH (primarily for program version ID) { date echo "" echo "System PATH for $SLURM_JOB_ID" echo "" printf "%0.s-" {1..10} echo "${PATH}" | tr : \\n } >> system_path.log # User-defined variables reads_dir=/gscratch/srlab/sam/data/Hematodinium/RNAseq threads=27 assembly_stats=assembly_stats.txt timestamp=$(date +%Y%m%d) fasta_name="${timestamp}.hemat.megan.Trinity.fasta" # Paths to programs trinity_dir="/gscratch/srlab/programs/trinityrnaseq-v2.9.0" samtools="/gscratch/srlab/programs/samtools-1.10/samtools" ## Inititalize arrays R1_array=() R2_array=() # Variables for R1/R2 lists R1_list="" R2_list="" # Create array of fastq R1 files R1_array=(${reads_dir}/*_R1.fq) # Create array of fastq R2 files R2_array=(${reads_dir}/*_R2.fq) # Create list of fastq files used in analysis ## Uses parameter substitution to strip leading path from filename for fastq in ${reads_dir}/*.fq do echo "${fastq##*/}" >> fastq.list.txt done # Create comma-separated lists of FastQ reads R1_list=$(echo "${R1_array[@]}" | tr " " ",") R2_list=$(echo "${R2_array[@]}" | tr " " ",") # Run Trinity using "stranded" setting (--SS_lib_type) ${trinity_dir}/Trinity \ --seqType fq \ --max_memory 120G \ --CPU ${threads} \ --left "${R1_list}" \ --right "${R2_list}" # Rename generic assembly FastA mv trinity_out_dir/Trinity.fasta trinity_out_dir/${fasta_name} # Assembly stats ${trinity_dir}/util/TrinityStats.pl trinity_out_dir/${fasta_name} \ > ${assembly_stats} # Create gene map files ${trinity_dir}/util/support_scripts/get_Trinity_gene_to_trans_map.pl \ trinity_out_dir/${fasta_name} \ > trinity_out_dir/${fasta_name}.gene_trans_map # Create FastA index ${samtools} faidx \ trinity_out_dir/${fasta_name} 

Sam’s Notebook: RNAseq Reads Extractions – C.bairdi Taxonomic Reads Extractions with MEGAN6 on swoose

I previously annotated reads and converted them to the MEGAN6 format RMA6 on 20200318.

I’ll use the MEGAN6 GUI to “Open” the RMA6 file. Once the file loads, you get a nice looking taxonomic tree! From here, you can select any part of the taxonomic tree by right-clicking on the desired taxonomy and “Extract reads…”. Here, you have the option to include “Summarized reads”. This option allows you to extract just the reads that are part of the exact classification you’ve selected or all those within and “below” the classification you’ve selected (i.e. summarized reads).

Extracted reads will be generated as FastA files.

Example:

If you select Arthropoda and do not check the box for “Summarized Reads” you will only get reads classified as Arthropoda! You will not get any reads with more specific taxonomies. However, if you select Arthropoda and you do check the box for “Summarized Reads”, you will get all reads classified as Arthropoda AND all reads in more specific taxonomic classifications, down to the species level.

I will extract reads from two phyla:

  • Arthropoda (for crabs)
  • Alveolata (for Hematodinium)

After read extractions using MEGAN6, I’ll need to extract the actual reads from the trimmed FastQ files. This will actually entail extracting all trimmed reads from two different sets of RNAseq:

It’s a bit convoluted, but I realized that the FastA headers were incomplete and did not distinguish between paired reads. Here’s an example:

R1 FastQ header:

@A00147:37:HG2WLDMXX:1:1101:5303:1000 1:N:0:AGGCGAAG+AGGCGAAG

R2 FastQ header:

@A00147:37:HG2WLDMXX:1:1101:5303:1000 2:N:0:AGGCGAAG+AGGCGAAG

However, the reads extracted via MEGAN have FastA headers like this:

>A00147:37:HG2WLDMXX:1:1101:5303:1000 SEQUENCE1 >A00147:37:HG2WLDMXX:1:1101:5303:1000 SEQUENCE2 

Those are a set of paired reads, but there’s no way to distinguish between R1/R2. This may not be an issue, but I’m not sure how downstream programs (i.e. Trinity) will handle duplicate FastA IDs as inputs. To avoid any headaches, I’ve decided to parse out the corresponding FastQ reads which have the full header info.

Here’s a brief rundown of the approach:

  1. Create list of unique read headers from MEGAN6 FastA files.
  2. Use list with seqtk program to pull out corresponding FastQ reads from the trimmed FastQ R1 and R2 files.

This aspect of read extractions/concatenations is documented in the following Jupyter notebook (GitHub):

Kaitlyn’s notebook: sex/stage plots on primers

Primers by sex & stage:

sex-stage-cqmean

Female Cq.Means by Stage:

female-cqmean

Male Cq.Means by Stage:

male-cqmean

 

Sam’s Notebook: Trimming/FastQC/MultiQC – C.bairdi RNAseq FastQ with fastp on Mox

After receiving our RNAseq data from Genewiz earlier today, needed to run FastQC, trim, check trimmed reads with FastQC.

FastQC on raw reads was run locally and files were kept on owl/nightingales/C_bairdi.

fastp trimming was run on Mox, followed by MultiQC.

FastQC on trimmed reads were run locally, followed by MultiQC.

SBATCH script (GitHub):

#!/bin/bash ## Job Name #SBATCH --job-name=cbai_fastp_trimming_RNAseq ## Allocation Definition #SBATCH --account=coenv #SBATCH --partition=coenv ## Resources ## Nodes #SBATCH --nodes=1 ## Walltime (days-hours:minutes:seconds format) #SBATCH --time=10-00:00:00 ## Memory per node #SBATCH --mem=120G ##turn on e-mail notification #SBATCH --mail-type=ALL #SBATCH --mail-user=samwhite@uw.edu ## Specify the working directory for this job #SBATCH --chdir=/gscratch/scrubbed/samwhite/outputs/20200318_cbai_RNAseq_fastp_trimming ### C.bairdi RNAseq trimming using fastp. # Exit script if any command fails set -e # Load Python Mox module for Python module availability module load intel-python3_2017 # Document programs in PATH (primarily for program version ID) { date echo "" echo "System PATH for $SLURM_JOB_ID" echo "" printf "%0.s-" {1..10} echo "${PATH}" | tr : \\n } >> system_path.log # Set number of CPUs to use threads=27 # Input/output files trimmed_checksums=trimmed_fastq_checksums.md5 raw_reads_dir=/gscratch/scrubbed/samwhite/data/C_bairdi/RNAseq/ # Paths to programs fastp=/gscratch/srlab/programs/fastp-0.20.0/fastp multiqc=/gscratch/srlab/programs/anaconda3/bin/multiqc ## Inititalize arrays fastq_array_R1=() fastq_array_R2=() programs_array=() R1_names_array=() R2_names_array=() # Programs array programs_array=("${fastp}" "${multiqc}") # Capture program options for program in "${!programs_array[@]}" do { echo "Program options for ${programs_array[program]}: " echo "" ${programs_array[program]} -h echo "" echo "" echo "

Kaitlyn’s notebook: boxplots on sex & stage for geoduck primers

All primer pairs by sex and stage:

primers-Cqmean

Cq. mean values based on sex for each primer pair tested on either just pooled or both pooled and known samples. Pooled samples are a combination of males and females across all stages.

stage-Cqmean

Cq. mean values based on reproductive stage for each primer pair tested on either just pooled or both pooled and known samples. Pooled samples are a combination of males and females across all stages.


Primers by Stage:

This slideshow requires JavaScript.

This slideshow requires JavaScript.

Primers by Sex:

sex-Cqmean

Primers tested on known samples and reported Cq.mean values.

sex-peakheight-zeros

A value of 0 was plotted for values not reported.

sex-melttemp-zeros

A value of 0 was plotted for values not reported.

Sam’s Notebook: Data Received – C.bairdi RNAseq Data from Genewiz

We received the RNAseq data from the RNA that was sent out by Grace on 20200212.

Sequencing is 150bp PE.

Grace has a Google Sheet that describes what the samples constitute (e.g. ambient/cold/warm, infected/uninfect, day, etc.)

Genewiz report:

Project Sample ID Barcode Sequence # Reads Yield (Mbases) Mean Quality Score % Bases >= 30
30-343338329 72 ACTCGCTA+TCGACTAG 27,249,335 8,175 34.16 85.82
30-343338329 73 ACTCGCTA+TTCTAGCT 25,856,008 7,757 33.87 84.36
30-343338329 113 ACTCGCTA+CCTAGAGT 31,638,462 9,492 32.38 77.77
30-343338329 118 ACTCGCTA+GCGTAAGA 29,253,455 8,776 33.50 82.62
30-343338329 127 ACTCGCTA+CTATTAAG 27,552,329 8,266 33.14 81.13
30-343338329 132 ACTCGCTA+AAGGCTAT 27,518,702 8,256 34.86 88.87
30-343338329 151 ACTCGCTA+GAGCCTTA 33,430,314 10,029 35.01 89.35
30-343338329 173 ACTCGCTA+TTATGCGA 33,262,459 9,979 34.45 87.06
30-343338329 178 GGAGCTAC+TCGACTAG 29,495,389 8,849 35.01 89.62
30-343338329 221 GGAGCTAC+TTCTAGCT 25,902,415 7,771 34.76 88.40
30-343338329 222 GGAGCTAC+CCTAGAGT 53,808,137 16,142 30.90 71.11
30-343338329 254 GGAGCTAC+GCGTAAGA 16,771,613 5,031 35.14 90.03
30-343338329 272 GGAGCTAC+CTATTAAG 27,818,893 8,346 33.30 81.70
30-343338329 280 GGAGCTAC+AAGGCTAT 61,008,799 18,303 30.85 70.86
30-343338329 294 GGAGCTAC+GAGCCTTA 28,539,233 8,562 35.12 90.04
30-343338329 334 GGAGCTAC+TTATGCGA 25,916,895 7,775 34.98 89.39
30-343338329 349 GCGTAGTA+TCGACTAG 32,868,756 9,861 33.53 82.69
30-343338329 359 GCGTAGTA+TTCTAGCT 27,274,149 8,182 34.96 89.20
30-343338329 425 GCGTAGTA+CCTAGAGT 66,224,932 19,867 29.54 65.13
30-343338329 427 GCGTAGTA+GCGTAAGA 18,918,640 5,676 33.31 80.87
30-343338329 445 GCGTAGTA+CTATTAAG 30,745,388 9,224 33.07 80.83
30-343338329 463 GCGTAGTA+AAGGCTAT 19,531,145 5,859 34.27 86.08
30-343338329 481 GCGTAGTA+GAGCCTTA 50,592,084 15,178 31.92 75.59
30-343338329 485 GCGTAGTA+TTATGCGA 26,010,208 7,803 34.63 87.48

Confirmed that SFTP transfer from Genewiz to owl/nightingales/C_bairdi/ was successful:

screencap of md5sum output

Shelly’s Notebook: Mon. Mar. 16, 2020 Trimming Geoduck RRBS data

This entry is about trimming for the 2016 juvenile geoduck RRBS data Hollie generated.

Multi-core TrimGalore!

TrimGalore! can be run with multi-core settings if you use version 0.6.1 or newer. Reference to the multi-core TrimGalore! update is here: https://github.com/FelixKrueger/TrimGalore/pull/39. Using 8 cores reduced the run time for 100M reads from ~2hr10min to ~30min.

Trimming history of RRBS data

Illumina Recommended Trimming:

  • I spoke with Dina from Illumina tech support today and she found trimming recommendations for the Illumina Truseq Methylation Kit here on page 45: Illumina adapter sequences reference
    • R1 Adapter: AGATCGGAAGAGCACACGTCTGAAC
    • R2 Adapter: AGATCGGAAGAGCGTCGTGTAGGGA
      • The first 13 bases (bolded above) correspond to the universal illumina adapter sequence you can specify in TrimGalore (AGATCGGAAGAGC)
      • There is an additional 12bp added on by the Illumina Truseq Methylation Kit that are recommeneded to be trimmed off
  • Bismark User Guide sections TruSeq DNA-Methylation Kit (formerly EpiGnome) and Random priming and 3’ Trimming in general
  • Illumina Truseq Methylation Kit workflow
  • Adaptor-tagged%20TruSeq%20DNA%20Methylation%20LIbrary%20Kit%20Workflow.png

Testing Recommended Trimming Parameters

TrimGalore! with new parameters

I performed a test on just one sample: EPI-167

(base) [strigg@mox2 raw]$ wget --no-check-certificate https://owl.fish.washington.edu/nightingales/P_generosa/EPI-167_S10_L002_R1_001.fastq.gz --2020-03-16 20:29:13-- https://owl.fish.washington.edu/nightingales/P_generosa/EPI-167_S10_L002_R1_001.fastq.gz Resolving owl.fish.washington.edu (owl.fish.washington.edu)... 128.95.149.83 Connecting to owl.fish.washington.edu (owl.fish.washington.edu)|128.95.149.83|:443... connected. WARNING: cannot verify owl.fish.washington.edu's certificate, issued by ‘/C=US/ST=MI/L=Ann Arbor/O=Internet2/OU=InCommon/CN=InCommon RSA Server CA’: Unable to locally verify the issuer's authority. HTTP request sent, awaiting response... 200 OK Length: 1451174652 (1.4G) [application/x-gzip] Saving to: ‘EPI-167_S10_L002_R1_001.fastq.gz’ 100%[=============================================================================================>] 1,451,174,652 27.6MB/s in 51s 2020-03-16 20:30:04 (26.9 MB/s) - ‘EPI-167_S10_L002_R1_001.fastq.gz’ saved [1451174652/1451174652] (base) [strigg@mox2 raw]$ wget --no-check-certificate https://owl.fish.washington.edu/nightingales/P_generosa/EPI-167_S10_L002_R2_001.fastq.gz --2020-03-16 20:30:08-- https://owl.fish.washington.edu/nightingales/P_generosa/EPI-167_S10_L002_R2_001.fastq.gz Resolving owl.fish.washington.edu (owl.fish.washington.edu)... 128.95.149.83 Connecting to owl.fish.washington.edu (owl.fish.washington.edu)|128.95.149.83|:443... connected. WARNING: cannot verify owl.fish.washington.edu's certificate, issued by ‘/C=US/ST=MI/L=Ann Arbor/O=Internet2/OU=InCommon/CN=InCommon RSA Server CA’: Unable to locally verify the issuer's authority. HTTP request sent, awaiting response... 200 OK Length: 1496018906 (1.4G) [application/x-gzip] Saving to: ‘EPI-167_S10_L002_R2_001.fastq.gz’ 100%[=============================================================================================>] 1,496,018,906 27.2MB/s in 53s 2020-03-16 20:31:01 (27.1 MB/s) - ‘EPI-167_S10_L002_R2_001.fastq.gz’ saved [1496018906/1496018906] 

Alignments with new trimming

  • ran this script 20200316_BmrkAln_EpiTest2.sh
  • NEXT STEPS:
    • check Mbias plots in report
    • check percent methylation
      • previous
      • new trimming
        Trim date 03/16/2020 05/16/2018
        Read pairs analyzed 23436512 24481250
        mapping efficiency (%) 40.9 42.6
        ambiguously mapped read pairs (%) 11.8 8.2
        unaligned read pairs 47.3 49.2
        mC in CpG (%) 25.3 27.9
        mC in CHG (%) 1.7 2.9
        mC in CHH (%) 2.7 3
        mC in CN or CHN (%) 4.9 8.5
    • determine if deduplicating should be done
      • previous report showed 26.85% duplicate alignments were removed – NOTE: previous alignments were done using genome v074. Although there shouldn’t be a difference between this genome and the one on OFS (Panopea-generosa-v1.0.fa), I am currently performing alignment of the 5/16/19 trimmed reads and the 9/23/19 trimmed reads for EPI-167/

from shellytrigg https://ift.tt/33rFOxp
via IFTTT

Kaitlyn’s notebook: Primers next steps/goals

  1. Look at product and compare to theoretical size
    • Must do visually (order based on sequence)
  2. Make plots for each primer w/ pooled
    • Cq.mean
    • Melt temp
    • Melt peak height
  3. Analyze Cq values
    • Sex or development differences (ANOVA?)
      • mean, melt temp and melt peak height by sex
        • And then by dev. stage
  4. Make summary of performance of each primer
    • Via a table (rank performances of each primer w/ grade & notes on grade [3 columns])
      • With pooled sample
      • And known samples