Ronit’s Notebook: qPCR Prep/Spreadsheet Organization

In preparation for the RNA extraction from the heat stress C. Gigas samples, Sam and I went over some sample organization/prep. I updated my spreadsheet, which was linked in my previous notebook entry, so that it now has columns to show treatment, ploidy, and tissue type. Also–there are now entries for replicate samples as well (i.e. D01-C2), so that every sample tube has a corresponding entry in the spreadsheet.

I then chose 2 samples from each treatment to prep for RNA extraction and subsequent qPCR. As there were 8 distinct treatment groups, we pulled a total of 16 samples:

D01, D02 — Diploid oysters exposed to control conditions (water in aquarium)

D09, D10 — Diploid oysters exposed to control conditions (water in aquarium); subsequently exposed to 1 hr acute heat shock at 45 degrees Celsius.

D11, D12 — Diploid oysters exposed to desiccation + elevated temperature (27 degrees Celsius) for 24 hrs

D19, D20 — Diploid oysters exposed to desiccation + elevated temperature (27 degrees Celsius) for 24 hrs; subsequently exposed to 1 hr acute heat shock at 45 degrees Celsius.

T01, T02 — Triploid oysters exposed to control conditions (water in aquarium)

T09, T10 — Triploid oysters exposed to control conditions (water in aquarium); subsequently exposed to 1 hr acute heat shock at 45 degrees Celsius.

T11, T12 — Triploid oysters exposed to desiccation + elevated temperature (27 degrees Celsius) for 24 hrs

T19, T20 — Triploid oysters exposed to desiccation + elevated temperature (27 degrees Celsius) for 24 hrs; subsequently exposed to 1 hr acute heat shock at 45 degrees Celsius.

These tubes were pulled from the -80 freezer and kept in a new, unlabeled box (which is also in the -80 freezer and has been updated in the -80 Inventory Map). We then made sure that we had all the materials on hand for the RNA extraction next week.

Sam’s Notebook: Transcriptome Alignment & Bedgraph – Olympia oyster transcriptome with Olurida_v080 genome assembly

0000-0002-2747-368X

Yesterday, I produced a bedgraph file of our Olympia oyster RNAseq data coverage using our Olurida_v081 genome.

I decided that I wanted to use the Olurida_v080 version instead (or, in addtion to?), as the Olurida_v080 version has not been size restricted (the Olurida v081 version is only contigs >1000bp). I feel like we could miss some important regions, so wanted to run this analysis using all of the genome data we currently have available. Additionally, this will be consistent with my previous Bismark (DNA methylation analysis).

Used HISAT2 on our HPC Mox node to align our RNAseq reads to our Olurida_v080 genome assembly:

SBATCH script file:

NOTE: For brevity sake, I have _not_listed all of the input RNAseq files below. Please see the full script, which is linked above.

  #!/bin/bash ## Job Name #SBATCH --job-name=20180926_oly_hisat2 ## Allocation Definition #SBATCH --account=srlab #SBATCH --partition=srlab ## Resources ## Nodes #SBATCH --nodes=1 ## Walltime (days-hours:minutes:seconds format) #SBATCH --time=5-00:00:00 ## Memory per node #SBATCH --mem=500G ##turn on e-mail notification #SBATCH --mail-type=ALL #SBATCH --mail-user=samwhite@uw.edu ## Specify the working directory for this job #SBATCH --workdir=/gscratch/scrubbed/samwhite/20180926_oly_RNAseq_genome_hisat2_bedgraph # Load Python Mox module for Python module availability module load intel-python3_2017 # Document programs in PATH (primarily for program version ID) date >> system_path.log echo "" >> system_path.log echo "System PATH for $SLURM_JOB_ID" >> system_path.log echo "" >> system_path.log printf "%0.s-" {1..10} >> system_path.log echo ${PATH} | tr : \\n >> system_path.log # Set genome assembly path oly_genome_path=/gscratch/srlab/sam/data/O_lurida/oly_genome_assemblies # Set sorted transcriptome assembly bam file oly_transcriptome_bam=20180926_Olurida_v080.sorted.bam # Set hisat2 basename hisat2_basename=Olurida_v080 # Set program paths ## hisat2 hisat2=/gscratch/srlab/programs/hisat2-2.1.0 ## bedtools bedtools=/gscratch/srlab/programs/bedtools-2.27.1/bin ## samtools stools=/gscratch/srlab/programs/samtools-1.9/samtools # Build hisat2 genome index ${hisat2}/hisat2-build \ -f ${oly_genome_path}/Olurida_v080.fa \ Olurida_v080 \ -p 28 # Align reads to oly genome assembly ${hisat2}/hisat2 \ --threads 28 \ -x "${hisat2_basename}" \ -q \ -1 \ -2 \ -S 20180926_"${hisat2_basename}".sam # Convert SAM file to BAM "${stools}" view \ --threads 28 \ -b 20180926_"${hisat2_basename}".sam > 20180926_"${hisat2_basename}".bam # Sort BAM "${stools}" sort \ --threads 28 \ 20180926_"${hisat2_basename}".bam \ -o 20180926_"${hisat2_basename}".sorted.bam # Index for use in IGV ##-@ specifies thread count; --thread option not available in samtools index "${stools}" index \ -@ 28 \ 20180926_"${hisat2_basename}".sorted.bam # Create bedgraph ## Reports depth at each position (-bg in bedgraph format) and report regions with zero coverage (-a). ## Screens for portions of reads coming from exons (-split). ## Add genome browser track line to header of bedgraph file. ${bedtools}/genomeCoverageBed \ -ibam ${oly_transcriptome_bam} \ -bga \ -split \ -trackline \ > 20180926_oly_RNAseq.bedgraph  

The script performs the following functions:

  • Genome indexing
  • RNAseq alignment to genome
  • Convert SAM to BAM
  • Sort and index BAM
  • Determine RNAseq coverage

Yaamini’s Notebook: Gigas Broodstock DNA Extraction Part 7

Extraction plan for actual samples

Kaitlyn ran DNA samples from protocol test 3 and test 4 on the Bioanalyzer. You can see her results in this notebook post. Based on her findings, I need to use a TissueTearor at setting 1 for 10 seconds to lyse the samples before incubation. I will also remove RNA from the DNA samples. The official protocol can be found here.

I have 10 gonad samples per pH treatment (ambient or low). Because 20 samples is a lot to work with at once, I’ll do 2 days of DNA extractions with 10 samples each. I randomly selected which ambient and low pH samples to extract in each batch. The IDs correspond to names on histology photos in this folder. Below are diagrams for each histology block:

gigas1

gigas2

gigas3

gigas4

gigas5

Figures 1-5. Location of each tissue sample on histology blocks.

Batch 1: I’ll extract DNA from this batch tomorrow.

Low pH:

  • 6-T1
  • 9-T2
  • 2-T1
  • 5-T3
  • 4-T3

Ambient pH:

  • UK-07
  • 12-T6
  • UK-04
  • UK-03
  • UK-01

Batch 2: I’ll extract DNA from this batch on Tuesday.

Low pH:

  • 1-T3
  • 7-T2
  • 8-T2
  • 10-T3
  • 3-T1

Ambient pH:

  • UK-06
  • 11-T4
  • UK-05
  • UK-08
  • UK-02

// Please enable JavaScript to view the comments powered by Disqus.

from the responsible grad student https://ift.tt/2Q9whDi
via IFTTT

Kaitlyn’s notebook: Bioanalzyer results

I ran Yaamini’s samples on the bioanalyzer and got the results from the procedure tests for broodstock C. gigas DNA extraction from blocks.

 

Sam’s Notebook: Bedgraph – Olympia oyster transcriptome with Olurida_v081 genome assembly

0000-0002-2747-368X

I took the sorted BAM file from yesterday’s corrected RNAseq genome alignment and converted it to a bedgraph using BEDTools genomeCoverageBed tool.

Analysis took place on our HPC Mox node.

SBATCH script file:

  #!/bin/bash ## Job Name #SBATCH --job-name=20180926_oly_bedgraphs ## Allocation Definition #SBATCH --account=srlab #SBATCH --partition=srlab ## Resources ## Nodes #SBATCH --nodes=1 ## Walltime (days-hours:minutes:seconds format) #SBATCH --time=5-00:00:00 ## Memory per node #SBATCH --mem=500G ##turn on e-mail notification #SBATCH --mail-type=ALL #SBATCH --mail-user=samwhite@uw.edu ## Specify the working directory for this job #SBATCH --workdir=/gscratch/scrubbed/samwhite/20180926_oly_RNAseq_bedgraphs # Load Python Mox module for Python module availability module load intel-python3_2017 # Document programs in PATH (primarily for program version ID) date >> system_path.log echo "" >> system_path.log echo "System PATH for $SLURM_JOB_ID" >> system_path.log echo "" >> system_path.log printf "%0.s-" {1..10} >> system_path.log echo ${PATH} | tr : \\n >> system_path.log # Set sorted transcriptome assembly bam file oly_transcriptome_bam=/gscratch/scrubbed/samwhite/20180925_oly_RNAseq_genome_hisat2/20180925_Olurida_v081.sorted.bam # Set program paths bedtools=/gscratch/srlab/programs/bedtools-2.27.1/bin samtools=/gscratch/srlab/programs/samtools-1.9/samtools # Create bedgraph ## Reports depth at each position (-bg in bedgraph format) and report regions with zero coverage (-a). ## Screens for portions of reads coming from exons (-split). ## Add genome browser track line to header of bedgraph file. ${bedtools}/genomeCoverageBed \ -ibam ${oly_transcriptome_bam} \ -bga \ -split \ -trackline \ > 20180926_oly_RNAseq.bedgraph 

Sam’s Notebook: Transcriptome Alignment – Olympia oyster RNAseq reads aligned to genome with HISAT2

0000-0002-2747-368X

Yesterday’s attempt at producing a bedgraph was a failure and a prodcuct of a major brain fart. The worst part is that I was questioning what I was doing the entire time, but still went through with the process! Yeesh!

The problem was that I tried to take our Trinity-assembled transcriptome and somehow align that to our genome. This can’t work because each of those assemblies don’t know the coordinates used by the other. So, as was the case, you end up with a bedgraph that shows zero coverage for all genome contigs.

Anyway, here’s the correct procedure!

Used HISAT2 on our HPC Mox node to align our RNAseq reads to our Olurida_v081 genome assembly:

SBATCH script files:

PERFORM GENOME INDEXING & ALIGNMENT
20180925_oly_RNAseq_genome_hisat2.sh

SORT & INDEX ALIGNMENT OUTPUT
20180925_oly_RNAseq_genome_sort_bam.sh

Kaitlyn’s notebook: ASCA in MetStaT

I downloaded the R package MetStat to preform an ASCA. I’m currently working through how best to organize the data because you input two dataframes. This is the example I’m working with.

I believe the first dataframe or the ‘data’ will only contain a list of the proteins and abundances (“variables are represented by columns, observations by rows”). Proteins will be in rows columns and abundances will be in columns rows  however, I don’t think order will be considered. Unless proteins should be in columns with abundance in rows… I’m going to look more into this tomorrow or Thursday since the ASCA is supposed to take into consideration time (that was the whole point) and I’ll update this. because observations are measured values (eg. abundance) and variables are what is observed/measured (eg. proteins).

The second dataframe or ‘levels’ will contain the temperature data for each protein/observation (“numeric matrix describing the experimental design. Each factor is represented by a column. The elements of the columns give the treatment level the row belongs to”). There will be one column representing a signal factor (temperature) and the elements will be 23 or 29, but I’m not sure how/if I can make time a factor?

I think the column would have to be “3, 5, 7, 9, 11, 13, 15” repeating, and I need to make sure that the order of proteins is the same for both dataframes so that the elements (23 or 29C) correctly match the factor (temperature). The data will only be described by temperature.

This will be okay since I will only use Silo 3 and 9. If I decide to do an ASCA between the 23C silos then I will make the elements “silo 2” or “silo 3″ for the factor.

Equation elements are specified as a string that indicates the factor to use in the ASCA. Factors are specified by the column (eg. =”1″) or interacting factors can be considered (=”123″). Multiple factors can also be entered (=”1,2,12”).

ASCA.Calculate(data, levels, equation.elements ="")

I wrote up an issue for help in our github.