[code][sr320@mox1 jobs]$ cat 1029_1500.sh #!/bin/bash...

[sr320@mox1 jobs]$ cat 1029_1500.sh 
#!/bin/bash
## Job Name
#SBATCH --job-name=bow64
## Allocation Definition
#SBATCH --account=srlab
#SBATCH --partition=srlab
## Resources
## Nodes (We only get 1, so this is fixed)
#SBATCH --nodes=1
## Walltime (days-hours:minutes:seconds format)
#SBATCH --time=5-100:00:00
## Memory per node
#SBATCH --mem=100G
#SBATCH --mail-type=ALL
#SBATCH --mail-user=sr320@uw.edu
## Specify the working directory for this job
#SBATCH --workdir=/gscratch/srlab/sr320/analyses/1030

source /gscratch/srlab/programs/scripts/paths.sh


for fastq in /gscratch/scrubbed/sr320/Phred64_fqs/*1.fq.gz; do
  read1=$(echo "$fastq")
  read1_array+=("$read1")
done

for fastq in /gscratch/scrubbed/sr320/Phred64_fqs/*2.fq.gz; do
  read2=$(echo "$fastq")
  read2_array+=("$read2")
done

for pair in "${!read1_array[@]}"; do
  i=${read1_array[$pair]}
  j=${read2_array[$pair]}
  filename="${i##*/}"
  no_ext="${filename%%.*}"
  /gscratch/srlab/programs/bowtie2-2.3.4.1-linux-x86_64/bowtie2 \
  -x /gscratch/srlab/sr320/data/new_genome_files/chinook_genome_masked \
  -1 "$i" \
  -2 "$j" \
  -X 2000 --sensitive  --no-mixed --phred64 --no-discordant --no-unal \
  -S /gscratch/scrubbed/sr320/cw/1017/"$no_ext"_bowtie2.sam \
  -q \
  -p 28
done

OUTPUT: /var/services/homes/charlie/out/1101

[sr320@mox1 jobs]$ cat 1029_1600.sh 
#!/bin/bash
## Job Name
#SBATCH --job-name=bow33
## Allocation Definition
#SBATCH --account=coenv
#SBATCH --partition=coenv
## Resources
## Nodes (We only get 1, so this is fixed)
#SBATCH --nodes=1
## Walltime (days-hours:minutes:seconds format)
#SBATCH --time=5-100:00:00
## Memory per node
#SBATCH --mem=100G
#SBATCH --mail-type=ALL
#SBATCH --mail-user=sr320@uw.edu
## Specify the working directory for this job
#SBATCH --workdir=/gscratch/srlab/sr320/analyses/1029

source /gscratch/srlab/programs/scripts/paths.sh


for fastq in /gscratch/scrubbed/sr320/Phred33_fqs/*1.fq.gz; do
  read1=$(echo "$fastq")
  read1_array+=("$read1")
done

for fastq in /gscratch/scrubbed/sr320/Phred33_fqs/*2.fq.gz; do
  read2=$(echo "$fastq")
  read2_array+=("$read2")
done

for pair in "${!read1_array[@]}"; do
  i=${read1_array[$pair]}
  j=${read2_array[$pair]}
  filename="${i##*/}"
  no_ext="${filename%%.*}"
  /gscratch/srlab/programs/bowtie2-2.3.4.1-linux-x86_64/bowtie2 \
  -x /gscratch/srlab/sr320/data/new_genome_files/chinook_genome_masked \
  -1 "$i" \
  -2 "$j" \
  -X 2000 --sensitive  --no-mixed --phred33 --no-discordant --no-unal \
  -S /gscratch/scrubbed/sr320/cw/"$no_ext"_bowtie2.sam \
  -q \
  -p 28
done

OUTPUT: /var/services/homes/charlie/out/1100

#sbatch

Grace’s Notebook: Trinity failed; Re-run, and Notes from Crab Meeting

Trinity

My Trinity job failed and it turns out it’s because I was missing a \ after /gscratch/srlab/programs/Trinity-v2.8.3/Trinity. I fixed that and then re-sent the job to Mox. The new script is called 20181101_Cbairdi_trinity.sh and it lives in my /gscratch/srlab/graceac9/jobs/ directory on Mox.

Here’s the FastQC results that Sam did on the RNAseq data when it came in. Looks good!

Crab Meeting #6

from Grace’s Lab Notebook https://ift.tt/2Dg75sh
via IFTTT

Yaamini’s Notebook: November 2018 Goals

October Goals Recap:

Manchester:

  • I resubmitted the paper! :tada:

Virginica:

This is how I spent most of my time.

DNR:

  • Addressed minor reviewer comments
  • Got information from Micah and Alex about outplant methods
  • Selected multivariate analysis procedure to explore environmental data and reltionship between environmental data and protein abundance data

Gigas Broodstock:

November Goals

DNR:

  • Upload preprint to bioRXiv
  • Complete multivariate analyses on environmental data and protein abundance data
  • Remake figures
  • Finish manuscript revisions
  • Upload all supplementary materials to the appropriate databases
  • Organize paper repository
  • Resubmit to MEPS

Virginica:

  • Finish bismark alignment on Mox
  • Run methylKit and bedtools on alignments from Mox
  • Figure out closest for flanking analysis
  • Figure out gene enrichment methods

Gigas Broodstock:

  • Identify kits for MBD-BSseq
  • Submit samples for sequencing

Other:

  • Run my second committee meeting!
  • Draft my PhD proposal
  • Consolidate all other requirements for bypassing

// Please enable JavaScript to view the comments powered by Disqus.

from the responsible grad student https://ift.tt/2zlrHuZ
via IFTTT

Yaamini’s Notebook: DML Analysis Part 16

Mox update

I stopped bismark job I started on Mox because I didn’t specify a path for samtools. Steven figured this out manually because by redirecting my standard error, there was nothing going into my default slurm output from Mox. I edited my script to include a path to samtools and remoe redirecting any error. I then restarted my Mox job. It’s currently deduplicating my files, but I used cat slurm-401682.out | grep "Mapping efficiency *" to look at the mapping efficiencies for the alignment.

screen shot 2018-11-01 at 9 54 12 am

Figure 1. Mapping efficiency for Mox alignment. The first mapping efficiency is for sample 10, second-tenth for samples 1-9.

Even though my Mox alignment used Bowtie 2-2.3.4, and my genefish alignment used Bowtie 2-2.2.9, I got the same mapping efficiencies! I created a table in my paper draft that Steven looked at. Turns out he was getting different mapping efficiences because he did not specify that the input data was --non_directional. I verfied that I needed that argument in this issue. The reason why Steven was getting different mapping efficiencies was because he did not include this argument.

However, I encountered another problem with my revised Mox run: I didn’t specify --samtools_path in my alignment step! I didn’t need to do this on genefish because samtools was already in my computer path. Because Mox couldn’t find samtools my output files are now SAMfiles.

screen shot 2018-11-01 at 9 53 19 am

Figure 2. Output from bismark alignment.

I need to convert my SAMfiles to BAMfiles before I move onto methylKit. I also added --samtools_path in a new script and queued the job on Mox.

I then encountered a THIRD issue with Mox. Just looking at my slurm.out, I couldn’t tell if my deduplication was actually running, or if it encountered an error.

47873735-39a6c680-ddcf-11e8-8003-000c1e8827b3

Figure 3. Status of slurm.out file

I posted this issue. Sam pointed out that my script included a “” at the end of the last line of code, which essentially left it hanging. I edited this script to remove any hanging backslashes, cancelled both of my queued jobs (Sam later mentioned that I probably should not have cancelled both but oh well), and restarted a new Mox job.

From now onwards, I need to quadruple check all of my Mox scripts. Because I’m not running the bismark pipeline in chunks like I’m used to in Jupyter, it’s hard to catch errors, fix them, and easily restart from where I left off.

Going forward

  1. Figure out how to convert SAMfiles to BAMfiles
  2. Use methylKit to identify DML and DMR
  3. Characterize DML and DMR locations

// Please enable JavaScript to view the comments powered by Disqus.

from the responsible grad student https://ift.tt/2SDgMFS
via IFTTT

Shelly’s Notebook: Sun. Oct 28, 2018

Methylation analysis

P. generosa juvenile OA response

bismark currently running on full data set

mapping and coverage analysis with subsetted samples (10K reads each)

bismark summary

  • all samples have > 50% mapping efficiency with Bismark settings –score_min L,0,-1.2 -I 60 –non_directional
  • deduplication removed very few reads, per usual
  • % methylation is on average 20%

methylkit summary – coverage plots don’t appear to obviously show any particular sample or condition should be excluded due to low coverage.this may become more obvious in the full analysis or downstream analysis

  • the genome region > 60000 seems to have low coverage in all samples

plot of genome regions with no coverage organized by length of treatment

C. virginica

bismark run completed on mox

 - last part failed because I copied from jupyter notebook and had '!' in the code still - need to edit this part for future scripts, but for now i will complete the analysis on ostrich in my jupyter notebook - generate summary file - generate sorted bam files for methylkit analysis - need to rsync everything onto Metacarcinus from Mox gscratch  

failed code

  /var/spool/slurm/d/job394229/slurm_script: line 47: !cat: command not found /var/spool/slurm/d/job394229/slurm_script: line 51: !sed: command not found /var/spool/slurm/d/job394229/slurm_script: line 56: fg: no job control xargs: samtools: No such file or directory /var/spool/slurm/d/job394229/slurm_script: line 61: -o: command not found  

code excerpt

  46 #create summary report 47 !cat /gscratch/srlab/strigg/analyses/20181004/*PE_report.txt | \ 48 grep 'Mapping\ efficiency\:' | \ 49 cat - /gscratch/srlab/strigg/analyses/20181004/*.deduplication_report.txt > /gscratch/srlab/strigg/analyses/20181004/zr2096_mapping_dedup_summary.txt 50 #clean up summary report 51 !sed 's/Mapping\ efficiency\://g' /gscratch/srlab/strigg/analyses/20181004/zr2096_mapping_dedup_summary.txt | \ 52 sed 's/Total\ number\ duplicated\ alignments\ removed\://g' | \ 53 sed 's/ //g' | awk '{print $1}' > /gscratch/srlab/strigg/analyses/20181004/zr2096_mapping_dedup_summary_clean.txt 54 55 #sort bams 56 %%bash 57 find /gscratch/srlab/strigg/analyses/20181004/*deduplicated.bam| \ 58 xargs basename -s _s1_R1_bismark_bt2_pe.deduplicated.bam | \ 59 xargs -I{} samtools \ 60 sort /gscratch/srlab/strigg/analyses/20181004/{}_s1_R1_bismark_bt2_pe.deduplicated.bam \ 61 -o /gscratch/srlab/strigg/analyses/20181004/{}_dedup.sorted.bam  

from shellytrigg https://ift.tt/2Jvzo6t
via IFTTT

Ronit’s Notebook: RNA Extraction for Remaining C. Gigas Dessication + Elevated Temperature Samples

I extracted RNA for 16 samples (D05, D06, D07, D08, D15, D16, D17, D18, T05, T06, T07, T08, T15, T16, T17, T18). RNA pellets were stored in the -80 freezer following isopropanol precipitation. The protocol I used is described below:

  1. 500 µL of RNAzol RT was added to a clean tube.
  2. Tissue samples were removed and a small section was cut out for RNA extraction.
  3. Tissue portions were placed in the tube and an additional 500 µL of RNAzol RT was added to bring the volume up to 1mL.
  4. The samples were vortexed vigorously for 10 seconds
  5. Samples were incubated at room temperature for 5 minutes.
  6. 400 µL of DEPC-water was added to the samples.
  7. Samples were centrifuged for 15 minutes at 12,000 g.
  8. 750  µL of the supernatant was transferred to a new, clean tube and an equal volume of isopropanol was added to the sample.
  9. The samples were vortexed vigorously for 10 seconds.
  10. Samples were incubated at room temperature for 5 minutes.
  11. Samples were centrifuged for 15 minutes at 12,000 g.
  12. Note: one of the RNA pellets (D17) looked black. Not sure what could have caused this or if this is a sign of contamination, but I’ll proceed with the RNA extraction for D17 and see what the Qubit results show for that sample.

Grace’s Notebook: BLAST with bad Trinity fasta, R plan for adding Qubit data, and Testing out RNeasy Plus Mirco Kit

Today I ran BLAST with the bad fasta from the Trinity run from last weekend. Will look more at the notebook Steven sent me to do the BLAST stats, goslim, contigs, go slim tables, etc. I also have been getting some input from Sam as to how best to manage adding new Qubit data to a master file consisting of all the hemolymph sampling data joined with the Qubit data results. Finally, I tested out the RNeasy Plus Micro Kit on 4 samples from Day 26, and ran the Qubit.

BLAST

BLAST test notebook

I had to get some help because my original notebook wasn’t working because I had a SPACE after one of the lines, and I guess Jupyter can’t handle that. (GitHub Issue # 461).

Here is what the head of the BLAST looks like:
img

I’ll look at this notebook of Steven’s to do the goslim, contigs, etc.

This process will help me identify a workflow and pipeline for once my most recent Trinity output is available.

R Plan

I’ve been having trouble with finding a relatively simple and reproducible way of adding new Qubit data to my hemolymph sampling data.

The new general plan, as recommended by Sam (# 460), is as follows:

 The order of operations for all of this should look something like this (all in R): Transfer Qubit CSV from Qubit USB drive to repo. Use sys() function to use bash command to format to UTF CSV and strip header. Use R to insert tube_number column and corresponding tube numbers for each sample and write to CSV. Use sys() function to use bash to concatenate all UTF CSVs into master UTF CSV. Use sys() function to use bash to restore header to master UTF CSV. Do all your downstream manipulations using fully formatted master UTF CSV. You can probably do a bunch (i.e. all) of this stuff using straight R, without needing to use the sys() function, but I don't know how and I do know that the sys() bash stuff is definitely needed for UTF conversion. Additionally, you should have (at least) two scripts Script for creating master UTF CSV. Script for all other downstream stuff involving master UTF CSV.  

Here is the start of that in the project-crab/scripts directory.

I’m still trying to wrap my head around the flow of things and still have to figure out how to do the bash script in R. Previously, I just did it in terminal, and then had that new UTF csv transferred to Qubit_data directory in my project-crab repository.

RNeasy Plus Micro Kit test

I tested the protocol out on 4 samples from Day 26 (tubes 2 out of 3):
img

I used the protocol (page 1 and page 2) provided in the box.

Some changes that were made were based on Sam’s notebook post.

Since I did 4 samples, I added 16 µL of 2-ME to 1600 µL of Buffer RLT Plus.
I didn’t understand what to do with the QIAshredder columns, so I didn’t use them.

Some things that happened:
Everything went well in general, I think. Once I had all the materials gathered, it was pretty easy to follow to protocol and it was quick.

However, during the drying out step (Step 8), the caps of the RNeasy MiniElute caps snapped off for samples 415-2 and 424-2. So, I ended up tossing them for the final step because the lid of the 1.5mL collection tube didn’t fit.

I ran the Qubit on the two samples that made it: 414-2 and 423-2.
The samples were eluted in 14 µL of RNAse-free water, and I used 1µL of the samples two quantify on the qubit.

The results are as follows:
img

img

These results are pretty good! Just want to check on bioanalyzer and/or Nanodrop.

Will discuss next steps at Crab meeting tomorrow morning.

from Grace’s Lab Notebook https://ift.tt/2Q8Cj7B
via IFTTT