[code] -> Done reading data...

	-> Done reading data waiting for calculations to finish
	-> Done waiting for threads
	-> Output filenames:
	-> Sat Jun 23 09:01:00 2018
	-> Arguments and parameters for all analysis are located in .arg file
	-> Total number of sites analyzed: 786914862
	-> Number of sites retained after filetering: 773121 
	[ALL done] cpu-time used =  273699.44 sec
	[ALL done] walltime used =  81714.00 sec

[code]#!/bin/bash ## Job Name #SBATCH...

## Job Name
#SBATCH --job-name=angsd-22
## Allocation Definition
#SBATCH --account=srlab
#SBATCH --partition=srlab
## Resources
## Nodes (We only get 1, so this is fixed)
#SBATCH --nodes=1
## Walltime (days-hours:minutes:seconds format)
#SBATCH --time=10-100:00:00
## Memory per node
#SBATCH --mem=100G
#SBATCH --mail-type=ALL
#SBATCH --mail-user=sr320@uw.edu
## Specify the working directory for this job
#SBATCH --workdir=/gscratch/srlab/sr320/analyses/0622

source /gscratch/srlab/programs/scripts/paths.sh

/gscratch/srlab/sr320/programs/angsd/angsd \
-bam /gscratch/srlab/sr320/data/cw/all_bam.bamlist \
-out Association_test \
-doAsso 1 \
-yBin /gscratch/srlab/sr320/data/cw/YBin_file \
-GL 1 \
-doMaf 1 \
-doMajorMinor 1 \
-minMaf 0.05 \
-SNP_pval 1e-6 \
-minInd 468 \
-minQ 20 \
-P 28


Grace’s Notebook: Reading up on Trinity and Interviewed Genevieve Johnson for DecaPod

Today I did more reading on how to start assembling a transcriptome using Trinity with plans to formulate the beginning steps tomorrow and I interviewed Genevieve Johnson (University of Alaska, Fairbanks) on her thesis project doing Tanner Crab population genetics for a new soon-to-be published episode of DecaPod.


I read a lot of cool things about Trinity and here are some links to useful information:
Main Trinity wiki: https://ift.tt/2axHHMZ
RNAseq Workshop slides: https://ift.tt/2luyCvr
Trinity de novo Transcriptome assembly workshop: https://ift.tt/2ttNk9g

DecaPod S1E9

Genevieve Johnson stopped by today (she’s in Seattle for a conference) and I recorded her sharing her experience performing her thesis work on population genetics of Tanner Crabs of Alaska. It was a really good interview and she is very good at communicating, so I won’t have to spend too much time editing and publishing! Will be published either tonight or tomorrow! Woo!

from Grace’s Lab Notebook https://ift.tt/2lqsl3N

bismark ran on mox -…

bismark ran on mox – default – 6 hours
might play with multicore option

[sr320@mox2 0620]$ cat *report.txt | grep "Mapping"
Mapping efficiency:	35.8%
Mapping efficiency:	34.8%
Mapping efficiency:	35.0%
Mapping efficiency:	35.9%
Mapping efficiency:	34.3%
Mapping efficiency:	34.0%
Mapping efficiency:	35.4%
Mapping efficiency:	33.6%
[sr320@mox2 0620]$ head slurm-194366.out 
Bowtie seems to be working fine (tested command '/gscratch/srlab/programs/bowtie2-2.1.0/bowtie2 --version' [2.1.0])
Output format is BAM (default)
Alignments will be written out in BAM format. Samtools found here: '/gscratch/srlab/programs/samtools-1.4/bin/samtools'
Reference genome folder provided is /gscratch/srlab/sr320/data/olurida-genomes/v081/	(absolute path is '/gscratch/srlab/sr320/data/olurida-genomes/v081/)'
FastQ format assumed (by default)
Attention: using more than 4 cores per alignment thread has been reported to have diminishing returns. If possible try to limit -p to a value of 4
Each Bowtie 2 instance is going to be run with 28 threads. Please monitor performance closely and tune down if needed!

Input files to be analysed (in current folder '/gscratch/srlab/sr320/analyses/0620'):

Grace’s Notebook: Still have high error rates in Skyline – posted my issue to the Skyline support page

Today I went through all my setting and files in Skyline for the 2015Oysterseed proteomic project and re-did my Skyline Daily peak-picking error rate determination. I still got an error rate nearing 50%, so per Emma’s suggestion, I submitted the issue to Skyline’s support page.

Here is a link to the Skyline Support page where you can find what I posted: https://ift.tt/2ysIONZ?

My error rate is near 50% in all four samples. Not good! Laura and Yaamini had final admissable error rates of 30%. I emailed Emma and she said that I should post to the Skyline support page… waiting to hear back from them now. Hopefully this can get sorted out soon. I would love to be able to move forward with the analysis and get more involved in this project.

from Grace’s Lab Notebook https://ift.tt/2lkEFT3

Running methylkit on emu

using srlab account



Grace’s Notebook: Update on Crab Pooling and Plan for next week-ish

Today I met with Steven to talk briefly about the crab sample pooling (need more info on how to proceed- will wait for Sam’s return in 1.5 weeks), and about things I can do in the interim while the samples are being sequenced: practicing Trinity with Geoduck transcriptome data; DecaPod; 2015 Oysterseed project.

Crab Pooling

So I misunderstood what was needed. The CORE facility needs at LEAST 20ng/ul of RNA in a 50ul sample. Meaning, the sample needs at 1000ng of RNA. However, when the Qubit results (in ng/ul) are multiplied by 50ul (the final volume of the isoated RNA samples), the vast majority of them are well below 250 ng of RNA. For a pool (4 crab samples per pool) to be successful, each sample needs to contribute at least 250ng of RNA to the pooled sample.

Here’s link to GitHub issue: here

My epiphany notes from today:

Practicing with Trinity

Steven mentioned that while the samples are being sequenced (it can take about 1.5 months) it would be cool if I practiced/ learn how to use Trinity on some Geoduck transcriptome data. I will be using Trinity to assemble the transcriptome from my samples later.

Here is a link to a wiki on Trinity that I have been reading about: here


When I was in Juneau last November with Pam, I met Genevieve Johnson who is a current intern at NOAA and she is working on Tanner crab hybrid population structure and genomics. I met up with her again this past weekend when I was in Juneau visiting my sister and we spoke more about our projects. She is in Seattle this week and if she has time, I will meet up with her and record a little about what she’s up to and her experience with Tanner crabs for a new episode of DecaPod.

If she doesn’t have time while she’s here, we can try and do it remotely. Should be interesting to see how the sound quality works with that.

Some other episode ideas are:

  • Crab pooling plan and sequencing info (wait for Sam to get back to get better idea of what is involved)
  • Some summaries of literature on Tanner crabs and Bitter crab disease and Hematodinium that I read

2015 Oysterseed project

Work on understanding what I’m doing and making sure I choosing the rights settings because my peaks are looking pretty bad. I had a really high error rate last week, so I need to go through the protocols and Yaamini and Laura’s notebooks to see what they did.

from Grace’s Lab Notebook https://ift.tt/2t8ih3M