Grace’s Notebook: June 23, 2017

Today I scrounged around for missing data from Heare et. al paper (Evidence of Ostrea lurida Carpenter, 1864 population structure in Puget Sound, WA).

Found it!

Am now going through all the revisions and making sure that they were addressed. Sending the final version to Brent and Steven today, and will resubmit by Monday!



Sean’s Notebook: BWA-Meth Output for…

Sean’s Notebook: BWA-Meth Output for EPI-135.

Found a different methylation aligner, BWA-meth, that’s based on a Burrows Wheeler aligner that’s supposed to deal better with gap alignment than Bowtie2. Fired it up with the EPI-135 and 10k Geoduck genome. It gave an answer, but I *really* don’t believe it. The bamtools stats output I believe claimed 80% mapping rate. Compared to the 6% from Bismark.

sean@emu:~/Documents/Geoduck_Rerun/bwatest$ bamtools stats -in bwa-meth.bam

Stats for BAM file(s): 

Total reads:       55253237
Mapped reads:      48098666	(87.0513%)
Forward strand:    30884874	(55.8969%)
Reverse strand:    24368363	(44.1031%)
Failed QC:         16970759	(30.7145%)
Duplicates:        0	(0%)
Paired-end reads:  55253237	(100%)
'Proper-pairs':    30054258	(54.3937%)
Both pairs mapped: 47463686	(85.9021%)
Read 1:            27626448
Read 2:            27626789
Singletons:        634980	(1.14922%)

.bam file is available: here

Sean’s Notebook: Starting GARM meta-assembly…

Sean’s Notebook: Starting GARM meta-assembly of PacBio and BGI assemblies for Olys.

The Pilon polishing for the CANU assembly finished last night, and it seems pretty small, but hopefully between that, the BGI assembly, and the Platanus assembly, we can assemble ourselves up one decent genome. Fingers crossed at least.

Assembly stats on the Polished assembly:

D-69-91-159-59:BGI_Oly_Genome Sean$ assembly-stats oly_polished_.fasta 
stats for oly_polished_.fasta
sum = 46364927, n = 3388, ave = 13685.04, largest = 61211
N50 = 14126, n = 1230
N60 = 12962, n = 1573
N70 = 11906, n = 1947
N80 = 10932, n = 2352
N90 = 9590, n = 2803
N100 = 2074, n = 3388
N_count = 1
Gaps = 1

1 gap is interesting, but with the assembly size being at least 1, of not two orders of magnitude smaller than the expected genome size, I think we’re short on coverage to allow for conservative error correction levels. Will have to reassemble with looser standards and see if we can bump it up.

Polished CANU assembly found: here

Pilon output file: here

Next step is GARM, to see what that gives us. I think I’ll also re-assemble the PacBio stuff with much less stringent error correction to see if that gives any measurable difference in outputs.

Edit: Also, I finished the –non-directional runs for Bismark, no change in mapping rates and less than 1% complementary mapping, so it looks like the regular arguments are correct. Output .bam files are found here with the NonDir tag.

Grace’s Notebook: June 21, 2017

  1. Today I took pictures of all (except Silo 9 Day 13) oyster seed cryotubes so that I can more easily measure them using ImageJ whenever I have free time.
  2. I also did some scrounging around for missing data for revisions to the Heare et. al paper – Evidence of Ostrea lurida (Carpenter 1864) population structure in Puget Sound, WA. Found some of it in a master data sheet (link: here) and am still looking for some survival data. Link to Jake’s online notebook blog kept during the project: here.
  3. I also took some images of some larvae that are in 50ml tubes. The goal is to image them such that one image has 100 easily-countable larvae. It looks like the tubes have a  lot of things in there that aren’t larvae. Here are the images:

    Will work on this more Friday when I’m in.

Tomorrow (Thursday) I’m off to Manchester to assist Laura!

Seans Notebook: The internet lies.

Part of the Pilon prep for polishing the PacBio Oly assembly is feeding it a bunch of Illumina data aligned to the PacBio assembly using your favorite aligner, in my case Bowtie2. I initially got a bunch of `.sam` files from Bowtie2 and wanted to convert them, so I turned to Google like any good person does these days. After looking at a bunch of different options, all answers pointed to `samtools view -sB file.sam > file.bam` as the preferred way to do this.

Thinking I knew what I was doing, I whipped up a quick slurm job script to convert everything and file up Pilon. It completed in less than 5 minutes, with a bunch of `.bam` files ~40kb in size. This was suspect, as their original `.sam` files were ~40gb.

After a reading the samtools manual, which corroborated the above conversion syntax, re-aligning some of the Illumina data files thinking I’d done something wrong there, and a whole bunch more googling, it turns out the `> file.bam` syntax is actually supposed to be `-o file.bam`.

Apparently the internet and documentation does not keep up with program changes as well as we’d like.

Sean’s Notebook: Pilon prep finished.

Finished the Pilon prep (except for .sam -> .bam conversion) yesterday afternoon. Found a 32-35% mapping rate for Illumina sequencing files to the PacBio backbone. Not great, but hopefully the whole meta-assembly route will manage to combine everything together in to one super assembly.

For fun, I mapped an Illumina file to a couple of the BGI assemblies, and got ~75% mapping. Wish we could have gotten that out of the PacBio thing, but ah well.

Bowtie Output

Im transferring the .sam files over to Owl now, but they’re ~40gb a piece, so that’ll take a bit. Will update here when finished!

Katie’s Notebook: Imaging Day 2- 6/19/17

Went in to UW to continue imaging Laura’s slides. Imaged slides 3-10. Here are the samples that I either couldn’t find gonad tissue in or I wasn’t quite sure about:

Oly 3 slide: HL-6_13 (not sure if I captured gonad tissue), HL-6_14 (none, no picture)
Oly 6 slide: NF-6_17 (none, no picture), NF-6_19 (not sure if I captured gonad tissue)
Oly 7 slide: NF-6_18 (not sure if I captured gonad tissue)
Oly 8 slide: SN-6_18 (not sure if I captured gonad tissue)

I will go through all of my noted questionable slides when I’m done.