Grace’s Notebook: 2015 DIA Oysterseed Paper Progress

A little bit of progress has been made with the DIA proteomics paper. I am aiming to have the methods section of the paper solidly done by our meeting this Wednesday at 2 pm.

Created a paper repo

A lot of work needs to be done to organize both the paper’s repo and the project’s repo.

Work in progress.

Methods developing

Got some input from Steven and Emma on the methods for the paper. I’m working on paring down the methods that Rhonda wrote, as well as adding the methods of what I’ve done. Still adding to it because I’m currently working towards getting the GO and GOslim terms from the Skyline output file so that I can compare the proteins expressed between the two temperature regimes.

Started using Galaxy

I’m using Galaxy right now to run a BLASTp with the newest version of the uniprot-swprot database and the proteome I used in the DIA analysis.

From the BLAST output file, I’ll join that with the GO terms (probably in R or Jupyter… unless that’s also something you can do in Galaxy. I’ll play around with it once the BLASTp is complete). Then, I’ll create some simple Venn diagrams and other basic plots and stats to show what proteins were expressed in general, as well as to compare the two temperatures. Aiming to have this done by the Wednesday meeting as well.

from Grace’s Lab Notebook

Sam’s Notebook: Metagenomics – Taxonomic Diversity from Geoduck Water with BLASTp and Krona plots

we’re working on getting the metagenomics sequencing project written up as a manuscript and Steven asked me to provide an overview of the taxonomic makeup of our metagenome assembly in this GitHub Issue.

I previously assembled all of the sequencing data in to a single assembly (i.e. did not assemble by experimental treatments):

Subsequently, I ran some gene prediction software to help refine the assembly in to a more conservative representation, in hopes of getting a more realistic view of biologically relevant DNA (i.e. analyzing sequenced DNA that actually has putative functions, as opposed to random eDNA that may have been floating around in the water):

For getting taxonomic info, I took the MetaGeneMark proteins FastA file and ran BLASTp against the NCBI SwissProt database (v5) to get taxonomic IDs. See this Jupyter Notebook (GitHub):

This was followed up by using Krona to plot the data in an interactive fashion, according to NCBI taxonomic ID abundance (see Results below).