Kaitlyn’s notebook: 20190123 Geoduck histology

We sampled geoduck at the end of January and took some histology samples. I believe only geoduck from tank 4 were strip spawned and we noticed that the treated tanks had poorly developed gonads. I examined the histology slides that we just received from sampling.

I reviewed how Grace staged her geoduck as well as the criteria in the paper Brent sent Shelly and I (Ropes 1968). I talked with Grace a little more about what she did and showed her some slides and we were both having trouble placing the geoduck at different stages. It was hard to decide whether they were producing less, reabsorbing, or just developing gonad more slowly than the other tanks.

I came up with some scoring criteria to get a better idea of how tanks and treatments compared and entered it into this spreadsheet.

Overall, it appears that the ambient tanks were further along in development than the treated tanks. Tank 4 was furthest in development compared to tank 1 which was dominated by early active geoduck.

All treatment group geoducks were early active:

Least developed male 4 with only spermatognium present in tiny acini-

Early active male 1 with dense connective tissue and very low amount of spermatids-

Early active male 018 with less connective tissue previously but is still early active (granted further along than 1) because of the number of acini with less than 30-40% spermatids; also the acini are smaller than seen below in the late active male –

Early active female 006 with dense connective tissue, low volume eggs, and most eggs are on the follicular wall

Early active female 034 with many elongated eggs on the wall of a small follicles however the eggs do have some good volume and roundness. This is a good example of difficulties staging-

While Ambient geoducks looked more like this:

Late active male 41 with visible spermatids covering at least 50% of large acini and all acini contain spermatids; connective tissue is less dense than 018-

Ripe female 045 with little connective tissue and high volume eggs; this female has the most eggs/follicle but it still isn’t a lot compared to a heavily ripe female-

Images are located on the FFAR-Geoduck drive under /Images/20190123-histo and on OWL. I also updated the histology-databank with the slide locations.

Ropes, J. W. 1968. Reproductive Cycle of the Surf Clam, Spisula Solidissima, in Offshore New Jersey. Biol Bull 135:349-365

Sam’s Notebook: Data Management – Data Migration and Drive Expansion on Gannet

A little while ago, we installed some additional hard drives in Gannet (Synology RS3618XS) with the intention of expanding the total storage space. However, the original set of HDDs were set up as RAID10. As it turns out, RAID10 configurations cannot be expanded! So, the new set of HDDs were configured as a separate volume (Volume 2) in a RAID6 configuration. After backing up the /volume1/web directory (via rsync) to our UW Google Drive, I begane the data migration.

Synology provides a simple interface for this:

  1. Select Shared Folders
  2. Select the folder you want to move.
  3. Edit
  • Change “Location” from Volume1 to Volume2.


This process took a couple of days.

After the data migration, then I removed Volume1 and Storage Pool1.

Then, restarted the Synology and used Storage Manager to add the unused disks, formerly part of Volume 1, to the only remaining storage pool, Storage Pool2. Re-installed and started the Web Station app to re-enable access to the folder indexes for our web folders and now we’re up and running again!

The expansion of the storage pool to include the “newly added” disks will take quite a bit of time, but, in the end, we should end up with ~78TB of total storage space, in a RAID6 configuration.

In theory, this change of volume will not be noticeable for most of our regular usage. However, for rsync/ssh, a user will now have to specify /volume2/web instead of /volume1/web like we previously did. There does not appear to be a way to change the name of the volume.

from Sam’s Notebook https://ift.tt/2VGi0R7

RNA extraction for first round of oyster tissue samples

Alanna Greene

See Laura’s notebook entry for homogenization steps.

Realized we skipped step 3 (centrifuging again in a DNA spin column to remove DNA). We will see if we can salvage RNA, if not we will re-d0 this with different samples. 

2/22/19 1:30 p.m. 

Samples were removed from -80 and I began the treatment on the samples, starting at step 3 in the QIAGEN RNeasy Plus protocol. Sample 136 clogged the MiniElute column (part of homogenate did not flow through after centrifuging ). I transferred this to a new column called 136B, which I processed over to replace 136, which was completely clogged. It turns out a lot of the columns were clogged (step 6+7), so at step 7 I increased the speed to 16G in the centrifuge. Once the protocol was completed, I stored the samples in the fridge. 


  • Autoclaved dishes and got more liquid nitrogen 
  • Ran samples through Qubit 30 following protocol: 
    • Wiped down tubes 
    • Calibrated standards first (blank and full) 
  • For samples: 1 microliter in sample in 199 microliter in solution
  • Successfully quantified! 
Sample # Qubit Reading 
S1Too low out of range 
S29.92 ng/mL
100Too high
124Too high
126Too high

Shelly’s Notebook: Mon. Mar 4, 2019 oyster proteomics network analysis

Network analysis on same-day temperature ChiSq pvalue 0.1 selected proteins


  • Temperature seems to affect the biological processes transport, metabolism, and development
    • Regarding transport, a couple of proteins related to cytoskeleton-dependent intracellular transport are upregulated at 29C across almost all days while a few proteins related to vesicle mediated transport are downregulated especially on Day 13.
    • Regarding metabolism, over time there is a general downregulation of proteins associated with different metabolic processes at 29C.
    • Regarding development, early on at Day 3 there is an upregulation of proteins associated with cell morphogenesis and nervous system processes at 29C. Day 11 also shows more upregulated than downregulated development proteins at 29C
      • revisiting past analysis, the PCA of technical replicates showed that 29C time points sometimes clustered with the succeeding 23C time points, suggesting the increased temperature could speed up development. unnamed-chunk-9-1.png
      • This could be apparent in the ASCA PCA plot of the timextemperature interaction effect with most of the 23C time points being separated from most of the 29C time points along PC1. This could be more dramatic separation if I removed time point 0.
      • revisiting phenotypic data Rhonda gathered, on Day 6, 29C seed was generally larger in size
      • day6survival.jpg.

To do:

  • Need to swap CHOYP names for gene names or protein names
  • Dig into the lit about the proteins mentioned above so get a better idea of the implications of their temperature-altered abundance.
  • Probably need to threshold the uniprot mapping by e-value since I haven’t done that yet
  • Maybe try this mapping pipeline for molecular function GO terms
  • see what people think about this network analysis
  • It would be cool to visualize these networks as dynamic networks rather than each day side-by-side. And there seem to be a few Cytoscape apps for visualising dynamic networks

from shellytrigg https://ift.tt/2EAC3cW

Shelly’s Notebook: Friday. Mar. 1, 2019 oyster proteomics network analysis

Expanding protein list to bring into network analysis

Instead of using ASCA selected proteins, which tend to not show much change due to temperature, I’m going to try selected proteins based on their log foldchange or ChiSq. pvalues from the same-day foldchange analysis

Determining a log foldchange or pvalue cutoff for selecting proteins

Numbers below came from code: ASCA_proteinNetworkAnalysis_withGO.R lines 260-308

  • Using same-day folchange and pvalues between temperatures, I found – 2224 proteins have a magnitude log2foldchange >=1 (which is 2x or greater foldchange) – 165 proteins have magnitude logfoldchange >=2 (4x) – 128 proteins have an adj.pvalue <= 0.05 – 153 proteins have an adj.pvalue <= 0.1

conclusions: I think it makes the most sense to use the adj.pvalue cutoff instead of log2foldchange, so I’m going to go with and adj.pvalue cutoff of 0.1.

Comparing ChiSq. pvalue cutoff protein selection with ASCA protein selection

Numbers below came from code: ASCA_proteinNetworkAnalysis_withGO.R lines 310-316

  • 330 ASCA selected proteins map to uniprot
  • 153 ChiSq. pvalue selected proteins map to uniprot
  • 50 proteins overlap between the two methods with 230 unique to ASCA and 103 unique to ChiSq. pvalue cutoff.


  • Using ASCA to select proteins provides a larger list of proteins, but the majority aren’t affected by temperature.
    • So when we are looking for temperature effects on protein abundance in the networks, it’s not apparent in most of these proteins.
  • I think it’s most relevant to use the same-day comparison adj.Chisq pvalue <= 0.1 threshold for selecting proteins and use their log foldchange to color the nodes in the network.
  • Need to:
    1. make a cleaner version of the R script to make new node and edge attribute files
    2. make a new cytoscape network

from shellytrigg https://ift.tt/2TiUInL

Shelly’s Notebook: Thur. Feb 28, 2019 oyster proteomics network analysis

Calculating abundance fold change between temperatures on the same day

Check that GO Slim terms in protein-GO Slim Terms file match REVIGO’s GO slim terms

  • figured out revigo changes the names of GO terms (or is using a version of GO terms with different names). See code CheckREVIGOterms.R.
    • 17 GO slim terms from REVIGO don’t overlap with the GO slims from OntologyX analysis
    • 26 GO slim terms from Ontology X don’t overlap with REVIGO GO slims.


  • only 2/3 of protein-term associations can map to REVIGO GOslim terms
  • REVIGO does not provide a file showing how the 62 OntologyX terms mapped to their GOslim terms
  • Need another way to get GO slim term-term relationships instead of using REVIGO; or another way to relate proteins to each other

Trying Cytoscape STRINGapp to get protein-protein relationships

The Stringapp gets protein-protein interaction data from the STRING database and uses the data to create a Cytoscape network. Publication here

  • After installing the STRINGapp, I uploaded a list of unique uniprot accessions from ASCA_entry_GO.txt generated by lines 255-258 in ASCA_proteinNetworkAnalysis_withGO.R to this cytoscape session OysterProteomicsNetworkAnalysis.cys
  • Selected human as the species as the app only allows you to select one species.
  • The resulting network only contained 26 proteins indicating poor mapping STRINGappNetwork.jpeg
  • The uniprot accessions I have from mapping the protein sequences to the uniprot database are from many different species


  • Find a non-species specific way to find protein-protein relationships

Using GO semantic similarity in R to get term-term relationships

REVIGO uses GO semantic similarity to determine slim term relationships. The standard cutoff is 0.7.

I used the get_sim_grid() function in package OntologySimilary (part of OntologyX) to get GO semantic similarity values. This output a symmetric matrix from which I took the upper half and reshaped it to make a network file listing each GO slim term combination and their GO semantic similarity values. See code: SymanticSim.R that i used in combination with ASCA_proteinNetworkAnalysis_withGO.R in the UniprotAnnotations_preliminaryNetworkAnalysis.Rproj. The output has 4 columns: 1. Term1 2. Term2 3. Semantic Similarity score 4. relationship type (term-term or protein-term). The ouput file is edge_attb_semsim0.7.csv.

I also generated an edge attribute file with the original GO IDs (not the GO Slim IDs) to avoid loss of information in mapping to slim terms via my OntologyX method and to see if clustering in cytoscape alone would be able to group terms into simpler categories. See lines 65-96 of code SymanticSim.R

Cytoscape network mapping with GO semantic similarity edges

In the same cytoscape session (OysterProteomicsNetworkAnalysis.cys):

  • I created a new network from term-term.csv. This network does not have protein information, only term-term info just to see if Cytoscape can cluster the non-slim GO terms into simpler categories. I chose organic clustering and it made this: ASCA_GOterms_semsim_clusters.jpg

It seems like too many GO terms are falling into the same cluster so I’m not sure how meaningful this is.

  • I created a new network from edge_attb_semsim0.7.csv and ASCA_all_FCtosameday_pval.csv as node attributes. I selected just the GO term nodes, made a new sub-network from those, and did organic clustering to find GO modules. img These smaller clusters of GO slim terms seem a lot easier to interpret than the non-GO slim clusters above
  • For one modules that I’m calling metabolism, I selected all proteins containing any of these terms and made a new network from that
  • I colored nodes by Day 13 log FC. img
  • I exported the networks as a figures and save the cystoscape session OysterProteomicsNetworkAnalysis.cys

Conclusions :

  • GO slim terms seem to cluster into more easy-to-interpret groups than non-GO slim terms
  • The Day 13 log FC network does not show many changing proteins
    • Maybe the ASCA selection is too stringent? Or the temperature effect was just overpowered by the time effect so ASCA missed some proteins affected by temperature?
    • I can try this analysis on all proteins showing same-day logFC between temperatures of 1 or more, or based on ChiSq. pvalue =< 0.1.

from shellytrigg https://ift.tt/2EABPCC