Kaitlyn’s notebook: take down day 1

Today we started breaking down the geoduck experiment at Pt. Whitney. We turned off the Apex and cleaned and stored all probes. CO2 lines were removed from conicals and secured. We moved most of our materials back up to the lab where we organized and cleaned so that the areas are available for the hatchery workers to use and the materials are safe until Sam returns. The geoduck are still in their original heath trays however only their gravity flow is left to be removed. This task will be done tomorrow when Steven and Brent arrive. I took lots of photos. Here are a few!


Kaitlyn’s notebook: Pt. Whitney

Last week I worked at Pt.Whitney and was toured around the facility by Sam. He showed me how to do titrations to determine alkalinity as well. I helped him take daily measurements of pH, temperature and salinity for each tray and conical. Once the dosing pumps arrived on last Tuesday, I helped him set up the next part of his experiment. I labelled tubes and snap frooze 8 geoduck from each treatment, then we worked as a team to move the top rows of health trays to different treatments. This was a difficult and long task because we wanted to be sure geoduck didn’t escape their isolation chambers but the quarters were tight for movement due to beams and the trays were quite heavy. Some pipes have to be removed and the conical need to be lowered to make moving the trays possible. We heard some air leaking from the CO2 canisters after the movement and had to find a wrench and soap to stop any leakage as well as clear the lines and clean the banjos. Nevertheless we were successful in our task! I also helped do algae counts and changed out the algae bucket which lasts for 24 hours to feed the larvae. Sam does respirometry on the juveniles every other day as well. I shadowed him moving the geoducks into the respirometry wells since it is a task that needs to be done quickly to prevent hypoxia. While waiting for titrations and respirometry measurements to be completed, I measured the geoduck larvae on ImageJ. These length measurements are used to correct respiration calculations. Before I left on Wednsday, I finished measuring larvae and helped Sam take daily measurements. We noticed that the temperature was much higher than it had been in previous measurements so we saw that the temperature had changed Tuesday evening. We believed that when we cut off flow to the tanks, the system switched to lagoon water which may be warmer. However, the chiller should be working to prevent any change so it could also have been a problem with the chiller. Fortunately all the trays were of equal temperature at least.

I arrived early today to help Sam with the final day of his experiment. It was the same long procedure as before except we didn’t have to change treatments. First, we took all of the daily measurements and I helped Sam with the tris calibration. Then I worked with Sam and used the cell counter to measure algae densities of Iso, Pav and Tet. The machine was really cool and much better than the haemocytometer! Once we knew the density, we could plug the numbers into Sam’s code which he made based on previous literature. This gave us the amounts to use in the buckets so I filled the 24 hour bucket like last week. Next was respirometry and prepping for snap freezing. I labelled tubes and emptied and removed labels from all of the old titration samples while Sam did respirometry. Once that was finished, we headed back to the hatchery. We only had to move the top health trays to access the geoduck in the bottom to snap freeze them this time although the pipes and conicals still have to be removed and drained.

Tomorrow we are breaking down the experiment and moving the geoduck to a different part of the hatchery. We will also make sure everything is tidied up and organized for when Sam returns. I will take photos of the geoduck setup and lab tomorrow!

Kaitlyn’s Notebook: New table with annotations and Kmeans run time…

I merged the Uniprot annotated table with each silo that had the quantitative and qualitative tags I previously made: new table .

I want to note this table includes proteins that are not abundant in each silo. I choose to include this for now since they are easily removable. I was thinking that some Revigo plots with the 0 abundance proteins might reveal some differences between the silos… There is ~1000 to 1500 proteins not expressed in each table (out of about 8400 proteins).

I’m making a new scree plot since my last scree plots weren’t with the right code (nhclus.scree(x, max.k=#)), however it has not been successful yet because of the amount of time it’s taking. I’ve let it run over 4 hours with no results produced. I am trying it one more time and am planning on letting it run overnight, however if it takes that long it may not be feasible since I need to do it 3 times and then run kmeans which takes a few hours itself…

Also, this came up lab meeting: changing max.print options in R .

Kaitlyn’s Notebook: Kmeans Clustering

I used an R script provided by Emma to cluster each silo.

The scree plots for each silo are:

I choose four clusters for each silo so that they could be better compared between one another. It was somewhat difficult to determine cluster numbers because the first component was so much larger than any others…

The plots are comparing the first measured day of the experiment with the last.

Initially I choose only 3 clusters for Silo 2. The comparison between 3 and 4 clusters for silo 2:

Here is Silo 3 followed by Silo 9 with 4 clusters:


I’m working next on Eigen vectors (I’m having a problem getting R to read my data as numeric) and incorporating the cluster assignment into the excel spreadsheet as a tag.

Kaitlyn’s Notebook: Basic Statistical ‘Tags’

I updated an excel spreadsheet so it has multiple stats that I thought might be useful to see any patterns in expression. There are multiple sheets on the file: combined data with few tags followed by silo 2, 3 and 9 with all tags. The tags and why they may be helpful in seeing protein expression patterns are listed below.

  1. Average- is this protein typically highly or lowly expressed?
  2. Standard Deviation- how much does each day deviate from one another on average?
  3. Coefficient of Variance – normalized variance; how dispersed the protein expression is
  4. Variance- less useful than (3) however another representation of the dispersion of protein expression
  5. Median- valuable if compared to the average protein abundance to understand if protein expression is consistent
  6. Slope- liner regression to understand overall trend of protein expression (decreasing vs. increasing)
  7. Kurtosis- understand if the protein has a sharp peak in protein expression
  8. Skewness- informs us if the protein is being expressed more in a certain hald of the experiment
  9. Max- is the protein expressed a lot at any point in the experiment?
  10. Min- is there a time when the protein is not expressed?
  11. Range- the overall change in protein expression (does not inform us whether it is increasing or decreasing)
  12. 1st quartile- What is the cutoff for 25% expression over the course of the experiment?
  13. 4th quartile- What is the cutoff for 75% expression over the course of the experiment?
  14. Sum- determines if the protein was highly abundant over the course of the experiment (relative to the sums of other proteins)
  15. Day0:Day15- a ratio of the day before treatment to the final day of the experiment; informs us if the protein significantly changed after treatment
  16. Day3:Day15- a ratio of the first day of measured day of treatment over the final day of treatment
  17. Average for Days 0-7- valuable when compared to the second average to see if there was a change in protein expression half way through the larvas’ lives
  18. Average for Days 9-15- a compliment to the above tag
  19. Range for Days 0-7- valuable when compared to the range for days 9-15; further elucidates changes in expression between the first half of the experiment and the second half
  20. Sum:Total Proteins Identified- what percentage of the total proteins in the experiment are caused by expression of this protein?

I’m not sure how else I should proceed with this data. I could potentially look at gene enrichment, but I believe that a significant portion of proteins should be eliminated before hand. Knowing which proteins to eliminate can be difficult because each ‘tag’ can highlight a new trait of that protein. Therefore, eliminating proteins will mostly depend on future interests for this data set.

Kaitlyn’s Notebook: NMDS and Protein Quant

Just a refresher, I’ve been working with Rhonda’s 2016 oyster larvae data.

This experiment looked at oyster mortality based on two temperatures: 23C and 29C. Proteomic work was done on 3 silos, two at 23C and one at 29C.  When the data between Silo 2 and Silo 3 is compared, we can see that Silo 2 had higher rates of mortality.  Silo 3 and 9, which were at 23C and 29C respectively, had the same mortality rate of 10%. Therefore we decided to make an NMDS that looked at Silo 3 and 9 only. The X is an artifact of the code. The First number is the silo number and the number following the underscore is the day of the experiment.

It looks like Silo 3 had more days that were less similar to most days of the experiment. Silo 3 was at 23C. Rhonda previously reported that hatcheries are growing larvae at 29C because of the higher mortality rate that occurs at 23C. This NMDS plot shows that the beginning of the experiment (day 3) and the end of the experiment (day 15), silo 3 stood out. Silo 9 seems more consistently related to other days of the experiment.

For comparison, here is the previous NMDS plot containing all of the silos:

In addition to the NMDS, I did a quick quantification of the different proteins expressed each day (under “Protein”) and the total protein abundance (under “Protein Abundance”).

Kaitlyn’s Notebok: Gene enrichment of unique proteins

I grouped proteins that had 0 abundance on day 1 based on the number of days they were abundant and ran it through CompGO for biological processes at 0.1. To see if there was a differences in the number of proteins expressed in each group, I made a small table. All values were similar. Highlighted values had p-values of at least 0.1.

I think it’s interesting that all silos had enrichment with proteins that had abundance for only 1 day. All silos had peptidyl-tyrosine-dephosphorylation or “the removal of phosphoric residues from peptidyl-O-phospho-tyrosine to form peptidyl-tyrosine” at p-values greater than 1E-1.

Silo 2- 1 day of protein abundance:

Silo 3- 1 day of protein abundance:

Silo 9- 1 day of protein abundance:

Other enriched processes are:

Silo 2-  cellular response to retionoic acid (6 days),

Silo 3-  intracellular protein transport (4 days) and maturation of SSU-rRNA from tricistronic rRNA transript (5 days),

and Silo 9: negative regulation of endopeptidase activity (7 days). This can be viewed below in respective order.

Silo 2- 6 days of protein abundance:

Silo 3- 4 and 5 days of protein abundance respectively:

Silo 9- 7 days of protein abundance: