I met with Steven and Shelly today and showed them the basic enrichment analyses I did. We discussed what question we are going to attempt to address and possible ways to answer it or at least parse out some data or patterns. If we want to address the differences between silos that were at 23C or 29C then we need to make direct comparisons between them. There are two silos at 23C and one silo at 29C. I need to make direct comparisons between 2 silos. To distinguish temperature differences: Silo 3 or Silo 2 and Silo 9.
(I could also look into possible changes in protein abundance or detection that resulted in increased mortality in Silo 2. This would mean I would run analyses between Silo 2 and Silo 3.)
Each silo contains about 7400 detected proteins with no more than about 5400 proteins abundant each day. It is likely that many of these proteins are similarly abundant between the silos throughout the experiment. Therefore, we need to parse out what protein are deferentially abundant between the silos:
I can rename the proteins according to their silo (eg. 3-CHOYP-….. or 9-CHOYP-….) and run a kmeans clustering analysis. This would generate a dataframe with cluster assignments for proteins in both silos. Proteins from both silos found in the same silo can be determined as having similar abundance patterns in both silos and thus they would not explain significant differences between the 23 and 29C silos- which can be seen in the (or a new) NMDS plot between said silos.
I’m running kmeans first and then will move on to the other stats options.
Some other stats options: