Yaamini’s Notebook: MEPS Revisions Part 3

Final unconstrained and constrained ordinations

I finally figured out all of my N/A problems and completed my unconstrained and constrained ordinations!

Protein abundance

I didn’t have any problems with my protein abundance NMDS and previously obtained pairwise ANOSIM R-statistic and p-values. In this R script, I visualized my ordinations. You can find one version with sample numbers here, and another with site and habitat distinctions here. Because I exported my files as pdfs and don’t feel like changing the code and re-exporting files, I’m also including screenshots of the ordinations.

screen shot 2018-11-29 at 10 58 33 am

screen shot 2018-11-29 at 10 58 49 am

Figures 1 and 2. Protein abundance ordinations with confidence ellipses.

Environmental data

My original problem with the environmental data was that I had N/As in my dataframe, but was unable to use method = "gower" with the metaMDS function in the package vegan. Turns out vegan uses vegdist, which doesn’t handle N/As. Instead, I used the daisy function in the cluster library to calculate a Gower’s dissimilarity (distance) matrix. I then inputted that matrix directly into metaMDS with method = "euclidean". It worked! My code can be found here. I then conducted an Analysis of Similarity (ANOSIM) to assess the significance of my results.

Table 1. One-way ANOSIM results for environmental data NMDS based on site (Case Inlet vs. Fidalgo Bay vs. Port Gamble Bay vs. Skokomish River Delta vs. Willapa Bay), habitat (bare vs. eelgrass), parameter (mean vs. variance), and environmental variable (pH vs. dissolved oxygen vs. salinity vs. temperature). Significant p-values are bolded.

One-way ANOSIM R p-value
Site -0.03428841 0.943
Habitat -0.02037461 0.982
Parameter 0.7378174 0.001
Environmental Variable 0.1796322 0.001

Only the parameter and environmental variable ANOSIMs were significant. My guess is that because I ordinated means and variances in the same space, this is skewing my results. I decided to conduct two-way ANOSIMs to go further.

Table 2. Two-way ANOSIM results for environmental data NMDS. Significant p-values are bolded.

Two-way ANOSIM R p-value
Site and Habitat -0.09991229 1
Parameter and Site 0.4778027 0.001
Parameter and Habitat 0.4778027 0.001
Environmental Variable and Site 0.02098092 0.315
Environmental Variable and Habitat 0.1089352 0.007
Environmental Variable and Parameter 0.7802082 0.001

When accounting for parameter, site and habitat were significant. When accounting for environmental variable, only habitat was significant. I included environmental variable and parameter, but I already figured that would be significant. I need to double check with Julian, but I belive conducting pairwise ANOSIMs based on the significant two-way ANOSIM results will help me understand what is driving the significant results.

I also visualized my ordination here.

screen shot 2018-11-29 at 11 08 28 am

Figure 3. Environmental data NMDS. Means are outlined in solid lines, variances are outlined in dashed lines.

Constrained ordination

In this R Markdown file, I conducted a constrained ordination to look at relationships in protein abundance based on my environmental data. To do this, I first created an environmental data matrix with mean and variance values from the entire outplant, and matched them with my objects, oyster sample IDs. This meant I had the same objects in both my protein abundance and environmental data matrices. I calculated the gradient length and found that the underlying model was linear, so I used an RDA. I specified na.action = na.exclude in my rda function so it could handle my missing values.

Table 3. Variance explained by constrained ordination. The RDA explains 29.21% of all variance.

Variance Partition Inertia Proportion
Total 0.015616 1
Constrained 0.004561 0.2921
Unconstrained 0.011055 0.7079

Table 4. ANOVA results for overall RDA analysis. With F(6,19) = 1.3064 and p-value = 0.195, the RDA is not significant.

Partition Df Variance F Pr(>F)
Model 6 0.0045607 1.3064 0.195
Residual 19 0.0110550

Table 5. Significance of each RDA axis. No axis is significant.

Axis Df Variance F Pr(>F)
RDA1 1 0.0023293 4.0033 0.316
RDA2 1 0.0013380 2.2995 0.603
RDA3 1 0.0003888 0.6682 0.998
RDA4 1 0.0003065 0.5268 0.994
RDA5 1 0.0001395 0.2397 0.999
RDA6 1 0.0000586 0.1008 0.999

Table 6. Significance of each environmental descriptor in the RDA. Temperature mean and variance are marginally significant predictors.

Environmental Variable Df Variance F Pr(>F)
Temperature Mean 1 0.0013542 2.3275 0.065
Temperature Variance 1 0.0012584 2.1627 0.067
pH Mean 1 0.0003063 0.5264 0.781
pH Variance 1 0.0007082 1.2171 0.301
Dissolved Oxygen Mean 1 0.0006024 1.0354 0.349
Dissolved Oxygen Variance 1 0.0003312 0.5692 0.725
Residual 19

One weird thing I noticed is that my salinity variables do not show up in Table 6 or my ordinations below. That’s something I’ll have to figure out with Julian. Other than that, it’s interesting to note that my RDA really isn’t significant, which fits the loose interpretation we had earlier of environment affecting protein abundance in the manuscript.

To visualized my contrained ordination, I included all proteins and environmental descriptors here, and only significant proteins and environmental descriptors here.

screen shot 2018-11-29 at 11 06 57 am

screen shot 2018-11-29 at 11 07 06 am

Figures 4-5. RDA visualization including all proteins and descriptors (top) and only those that are significant (bottom).

Going forward

  1. Discuss all results with Julian
  2. Create a better ordination visual
  3. Update manuscript with new methods and results
  4. Tweak discussion

// Please enable JavaScript to view the comments powered by Disqus.

from the responsible grad student https://ift.tt/2ACXXui