Yaamini’s Notebook: Remaining Analyses Part 22

Revising environmental data

Before I begin, a quick shoutout to myself for making the comments on my R scripts detailed enough that I can return to a script I created in December and rerun it!

Micah and Alex both pointed out that the Pacific oysters were only outplanted on June 19, 2016. This means all of my environmental data calculations used data points before the experiment even started! To fix this, I made changes to this R script. I used two commands to subset the data I needed:

temperatureData$Date <- as.Date(temperatureData$Date, format = “%m/%d/%y”)

This command makes R recognize all of my dates as actual dates instead of character strings.

temperatureData <- temperatureData[temperatureData$Date >= “2016-06-19”, ]

This command allows me to subset rows with dates on or after 2016-06-19, the start of the outplant experiment.

I repeated those commands for my temperature, pH, dissolved oxygen, and salinity data. With the subset, I calculated variables of interest:

In separate R scripts, I used the same commands to subset data and remake my boxplots and line graphs that show fluctuation over time. You can find the scripts below:

A quick note on salinity:

I had this .csv file with calculated salinity output from Micah. In my paper draft, Micah pointed out that those are conductivity values, and not salinity. However, conductivity values would be much smaller! I emailed Micah to confirm what the values were, and he sent me this response:

I did indeed send you ‘calculated salinity’ values – I took the conductivity output in mS/cm and converted it to salinity in PSU at the in situ temperature using the swSCTp function in the R package oce. BUT, after making a call to the manufacturer, I’ve found out that the conductivity output is standardized by the sensor software to ‘what it would be at 25°C’ rather than outputting the raw value at the in situ temperature. So, roughly speaking, salinity values should be ~7 units lower than they are in this spreadsheet.

TL;DR They are salinity values but the wrong ones. Either way, Willapa Bay still has the lowest salinity values. My code will be able to handle revised data once I get it, so I’m all set on that front.

// Please enable JavaScript to view the comments powered by Disqus.

from the responsible grad student https://ift.tt/2IvgR91

Yaamini’s Notebook: Gonad Methylation Analysis Part 14

A spoiled enchilada

I aligned all of my files, and proceeded to deduplication, sorting, and indexing in this Jupyter notebook. All of that was successful!

I was ready to use bismark_methylation_extractor but I ran into an error instead. To run bismark_methylation_extractor, I need to provide the path to my deduplicated — but unsorted — .bam files. The error file says that I need to use unsorted files, but I’m already directing the program to my unsorted files! I posted this issue to figure it out.

Since I’m replicating Steven’s code, the only difference in my work and his is that I used the default –score_min option. However, he replicated my work in this notebook and this notebook, but did not get an error! Thinking everything was peachy, I reran my code and still got the same error!


I am officially stumped and unable to continue these analyses for now. Back to the DNR paper!

// Please enable JavaScript to view the comments powered by Disqus.

from the responsible grad student https://ift.tt/2KyTCvv