Emu had been running Laura’s Geoduck samples for the last three weeks. All of the blanks worked perfectly, producing the expected files and not throwing any exceptions. The actual informative sample files? Not so much. They seemed to be being killed by Grid Engine at random times in the process. The only thing I can imagine is that there would be some sort of cascade of failure, where a subset used all available system memory and starved the others out, leading to Grid Engine to kill them for lack of system resources. Some real Darwinian stuff going on in Emu’s internals. This was corroborated by going through and reading every line of code for Pecan and it’s support methods and seeing no good reason for failure except for an out of memory issue.
On the Geoduck front I’m slowing working my way though prepping the Bismark .bam files for run through RADMeth. I’m at the
methcounts step, which produces a text file of genomic locations, methylation ratio, and coverage data but it is exceptionally slow. I started file at the beginning of my train ride and it has yet to finish as I see my stop approaching. I’ll try to leave it running over the weekend, and hopefully things will be finished by then.
All in all, it doesn’t feel like I’ve actually produced much beyond a ton of personal consternation at the difficulty of getting the DIA data analysis workflow actually working. Hopefully next week will be better.