Making bedgraphs and q-value results.
So I had a couple of projects to tackle today, first Steven wanted some new bedgraphs for the Day 10 samples he’s working on annotating. The new ones can be found at here and the notebook with code for generation, and an explanation for how they are made can be found here
Next was playing with the qvalue() package in R. The q statistic is a way to determine the false discovery rate in an analysis with multiple comparisons. Given that we have at a minimum of ~2000 of loci we look at, FDR is an important consideration. In my notebook here. I work through the package supplied sample data, to get an idea as to what I should expect, and then work through our 10x and 5x coverage data.
The end result, our results look nothing like the sample results.
My next plan, at the advice of Brent, is to go through and simulate a range of data, starting with what the sample data looks like, and working my way towards what my data looks like (% significant, number of comparisons, distribution of p-values) and see if I can find a trend in what happens that explains why we’re seeing what we do.
Also, I plan on re-reading the paper around the q-value development, which you can find here if you’re feeling particularly masochistic.