Steven requested a comparison of geoduck genome assemblies.
Ran the following Quast command:
/home/sam/software/quast-4.5/quast.py \ -t 24 \ --labels 20180405_sparse_kmer101,supernova_pseudohap_duck4-p,20180421_Hi-C \ /mnt/owl/Athaliana/20180405_sparseassembler_kmer101_geoduck/Contigs.txt \ /mnt/owl//halfshell/bu-mox/analyses/0305b/duck4-p.fasta.gz \ /mnt/owl/Athaliana/20180419_geoduck_hi-c/Results/geoduck_roberts\ results\ 2018-04-21\ 18\:09\:04.514704/PGA_assembly.fasta
Quast output folder: results_2018_04_30_08_00_42/
Quast report (HTML): results_2018_04_30_08_00_42/report.html
The data’s pretty interesting and cool!
SparseAssembler has over 2x the amount of data (in bas pairs), yet produces the worst assembly.
SuperNova and Hi-C assemblies are very close in nearly all categories. This isn’t surprising, as the SuperNova assembly was used as a reference assembly for the Hi-C assembly.
However, the Hi-C assembly is insanely better than the SuperNova assembly! For example:
- Largest contig is ~7x larger than the SuperNova assembly.
- The N50 size is ~243x larger than the SuperNova assembly!!
- L50 is only 18, 46x smaller than the SuperNova assembly!
This is pretty amazing, honestly. Even more amazing is that this data was sent over to us as some “preliminary” data for us to take a peak at!
from Sam’s Notebook https://ift.tt/2rbMUE8
Ran the following Quast command to compare the two geoduck assemblies provided to us by Phase Genomics:
/home/sam/software/quast-4.5/quast.py \ -t 24 \ --labels 20180403_pga,20180421_pga \ /mnt/owl/Athaliana/20180421_geoduck_hi-c/Results/geoduck_roberts\ results\ 2018-04-03\ 11\:05\:41.596285/PGA_assembly.fasta \ /mnt/owl/Athaliana/20180421_geoduck_hi-c/Results/geoduck_roberts\ results\ 2018-04-21\ 18\:09\:04.514704/PGA_assembly.fasta
Quast Output folder: results_2018_04_30_11_16_04/
Quast report (HTML): results_2018_04_30_11_16_04/report.html
The two assemblies are nearly identical. Interesting…
from Sam’s Notebook https://ift.tt/2JEINax
The end of a pipeline test
Yesterday I encountered a
gunzip error when aligning sequences with
bismark. I opened a issue and documented everything in my lab notebook post. Steven said that I shouldn’t worry about it because I got .bam files! Today, I moved on in my Jupyter notebook to the
bismark_methylation_extractor step. I successfully used the following parameters:
Once again, I encountered a
And like last time, I have outputs! You can find them in this folder. I ignored the error and moved on in the pipeline to the
This step is fairly simple if you don’t want to customize the command. I used the following parameters:
- Path to
- –dir + path to output directory
The reports generated can be found in this folder.
The last part of the pipeline is
bismark2summary. I’m not sure how this differs from
bismark2report, but I’m gonna use it anyways. It generated a report that can be found as a .txt file and .html report.
The next steps are to understand the outputs and start the full pipeline. I posted this issue to get Steven’s advice.
from the responsible grad student https://ift.tt/2r9LcCJ
This weekend I did two sets of 9 isolations. I had to do these along with a few more sets in order to get a subset of samples that ALL have quantifiable RNA via the Qubit.
The first set from yesterday was great! All 9 samples (uninfected; ambient) had readable RNA:
Today’s set of nine (uninfected; cold) was mostly good! Just one tube had “Out of range”, and as a result, I will pick a new crab (3 samples) to replace it:
I also edited and published S1Ep6 of DecaPod during which Pam answers a few of my questions on the issue and project. This is part 1 of 2. There were a lot of questions that I had as well as some questions that others in my cohort have asked me that I didn’t know the answer to.
Crab Mtg #3
Crab Mtg #3 is on Thursday. By then I am hoping to have a subset of samples that all have quantifiable RNA, as well as run a couple samples on the Bioananalyzer with Sam.
from Grace’s Lab Notebook https://ift.tt/2w38Ne2