Yaamini’s Notebook: DML Analysis Part 24

Some information I’ve missed

I met with Steven on Tuesday, and he suggested I do a few things:

  1. Figure out if the mRNA genome feature file overlaps with introns and exons
  2. Count the number of unique genes the gene background overlapped with and get Uniprot codes

Feature file overlaps

TL;DR: Yes, the mRNA feature file includes introns and exons.

screen shot 2019-02-27 at 2 33 18 pm

Figure 1. Various genome feature files in IGV.

I opened the tracks in IGV and found they overlapped. I have to consider this as I think about what the overlaps between DML and DMR and exon, intron, or mRNA coding regions actually mean. My guess is that I need to consider exon and intron overlaps as a subset of the mRNA overlaps. Unless the mRNA coding region file has information that isn’t an intron or exon, I could just compare exon and intron overlaps instead of using mRNA overlaps.

Unique genes from gene background-mRNA overlaps

I went back to my R Markdown file and subsetted unique Genbank IDs from the file with gene background-mRNA overlaps and Uniprot codes. I used the following code:

 uniqueBackgroundmRNAblast <- subset(backgroundmRNAblast, !duplicated(backgroundmRNAblast$Genbank)) #Subset the unique Genbank IDs from backgroundmRNAUniprot and save as a new dataframe. nrow(uniqueBackgroundmRNAblast$Genbank) #Count the number of unique genes  

The gene background overlapped with 14,943 unique genes. I saved the subsetted information in this file.

Going forward

  1. Describe gene products for all remaining DML and DMR overlaps
  2. Compare genes with hypermethylated vs. hypomethylated loci and regions

// Please enable JavaScript to view the comments powered by Disqus.

from the responsible grad student https://ift.tt/2Xwpgkj
via IFTTT

Yaamini’s Notebook: Sperm DNA Extractions Part 3

Continuing sperm DNA extractions

Yesterday I finished extracting DNA from two sperm samples. Since my yields were good, I’m finishing up extractions!

Methods

Step 1. Obtain liquid nitrogen (LN2), dry ice, a small ceramic mortar and pestle, and a wide spatula. Place DNA samples in dry ice. Set a heat block to 37ºC.

Step 2. Pour LN2 into the ceramic mortar, and additional LN2 into a styrofoam cooler. Place the pestle and spatula in the cooler. While waiting for the LN2 to boil off the mortar, prepare a 10% bleach solution.

Step 3. Once the LN2 has boiled off, transfer the DNA sample (no more than 30 mg) into the mortar. Break the frozen sample with the spatula, then transfer into the mortar. If the sample does not break, slightly thaw it with heat from your hand.

  • I lost a chunk of sample 48s (close to half the sample) while I was breaking it apart.

Step 4. Pulverize the DNA sample with the LN2-cooled pestle. Transfer the powder to a clean, labeled 1.5 mL microcentrifuge tube.

Step 5. Obtain a new mortar and pestle and repeat Steps 1-4 with remaining samples.

  • It was only after I processed my third sample that I confirmed with Sam that I needed a new mortar and pestle each time! I processed samples 6s and 7s with the same mortar and pestle. Thankfully they are both from the same treatment (high pCO2). Between these samples, I wiped the mortar and pestle, cleaned with a 10% bleach solution, and wiped the surfaces with a clean kim wipe. After processing sample 7s, I found that some of the bleach had frozen onto the LN2-cooled pestle. I got a new mortar and pestle and processed sample 23s, after which I checked my methods with Sam.

Step 6. Obtain the E.Z.N.A. Mollusc Kit. Add 350 µL ML1 Buffer and 25 µL Proteinase K Solution to each sample. Vortex thoroughly.

Step 7. Place the samples on 37ºC heat block for overnight incubation.

Step 8. Soak used mortars and pestles in a 10% bleach solution for 5 minutes. Clean the equipment with a sponge and rinse with DI water before drying. The equipment can be sprayed with 100% ethanol to speed up the drying process.

Going forward

  1. Friday: Finish isolating DNA and quantify with the Qubit
  2. Figure out where to send samples for WGBS and prepare samples accordingly

// Please enable JavaScript to view the comments powered by Disqus.

from the responsible grad student https://ift.tt/2GOPPMF
via IFTTT

Yaamini’s Notebook: Sperm DNA Extractions Part 2

Isolating and quantifying C. virginica sperm DNA

Yesterday I prepared two C. virginica sperm samples for overnight lysis. Today I’ll continue with the E.Z.N.A. Mollusc Kit to isolate DNA, then use the Qubit to quantify my yield. Hana, an undergraduate biology major, shadowed me while I did labwork! I’m hoping I can get her involved in helping with my labwork in the next few weeks.

DNA Isolation

Step 7. Remove samples from the heat block and add 350 µL of choroform:isoamyl acholol (24:1) to each sample. Vortext the samples to mix. Set the heat block to 70ºC.

Step 8. Centrifuge 10,000 x g for 2 minutes at room temperature.

  • While centrifuging, I labelled two additional tubes for Step 9 with the sample name and “Aq” to designate that this was the aqueous phase (ex. 12s Aq)

Step 9. Transfer the upper aqueous phase to a clean 1.5 mL microcentrifuge tube. Take note of the quantity transferred. Avoid the milky interface containing contaminants and inhibitors.

  • The milky interface was really easy to spot! I forgot I had my phone with me so I didn’t take any pictures, but I will next time.
  • Both samples had 350 µL of the upper aqueous phase

Step 10. Add MBL Buffer in the same amount as the volume of the aqueous phase transferred in Step 8. Add 10 µL of RNase A to each sample, then vortex at maximum speed for 15 seconds.

  • I added 350 µL MBL Buffer to each sample

Step 11: Incubate the samples at 70ºC for 10 minutes.

  • The original protocol specifies this should be done in a water bath, but Sam said it was fine if I just used a heat block.
  • The samples were left at room temperature for 13 minutes while I waited for the heat block to reach 70ºC

Step 12. Cool the sample to room temperature.

  • I let the samples sit in the tube rack for 10 minutes to reach room temperature.
  • While cooling, I pipetted 350 µL of the Elution Buffer into a 1.5 mL centrifuge tube, then placed it on the 70ºC heat block. I need the buffer at 70ºC for the final elution.
  • I also labelled HiBind DNA Mini Columns with sample names while waiting for samples to cool.

Step 13. Add one volume 100% ethanol. Vortex at maximum speed for 15 seconds.

  • I added 350 µL 100% ethanol to each sample.

Step 14. Insert a labelled HiBind DNA Mini column into a 2 mL Collection Tube. Transfer 750 µL of sample from Step 12 (including any precipitate) into the column.

  • I did not have any precipitate.

Step 15. Centrifuge at 10,000 x g for 1 minute. Discard the filtrate and place the column back in the collection tube. Repeat Step 14 and 15 with until al of the sample has been applied to the spin column.

  • I only needed to repeat these steps once more.

Step 16. Discard the collection tube. Place the spin column into a new collection tube and add 500 µL HBC Buffer to the column. Centrifuge at 10,000 x g for 30 seconds.

Step 17. Discard the filtrate and reuse the collection tube. Add 700 µL DNA Wash Buffer and centrifuge the samples at 10,000 x g for 1 minute. Repeat this step once more for a second DNA Wash Buffer wash step.

Step 18. Centrifuge the empty column at maximum speed for 2 minutes to dry the membrane.

Step 19. Place the spin column in a clean 1.5 mL microcentrifuge tube. Add 50-100 µL of the pre-heated Elution Buffer to the membrane and let it sit at room temperature for 5 minutes.

  • I added 50 µL of buffer to each sample to get a more concentrated sample.

Step 20. Centrifuge at 10,000 x g for 1 minute. Repeat Step 19 and 20 once more for a second elution step.

  • For the second elution, I used the eluate from the first elution instead of adding new elution buffer. I hoped this would increase my yield and concentration without changing my elution volume.

Quantificiation

Step 21. Obtain dsDNA BR standards from the fridge.

Step 22. Prepare the master solution, using a 1:200 ratio of dye to dsDNA BR buffer. Each standard and sample needs 200 µL of solution.

  • I had two standards and two samples, so I needed 800 µL of solution.
  • I prepared 880 µL of solution, using 875.6 µL buffer and 4.4 µL dye (880 µL solution * 0.5 / 100 = 4.4 µL dye; 880 µL solution – 4.4 µL dye = 875.6 µL buffer).

Step 23. Pipet 200 µL master solution into each Qubit assay tube.

Step 24. Add 10 µL of the correct standard to the standard assay tube. Add 5 µL of sample to the sample tube. Vortex the tubes for 2-3 seconds, then incubate at room temperature for 2 minutes.

Step 25. Use Qubit to quantify yield

Results

Table 1. Sample ID, concentration, and total DNA yield.

Sample Concentration (ng/µL) Total DNA Yield (ng in 45 µL total)
L18A0012s 125 5625
L18A0031s 9.84 442.8

My two samples had very different concentrations! I think I was able to get enough DNA for whole genome bisulfite sequencing out of these samples. Since Sam isn’t in today for me to confirm, I’ll comment in this issue and continue with labwork tomorrow.

Going forward

  1. Thursday: Pulverize remaining samples and start overnight incubation
  2. Friday: Finish isolating DNA from all samples and quantify with the Qubit

// Please enable JavaScript to view the comments powered by Disqus.

from the responsible grad student https://ift.tt/2GOvcjC
via IFTTT

Sam’s Notebook: Data Wrangling – CpG OE Calculations on C.virginica Genes

Steven tasked me with processing ~90 FastA files containing gene sequences from C.virginica in this GitHub Issue. He needed to determine the Observed/Expected (O/E) ratio of CpGs in each FastA. He provided this example code and this link to all the files. Additionally, today, he tasked Kaitlyn with merging all of the output CpG O/E values for each sample in to a single file, but I decided to tackle it anyway.

The CpG O/E determination was done in a Jupyter Notebook:

Interestingly, the processing (which relied on awk) required the use of gawk, due to the high number of output fields. The default implementation of awk on the version of Ubuntu I was using was not gawk.

The creation of a single file with all of the CpG O/E info is detailed in this bash script:

  #!/bin/bash ## Script to append sample-specific headers to each ID_CpG ## file and join all ID_CpG files. ## Run file from within this directory. # Temp file placeholder tmp=$(mktemp) # Create array of subdirectories. array=(*/) # Create column headers for ID_CpG files using sample name from directory name. for file in ${array[@]} do gene=$(echo ${file} | awk -F\[._] '{print $6"_"$7}') sed "1iID\t${gene}" ${file}ID_CpG > ${file}ID_CpG_labelled done # Create initial file for joining cp ${array[0]}ID_CpG_labelled ID_CpG_labelled_all # Loop through array and performs joins. for file in ${array[@]:1} do join \ --nocheck-order \ ID_CpG_labelled_all ${file}ID_CpG_labelled \ | column -t \ > ${tmp} \ && mv ${tmp} ID_CpG_labelled_all done 

Yaamini’s Notebook: Sperm DNA Extractions

Extracting DNA from C. virginica sperm samples

We got C. virginica sperm and mantle samples from the Lotterhos Lab’s 2018 experiment! I’m going to use sperm samples to examine potential for epigenetic inheritance in this species. But first, I need to see if I can get usable DNA from these samples.

Sam suggested I use the E.Z.N.A Mollusc Kit for these extractions instead of DNAzol like Claire did with her sperm samples. I’m going to test this method on two sperm samples: L18A0012s and L18A0031s.

Methods

Step 1. Obtain liquid nitrogen (LN2), dry ice, a small ceramic mortar and pestle, and a wide spatula. Place DNA samples in dry ice. Set a heat block to 37ºC.

Step 2. Pour LN2 into the ceramic mortar, and additional LN2 into a styrofoam cooler. Place the pestle and spatula in the cooler. While waiting for the LN2 to boil off the mortar, prepare a 10% bleach solution.

Step 3. Once the LN2 has boiled off, transfer the DNA sample (no more than 30 mg) into the mortar. Break the frozen sample with the spatula, then transfer into the mortar. If the sample does not break, slightly thaw it with heat from your hand.

Step 4. Pulverize the DNA sample with the LN2-cooled pestle. Transfer the powder to a clean, labeled 1.5 mL microcentrifuge tube.

Step 5. Obtain the E.Z.N.A. Mollusc Kit. Add 350 µL ML1 Buffer and 25 µL Proteinase K Solution to each sample. Vortex thoroughly.

Step 6. Place the samples on 37ºC heat block for overnight incubation.

Going forward

  1. Wednesday: Finish isolating DNA and quantify with the Qubit. If successful, pulverize remaining samples and start overnight incubation
  2. Thursday: Finish isolating DNA from all samples and quantify with the Qubit

// Please enable JavaScript to view the comments powered by Disqus.

from the responsible grad student https://ift.tt/2Th84A3
via IFTTT

Yaamini’s Notebook: DML Analysis Part 23

DMR-mRNA gene product information

Since I didn’t get much information from my previous gene enrichment, I thought I would start describing the function of coding regions with DMRs in them. When I looked at the DMR-mRNA overlap file, I saw that there was gene product information buried in the last column:

 ID=rna48;Parent=gene35;Dbxref=GeneID:111114201,Genbank:XM_022452489.1;Name=XM_022452489.1;Note=The sequence of the model RefSeq transcript was modified relative to this genomic sequence to represent the inferred CDS: inserted 2 bases in 2 codons;exception=unclassified transcription discrepancy;gbkey=mRNA;gene=LOC111114201;model_evidence=Supporting evidence includes similarity to: 4 Proteins%2C and 99%25 coverage of the annotated genomic feature by RNAseq alignments%2C including 9 samples with support for all annotated introns;product=vacuolar protein sorting-associated protein 13B-like;transcript_id=XM_022452489.1  

My first step was to isolate the product= information from this column. The easiest thing to do would be to use awk to separate the text into multiple columns based on a ; delimiter. However, there are different numbers of ; in each line, so the product informaiton would not consistently be in the same column throughout the document. I wanted to separate out all product information, either by creating a custom delimiter or extracting all information after product=. I figured there’d be an elegant way to do this with bash, so I posted this issue. Unfortunately Sam couldn’t help me with a solution! I took the easy way out and used Excel.

I created a duplicate column, replaced product= with >, then used > as a delimiter to separate product information from the rest of the column. I then used ; as a delimiter with the column I just generated to separate the product information from the transcript ID. Finally, I deleted columns without product information that I generated throughout this process. My final document, found here, looks like this:

screen shot 2019-02-25 at 4 14 47 pm

I know you can use sed for a find-and-replace to do something similar in bash, but I couldn’t get that to work:

screen shot 2019-02-25 at 4 02 13 pm

Now that I had the product information isolated, I wanted to create a new document with summary information (gene product, percent methylation difference, number of transcript variants the DMR was found in). I figured this summary document would be a good starting place when I start writing. The summary document can be found here. DMRs were found in interesting genes, including calcium uptake (could help with calcification), cilia and flagella associated protein (sperm motility, cellular structure), cytochrome P450 (oxidative stress protein), tubulin-specific chaperones (motility-related), telomere-associated protein (could be related to cell replication and apoptosis), sperm-tail PG-rich repeat-containing protein (no idea what this does but my guess is that it’s sperm-related), and zygotic DNA replication. There were also 37 uncharacterized gene products.

Going forward

  1. Repeat this process with DMR-exon, DMR-intron, and DMR-TE overlaps
  2. Visualize gene product information
  3. Relate this information to gene enrichment

// Please enable JavaScript to view the comments powered by Disqus.

from the responsible grad student https://ift.tt/2tNxsiN
via IFTTT

Yaamini’s Notebook: DML Analysis Part 22

Gene enrichment for mRNA overlaps

Back in analysis mode! I decided to tackle a gene enrichment for the any feature files that overlaped with mRNA coding regions. The reasons why I only chose files with mRNA coding region overlaps are because learning which coding regions are enriched is most interesting to me, and because I needed Genbank IDs to match overlap results with my blastx output. I did a gene enrichment before, but this was using the wrong background. For this analysis, I used the a previously generated blastx output and the gene background from methylKit. I merged documents and isolated Uniprot codes in this R Markdown file.

I followed the instructions from my previous gene enrichment using DAVID. I downloaded the functional annotation table, functional annotation clustering, and GOterm information (biological processes, cellular components, and molecular functions) from DAVID for each analysis. The output from all my analyses can be found in this master folder. I’ve gone into detail about some of the GOterm results below, but there were barely any significantly enriched terms after correcting for multiple comparisons. There are a handful of enriched GOterms that are significant without correction that could be interesting to describe.

Overlaps

I performed a gene enrichment from the DMR-mRNA and DML-mRNA overlaps to see which genes were overrepresented in differentially methylated loci and regions between ambient and treatment samples.

Based on corrected p-values, there was only one significantly enriched GOterm for DMR-mRNA overlaps: cilium morphogenesis! That could be interesting seeing how impacts on cilia could affect cellular structure. The only other GOterm with less than a 10% FDR was cellular projection organization, which may also be involved with cilia, flagella, and sperm motility.

There were no significantly enriched GOterms when I looked at DMLs instead of DMRs. For cellular components, cytoplasm had a FDR less than 10%. The molecular function ubiquitin-protein transferase activity also had a FDR less than 10%.

Upstream Flanks

I previously conducted a flanking analysis to identify 100 bp flanks upstream and downstream of mRNA coding regions. I then intersected these flanks with DMR and DML. Understanding what processes are enriched in the flanking regions can provide insight into regulatory mechanisms.

For the upstream flanks, I used this DMR overlap file and this DML overlap file.

There were no significantly enriched GOterms for the intersection of upstream flanks with DMRs or DMLs after correcting for multiple comparisons.

Downstream Flanks

For the downstream flanks, I used this DMR overlap file and this DML overlap file.

There were no significantly enriched GOterms for the intersection of downstreams flanks with DMRs or DMLs after correcting for multiple comparisons.

Closest Non-Overlapping

When I conduced my flanking analysis, I also identified the closest non-overlapping DMR and DML to each mRNA coding region. Again, understanding what processes are enriched in these non-overlapping elements may provide information about regulatory mechanisms or related gene functions. I used this file for DMRs and this file for DMLs.

There were also no significantly enriched GOterms for the closest non-overlapping DMRs or DMLs.

Going forward

  1. Determine if this is the best gene enrichment approach
  2. Find a way to do a gene enrichment with exon, intron, and transposable element overlaps
  3. Describe functions of most interesting genes with DML and DMR

// Please enable JavaScript to view the comments powered by Disqus.

from the responsible grad student https://ift.tt/2T7WIhX
via IFTTT