In the final GFF from our GenSAS Pgenerosa_v074.a4 annotation , we noticed that there were no repeat motifs/sequences identified on Scaffold 01. The remaining scaffolds all had repeat motifs present on them, so something seemed amiss (see this GitHub Issue for more info).
I ended up contacting GenSAS and it turned out there was a bug on their end that led to this issue:
Taein Lee Nov 26, 2019, 7:27 PM (8 days ago) to me, jhumann
Hi Sam,
Thank you so much for your report. There was a bug and it has been fixed. Your gff3 files has been re-generated.
-Taein From: gensas-admin on behalf of sam white Sent: Tuesday, November 26, 2019 3:45 PM To: gensas-admin; jhumann; taein_lee Subject: [Website feedback] Merged GFF missing repeats on only one chromosome
Sam (https://ift.tt/2LnmNEw) sent a message using the contact form at https://ift.tt/2qkEE7F.
Hi,
I generated a merged GFF after I “published” my annotation. I included RepeatModeler features in the merged GFF.
My genome has 18 chromosomes. All of them except one chromosome (name: PGA_scaffold1__77_contigs__length_89643857) has the expected repeats annotations present.
I looked at the individual RepeatMasker and RepeatModeler jobs, and both of those GFFs identified repeats on PGA_scaffold1__77_contigs__length_89643857.
Would you happen to have any ideas on why PGA_scaffold1__77_contigs__length_89643857 isn’t showing any repeat features in the merged GFF?>
This is for my project Pgenerosa_v074.
Thanks for any insight!
Sam
So, now that I have the updated, final GFF, I want to re-run the GFF splitting into separate feature files, as well as counts and sequence length stats for all features (including repeats).
Everything is documented in this Jupyter Notebook (GitHub):