Roberto’s Notebook: Gene mapping

Testing the program tophat-2.0.13, the data were mapped with the genome (downloaded from: http://eagle.fish.washington.edu/trilobite/Crassostrea_gigas_ensembl_tracks/Crassostrea_gigas.GCA_000297895.1.22.dna_sm.genome.fa.gz). But there were a problem with the GTF file. Steven looked on tophat page and there where a suggested and faster program (hisat2-2.1.0) than tophat.

The hisat program has been downloaded (at /usr/local/bioinformatics/). The support information (Pertea, M., Kim, D., Pertea, G. M., Leek, J. T., & Salzberg, S. L. (2016). Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. Nature protocols, 11(9), 1650.) revels that the GTF file is needed (and it has been downloaded as the Looking back at Tophat post said: Crassostrea_gigas.GCA_000297895.1.24.gtf). Using this file and the genome it was necessary to extract splice sites and exons.
For the moment, the “creating a HISAT2 index” is running and after, the reads will be mapped again.