mergePolishes
batch spliced alignment of cDNA sequences to a target genome
Install
- All systems
-
curl cmd.cat/mergePolishes.sh
- Debian
-
apt-get install sim4db
- Ubuntu
-
apt-get install sim4db
- Kali Linux
-
apt-get install sim4db
- Windows (WSL2)
-
sudo apt-get update
sudo apt-get install sim4db
- Raspbian
-
apt-get install sim4db
- Dockerfile
- dockerfile.run/mergePolishes
sim4db
batch spliced alignment of cDNA sequences to a target genome
Sim4db performs fast batch alignment of large cDNA (EST, mRNA) sequence sets to a set of eukaryotic genomic regions. It uses the sim4 and sim4cc algorithms to determine the alignments, but incorporates a fast sequence indexing and retrieval mechanism, implemented in the sister package 'leaff', to speedily process large volumes of sequences. While sim4db produces alignments in the same way as sim4 or sim4cc, it has additional features to make it more amenable for use with whole-genome annotation pipelines. A script file can be used to group pairings between cDNAs and their corresponding genomic regions, to be aligned as one run and using the same set of parameters. Sim4db also optionally reports more than one alignment for the same cDNA within a genomic region, as long as they meet user-defined criteria such as minimum length, percentage sequence identity or coverage. This feature is instrumental in finding all alignments of a gene family at one locus. Lastly, the output is presented either as custom sim4db alignments or as GFF3 gene features. This package is part of the Kmer suite.