sumaclust

fast and exact clustering of genomic sequences

Install

All systems
curl cmd.cat/sumaclust.sh
Debian Debian
apt-get install sumaclust
Ubuntu
apt-get install sumaclust
image/svg+xml Kali Linux
apt-get install sumaclust
Windows (WSL2)
sudo apt-get update sudo apt-get install sumaclust
Raspbian
apt-get install sumaclust

sumaclust

fast and exact clustering of genomic sequences

With the development of next-generation sequencing, efficient tools are needed to handle millions of sequences in reasonable amounts of time. Sumaclust is a program developed by the LECA. Sumaclust aims to cluster sequences in a way that is fast and exact at the same time. This tool has been developed to be adapted to the type of data generated by DNA metabarcoding, i.e. entirely sequenced, short markers. Sumaclust clusters sequences using the same clustering algorithm as UCLUST and CD- HIT. This algorithm is mainly useful to detect the 'erroneous' sequences created during amplification and sequencing protocols, deriving from 'true' sequences.