find-large-clusters

tool for clustering millions of short DNA sequences

Install

All systems
curl cmd.cat/find-large-clusters.sh
Debian Debian
apt-get install dnaclust
Ubuntu
apt-get install dnaclust
image/svg+xml Kali Linux
apt-get install dnaclust
Windows (WSL2)
sudo apt-get update sudo apt-get install dnaclust
Raspbian
apt-get install dnaclust

dnaclust

tool for clustering millions of short DNA sequences

dnaclust is a tool for clustering large number of short DNA sequences. The clusters are created in such a way that the "radius" of each clusters is no more than the specified threshold. The input sequences to be clustered should be in Fasta format. The id of each sequence is based on the first word of the seqeunce in the Fasta format. The first word is the prefix of the header up to the first occurrence of white space characters in the header.