simhash

generate similarity hashes to find nearly duplicate files

Install

All systems
curl cmd.cat/simhash.sh
Debian Debian
apt-get install simhash
Ubuntu
apt-get install simhash
image/svg+xml Kali Linux
apt-get install simhash
Windows (WSL2)
sudo apt-get update sudo apt-get install simhash
Raspbian
apt-get install simhash

simhash

generate similarity hashes to find nearly duplicate files

One of the questions that it's nice to be able to answer about a pair of files is the degree of similarity between them. This command-line tool is useful for estimating the "degree of similarity" between a pair of nominally sequential files such as textfiles. The tool uses Manassas's "shingleprinting" technique;