simhash
generate similarity hashes to find nearly duplicate files
Install
- All systems
-
curl cmd.cat/simhash.sh
- Debian
-
apt-get install simhash
- Ubuntu
-
apt-get install simhash
- Kali Linux
-
apt-get install simhash
- Windows (WSL2)
-
sudo apt-get update
sudo apt-get install simhash
- Raspbian
-
apt-get install simhash
- Dockerfile
- dockerfile.run/simhash
simhash
generate similarity hashes to find nearly duplicate files
One of the questions that it's nice to be able to answer about a pair of files is the degree of similarity between them. This command-line tool is useful for estimating the "degree of similarity" between a pair of nominally sequential files such as textfiles. The tool uses Manassas's "shingleprinting" technique;