Crandore Hub

textreuse

Detect Text Reuse and Document Similarity

Tools for measuring similarity among documents and detecting passages which have been reused. Implements shingled n-gram, skip n-gram, and other tokenizers; similarity/dissimilarity functions; pairwise comparisons; minhash and locality sensitive hashing algorithms; and a version of the Smith-Waterman local alignment algorithm suitable for natural language.

Versions across snapshots

VersionRepositoryFileSize
0.1.5 rolling source/ R- textreuse_0.1.5.tar.gz 1.1 MiB
0.1.5 rolling linux/jammy R-4.5 textreuse_0.1.5.tar.gz 1.2 MiB
0.1.5 latest source/ R- textreuse_0.1.5.tar.gz 1.1 MiB
0.1.5 latest linux/jammy R-4.5 textreuse_0.1.5.tar.gz 1.2 MiB
0.1.5 2026-04-23 source/ R- textreuse_0.1.5.tar.gz 1.1 MiB
0.1.5 2026-04-09 windows/windows R-4.5 textreuse_0.1.5.zip 1.6 MiB
0.1.5 2025-04-20 source/ R- textreuse_0.1.5.tar.gz 1.1 MiB

Dependencies (latest)

Imports

LinkingTo

Suggests