textreuse
Detect Text Reuse and Document Similarity
Tools for measuring similarity among documents and detecting passages which have been reused. Implements shingled n-gram, skip n-gram, and other tokenizers; similarity/dissimilarity functions; pairwise comparisons; minhash and locality sensitive hashing algorithms; and a version of the Smith-Waterman local alignment algorithm suitable for natural language.
Versions across snapshots
| Version | Repository | File | Size |
|---|---|---|---|
0.1.5 |
rolling source/ R- | textreuse_0.1.5.tar.gz |
1.1 MiB |
0.1.5 |
rolling linux/jammy R-4.5 | textreuse_0.1.5.tar.gz |
1.2 MiB |
0.1.5 |
latest source/ R- | textreuse_0.1.5.tar.gz |
1.1 MiB |
0.1.5 |
latest linux/jammy R-4.5 | textreuse_0.1.5.tar.gz |
1.2 MiB |
0.1.5 |
2026-04-23 source/ R- | textreuse_0.1.5.tar.gz |
1.1 MiB |
0.1.5 |
2026-04-09 windows/windows R-4.5 | textreuse_0.1.5.zip |
1.6 MiB |
0.1.5 |
2025-04-20 source/ R- | textreuse_0.1.5.tar.gz |
1.1 MiB |