NUSS
Mixed N-Grams and Unigram Sequence Segmentation
Segmentation of short text sequences - like hashtags - into the separated words sequence, done with the use of dictionary, which may be built on custom corpus of texts. Unigram dictionary is used to find most probable sequence, and n-grams approach is used to determine possible segmentation given the text corpus.
Versions across snapshots
| Version | Repository | File | Size |
|---|---|---|---|
0.1.0 |
rolling linux/jammy R-4.5 | NUSS_0.1.0.tar.gz |
855.7 KiB |
0.1.0 |
rolling linux/noble R-4.5 | NUSS_0.1.0.tar.gz |
858.1 KiB |
0.1.0 |
rolling source/ R- | NUSS_0.1.0.tar.gz |
292.7 KiB |
0.1.0 |
latest linux/jammy R-4.5 | NUSS_0.1.0.tar.gz |
855.7 KiB |
0.1.0 |
latest linux/noble R-4.5 | NUSS_0.1.0.tar.gz |
858.1 KiB |
0.1.0 |
latest source/ R- | NUSS_0.1.0.tar.gz |
292.7 KiB |
0.1.0 |
2026-04-26 source/ R- | NUSS_0.1.0.tar.gz |
292.7 KiB |
0.1.0 |
2026-04-23 source/ R- | NUSS_0.1.0.tar.gz |
292.7 KiB |
0.1.0 |
2026-04-09 windows/windows R-4.5 | NUSS_0.1.0.zip |
1.1 MiB |
0.1.0 |
2025-04-20 source/ R- | NUSS_0.1.0.tar.gz |
292.7 KiB |