ngram
Fast n-Gram 'Tokenization'
An n-gram is a sequence of n "words" taken, in order, from a body of text. This is a collection of utilities for creating, displaying, summarizing, and "babbling" n-grams. The 'tokenization' and "babbling" are handled by very efficient C code, which can even be built as its own standalone library. The babbler is a simple Markov chain. The package also offers a vignette with complete example 'workflows' and information about the utilities offered in the package.
Versions across snapshots
| Version | Repository | File | Size |
|---|---|---|---|
3.2.3 |
2026-04-09 windows/windows R-4.5 | ngram_3.2.3.zip |
349.2 KiB |