PRISMA
Protocol Inspection and State Machine Analysis
Loads and processes huge text corpora processed with the sally toolbox (<http://www.mlsec.org/sally/>). sally acts as a very fast preprocessor which splits the text files into tokens or n-grams. These output files can then be read with the PRISMA package which applies testing-based token selection and has some replicate-aware, highly tuned non-negative matrix factorization and principal component analysis implementation which allows the processing of very big data sets even on desktop machines.
README
This folder contains an example file to show the preprocessing step with the sally toolkit (see http://www.mlsec.org/sally/). Before executing the examples please extract asap.tar.gz to find all data necessary to understand the processing chain from the raw data (asap.raw) to the sally file (asap.sally) and the optimized file (asap.fsally). The asap.sally file can be produced as follows: sally -c asap.cfg asap.raw asap.sally this call generates asap.sally from the raw data found in asap.raw. To speed up the loading of the data in R, one should apply the sallyPreprocessing.py python script as follows: python sallyPreprocessing.py asap.sally asap.fsally Now the data is ready to be efficiently loaded and processed in R.
Versions across snapshots
| Version | Repository | File | Size |
|---|---|---|---|
0.2-7 |
rolling linux/jammy R-4.5 | PRISMA_0.2-7.tar.gz |
1.0 MiB |
0.2-7 |
rolling linux/noble R-4.5 | PRISMA_0.2-7.tar.gz |
1.0 MiB |
0.2-7 |
rolling source/ R- | PRISMA_0.2-7.tar.gz |
1.0 MiB |
0.2-7 |
latest linux/jammy R-4.5 | PRISMA_0.2-7.tar.gz |
1.0 MiB |
0.2-7 |
latest linux/noble R-4.5 | PRISMA_0.2-7.tar.gz |
1.0 MiB |
0.2-7 |
latest source/ R- | PRISMA_0.2-7.tar.gz |
1.0 MiB |
0.2-7 |
2026-04-26 source/ R- | PRISMA_0.2-7.tar.gz |
1.0 MiB |
0.2-7 |
2026-04-23 source/ R- | PRISMA_0.2-7.tar.gz |
1.0 MiB |
0.2-7 |
2026-04-09 windows/windows R-4.5 | PRISMA_0.2-7.zip |
1.0 MiB |
0.2-7 |
2025-04-20 source/ R- | PRISMA_0.2-7.tar.gz |
1.0 MiB |