orderanalyzer
Extracting Order Position Tables from PDF-Based Order Documents
Functions for extracting text and tables from PDF-based order documents. It provides an n-gram-based approach for identifying the language of an order document. It furthermore uses R-package 'pdftools' to extract the text from an order document. In the case that the PDF document is only including an image (because it is scanned document), R package 'tesseract' is used for OCR. Furthermore, the package provides functionality for identifying and extracting order position tables in order documents based on a clustering approach.
Versions across snapshots
| Version | Repository | File | Size |
|---|---|---|---|
1.0.1 |
rolling linux/jammy R-4.5 | orderanalyzer_1.0.1.tar.gz |
359.5 KiB |
1.0.1 |
rolling linux/noble R-4.5 | orderanalyzer_1.0.1.tar.gz |
359.4 KiB |
1.0.1 |
rolling source/ R- | orderanalyzer_1.0.1.tar.gz |
215.1 KiB |
1.0.1 |
latest linux/jammy R-4.5 | orderanalyzer_1.0.1.tar.gz |
359.5 KiB |
1.0.1 |
latest linux/noble R-4.5 | orderanalyzer_1.0.1.tar.gz |
359.4 KiB |
1.0.1 |
latest source/ R- | orderanalyzer_1.0.1.tar.gz |
215.1 KiB |
1.0.1 |
2026-04-26 source/ R- | orderanalyzer_1.0.1.tar.gz |
215.1 KiB |
1.0.1 |
2026-04-23 source/ R- | orderanalyzer_1.0.1.tar.gz |
215.1 KiB |
1.0.1 |
2026-04-09 windows/windows R-4.5 | orderanalyzer_1.0.1.zip |
362.3 KiB |
1.0.0 |
2025-04-20 source/ R- | orderanalyzer_1.0.0.tar.gz |
215.1 KiB |