autocodebook
Automatic Codebook and Tracking for 'Spark' and 'dplyr' Pipelines
Wraps 'dplyr' verbs (mutate, summarise, filter) to automatically capture variable metadata (type, source columns, categories, and source code), producing a codebook and eligibility tracking table with zero manual documentation. Works with both 'sparklyr' (tbl_spark) and local data frames. Adds big-data optimizations (caching, assume-unique counting, checkpointing) and a standardized report module with an eligibility flowchart, editable codebook export (HTML, DOCX, XLSX), and cross-sectional or longitudinal variable inspection. The eligibility flowchart follows the CONSORT statement (Schulz, Altman and Moher (2010) <doi:10.1136/bmj.c332>) and the reporting of observational cohort studies follows the STROBE recommendations (von Elm and others (2007) <doi:10.1371/journal.pmed.0040296>).
Versions across snapshots
| Version | Repository | File | Size |
|---|---|---|---|
0.1.0 |
rolling linux/jammy R-4.5 | autocodebook_0.1.0.tar.gz |
626.1 KiB |
0.1.0 |
rolling linux/noble R-4.5 | autocodebook_0.1.0.tar.gz |
626.2 KiB |
0.1.0 |
rolling source/ R- | autocodebook_0.1.0.tar.gz |
481.3 KiB |
0.1.0 |
latest linux/jammy R-4.5 | autocodebook_0.1.0.tar.gz |
626.1 KiB |
0.1.0 |
latest linux/noble R-4.5 | autocodebook_0.1.0.tar.gz |
626.2 KiB |
0.1.0 |
latest source/ R- | autocodebook_0.1.0.tar.gz |
481.3 KiB |
0.1.0 |
2026-04-23 source/ R- | autocodebook_0.1.0.tar.gz |
0 B |