Crandore Hub

autocodebook

Automatic Codebook and Tracking for 'Spark' and 'dplyr' Pipelines

Wraps 'dplyr' verbs (mutate, summarise, filter) to automatically capture variable metadata (type, source columns, categories, and source code), producing a codebook and eligibility tracking table with zero manual documentation. Works with both 'sparklyr' (tbl_spark) and local data frames. Adds big-data optimizations (caching, assume-unique counting, checkpointing) and a standardized report module with an eligibility flowchart, editable codebook export (HTML, DOCX, XLSX), and cross-sectional or longitudinal variable inspection. The eligibility flowchart follows the CONSORT statement (Schulz, Altman and Moher (2010) <doi:10.1136/bmj.c332>) and the reporting of observational cohort studies follows the STROBE recommendations (von Elm and others (2007) <doi:10.1371/journal.pmed.0040296>).

Versions across snapshots

VersionRepositoryFileSize
0.1.0 rolling linux/jammy R-4.5 autocodebook_0.1.0.tar.gz 626.1 KiB
0.1.0 rolling linux/noble R-4.5 autocodebook_0.1.0.tar.gz 626.2 KiB
0.1.0 rolling source/ R- autocodebook_0.1.0.tar.gz 481.3 KiB
0.1.0 latest linux/jammy R-4.5 autocodebook_0.1.0.tar.gz 626.1 KiB
0.1.0 latest linux/noble R-4.5 autocodebook_0.1.0.tar.gz 626.2 KiB
0.1.0 latest source/ R- autocodebook_0.1.0.tar.gz 481.3 KiB
0.1.0 2026-04-23 source/ R- autocodebook_0.1.0.tar.gz 0 B

Dependencies (latest)

Imports

Suggests