pickmax
Split and Coalesce Duplicated Records
Deduplicates datasets by retaining the most complete and informative records. Identifies duplicated entries based on a specified key column, calculates completeness scores for each row, and compares values within groups. When differences between duplicates exceed a user-defined threshold, records are split into unique IDs; otherwise, they are coalesced into a single, most complete entry. Returns a list containing the original duplicates, the split entries, and the final coalesced dataset. Useful for cleaning survey or administrative data where duplicated IDs may reflect minor data entry inconsistencies.
Versions across snapshots
| Version | Repository | File | Size |
|---|---|---|---|
0.1.0 |
rolling linux/jammy R-4.5 | pickmax_0.1.0.tar.gz |
13.3 KiB |
0.1.0 |
rolling linux/noble R-4.5 | pickmax_0.1.0.tar.gz |
13.2 KiB |
0.1.0 |
rolling source/ R- | pickmax_0.1.0.tar.gz |
3.2 KiB |
0.1.0 |
latest linux/jammy R-4.5 | pickmax_0.1.0.tar.gz |
13.3 KiB |
0.1.0 |
latest linux/noble R-4.5 | pickmax_0.1.0.tar.gz |
13.2 KiB |
0.1.0 |
latest source/ R- | pickmax_0.1.0.tar.gz |
3.2 KiB |
0.1.0 |
2026-04-26 source/ R- | pickmax_0.1.0.tar.gz |
3.2 KiB |
0.1.0 |
2026-04-23 source/ R- | pickmax_0.1.0.tar.gz |
3.2 KiB |
0.1.0 |
2026-04-09 windows/windows R-4.5 | pickmax_0.1.0.zip |
15.9 KiB |