date_utils.py — Date Parser & Power BI Calendar Generator
BCHPR · 2023 – present
3,746-line date engine handling 60+ formats, Excel serials, timezone conversion, and Power BI dimension tables with 99+ attributes (fiscal periods, holidays, relative categories, sort orders).
Highlights
- 60+ date format patterns with priority / secondary / tertiary fallback (ISO · European · US · Excel · DICOM · RFC 2822).
- Excel serial detection and conversion (e.g. 45,321 → 2024-01-15).
- Power BI-ready dimension tables with fiscal quarters, holidays, relative categories, and sort orders.
- Polars-first with pandas fallback; 3-10× speedup on 100k+ row datasets.
- Statistical date outlier detection (IQR, Z-score, range) for data-entry error identification.
Related projects
Architect
my_functions.py — Centralised Python Library
The 21,086-line shared Python library that every BCHPR data project depends on — APIManager, PathsManager, REDCap wrappers, study-ID generation, SharePoint I/O, and dozens of cross-project utilities.
Architect
data_quality_manager.py — Enterprise DQA Framework
11,007-line data quality platform with fluent QueryBuilder, persistent query lifecycle tracking, duplicate analysis, and double-data-entry verification across 28+ instruments — with SQLite persistence and Polars acceleration.
Engineer
study_id_patterns.py — Study-ID Regex Registry
2,611-line centralised registry of 8 study-ID patterns and 14 site-code patterns across Cameroon, Nigeria, and Vietnam projects — with vectorised extraction, validation, classification, and cleaning.