28 Apache Airflow Production Pipelines
BCHPR · CHPR_DAGS.py (1,242 lines) · 2023 – present
Production Airflow 3 orchestration for every BCHPR data pipeline — 28 DAGs running from every 10 minutes to 3-hourly, with a custom FileMtimeSensor, Slack + email alerting, and a reusable DAG factory pattern.
Highlights
- 28 production DAGs covering Wave 11, GHIT FujiLAM II, NPOC, Start4All, Viral Load, Xpert, Truenat, Pluslife, Image Quality, Inventory Management, Specimen Transport, TB Treatment, Chest X-ray, Culture, TBRL Lab, Collaborators, UCD DDE Check, Global Outcome, and maintenance jobs.
- Custom `FileMtimeSensor` — rescheduling sensor that triggers on file / directory mtime changes, with SHA-based baselines stored in Airflow Variables (no false positives on first run).
- `build_script_dag` factory pattern — parameterised DAG construction keeps 1,242 lines maintainable across 28 DAGs.
- Cadences — every 10 min (Culture) · every 15 min (cleanup, UCD DDE) · 30 min (user activity, specimen transport, lab PDF) · 45 min (FujiLAM UA) · hourly (most pipelines) · 3-hourly (reports) · daily (DQ scoring, collaborators).
- Timezone-aware scheduling (Africa/Douala) with 60-min execution timeout (120 min for GHIT) and 180-min dagrun timeout.
- Failure alerting — Slack via webhook or `SlackWebhookHook` connection, plus email to two addresses, with 2× exponential-backoff retries.
- Airflow 3.x SDK with Airflow <3 fallback compatibility; cross-platform path normalisation (Windows ↔ WSL) baked into every job.
- Runtime config resolution: env vars → Airflow Variables → defaults; `ShortCircuitOperator` guards skip runs when no upstream change detected.
Related projects
Designer & Lead Engineer
QR-Code Patient & Specimen Tracking System
Operational innovation deployed across 365+ primary health facilities and 25+ GeneXpert laboratories — now under active consideration by Cameroon's National TB Programme as national best practice.
Architect & Lead Developer
Request Management System — Power Platform End-to-End
Production-grade Microsoft Power Platform solution running the full lifecycle of financial and operational requests at BCHPR — submission, document validation, executive approval, payment, issue tracking, and reporting — built on SharePoint Lists, Power Automate, Teams Approval Cards, and a Power BI semantic model.
Architect
Inventory & Asset Management — Manager.io + M365
Manager.io-integrated inventory and asset-management system for CHPR logistics across multiple Cameroon sites — with Microsoft 365 ecosystem orchestration (Lists, Power Automate, Power BI) and automated procurement workflows.