GeneXpert MTB/RIF — 16-Lab Pooled Testing Pipeline
BCHPR · 16 GeneXpert laboratories, Cameroon · 2023 – present
4,942-line pipeline ingesting multi-site GeneXpert MTB/RIF CSV exports, harmonising French / English bilingual records, and consolidating pooled + individual TB test results — the data backbone of the 2026 openRxiv preprint (first & corresponding author).
Highlights
- Four simultaneous GeneXpert CSV export formats handled with encoding auto-detection.
- 200+ French ↔ English translation dictionary for lab-specific terms.
- Pool-ID deconvolution with SQLite tracking — maps each positive pool back to individual participant results.
- Fallback regex for study-ID pattern extraction from malformed Notes fields (INSPIRE · S4A · Rapid TB · FujiLAM IDs all detected).
- Negative test-duration auto-correction for midnight-rollover timestamps.
- MTB/RIF grade parsing (high / medium / low / trace / negative).
- Streaming mode activation above 10,000 files to prevent OOM.
- Site / region / lab / project hierarchy extracted from folder paths.
- Unprocessed-file notification daemon with Teams alerts on import failures.
- Evidence for the WHO 9 March 2026 sputum-pooling recommendation.
Related projects
Country Data Manager
Start4All — Cameroon Country Data Management
End-to-end Cameroon data operations for Start4All — a 7-country TB intervention programme evaluating near-POC molecular diagnostics and sputum pooling. Evidence fed the WHO 9 March 2026 recommendations; LSTM's 27 February 2026 commentary explicitly cited Start4All.
Data Manager
TB Reach Wave 10 Cameroon — Data Management
Cameroon TB Reach Wave 10 data backbone — community-based active case finding using chest X-ray AI triage paired with sputum pooling for molecular confirmation.
Data Systems Lead
TB Reach Wave 11 — Health Camps & Facility Screening
Largest and most operationally complex BCHPR data system — 19,673 lines integrating 5 REDCap projects across community health camps, primary health facility screening, and prison screening in 7 regions of Cameroon. Target: 494,400 screenings and 105,646 diagnostic results.