Skip to content
All projects
Core engineering libraryEngineer

dump_upload_pipeline.py — Mobile-App Ingestion with QR Dedup

BCHPR · field operations · 2024 – present

4,005-line reusable mobile-app data-dump ingestion pipeline that prevents duplicate enrolment and silent sync collisions via QR-based identity, SQLite audit trail, and configurable change policies.

Highlights

  • QR-based identity resolution (record_id ← QR code) — immutable participant matching across dumps.
  • FormConfig change policies: skip / flag / overwrite · within-dump policies: skip / flag / most_complete / most_recent.
  • SQLite audit trail on every record: _source_file, _source_folder, _source_created_at, _source_modified_at.
  • Content-hash idempotency (SHA-256 of serialised record) for duplicate detection.
  • Chunked REDCap import (configurable chunk_size, default 100) with drop_fields for sensitive data filtering.