Technology & Digital
AI Engineer
About this role
PATH is a global nonprofit dedicated to achieving health equity. With more than 40 years of experience forging multisector partnerships and expertise spanning science, economics, technology, and advocacy, PATH develops and scales up innovative solutions to the world's most pressing health challenges. PATH is seeking an AI Engineer to support the expansion of SnapiForm, an AI-powered platform accessible through Telegram mini-app, WhatsApp, and the browser that enables health workers to digitize paper HMIS forms by simply taking a photo. Following a successful pilot in the DRC that significantly improved data accuracy and reduced reporting time, SnapiForm is now scaling to process millions of health records each month.
In this role, the AI Engineer will design and optimize AI pipelines for complex document understanding, focusing on extracting structured data from mobile-captured HMIS forms, including handwriting recognition, complex table extraction, and multilingual parsing. The role involves researching, benchmarking, and fine-tuning state-of-the-art Vision-Language Models and foundational OCR models on domain-specific datasets, using advanced techniques such as LoRA, QLoRA, and DeepSpeed to maximize accuracy on noisy, real-world mobile images. The AI Engineer will architect and deploy production-grade inference pipelines using vLLM or similar engines, optimizing continuous batching, KV cache management, and quantization to meet strict low per-page processing cost targets. Design of infrastructure for both self-hosted and on-premise environments, keeping data sovereignty and cost efficiency in mind, is also central to the role.
S. S. in Computer Science, Artificial Intelligence, Machine Learning, or a related quantitative field, with seven years of experience in Machine Learning Engineering and at least one to two years focused on Computer Vision, Document AI, or Multimodal Large Language Models. Core technical requirements include deep expertise in PyTorch and the Hugging Face ecosystem, production-level experience deploying models using vLLM, proficiency in OCR, handwriting recognition, and Vision-Language models, experience with OpenCV and Pillow for image processing, and strong skills in Docker, Kubernetes, and cloud GPU provisioning.