About this role:
We are hiring one AI/LLM Engineer to join a small, focused team building backend features for an existing Retrieval-Augmented Generation (RAG) service.
Your primary focus will be building and optimizing RAG models for search across large-scale medical and scientific documents, including pre-processing and embedding over half a billion documents to ensure they are searchable and contextually accurate.
You'll work on semantic chunking strategies, improving the automated evaluation pipeline, and fine-tuning LLMs for textual RAG use cases.
The role involves hands-on experimentation, model development, and backend engineering, with deployments to non-prod environments and collaboration with DevOps for production rollout.