Job Title:
Data Scientist
About the Role:
We are looking for a highly skilled Data Scientist to join our team in building backend features for an existing Retrieval-Augmented Generation (RAG) service.
Main Responsibilities:
* Developing and optimizing RAG models for search across large-scale medical and scientific documents
* Preparing and embedding over half a billion documents to ensure they are searchable and contextually accurate
* Improving the automated evaluation pipeline
* Fine-tuning Language Models for textual RAG use cases
Key Requirements:
* 1+ years of experience as a Data Scientist or AI Engineer
* Experience with Search Technology – OpenSearch, building scalable search systems
* 2+ years of Python development experience, including API creation, model training, testing, and general backend programming
* Familiarity with LangChain for building LLM workflows using tools, memory, and retrieval
Nice to Have Skills:
* Familiarity with AWS infrastructure (IAM, VPC, S3, etc.)
* Exposure to RAG architectures, specifically textual RAG use cases