High-Level Evaluation Specialist
Job Description
We seek highly analytical and detail-oriented specialists to evaluate AI system outputs across various modalities, including text, images, videos, and multimodal interactions.
Key Responsibilities:
1. Evaluate the quality of large language model (LLM) outputs against project-specific criteria such as correctness, coherence, completeness, style, cultural appropriateness, and safety.
2. Identify subtle errors, hallucinations, or biases in AI responses.
3. Apply domain expertise and logical reasoning to resolve ambiguous or unclear outputs.
4. Provide detailed written feedback, tagging, and scoring of outputs to ensure consistency across evaluation teams.
5. Collaborate with project managers and quality leads to meet accuracy, reliability, and turnaround benchmarks.
Requirements:
* Bachelor's degree or equivalent qualification.
* 1+ years of experience in data annotation, LLM evaluation, content moderation, or related AI/ML domains.
* Demonstrated experience working with data annotation tools and software platforms.
* Strong understanding of language and multimodal communication.
* Ability to adapt quickly to changing project directions and fast-paced work environments.
* Prior experience creating or annotating complex data for LLM training is a plus.