Job Overview
This role involves highly nuanced assessments of AI system outputs across various modalities. Analysts will evaluate accuracy, quality, clarity, and cultural alignment of model outputs against project standards.
Key Responsibilities:
* Evaluate LLM outputs across multiple formats (text, image captions, video descriptions, multimodal prompts).
* Assess quality against project-specific criteria (correctness, coherence, completeness, style, cultural appropriateness, safety).
* Identify subtle errors, hallucinations, or biases in AI responses.
* Apply domain expertise to resolve ambiguous or unclear outputs.
* Provide detailed feedback, tagging, and scoring to ensure consistency.
Requirements:
1. Strong critical reading, observational, and evaluative skills.
2. Ability to articulate nuanced judgments with precision and clarity.
3. Excellent English comprehension (CEFR B2 or above), additional languages a plus.
4. Familiarity with LLMs, generative AI, and multimodal systems.
5. Awareness of cultural and linguistic nuances.