Multimodal GenAI Evaluation Analyst
We seek highly skilled and detail-oriented professionals to perform nuanced evaluations of AI system outputs across multiple modalities.
Responsibilities:
* Evaluate text, image captions, video descriptions, and multimodal prompts generated by LLMs.
* Assess quality against project-specific criteria including correctness, coherence, completeness, style, cultural appropriateness, and safety.
* Identify subtle errors, hallucinations, or biases in AI responses.
* Apply domain expertise and logical reasoning to resolve ambiguous or unclear outputs.
* Provide detailed written feedback, tagging, and scoring of outputs for consistency.
* Escalate unclear cases and contribute to refining evaluation guidelines.
* Collaborate with project managers and quality leads to meet accuracy, reliability, and turnaround benchmarks.