Job Overview:
* Evaluate and assess model outputs across multiple modalities (text, image captions, video descriptions, and multimodal prompts)
* Apply domain expertise and logical reasoning to resolve ambiguous or unclear outputs
Key Responsibilities:
1. Evaluate LLM-generated outputs
2. Assess quality against project-specific criteria such as correctness, coherence, completeness, style, cultural appropriateness, and safety
3. Identify subtle errors, hallucinations, or biases in AI responses
4. Provide detailed written feedback, tagging, and scoring of outputs to ensure consistency across the evaluation team
Benefits of Working on This Role:
* Pursue a challenging career with high growth potential
* Collaborate with experts in the field of AI and GenAI
* Develop skills in critical thinking, problem-solving, and decision-making