Job Description
* Evaluate outputs generated by large language models across multiple modalities, including text, image captions, video descriptions, and multimodal prompts.
* Assess quality against project-specific criteria such as correctness, coherence, completeness, style, cultural appropriateness, and safety.
* Identify subtle errors, hallucinations, or biases in AI responses.
Strong critical reading, observational, and evaluative skills across different modalities are required. Ability to articulate nuanced judgments with precision and clarity is essential.
Requirements:
* Bachelor's degree or equivalent educational qualification.
* 1+ years of experience in data annotation, LLM evaluation, content moderation, or related AI/ML domains.
* Demonstrated experience working with data annotation tools and software platforms.
The successful candidate will have excellent English comprehension (CEFR B2 or above) and familiarity with large language models, generative AI, and multimodal systems.