Multilingual large language model evaluator

Valença do Piauí

beBee Careers

Modelista

Anunciada dia 14 junho

Descrição

**Large Language Model Evaluation Specialist**

We are seeking a skilled and linguistically aware professional to evaluate and enhance multilingual prompt-response datasets for large language models. This role involves designing evaluation rubrics, assessing translations and model outputs, creating prompts, and identifying cultural nuances and biases in LLM behavior.

Key Responsibilities:

* Create region/language-specific rubric definitions to ensure cultural and linguistic relevance.
* Identify the need for additional rubrics tailored to specific languages or regional contexts.
* Review translated prompts from English into the target language and revise where translations appear unnatural or inaccurate.
* Develop thoughtful prompts to test the cultural awareness of LLM models.
* Evaluate prompt-response pairs using a standardized template based on rubrics and provide detailed justifications.
* Document problematic outputs with clear explanations of rubric violations or cultural insensitivities.

Required Qualifications:

* Native proficiency in the target language and deep familiarity with cultural norms in the corresponding region.
* Experience in LLM evaluation, content moderation, or linguistic QA is preferred.
* Strong attention to detail with the ability to identify subtle issues in language use, tone, and cultural references.
* Comfortable working in spreadsheets and evaluation templates.
* A Master's degree in a relevant field is required.

Preferred Qualifications:

* Prior experience with prompt engineering or LLM testing is beneficial.
* Familiarity with tools such as Gemini, ChatGPT, or similar LLM platforms is an asset.
* Ability to clearly articulate reasoning behind rubric ratings or prompt edits.

The ideal candidate will be able to work independently and as part of a team, with strong communication skills and attention to detail.

Se candidatar

Criar um alerta

Salvar

Vaga parecida

Language model trainer - content evaluator

Valença do Piauí

beBee Careers

Modelista