We are seeking analytical professionals with hands-on experience in Red Teaming, Prompt Evaluation, and AI/LLM Quality Assurance.
Key Responsibilities:
* Conduct Red Teaming exercises to identify adversarial outputs from large language models.
* Evaluate AI prompts across multiple domains (e.g., finance, healthcare, security) to uncover failure modes.
* Develop test cases to assess accuracy, bias, toxicity, hallucinations, and misuse potential in AI-generated responses.
* Collaborate with data scientists, safety researchers, and prompt engineers to report risks and suggest mitigations.
* Perform manual QA and content validation across model versions, ensuring factual consistency, coherence, and guideline adherence.
* Create evaluation frameworks and scoring rubrics for prompt performance and safety compliance.
Our team is looking for talented individuals who can help us improve the quality of our AI models. If you have a passion for problem-solving and a strong understanding of AI concepts, we encourage you to apply.
A successful candidate will have excellent analytical skills, be able to work independently, and possess strong communication skills.