 
        
        We are seeking analytical professionals with experience in Red Teaming, Prompt Evaluation and AI/LLM Quality Assurance to rigorously test and evaluate AI-generated content.
Job Overview
 * We are looking for a detail-oriented individual to help identify vulnerabilities, assess risks, and ensure compliance with safety, ethical, and quality standards.
Key Responsibilities:
 * Conduct Red Teaming exercises to identify adversarial outputs from large language models.
 * Evaluate and stress-test AI prompts across multiple domains to uncover potential failure modes.
 * Develop and apply test cases to assess accuracy, bias, toxicity, and hallucinations in AI-generated responses.
 * Collaborate with data scientists and prompt engineers to report risks and suggest mitigations.
 * Perform manual QA and content validation ensuring factual consistency and guideline adherence.
 * Create evaluation frameworks and scoring rubrics for prompt performance and safety compliance.
Required Skills & Qualifications
 * Prior experience in AI red teaming, LLM safety testing, or adversarial prompt design is essential.
 * Familiarity with prompt engineering, NLP tasks, and ethical considerations in generative AI is required.
 * A strong background in Quality Assurance, content review, or test case development for AI/ML systems is necessary.
 * Understanding of LLM behaviors, failure modes, and model evaluation metrics is crucial.
 * Excellent critical thinking and analytical writing skills are expected.
Preferred Qualifications
 * Prior work with teams like OpenAI, Anthropic, or Google DeepMind is highly valued.
 * Experience in risk assessment, red team security testing, or AI policy & governance is desirable.