Job Opportunity:
We are seeking detail-oriented professionals with experience in AI/LLM Quality Assurance to help us rigorously test and evaluate AI-generated content.
Key Responsibilities:
* Conduct thorough testing of large language models (LLMs) to identify vulnerabilities and assess risks.
* Evaluate and stress-test AI prompts across multiple domains (e.g., finance, healthcare, security) to uncover potential failure modes.
* Develop and apply test cases to assess accuracy, bias, toxicity, hallucinations, and misuse potential in AI-generated responses.
* Collaborate with data scientists, safety researchers, and prompt engineers to report risks and suggest mitigations.
* Perform manual QA and content validation across model versions, ensuring factual consistency, coherence, and guideline adherence.
* Create evaluation frameworks and scoring rubrics for prompt performance and safety compliance.
* Document findings, edge cases, and vulnerability reports with high clarity and structure.
Required Skills and Qualifications:
* Proven experience in AI red teaming, LLM safety testing, or adversarial prompt design.
* Familiarity with prompt engineering, NLP tasks, and ethical considerations in generative AI.
* Strong background in Quality Assurance, content review, or test case development for AI/ML systems.
* Understanding of LLM behavior's, failure modes, and model evaluation metrics.
* Excellent critical thinking, pattern recognition, and analytical writing skills.
* Ability to work independently, follow detailed evaluation protocols, and meet tight deadlines.
Benefits:
This is an exciting opportunity to contribute to the advancement of AI safety and quality assurance. If you are a motivated professional with a passion for improving AI-generated content, we encourage you to apply.
Additional Information:
Please note that this job requires a high level of expertise in AI safety and quality assurance. We are looking for a candidate who can work independently and make significant contributions to our team.