About this role:
We are seeking an experienced software engineer to collaborate with our research team on a project that aims to improve the interaction between large language models (LLMs) and realistic software engineering tasks.
* This is a unique opportunity to work directly with AI researchers shaping the future of AI-powered software development, evaluate high-impact open-source projects, and influence dataset design that will train and benchmark next-gen LLMs.
Key Responsibilities
* Evaluate code diffs for correctness, code quality, style, and efficiency.
* Review and compare model-generated code responses using a structured ranking system.
* Provide clear, detailed rationales explaining the reasoning behind each ranking decision.
Requirements
We require a strong background in software design, coding best practices, and debugging, as well as excellent written communication skills to articulate evaluation rationale clearly. Experience with code review processes and reading diffs in real-world repositories is essential.
* 7+ years of professional software engineering experience.
* Strong fundamentals in software design, coding best practices, and debugging.
* Excellent ability to assess code quality, correctness, and maintainability.
* Proficient with code review processes and reading diffs in real-world repositories.
* Exceptional written communication skills.
Engagement Details
* Commitment: ~20 hours/week.
* Type: Contractor.
* Duration: 1 month (potential extensions based on performance and fit).