Senior Software Engineer - AI Evaluation & Code Quality Specialist
We are seeking an experienced software engineer to contribute to the development of LLM evaluation and training datasets. The ideal candidate will have expertise in working with high-quality public GitHub repositories, specifically those with 500+ stars.
About the Project:
We are building verifiable SWE tasks based on public repository histories using a synthetic approach with human-in-the-loop. Our goal is to expand the dataset coverage to different types of tasks, including programming language, difficulty level, and more.
About the Role:
This role involves hands-on software engineering work, including development environment automation, issue triaging, and evaluating test coverage and quality. You will analyze and triage GitHub issues across trending open-source libraries, set up and configure code repositories, and evaluate unit test coverage and quality.
Responsibilities:
* Analyze and triage GitHub issues across trending open-source libraries.
* Set up and configure code repositories, including Dockerization and environment setup.
* Evaluate unit test coverage and quality.
* Modify and run codebases locally to assess LLM performance in bug-fixing scenarios.
* Collaborate with researchers to design and identify repositories and issues that are challenging for LLMs.
Requirements:
To be successful in this role, you should have:
* Strong experience with at least one of the following languages: Python, JavaScript, Java, Go, Rust, C/C++, C#, or Ruby.
* Experience working with well-maintained, widely-used repositories with 500+ stars.
* Proficiency with Git, Docker, and basic software pipeline setup.
* Ability to understand and navigate complex codebases.
* Comfortable running, modifying, and testing real-world projects locally.
Bonus Points:
If you have previous participation in LLM research or evaluation projects, experience building or testing developer tools or automation agents, it's a plus!
Contract Details:
This is a contractor assignment (no medical/paid leave) with a duration of 1 month. You will work 20 hours per week with some overlap with PST.