Job Title: Senior Software Engineer - LLM Evaluation and Repository Validation
We are seeking experienced software engineers to contribute to our project, which involves building verifiable SWE tasks based on public repository histories in a synthetic approach with human-in-the-loop.
About the Role:
This role requires hands-on software engineering work, including development environment automation, issue triaging, and evaluating test coverage and quality. You should have experience working with well-maintained, widely-used repositories with 500+ stars.
Responsibilities:
* Analyze and triage GitHub issues across trending open-source libraries.
* Set up and configure code repositories, including Dockerization and environment setup.
* Evaluate unit test coverage and quality.
* Modify and run codebases locally to assess LLM performance in bug-fixing scenarios.
* Collaborate with researchers to design and identify repositories and issues that are challenging for LLMs.
Required Skills and Qualifications:
* Strong experience with at least one of the following languages: Python, JavaScript, Java, Go, Rust, C/C++, C#, or Ruby.
* Experience working with well-maintained, widely-used repositories with 500+ stars.
* Proficiency with Git, Docker, and basic software pipeline setup.
* Ability to understand and navigate complex codebases.
* Comfortable running, modifying, and testing real-world projects locally.
Benefits:
* Fully remote work environment.
* Opportunity to work on cutting-edge AI projects.
Other Details:
* Commitment required: 20 hours per week.
* Employment type: Contractor assignment.