**Software Engineer for AI-Assisted Software Development**
About the Project:
We are building datasets to train Large Language Models (LLMs) on realistic software engineering problems. Our approach involves creating verifiable tasks based on public repository histories with human-in-the-loop. We aim to expand the dataset coverage in terms of programming languages, difficulty levels, and more.
About the Role:
We are seeking experienced software engineers who are familiar with high-quality public GitHub repositories. You should have experience working with well-maintained, widely-used repos with 500+ stars. This role involves hands-on software engineering work, including development environment automation, issue triaging, and evaluating test coverage and quality.
Responsibilities:
* Analyze and triage GitHub issues across trending open-source libraries.
* Set up and configure code repositories, including Dockerization and environment setup.
* Evaluate unit test coverage and quality.
* Modify and run codebases locally to assess LLM performance in bug-fixing scenarios.
* Collaborate with researchers to design and identify repositories and issues that are challenging for LLMs.
* Opportunities to lead a team of junior engineers to collaborate on projects.
Requirements:
* Strong experience with at least one of the following languages: Python, JavaScript, Java, Go, Rust, C/C++, C#, or Ruby.
* Experience working with well-maintained, widely-used repositories with 500+ stars.
* Proficiency with Git, Docker, and basic software pipeline setup.
* Ability to understand and navigate complex codebases.
* Comfortable running, modifying, and testing real-world projects locally.
* Experience contributing to or evaluating open-source projects is a plus.
Perks:
* Work in a fully remote environment.
* Opportunity to work on cutting-edge AI projects with leading LLM companies.