Senior Software Engineer - LLM Evaluation & Repository Validation
">
About the Projects:
">
* We are building evaluation and training datasets for large language models to work on realistic software engineering problems.
">
* Our approach involves creating verifiable tasks based on public repository histories in a synthetic way with human input.
">
">
About the Role:
">
We are looking for experienced software engineers who can contribute to this project. You should have experience working with high-quality public GitHub repositories.
">
This role involves hands-on software engineering work, including development environment automation, issue triaging, and evaluating test coverage and quality.
">
Key Responsibilities:
">
* Analyze and triage GitHub issues across trending open-source libraries.
">
* Set up and configure code repositories, including Dockerization and environment setup.
">
* Evaluate unit test coverage and quality.
">
* Modify and run codebases locally to assess large language model performance in bug-fixing scenarios.
">
* Collaborate with researchers to design and identify challenging repositories and issues for large language models.
">
">
Required Skills:
">
* Strong experience with at least one of the following languages: Python, JavaScript, Java, Go, Rust, C/C++, C#, or Ruby.
">
* Experience working with well-maintained, widely-used repositories with 500+ stars.
">
* Proficiency with Git, Docker, and basic software pipeline setup.
">
* Ability to understand and navigate complex codebases.
">
* Comfortable running, modifying, and testing real-world projects locally.
">
">
Nice to Have:
">
* Previous participation in large language model research or evaluation projects.
">
* Experience building or testing developer tools or automation agents.
">
">
Benefits:
">
* Work in a fully remote environment.
">
* Opportunity to work on cutting-edge AI projects.
">
"],