Senior Software Engineer for LLM Evaluation

Company:  Confidential
Location: remote
Closing Date: 06/07/2026
Hours: Full Time
Type: Permanent

Job Description

Role Overview
As a Software Engineering evaluator, you will play a crucial role in creating advanced datasets for training, benchmarking, and enhancing large language models. This position involves collaborating closely with researchers to curate code examples, provide precise solutions, and refine AI-generated code across various programming languages, ensuring the development of reliable and efficient AI-driven coding solutions.

Key Responsibilities
  • Curate code examples, build solutions, and correct code in Python, JavaScript (including ReactJS), C/C++, Java, Rust, and Go for AI model training initiatives.
  • Evaluate and refine AI-generated code to ensure efficiency, scalability, and reliability.
  • Collaborate with cross-functional teams to enhance AI-driven coding solutions against industry performance benchmarks.
  • Develop agents to verify code quality and identify error patterns.
  • Hypothesize on software engineering cycle steps (such as prototyping, architecture design, API design, production implementation, launch, experiments, monitoring, operational maintenance) and assess model capabilities.
  • Design verification mechanisms to automatically validate solutions to software engineering tasks.
Qualifications
  • Minimum of 3 years of software engineering experience.
  • Strong expertise in building full-stack applications and deploying scalable, production-grade software using modern languages and tools.
  • Deep understanding of software architecture, design, development, debugging, and code quality/review assessment.
  • Excellent oral and written communication skills for clear, structured evaluation rationales.
Work Terms
  • Commitment : Flexible engagement, minimum 10 hours/week, up to 40 hours/week.
  • Type : Contractor (no medical/paid leave).
  • Duration : 1 month (potential extensions based on performance and fit).
  • Location : Candidates must be based in the US, Canada, or WEU countries (Austria, Belgium, France, Germany, etc.).
Eligibility
  • The application process takes 15-30 minutes.
  • Completion of an AI video interview is required.
Apply Now
Share this job
Confidential
  • Similar Jobs

  • Software Engineer for AI Model Evaluation (Scala/Kotlin/OCaml)

    remote
    View Job
  • Hardware RTL Engineer for AI Evaluation

    remote
    View Job
  • Aerospace Engineer for AI Model Evaluation

    remote
    View Job
  • Specialized Systems Engineer for AI Evaluation

    remote
    View Job
  • Petroleum Engineer for AI Model Evaluation

    remote
    View Job
An unhandled error has occurred. Reload 🗙