ML Systems Engineer - Model Training and Infrastructure (SWE-focused LLMs)

Cosine
London Office£80k – £110kPosted 27 March 2026

Job Description

ML Systems Engineer - Model Training and Infrastructure (SWE-focused LLMs) Job title: ML Systems Engineer - Model Training and Infrastructure (SWE-focused LLMs) Location: London; full in-office working as default Start date: ASAP Compensation: £80,000 - £110,000 Base Salary & £80,000 - £110,000 Share options. ___________________________________________________________________________ COSINE AT A GLANCE At Cosine, we’re building autonomous AI engineers that plan, write, and ship code inside real development workflows. Cosine is designed for on-premise and virtual private cloud (VPC) deployments, including fully air-gapped environments. We build our agent tooling entirely in-house and post-train open-source models to deliver reliable, enterprise-grade coding performance in security-critical settings. In 2024, Cosine achieved a 72% score on OpenAI’s SWE-Lancer benchmark, placing us among the strongest real-world software-engineering AI systems evaluated. YC-backed and well-funded, Cosine was founded by experienced operators focused on building dependable, production-grade AI. This role is based in our Hoxton office, five days a week, because close collaboration, fast feedback, and shared context matter for the problems we’re solving. ___________________________________________________________________________ THE ROLE We’re looking for an ML Systems Engineer to collaborate in training our Lumen models – our open‑source–based software engineering LLMs. This is a unique, and truly interdisciplinary role that involves developing and deploying our reinforcement learning (RL) training environments, working on synthetic data pipelines at massive scale and running fine-tuning jobs to train the next generation of SWE models that will be used in both our self-serve and enterprise products. We want to make sure that the models we train are the best SWEs in the world - this doesn’t just mean training them to get the right answer, it means training them so that they write readable, maintainable code, that fits with the architectural patterns already present in the codebase. We believe we’re now in the anti-slop era of coding agents, where data, RL environments and opinionated reward functions will shape the future standards of SWE models. If this sounds exciting, then this could be the role for you. ABOUT THE ROLE In this role you will: - Develop and manage synthetic data generation pipelines to curate datasets that will underpin future RL fine-tunes. - Design, build and deploy containerized services using Docker and platforms like Kubernetes to enable our RL infrastructure. - Build and iterate on large-scale RL loops where models write code, run tests or tools, and get rewarded (or penalized) accordingly. - Work hands-on across the stack: custom PyTorch dataloaders, RL objectives, and evaluation on real-world repos and tasks. You’ll collaborate closely with infra, product, and research to decide what to train next, how to train it, and how to measure whether it’s actually better for engineers. ___________________________________________________________________________ WHAT YOU’LL DO - Participate in end-to-end training of models: - Supervised fine-tuning on curated code and conversation datasets. - RL on top of those models to align them with software-engineering objectives. - Architect synthetic data generation pipelines for RL and deploy using containerization technologies. - Ideate on novel and opinionated reward functions for the training of SWE agents. - Improve evaluation for SWE models: - Help maintain/extend an evaluation suite for code models (unit tests, benchmark suites, repo-level tasks). - Analyze failure modes and feed them back into data and training plans. ___________________________________________________________________________ WHAT WE’RE LOOKING FOR (ESSENTIAL) - Strong software engineering or computer science background: - Typically 3-5 years of experience. - You can read, debug, and ... (truncated, view full listing at source)
Apply Now

Direct link to company career page

AI Resume Fit Check

See exactly which skills you match and which are missing before you apply. Free, instant, no spam.

Check my resume fit

Free · No credit card

Share