Software Engineer 5 – Model Runtime, AI Platform

USA - Remote$466k – $750kPosted 28 April 2026

Tech Stack

Express .NET Scala AWS PyTorch Machine Learning AI LLM Fine-tuning

Job Description

At Netflix, our mission is to entertain the world. Together, we are writing the next episode - pushing the boundaries of storytelling, global fandom and making the unimaginable a reality. We are a dream team obsessed with the uncomfortable excitement of discovering what happens when you merge creativity, intuition and cutting-edge technology. Come be a part of what’s next. Netflix is the world's leading streaming entertainment service, with over 300 million members in over 190 countries, enjoying TV series, feature films, and games across numerous genres and languages. Members can watch or play as much as they want, anytime, anywhere, on any internet-connected screen. Machine Learning/Artificial Intelligence powers innovation in all areas of the business, from helping members choose the right title for them through personalization, to optimizing our payment processing. Building highly scalable and differentiated ML infrastructure is key to accelerating this innovation. The Opportunity The Model Runtime team owns the systems that train, align, and serve Netflix's most critical ML models. We are a small, highly autonomous team with outsized impact — the infrastructure we build directly shapes what Netflix can do with AI. We're looking for a Software Engineer who thrives at the intersection of systems engineering and ML. You will: Build alignment and post-training infrastructure — Design infrastructure for reinforcement learning (GRPO, DPO, PPO), reward modeling, and preference optimization so Netflix can train recommendation models directly against what members actually value. Enable next-generation GenAI workloads — Create infrastructure for multimodal and diffusion models, including distributed training, disaggregated serving, real-time, near-real-time and batch inference, and asynchronous GPU pipelines. Scale distributed training — Engineer fault-tolerant training systems using FSDP, tensor/pipeline/context parallelism, and mixed-precision strategies across clusters of hundreds of GPUs. Optimize across the full stack — Profile and tune from PyTorch operators down to GPU kernels, driving utilization improvements and building cost models that inform infrastructure strategy. Evaluate emerging hardware and frameworks — Be the team's eyes on specialized accelerators, next-gen NVIDIA silicon, and the open-source ecosystem to keep Netflix at the efficiency frontier. If you want to work on problems where the gap between "possible" and "deployed at scale" is the hard part, this is the role. Minimum Job Qualifications Experience in ML systems engineering — building infrastructure for training, fine-tuning, or inference of pre-LLM and post-LLM era models at scale. Strong systems programming skills with the ability to work across multiple layers of the stack, from high-level ML frameworks down to GPU kernels and memory management Hands-on experience with PyTorch internals, large-scale distributed training and system-model codesign Comfortable with ambiguity and working across multiple business and technical domains to execute on both 0-to-1 and 1-to-100 projects Adopt and promote best practices in operations, including observability, logging, reporting, and on-call processes to ensure engineering excellence Experience with cloud computing providers, preferably AWS Excellent written and verbal communication skills Strong communication skills; effective across distributed time zones and remote environments Preferred Qualifications Deep experience with distributed training at scale (FSDP, parallelism strategies, checkpointing) or LLM post-training (SFT, RLHF, DPO/GRPO) Inference optimization — vLLM, TensorRT, quantization, continuous batching, KV-cache management GPU performance profiling and tuning (CUDA, NCCL, Nsight, PyTorch profiler) Experience with multimodal or diffusion model architectures and generation pipelines Track record building reusable ML libraries or contributing to open-sourc ... (truncated, view full listing at source)

Apply Now

Direct link to company career page

More jobs atNetflix

AI Resume Fit Check

See exactly which skills you match and which are missing before you apply. Free, instant, no spam.

Check my resume fit

Free · No credit card