Staff Software Engineer, AI Runtime Systems

Crunchyroll
Los Angeles, California, United StatesPosted 24 February 2026

Job Description

<div class="content-intro"><h2 data-pm-slice="1 1 []">About Crunchyroll</h2> <p>Founded by fans, Crunchyroll delivers the art and culture of anime to a passionate community. We super-serve over 100 million anime and manga fans across 200+ countries and territories, and help them connect with the stories and characters they crave. Whether that experience is online or in-person, streaming video, theatrical, games, merchandise, events and more, it’s powered by the anime content we all love.</p> <p>Join our team, and help us shape the future of anime!</p></div><h2 data-pm-slice="1 1 []">About the role</h2> <p>Crunchyroll's Platform Development organization powers the infrastructure that delivers anime at scale to millions of fans worldwide. We are seeking a Staff Software Engineer to join our team in Los Angeles.</p> <p>In this role you will drive the design and evolution of core platform services that power Crunchyroll's global ecosystem. Your work will span authentication and security enhancements, notification services, and ML inference runtimes, forming the foundation that enables engineering teams to build reliable, secure, and intelligent experiences at scale.</p> <p>You will lead architectural initiatives, define technical direction, and ensure system scalability, performance, and resilience across distributed environments. Partnering closely with ML, data science, and engineering teams, you will shape the platform capabilities that support deploying and operating models in production, ensuring they meet the reliability and efficiency standards required for a global streaming service.</p> <p>In the role of Staff Software Engineer, you will report to the Engineering Manager, Platform.</p> <h3>Core Areas of Responsibility</h3> <ul> <li>Architect, build, and maintain ML inference runtimes for multi-model serving, autoscaling, and GPU/TPU utilization.</li> <li>Optimize inference pipelines and platform services for performance, reliability, and scalability.</li> <li>Lead deployment, operationalization, and maintenance of ML workloads in collaboration with ML and data science teams.</li> <li>Shape and maintain core platform services, including authentication, security, and notifications.</li> <li>Ensure seamless integration with platform infrastructure, CI/CD pipelines, and observability systems.</li> <li>Define scalable system architectures and guide cross-team design alignment.</li> <li>Develop benchmarking, validation, and monitoring tools to measure and maintain system performance.</li> <li>Promote security, compliance, and engineering best practices across platform and ML services.</li> <li>Mentor and influence engineering peers, fostering technical excellence and consistent standards.</li> </ul> <h2>About You</h2> <ul> <li><strong>12+ years of backend software engineering experience, with a track record of leading complex projects end-to-end.</strong></li> <li><strong>Hands-on experience building and optimizing AI/ML inference runtimes (e.g., KServe, TorchServe, TensorRT, Triton) and integrating with CI/CD and MLOps pipelines (e.g., SageMaker, Kubeflow, BentoML).</strong></li> <li><strong>Expertise in JavaScript/TypeScript, with additional experience in Golang or Kotlin.</strong></li> <li>Experience with containers, orchestration (Kubernetes/ECS), cloud platforms (AWS preferred), and distributed systems.</li> <li>Experience with performance profiling, model optimization, GPU acceleration, and designing inference workloads to meet latency/throughput SLAs.</li> <li>Experienced in building scalable APIs (REST/gRPC), caching strategies, and high-performance systems, including relational and NoSQL databases.</li> <li>Familiar with monitoring, observability tools, security, and compliance best practices in production ML/AI services.</li> <li>Proven ability to collaborate with ML/AI teams, bridge research and production, and mentor peers.</li> <li>Strong problem-solving, communication, and engineering best practices, with attentio ... (truncated, view full listing at source)