Senior Software Engineer II- AI Workload Orchestration

CoreWeave
Sunnyvale, CA / Bellevue, WAPosted 23 January 2026

Job Description

<div class="content-intro"><div> <div> <div class="gmail_quote"> <div> <div><span id="m_1770241969069985273m_-2746164444908759431gmail-docs-internal-guid-131e4fb0-7fff-b4e9-ff50-e8cf32449b1b">CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers a platform of technology, tools, and teams that enables innovators to build and scale AI with confidence. Trusted by leading AI labs, startups, and global enterprises, CoreWeave combines superior infrastructure performance with deep technical expertise to accelerate breakthroughs and turn compute into capability. Founded in 2017, CoreWeave became a publicly traded company (Nasdaq: CRWV) in March 2025. Learn more at&nbsp;<a href="http://www.coreweave.com/" target="_blank" data-saferedirecturl="https://www.google.com/url?q=http://www.coreweave.com&amp;source=gmail&amp;ust=1762613132717000&amp;usg=AOvVaw3D-UOhNaqEvF5BEWxjYyAU">www.coreweave.com</a>.</span></div> </div> </div> </div> </div></div><h3><span style="text-decoration: underline;"><strong>What You’ll Do:</strong></span></h3> <p>As a <strong>Senior Software Engineer II (IC4)</strong> on the <strong>AI Workload Orchestration Platform</strong> team, you will help build and operate CoreWeave’s Kubernetes-native platform for admitting, scheduling, and operating AI workloads at scale.</p> <p>This platform integrates multiple orchestration and scheduling frameworks such as <strong>Kueue, Volcano, and Ray</strong> to support modern AI training and inference workflows. It complements <strong>SUNK (Slurm on Kubernetes)</strong> by providing a Kubernetes-first, cloud-native orchestration layer with deep platform integration.</p> <p>You will own meaningful components of the platform, drive reliability and performance improvements, and help scale the system as customer demand and workload complexity continue to grow.</p> <h3><span style="text-decoration: underline;"><strong>About the role:</strong></span></h3> <ul> <li>Design, build, and operate Kubernetes-native services for AI workload orchestration and scheduling</li> <li>Own one or more platform components end-to-end, including design, implementation, testing, and on-call support</li> <li>Improve scheduling latency, cluster utilization, and workload reliability through metrics-driven engineering</li> <li>Contribute to architectural discussions across services and influence design decisions within the platform</li> <li>Work closely with adjacent teams (CKS, infrastructure, managed inference) to ensure clean interfaces and integrations</li> <li>Mentor junior engineers and raise the quality bar for code, design, and operations</li> </ul> <h3><span style="text-decoration: underline;"><strong>Who You Are:</strong></span></h3> <ul> <li>5–8 years of professional software engineering experience in distributed systems, cloud infrastructure, or platform engineering</li> <li>Strong experience building production systems in <strong>Go</strong> (Python or C++ a plus)</li> <li>Solid understanding of <strong>Kubernetes fundamentals</strong>, APIs, controllers, and operating services in production</li> <li>Experience working with scheduling, resource management, or quota-based systems</li> <li>Proven ability to improve system reliability and performance using data and operational metrics</li> <li>Comfortable owning services in production and participating in on-call rotations</li> </ul> <p><strong>Preferred:</strong></p> <ul> <li>Experience with Kubernetes-native orchestration frameworks such as <strong>Kueue, Volcano, Ray, Kubeflow, or Argo Workflows</strong><strong><br></strong></li> <li>Familiarity with GPU-based workloads, ML training, or inference pipelines</li> <li>Knowledge of scheduling concepts such as <strong>quota enforcement, pre-emption, and backfilling</strong><strong><br></strong></li> <li>Experience with reliability practices including <strong>SLOs, alerting, and incident response</strong><strong><br></strong></li> <li>Exposure to AI infrastructure, ... (truncated, view full listing at source)
Apply Now

Direct link to company career page

AI Resume Fit Check

See exactly which skills you match and which are missing before you apply. Free, instant, no spam.

Check my resume fit

Free · No credit card

Share