Staff Software Engineer, Ads ML Inference Infrastructure
PinterestPalo Alto, CA, US; San Francisco, CA, US; Seattle, WA, USPosted 4 March 2026
Job Description
<div class="content-intro"><p><strong>About Pinterest:</strong></p>
<p>Millions of people around the world come to our platform to find creative ideas, dream about new possibilities and plan for memories that will last a lifetime. At Pinterest, we’re on a mission to bring everyone the inspiration to create a life they love, and that starts with the people behind the product.</p>
<p>Discover a career where you ignite innovation for millions, transform passion into growth opportunities, celebrate each other’s unique experiences and embrace the <a href="https://www.pinterestcareers.com/our-life/pinflex/">flexibility</a> to do your best work. Creating a career you love? It’s Possible.</p></div><p><strong>Staff Software Engineer, Ads ML Inference Infrastructure</strong></p>
<p> </p>
<p>The Ads ML Inference Infra team owns the online inference and feature serving systems that power real-time model scoring and delivery for all Ads models at Pinterest. The team is looking for a staff engineer with strong hands-on experience in large-scale ML inference systems, as well as capabilities in solving ambiguous technical problems and driving strategic, cross-functional efforts.</p>
<p> </p>
<p><strong>What you’ll do:</strong></p>
<ul>
<li>Lead and drive efforts to build next-generation <strong>model inference and feature serving systems</strong> that power up to <strong>100x larger models</strong> and directly uplevel Pinterest’s monetization business.</li>
<li>Design and optimize <strong>low-latency, high-throughput inference pipelines</strong> to meet strict SLOs while improving <strong>performance, efficiency, and cost</strong>.</li>
<li>Partner with Ads ML and product teams to <strong>productionize new model architectures</strong> (including LLMs and multi-stage ranking models) and scale them reliably to global traffic.</li>
<li>Evolve the <strong>online feature platform</strong> (feature computation, caching, and retrieval) to improve coverage, freshness, and consistency for Ads models.</li>
<li>Evaluate and integrate new technologies (e.g., <strong>GPU acceleration, model compression, Triton, vLLM, Dynamo</strong>) to advance our inference stack.</li>
<li>Build strong partnerships with other infra and ML teams to improve <strong>end-to-end reliability, observability, and developer velocity</strong> for Ads ML.</li>
<li>Mentor and coach other engineers, guiding them through technical decisions, system design, and career development.</li>
</ul>
<p> </p>
<p><strong>What we’re looking for:</strong></p>
<ul>
<li>BS (or higher) degree in <strong>Computer Science</strong> or a related field.</li>
<li>~8+ years of relevant industry experience designing and operating <strong>large-scale, production ML or distributed infra systems</strong>.</li>
<li>Deep knowledge of at least one programming language (<strong>Java, C++, Python</strong>).</li>
<li>Deep experience with <strong>distributed systems or recommendation / ads serving infrastructure</strong> (e.g., request routing, online storage, caching, feature serving, APIs).</li>
<li>Hands-on experience with at least one deep learning framework (<strong>PyTorch</strong> or <strong>TensorFlow</strong>) and bringing models from offline experimentation to production.</li>
<li>[Preferred] Experience with <strong>model / hardware accelerator libraries</strong> (e.g., CUDA, quantization, distillation, low-precision inference).</li>
<li>[Preferred] Experience with <strong>inference optimization and serving frameworks</strong> such as <strong>Triton, vLLM, or Dynamo</strong>.</li>
<li>Proven track record of <strong>leading complex projects</strong>, setting technical direction, and <strong>collaborating across functions and orgs</strong>; experience mentoring and coaching other engineers.</li>
</ul>
<p> </p>
<p><strong>In-Office Requirement Statement:</strong></p>
<ul>
<li>We let the type of work you do guide the collaboration style. That means we're not always working in an office, but we continue to gathe ... (truncated, view full listing at source)
Apply Now
Direct link to company career page
More jobs at Pinterest
See all →Sr. Staff Software Engineer, 3P Measurement
Seattle, WA, US · 4 March 2026
Sr. Staff Software Engineer, Conversion Visibility
Seattle, WA, US · 4 March 2026
Sr. Staff Quantitative Product Researcher, Monetization
San Francisco, CA, US; Remote, US · 4 March 2026
Sr. Staff Software Engineer, Programmatic Ads
San Francisco, CA, US; Palo Alto, CA, US; Seattle, WA, US · 4 March 2026