Software Engineer, Large Model Evaluation
WaymoMountain View, California, USA; San Francisco, California, USAPosted 24 February 2026
Job Description
<div class="content-intro"><p>Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver. Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on building the Waymo Driver—The World's Most Experienced Driver™—to improve access to mobility while saving thousands of lives now lost to traffic crashes. The Waymo Driver powers Waymo’s fully autonomous ride-hail service and can also be applied to a range of vehicle platforms and product use cases. The Waymo Driver has provided over ten million rider-only trips, enabled by its experience autonomously driving over 100 million miles on public roads and tens of billions in simulation across 15+ U.S. states.</p></div><p>The Large Model Evaluation team is at the nexus of Waymo’s <a href="https://waymo.com/blog/2025/12/demonstrably-safe-ai-for-autonomous-driving">AI ambition</a>. With advancements in Large Language Models (LLMs) and Vision-Language Models (VLMs), Waymo is building state-of-the-art <a href="https://waymo.com/blog/2024/10/ai-and-ml-at-waymo">AI systems</a> that handle the full complexity of real-world driving. At its core, our progress is defined by our ability to measure it. While robust evaluation is the bottleneck for deploying any large model, the challenge at Waymo is uniquely complex and safety-critical. We are looking for quantitatively-minded engineers to research and propose new ways to assess the ML models deployed in the Waymo Driver.</p>
<p><strong>You will:</strong></p>
<ul>
<li>Develop novel metrics and sampling techniques to measure the driving trajectories generated by ML models.</li>
<li>Employ creative simulation strategies to measure the driving performance of generative AI models. Identify potential edge cases, and provide reliable performance insights that inform model development and deployment. </li>
<li>Build data pipelines for signal discovery, data labeling, feature extraction and metric computation based on large-scale simulations.</li>
<li>Conduct data analysis to diagnose regressions in ML models.</li>
<li>Collaborate with world-class engineering and research teams that develop large-scale ML models.</li>
</ul>
<p><strong>You have:</strong></p>
<ul>
<li>BS/MS/PhD in Computer Science, Machine Learning, Robotics, Statistics, Physics, Math or another quantitative area</li>
<li>Proficiency in programming in Python or C++</li>
<li>Knowledge of AI fundamentals, such as transformer architectures, distillation techniques, etc.</li>
<li>Demonstrated industry or research experience with creative problem solving and rigorous data analysis of open-ended quantitative problems</li>
</ul>
<p><strong>We prefer:</strong></p>
<ul>
<li>Familiarity with one of the modern deep learning frameworks (e.g. JAX, Tensorflow, Pytorch)</li>
<li>Experience evaluating the quality of ML models</li>
</ul><div class="content-pay-transparency"><div class="pay-input"><div class="description"><p><span style="font-weight: 400;">The expected base salary range for this full-time position across US locations is listed below. Actual starting pay will be based on job-related factors, including exact work location, experience, relevant training and education, and skill level. Your recruiter can share more about the specific salary range for the role location or, if the role can be performed remote, the specific salary range for your preferred location, during the hiring process. </span></p>
<p><span style="font-weight: 400;">Waymo employees are also eligible to participate in Waymo’s discretionary annual bonus program, equity incentive plan, and generous Company benefits program, subject to eligibility requirements. </span></p></div><div class="title">Salary Range</div><div class="pay-range"><span>$170,000</span><span class="divider"></span><span>$216,000 USD</span></div></div></div>
Apply Now
Direct link to company career page
More jobs at Waymo
See all →Back-End Senior Software Engineer, Simulation
Mountain View, CA, United States; New York City, NY, United States · 26 February 2026
Regulatory Counsel
San Francisco, CA, USA; Mountain View, CA, USA · 26 February 2026
Business Development & Strategic Partnerships Lead, Japan
Tokyo, Japan · 26 February 2026
2026 Summer Intern, PhD, Perception
Mountain View, California, USA · 25 February 2026
More Python jobs
See all →[Summer 2026] People Science - PhD Intern
Roblox · San Mateo, CA, United States
Team Lead - Security Platform
Cloudflare · Distributed; Hybrid
Sr. Security Software Engineer, Applied Computing (Starshield)
SpaceX · Hawthorne, CA
Security Software Engineer, Applied Computing (Starshield)
SpaceX · Washington, DC