Senior Machine Learning Engineer, Computer Vision/VLM

Waymo
Mountain View, CA, USA; San Francisco, CA, USAPosted 24 February 2026

Job Description

<div class="content-intro"><p>Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver. Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on building the Waymo Driver—The World's Most Experienced Driver™—to improve access to mobility while saving thousands of lives now lost to traffic crashes. The Waymo Driver powers Waymo’s fully autonomous ride-hail service and can also be applied to a range of vehicle platforms and product use cases. The Waymo Driver has provided over ten million rider-only trips, enabled by its experience autonomously driving over 100 million miles on public roads and tens of billions in simulation across 15+ U.S. states.</p></div><p data-pm-slice="1 1 []">In Semantics, our team's mission is to create the highest-fidelity, most comprehensive <strong>offboard perception autolabels</strong> at a massive scale, serving as the foundation for training and validating the AV stack. We are an advanced ML and engineering team that leverages state-of-the-art <strong>computer vision, deep learning, and generative AI</strong> to automatically analyze driving logs, generate rich scene understanding, and power the data engine that enables Waymo to scale safely and efficiently.</p> <p>In this hybrid role, you will report to a Technical Lead Manager.</p> <p><strong>You will:</strong></p> <ul> <li>Develop and train state-of-the-art computer vision / multimodal models (e.g., Gemini) to extract the rich semantic information (e.g., object attributes, scene properties, interaction dynamics) required by the AI agent.</li> <li>Design and implement a scalable AI agent framework that integrates large foundation models (e.g., Gemini) with the outputs of our perception models and internal knowledge bases.</li> <li>Develop and apply Fine-tuning and Reinforcement Learning (RL) techniques to create a "data flywheel," continuously improving the system's captioning and reasoning abilities through automated feedback.</li> <li>Develop and prototype novel prompting strategies for Vision-Language Models (VLMs) to elicit complex, causal reasoning about driving scenarios.</li> <li>Collaborate closely with the ML Infra, Perception, Behavior, and AI Foundation teams to define data requirements and integrate the captioning system into the broader ML development lifecycle.</li> <li>Own the full system lifecycle, from advanced model development and prototyping to production deployment and scaling for massive data generation</li> </ul> <p><strong>You have:</strong></p> <ul> <li>Master’s degree in Computer Science, or a related technical field.</li> <li>4+ years of hands-on experience training and shipping deep learning models for computer vision tasks (e.g., detection, segmentation, video understanding) using Python and frameworks like PyTorch, JAX, or TensorFlow.</li> <li>1+ years of demonstrated experience working with large language models (LLMs) or vision-language models (VLMs) in areas such as fine-tuning, prompting, or Retrieval-Augmented Generation (RAG).</li> <li>Strong software engineering fundamentals, including designing scalable and reliable systems.</li> <li>Experience building and managing large-scale data processing pipelines for ML training.</li> <li>Proven ability to work autonomously and lead complex technical projects in a fast-paced RD environment.</li> </ul> <p><strong>We prefer:</strong></p> <ul> <li>PhD in Computer Science, or a related technical field.</li> <li>Publication record in top-tier AI conferences (e.g., NeurIPS, ICML, ICLR, CVPR).</li> <li>Hands-on experience with Reinforcement Learning, especially RLHF, RLAIF, or applying RL to language/agentic tasks.</li> <li>Experience with modern techniques in self-supervised, weakly-supervised, or multi-task learning for perception.</li> <li>Experience building with AI agent frameworks (e.g., LangChain, LlamaIndex) or developing autonomous agentic systems.</li> <li>Familiarity with the challenges of ... (truncated, view full listing at source)
Apply Now

Direct link to company career page

Share this job