Lead ML Engineer
CloudZeroSan FranciscoPosted 26 March 2026
Job Description
Lead ML Engineer
LEAD ML / DATA SCIENTIST
ABOUT THE ROLE
CloudZero is growing fast. Our customer base is expanding, the data challenges we're solving are getting more complex, and the platform is scaling to match. As our founding ML/Data Scientist, you'll own the hardest data science problems at CloudZero: building the models, pipelines, and intelligence layer that powers real-time cost visibility, anomaly detection, forecasting, and agentic governance across billions of dollars in cloud spend.
This is real ML engineering work at scale, not a research role or a prompt engineering job. You'll work at the intersection of financial telemetry, cloud infrastructure, AI inference, and stream processing, shaping how CloudZero evolves from a billing-first platform toward a telemetry-first, cost-per-anything model for cloud and AI. You'll set the technical patterns, solve problems no one has solved before, and help build the team around you.
This role is ideal for an engineer who thrives on hard data science problems, cares deeply about correctness and production quality, and wants to see their work matter to customers in direct and measurable ways.
WHAT YOU'LL DO
Build the ML Foundation
- Spend 70% or more of your time hands-on: building models, writing production code, designing pipelines, and shipping ML capabilities that customers use
- Define the standards, infrastructure, and patterns the future ML team will build on
- Partner closely with platform engineering and product to embed ML into CloudZero's core, serving as the technical bridge rather than a separate track
Solve Genuinely Hard ML Problems
- Build real-time anomaly detection systems that identify cost spikes, efficiency breaches, and AI usage anomalies across millions of cloud and inference events via stream processing (Kafka, Flink/KStreams)
- Develop production-grade time-series forecasting models for cost and usage, with proper seasonality handling, confidence intervals, and feedback loops
- Model relationships between cloud resources, services, products, and business units as semantic cost graphs at cloud scale
- Tackle cardinality estimation for compound effects of high-dimensional column combinations at the core of our data model
- Build the multi-tier architecture that processes every AI inference event in real time, per model, per token, per team, per customer, reconciled against billing to produce total cost-to-produce intelligence
- Design the intelligence layer for autonomous AI agents, including real-time budget enforcement, policy compliance detection, and spend guardrails for the agents customers deploy in production
Take Models to Production
- Own the full stack: feature engineering, model serving, monitoring, retraining pipelines, and feedback loops
- Turn research and prototypes into production-grade features with full observability baked in
- Apply LLM-based approaches for semantic parsing, NL-to-query translation, and conversational analytics where they genuinely fit, and know when they don't
WHAT YOU BRING
- 6+ years of ML engineering and data science experience, with meaningful time in production systems at scale
- Deep time-series fluency: you've built forecasting and anomaly detection systems that made it to production and earned customer trust
- Classical ML foundations across graphs, clustering, probabilistic modeling, and data structures; you reach for the right tool, not the trendiest one
- Full-stack production ML ownership: feature engineering, model serving, monitoring, retraining pipelines, and feedback loops
- Python fluency and data warehouse experience (Snowflake, BigQuery, or equivalent)
- Formal background in Computer Science, Statistics, Mathematics, or a related quantitative field
BONUS IF YOU HAVE...
- GenAI/LLM experience: you've integrated LLMs, seen their failure modes, and know when to use them versus traditional ML
- Cloud ML infrastructure experience with AWS SageMaker, Bedrock, or equivalent at ... (truncated, view full listing at source)
Apply Now
Direct link to company career page
AI Resume Fit Check
See exactly which skills you match and which are missing before you apply. Free, instant, no spam.
Check my resume fitFree · No credit card