Senior AI Engineer - CD Visibility
DatadogMadrid, SpainPosted 24 February 2026
Job Description
<p>At Datadog, we leverage AI across our observability platform to improve monitoring, speed up incident resolution, and ensure data reliability for cloud applications.</p>
<p>Datadog’s <a href="https://docs.datadoghq.com/deployment_gates/">Deployment Gates</a> team builds customer-facing systems that decide whether software should ship to production. Deployment Gates sit directly in customers’ CI/CD pipelines and use observability data to answer one of the hardest questions in software delivery:<br>Given everything we know right now, is this deployment safe to proceed?<em><br></em><em><br></em>In this role, you will:</p>
<ul>
<li>Work on the evolution of Deployment Gates from static rules to AI-driven decisions, combining Datadog telemetry to detect and block faulty changes before they impact users.</li>
<li>Lead the design of progressive deployment automation, starting with zero-setup, conservative AI rules and evolving toward adaptive gates that learn from incidents and organizational patterns.</li>
<li>Design the foundation for autonomous remediation, connecting what changed in code to what broke in production, from blocking and rolling back unsafe deployments to proposing fixes and enforcing policies.</li>
</ul>
<p>This is a highly product‑minded engineering role: you’ll work from problem discovery and UX all the way to reliable, scalable production systems.</p>
<p><em>At Datadog, we place value in our office culture - the relationships that it builds, the creativity it brings to the table, and the collaboration of being together. We operate as a hybrid workplace to ensure our employees can create a work-life harmony that best fits them.</em></p>
<p> </p>
<p><strong>What you’ll do:</strong></p>
<ul>
<li>Build AI-driven deployment gates: Design and ship decision systems that evaluate customer deployments using CI/CD context and Datadog telemetry, producing safe, explainable allow/block outcomes.</li>
<li>Own evals and rollout: Define precision, recall, and trust metrics; build offline and online evals; validate changes in shadow mode; and safely promote improvements to enforcement.</li>
<li>Design for robustness and safety: Implement conservative defaults, guardrails, fallbacks, and human-in-the-loop paths so gates behave predictably under noisy or incomplete data.</li>
<li>Partner closely with Product: Work hand-in-hand with the Product Manager to translate customer problems, adoption signals, and roadmap goals into concrete technical decisions and iterations.</li>
<li>Integrate across the Datadog platform: Partner with internal AI teams building the Faulty Deployment Detection pipeline, as well as teams working on LLMs and AI agents.</li>
<li>Own production systems: Build and operate reliable backend services that run in the critical path of customer deployments, and be on-call for those services.</li>
</ul>
<p> </p>
<p><strong>Who you are:</strong></p>
<ul>
<li>A Product‑minded engineer who ships AI to production</li>
<li>You have 5+ years experience with backend systems and microservices performance: tracing, latency breakdowns, concurrency, and resiliency patterns</li>
<li>You are proficient in a modern programming language; strong API/service design; production ops (monitoring, alerting, on‑call rotation)</li>
<li>You have proven experience delivering LLM/agent features to production </li>
<li>You are comfortable owning user journeys, iterating from prototype → alpha → GA, and measuring impact with clear product metrics</li>
<li>An End-to-end AI implementation owner: You understands the end-to-end LLM product lifecycle</li>
<li>Fluent with offline/online evals for AI systems</li>
</ul>
<p> </p>
<p><strong>Bonus Point:</strong></p>
<ul>
<li>Experience with Continuous Delivery tools (i.e. ArgoCD / Argo Rollouts, Spinnaker, Octopus Deploy)</li>
<li>Exposure to planning/agent frameworks, tool‑use orchestration, RAG, and retrieval/indexing for observability data</li>
</ul>
<p><em>Datadog values people from all walks of li ... (truncated, view full listing at source)
Apply Now
Direct link to company career page
More jobs at Datadog
See all →Parter Solutions Architect (EMEA)
Germany, Remote; Lisbon, Portugal; Paris, France · 6 March 2026
Strategic Account Executive (Philippines Market)
Singapore, Singapore · 6 March 2026
Commercial Account Executive (Saudi Arabia)
Riyadh, Saudi Arabia · 6 March 2026
Partner Marketing Manager (Cloud Alliances)
London, United Kingdom · 5 March 2026