Lead SRE, Site Reliability Engineering

Klaviyo
Dublin, IEPosted 24 February 2026

Job Description

<div class="content-intro"><p><em>At Klaviyo, we value the unique backgrounds, experiences and perspectives each Klaviyo (we call ourselves Klaviyos) brings to our workplace each and every day. We believe everyone deserves a fair shot at success and appreciate the experiences each person brings beyond the traditional job requirements. If you’re a close but not exact match with the description, we hope you’ll still consider applying. Want to learn more about life at Klaviyo? Visit <a class="_ymio1r31 _ypr0glyw _zcxs1o36 _mizu194a _1ah3dkaa _ra3xnqa1 _128mdkaa _1cvmnqa1 _4davt94y _4bfu18uv _1hms8stv _ajmmnqa1 _vchhusvi _kqswh2mm _ect4ttxp _syaz13af _1a3b18uv _4fpr8stv _5goinqa1 _f8pj13af _9oik18uv _1bnxglyw _jf4cnqa1 _30l313af _1nrm18uv _c2waglyw _1iohnqa1 _9h8h12zz _10531ra0 _1ien1ra0 _n0fx1ra0 _1vhv17z1" href="http://klaviyo.com/careers" data-renderer-mark="true">klaviyo.com/careers</a> to see how we empower creators to own their own destiny.</em></p></div><h2><strong>Lead Site Reliability Engineer – Site Reliability Engineering (Dublin)</strong></h2> <h3><strong>Team Overview</strong></h3> <p>As a Lead Site Reliability Engineer, you will set technical direction and lead reliability strategy for Klaviyo’s most critical platforms. You’ll ensure our systems are reliable, scalable, and sustainable while enabling rapid product development across the company.</p> <p>We treat reliability as a core product feature. Our work spans security, infrastructure, and software engineering, requiring deep systems thinking and strong technical leadership. We build foundational services that must be extremely reliable, secure, and performant at global scale.</p> <p>The SRE team’s charter is to design, build, and operate foundational infrastructure and services, define reliability standards, reduce operational toil through automation, and continuously improve systems based on production learnings. As a lead, your work will be highly visible and will directly influence how Klaviyo builds software and how customers experience our platform every day.</p> <h3><strong>How you’ll make an impact</strong></h3> <p>As a <strong>Lead Site Reliability Engineer</strong>, you will provide technical leadership while remaining hands-on with the systems that underpin Klaviyo’s reliability and operational excellence. You will:</p> <ul> <li>Set the technical vision and long-term strategy for reliability, availability, and operational excellence across critical platforms</li> <li>Lead the design, implementation, and evolution of foundational, security-critical services with strong guarantees around availability, scalability, latency, and fault tolerance</li> <li>Drive adoption of SRE best practices across engineering teams, including SLIs, SLOs, error budgets, and reliability-based decision making</li> <li>Identify systemic reliability risks and architectural bottlenecks, and lead cross-team initiatives to address them with durable, preventative solutions</li> <li>Apply software engineering principles to automate infrastructure, eliminate operational toil, and improve system reliability at scale</li> <li>Own and continuously improve observability, alerting, and incident response practices to reduce mean time to detection and recovery</li> <li>Guide on-call strategy and operational processes to ensure sustainability, automation, and healthy operational load</li> <li>Perform and lead quantitative analysis around system behavior, capacity planning, scaling limits, and performance characteristics</li> <li>Partner closely with product, platform, and security leaders to influence system architecture early and ensure reliability is built in from the start</li> <li>Lead incident response for high-severity events, driving effective mitigation, communication, and follow-up</li> <li>Mentor senior and mid-level engineers, raising the bar for technical quality, operational maturity, and reliability culture across the organization</li> <li>Review and influence technical designs, ... (truncated, view full listing at source)