Principal Software Engineer, Enterprise Scalability

Klaviyo
Boston, MAPosted 24 February 2026

Job Description

<div class="content-intro"><p><em>At Klaviyo, we value the unique backgrounds, experiences and perspectives each Klaviyo (we call ourselves Klaviyos) brings to our workplace each and every day. We believe everyone deserves a fair shot at success and appreciate the experiences each person brings beyond the traditional job requirements. If you’re a close but not exact match with the description, we hope you’ll still consider applying. Want to learn more about life at Klaviyo? Visit <a class="_ymio1r31 _ypr0glyw _zcxs1o36 _mizu194a _1ah3dkaa _ra3xnqa1 _128mdkaa _1cvmnqa1 _4davt94y _4bfu18uv _1hms8stv _ajmmnqa1 _vchhusvi _kqswh2mm _ect4ttxp _syaz13af _1a3b18uv _4fpr8stv _5goinqa1 _f8pj13af _9oik18uv _1bnxglyw _jf4cnqa1 _30l313af _1nrm18uv _c2waglyw _1iohnqa1 _9h8h12zz _10531ra0 _1ien1ra0 _n0fx1ra0 _1vhv17z1" href="http://klaviyo.com/careers" data-renderer-mark="true">klaviyo.com/careers</a> to see how we empower creators to own their own destiny.</em></p></div><p>Be Klaviyo’s senior IC for scale, you will report into a VP of Engineering and lead performance, reliability, multi‑region, and large‑tenant readiness. You’ll drive platform-wide architectural change, hunt bottlenecks and optimize systems, and partner across teams to productionize improvements. Given that this is an IC role with no direct reports; you will lead via technical depth, hands‑on impact, and crisp cross‑org alignment.</p> <p><strong>What You’ll Do</strong></p> <ul> <li>Define enterprise scalability fitness functions (latency/throughput/error rates) and a scorecard; align teams to SLOs and budgets.</li> <li>Design/implement sharding and partitioning strategies, caching/back‑pressure, multi‑region readiness, and high‑volume migration paths.</li> <li>Build lightweight enablement: benchmarks, profiling harnesses, reproducible testbeds; pair with teams to land fixes.</li> <li>Lead scalability reviews and readiness gates that accelerate—not block—delivery; drive incident deep dives tied to systemic fixes.</li> <li>Communicate clearly to execs and engineers, tying technical work to business impact and customer outcomes.</li> <li>Integrate AI into scale and resiliency work—from proactive anomaly detection to synthetic load and guided runbooks—so performance improvements stick and incidents don’t repeat.</li> </ul> <p><strong>Who You Are</strong></p> <ul> <li>Experience: 12+ years scaling multi‑tenant SaaS with a reputation for removing major bottlenecks and proving impact with data.</li> <li>Technical expertise: Performance engineering, capacity planning, sharding/partitioning, caching/back‑pressure, multi‑region readiness, and high‑volume migrations; you turn hotspots into robust patterns.</li> <li>AI tools automation: You apply AI to scale work—profiling assistance, workload modeling, synthetic traffic generation, anomaly detection, and runbook copilots—always with explicit guardrails and observability.</li> <li>Cross‑org influence: You align teams through fitness functions, scorecards, and readiness gates that accelerate—not block—delivery; you communicate tradeoffs crisply to execs and engineers.</li> <li>AI fluency: Curious, adaptable, and proactive in exploring AI that responsibly improves scale outcomes.</li> </ul> <p><strong>Nice to Haves</strong></p> <ul> <li>Scale scorecard: Company‑wide fitness functions (latency/throughput/error rates) are adopted and reviewed regularly.</li> <li>High‑impact wins: 2–3 bottlenecks removed with documented, reproducible testbeds; pXX latencies and error rates improve on top enterprise workloads; repeat P0s trend down.</li> <li>AI‑assisted scale engineering: AI‑driven anomaly detection reduces alert noise while improving signal; generative load testing and copilot runbooks are used in release/readiness checks for the top critical services; time‑to‑isolate regressions drops 20–30%.</li> </ul> <p><strong>Success in 6–12 Months</strong></p> <ul> <li>Company‑wide scale scorecard in place; 2–3 high‑impact bottlenecks removed; top en ... (truncated, view full listing at source)