Staff Software Engineer, Replication Foundations

Temporal Technologies
United States - Remote Opportunity$180k – $304kPosted 9 March 2026

Tech Stack

Job Description

About Us Temporal is an open source programming model that can simplify code, make applications more reliable, and help developers focus on the important things like delivering features faster. We are on a mission to be the reliable foundation of every developer’s toolbox, and are building the team that will make that happen. Our values guide us —they are present in how we show up, make decisions, and work together to make an impact. We’re curious, driven, collaborative, genuine and humble. Temporal is growing and we are looking for those who share our values, challenge 'standard' thinking, and want to influence our future. If you have a passion for improving the developer experience, building world-class open-source software and communities, and want to be a part of our amazing team, we'd love to hear from you! Summary We’re hiring a Staff Software Engineer to join the Replication Foundations team, part of Temporal’s Cloud Global Services (CGS) organization. Replication Foundations owns and evolves Temporal’s core replication stack in Temporal OSS —the distributed systems backbone behind key Temporal Cloud capabilities like High Availability namespaces , cross-cluster and cross-region failover , and migration products that let customers move workloads between self-hosted Temporal and Temporal Cloud . The team also drives fundamental scalability and reliability mechanisms that define how Temporal operates at large scale. At the Staff level, you’ll operate as a technical leader in Temporal’s distributed core: shaping protocols and correctness guarantees, leading complex efforts end-to-end, raising engineering rigor across replication foundations, and partnering cross-functionally to ensure OSS primitives unlock current and future cloud products. This role is deeply technical and requires comfort reasoning about consistency, concurrency, failure modes, performance, and operational safety in correctness-critical systems. Key outcomes you’ll drive: Evolve replication protocols and correctness mechanisms that make failures (and cloud outages) a non-event for customers. Deliver scalability primitives (e.g., multi-cell namespaces) that unlock the next phase of Temporal Cloud growth. Raise the bar on safety, observability, and operability of Temporal’s replication layer across OSS and cloud. What You'll Do Lead the design and implementation of core components of Temporal’s OSS replication stack, from initial design through rollout and long-term operational ownership. Design and evolve replication protocols that power: High Availability namespaces Cross-cluster and cross-region replication Migration between Temporal clusters (cloud ↔ self-hosted, cloud ↔ cloud) Build scalability and reliability capabilities such as: Multi-cell namespaces Protocols enabling a single namespace to span multiple clusters Dynamic split/merge strategies based on usage, hot spots, and capacity needs Reason deeply about correctness: consistency models, ordering guarantees, idempotency, failure recovery, and safe rollouts of protocol changes. Drive cross-team alignment with Cloud Enablement and other CGS teams to ensure OSS foundations support current and future cloud products. Author high-quality design docs that clarify invariants, trade-offs, failure modes, and operational playbooks for complex changes. Raise engineering standards through reviews, mentorship, and technical leadership—improving correctness testing, fault injection, and incident readiness. Participate in on-call/incident response related to replication and core system behavior, helping build durable fixes and prevention mechanisms. What You'll Bring 10+ years building production systems, including significant experience with distributed systems and correctness-critical infrastructure. Strong experience with replication, consistency, fault tolerance, and failure recovery in distributed environments. Demonstrated ability to design and ... (truncated, view full listing at source)