Staff Software Engineer, Production Engineering

Uber
San Francisco, United StatesPosted 26 March 2026

Job Description

Staff Software Engineer, Production Engineering Department: Engineering Team: Backend Location: San Francisco, United States Type: Full-Time **About the Role** Engineering at Uber means building for real-world impact under real-world constraints. The problems are complex, the systems are massive, and the pace is fast. You’ll need to make smart decisions with imperfect information — and own them. If you think in systems, stay calm under pressure, and care about building things that actually work — this is where you’ll grow. As a Production Engineer, you will blend software and systems engineering to ensure Uber's services run reliably at a massive scale. This isn't just about maintaining uptime; it's about architecting solutions for high-traffic distributed systems where performance and safety cannot be separated. You will navigate technical debt and shifting priorities while keeping the experience of millions of global users in mind. If you are energized by the challenge of unblocking complex reliability issues and want to build tools that empower an entire engineering organization, you belong on this team. \-\-\-\- What the Candidate Will Do ---- 1. Design, build, and maintain services to increase the reliability, scalability, and efficiency of Uber's thousands of stateless and stateful production services spread across multiple datacenter zones and regions. 2. Lead initiatives end-to-end within the team, the Production Engineering org, and across engineering at large to increase reliability through automation, setting standards, developer tooling, and reusable frameworks. 3. Work with other engineers to deeply understand their services and guide them towards practical and reliable architecture and implementation. 4. Apply SRE concepts such as observability, integration/load/chaos testing, oncall, incident management, failovers, and disaster recovery to design and apply tooling to improve mean time between failures (MTBF), time to detection (TTD), and time to mitigation (TTM) of incidents. \-\-\-\- Basic Qualifications ---- 1. 8+ years of experience in Go, Java, Python, or similar language 2. Expe
Apply Now

Direct link to company career page

AI Resume Fit Check

See exactly which skills you match and which are missing before you apply. Free, instant, no spam.

Check my resume fit

Free · No credit card

Share