Staff Software Engineer, Observability

Pinterest
San Francisco, CA, US; Remote, USPosted 4 March 2026

Job Description

<div class="content-intro"><p><strong>About Pinterest:</strong></p> <p>Millions of people around the world come to our platform to find creative ideas, dream about new possibilities and plan for memories that will last a lifetime. At Pinterest, we’re on a mission to bring everyone the inspiration to create a life they love, and that starts with the people behind the product.</p> <p>Discover a career where you ignite innovation for millions, transform passion into growth opportunities, celebrate each other’s unique experiences and embrace the <a href="https://www.pinterestcareers.com/our-life/pinflex/">flexibility</a> to do your best work. Creating a career you love? It’s Possible.</p></div><p>We're seeking an exceptional Staff Software Engineer to join our Observability team at Pinterest. This role combines deep technical expertise in distributed systems and data engineering with a product-oriented mindset to build world-class observability solutions that empower our engineering organization. As a Staff Engineer on the Observability team, you'll be responsible for designing and building the infrastructure and tools that provide visibility into Pinterest's large-scale distributed systems, helping thousands of engineers understand, debug, and optimize their services.</p> <p><strong>What you'll do:</strong></p> <ul> <li>Define and execute the observability roadmap, treating it as a product. Understand engineering team needs and translate them into technical solutions with measurable impact.</li> <li>Architect, build, and scale distributed observability infrastructure (metrics, logs, traces) to handle massive volumes across Pinterest's distributed systems.</li> <li>Build high-performance data pipelines and storage for real-time and historical telemetry analysis at Pinterest scale.</li> <li>Champion Best Practices: Establish observability standards and patterns across the organization, making it easy for teams to instrument their services and gain actionable insights</li> <li>Technical Leadership: Mentor engineers, lead architectural reviews, and influence technical decisions across teams to improve overall system reliability and performance</li> <li>Cross-functional Collaboration: Partner with SRE, Infrastructure, Product Engineering, and other teams to understand pain points and deliver solutions that improve developer productivity and system reliability</li> <li>Innovation: Stay current with observability trends and technologies, evaluating and adopting cutting-edge tools and techniques to keep Pinterest at the forefront</li> </ul> <p><strong>What we’re looking for:</strong></p> <ul> <li>Bachelor’s degree in Computer Science, Engineering, or a related field, or equivalent experience.</li> <li>Product Mindset: Demonstrated ability to work backwards from customer needs —understanding user needs, prioritizing features, measuring success, and iterating based on feedback. Experience building internal platforms or tools with strong adoption</li> <li>Distributed Systems Expertise: 7+ years of experience designing and operating large-scale distributed systems with deep understanding of consistency, availability, scalability, and failure modes</li> <li>Data Engineering Skills: Strong background in building data pipelines, working with time-series databases, columnar storage, stream processing (Kafka, Flink, etc.), and data modeling at scale</li> <li>Observability Domain Knowledge: Hands-on experience with modern observability tools and practices including metrics, logging, tracing, and profiling. Familiarity with OpenTelemetry, Prometheus, Grafana, or similar technologies</li> <li>Programming Proficiency: Expert-level coding skills in languages like Java, Python, Go, or Scala with ability to write production-quality code</li> <li>Systems Thinking: Ability to see the big picture while managing complex technical details, balancing trade-offs between cost, performance, and reliability</li> <li>Experience building observability platforms fro ... (truncated, view full listing at source)