Staff Engineer - Observability & Performance

Fastly
Denver, CO; New York City, NY; San Francisco, CA$196k – $235kPosted 24 February 2026

Job Description

<div class="content-intro"><p>Fastly helps people stay better connected with the things they love. Fastly’s edge cloud platform enables customers to create great digital experiences quickly, securely, and reliably by processing, serving, and securing our customers’ applications as close to their end-users as possible — at the edge of the Internet. The platform is designed to take advantage of the modern internet, to be programmable, and to support agile software development. Fastly’s customers include many of the world’s most prominent companies, including GitHub, Yelp, Paramount, and JetBlue.</p> <p>We're building a more trustworthy Internet. Come join us.</p></div><p><strong>Posting Open Date: </strong>Reposted Feb 6, 2026</p> <p><strong>Anticipated Posting Close Date*: </strong>March 2, 2026</p> <p><em>*Job posting may close early due to the volume of applicants.</em></p> <p><strong>Staff Engineer - Observability Performance</strong></p> <p>We’re seeking a versatile and experienced Engineer who thrives in a fast paced, high scale environment and is passionate about reliability, performance, automation, and tooling. Reporting to the VP of Performance Center Operations, you’ll serve as a key individual contributor within the Performance Center Operations team. The Fastly Performance Center is the strategic and operational engine that ensures the highest level of performance for the most demanding workloads on the Internet. We proactively safeguard quality of service at the global scale, drive technical and product strategies that shape our platform’s evolution, and directly influence revenue outcomes by ensuring our customers succeed.</p> <p>Partnering cross-functionally across Engineering, Infrastructure, Product, Revenue, and Account teams to build tooling and processes that drive scale, availability, and intelligent automation. Your work will help ensure Fastly remains the most performant, trusted, and customer aligned partner in the industry.</p> <p>The scope of this role will evolve with the needs of the business and the maturity of the program. Additional responsibilities may be assigned based on individual expertise and strategic priorities from the Office of the Founder CTO.</p> <p><strong>What You'll Do:</strong></p> <ul> <li>This role is approximately 50% Cross-functional Operations, 40% Data Analysis / Traffic Insights, and 10% Site Reliability Engineering, balancing technical expertise with collaboration and strategic impact.</li> <li>Drive the development of automation and observability tooling that improves operational efficiency and platform reliability, including traffic monitoring, alerting, and surveillance tools.</li> <li>Partner with observability teams to implement and improve existing dashboards (Grafana, Prometheus) and metrics pipelines that provide meaningful visibility into traffic patterns, surges, and seasonal trends.</li> <li>Help define SLIs/SLOs, and improve monitoring frameworks, ensuring alerts and dashboards reflect operational reality and proactively surface issues before customer impact. </li> <li>Collaborate with data/analytics teams to leverage data pipelines (e.g., SQL, BigQuery or other large-scale data stores) for trend analysis, capacity planning, traffic pattern recognition</li> <li>Step in to run daily operational standups or coordination meetings as needed. Ensuring priorities are clear, follow ups are tracked, and cross functional execution maintains momentum.</li> <li>Facilitate cross-team communication during high-impact initiatives or incident reviews, surfacing blockers early and maintaining execution momentum</li> <li>Assist in root-cause investigations of performance, scalability or traffic anomalies, translate learnings into improvements in tooling and architecture</li> <li>Act as a technical liaison, helping contextualize traffic behavior, system performance, and support escalations with clear insight</li> <li>Help define and evolve run-books, incident response processes, ... (truncated, view full listing at source)
Apply Now

Direct link to company career page

Share this job