Senior Site Reliability Engineer - Observability
OktaBellevue, WashingtonPosted 27 February 2026
Job Description
<div class="content-intro"><p><span style="color: #000000;"><strong>Get to know Okta<br><br></strong></span>Okta is The World’s Identity Company. We free everyone to safely use any technology, anywhere, on any device or app. Our flexible and neutral products, Okta Platform and Auth0 Platform, provide secure access, authentication, and automation, placing identity at the core of business security and growth.<br><br>At Okta, we celebrate a variety of perspectives and experiences. We are not looking for someone who checks every single box - we’re looking for lifelong learners and people who can make us better with their unique experiences. <br><br>Join our team! We’re building a world where Identity belongs to you.</p></div><div class="SingleJob-content">
<h3><strong>Position Overview:</strong></h3>
<p>We are seeking a highly technical <strong>Senior</strong> <strong>Observability</strong> <strong>Site Reliability Engineer</strong> with a specialty in Splunk to own and evolve our Splunk ecosystem. In this role, you will move beyond simple monitoring to delivering a world class, comprehensive, scalable Observability Platform that enables our SRE teams and business partners. You will treat <strong>infrastructure as code</strong>—utilizing Terraform and strong coding proficiency in <strong>Go, Python, or Ruby</strong>—to automate the deployment of agents and collectors across complex distributed systems.</p>
<p><strong>Key Responsibilities</strong></p>
<ul>
<li>Automated Infrastructure: Design, build, and maintain scalable observability infrastructure using tools like Terraform.</li>
<li>Splunk Engineering: Optimize the collection, processing, and storage of log data to ensure high reliability and low latency of our Splunk services</li>
<li>Incident Response: Participate in on-call rotations and lead post-incident reviews to drive systemic improvements and "observability-driven development."</li>
<li>Automation: Eliminate "toil" by automating the deployment and scaling of observability agents and collectors.</li>
</ul>
<p><strong>Required Skills Experience (The Essentials)</strong></p>
<p><strong>Log Management: Minimum 5+ Experience scaling and managing Splunk Cloud at scale (1000+ SVCs), including Workload Management (WLM) and HEC optimization. </strong><strong>Visualization: Expertise in creating intuitive, actionable Splunk dashboards that correlate data across multiple sources.<br></strong><strong>SRE Mindset: Minimum 3+ years of experience in an SRE, DevOps, or Systems Engineering role with a focus on high-availability systems.</strong></p>
<ul>
<li><strong>Programming Proficiency:</strong> Strong coding skills in <strong>SPL</strong>, <strong>Go</strong> for building internal tools and automating workflows.</li>
<li><strong>Distributed Systems:</strong> Deep understanding of Linux internals, networking (TCP/IP, DNS, Load Balancing), and container orchestration (Kubernetes/EKS).</li>
<li><strong>Problem Solving:</strong> A data-driven approach to debugging complex, cross-service performance bottlenecks.</li>
</ul>
<p><strong>Bonus Skills (The "Nice-to-Haves")</strong></p>
<ul>
<li><strong>Telemetry Standards: </strong>Hands-on experience with OpenTelemetry (OTel), Vector, or similar frameworks for instrumenting applications.</li>
<li><strong>Charge-back app:</strong> Experience in implementing Splunk charge-back app for usage reporting </li>
</ul>
<p><strong>Cloud Platforms:</strong> Experience managing observability native tools within AWS or GCP.</p>
<p><strong>Additional requirements:</strong></p>
<ul>
<li>This position requires the ability to access federal environments and/or have access to protected federal data. As a condition of employment for this position, the successful candidate must be able to submit documentation establishing U.S. Person status (e.g. a U.S. Citizen, National, Lawful Permanent Resident, Refugee, or Asylee. 22 CFR 120.15) upon hire.</li>
<li>This person must attend in person onboarding in our San F ... (truncated, view full listing at source)
Apply Now
Direct link to company career page
More jobs at Okta
See all →Communications AI Strategy & Operations Intern (Summer 2026)
San Francisco, California · 28 February 2026
Vice President of Enterprise Sales, East
New York, New York · 27 February 2026
Tax Manager, Foreign Compliance and Reporting
Dublin, Ireland · 27 February 2026
Technical Account Manager - Auth0
Dublin, Ireland · 27 February 2026