Senior Site Reliability Engineer- Observability

Okta
Bengaluru, IndiaPosted 27 February 2026

Job Description

<div class="content-intro"><p><span style="color: #000000;"><strong>Get to know Okta<br><br></strong></span>Okta is The World’s Identity Company. We free everyone to safely use any technology, anywhere, on any device or app. Our flexible and neutral products, Okta Platform and Auth0 Platform, provide secure access, authentication, and automation, placing identity at the core of business security and growth.<br><br>At Okta, we celebrate a variety of perspectives and experiences. We are not looking for someone who checks every single box - we’re looking for lifelong learners and people who can make us better with their unique experiences. <br><br>Join our team! We’re building a world where Identity belongs to you.</p></div><p><strong>Workforce Identity Cloud</strong></p> <p>Okta Workforce Identity Cloud (WIC) provides easy, secure access for your workforce so you can focus on other strategic priorities—like reducing costs, and doing more for your customers.</p> <p>If you like to be challenged and have a passion for solving large-scale automation, testing, and tuning problems, we would love to hear from you. The ideal candidate is someone who exemplifies the ethics of, “If you have to do something more than once, automate it” and who can rapidly self-educate on new concepts and tools.</p> <h3><strong>Position Overview</strong></h3> <p>We are seeking a highly technical <strong>Site Reliability Engineer</strong> with deep expertise in <strong>Splunk and Grafana</strong> to own and evolve our observability ecosystem. In this role, you will move beyond simple monitoring to architect a comprehensive, scalable telemetry platform. You will be our subject-matter expert in Splunk optimisation, ensuring our logging architecture is performant, cost-effective, and deeply integrated with our automated workflows.</p> <p>You will treat infrastructure as code—utilising <strong>Terraform</strong> and strong coding proficiency in <strong>Go, Python, or Ruby</strong>—to automate the deployment of agents and collectors across complex distributed systems.</p> <h3><strong>Key Responsibilities</strong></h3> <ul> <li><strong>Splunk Architecture Optimisation:</strong> Lead the design and tuning of Splunk environments. Optimise indexer performance, search efficiency, and data models to ensure rapid troubleshooting and cost-efficiency.</li> <li><strong>Advanced Visualisation:</strong> Architect and maintain sophisticated <strong>Grafana</strong> dashboards that correlate disparate data sources into a single pane of glass for real-time system health.</li> <li><strong>Automated Infrastructure:</strong> Design, build, and maintain scalable observability infrastructure using tools like <strong>Terraform</strong>.</li> <li><strong>Pipeline Engineering:</strong> Optimise the collection, processing, and storage of telemetry data (Metrics, Logs, Traces) to ensure high reliability and low latency.</li> <li><strong>Workflow Automation:</strong> Develop custom Splunk workflows and integrations that trigger automated responses to system events, reducing Mean Time to Resolution (MTTR).</li> <li><strong>Incident Response:</strong> Participate in on-call rotations and lead post-incident reviews to drive systemic improvements through "observability-driven development."</li> </ul> <h3><strong>Required Skills Experience (The Essentials)</strong></h3> <ul> <li><strong>Splunk Mastery:</strong> Deep, hands-on experience with Splunk administration, search optimisation (SPL), and architecting complex data pipelines. You know how to make Splunk "hum" at scale.</li> <li><strong>Grafana Expertise:</strong> Proven ability to build actionable, intuitive dashboards in Grafana that go beyond simple charts to provide deep operational insights.</li> <li><strong>SRE Mindset:</strong> Minimum 3+ years of experience in an SRE, DevOps, or Systems Engineering role with a focus on high-availability systems.</li> <li><strong>Programming Proficiency:</strong> Strong coding skills in <strong>Go, ... (truncated, view full listing at source)
Apply Now

Direct link to company career page

Share this job