Senior Software Engineer - Cloud Infrastructure & Observability 

Roku
Cambridge, United KingdomPosted 1 April 2026

Job Description

Teamwork makes the stream work. Roku is changing how the world watches TV Roku is the #1 TV streaming platform in the U.S., Canada, and Mexico, and we've set our sights on powering every television in the world. Roku pioneered streaming to the TV. Our mission is to be the TV streaming platform that connects the entire TV ecosystem. We connect consumers to the content they love, enable content publishers to build and monetize large audiences, and provide advertisers unique capabilities to engage consumers. From your first day at Roku, you'll make a valuable - and valued - contribution. We're a fast-growing public company where no one is a bystander. We offer you the opportunity to delight millions of TV streamers around the world while gaining meaningful experience across a variety of disciplines. About the Role We are building a next-generation observability and cloud platform that is high-performance, cost-efficient, secure, and scalable across multi-region, multi-cloud clusters. You will lead the architecture and evolution of Roku’s observability and cloud infrastructure stack. This includes metrics, logs, traces, telemetry pipelines, service mesh, developer experience, and reliability of systems that power thousands of services and millions of devices. You will drive a vision where developers gain deep visibility with minimal overhead , onboarding is seamless, and insights are available in real time. Your work will directly help Roku scale efficiently while maintaining reliability, cost control, and performance. What You’ll Be Doing Architect and lead Roku’s observability platform across metrics, logs, and traces; evolve data pipelines and storage layers optimized for high throughput, performance, and cost at Roku scale (TSDBs, Parquet, distributed processing). Extend and harden open‑source observability systems; overhaul core components (e.g., storage layers, query paths) to improve performance, reliability, and usability at scale. Implement features such as pre‑aggregation, down-sampling, and sampling to reduce load and accelerate queries across the platform. Collaborate across platform, SRE, and product teams to migrate hundreds of workloads to our common platform; augment and automate CI/CD flows and onboarding. Integrate security into infrastructure and platform services; ensure robust multi‑tenant, multi‑cluster, and multi‑cloud designs. Contribute improvements back to open source and CNCF‑aligned projects; shape standards adoption (OpenTelemetry, OpenMetrics) across the company. Mentor engineers; establish best practices for reliability, efficiency, and cost management across service mesh and observability domains. What You'll have Extensive experience with software engineering with a track record of architecting distributed systems or platforms at scale. Strong hands‑on experience in Golang and one scripting language (e.g., Python or Shell). Experience operating observability at pb-scale ingestion and hundreds of millions of series. Expertise in observability platforms and tooling (Prometheus, Grafana, Loki, Tempo, ELK/OpenSearch, ClickHouse) and standards (OpenTelemetry, OpenMetrics). Deep experience building systems of scale and operating cloud infrastructure with Kubernetes; strong proficiency with service mesh technologies (Istio/Envoy), infrastructure‑as‑code (Terraform) and experience in multi‑cloud (AWS, GCP) Demonstrated ability to evolve storage and query architectures for cost, scale, and latency (e.g., TSDB, Parquet, distributed processing). Proven experience integrating security as part of infrastructure and platform development. Exceptional cross‑functional communication; effective collaboration with both technical and non‑technical stakeholders. Culture fit: independent thinker, pragmatic problem‑solver, low‑ego collaborator who moves fast and focuses on company success. Experience integrating AI tools to improve processes and reduce toil. Open‑source contributions in CN ... (truncated, view full listing at source)
Apply Now

Direct link to company career page

AI Resume Fit Check

See exactly which skills you match and which are missing before you apply. Free, instant, no spam.

Check my resume fit

Free · No credit card

Share