Senior Kubernetes Platform Engineer
TensorWaveLas Vegas, NevadaPosted 24 February 2026
Job Description
Our mission at Tensorwave Cloud is to build seamless, secure, reliable, and resilient AI infrastructure at scale, eliminating barriers and challenging the status quo to empower builders and support AI innovation.About the roleWe are seeking a Senior Kubernetes Platform Engineer focused on support and operations.You’ll play a critical role in maintaining the stability and reliability of our bare-metal Kubernetes infrastructure and work closely with senior engineers, taking point on troubleshooting, incident response, and day-to-day cluster operations across multi-tenant workloads.This is a great opportunity for engineers ready to deepen their Kubernetes expertise while supporting cutting-edge AI environments in real-time.ResponsibilitiesOwn and troubleshoot operational issues within Kubernetes environmentsMaintain and monitor core services (e.g., Cilium, HAProxy, Prometheus, etc.)Ensure uptime, performance, and reliability of multi-tenant clustersAssist with Ingress/Egress connectivity and network debuggingSupport internal and customer teams in secure, isolated VPC environmentsCollaborate with senior engineers on automation and cluster lifecycle improvementsRequired ExperienceBachelor of Science in Computer Science, Computer Engineering, or a related technical field, or equivalent practical experience5+ years experience in DevOps, SRE, or Linux infrastructure roles4+ years of hands-on experience with Kubernetes in production3+ years designing and operating multi-tenant Kubernetes platforms at CSP or hyperscaler scale (AWS EKS, GCP GKE, Azure AKS) including control plane architecture, cluster federation, and tenant isolation strategiesProven experience implementing production-grade cluster authentication (OIDC/SSO integration, RBAC policies) and advanced network design (CNI selection/configuration, network policies, service mesh architecture, cross-cluster networking)Familiarity with networking, CNI plugins, and core Linux troubleshootingStrong infrastructure-as-code mindset - Helm, Terraform, AnsibleSolid experience with monitoring and logging tools - Prometheus, Grafana, LokiUnderstanding of secure infrastructure design principles and least-privilege accessPreferred ExperienceExperience with RKE2, Rancher, or similar platformsExperience troubleshooting or supporting AI or GPU-based workloadsFamiliarity with HAProxy, Cilium, or other Kubernetes ingress/networking toolsWhat We BringMission driven companyCompetitive SalaryStock Options100% paid Medical, Dental, and Vision insuranceFlexible PTOPaid Holidays401(k)Parental LeaveFlexible Spending AccountShort Term Disability InsuranceLife and Voluntary Supplemental InsuranceMental Health Benefits through Spring HealthWe’re looking for resilient, adaptable people to join our team, people who believe in the mission and think at massive scale. The solutions that worked on a handful of devices will not work at Exascale. Be prepared to be pushed daily, to learn a lot, and literally build the future.Tensorwave is an equal opportunity employer, committed to fostering an inclusive and supportive workplace. All qualified applicants and candidates will receive consideration for employment without regard to race, color, religion, sex, disability, age, national origin, or veteran status.
Apply Now
Direct link to company career page
More jobs at TensorWave
See all →More Spring jobs
See all →Sr. Backend Software Engineer, Fraud Risk Platform
Navan · Palo Alto, CA or San Francisco, CA
Sr. Backend Software Engineer, Fraud Risk Platform
Navan · Dallas, TX
Product Management Intern (Summer 2026) - Master’s/MBA
Gusto · San Francisco, CA - Hybrid
Senior Software Engineer, Security
CoreWeave · Livingston, NJ / New York, NY / Sunnyvale, CA / Bellevue, WA