Maintenance Engineer

Luminai
San MateoPosted 27 March 2026

Job Description

Maintenance Engineer ABOUT LUMINAI Healthcare operations have always depended on people to bridge the gaps that technology couldn't. It relies on complex manual work to carry out critical internal processes, yet most health systems don’t have enough resources to properly automate these tasks, leaving them stuck in outdated, labor-intensive SOPs. Luminai structures the chaos, automates the manual handoffs, and deploys end-to-end workflows across every system, providing the integrated intelligence layer to improve processes over time. By delegating to autonomous AI systems those mission-critical workflows that previously expended valuable human time, Luminai allows doctors and administrators to do what they do best: Focus on Patients. We’ve raised significant amounts of capital from the best Silicon Valley VCs: General Catalyst, Peak XV (fka Sequoia India), YCombinator, as well as from investors including Kevin Weil (Chief Product Officer at OpenAI), Arash Ferdowsi (co-founder of Dropbox), Katie Stanton (former VP Global Media, Twitter), and CEOs of companies such as Flexport, Notion, Front, Ramp and Twitch. ABOUT THE ROLE As a Maintenance Engineer at Luminai, you will ensure the reliability, performance, and resilience of the systems that power mission-critical AI workflows. You’ll operate at the core of our production infrastructure — maintaining, monitoring, and continuously improving the systems that healthcare, finance, and telecommunications organizations depend on every day. This is a highly ownership-driven role for someone who thrives on operational excellence, proactively prevents issues before they arise, and takes pride in keeping complex systems running smoothly in high-stakes environments. You’ll work closely with Engineering, Product, and Forward Deployed teams to ensure our deployments are stable, secure, and scalable. This is a hybrid position. Our team is in-office 3 days a week (Mon, Tue, Thu) in San Mateo, California. WHAT YOU’LL DO - Monitor, maintain, and improve the reliability of production AI systems and workflow infrastructure - Proactively identify, diagnose, and resolve system issues across application, integration, and cloud infrastructure layers - Own incident response processes, including root cause analysis and long-term remediation - Implement monitoring, alerting, and observability tooling to ensure system health and uptime - Collaborate with Engineering to harden deployments and improve system architecture for resilience and scalability - Support customer-facing teams by troubleshooting and resolving technical issues in live environments - Document system configurations, operational procedures, and recovery protocols - Continuously improve reliability standards, deployment practices, and operational safeguards WHAT WE’RE LOOKING FOR - 3+ years of experience in support engineering, site reliability engineering, or infrastructure maintenance - Strong proficiency in Python or scripting languages - Experience managing cloud infrastructure (AWS, GCP, or Azure) - Strong problem-solving skills and a proactive, preventative mindset - Clear communication skills and ability to collaborate across engineering and customer-facing teams - High ownership and accountability in high-reliability environments
Apply Now

Direct link to company career page

AI Resume Fit Check

See exactly which skills you match and which are missing before you apply. Free, instant, no spam.

Check my resume fit

Free · No credit card

Share