Manager, Site Reliability Engineering
Veeam SoftwareRemote, CzechiaPosted 21 March 2026
Tech Stack
Job Description
Veeam is the Data and AI Trust Company, specializing in helping organizations ensure their data and AI are fully understood, secured, and resilient to enable the acceleration of safe AI at scale. As the market leader in both data resilience and data security posture management, Veeam is built for the convergence of identity, data, security, and AI risk. Headquartered in Seattle with offices in more than 30 countries, Veeam protects over 550,000 customers worldwide, who trust Veeam to keep their businesses running. Join us as we go fearlessly forward together, growing, learning, and making a real impact for some of the world’s biggest brands.
About the Role
Veeam is expanding its global Site Reliability Engineering (SRE) organization to support the Veeam Data Cloud . As an SRE Manager , you will report to our Global Director of SRE and will build and lead a high-performing team that partners with product, platform, and security engineering to make our systems reliable, scalable, and observable from the ground up. You’ll collaborate with peer engineering leaders to embed reliability into service roadmaps, and you’ll represent your team in global SRE planning and delivery of cross-cutting reliability initiatives across all VDC services.
You’ll drive adoption of SRE principles (SLIs/SLOs/error budgets, toil reduction, blameless learning) and operate a healthy, daytime follow-the-sun on call model in partnership with our other regions. You will lead your team to make code contributions leading to improvements in the overall operability, reliability, resilience, and security of the codebase(s) we support.
What You’ll Do
People Team Leadership
Hire, onboard, and grow your SRE team; coach career development and performance
Foster a psychologically safe, blameless culture that favors learning over blame and emphasizes engineering over firefighting
Ensure a sustainable operational coverage; monitor on-call health and workload
Track and cap toil so engineers spend the majority of time on project work that reduces future toil
Reliability Strategy Governance
Establish and operationalize SLIs/SLOs and error budgets with service owners; run reliability reviews and hold teams accountable to outcomes
Define reliability standards, runbooks, readiness checklists, and alerting patterns (including SLO-based alerting)
Partner with product/EMs to align reliability work with service goals and customer experience, not as a gate but as an enabler
Operations Incident Excellence
Ensure incident response readiness; lead/coordinate major incidents; drive fast, high-quality postmortems and systemic fixes
Measure MTTR, change failure rate, SLO posture, and repeat-incident reduction; publish learning broadly
Engineering Automation
Lead software-first reliability investments: observability, deployment safety (canary/blue-green), resilience testing/chaos, and self-service guardrails
Drive platform improvements (IaC, CI/CD, Kubernetes) and internal tools that scale operations and improve developer experience
What You’ll Bring
7+ years in Software, Platform, and/or Reliability Engineering with 2+ years managing engineers
Demonstrable experience leading engineering teams to predictably deliver outcomes
Experience leading cross-functional initiatives collaboratively with peers through influence
Experience with public cloud (Azure preferred), Kubernetes, IaC (Terraform, Pulumi), CI/CD (Github Actions, ArgoCD, Azure DevOps), and observability (OpenTelemetry, Elastic, Datadog, Prometheus, Grafana)
Coding background with experience improving service reliability
Hands-on incident management and postmortem practice; excellent cross-geo communication
Willingness to participate in an on-call rotation (typically during daytime hours, including weekends/holidays)
Bonus Skills
Demonstrated success leading SLO/error-budget adoption and reliability programs for cloud services
Experience operating a multi-region, follow-the-sun on-call model
Back ... (truncated, view full listing at source)
Apply Now
Direct link to company career page
AI Resume Fit Check
See exactly which skills you match and which are missing before you apply. Free, instant, no spam.
Check my resume fitFree · No credit card