Staff Site Reliability Engineer

ServiceTitan
Warsaw, PolandPosted 5 March 2026

Job Description

The Staff Site Reliability Engineer will be a key player in managing, optimizing, and ensuring the reliability and scalability of our SQL Server and PostgreSQL databases both in the cloud and on-premises. The ideal candidate will have extensive experience with Azure and AWS platforms, with a strong preference for Azure expertise. You will work closely with our development and operations teams to drive improvements in database performance, automate processes, and implement robust backup and recovery procedures. What you'll do: Own the architecture, design, deployment, and lifecycle management of SQL Server and PostgreSQL databases across Azure, AWS, and on-prem environments. Lead database design reviews, schema governance, indexing strategy, and query optimization to ensure performance and scalability. Manage database security, including access controls, encryption (at rest and in transit), auditing, and compliance best practices. Design and maintain high availability (HA) and disaster recovery (DR) architectures (Always On, replication, failover clusters, logical/physical replication, etc.), defining and enforcing RTO/RPO objectives. Implement and manage backup strategies, validation testing, and recovery procedures. Perform proactive database performance tuning, capacity planning, and workload optimization for mission-critical systems. Own patching, upgrades, migrations (cloud/on-prem), and version lifecycle management. Develop automation for DBA operations (provisioning, patching, backups, health checks, failover testing) using scripting (PowerShell, Bash, Python) and Infrastructure as Code. Collaborate with engineering teams to optimize data models, troubleshoot production incidents, and resolve performance bottlenecks. Establish database monitoring standards and implement observability solutions using tools such as Datadog, Grafana, ELK, and Prometheus. Participate in incident response, root cause analysis (RCA), and postmortem improvements related to database systems. Contribute to CI/CD processes for database deployments, schema changes, and migration pipelines. Define and document database operational standards, runbooks, and best practices. What you'll need: Deep expertise in: Performance tuning (query plans, indexing strategies, locking/blocking analysis, deadlock resolution): High availability and disaster recovery configurations; Backup/restore strategies and validation; Database security and hardening Strong experience operating databases in Azure and AWS, including managed services (Azure SQL, RDS, etc.) and self-managed deployments. Proven experience with database migrations (on-prem to cloud, version upgrades, cross-platform migrations). Strong scripting skills (PowerShell, Bash, Python) to automate DBA workflows. Experience implementing monitoring, alerting, and observability for database systems. Solid understanding of reliability engineering principles (SLIs/SLOs) as they apply to database systems. Experience with Infrastructure as Code and containerized database environments (Kubernetes, Docker) is a plus. Familiarity with CI/CD pipelines for database schema deployments (GitHub Actions, Azure DevOps, TeamCity, etc.). Strong troubleshooting skills, attention to detail, and ability to manage high-impact production systems. What We Offer: When you join our team, you’re not just accepting a job. You’re making a career move. Here’s how we’ll support you in doing some of the most impactful work of your career: Flexibility & Autonomous Work: Enjoy the freedom of a fully remote setup, flexible working hours, and flexible time off. We trust you to manage your time and deliver outstanding results while maintaining a healthy work-life balance, which is a core part of our culture. Growth & Development: We invest in your growth. Benefit from a comprehensive onboarding program, leadership training for Titans at all levels, and ample learning and development opportunities to help you grow. Ownership & Recognition: As a Tit ... (truncated, view full listing at source)