Cloud Observability Engineer Lead

Ceridian HCM Holding
RemotePosted 7 March 2026

Job Description

Skip to Content Sign In Cloud Observability Engineer Lead Req #23325 United States Apply Share Job Description Posted Thursday, March 5, 2026 at 10:00 PM | Expires Saturday, June 6, 2026 at 9:59 PM Dayforce is a global human capital management (HCM) company headquartered in Toronto, Ontario, and Minneapolis, Minnesota, with operations across North America, Europe, Middle East, Africa (EMEA), and the Asia Pacific Japan (APJ) region.    Our award-winning Cloud HCM platform offers a unified solution database and continuous calculation engine, driving efficiency, productivity and compliance for the global workforce.   Our brand promise - Makes Work Life Better™ - Reflects our commitment to employees, customers, partners and communities globally.  About the Opportunity As a Lead Observability Engineer, you will provide senior technical leadership in the implementation, operation, and continuous improvement of Dayforce’s observability platform. You will partner closely with engineering and infrastructure teams to ensure reliable telemetry collection, actionable insights, and effective operational workflows across distributed systems. This role focuses on executing against established observability strategy, translating architectural direction into scalable and robust solutions, and enabling teams to successfully adopt and operationalize observability capabilities. What You’ll Get to Do Design, implement, and operate components of the Dayforce observability platform in alignment with architectural standards and platform strategy. Lead implementation, tuning, and operational improvements across observability tooling including metrics, logs, traces, dashboards, alerting, and synthetic monitoring. Apply best practices for telemetry collection and instrumentation across application and infrastructure workloads. Build, maintain, and enhance dashboards and alerting mechanisms to support service ownership and incident response. Enable and onboard engineering and infrastructure teams to drive consistent adoption and effective platform usage. Design and optimize data pipelines for high-cardinality telemetry data, balancing performance, reliability, and cost. Partner with platform and engineering teams to gather requirements and deliver solutions aligned to operational needs. Provide mentorship through code reviews, documentation, and knowledge sharing. Participate in on-call rotations and operational reviews to drive reliability improvements and post-incident learnings. Skills and Experience We Value Strong communication and collaboration skills across engineering and infrastructure teams. Ability to gather requirements, prioritize effectively, and deliver high-quality solutions within defined scope. Significant experience operating and troubleshooting distributed systems in production environments. Experience implementing and operating observability platforms including metrics, logging, tracing, and alerting systems. Hands-on experience with OpenTelemetry, distributed tracing, and APM tooling. Experience working with data pipelines, ETL processes, and high-cardinality telemetry datasets. Proficiency in at least one object-oriented programming language and one scripting language. Demonstrated ability to deliver scalable, reliable, and maintainable technical solutions. Strong interest in learning and adopting emerging technologies within an established architectural framework. Bachelor’s degree plus 5–10 years of related experience, Master’s degree plus 6 years of related experience, or equivalent combination of education and experience. What Would Make You Stand Out Experience operating and tuning observability storage systems such as ClickHouse. Hands-on experience with Kubernetes observability and monitoring containerized workloads. Experience extending or integrating Grafana dashboards, data sources, or plugins. Familiarity applying AI-assisted tooling to observability workflows. Contributions to internal ... (truncated, view full listing at source)