Senior Site Reliability Engineer (SRE & Platform Reliability)

Affirm
Remote Spain€85k – €115kPosted 23 February 2026

Job Description

<div class="content-intro"><p>Affirm is reinventing credit to make it more honest and friendly, giving consumers the flexibility to buy now and pay later without any hidden fees or compounding interest.</p></div><p>Site Reliability Engineering at Affirm is a small, yet crucial, team that helps our Engineering partners to “Operate What They Own” with excellence to protect their customers’ experience. SRE accomplishes this through defining frameworks and best practices for operating applications, building tooling, and providing training and consulting. Some of the many SRE responsibilities are:</p> <ul> <li>Providing data and visibility to teams and leadership on application performance</li> <li>Guiding the development of SLOs</li> <li>Driving the Incident Management and Analysis process</li> <li>Steering the implementation of Change Management and Deployment practices</li> <li>Engaging in service and architectural conversations</li> <li>Recommending observability and alerting configurations</li> </ul> <p>The SRE team benefits from experience across many domains including:</p> <ul> <li>infrastructure, platform, and distributed systems</li> <li>capacity management, load and chaos testing</li> <li>automation, observability, and configuration management</li> <li>development and product experience</li> </ul> <p>The SRE team is seeking motivated software and systems engineers with the experience to build, iterate on, and expand incident lifecycle, reliability, and resilience practices throughout Affirms Engineering organization and beyond.</p> <p><br><strong>What You'll Do:</strong></p> <ul> <li>You will be responsible for owning and delivering quarterly goals for your team, leading engineers on your team through ambiguity to solve open-ended problems, and ensuring that everyone is supported throughout delivery.</li> <li>You will support your peers and stakeholders in the product development lifecycle by collaborating with infrastructure, product management, developer experience analytics by participating in ideation, articulating technical constraints, and partnering on decisions that properly consider risks and trade-offs.</li> <li>You will proactively identify technical solutions and operational processes that strengthen incident readiness, response, and post-incident analysis.</li> <li>You will support the operations and availability of your team’s artifacts by creating and monitoring metrics, escalating when needed, and supporting “keep the lights on” on-call efforts.</li> <li>You will foster a culture of quality and ownership on your team by setting or improving code review and design standards for your team, and advocating for them beyond your team through your writing and tech talks.</li> <li>You will help develop talent on your team by providing feedback and guidance, and leading by example.</li> <li>On-Call Rotation - There would be an on-call rotation for this role as a requirement.</li> </ul> <p><br><strong>What We Look For:</strong></p> <ul> <li>You have 4+ years of experience designing, developing and launching backend systems at scale using scripting and development languages like Bash, Python or Kotlin.</li> <li>You have a track record of developing highly available distributed systems using technologies like AWS, MySQL and Kubernetes.</li> <li>You have meaningful experience contributing in or driving parts of the Incident Lifecycle process, enabling actionable insights that improve the quality culture, reliability, resilience, and system performance.</li> <li>You have 4+ years working in a Site Reliability or Production Engineering team</li> <li>You demonstrate curiosity with empathy, and strong opinions loosely held</li> <li>You have experience defining a technical plan for the delivery of a significant feature or system component with an elegant, simple and extensible design. You write high quality code that is easily understood and used by others.</li> <li>You have experience in making impactful changes in a large ... (truncated, view full listing at source)
Apply Now

Direct link to company career page

Share this job