Job Description
About this role
We are seeking an Application Production Support Engineer to support and operate critical developer platform services. This role is operations and support focused and is not a software development position . The successful candidate will play a key role in ensuring platform stability, rapid incident resolution, and continuous improvement of our support and operational processes.
Key Responsibilities:
Production Support & Operations
Provide day‑to‑day production support for core developer platform services, including artifact management, CI/CD tooling, build systems, and release pipelines
Triage, troubleshoot, and resolve platform incidents and service degradation in partnership with engineering teams
Act as a point of escalation for complex platform issues, ensuring timely resolution and clear communication to stakeholders
Participate in on‑call or support rotations as required
Reliability & Continuous Improvement
Analyze recurring incidents and operational pain points to identify underlying reliability gaps
Drive proactive improvements in automation, monitoring, alerting, and observability to reduce manual effort and incident volume
Contribute to post‑incident reviews and root cause analysis, ensuring learnings are captured and actions are tracked to completion
Operational Excellence
Formalize and standardize support processes, runbooks, and operating procedures across developer platform services
Improve documentation quality and accessibility to enable faster issue resolution and self‑service by engineering teams
Design and implement structured support workflows, escalation paths, and service engagement models
Project & Stakeholder Management
Own and deliver small operational initiatives end‑to‑end, coordinating across platform, infrastructure, and engineering teams
Partner closely with developers to understand usage patterns and operational requirements
Contribute to improving overall operational maturity and service quality of the Developer Platform
Required Qualifications:
4–5 years of experience in Production Support, Application Support, DevOps Support, or Platform Operations within a large-scale enterprise environment
Strong troubleshooting experience across distributed systems and multi-tier applications
Hands-on experience supporting CI/CD platforms (e.g., Jenkins, GitHub Actions, Azure DevOps or similar)
Experience with artifact repositories and container registries (e.g., JFrog Artifactory, Azure Container Registry (ACR) or similar)
Solid understanding of build and release processes across Java, .NET, or containerized workloads
Proficiency in Linux environments and command-line troubleshooting
Scripting ability in Python, Bash, or PowerShell to drive automation and operational efficiency
Experience working in cloud environments (Azure preferred)
Familiarity with monitoring and observability tools (e.g., Splunk, , Prometheus, Grafana, etc.)
Experience with incident management practices, root cause analysis, and post-incident reviews
Strong documentation skills with the ability to formalize and standardize operational processes
Preferred Qualifications
Experience driving automation or reliability improvements in a production support or SRE‑adjacent role
Familiarity with observability tools, monitoring, and incident management practices
Experience coordinating cross‑team initiatives or owning operational improvement projects
Background in structured support models, service management, or platform operations
Special Note: This role requires mandatory 7:00am PST, coverage to ensure operational overlap with our EMEA and APAC regions. On Fridays begin at 9 AM to 5PM

For SF4-San Francisco - 400 Howard Street Only the salary range for this position is USD$132,500.00 - USD$162,000.00 . Additionally, employees are eligible for an annual discretionary bonus, and benefits including healthcare, leave benefits, and retirement benefits. BlackRock operates a pay-for-perform ... (truncated, view full listing at source)