Senior Site Reliability Engineer

Remote $170k–$170k senior 2 months ago full-time quality 8.5/10

KotlinModern Java (11+)HTTPJSONgRPCProtocol BuffersMySQLVitessDynamoDBEvent driven architecturesDataDogLaunchDarkly

Build and extend platforms to improve system reliability
Work on team goals that encompass reliability for the entire company
Standardize reliability tools across multiple platforms and organizations
Triage, coordinate, and lead stabilization of sev 0–1 incidents
Serve as primary oncall, maintaining structured escalation paths and exercising leadership escalation
Drive platform-wide reliability improvements, shared operational tooling, and deploy-safety patterns
Use AI-driven systems to improve signal detection, reduce noise, and accelerate root cause analysis
Design and implement safe deployment patterns (progressive delivery, automated rollback, guardrails)

Drive to root cause systems with many moving parts and take the necessary steps to fix them
Demonstrated technical initiative and leadership on previous projects, especially those with a backend/platform focus
Familiarity with AI-driven tooling for observability, incident analysis, or automation
A mindset that naturally reaches for AI to accelerate problem-solving and reduce toil
Experience running production oncall for high-availability systems
Strong incident management skills — structured triage, mitigation under pressure, blameless postmortems
Fluency with CI/CD pipelines, progressive rollout strategies, and rollback automation
Monitoring & observability expertise — building/tuning alerts for uptime, error rates, latency regression, and resource exhaustion
Ability to create and maintain evidence-based maturity assessments using trailing 90-day data windows
Comfort with vendor/dependency management — maintaining validated escalation contacts reachable within ≤ 5 minutes
Boundless curiosity, autonomy, and a strong sense of accountability
A strong desire to perform and grow as an engineer
5+ years of software development experience

This program shifts Block from reactive incident handling to repeatable, system-wide reliability gains — fewer customer-visible incidents, faster response, higher product velocity, and lower burnout across the organization.
Block takes a market-based approach to pay, and pay may vary depending on your location. U.S. locations are categorized into one of four zones based on a cost of labor index for that geographic area. The successful candidate’s starting pay will be determined based on job-related skills, experience, qualifications, work location, and market conditions. These ranges may be modified in the future.
Zone A: USD $189,000 - USD $283,600
Zone B: USD $179,600 - USD $269,400
Zone C: USD $170,100 - USD $255,100
Zone D: USD $160,700 - USD $241,100

Similar jobs