Software Engineer II - Site Reliability
About the RoleWe're looking for a Software Backend Engineer with strong DevOps expertise to support our team managing restricted government cloud environments for federal customers. This role involves both building scalable infrastructure and supporting distributed systems, with a focus on reliability, performance, and compliance.As a Software Backend and DevOps Engineer, you'll troubleshoot production issues, enhance infrastructure, and ensure a smooth, secure experience for federal users. You’ll also collaborate with cross-functional teams to drive continuous improvements in deployment, monitoring, and system design.The ideal candidate:Brings a combination of technical depth and problem-solving skillsNavigates ambiguity and works effectively in complex, distributed systemsCollaborates across teams and escalates issues when needed to maintain progressIdentifies and addresses inefficiencies in workflows and operationsContributes to clarity, accountability, and process improvementKey ResponsibilitiesAutomate and build tools to eliminate repetitive operational tasks and reduce toilMaintain and scale reliable software applications using DevOps best practicesBuild and enhance CI/CD pipelines for automated testing, builds, and deploymentsOptimize and maintain Kubernetes-based orchestration systems for performance and reliabilityTroubleshoot complex production issues across application, infrastructure, and distributed system layersParticipate in on-call rotations and support incident responseCollaborate with stakeholders and product teams on infrastructure and deployment requirementsEnsure compliance with government cloud standards across applications and infrastructureMust-Have QualificationsProven ability to maintain 99.99% uptime in production environments6+ years of overall experience, including 3+ years in software development and 2+ years in DevOps practices.2+ years of experience with Kubernetes, Terraform, Python or Go, and AWS2+ years of experience working with distributed systemsExperience in fast-paced or startup-like environmentsStrong collaboration and communication skills across cross-functional teams and divisionsAbility to ramp up quickly and contribute in complex, large-scale environmentsDemonstrated leadership in incident management and operational reliabilityNice-to-Have QualificationsExperience with FedRAMP compliance and government security requirementsTrack record of implementing secure CI/CD pipelines in restricted or regulated environmentsFamiliarity with Redis, Kafka/PubSub, and relational databasesAt Abnormal AI, certain roles are eligible for a bonus, restricted stock units (RSUs), and benefits. Individual compensation packages are based on factors unique to each candidate, including their skills, experience, qualifications and other job-related reasons. We know that benefits are also an important piece of your total compensation package. Learn more about our Compensation and Equity Philosophy on our Benefits & Perks page.Base pay range:$148,800—$175,000 USDSan Francisco/New York Base pay range:$165,800—$195,000 USDOriginally posted on Himalayas
Apply Now
Apply Now