Sr TechOps and SRE Lead (AWS Cloud)- REMOTE
Sr TechOps and SRE Lead (AWS Cloud) - RemoteDepartment: Technology / EngineeringRole OverviewWe are seeking a highly experienced Sr TechOps and SRE Lead with deep expertise in Cloud to lead our cloud infrastructure, DevOps practices, Site Reliability "Best Practices", and overall operational excellence initiatives. This role is both strategic and hands-on — responsible for designing scalable architectures, improving automation, ensuring system reliability, and leading the TechOps team.Key ResponsibilitiesArchitect and manage secure, scalable, and highly available infrastructure on AWS.Design multi-account AWS environments using AWS Organizations.Implement VPC architecture, IAM policies, networking, and security best practices.Oversee EC2, ECS/EKS, Lambda, RDS, S3, CloudFront, and related AWS services.Optimize AWS cost management and resource utilization.Reliability & Production OperationsImplement Site Reliability Engineering (SRE) best practices.Define SLIs, SLOs, and error budgets.Manage monitoring and alerting (CloudWatch, Datadog, Prometheus, Grafana).Lead incident response, root cause analysis (RCA), and postmortems.Ensure 24/7 uptime and operational resilience.Security & ComplianceImplement IAM best practices and least-privilege access controls.Manage secrets and key management (AWS KMS, Secrets Manager).Conduct vulnerability management and patching.Support compliance initiatives (SOC 2, ISO 27001, GDPR as applicable).Lead disaster recovery planning and backup strategies.Leadership & StrategyLead and mentor a team of DevOps/TechOps engineers.Establish operational KPIs and performance benchmarks.Manage on-call rotations and escalation processes.Collaborate with Engineering, Product, Security, and Data teams.Contribute to long-term infrastructure strategy and cloud roadmap.<>Required QualificationsBachelor’s degree in Computer Science, Engineering, or equivalent experience.10+ years in DevOps, Cloud Engineering, or Infrastructure roles.5+ years leading technical teams.Strong hands-on experience with AWS services (EC2, EKS, RDS, S3, IAM, VPC, Lambda).Deep knowledge of networking, Linux systems, and distributed systems.Experience with Infrastructure-as-Code (Terraform or CloudFormation).Strong scripting skills (Python, Bash, or similar).Experience with containerization (Docker) and Kubernetes (EKS preferred).Key CompetenciesStrong architectural thinkingHands-on technical leadershipCrisis and incident managementStrategic planning and executionExcellent cross-functional communicationSuccess Metrics99.9%+ production uptimeReduced deployment lead timeReduced incident frequency and MTTRImproved cost efficiencyHigh-performing and scalable TechOps function
Apply Now
Apply Now