Site Reliability Engineering Ops Team Lead

Remote Full-time
Site Reliability Engineering Ops Team Lead - Careers At ECI Software Solutions Career Opportunities with ECI Software Solutions A great place to work. Careers At ECI Software Solutions Share with friends or Subscribe! Are you ready for new challenges and new opportunities? Join our team! Current job opportunities are posted here as they become available. Subscribe to our RSS feeds to receive instant updates as new positions become available. Back To Openings Site Reliability Engineering Ops Team Lead Department: G&A Location: START YOUR APPLICATION Location: United States (Remote) Site Reliability Engineering (SRE) Ops Team Lead Are you ready to take the lead in ensuring the heartbeat of ECI�s production systems never skips? As the Site Reliability Engineering (SRE) Ops Team Lead, you�ll be a hands-on, operations-driven technical leader charged with driving the availability, stability, and operational excellence of our critical systems across hybrid cloud and on-prem environments. This role is your chance to own the full spectrum of operational responsibilities�from incident command to continuous improvement�while collaborating closely with Product, Development, and Infrastructure teams. You�ll be at the forefront of delivering 24/7/365 uptime, meeting SLAs, optimizing performance, and balancing cost-efficiency in a fast-paced, mission-critical environment. What You�ll Own Operational Excellence & Service Reliability Take full ownership of the day-to-day operation and support of our always-on production systems. Lead high-stakes incident response with precision�triage, coordinate, communicate, and resolve. Drive impactful post-incident reviews that fuel continuous operational improvements and prevent future issues. Enforce runbooks, SOPs, and escalation paths to keep operations smooth and predictable. Monitor and elevate uptime, SLIs, SLOs, error budgets, and MTTR with a relentless focus on results. Oversee and optimize on-call rotations and operational readiness to keep the team sharp. Observability, Telemetry & Alerting Own the monitoring and alerting landscape to ensure no critical signal goes unnoticed. Master and optimize observability platforms like Coralogix to extract actionable insights and improve alert quality. Refine alerting strategies and incident workflows using Coralogix and FireHydrant to reduce noise and boost response effectiveness. Build and maintain real-time dashboards that provide crystal-clear visibility into service health. Operational Automation & Infrastructure Practices Champion automation initiatives that slash manual toil and accelerate incident response. Implement GitOps practices to deliver consistent, auditable, and reliable operational changes. Contribute to Terraform-driven infrastructure with a sharp eye on operability and maintainability. Review changes rigorously to ensure operational impact, resiliency, and supportability are top-notch. FinOps & Capacity Management Lead the charge on operational cost awareness and drive initiatives to optimize spend through right-sizing and waste reduction. Partner on capacity planning and demand forecasting to guarantee system stability under all conditions. Make smart trade-offs balancing cost, performance, and reliability in every operational decision. Team Leadership & Operational Coordination Inspire and mentor SRE team members with hands-on leadership and operational expertise. Be the go-to escalation point for production issues and operational challenges. Collaborate seamlessly across Product, Development, Infrastructure, and Support teams to ensure flawless service delivery. Cultivate a culture of operational discipline, accountability, and relentless improvement. Engage actively in Agile ceremonies and manage operational workflows through Jira. What You Bring Deep hands-on experience in production operations, SRE, DevOps, or Infrastructure roles. Proven success operating and supporting production systems in hybrid cloud and on-prem environments. Expertise in incident management, on-call best practices, and operational processes. Proficiency with GitOps workflows, Terraform, and observability tools. Strong communication skills with the ability to lead confidently during high-pressure incidents and coordinate cross-team efforts. Bonus Points For Bachelor�s degree in Computer Science, Engineering, or related field, or equivalent practical experience. 5+ years in SRE, DevOps, Infrastructure, or Production Operations roles. Cloud certifications such as AWS, Azure, or Google Cloud. Experience in Agile/Scrum environments and Jira-based work management. Background supporting high-availability, customer-facing SaaS platforms. Why You�ll Love Working at ECI Lead the charge as a hands-on technical leader supporting mission-critical systems. Work with cutting-edge SRE, observability, and automation technologies. Collaborate with talented global engineering and product teams. Enjoy competitive compensation, comprehensive benefits, and exciting growth opportunities. The International Traffic in Arms Regulations (ITAR) is the United States regulation that controls the manufacture, sale, and distribution of defense and space-related articles and services as defined in the United States Munitions List (USML). Besides rocket launchers, torpedoes, and other military hardware, the list also restricts the plans, diagrams, photos, and other documentation used to build ITAR-controlled military gear. This is referred to by ITAR as �technical data�. ITAR mandates that access to physical materials or technical data related to defense and military technologies is restricted to US Persons only. #LI-Remote #LI-ND1 START YOUR APPLICATION Visit Our Home Page © 2026 ECI Software Solutions Applicant Tracking System Powered by
Apply Now

Similar Opportunities

Experienced Registered Behavior Technician for In-Home ABA Therapy - Atlanta, GA

Remote Full-time

Immediate Hiring: Experienced Registered Behavioral Technician (RBT) for Clinic-Based ABA Therapy Services

Remote Full-time

Experienced Registered Behavioral Technician (RBT) - ABA Therapy for Children with Autism Spectrum Disorder

Remote Full-time

Experienced Registered Nurse - Telehealth: Providing Remote Care Coordination and Patient Support

Remote Full-time

Experienced Substitute Teacher for Riverside County Schools - Join Scoot Education's Innovative Team

Remote Full-time

Experienced Substitute Teacher for San Bernardino County - Flexible Schedules & Competitive Pay

Remote Full-time

Experienced School Year Instructional Coach for High-Dosage Tutoring Programs in Edgewater Park, NJ

Remote Full-time

Experienced School Year Tutor for K-8 Students in Math and Literacy - Mickleton, NJ

Remote Full-time

Experienced Secondary Social Studies Teacher for Kansas - Flexible Hybrid Remote Arrangement

Remote Full-time

USPS Office Helper

Remote Full-time

Tech Lead/Scrum Master - Java - Niort

Remote Full-time

Tech Lead in Blockchain Consulting

Remote Full-time

Software Architect with React, C# & SQL - REMOTE (W2 Only)

Remote Full-time

Crypto Markets Specialist – Remote – New York, NY

Remote Full-time

Senior Counsel, Cybersecurity and Incident Response (Remote) USA - IL (Remote)

Remote Full-time

Insurance Agent (Base salary + Uncapped commissions)

Remote Full-time

Experienced Customer Service Representative – Work from Home Opportunities with Amazon: Flexible, Fulfilling, and Rewarding Career Path

Remote Full-time

**Experienced Customer Service Specialist – Work from Home Opportunity at arenaflex**

Remote Full-time

Lot Attendant

Remote Full-time

[Remote] Claims Examiner (Annuity)

Remote Full-time
← Back to Home