Senior SRE Engineer – Cloud Operations

Remote Full-time
Senior SRE Engineer – Cloud Operations

Remote – Americas

Full-time

We are recruiting on behalf of a fast-growing AI infrastructure company that builds a high-performance vector database powering semantic search, RAG pipelines, AI agents, and large-scale machine learning applications.

We are seeking a Senior Site Reliability Engineer (SRE) to join the Cloud Operations team and help ensure reliability, observability, and operational excellence across production cloud environments.

This role is highly operations-focused and ideal for engineers who enjoy owning system reliability, improving automation, and operating large-scale distributed systems in production.

About the Role

As a Senior SRE, you will be responsible for maintaining and improving production infrastructure while reducing operational risk and improving system reliability at scale.

You will work closely with platform engineering and infrastructure teams to ensure systems remain secure, performant, and highly available as customer usage grows.

Location Requirements

Remote – Americas (North, Central, or South America)

Candidates must be able to work primarily within American time zones

Key Responsibilities

Cloud Infrastructure & Operations

Operate and maintain production cloud infrastructure at scale

Manage Kubernetes clusters, networking, and deployment pipelines

Improve reliability, performance, and security of production systems

Monitoring & Observability

Enhance monitoring, logging, and alerting systems

Improve operational visibility and incident detection

Incident Response & Reliability

Lead incident response and root cause analysis

Implement preventive measures and continuous reliability improvements

Participate in on-call rotations

Automation & Process Improvement

Reduce operational toil through automation and tooling

Maintain and improve runbooks and operational procedures

Collaboration

Work closely with platform engineering and infrastructure teams

Support scalable architecture and operational best practices

Requirements

5+ years of experience in DevOps, SRE, or infrastructure operations

Strong hands-on experience running Kubernetes in production

Solid understanding of :

Linux systems

Networking fundamentals

Cloud infrastructure (AWS, GCP, or Azure)

Experience with monitoring, alerting, and incident management

Experience with infrastructure automation or infrastructure-as-code

Comfortable participating in on-call rotations

Strong communication and problem-solving skills

Preferred Qualifications

Experience with Terraform or similar IaC tools

Familiarity with Prometheus, Grafana, Loki, or OpenTelemetry

Scripting experience in Python, Bash, or Go

Experience in SaaS, cloud platforms, or data infrastructure environments

Exposure to security, compliance, or system hardening

Whats Offered

Competitive compensation and benefits

Fully remote work environment

Flexible working hours

Opportunity to work on mission-critical cloud infrastructure

Collaborative, engineering-driven culture

How to Apply

If you are passionate about reliability engineering, cloud infrastructure, and large-scale distributed systems , we would love to hear from you.

Apply Now

Apply Now
Apply Now

Similar Opportunities

Experienced Registered Behavior Technician for In-Home ABA Therapy - Atlanta, GA

Remote Full-time

Immediate Hiring: Experienced Registered Behavioral Technician (RBT) for Clinic-Based ABA Therapy Services

Remote Full-time

Experienced Registered Behavioral Technician (RBT) - ABA Therapy for Children with Autism Spectrum Disorder

Remote Full-time

Experienced Registered Nurse - Telehealth: Providing Remote Care Coordination and Patient Support

Remote Full-time

Experienced Substitute Teacher for Riverside County Schools - Join Scoot Education's Innovative Team

Remote Full-time

Experienced Substitute Teacher for San Bernardino County - Flexible Schedules & Competitive Pay

Remote Full-time

Experienced School Year Instructional Coach for High-Dosage Tutoring Programs in Edgewater Park, NJ

Remote Full-time

Experienced School Year Tutor for K-8 Students in Math and Literacy - Mickleton, NJ

Remote Full-time

Experienced Secondary Social Studies Teacher for Kansas - Flexible Hybrid Remote Arrangement

Remote Full-time

USPS Office Helper

Remote Full-time

[PART_TIME Remote] Need Substitute Teacher in Hasbrouck Heights

Remote Full-time

Account Manager

Remote Full-time

Experienced Data Entry Operator for Dynamic Consulting Firm – Remote Work Opportunity with arenaflex

Remote Full-time

Want Member Enrollment/Engagement Supervisor - Remote in Plymouth Meeting, PA

Remote Full-time

Experienced Customer Service Representative – Remote Work Opportunity for Exceptional Client Support and Relationship Building at arenaflex

Remote Full-time

Experienced Consultant, Enterprise Strategy & Execution

Remote Full-time

**Experienced Part-Time Remote Data Entry Specialist – Enchanted World of Disney**

Remote Full-time

Part Time Merchandiser

Remote Full-time

SMB Account Executive - Talent & Learning

Remote Full-time

**Experienced Customer Success Consultant – Remote Opportunity at arenaflex**

Remote Full-time
← Back to Home