Lead MLOps Engineer

Remote Full-time
Lead MLOps Engineer
Location: Bangladesh, South Asia (Remote)
Department: Software Engineering
ABOUT NEXGEN CLOUD:
NexGen Cloud is the company behind Hyperstack, a full-stack AI cloud serving tens of thousands of customers from AI researchers to enterprises running the world's most compute-intensive workloads. We deliver on-demand and private GPU infrastructure to teams who treat performance as a requirement, not a feature.
We're a tight-knit, fast-moving team working at the cutting edge of AI cloud infrastructure. We practice what we preach, equipping our people with AI at every level so we can solve harder problems, ship faster, and keep raising the bar for what enterprise GPU infrastructure looks like.
THE ROLE: Lead MLOps Engineer
This role exists because Hyperstack is scaling its AI cloud platform and building out the infrastructure that powers production ML workloads for thousands of customers. As AI Studio capabilities grow and the platform takes on increasingly complex training, fine-tuning, and inference workloads, we need someone to own the MLOps layer — the systems, tooling, and practices that make large-scale AI workloads reliable, observable, and repeatable in production. You’ll have direct ownership over ML platform reliability, deployment workflow engineering, and the operational standards that underpin how AI workloads run on Hyperstack — end to end.
Role positioning:
This is a lead individual contributor role. You’ll set the technical direction for MLOps on the platform, work directly with Product and Engineering, and take end-to-end ownership of the systems that make AI workloads run in production. No hand-holding, lots of impact.
WHAT YOU’LL BE DOING
Rather than a long checklist, here’s what success in this role looks like:

Own the design, implementation, and evolution of core MLOps systems across Hyperstack — including the infrastructure and workflows that underpin AI Studio
Build and improve systems that orchestrate model training, fine-tuning, evaluation, and deployment — engineered for long-running, resource-intensive, GPU workloads
Own production readiness across ML infrastructure — monitoring, alerting, incident response, and continuous improvement based on real-world usage
Define and embed strong MLOps practices across teams — model versioning, reproducibility, deployment safety, rollback strategies, and environment management
Provide technical leadership through architecture decisions, implementation guidance, and shared standards — working closely with Product, Engineering, and cross-functional teams

ABOUT YOU:
We’re more interested in how you think and work than in a perfect CV. You’ll likely bring a combination of the following:
Essential

Proven experience designing, building, and operating production ML infrastructure, platform systems, or MLOps workflows in cloud environments

Hands-on Python development skills, with experience building backend systems, automation, and developer or platform tooling
Experience supporting LLM, generative AI, or fine-tuning workflows in production — including training, evaluation, deployment, inference, and lifecycle management
Production-grade experience with Docker, Kubernetes, CI/CD, and infrastructure-as-code in real, operational environments
Experience owning complex, asynchronous, or resource-intensive workloads end to end — including orchestration, reliability, observability, and incident response
Ability to work cross-functionally and provide technical leadership through influence — shaping standards, direction, and ways of working across engineering teams



Nice to Have

Exposure to GPU-intensive, distributed, or performance-sensitive ML workloads
Experience building internal developer platforms or tooling that improve experimentation, reproducibility, and delivery speed for ML teams
Background in cloud infrastructure, platform products, or technically complex B2B software

WHAT WE OFFER

Competitive salary and annual discretionary bonus scheme
Employee wellbeing benefits
25 days of holiday, plus public holidays
Flexible working arrangements (remote or hybrid, depending on role and location)
Real ownership and autonomy, with the trust to take initiative and experiment
The opportunity to make a visible, meaningful impact as we scale
Clear career progression and growth opportunities in a fast-growing company
A collaborative, international culture built on trust, transparency, and ownership
The chance to help shape NexGen Cloud’s team, culture, and future alongside ambitious, mission-driven colleagues

MORE INFORMATION
Head over to our NexGen Cloud careers page to view current opening and follow us on LinkedIn and X to learn more about our journey, newest releases and hear exciting news in the neocloud space.



Apply Now

Similar Opportunities

Experienced Registered Behavior Technician for In-Home ABA Therapy - Atlanta, GA

Remote Full-time

Immediate Hiring: Experienced Registered Behavioral Technician (RBT) for Clinic-Based ABA Therapy Services

Remote Full-time

Experienced Registered Behavioral Technician (RBT) - ABA Therapy for Children with Autism Spectrum Disorder

Remote Full-time

Experienced Registered Nurse - Telehealth: Providing Remote Care Coordination and Patient Support

Remote Full-time

Experienced Substitute Teacher for Riverside County Schools - Join Scoot Education's Innovative Team

Remote Full-time

Experienced Substitute Teacher for San Bernardino County - Flexible Schedules & Competitive Pay

Remote Full-time

Experienced School Year Instructional Coach for High-Dosage Tutoring Programs in Edgewater Park, NJ

Remote Full-time

Experienced School Year Tutor for K-8 Students in Math and Literacy - Mickleton, NJ

Remote Full-time

Experienced Secondary Social Studies Teacher for Kansas - Flexible Hybrid Remote Arrangement

Remote Full-time

USPS Office Helper

Remote Full-time

Customer Support Analyst II

Remote Full-time

Survey Taker ID-2697 – Amazon Store

Remote Full-time

UPS Customer Service Remote Jobs - Immediate Hiring Now

Remote Full-time

Postdoctoral Research Scholar

Remote Full-time

[Remote/WFM] SPS WFM Manager, Selling Partner Support

Remote Full-time

Host

Remote Full-time

Freelance Director, Catalyst

Remote Full-time

Bilingual Spanish / English Loan Officer

Remote Full-time

People Operations Team Lead

Remote Full-time

**Experienced Customer Service Advocate – National Remote Opportunity to Revolutionize Healthcare**

Remote Full-time
← Back to Home