Architect of Platform Engineering – AI Supercompute Infrastructure | Cloud & Engineering

Remote Full-time
We are a technology consulting firm building and operating next-generation AI supercompute infrastructure for the world's most ambitious organizations. As Architect of Platform Engineering, you will own the full stack, from bare metal and operating system up through cluster orchestration, job scheduling, and observability across engagements with leading enterprise and public sector clients pushing the frontier of AI adoption.
As a repeatedly awarded NVIDIA Consulting Partner of the Year in EMEA, we hold one of the deepest and most recognized NVIDIA partnerships in the region. This gives our engineers privileged access to adoption programs and NVIDIA's engineering teams.
You will work with technology and at a scale that most engineers won't encounter for years.
This role sits at the intersection of deep technical ownership and client-facing leadership. You will shape platform strategy for clients, embed within their teams, and deliver outcomes that define how large-scale AI infrastructure is built and run. You will work in close partnership with NVIDIA to bring cutting-edge GPU architecture and software capabilities directly to client environments.

We are looking for a hands-on technical Architect who thrives in entrepreneurial, high-velocity environments.

What We Expect:
8+ years of hands-on infrastructure and platform engineering experience, including full ownership of production systems
cluster architecture, control plane operations, custom controllers/operators, multi-tenancy, and large-scale fleet management
Slurm experience or other HPC/AI workload scheduling: job queuing, fair-share scheduling, MPI integration
Strong Linux internals knowledge: kernel tuning, cgroups, namespaces, NUMA topology, hugepages, and storage subsystems
Familiarity with high-speed networking: InfiniBand, RoCE, RDMA; tuning for distributed training workloads
Infrastructure as Code fluency: Terraform, Ansible, Helm or equivalent
Demonstrated ability to lead technical engagements with enterprise clients. Translating ambiguous requirements into clear deliverables, managing stakeholders across seniority levels, and navigating complex organizational dynamics
Entrepreneurial mindset. Comfortable operating with autonomy and moving fast without sacrificing rigor

Bonus Experience
Proven experience managing NVIDIA GPU infrastructure: driver lifecycle, CUDA toolchain, MIG/MPS partitioning, NVLink/NVSwitch topologies, and GPUDirect RDMA
Familiarity with NVIDIA Base Command Platform, DGX SuperPOD, or CSP GPU cloud deployments
Experience with DCGM, or other GPU profiling and telemetry tooling
Prior consulting, professional services, or client delivery experience in an infrastructure or cloud practice
Contributions to open-source platform tooling or CNCF ecosystem projects

Apply Now

Apply Now

Similar Opportunities

Experienced Registered Behavior Technician for In-Home ABA Therapy - Atlanta, GA

Remote Full-time

Immediate Hiring: Experienced Registered Behavioral Technician (RBT) for Clinic-Based ABA Therapy Services

Remote Full-time

Experienced Registered Behavioral Technician (RBT) - ABA Therapy for Children with Autism Spectrum Disorder

Remote Full-time

Experienced Registered Nurse - Telehealth: Providing Remote Care Coordination and Patient Support

Remote Full-time

Experienced Substitute Teacher for Riverside County Schools - Join Scoot Education's Innovative Team

Remote Full-time

Experienced Substitute Teacher for San Bernardino County - Flexible Schedules & Competitive Pay

Remote Full-time

Experienced School Year Instructional Coach for High-Dosage Tutoring Programs in Edgewater Park, NJ

Remote Full-time

Experienced School Year Tutor for K-8 Students in Math and Literacy - Mickleton, NJ

Remote Full-time

Experienced Secondary Social Studies Teacher for Kansas - Flexible Hybrid Remote Arrangement

Remote Full-time

USPS Office Helper

Remote Full-time

Engineering Project Manager

Remote Full-time

Billing and Follow-Up Representative-II (Hospital Medical Billing Follow-up) - PFS (Remote)

Remote Full-time

**Senior Data Analyst - Customer Experience - Remote**

Remote Full-time

(USA) Tire and Battery Center Team Leader

Remote Full-time

REMOTE Director of Engineering

Remote Full-time

Customer Service Representative

Remote Full-time

Leader of Outreach and Communication

Remote Full-time

Home Infusion Nurse

Remote Full-time

Sales Development Representative

Remote Full-time

Online Order Filling Team Associate

Remote Full-time
← Back to Home