AI Inference Engineer

Remote Full-time
Be part of the team creating the software foundation for next-generation AI compute platforms. In this role, you’ll work across the full stack — from low-level kernels and hardware-optimized operators to large-scale ML deployment frameworks — in close collaboration with compiler developers, ML scientists, and hardware specialists. This position offers the chance to contribute to state-of-the-art AI infrastructure, fine-tune software for custom hardware, and deepen your expertise in system software and machine learning. Responsibilities (some of the following) Design, develop, and maintain components of the deployment stack and software kernels for AI compute platforms Optimize and implement core ML operators (e.g., GEMMs, convolutions, BLAS routines, SIMD kernels) Translate computational graphs from ML frameworks onto the underlying hardware Contribute to compiler infrastructure together with compiler and hardware teams Investigate and resolve issues through system-level debugging and performance analysis Deliver scalable software solutions under ambitious development schedules Define and apply practices for testing, deployment, and scaling AI systems Minimum qualifications Bachelor’s degree in Computer Science, Engineering, Mathematics, or related discipline, with 3+ years of professional software development experience Solid knowledge of computer architecture, system software, data structures Strong programming skills in C/C++ or Python in Linux environments using common development tools Hands-on experience implementing algorithms in high-level languages (C/C++/Python) Exposure to specialized hardware (GPUs, FPGAs, DSPs, AI accelerators) and frameworks such as OpenCL or CUDA Experience designing or working with high-performance software systems Solid knowledge of ML fundamentals Motivated team player with a strong sense of responsibility You are a great fit if you have experience in at least one of the following areas: Model serving frameworks (e.g., Triton Inference Server, DeepSpeed Inference, vLLM) Deep learning frameworks (e.g., PyTorch, TensorFlow) ML runtimes (e.g., ONNX Runtime, TVM, IREE, XLA) Distributed collectives (e.g., Gloo, MPI) Software testing and validation methodologies Deploying ML workloads (LLMs, VLMs, NLP, etc.) across distributed systems Implementation of ML operators and kernels (e.g., SIMD routines, Activation functions, Pooling layers, Quantization layers) Hardware-aware optimizations and performance tuning 2+ years of experience developing software targeting AI hardware Contribution to open-source projects (e.g., LLVM, PyTorch, TensorFlow, ONNX Runtime, xDSL, IREE) is a big plus.
Apply Now

Similar Opportunities

Experienced Registered Behavior Technician for In-Home ABA Therapy - Atlanta, GA

Remote Full-time

Immediate Hiring: Experienced Registered Behavioral Technician (RBT) for Clinic-Based ABA Therapy Services

Remote Full-time

Experienced Registered Behavioral Technician (RBT) - ABA Therapy for Children with Autism Spectrum Disorder

Remote Full-time

Experienced Registered Nurse - Telehealth: Providing Remote Care Coordination and Patient Support

Remote Full-time

Experienced Substitute Teacher for Riverside County Schools - Join Scoot Education's Innovative Team

Remote Full-time

Experienced Substitute Teacher for San Bernardino County - Flexible Schedules & Competitive Pay

Remote Full-time

Experienced School Year Instructional Coach for High-Dosage Tutoring Programs in Edgewater Park, NJ

Remote Full-time

Experienced School Year Tutor for K-8 Students in Math and Literacy - Mickleton, NJ

Remote Full-time

Experienced Secondary Social Studies Teacher for Kansas - Flexible Hybrid Remote Arrangement

Remote Full-time

USPS Office Helper

Remote Full-time

**Experienced Chat Operator – Remote Customer Support Specialist**

Remote Full-time

Experienced Data Entry Specialist for Remote Online Typing Jobs Near Kanyakumari

Remote Full-time

Experienced Part-Time Remote Data Entry Representatives – Entry-Level Opportunities for Detail-Oriented Individuals with Excellent Typing Skills

Remote Full-time

Part Time Remote Data Entry Clerk Opportunity for Career Growth and Development at blithequark

Remote Full-time

Field Service Engineer - Atlanta, GA

Remote Full-time

**Experienced Data Entry Technician – Customs Brokerage Team Member (Remote)**

Remote Full-time

**Experienced Data Entry Specialist – Part-Time Opportunity for Fresher and College Students at blithequark**

Remote Full-time

Senior Sales Operations Manager, Enhanced Market Sales Operations

Remote Full-time

**Experienced Entry-Level arenaflex Data Entry Specialist - Remote: Launch Your Career in Logistics and Technology**

Remote Full-time

Experienced Professional Writing Tutor for Diverse Student Population - Academic Writing, Organization, and Critical Thinking Support in Tsaile, AZ

Remote Full-time
← Back to Home