[Remote] AI Researcher — Inference Optimization

Remote Full-time
Note: The job is a remote job and is open to candidates in USA. FeatherlessAI is seeking an AI Researcher with deep experience in inference optimization to design, evaluate, and deploy high-performance inference systems for large-scale machine learning models. The role involves improving latency, throughput, and cost efficiency across real-world production environments by developing techniques to optimize inference performance and collaborating with engineering teams to deploy optimized pipelines. Responsibilities • Research and develop techniques to optimize inference performance for large neural networks • Improve latency, throughput, memory efficiency, and cost per inference • Design and evaluate model-level optimizations (quantization, pruning, KV-cache optimization, architecture-aware simplifications) • Implement systems-level optimizations (dynamic batching, kernel fusion, multi-GPU inference, prefill vs decode optimization) • Benchmark inference workloads across hardware accelerators • Collaborate with engineering teams to deploy optimized inference pipelines • Translate research insights into production-ready improvements Skills • Strong background in machine learning, deep learning, or AI systems • Hands-on experience optimizing inference for large-scale models • Proficiency in Python and modern ML frameworks (e.g., PyTorch) • Experience with inference tooling (e.g., Triton, TensorRT, vLLM, ONNX Runtime) • Ability to design experiments and communicate results clearly • Experience deploying production inference systems at scale • Familiarity with distributed and multi-GPU inference • Experience contributing to open-source ML or inference frameworks • Authorship or co-authorship of peer-reviewed research papers in machine learning, systems, or related fields • Experience working close to hardware (CUDA, ROCm, profiling tools) Company Overview • We enable serverless inference via our GPU orchestration and model load-balancing system. It was founded in 2023, and is headquartered in San Francisco, California, USA, with a workforce of 2-10 employees. Its website is Company H1B Sponsorship • Featherless AI has a track record of offering H1B sponsorships, with 1 in 2025. Please note that this does not guarantee sponsorship for this specific role. Apply tot his job
Apply Now

Similar Opportunities

Experienced Registered Behavior Technician for In-Home ABA Therapy - Atlanta, GA

Remote Full-time

Immediate Hiring: Experienced Registered Behavioral Technician (RBT) for Clinic-Based ABA Therapy Services

Remote Full-time

Experienced Registered Behavioral Technician (RBT) - ABA Therapy for Children with Autism Spectrum Disorder

Remote Full-time

Experienced Registered Nurse - Telehealth: Providing Remote Care Coordination and Patient Support

Remote Full-time

Experienced Substitute Teacher for Riverside County Schools - Join Scoot Education's Innovative Team

Remote Full-time

Experienced Substitute Teacher for San Bernardino County - Flexible Schedules & Competitive Pay

Remote Full-time

Experienced School Year Instructional Coach for High-Dosage Tutoring Programs in Edgewater Park, NJ

Remote Full-time

Experienced School Year Tutor for K-8 Students in Math and Literacy - Mickleton, NJ

Remote Full-time

Experienced Secondary Social Studies Teacher for Kansas - Flexible Hybrid Remote Arrangement

Remote Full-time

USPS Office Helper

Remote Full-time

**Experienced Data Entry Specialist – Remote Position at arenaflex**

Remote Full-time

SAP Basis Senior – Migraciones a S/4HANA

Remote Full-time

Experienced Part-Time Remote Data Entry Clerk – Flexible Schedule, Work-from-Home Opportunity with Comprehensive Benefits and Career Growth

Remote Full-time

Development Editor – Books

Remote Full-time

Business Development Representative, Franchise

Remote Full-time

PART TIME DATA ENTRY (REMOTE)

Remote Full-time

Case Management Analyst

Remote Full-time

Data Center - Senior TPM Portfolio Manager

Remote Full-time

**Experienced Customer Service Executive – Immediate Hiring Opportunity at arenaflex**

Remote Full-time

Senior Fullstack Developer (Vue.js & Python)

Remote Full-time
← Back to Home