Senior / Lead Machine Learning Engineer, Serving - Serbia

Remote Full-time
About Inworld Inworld is a product-oriented research lab of top AI researchers and engineers, developing best-in-class realtime multimodal models and the only realtime orchestration platform optimized for thousands of queries per second. We’ve raised more than $125M from Lightspeed, Section 32, Kleiner Perkins, Microsoft’s M12 venture fund, Founders Fund, Meta and Stanford, among others. Our technology has powered experiences from companies such as NVIDIA, Microsoft Xbox, Niantic, Logitech Streamlabs, Wishroll, Little Umbrella and Bible Chat. We’ve also been recognized by CB Insights as one of the 100 most promising AI companies globally and have been named one of LinkedIn's Top 10 Startups in the USA. Who We're Looking For A year ago, reliably working agentic systems and sub-second multimodal inference at scale barely existed. Nobody has a decade of experience here. So we're not screening for a resume template — we're looking for strong people from varied backgrounds who learn fast, thrive in ambiguity, and can show us what they've built, broken, and understood. Experience We Find Useful You don't need all of this. But you need enough to make a case. Inference Optimization. Deep understanding of modern serving frameworks and techniques like vLLM or TRT-LLM. Model Acceleration . Hands-on experience with quantization, distillation, caching strategies , continuous batching, paged attention, and speculative decoding. High-Performance Systems. Proficiency in C++, CUDA, Rust, or highly optimized Python. You know how to profile code and squeeze every ounce of performance out of NVIDIA GPUs. Distributed Systems & Scaling. Experience with Kubernetes, Ray, custom load balancing, multi-GPU/multi-node inference, and reliably handling thousands of concurrent connections. Public work. Non-trivial systems programming projects, open-source contributions to major inference engines, or deep-dive technical write-ups. Full-cycle ownership. You can take a model from the research team, containerize it, optimize its serving, and ensure it runs reliably in production. Background. PhD in CS, Physics, Math, or equivalent practical experience building backend or ML systems. Professional fluency in English (written and spoken) is required, as you will be collaborating daily with our US-based leadership and engineering teams. Who Thrives Here You don’t need a roadmap to start walking; you’re comfortable picking a direction and building the map as you go. You believe engineering isn't finished until it’s shipped and stable. You have a bias for impact over purely theoretical optimizations. You don't just ship code; you obsess over the why. You’re the first to question an architecture if you think there’s a better way to solve the core latency or throughput problem. You aren't satisfied with "the PM said so." You thrive on deep context and want to understand the fundamental logic behind every decision we make. What Working Here Is Like We hand you unclear problems and expect you to make them clear. We value engineers who say "I don't know yet" and then design the benchmark or prototype that finds out. We treat performance, latency, and reliability as first-class product features, not a box to check before launch. Impact comes before everything else, though we support sharing work and open-source contributions that move the field forward. Your work should be visible. Flat structure, fast iterations, minimal process theater. For candidates interested in relocating to the San Francisco Bay Area in the future, full U.S. visa and relocation support may be available, subject to business needs and applicable legal and work authorization requirements.
Apply Now

Similar Opportunities

Experienced Registered Behavior Technician for In-Home ABA Therapy - Atlanta, GA

Remote Full-time

Immediate Hiring: Experienced Registered Behavioral Technician (RBT) for Clinic-Based ABA Therapy Services

Remote Full-time

Experienced Registered Behavioral Technician (RBT) - ABA Therapy for Children with Autism Spectrum Disorder

Remote Full-time

Experienced Registered Nurse - Telehealth: Providing Remote Care Coordination and Patient Support

Remote Full-time

Experienced Substitute Teacher for Riverside County Schools - Join Scoot Education's Innovative Team

Remote Full-time

Experienced Substitute Teacher for San Bernardino County - Flexible Schedules & Competitive Pay

Remote Full-time

Experienced School Year Instructional Coach for High-Dosage Tutoring Programs in Edgewater Park, NJ

Remote Full-time

Experienced School Year Tutor for K-8 Students in Math and Literacy - Mickleton, NJ

Remote Full-time

Experienced Secondary Social Studies Teacher for Kansas - Flexible Hybrid Remote Arrangement

Remote Full-time

USPS Office Helper

Remote Full-time

Experienced Customer Service Professional – Delivery Station Customer Support and Logistics Expertise at blithequark

Remote Full-time

Biology Teacher | $50/hr | Remote

Remote Full-time

Sr. Data Security Analyst- Remote- Genesco

Remote Full-time

Projektmanager/in | Webdesign & SEO (100% remote!)

Remote Full-time

Experienced Data Entry Specialist for Remote Work - Flexible Opportunities for Students

Remote Full-time

Marketing & Communications Coordinator

Remote Full-time

[Remote] Coding Compliance Auditor 2, Health Information Management, Full Time, Days

Remote Full-time

**Experienced Customer Service Representative | Remote West Coast – Join arenaflex's Dynamic Team**

Remote Full-time

Entry-Level Data Entry Specialist – Remote Opportunity for Detail-Oriented Individuals to Launch Their Career in Data Management and Analysis at arenaflex

Remote Full-time

Penske - PMC Freelance Photo Editor - Los Angeles, CA

Remote Full-time
← Back to Home