[Remote] Engineering Manager, Machine Learning, Model Evaluations and Data Curation (AI Foundations)

Remote Full-time
Note: The job is a remote job and is open to candidates in USA. Netflix is a leading entertainment company focused on pushing the boundaries of storytelling and technology. They are seeking an Engineering Manager to lead a team responsible for model evaluations and data curation for large language models, ensuring that these models improve personalization and discovery for users. Responsibilities β€’ Partner with downstream AI application teams to define shared evaluations that codify application expectations of LLMs and other foundation models, ensuring progress can be transparently tracked against real-world needs β€’ Design rigorous benchmarks and evaluation methodologies across ranking & recommendations, content understanding, and language/text generation β€” grounded in a deep technical understanding of LLMs, their strengths, limitations, and failure modes β€’ Lead the development of evaluators and strong baselines to ensure in-house LLMs and other foundation models demonstrate clear advantages over off-the-shelf alternatives β€’ Build scalable, reproducible data and evaluation systems that make dataset creation and evaluation design as nimble and experiment-friendly as model development itself β€’ Hire, grow, and nurture a world-class team, fostering an inclusive, high-performing culture that balances research innovation with engineering excellence β€’ Work closely with the teams developing Netflix’s foundation models (including our core LLM) to ensure evaluation and data insights are folded back into the cadence of model development. Proactively influence the ML Platform and Data Engineering teams at key interfaces Skills β€’ Experience building and leading high-performing teams of ML researchers and engineers β€’ Proven track record of leading machine learning initiatives from research to production, ideally involving evaluation frameworks, ML infrastructure, or data-intensive systems β€’ Strong technical expertise in LLMs, their evaluation, and practical methods for ensuring robustness, reproducibility, and quality β€’ Broad knowledge of machine learning fundamentals and evaluation methodologies, including benchmark design, model-based evaluators, and offline/online metrics β€’ Experience driving cross-functional projects, including close collaboration with AI application teams to translate product needs into evaluation frameworks β€’ Excellent written and verbal communication skills, able to bridge technical and non-technical audiences β€’ Advanced degree in Computer Science, Statistics, or a related quantitative field β€’ 8+ years of overall experience, including 3+ years in engineering management β€’ Experience with large-scale ML systems and foundation models, especially LLMs β€’ Background in building evaluation frameworks, model benchmarking, or data infrastructure for LLM training β€’ Familiarity with multi-modal data and evaluation Benefits β€’ Health Plans β€’ Mental Health support β€’ A 401(k) Retirement Plan with employer match β€’ Stock Option Program β€’ Disability Programs β€’ Health Savings and Flexible Spending Accounts β€’ Family-forming benefits β€’ Life and Serious Injury Benefits β€’ Paid leave of absence programs β€’ Full-time hourly employees accrue 35 days annually for paid time off to be used for vacation, holidays, and sick paid time off. β€’ Full-time salaried employees are immediately entitled to flexible time off. Company Overview β€’ Netflix is an online streaming platform that enables users to watch TV shows and movies. It was founded in 1997, and is headquartered in Los Gatos, California, USA, with a workforce of 10001+ employees. Its website is Company H1B Sponsorship β€’ Netflix has a track record of offering H1B sponsorships, with 310 in 2025, 309 in 2024, 191 in 2023, 261 in 2022, 268 in 2021, 225 in 2020. Please note that this does not guarantee sponsorship for this specific role. Apply tot his job

Apply tot his job

Apply To this Job
Apply Now

Similar Opportunities

Experienced Registered Behavior Technician for In-Home ABA Therapy - Atlanta, GA

Remote Full-time

Immediate Hiring: Experienced Registered Behavioral Technician (RBT) for Clinic-Based ABA Therapy Services

Remote Full-time

Experienced Registered Behavioral Technician (RBT) - ABA Therapy for Children with Autism Spectrum Disorder

Remote Full-time

Experienced Registered Nurse - Telehealth: Providing Remote Care Coordination and Patient Support

Remote Full-time

Experienced Substitute Teacher for Riverside County Schools - Join Scoot Education's Innovative Team

Remote Full-time

Experienced Substitute Teacher for San Bernardino County - Flexible Schedules & Competitive Pay

Remote Full-time

Experienced School Year Instructional Coach for High-Dosage Tutoring Programs in Edgewater Park, NJ

Remote Full-time

Experienced School Year Tutor for K-8 Students in Math and Literacy - Mickleton, NJ

Remote Full-time

Experienced Secondary Social Studies Teacher for Kansas - Flexible Hybrid Remote Arrangement

Remote Full-time

USPS Office Helper

Remote Full-time

Senior Automation Tester - Python/Pytest - Full remote

Remote Full-time

Public Company Senior Paralegal and North America Governance Manager (Remote) in New York City, NY

Remote Full-time

**Experienced Part-Time Customer Service Representative – Remote & Work-From-Home Opportunity at arenaflex**

Remote Full-time

**Experienced Full Stack Data Entry Specialist – Web & Cloud Application Development**

Remote Full-time

E-Commerce Specialist – Amazon, Social Commerce

Remote Full-time

Experienced Online Data Entry Specialist for Students – Flexible Remote Work Opportunities at arenaflex

Remote Full-time

Generator Technician

Remote Full-time

Part-Time Survey Panelist

Remote Full-time

Manager - Partner Due Diligence Oversight – QA and Controls

Remote Full-time

Freelance Residential Construction Estimator (Remote)

Remote Full-time
← Back to Home