AI Systems Engineer (LLM Performance, Cost & Reliability) | Audit → Recommend → Implement

Remote Full-time
Overview Jules is a mobile AI-powered style and dating photo coach. We analyze outfit photos and dating profile images, score them, and give actionable feedback using LLMs and vision models. The product is live, architected, and thoughtfully built. What we need now is systems-level optimization. We’re looking for a senior engineer to audit, optimize, and harden our LLM infrastructure — reducing latency and cost while improving reliability and consistency — without changing product flows or UX. This is not a greenfield build. This is not prompt polishing. This is a real production system that needs to scale. What You’ll Do Phase 1 Audit all LLM usage across the system: FitCheck (vision) PicReview (vision) Comparison modes Conversational chat Analyze: Latency bottlenecks (user-perceived and backend) Cost per request / feature / user Model usage vs actual requirements Prompt size, retries, determinism, and waste Review existing cost instrumentation and update pricing assumptions Deliverable: A written audit outlining: Current performance & cost profile Clear problem areas Ranked list of optimization opportunities with estimated impact Phase 2 — Optimize & Implement Implement agreed optimizations directly in the codebase, which may include: Multi-model routing (cheap → expensive fallback) Vision + text model rationalization Caching (hash-based, context-based, or result reuse) Async coordination improvements (queues, batching, retries) Prompt minimization and structural refactors (not stylistic rewrites) More accurate cost tracking and reporting Ensure output stability and scoring consistency are preserved Deliverable: Merged code changes Before/after latency and cost comparison Clear documentation of decisions and tradeoffs What You Will Not Do To be explicit: ❌ Redesign product flows, UX, or scoring logic ❌ Rewrite Jules’ persona or tone ❌ “Improve” the product by adding features ❌ Push unnecessary infra churn before instrumentation ❌ Suggest fine-tuning as a first solution Your job is to make the engine faster, cheaper, and more reliable, not change the car. Technical Environment (You’ll Be Working Inside This) Frontend: React Native (Expo, TypeScript) Backend: Node.js + Express Database: MongoDB AI: OpenAI (GPT-4o for vision, GPT-4.1-mini for chat) Infra: Cloudinary (images), Firebase Auth, Segment, Sentry Architecture: Async API calls, structured JSON outputs, prompt routing system Full architecture documentation will be provided on engagement start. What We’re Looking For Required Deep experience optimizing production LLM systems Strong intuition for cost vs latency vs quality tradeoffs Hands-on backend engineering skills (Node.js) Experience with: model routing async systems caching strategies deterministic LLM outputs Nice to Have Vision model experience Experience evaluating multiple inference providers Prior startup or zero-to-scale experience Engagement Details Type: Short-term contract Length: TBD Scope: Audit → Recommend → Implement Potential extension: Yes, based on results Timezone: Flexible, but on Pacific Time Apply tot his job
Apply Now

Similar Opportunities

Experienced Registered Behavior Technician for In-Home ABA Therapy - Atlanta, GA

Remote Full-time

Immediate Hiring: Experienced Registered Behavioral Technician (RBT) for Clinic-Based ABA Therapy Services

Remote Full-time

Experienced Registered Behavioral Technician (RBT) - ABA Therapy for Children with Autism Spectrum Disorder

Remote Full-time

Experienced Registered Nurse - Telehealth: Providing Remote Care Coordination and Patient Support

Remote Full-time

Experienced Substitute Teacher for Riverside County Schools - Join Scoot Education's Innovative Team

Remote Full-time

Experienced Substitute Teacher for San Bernardino County - Flexible Schedules & Competitive Pay

Remote Full-time

Experienced School Year Instructional Coach for High-Dosage Tutoring Programs in Edgewater Park, NJ

Remote Full-time

Experienced School Year Tutor for K-8 Students in Math and Literacy - Mickleton, NJ

Remote Full-time

Experienced Secondary Social Studies Teacher for Kansas - Flexible Hybrid Remote Arrangement

Remote Full-time

USPS Office Helper

Remote Full-time

Development Operations Engineer, Sr. – Multi-Tenant Cloud

Remote Full-time

Part-Time/Full-Time Remote Work For Data Entry

Remote Full-time

Federal Enterprise Account Executive, US Remote

Remote Full-time

Hybrid Remote Case Manager Registered Nurse - Bilingual Spanish - MUST BE IN MASSACHUSETTS

Remote Full-time

**Experienced Customer Service Representative – Remote Dental Insurance Benefits and Claims Support**

Remote Full-time

Experienced Remote Customer Support Specialist – Delivering Exceptional Service from the Comfort of Your Own Home with blithequark

Remote Full-time

Bioinformatics Scientist III Spatial Biology job at Allen Institute in Seattle, WA

Remote Full-time

Experienced Customer Service Representative – Remote Work Opportunity for Exceptional Support Agents

Remote Full-time

**Experienced Sales Assistant - Part-Time Retail Professional – Customer Experience and Sales Growth**

Remote Full-time

HR Generalist (Contractor to FTE)

Remote Full-time
← Back to Home