[Remote] AI Engineer (AI System Calibration & Optimization)

Remote Full-time
Note: The job is a remote job and is open to candidates in USA. Robots & Pencils is seeking an outcome-oriented AI Engineer to partner with a strategic client on a high-impact AI system calibration and optimization engagement. The role involves embedding directly with the client's teams to improve their AI model's accuracy and reliability through systematic prompt optimization and calibration workflows. Responsibilities • Embed with strategic client as their technical partner for AI system calibration and prompt optimization. • Build production-grade calibration systems using Python within the client's Azure environment. • Implement DSPy framework and GEPA optimizer to systematically improve prompt quality and retrieval performance. • Design and develop Golden Dataset curation workflows using Azure Data Labeling, establishing gold/silver data tier schemas. • Create evaluation frameworks to measure model accuracy, precision/recall, latency, and hallucination rates. • Architect prompt optimization pipelines for retrieval, context synthesis, and answer generation tailored to client needs. • Own the path to production - evaluation pipelines, Azure ML workflows, KPI dashboards, and optimization automation. • Iterate rapidly based on client feedback and KPI results, translate business goals into technical calibration improvements. • Own end-to-end delivery of calibration systems from initial baseline to production-ready optimization workflows. • Establish measurable KPIs and demonstrate accuracy improvements, latency reduction, and hallucination mitigation. • Provide strategic guidance on RAG architecture improvements and retrieval parameter optimization. • Accelerate client time-to-value through hands-on development and comprehensive knowledge transfer. • Deliver operational playbooks and documentation enabling the client team to maintain calibration systems independently. • Lead complex, multi-stakeholder calibration initiatives on-site and remotely; drive clarity, remove blockers, and keep execution on track. • Set coding standards and architectural patterns for calibration components; write clear docs, runbooks, and technical specifications. • Mentor client engineers through code reviews, pairing sessions, and technical workshops on DSPy, GEPA, and evaluation best practices. • Make sound tradeoffs under real-world constraints - Azure cost optimization, data quality, performance requirements, and security. • Align delivery with Robots & Pencils' responsible AI practices and client governance requirements. • Work closely with client's AI SMEs and product engineering teams to understand product catalog structure and validation workflows. • Collaborate with internal R&P product, engineering, and delivery teams on calibration methodology and best practices. • Share insights from client engagement to improve R&P's prompt optimization frameworks and tooling. • Contribute reusable patterns, evaluation frameworks, and documentation back to R&P's core platform. • Collaborate across time zones with distributed teams. Skills • Bachelor's degree in computer science, Engineering, or equivalent experience. • 7+ years of professional software development with significant ownership of architecture and delivery. • 3+ years of Python in ML/AI systems with a strong focus on data processing and evaluation pipelines. • 2+ years building with Generative AI including hands-on prompt engineering and optimization work. • Experience with prompt optimization frameworks - DSPy strongly preferred, or similar systematic approaches to prompt improvement. • Deep understanding of RAG architectures - retrieval quality, latency/cost tuning, hallucination mitigation, and evaluation methods. • Hands-on experience designing evaluation metrics and building assessment frameworks for LLM systems. • Knowledge of systematic experimentation methods - A/B testing, parameter tuning, performance benchmarking. • Experience with data curation, labeling workflows, and dataset quality management for AI systems. • Strong Azure cloud experience with focus on AI/ML services - Azure Machine Learning, Azure AI Search, Azure OpenAI Service. • Experience with Azure Data Labeling, Azure Blob Storage, and Azure infrastructure fundamentals. • Understanding vector search platforms and retrieval optimization (Azure AI Search, Weaviate, Qdrant, Pinecone). • Strong IaC background (Terraform or ARM templates) plus containerization and distributed systems knowledge. • Solid SDLC practices - testing strategies, CI/CD, code reviews, observability, and operational excellence. • Upper-intermediate English for client communication. • Experience leading complex technical projects with multiple stakeholders. • Strong communication skills for technical and executive audiences. • Ability to context-switch and adapt to client environments. • Willingness to travel to client sites. • Direct hands-on experience with DSPy framework and GEPA optimizer. • Understanding systematic optimization principles: evolutionary algorithms, Bayesian optimization, multi-objective optimization, and Pareto efficiency concepts. • Familiarity with prompt optimization frameworks and methods - experience with any of: MIPROv2, TextGrad, EvoPrompt, AutoPrompt, or reinforcement learning approaches (GRPO, PPO). • Experience with LLM-as-judge patterns and automated evaluation pipelines. • Knowledge of advanced RAG patterns - Adaptive RAG, Self-RAG, Corrective RAG - and retrieval evaluation methods (MRR, NDCG, precision@k). • Understanding of agentic AI patterns - ReAct, Chain-of-Thought, Tool Use - and their application in RAG systems. • Experience building evaluation dashboards with Azure Monitor, Application Insights, or similar observability tools. • Familiarity with MLOps practices - model versioning, experiment tracking, metric logging for evaluation systems. • Experience with AWS or GCP AI/ML platforms (Bedrock, SageMaker, Vertex AI) and cross-cloud architecture patterns. • Experience with product catalog systems, cross-reference matching, or e-commerce search optimization. • Background in manufacturing, industrial equipment, or technical specification systems. • Prior consulting or professional services experience with enterprise clients. Company Overview • Robots & Pencils develops digital strategies and products that deliver exponential impact to our clients. It was founded in 2009, and is headquartered in Calgary, Alberta, CAN, with a workforce of 51-200 employees. Its website is Company H1B Sponsorship • Robots & Pencils has a track record of offering H1B sponsorships, with 1 in 2025, 1 in 2022, 1 in 2021. Please note that this does not guarantee sponsorship for this specific role. Apply tot his job
Apply Now

Similar Opportunities

Experienced Registered Behavior Technician for In-Home ABA Therapy - Atlanta, GA

Remote Full-time

Immediate Hiring: Experienced Registered Behavioral Technician (RBT) for Clinic-Based ABA Therapy Services

Remote Full-time

Experienced Registered Behavioral Technician (RBT) - ABA Therapy for Children with Autism Spectrum Disorder

Remote Full-time

Experienced Registered Nurse - Telehealth: Providing Remote Care Coordination and Patient Support

Remote Full-time

Experienced Substitute Teacher for Riverside County Schools - Join Scoot Education's Innovative Team

Remote Full-time

Experienced Substitute Teacher for San Bernardino County - Flexible Schedules & Competitive Pay

Remote Full-time

Experienced School Year Instructional Coach for High-Dosage Tutoring Programs in Edgewater Park, NJ

Remote Full-time

Experienced School Year Tutor for K-8 Students in Math and Literacy - Mickleton, NJ

Remote Full-time

Experienced Secondary Social Studies Teacher for Kansas - Flexible Hybrid Remote Arrangement

Remote Full-time

USPS Office Helper

Remote Full-time

DevOps (Remote Position)

Remote Full-time

Case Manager RN - Work From Home - PST

Remote Full-time

Experienced Remote Virtual Customer Care Chat Professional – Delivering Exceptional Service and Driving Customer Satisfaction in a Dynamic and Supportive Environment at blithequark

Remote Full-time

Director, Marketing Operations [Remote]

Remote Full-time

Virtual Chat Support-Laboratory Express

Remote Full-time

Experienced Online Product Support Specialist - Work from Home with Competitive Hourly Rate

Remote Full-time

[Remote] Software Engineer Intern 2026 — Remote (US)

Remote Full-time

**Part Time Remote Data Entry Associate – Join arenaflex in Revolutionizing E-commerce and Data Management**

Remote Full-time

Lead Biostatistician (Analytical Validation)

Remote Full-time

[Remote] Group Strategy Manager, Marketing Applied Sciences

Remote Full-time
← Back to Home