Technical Reviewer - RL Environment Terminal Benchmarking (Agentic AI)

Remote Full-time
This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more. Role Description Mercor is hiring a Technical Reviewer on behalf of a leading AI lab to evaluate and refine benchmarking pipelines for reinforcement learning (RL) environments and agentic AI systems. In this role, you’ll be responsible for reviewing environment design, terminal conditions, and evaluation protocols to ensure accuracy, reproducibility, and fairness in benchmarking. You’ll work closely with researchers and engineers to provide technical feedback that strengthens experimental rigor and system reliability. Qualifications Background in reinforcement learning, computer science, or applied AI research Experience with RL environments Understanding of benchmarking methodologies, terminal conditions, and evaluation metrics for RL tasks Comfortable reading and reviewing codebases in Python (PyTorch/TensorFlow a plus) Strong critical thinking skills and ability to provide structured technical feedback Care deeply about experimental reproducibility, fairness, and standardization in agentic AI Detail-oriented and capable of reviewing both theoretical formulations and implementation details Requirements Review RL environments and evaluate terminal conditions for correctness and consistency Assess benchmarking pipelines for fairness, reproducibility, and alignment with research objectives Provide structured technical feedback on code implementations and documentation Collaborate with researchers to refine evaluation metrics and methodologies Ensure reproducibility by validating results across different runs, seeds, and hardware setups Document findings and recommend improvements for environment design and benchmarking standards Benefits Directly influence the reliability of benchmarking in agentic AI research Work on cutting-edge RL environments that test the limits of intelligent agents Help establish standards for evaluation and reproducibility in a fast-moving field Collaborate with researchers shaping the future of agentic AI systems Pay & Work Structure Classified as a full-time hourly contractor to Mercor Paid weekly via Stripe Connect, based on hours logged 40 hours/week commitment with flexible scheduling Remote and flexible working style
Apply Now

Similar Opportunities

Experienced Registered Behavior Technician for In-Home ABA Therapy - Atlanta, GA

Remote Full-time

Immediate Hiring: Experienced Registered Behavioral Technician (RBT) for Clinic-Based ABA Therapy Services

Remote Full-time

Experienced Registered Behavioral Technician (RBT) - ABA Therapy for Children with Autism Spectrum Disorder

Remote Full-time

Experienced Registered Nurse - Telehealth: Providing Remote Care Coordination and Patient Support

Remote Full-time

Experienced Substitute Teacher for Riverside County Schools - Join Scoot Education's Innovative Team

Remote Full-time

Experienced Substitute Teacher for San Bernardino County - Flexible Schedules & Competitive Pay

Remote Full-time

Experienced School Year Instructional Coach for High-Dosage Tutoring Programs in Edgewater Park, NJ

Remote Full-time

Experienced School Year Tutor for K-8 Students in Math and Literacy - Mickleton, NJ

Remote Full-time

Experienced Secondary Social Studies Teacher for Kansas - Flexible Hybrid Remote Arrangement

Remote Full-time

USPS Office Helper

Remote Full-time

School Counselor (2026/27 School Year)

Remote Full-time

Customer Service Associate - Remote - Delivering Exceptional Experiences in Home Automation and Concierge Services

Remote Full-time

Remote Revenue Specialist II (Collections)

Remote Full-time

Chief Financial Officer (Remote)

Remote Full-time

Python Developer (12 month FTC )

Remote Full-time

Grants & Contracts Financial Compliance Analyst - Post-Award, Office of Sponsored Programs

Remote Full-time

Front-End Developer

Remote Full-time

**Experienced Chat and Text Supervisor – Crisis Support and Volunteer Oversight at blithequark**

Remote Full-time

Adjunct Instructor – Baking, Culinary, and Hospitality - Part-time

Remote Full-time

Compliance Officer - Remote local to CA preferred

Remote Full-time
← Back to Home