NLP Engineer & Computer Vision – Rebuild OCR→LLM Comic Translation Pipeline (Convex + Python) - Contract to Hire

Remote Full-time
We’re hiring an experienced Computer Vision + NLP Engineer to rebuild our entire Korean → English comic translation tool from scratch. The current system works, but we need a clean, modular, much faster, and more accurate version built on top of Convex instead of Supabase for real-time updates. You will replicate the existing workflow exactly, and improve it across accuracy, performance, and architecture. This is a rebuild from scratch. --- # Current Validated Workflow (What You Will Rebuild and Improve) Our existing tool processes full chapters with this pipeline: 1. Upload – Chapter images are uploaded. 2. Text Detection – CRAFT generates bounding boxes around text. 3. Text Extraction (OCR) – Gemini 2.5 Pro extracts Korean text inside each bounding box. 4. Panel Detection – OpenCV identifies comic panels in each image. 5. Panel Filtering – Gemini 2.5 Pro removes inaccurate/outlier panels. 6. Alignment – Remaining text boxes are matched to their correct panels. 7. Translation – Gemini 2.5 Pro produces English translations using panel and chapter context. This workflow is already validated and must behave the same, just faster, cleaner, more accurate, and modular. --- # Your Job in This Project Rebuild this entire system from zero with a modern, maintainable architecture that gives us: Better accuracy • More precise bounding boxes • Higher OCR accuracy (including stylized Korean fonts) • Better panel detection and filtering • More consistent, human-like translations Much faster overall performance • Dramatically reduced processing time per chapter • Efficient batching and async operations • Minimal latency from upload to final results A modular, replaceable architecture Every step must be isolated behind a clear interface so we can easily swap components: • Replace CRAFT → PaddleOCR / Donut / Yolov8 detector • Replace Gemini → GPT or another LLM • Replace panel detector without touching text logic • Swap OCR engines freely (Paddle, Donut, TrOCR, GPT fallback) Modular means no rewrites when upgrading models. Convex-based backend • Real-time updates streamed to the frontend • Job orchestration in Convex • Stable state management • Partial outputs instead of waiting for entire chapter completion --- # What You Must Deliver For The $2,000 Milestone 1. Fully rebuilt pipeline implementing all steps (upload → detection → OCR → panels → alignment → translation). 2. Modular architecture where detection, OCR, panel logic, and translation can be swapped independently. 3. Convex integration for real-time syncing, job progress, and results. 4. Significant accuracy improvements over the current system. 5. Significant performance improvements (faster processing end-to-end). 6. Clean project structure with documentation for all modules and interfaces. --- # Tech Stack You Will Use • Python – OCR, detection, panel processing, AI orchestration • TypeScript – Convex + frontend integration • Convex – backend database, jobs, and real-time sync • OCR Tools – CRAFT for text detection • LLMs – Gemini, GPT --- # Required Skills Must have • Strong OCR experience • Experience with LLM-based translation/localization • Python + TypeScript proficiency • Ability to design clean, modular system architectures • Experience rebuilding/refactoring complex pipelines --- # To Apply Please include: • Relevant projects (OCR, CV, LLM translation, or modular system rebuilds) • Examples where you improved accuracy, performance, or architecture • A short explanation of how you would: 1. Design a modular detection → OCR → panel → translation pipeline 2. Improve bounding boxes and OCR for stylized Korean fonts 3. Integrate Convex for real-time progress streaming to the frontend Apply tot his job
Apply Now

Similar Opportunities

Experienced Registered Behavior Technician for In-Home ABA Therapy - Atlanta, GA

Remote Full-time

Immediate Hiring: Experienced Registered Behavioral Technician (RBT) for Clinic-Based ABA Therapy Services

Remote Full-time

Experienced Registered Behavioral Technician (RBT) - ABA Therapy for Children with Autism Spectrum Disorder

Remote Full-time

Experienced Registered Nurse - Telehealth: Providing Remote Care Coordination and Patient Support

Remote Full-time

Experienced Substitute Teacher for Riverside County Schools - Join Scoot Education's Innovative Team

Remote Full-time

Experienced Substitute Teacher for San Bernardino County - Flexible Schedules & Competitive Pay

Remote Full-time

Experienced School Year Instructional Coach for High-Dosage Tutoring Programs in Edgewater Park, NJ

Remote Full-time

Experienced School Year Tutor for K-8 Students in Math and Literacy - Mickleton, NJ

Remote Full-time

Experienced Secondary Social Studies Teacher for Kansas - Flexible Hybrid Remote Arrangement

Remote Full-time

USPS Office Helper

Remote Full-time

Experienced Virtual Data Entry Clerk – Entry Level Remote Position for Detail-Oriented Individuals with Strong Organizational Skills

Remote Full-time

Licensed P&C Insurance Professional - Sales and Service (Signing Bonus)

Remote Full-time

Remote Customer Service & Data Entry Specialist – Flexible Home‑Based Role with Growth Path at Workwarp

Remote Full-time

Urgently Required Telemetry Nurse – St Joseph Health

Remote Full-time

Part-Time Yelp Spam Comment Remover (Multiple Locations)

Remote Full-time

Experienced Online Data Entry Specialist – Remote E-commerce Data Management and Entry Position with arenaflex

Remote Full-time

Experienced Customer Support Professional - Email and Live Chat Specialist for Dynamic Remote Team

Remote Full-time

**Experienced Full Stack Account Executive - Local Government Sales**

Remote Full-time

**Experienced Customer Service Representative – Remote Support Role with arenaflex**

Remote Full-time

Specialist, Executive Protection

Remote Full-time
← Back to Home