Back to jobs
Mercor

Code-Data Eval Author — Machine Learning Engineer (Pilot)

Remote — Americas &amp$45 - $140Posted Yesterday
Contractremote

Job Description

**Code-Data Eval Author — Machine Learning Engineer** (Mercor · remote contract) Mercor partners with frontier AI labs to build the evaluations their models are trained and measured against. You'll design ML/LLM evaluation tasks and rubrics and grade model/agent outputs — your training-side knowledge directly shapes reward and eval signals. **What you'll do** - Design ML/LLM evaluation tasks, rubrics, and metrics - Grade model/agent outputs and improve eval quality through review - Bring training-side judgment (SFT / RLHF / reward modeling) to eval design **You are** - ~5+ years as an MLE at a real product organization with hands-on training/fine-tuning and evals - Ideally fluent in SFT / RLHF / reward modeling / eval metrics (rare, high-leverage here) - PyTorch/JAX, Hugging Face, experiment tracking; clear written communication **Engagement & pay** - Remote contract, flexible 30+ hrs/week - Hourly rate set to your local market (e.g., US/Canada $100–140/hr; Europe and LatAm scaled to region) **Hiring process — paid** A short Mercor Technical Screen, a live Code Review Session, and a Domain Expert Interview. You're paid $200 for completing all three, regardless of outcome.

See Your Match Score

Sign up and Renata will show you how this job matches your skills and experience.

Get Started Free
Code-Data Eval Author — Machine Learning Engineer (Pilot) at Mercor | Renata