## Role Overview
We are seeking expert physics researchers to author and verify golden reference solutions for the **CritPt benchmark (arXiv:2509.26574v3)** — a frontier research-level physics benchmark. Participants will solve CritPt research-level problems end-to-end, audit solutions from other experts, or adjudicate between parallel solution attempts, producing 100%-human-verified reference data used to evaluate large language models on frontier physics reasoning.
## Physics Subdomains Covered
High Energy Physics & Mathematical Physics, Biophysics & Statistical Physics, Condensed Matter & AMO, Gravitation / Cosmology / Astrophysics, Quantum Information, Optical Properties of Materials, Magnetic Materials, Measurements in QM.
## Key Responsibilities
- Solve research-level physics challenges end-to-end with verifiable derivations, code, and peer-reviewed references
- Decompose challenges into standalone checkpoint sub-problems that require genuine physical reasoning
- Author Python answer templates with auto-grading functions for symbolic or numerical answers
- Audit submitted solutions for correctness, scope, and method soundness; deliver actionable feedback across iterations
- Adjudicate between parallel solver attempts and decide which solution becomes the golden reference
- Document chain-of-thought reasoning, error tolerances, equivalent symbolic forms, and verification test cases
## Ideal Qualifications
- **Solver:** PhD or postdoc in the relevant subfield (senior PhD student minimum)
- **Auditor:** Postdoc or junior professor in the relevant subfield (PhD minimum)
- **Adjudicator:** Full professor or industry research PI in the relevant subfield (senior postdoc or junior professor minimum)
- Hands-on familiarity with at least two canonical methods of the target subfield, demonstrable through publications (broader coverage strongly preferred)
- 3–5 representative publications (arXiv ID or DOI), ideally within the last ~5 years and in the target subfield
- Working proficiency with LaTeX, Python, Jupyter, and SymPy
- Strong written English (B2/C1/C2 minimum; native or near-native preferred)
## More About the Opportunity
- Expected commitment: ~10 hours/week, sustained across an 8–10 week window per task pool
- Pay range: $80–$140 per hour, based on role and demonstrated expertise
- Asynchronous work