Back to jobs
Job Description
**PLEASE NOTE: You must take the Bilingual Competency interview in Assamese to be considered for this role.**
**Location**: Global
**Type**: Contract Work
**Fluent Language Skills Required:** Assamese (native fluency) and English (strong proficiency)
**Why this role matters:** Your job is to assess Assamese AI-generated responses and identify specific strengths and areas of improvement for these responses – your work will be used to create the "perfect AI-generated response" at a later stage of this project. Note the analysis you create will be in English.
**What You'll Do**
- Conduct fact-checking using trusted public sources and external tools
- Generate high-quality human evaluation data by identifying response strengths, areas for improvement, and factual inaccuracies
- Assess reasoning quality, clarity, tone, and completeness of responses
- Ensure model responses align with expected conversational behavior and system guidelines
**Who You Are**
- You hold a **Bachelor's degree**
- You are a **native speaker** in **Assamese**
- You have **significant experience using large language models** (LLMs) and understand how and why people use them
- You have **excellent writing skills in English** and can clearly articulate nuanced feedback
- You have **strong attention to detail** and consistently notice subtle issues others may overlook
- You have a background or experience in domains requiring **structured analytical thinking** (e.g., research, policy, analytics, linguistics, engineering)
**Nice-to-Have Specialties**
- Prior experience with **RLHF, model evaluation, or data annotation work**
- Experience writing or editing **high-quality written content**
- Experience comparing multiple outputs and making **fine-grained qualitative judgments**
**What Success Looks Like**
- You identify factual inaccuracies, reasoning errors, and communication gaps in model responses
- You produce clear, consistent, and reproducible evaluation artifacts
- Your feedback leads to measurable improvements in response quality and user experience
