Back to jobs
Job Description
- Build advanced classifiers and data pipelines to detect misuse, owning the end-to-end process from automated evaluation to rapid model iteration.
- Build cross-context monitoring systems to detect coordinated harms, developing novel signal aggregation methods across disparate user sessions to identify large-scale attack vectors.
- Implement data-driven, semi-automated account-level response systems to detect, track, and apply strikes against persistent malicious actors using rich signals from production traffic.
- Evaluate and secure agentic AI systems by developing threat models, creating testing environments, and deploying robust mitigations against frontier-level agentic hacking and long-horizon attacks.
- Be able to advance research in automated red-teaming and adversarial robustness, leveraging multi-turn/agentic attacks to systematically test for and uncover misuse vulnerabilities.
