
Research Engineer, Frontier Safety Loss of Control, DeepMind
Job Description
- Identify potential harms from misaligned agents and develop strategies for detection and prevention.
- Implement technical controls to monitor agent thoughts, behaviour, and respond to mitigate potential harms.
- Integrate various agent behaviour signals from across the organisation to inform response policies.
- Conduct adversarial testing of controls.
- Work with internal product teams to ensure that control systems are adopted over all high-risk AI surfaces.