Jan Leike, a prominent AI researcher who recently departed OpenAI amid concerns over AI safety practices, has joined Anthropic to lead a newly formed “superalignment” team.
Leike’s team at Anthropic will focus on various aspects of AI safety and security, including “scalable oversight,” “weak-to-strong generalization,” and automated alignment research. He will report directly to Anthropic’s Chief Science Officer, Jared Kaplan. Researchers from Anthropic currently working on scalable oversight will join Leike’s team as it ramps up.
The move mirrors OpenAI’s now-dissolved Superalignment team, which Leike co-led and aimed to tackle the technical challenges of controlling superintelligent AI within the next four years. Leike’s departure from OpenAI was partly due to differences in approaches to AI safety.
Anthropic, led by CEO Dario Amodei, a former VP of Research at OpenAI, has positioned itself as more safety-focused than its counterparts. The company was established following Amodei’s split with OpenAI, driven by disagreements over the company’s increasing commercial direction. Anthropic has already attracted several former OpenAI employees, including policy lead Jack Clark, to its ranks.