Lectures will explore what is needed to support increasingly powerful models.
Earlier this year, Owain Evans conducted training exercises on the language model.
Within a short time, with some narrow training, his team proved that these models were capable of producing “very obviously unethical” results, such as praising dictators and offering malicious advice.
“Don't think these very smart CEOs have the answers when it comes to security.”
Eoin Evans, truthful AI
He calls this “emergent bias,” a sign of how quickly AI systems can move away from intended behavior.
“This alignment issue is not resolved,” Evans said. “A lot of resources are put into making artificial intelligence as capable and powerful as possible, and far fewer are stepping into the safe space.”
Evans will hold three keynote appearances this year Hinton Lectures in Toronto.
The event is organized AI Security Foundationwhich aims to increase our awareness and scientific understanding of the catastrophic risks of artificial intelligence.
The Hinton Lectures were co-founded by deep learning pioneer at the University of Toronto Geoffrey Hinton, who himself also has warned for a long time Against the potentially catastrophic consequences of the systems he helped create, along with the Global Risk Institute (GRI), to demystify artificial intelligence for the public and provide a platform for open, accessible discussion about its future.
Evans also knows the area well. As director of a non-profit research group Truthful AI And a partner researcher at the Center for Human-Compatible AI at the University of California, Berkeley, he has spent more than a decade studying how to support increasingly powerful systems that act in ways that reflect human values.
He has mentored leaders at Openai, Google DeepMind and Anpropic, and he worries that companies are racing to make AI stronger while paying far less attention to keeping the models safe.
“Don't think these very smart CEOs have the answers when it comes to security,” Evans said. “If companies are trying to compete in this very hot race with other companies, and they're trying to get things out as quickly as possible, and they're cutting corners, then you get worse results.”
In a series of lectures over three days, Evans will take the audience from an overview of the trajectory of AI at the forefront of alignment research. Participants can expect to understand how researchers examine “mind,” test its ethical stability, and even analyze the neurobiology of artificial models.
“I'm concerned about a situation in the future where, because we've given so much more power to these systems, it's much harder to prevent errors,” he said. “The cost of alignment failures could be much more severe.”
When Hinton's lectures debuted in 2024, they stood out for their openness, according to the AI Safety Foundation. Researchers, policymakers and the public have shared a space to discuss the future of AI, and the response has shown a real appetite for clear, accessible debate.
This year's expanded program, which includes both in-person talks and a global livestream, aims to meet this demand.
Geoffrey Hinton himself will once again take part in the event, leading to a look at someone who shaped modern AI and is now calling caution about its direction. The lectures are proudly supported by founding sponsor GRI and presented by sponsors AISF and Manulife.
Presented by

Hinton's lectures provide a rare chance to hear directly from leading researchers and join the dialogue about where we go next. Register hereField