top of page

Book Review: Human Compatible: Artificial Intelligence and the Problem of Control

Updated: Mar 23, 2022



How we currently build AI is fundamentally flawed. We need new principles and a new process to build intelligent machines that are able to be controlled by humans in the long run. That’s the premise that anchors Stuart Russell’s book, Human Compatible: Artificial Intelligence and the Problem of Control. Russell is considered a pioneer in the AI research community. He’s a professor at UC Berkeley and co-author with Peter Norvigg of a hugely popular textbook, Artificial Intelligence: A Modern Approach, used by thousands of computer science students. So, when Stuart Russell says things broken, it’s not something to take lightly.

The book is written in an accessible way that a non-technical audience can easily follow but that still offers value for those working in the field. Russell lays out the history of artificial intelligence, a useful primer for those not familiar with AI or for those who may only be familiar with the technical aspects of AI. A central issue, according to Russell, is how intelligence itself came to be defined. “Machines are intelligent to the extent that their actions can be expected to achieve their objectives.” (Russell, 2019, p 9) This definition, he argues, is problematic. Instead, we need to ensure machines act in ways that are beneficial to humans to achieve our objectives.

Much of the book sets out to chronicle the problems with losing control over AI as we move towards superintelligence. Russell’s epiphany about AI’s dangers and commitment to publicly discuss these issues was cemented in 2014 during a screening of the movie Transcendence. If he wasn’t a distinguished professor and researcher, he might be amongst the “woke” tech bros making amends for the fact that they moved too fast and broke too many things. Having seen the light, Russell’s mission is focused on sounding the alarm around the existential risk of superintelligence.

This is particularly challenging work within the AI community itself, as members refuse to acknowledge the problem through various forms of “denial and deflection”, as Russell puts it. One of the most interesting arguments is that many AI researchers don’t seem to believe in the successful outcome of their own work, instead denying that achieving superintelligence is even possible. He deftly extinguishes these arguments from his vantage point as a seasoned AI researcher and chalks up the fissures in the AI community to tribalism. To speak up about risks of AI is to be seen as anti-AI, which Russell concludes is non-sense. In fact, he claims, those who care about AI should care enough to “own the risks and work to mitigate them.” (Russell, 2019, p. 160)

His solution to the problem of control is a set of principles for building beneficial AI that are centered around maximizing human preferences. This seems like a logical approach, but as Russell goes on to point out, it’s not all that easy to implement because humans are highly unpredictable in their preferences. This is covered in the chapter called Complications: Us. One of the most illuminating lines in the book is in this chapter. In unpacking the work of behavioural economist Daniel Kahneman who studies how humans often make irrational or suboptimal choices, Russell notes “that standard mathematical models focusing on maximizing a sum of rewards” were used for mathematical convenience – the justifications came later. (Russell, 2019, p. 240) While he was referring to a specific situation in Kahneman’s work, this led me to wonder, what else are we doing for “mathematical convenience” and then justifying?

One possible example occurs a few pages later, where thousands of years of moral philosophy are glossed over in less than a paragraph and a consequentialist, utilitarian, approach to ethics is deemed most relevant to AI. This utilitarian approach, which optimizes the greatest good of the greatest number, is convenient from a mathematical perspective, but there are other justifications given as to why this perspective is best. Russell uses this utilitarian approach (and a dose of game theory) as a basis for informing his technical solution for provably beneficial AI. To mathematically handle the uncertainty of preferences, something he calls Inverse Reinforcement Learning (IRL), actual human behaviour is used to determine the reward function that the machine should optimize.

While Russell proposes a way forward, he also understands that the problem is hard to solve. He dedicates the last chapter to a strengths and weaknesses analysis of whether or not the proposed solution is even feasible given the complexity involved and the vast array of stakeholders. The book ends on a note of “wait and see” uncertainty.

I agree that the way AI is currently constructed needs to be fixed, but perhaps not exactly in the manner that Russell proposes. There’s an assumption that his argument is built on – that AI is just a tool, one that we as humans need to control. This “problem of control” is his central focus, as the title of the book suggests. Yet, is control the lens through which we should examine our future with AI or are there are other ways to think about it?

Seeing AI only as a tool to be controlled sets up a familiar pattern. The argument that we need to keep the machines in line to ensure that they do our bidding is reflective of a Western worldview and colonial frameworks of master and slave or resource exploitation. This perspective sets up a binary dynamic – either we are in control or super intelligent AI is in control. However, as we seek better ways to evolve our thinking about how AI is constructed, perhaps we can also learn from other cultures in reframing our relationship with AI. Could we have a more harmonious relationship? Is there a possible future where AI’s and humans co-exist in ways that are truly compatible without either being dominant? I think these are interesting questions to consider and deserve further exploration.

Overall, the book does a good job of explaining the risks around what might happen if we continue to develop super intelligent AI within the goal seeking optimization framework for intelligence that we currently use. Russell makes a logically passionate case for change and I’m onboard with the sentiment that change is needed, even if I don’t fully buy into his proposed solution. We need more AI researchers like Stuart Russell to be working on new directions for AI that align with societal values and we need more books like Human Compatible that explain the issues and engage a wider group of people to be part of the conversation.


By Katrina Ingram, CEO, Ethically Aligned AI _______


Sign up for our newsletter to have new blog posts and other updates delivered to you each month!

Ethically Aligned AI is a social enterprise aimed at helping organizations make better choices about designing and deploying technology. Find out more at ethicallyalignedai.com © 2020 Ethically Aligned AI Inc. All right reserved.

Resources

Russell, S. (2019) Human Compatible: Artificial Intelligence and the Problem of Control. Penguin. NYC, New York.


bottom of page