Berkeley professor Stuart Russell is the author of Artificial Intelligence: A Modern Approach (the standard AI textbook used in 1,400+ universities) and his 2017 TED Talk on AI safety remains the clearest formulation of the technical alignment problem available. Russell three principles — that AI systems must be uncertain about human values (not hardcoded), must derive their understanding of preferences from observing human behavior, and must allow humans to switch them off — form the basis of the Cooperative AI research agenda he leads. The talk is 17 minutes and covers more conceptual ground in that time than most full-length books on AI safety. The honest limitation: Russell work assumes a level of technical precision in goal specification that current systems are far from achieving. The prescriptive value remains; the timeline is more uncertain than the 2017 framing suggested.
Comments on "Stuart Russell — 3 Principles for Creating Safer AI"
Create a free account or sign in to join the discussion.
Sign in to join the conversation