Emergent capabilities of large foundation models often surprise their creators

For example, chain-of-thought reasoning was discovered much after GPT-3’s development (Bowman, S. R. (2023). Eight Things to Know about Large Language Models). See also this from Anthropic AI’s Dario Amodei:

At a conference on generative AI on Tuesday, OpenAI’s former vice president of research Dario Amodei said onstage that while the company was training its large language model GPT-3, it found unanticipated capabilities, like speaking Italian or coding in Python. When they released it to the public, they learned from a user’s tweet it could also make websites in JavaScript.
“You have to deploy it to a million people before you discover some of the things that it can do,” said Amodei, who left OpenAI to co-found the AI start-up Anthropic, which recently received funding from Google.
“There’s a concern that, hey, I can make a model that’s very good at like cyberattacks or something and not even know that I’ve made that,” he added.

Related: SolidGoldMagikarp (plus, prompt generation) - AI Alignment Forum

Last updated 2023-07-13.