AI alignment

# Emergent capabilities of large foundation models often surprise their creators

For example, chain-of-thought reasoning was discovered much after GPT-3’s development ([[z7jNwtTKn3T5q2CMu749jpW:::Bowman, S. R. (2023). Eight Things to Know about Large Language Models]]). See also [this](https://www.washingtonpost.com/technology/2023/02/16/microsoft-bing-ai-chatbot-sydney/) from [[zBkj4qct8s9Z7bH6jUkGEdi:::Anthropic AI]]’s [[zFYDGXi8dNvE5HQFGeJwHim:::Dario Amodei]]:

> At a conference on generative AI on Tuesday, OpenAI’s former vice president of research Dario Amodei said onstage that while the company was training its large language model GPT-3, it found unanticipated capabilities, like speaking Italian or coding in Python. When they released it to the public, they learned from a user’s tweet it could also make websites in JavaScript.
> “You have to deploy it to a million people before you discover some of the things that it can do,” said Amodei, who left OpenAI to co-found the AI start-up Anthropic, which recently received funding from Google.
> “There’s a concern that, hey, I can make a model that’s very good at like cyberattacks or something and not even know that I’ve made that,” he added.

Related: [SolidGoldMagikarp (plus, prompt generation) - AI Alignment Forum](https://www.alignmentforum.org/posts/aPeJE8bSo6rAFoLqg/solidgoldmagikarp-plus-prompt-generation)