https://www.anthropic.com/index/core-views-on-ai-safety
Unfortunately, I find this awfully unpersuasive on AI risk.
Per MN: they made a terrible, awful call in trusting FTX/SBF, and this document utterly fails to grapple with what that says about their judgment. From Oliver at Lightcone:
I feel quite worried that the alignment plan of Anthropic currently basically boils down to "we are the good guys, and by doing a lot of capabilities research we will have a seat at the table when AI gets really dangerous, and then we will just be better/more-careful/more-reasonable than the existing people, and that will somehow make the difference between AI going well and going badly". That plan isn't inherently doomed, but man does it rely on trusting Anthropic's leadership, and I genuinely only have marginally better ability to distinguish the moral character of Anthropic's leadership from the moral character of FTX's leadership, and in the absence of that trust the only thing we are doing with Anthropic is adding another player to an AI arms race.