Even if we slow down training runs, we’ll still make algorithmic and hardware progress, lowering the barrier to future training by defectors, or producing sudden discontinuous is a ban is suddenly lifted.
Matthew Barnett argues that at least theoretically, a private individual could afford to run inference with a brain-scale model.
It might be better to slow/stop research later, when we’re closer to truly dangerous systems; i.e. maybe AI alignment is easier with more ASI-like systems
This is probably true of RLHF as an alignment method method.
I notice that much of the progress in Mechanistic interpretability has been made with very small systems, and that’s often cited as one of the more promising paths.
unsurprisingly sama and gdb don’t sign, call for regulation and coordination, alignment with some democratic standard of behavior—not really sure how that would work