Noosphere89 comments on Some data from LeelaPieceOdds

Noosphere89 31 Oct 2025 2:11 UTC
2 points
0
The more boring (and likely case) is that we just have too few data-points to tell whether AI control can actually work as it’s supposed to, so we have to mostly fall back on priors.
I’ll flag something from J Bostock’s comment here while I’m making the comment:
I’ve only ever heard control talked about as a stopgap for a fairly narrow set of ~human capabilities, which allows us to something something solve alignment.
The human range of capabilities is actually quite large (discussed in SSC).