DragonGod comments on Take 4: One problem with natural abstractions is there’s too many of them.

DragonGod 21 Dec 2022 12:48 UTC
1 point
0
Consider the subset of “human values” that we’d be “happy” (where we fully informed) for powerful systems to optimise for. [Weaker version: “the subset of human values that it is existentially safe for powerful systems to optimise for”.]

Let’s call thia subset “ideal values”.

I’d guess that the “most natural” abstraction isn’t “ideal values” themselves but something like “the minimal latents of ideal values”.