Is checking that a state of the world is not dystopian easier than constructing a non-dystopian state?

You want to align an AGI. You just started your first programming course in the Computer Science faculty and you have a bright idea: let’s decompose the problem into simpler subproblems!

What you want is to reach a world state that is not a dystopia. For a state S such that S is dystopian, it holds that D(S) is true. If S is not a dystopia, D(S) is false.

You would like to implement the following algorithm:

You start at a random state S.
If D(S): 
    Change S a little in a non-dystopian direction.
Else:
    You aligned AI!

Of course, everything here is vaguely defined. But I still wonder if it makes sense to pose the questions I have before doing the defining work. I guess you might want to interpret the questions under the most reasonable (?) formalizations of the concept above if such a thing is at all possible and not already alignment-complete.

So, we have multiple questions:

1. Is specifying anything of the above (e.g., D) alignment complete?

2. Is evaluating D(S) alignment-complete?

3. Is changing S a little in a non-dystopian direction alignment complete?

4. How would you actually use that pseudo-algorithm though?

Now, there’s a slightly different concept from alignment-completeness (I’d guess), which is “constructing a state U that is such that D(U) = false”. Such a thing hasn’t been done yet as far as I know, and every proposal tends to fail horribly.

So the questions above are worth repeating in this form:

1. Is specifying anything of the above (e.g., D) easier than constructing U?

2. Is evaluating D(S) easier than constructing U?

3. Is changing S a little in a non-dystopian direction easier than constructing U?

4. How would you actually use that pseudo-algorithm though?