The REVERSE HEADS Environment always you 0.5 reward if the coin comes up tails, but [if] it comes up heads, saying “tails” gets you 1 reward and “heads” gets you 0 reward. We have Knightian uncertainty between the two environments.
In the next post, (#2)
The post after that (#3)
2 more posts to look forward to.
Later posts (not written yet) will be about the “1 reward forever” variant of Nirvana and InfraPOMDP’s (~#4), developing inframeasure theory more(~#5), applications to various areas of alignment research(~#6), the internal logic which infradistributions are models of (~#7), unrealizable bandits (~#8), game theory (~#9), attempting to apply this to other areas of alignment research (~#10), and… look, we’ve got a lot of areas to work on, alright? (*)
Plus a speculative/possible 7 more after that assuming no overlap or multi-post topics. (~#6 and ~#10 already being counted as 2 posts.)
2 more posts to look forward to.
Plus a speculative/possible 7 more after that assuming no overlap or multi-post topics. (~#6 and ~#10 already being counted as 2 posts.)
*More leaning on the unenumerated possibilities.
I look forward to seeing more of this!