Here are some maybe useful tags. Interpret these as ideas, not requests.
Mechanism Design (I think I am imagining including systemization that aligns incentives within yourself in here, which maybe means you would want a more general name like “Aligning Incentives” but I think I prefer “Mechanism Design”)
Fake Frameworks (When I first thought of this, I was thinking of people tagging their own posts. Maybe it is a little weird to have people tagging each other’s posts as fake. )
Embedded Agency (Where I am imagining this as being largely for technical work) (In particular, I personally would get more use out one big embedded agency tag than a bunch of smaller tags, since I feel like all the most interesting stuff in embedded agency cuts across tags like “decision theory”)
Something like the class including: Toward a New Technical Explanation of Technical Explanation, Embedded World Models, technical logical uncertainty work, things about dealing with the fact that Bayes is not a viable strategy for embedded agents. “Embedded World Models” “Resource Bounded Epistemics” “Embedded Epistemics” “Post-Bayesianism” I would hope the name here does not make people think it should only be for technical things.
Something like the class including: How I Lost 100 Pounds Using TDT, Humans Are Embedded Agents Too, Inner alignment in the brain, Sources of intuitions and data on AGI, things about applying AI alignment theory to human rationality and vice versa. Maybe more generally about applying results from one field to another field. “Interdisciplinary Analogies”?
https://www.lesswrong.com/posts/aiz4FCKTgFBtKiWsE/even-odds is another proposal that gives incentive compatible betting by having the bet be smaller than the maximum. (maybe its the same, haven’t checked.)
You should (more strongly?) disambiguate between how long after being sick are you safe, or how long after being 100% isolated are you safe.
Is it pro-social or anti-social to vote on posts I have skimmed but not read?
We actually avoided talking about AI in most of the cartoon, and tried to just imply it by having a picture of a robot.
The first time (I think) I presented the factoring in the embedded agency sequence was at a MIRI CFAR collaboration workshop, so parallels with humans was live in my thinking.
The first time we presented the cartoon in roughly its current form was at MSFP 2018, where we purposely did it on the first night before a CFAR workshop, so people could draw analogies that might help them transfer their curiosity in both directions.
Conspiracy theory: There are no launch codes. People who claim to have launch codes are lying. The real test is whether people will press the button at all. I have failed that test. I came up with this conspiracy theory ~250 milliseconds after pressing the button.
Oh no! Someone is wrong on the internet, and I have the ability to prove them wrong...
Did you consider the unilateralist curse before making this comment?
Do you consider it to be a bad idea if you condition the assumption that only one other person with launch access who sees this post in the time window choose to say it was a bad idea?
If any users do submit a set of launch codes, tomorrow I’ll publish their identifying details.
If we make it through this, here are some ideas to make it more realistic next year:
1) Anonymous codes.
2) Karma bounty for the first person to press the button.
1+2) Randomly and publicly give some people the same code as each other, and give a karma bounty to everyone who had the code that took down the site.
3) Anyone with button rights can share button rights with anyone, and a karma bounty for sharing with the most other people that only pays out if nobody presses the button.
Not sure if you’ve seen it, but this paper by Critch and Russell might be relevant when you start thinking about uncertainty.
This is my favorite comment. Thank you.
I think I do want to make my agent-like architecture general enough to include evolution. However, there might be a spectrum of agent-like-ness such that you can’t get much more than Sphex behavior with just evolution (without having a mesa-optimizer in there)
I think you can guarantee that, probabilistically, getting a specific outcome requires information about that outcome (no free lunch), which implies “search” on a “world model.”
Yeah, but do you think you can make it feel more like a formal proof?
I think there is a possible culture where people say a bunch of inside-view things, and run with speculations all the time, and another possible culture where people mostly only say literally true things that can be put into the listener’s head directly. (I associate these cultures with the books R:A-Z and superintelligence respectively.) In the first culture, I don’t feel the need to defend myself. However I feel like I am often also interacting with people from the second culture, and that makes me feel like I need a disclaimer before I think in public with speculation that conflates a bunch of concepts.
Were you summoned by this post accidentally using your true name?
Nitpick: conservation of expected evidence does not seem to me like why you can’t do divination with a random number generator.