Primarily interested in agent foundations and AI macrostrategy.
I endorse and operate by Crocker’s rules.
I have not signed any agreements whose existence I cannot mention.
Primarily interested in agent foundations and AI macrostrategy.
I endorse and operate by Crocker’s rules.
I have not signed any agreements whose existence I cannot mention.
Estonia. (Alternatively, Poland, in which case: PLN, not EUR.)
I’m considering donating. Any chance of setting up some tax deduction for Euros?
I think you meant to hide these two sentences in spoiler tags but you didn’t
guilt-by-association
Not necessarily guilt-by-association, but maybe rather pointing out that the two arguments/conspiracy theories share a similar flawed structure, so if you discredit one, you should discredit the other.
Still, I’m also unsure how much structure they share, and even if they did, I don’t think this would be discursively effective because I don’t think most people care that much about (that kind of) consistency (happy to be updated in the direction of most people caring about it).
Reminds me of how a few years ago I realized that I don’t feel some forms of stress but can infer I’m stressed by noticing reduction in my nonverbal communication.
FYI if you want to use o1-like reasoning, you need to check off “Deep Think”.
It’s predictably censored on CCP-sensitive topics.
(In a different chat.) After the second question, it typed two lines (something like “There have been several attempts to compare Winnie the Pooh to a public individual...”) and then overwrote it with “Sorry...”.
glitch tokens are my favorite example
I directionally agree with the core argument of this post.
The elephant(s) in the room according to me:
What is an algorithm? (inb4 a physical process that can be interpreted/modeled as implementing computation)
How do you distinguish (hopefully, in a principled way) between (a) an algorithm changing; (b) you being confused about what algorithm the thing is actually running and in reality being more nuanced so that what “naively” seems like a change of the algorithm is “actually” a reparametrization of the algorithm?
I haven’t read the examples in this post super carefully, so perhaps you discuss this somewhere in the examples (though I don’t think so because the examples don’t seem to me like the place to include such discussion).
Thanks for the post! I expected some mumbo jumbo but it turned out to be an interesting intuition pump.
Based on my attending Oliver’s talk, this may be relevant/useful:
I too have reservations about points 1 and 3 but not providing sufficient references or justifications doesn’t imply they’re not on SL1.
mentioned in the FAQ
(I see what podcasts you listen to.)
My notion of progress is roughly: something that is either a building block for The Theory (i.e. marginally advancing our understanding) or a component of some solution/intervention/whatever that can be used to move probability mass from bad futures to good futures.
Re the three you pointed out, simulators I consider a useful insight, gradient hacking probably not (10% < p < 20%), and activation vectors I put in the same bin as RLHF whatever is the appropriate label for that bin.
Also, I’m curious what it is that you consider(ed) AI safety progress/innovation. Can you give a few representative examples?
I think Mesa is saying something like “The missing pieces are too alien for us to expect to discover them by thinking/theorizing but we’ll brute-force the AI into finding/growing those missing pieces by dumping more compute into it anyway.” and Tsvi’s koan post is meant to illustrate how difficult it would be to think oneself into those missing pieces.