What does it feel like for you to hold your mind in that posture? How would you describe the main sensation by which you navigate when you encounter a maze?
Constrained, there’s no other way for things to be. But I don’t feel that nearly as strongly when I think of picking up my cup. Yet if I imagine moving my arm away from the cup instead of towards, I might feel just a touch of that constraint. When I consider getting up, and not imagining crazy ways to move, I feel constrained. The objects next to my legs, the screen in front (the mouse and keyboard at my side less so, oddly enough), I feel constraint.
This kind of reminds me of the primitive reachability, and Eliezer’s explanation of what is behind the illusion of “free will”. You’re terrifying me with this essay. Reading your work is like teetering on the brink of an abyss.
Twitter thread of classic excellent papers.
those winning to engage in such conduct are much less likely to become billionaires.
“The sage does not contend” reminds me of the most effective educational system I’ve ever heard of, DARPA’s digital tutor. Here’s a list of instructional tactics and procedures embodied in the Digital Tutor:
Are you predicting the LW responses or is a model you made predicting them?
-0.37 If you see ghosts, you should believe in ghosts. (Predicted LW response: Disagree)
I find this opinion weird, probably because there are multiple reasonable interpretations with quite different truth-values.
I think this post is worse than the previous one: the techniques it lists appear much less promising, and it somehow reminds me more of a set of notes for yourself about software or techniques you want to look into/remember.
Also, I really think you should’ve just started with the most important post first.
What would be an example of an optical illusion then?
In that paper did you guys take a good long look at the output of various sized models throughout training? In addition to looking at the graphs of gold-standard/proxy reward model ratings against KL-divergence. If not, then maybe that’s the discrepancy: perhaps Sherjil was communicating with the LLM and thinking “this is not what we wanted”.
Circa 2008, I don’t think we had great methods for detecting such cases, so I’m curious how your surgeons realized that you were awake. And there’s a term for that state in the literature: Inverse-Zombies. That happens about 0.13% of the time with some anaesthetics. And surgeons paid less attention to this stuff until about 1.5-2 decades ago, and you’d get some cases where people were awake, paralyzed and in pain. Some proportion of those had PTSD.
Issue is when GPTs fail to generalize, they lose their capabilities, not just their alignment, because their capabilities originate from mimicking humans.
It is late at night, I can’t think clearly, and I may disavow whatever I say right now later on. But your comment that you link to is incredible and contains content that zogs rather than zigs or zags from my perspective and I’m going to re-visit when I can think good.I also want to flag that I have been enjoying your comments when I see them on this site, and find them novel, inquisitive and well-written. Thank you.
This is well written. I am interested in reading your next post.
I’d never actually heard of Quine’s indeterminacy of translation before. It sounds interesting, so thank you for that. But I don’t understand how Quine’s arguement leads to the next step. It sounds discrete, absolute, and has no effect on the outward behaviour of entities. And later you use some statements about the difficulty of translation between entities which sure sound continuous to me.
Sure, my post doesn’t rebut Roko’s. It isn’t an arguement, after all. It is a recommendation for some content.
Yep, I didn’t watch the whole debate. I apologize if I gave that impression. Mea culpa.
How much you update on the evidence depends on your priors, no? Daniel went from “probably lab-leak” to “probably zoonosis”, and I went from “IDK” to “quite probably lab-leak”. That is the similairity I was pointint at.
I found Daniel’s thread helpful after watching some hours of the debate, as he noticed numerous points I missed. Plus, I had watched enough that the rest of the data in the thread made sense to me. So you’re right, Daniel’s thread probably isn’t that helpful. At least, not without watching e.g. the opening arguements by rootclaim and peter miller.
I read summaries and watched some of the debate, because I too am lazy. That’s why I linked to Daniel Filan’s thread on the debate. Also, Peter and Rootclaim are working on a condensed version of the three debates, which should be couple hours long.
No, it’s about maximizing the free energy
Wait, what? This is literally the opposite of what thermodynamics does though?
So I would’ve agreed with you even a month ago, but then I was informed of the Rootclaim debate over the origins of Covid19 with $100,000 at stake. And oh boy, did that debate (and Daniel Filan’s helpful coverage!) change my mind on the likelihood of the lab-leak hypothesis. Franky, this is the minimum standard we should’ve held as to the inquiry of such an important topic as a society. Regardless, it change my mind from about 70% likelihood of a lab-leak to about 1-5%. Manifold seems to agree, given the change from ~50% probability of lab-leak winning to 6%.
EDIT: To clarify, this comment was a recommendation, not an argument. Second, I didn’t watch the whole debate. Instead, I watched the opening arguements, snippets of later parts of the debate, and read through Daniel’s thread which made sense given the content I had watched. I apologize for not making that clear. Mea culpa. Third, the debate covers the Defuse proposal. 
I still recommend looking at parts of the debate (they’re handily timestamped!) which consists of written, and verbal segments. Or you can wait until the condensed 2-hour version is released. Either way, the breadth of evidence considered, and the amount of detail analysed, was very high. To make things easier, I’ll describe the format of the debate and what each round was about. There were three rounds focusing on different aspects of Covid19′s origins. Each round was compsed of 3 parts: opening statements by both sides (about 90 mins) and a third part conisting of series of shorter debate segments the judges asking questions. 10 days before each round, slides were shared, with written questions from the judges, replies to judges and to each other’s 2 days before each round .
Round 1: Opening arguments from Peter (pro zoonotic origin) and Saar/Rootclaim (lab-leak), plus the debate segment with judges asking questions.
Round 2: Opening arguments from Yuri Deigin (siding with Rootclaim) and Peter. They focus on the genetic evidence for lab-leak and gain of function research, including the whole DEFUSE proposal.
Round 3: Peter vs Saaf again. Peter mostly summarizing his arguement here, and gives probabilities for a bunch of different things. Saaf gives his final rebuttals, claiming every slide in Peter’s presentation has something wrong with it. Then the final debate segment.
There will be a shorter, edited version of the debate that I think will be released sometime in Febuary.
 Link to a market whose description contains links to all of the written documents.
The arguements against it were (IIRC) that the proposal didn’t happen at the WIV, the known virus in WIV, or those proposed in DEFUSE, weren’t actually that close genetically to what we got, likely suspects in lab-leak theories weren’t acting suspiciously before Covid blew up, Covid was first known to spread in the Wuhan market which closely resembled zoontic viruses spreading in wet-markets in other cities, the market was 10 milles across the river from the WIV were relevant labs were, there were multiple lineages which better match zoonosis compared to single lab-leak etc.
He gives total odds against a lab leak as 1 in 5^10^25, but I think he commited errors here. He made the conjuction fallacy and I don’t think he gave likelihood ratios, which is what we care about. Plus, I feel like he left out evidence favouring the lab-leak theory in this probabilistic calculation. Funnily enough, the judges noted that and said he had made the same criticism against some of Rootclaim’s probabilistic analyses.
And pinning my hopes on AI wouldn’t work very well anymore anyway, since AI now seems more likely to me to lead to dystopian outcomes than utopian ones.
Do you mean s-risks, x-risks, age of em style future, stagnation, or mainstream dystopic futures?
I shifted to the view that there’s probably no persisting identity over time anyway
I am suspcious about claims of this sort. It sounds like a case of “x is an illusion. Therefore, the pre-formal things leading to me reifying x are fake too.” Yeah, A and B are the same brightness. No, they’re not the same visually.
I.e. I am claiming that you are probably making the same mistake people make when they say “there’s no such thing as free will”. They are correct that you aren’t “transcendentally free” but make the mistake of treating their feelings which generated that confused statement as confused in themselves, instead of just another part of the world. I just suspect you’re doing a much more sophisticated version of this.
Or maybe I’m misunderstanding you. That’s also quite likely.EDIT: I just read the wikipedia page you linked to. I was misunderstanding you. Now, I think you are making an invalid inference from our state of knowledge of “Relation R” i.e. psychological connectedness which is quite rough and suprisingly far reaching to coarsening it while preserving those states that are connected. Naturally, you then notice “oh, R is a very indiscriminating relationship” and impose your abstract model of R over R itself, reducing your fear of death.EDIT 2: Note I’m not disputing your choice not to identify as a transhumanist. That’s your choice and is valid. I’m just disputing the arguement for worrying less about death that I think you’re referencing.
The first conversation is about Tyler’s book GOAT about the world’s greatest economists. Fascinating stuff, this made me more likely to read and review GOAT in the future if I ever find the time. I mostly agreed with Tyler’s takes here, to the extent I am in position to know, as I have not read that much in the way of what these men wrote, and at this point even though I very much loved it at the time (don’t skip the digression on silver, even, I remember it being great) The Wealth of Nations is now largely a blur to me.
If you’re familiar with Tyler, you can guess his decision as to who is the GOAT.Also, I find it hilarious that Tyler claimed the AI x-risk people were these guys in Berkely who think that the world has gotten a bit better over their lives, and do they really want to risk that, given how “business as usual” he thinks the future will be. Also, are his views really standard econ? Doesn’t the rules and heuristics of econ + ems, shows you get crazy growth rates?
The title made me expect a different essay which talked about the resources consumed by ambition to let it blaze high. Also, could you expand on this paragraph by giving more concrete detail?
In actuality, putting a witness on my insecurities, triggers, and shadows has given me the power to empathize with my unintegrated parts: the inner critic who pushes me beyond my limits, the night owl who self-sabotages my early bedtime to do just a little bit more work, the impact-obsessed yes woman who takes on too much in an effort to prove herself.
How did you convince these parts? How long did it take? How easy was it to identify them (sounds not too hard if they map 1-1 to behaviours)? Have you stopped these self-sabotaging behaviours?
FLOPS can be cashed out into bits. As bits are the universal units of computation, and hence optimization, if you can upper bound the number of bits used to elicit capabilities with one method, it suggests that you can probably use another method, like activation steering vs prompt engineering, to elicit said capability. And if you can’t, if you find that prompting somehow requires fewer bits to elicit capabilities than another method, then that’s interesting in its own right.
Now I’m actually curious: how many bits does a good prompt require to get a model to do X vs activation steering?
In some cases, we should aim to use RL-based finetuning to elicit a particular behavior. Especially in cases where the model is supposed to show agentic behavior, RL-based finetuning seems preferable over SFT.Therefore, it is helpful to be familiar with the combination of RL and the type of model you’re evaluating, e.g. LLMs. Useful knowledge includes how to set up the pipeline to build well-working reward models for LLMs or how to do “fake RL” that skips training a reward model and replaces it e.g. with a well-prompted LLM.
In some cases, we should aim to use RL-based finetuning to elicit a particular behavior. Especially in cases where the model is supposed to show agentic behavior, RL-based finetuning seems preferable over SFT.
Therefore, it is helpful to be familiar with the combination of RL and the type of model you’re evaluating, e.g. LLMs. Useful knowledge includes how to set up the pipeline to build well-working reward models for LLMs or how to do “fake RL” that skips training a reward model and replaces it e.g. with a well-prompted LLM.
Deep RL is black magic and should be your last resort. But sometimes you have to use the dark arts. In which case, I’ll pass on a recommendation for this RL course from Roon of OpenAI and Twitter fame. Roon claimed that he couldn’t get deep RL to actually work until he watched this video/series. Then it started making sense.