LessWrong Team
I have signed no contracts or agreements whose existence I cannot mention.
LessWrong Team
I have signed no contracts or agreements whose existence I cannot mention.
Curated! There’s a difficult-to-bridge divide between the intuitions of people who think everything is going to get really crazy with AG and those who think a kind of normality will be maintained. This post seems to do an uncommonly good job of piercing the divide by arguing in detail and mechanistically for why current picture doesn’t obviously continue. More generally, it argues for a better epistemic approach.
I struggle encountering people who predict reality being not-that-different in coming decades: it feels crazy to me, but that reaction makes it harder to discuss. I think the contents and example of this post point at where discussion can be had, suggesting both object and meta level places to explore cruxes. It’s a valuable contribution. Kudos!
Curated. This is a clearly written, succinct version of both arguments and counterarguments, which doesn’t even seem terribly lossy to me (though I’ve read IABIED in full but not the counterarguments). I find it helpful for loading it all up into my mental context at once, and helpful for directing my own thinking for further investigation. All that to say, I think this post does the world a good service. And like much distillation work, deserves more appreciation than is the default.
I’m pretty on the doomy side and find the counterarguments not persuasive, but it is interesting to realize that often that’s because of yet further arguments/counter-counter arguments that I’m aware of but aren’t in IABIED itself, if I’m remembering correctly, or at least not at the length or depth I think is warranted for how intuitively reasonable those counterarguments seem, e.g. that models are trained on lots of data about human values and so hitting that target wouldn’t so surprising after all, and how current models seem pretty aligned. I think answering them requires something of a 201 of IABIED. (But that’s why we have the Four Layers of Intellectual Conversation!)
However, I am saddened that this review is missing the critiques that I’m most interested in hearing, e.g. those from the likes of Buck and Ryan, e.g. I enjoyed most of IABIED. The counterargument authors like Matthew Barnett, Quintin, Nora, etc are people with whom I have a lot of divergences of views, so their arguments have a harder time for being compelling. Buck and Ryan are much, much closer (and I respect their thinking) such that I’d like any list to capture their arguments (or at least link to them). Notwithstanding, I like this piece. Kudos!
Noted! Thanks for responding and clarifying. If you had any examples you’d encountered, that might be helpful.
In my case, I have encountered, e.g. startup founders who lied to their clients blatantly, but semi-competently in that they could hope to not get caught. Things like “our product can [already] do that too” and then run to the engineers.
I like the title of this post! The content of the post isn’t bad.
This was supposed to be a grand post explaining that belief. In practice it’s mostly a bunch of pointers to facets of truthseeking and ideas for how to do better.
I want the grand post! (I want clear articulations of the thing I feel is true and important.) Especially after you point out that it might have been.
The points in the post aren’t bad, though it feels like fewer examples in greater depth that I could better memorize would have more value than a lot of short ones. I think the alphabetical fictional names make me bounce of a bit.
My guess for this is the content surpassed the writing; mostly a choice on Elizabeth’s part as her strongest writing is very strong. I’d be interested in taking some of what’s here and expanding it (expanding into a “meditation” that helps some lesson sink in). And of course, the grand post, please!
+1 in the Review
This is a good post. The examples are clear and it deepened my intuition (though I’m judging from the reread, I don’t remember the delta from before my first reading). From the second-read, I think I might notice more instances in the wild of adverse selection, though I don’t think the first read had much impact on me.
The intended subsequent posts look really great and like they’d have interesting models I don’t yet have. I think I had the concept of adversarial selection before this, so wasn’t a conceptual breakthrough.
Then again, maybe the title should have been “availability is an update against goodness because of adverse selection”, which is depressing but perhaps true. I feel like I don’t know what to do with that though. I kind of already know the best restaurants are crowded and the most attractive people aren’t single? Maybe there’s some gain from remembering to make an update on things once they prove available.
It’s a good post, but didn’t give me obvious large value. So giving it a 1 in the review.
I could imagine giving the sequels more though. I suppose it figures, this post was probably adversely selected for being easier to write due to its simpler content ;)
Curated! Beliefs should pay rent in anticipated experiences is a foundational tenet of this site, and AI is turning out to be an important topic, so thank you to jessicata for compiling this list.
Somehow this feels a lot more interesting than the list of predictions on Manifold, perhaps because of the selection/curation and it not being filtered to predictions that ended up on a market. I’d be interested in someone making an @record-prediction on Twitter that you can reply to tweets with, and then they get added to some database like this.
Predictions often get down to operationalization and that’d be a neat expansion, yet even without feels neat to have collected them.
Thanks. I hope we can eventually evaluate every prediction collected here!
I read this and think “ah, yes, this is valuable and important and I should be trying to do that more”. And thought as much when I first read it. I don’t think it stayed on my mind. It’s too compressed and not a ready a cognitive strategy.
But taking a few moments to extrapolate it into something better, starting with why I’m not doing it to begin with:
A reason I don’t do more of this is I can’t do this on the order of 30 seconds. My guess is constructing a picture of what mental operations I did and what I could have done alone is the work of many minutes.
For the kinds of reasoning I really wish I’d done faster, they happened over time and it really would take a bunch of mental excavation to reconstruct my reasoning.
I don’t have a well-specified ontology for mental operations such that it’s easy to specify changes. (In contrast, I have a very clear ontology for driving a car and realize error + rehearse doing it differently in 30 seconds feels doable there). This means the work of figuring out how to do better is trying to carve out descriptions of what went wrong.
The things that went wrong run deep, or something, into weird emotional territory that are hard to analyze.
Solving problems and reaching true conclusions is hard enough that I’m caught up on that level, from one problem to the next, such that I feel too busy for reflection.
Yet I don’t fully buy all the above.
I do think that to do more of this, to make it a habit, it’ll need intentional practice. Scheduled blocks of 30-min on the schedule. Seems worth it, I should add it to ye old exobrain to remind me. I’m forming an intention to try it.
The other piece is the noticing. I don’t think I have a part of my brain that registers a “reached some milestone” event such that other actions could be triggered by it. Something, something Logan’s Noticing sequence. I’ll try that.
Ok, so where does that leave me regarding this crosspost?
I want to give this a 4 because it’s Rationality stuff from Eliezer. I don’t think I can because that great seeming, I don’t see that people will have a lot to do with it, without a bunch of unpacking (as I’m attempting). Then again, if I do the post-inspired work for a while and get great gains, I might want to say “it was short, but it had such a large effect on me it, it was def worth a 4 or even 9!”
I think this post does a good job of conveying the challenges here, grounded in actual cases. (It’s hard for me to evaluate whether it does a great job because my pre-existing knowledge on the topic.) I think this stuff is hard and I have so much sympathy for anyone who’s been caught up in it, if they’re weren’t the instigator.
I don’t feel convinced it’s impossible to do this much better. My own median world isn’t very fleshed out, but my gut tells me that dath ilan has figured out some good wisdom and process here, and I trust it. I’d also guess that if Lightcone did more in this realm[1], we’d eventually figure out some better processes here that make things better for all involved, and possibly not just within our own domain, but guidelines for other groups to follow too.
Given that, I think it’s kinda bad to call the problem impossible. Even if something is hard and we’re unlikely to make progress on it, don’t cause people not try through excessive pessimism.
I’ll probably give this post a 1. I’d be excited if anyone wrote a sequel that was guesses at how to design a better system (inspired by the actual challenges encouraged) and perhaps experimental guide for others to try out.
We were doing much more of these while running the Lightcone Offices. Lighthaven is typically rented out to other groups, and not maintaining enough of it’s own persistent in-person community for us to have reason to do a lot of this kind of adjudication. (Ben’s big investigation started in the Offices day, and is a case from which there is a lot to be learned, but I do think if we unfortunately did more such cases, we would learn and get better).
I think this is a valuable post. I say that less engaging with the specific ideas (they all seem like plausibly correct analyses to me), but for exploring the problem at all.
1. There’s a societal taboo against discussions of intelligence and IQ that although it is much weaker on LessWrong, I wonder if it is not completely absent, and therefore we don’t get that many posts like this one.
2. I often feel annoyed and judgmental that broader society doesn’t clamor for longevity increases – it’s seems so correct to think these are possible and important. Reading this post, I wonder if I commit the same mistake regarding intelligence enhancement. It clearly should be doable.
The argument against thinking about this stuff is that we have more dire urgent problems (AI) and in contrast there isn’t that much tractability here. But was I justified in believing that before this post? Am I still justified in believing it?
In reality, I (and others?) feel stuck regarding AI, if so, isn’t intelligence stuff worth more attention? This actually does that. I’m caught between giving it a 4 and a 9.
Feels like this points at correct things and I’m amenable to it being one of the top posts for 2024. It didn’t change much for me (as opposed to @Ben Pace, who thinks about it many times per month according to his review) or feel so spot on that I’d want to give a high vote. I’ll probably give something between 1-4.
Areas where I think it strikes me (admittedly with not that much thought or careful reading) as not perfectly right:
Notwithstanding the heading contra this, my instinct to want to reduce “believing in” statements to a combination of “I believe (Bayesian-style) that good things happen if I invest in X” + “I am publicly declaring myself for X (kickstarter / commitment mechanism)”. Which is a little bit interesting, but also known phenomena. Added to that, you get boring old motivated cognition to tell yourself “I’ll get this done in three hours”. This might be an effective semi-self-aware self-deception to get yourself to do things that you wouldn’t otherwise do, but that is also manipulation of the Bayesian belief slots in your head in order to get some result.
So believing in’s are Bayesian beliefs with some indirection + an expression of commitment and/or group affiliation. If so, that is useful to point out.
An extension here that’d be neat is to analyze how often expressed “values” are believing-in’s, e.g. “I believe in family”, “I believe in democracy”. If those are actually just Bayesian beliefs + commitment, then they’re a lot more defeasible than the intrinsic inherent base “values” LessWrong normally talks about.
This post is entertaining and was valuable for describing to me a group of people with whom I never interact (highly incompetent liars), but not all that useful given that I never interact with such people. I don’t think I especially need an existence proof for lying at all; I do think it’d help to get a post about examples of lying that are closer to what I’d encounter, or at that sophisticated enough to pass if you’re too credulous.
I have a feeling of distaste for this post from an unusual angle: when we were first introducing a recommendation engine, out-of-box, the algorithm started maximizing click-through-rate (and was pretty good at it), but definitely doing so via promoting posts with the most click-baity titles. This one was perhaps the algorithms favorite post. (We then adjusted things to pull towards a distribution we liked better, at the expense of some CTR). The exercise gave me a sense of what’s going wrong with the internet at large.
It’s not a bad post, but I think something has gone wrong if it’s high up on our list of “best posts”. Looking at the post so far, a lot of people voted on it, probably because a lot of people read it given the alluring title...though many votes are 4s and I’m curious what people thought was so valuable.
I’m giving a −1 because it’s not the intellectual progress I hope to see, and didn’t find it all the helpful. A fun read though.
The first concept in this post really stuck with me, that of computational kindness vs <whatever the kindness of letting the other choose is>. The OP writes they got it from elsewhere, but I appreciate it having made it to me.
I’d really love it if it had a better solution of how to pick between kindnesses as I can find myself wondering which is more preferred.
The other concepts are great too. They hadn’t stuck in my mind from original reading but perhaps will now.
I really wouldn’t mind more posts just providing me with useful handles like this, so good stuff.
The review I have of this post is much the same as I left oh John’s other delta post, My AI Model Delta Compared To Christiano:
speaking to the concept of deltas between views
I find that in reading this I end up with a better understanding of Paul and John (to the extent John summarized Paul well, but it feels right). Double-crux feels like a large ask given the hunt for a mutually shared counterfactual change that just seems like a lot to identify; ideological turing test means someone putting aside who they are and what they think too much; but “deltas” feel like a nice alternative that’s not as complicated to compute and doesn’t lose ones reference to what they think.
I’d be pretty into everyone for six months being gung-ho on deltas and trying to identify and flesh them out. It’s a cool approach. For that matter, I’d be excited for someone to try and flesh out a methodology for eliciting deltas akin to the effort Eli Tyre once put into double-crux, as an adjacent approach.
What really stuck for me from this one is the point Eliezer makes in his top-level comment:
I think that the AI’s internal ontology is liable to have some noticeable alignments to human ontology w/r/t the purely predictive aspects of the natural world; it wouldn’t surprise me to find distinct thoughts in there about electrons. As the internal ontology goes to be more about affordances and actions, I expect to find increasing disalignment.
The Natural Abstraction Hypothesis really does feel so plausible when thinking about predicting the external world, as the external world does appear to have all this structure you’d expect any mind to parse. Because of Eliezer’s response comment (and I give a lot of credit to posts that elicit valuable comments), I see we have a problem anyway.
I’d have liked if @johnswentworth has responded to Eliezer’s comment at more length, though maybe he did and I missed? Still, giving this post a 4 in the review, same as the other.
Curated. I like this post for capturing and expressing a struggle I relate to. I very much like the detail in the recollection of the thoughts and feelings throughout, all tying back to the motivation.
The way I’d express the struggle for myself is being caught between wanting to connect to people in general, and find people in general to be painfully lacking. At some point in recent years I privileged the hypothesis that focusing ways I was better and others worse was a way to preempt or soothe from rejection: I don’t know how to fit in with these folks, but it’s okay, I’m better. I still suspect that dynamic is at play, but sometimes it doesn’t feel like, it just feels like people are painfully myopic to their and my detriment. I feel frustrated with them, and I don’t feel kinship.
(Motivated cognition feels like myopia to me – you feel better now with a belief you like, but you pay a greater cost later.)
At present I try to find kinship with people over the things we do have in common. Yet rationality, philosophy, truth-seeking, knowledge, integrity/cooperation feel so core, it’s hard to not to feel distant when I reflect on those.
Speaking of epistemic rigor, it feels like intelligence can be disentangled from rationality. I value both, they’re correlated, but can come apart. Lack of rationality feels more offensive than lack of intelligence. But it feels more like intelligent people can lack rationality, but it’s much harder for lower intelligent people to achieve a good dose of rationality (like parsing a passage of Descartes properly).
Ultimately, I’m not sure what to do. I identify as human, but see myself as atypical and a bit separated. It sounds nice to feel like one was typical, that all around were good and reasonable. I mean maybe in patch of the rainforest?
I’m curious what would happen if I gained in charisma and social skill such that I was completely at ease around people at large, rather than feeling like I’m translating and adapting. That could be shaping feelings here too.
All in all, good post, kudos!
My guess is that not having long-term full ownership means that actual working on these projects goes less deep, e.g., when being assigned for a few weeks or months to typically narrower tasks.
“Stuck between a rock and a hard place” is an English expression for being stuck between two difficult options, so just playing on that.
Curated. Questions of AI and consciousness are interesting, if not important. Unfortunately, I’ve been innoculated against thinking about to the topic due to LessWrong receiving a steady stream of low-quality/AI slop submissions from new users who claim to have awoken an AI, caused it to be a fractal conscious quantum entity with which they are in symbiosis, and so on. So I’m grateful to this post for engaging on the topic on reasonable terms.
Things I found interesting are the functional vs phenomenal angle, and that [paraphrased] we’ve got forces pushing in opposite directions re self-reports of AI consciousness: (a) for AIs to simulate human reports, (b) active training/suppression against AIs reporting consciousness. Makes for a hard scientific/philosophical problems.
Among other tricky problems, perhaps not as tricky (I don’t know, maybe more) is how to have good discussions of the topic that seems to unhinge so many. Yet maybe we can manage it here :) Kudos, thanks Kaj.
Curated.
I’m 35, born in 1990. The world felt pretty sensible for most of my lifetime, definitely up through 2010, maybe even 2015. Broken, sure, but there was a normality to it. The present era is disorienting. I’ve kind of imagined that that the disorientation is because I’m anchored on how life was for me growing up, that was normal for me, and that someone growing up in this tumultuous era[1] would find it normal for them and not so unsettling.
This post makes me think otherwise. Perhaps the disorientation comes from the pace of change and resultant uncertainty, and having only known uncertainty and rapid change doesn’t make it easier to maintain footing is what I’m reading. If anything, and this a separate thought I’ve had and not connected before, is I have felt glad I’m not 20 (perhaps wrongly[2], but still). I had a chance to find some footing and stability in life before things got so mad.
From the reactions (and karma here), sounds like this resonates, and across ages, a lot of us are feeling something here. How to live in times such as this feels important (see my attempt in A Slow Guide to Confronting Doom), and though this post is not an answer, I like seeing the challenge raised again, especially so evocatively and concretely.
speaking to the concept of deltas between views
I find that in reading this I end up with a better understanding of Paul and John (to the extent John summarized Paul well, but it feels right). Double-crux feels like a large ask given the hunt for a mutually shared counterfactual change that just seems like a lot to identify; ideological turing test means someone putting aside who they are and what they think too much; but “deltas” feel like a nice alternative that’s not as complicated to compute and doesn’t lose ones reference to what they think.
I’d be pretty into everyone for six months being gung-ho on deltas and trying to identify and flesh them out. It’s a cool approach. For that matter, I’d be excited for someone to try and flesh out a methodology for eliciting deltas akin to the effort Eli Tyre once put into double-crux, as an adjacent approach.
So cool stuff, would like to see more like this.
Not just you, it was superseded by the Following feed, an option you can select when scrolling down to the Feed on the frontpage.