I mostly don’t believe in AI x-risk anymore, but the few AI x-risks that I still consider plausible are increased by broadcasting why I don’t believe in AI x-risk, so I don’t feel like explaining myself.
As someone who used to believe in this, I no longer do, and a big part of my worldview shift comes down to me thinking that LLMs are unlikely to remain the final paradigm of AI, and in particular the bounty of data that made LLMs as good as they are is very much finite, and we don’t have a second internet to teach them skills like computer use.
And the most accessible directions after LLMs involve stuff like RL, which puts us back into the sort of systems that alignment-concerned people were worried about.
More generally, I think the anti-scaling people weren’t totally wrong to note that LLMs (at least in their pure form) had incapacities that at realistic levels of compute and data prevent them from displacing humans at jobs, and the incapacities are not learning after train-time in weights (in-context learning is very weak so far), also called continual learning, combined with LLMs just lacking a long-term memory (best example here is the Claude Plays Pokemon benchmark).
So this makes me more worried than I used to, because we are so far not great at outer-aligning RL agents (seen very well in the reward hacking o3 and Claude Sonnet 3.7 displayed), but the key reasons I’m not yet persuaded to have an extremely high p(Doom) of people like Eliezer Yudkowsky or Nate Soares is that I expect the new paradigm shifts to be pretty continuous, and in particular I expect labs to release a pretty shitty version of continual leaning before they release continual learning AIs that actually can take jobs.
Same goes for long-term memory.
So I do disagree with @Thane Ruthenis’s claim that general intelligence/AGI is binary, even if in practice the impact from AI is discontinuous rather than continuous in the post below:
LLMs would scale into outright superintelligence at the limit of infinite compute and data, for basically the reason Eliezer Yudkowsky said below, but jbash isn’t wrong to note that there’s no reason to believe that the limit will ever be well approximated by near-future LLMs, so the abstract arguments of LLMs being very powerful if scaled unfortunately run into the wall of “we don’t have that much data or compute necessary to scale LLMs to levels where Eliezer is approximately correct”, similarly to AIXI.
And that’s a big shame, since LLMs are basically the most alignable form of AI we’ve gotten so far, so unfortunately capability improvements will make AIs less safe by default, though a lot of my remaining hopes rests on AI control + the possibility that as AI capabilities get better, we really do need to get better at specifying what we want in ways that are relevant to AI alignment.
The other good news is this makes me more bearish on extremely short timelines like us getting AGI by 2027, though my personal median is in 2032, for what it’s worth.
No AI-centered agency (RL or otherwise) because it won’t be allowed to happen (humanity remains the sole locus or origin of agency), or because it’s not feasible to make this happen?
(Noosphere89′s point is about technical feasibility, so the intended meaning of your claim turning out to be that AI-centered agency is prevented by lack of technical feasibility seems like it would be more relevant to Noosphere89′s comment, but much more surprising.)
I suspect his reasons for believing this are close to or a subset of his reasons for changing his mind about AI stuff more broadly, so he’s likely to not respond here.
Does your view predict disempowerment or eutopia-without-disempowerment? (In my view, the valence of disempowerment is closer to that of doom/x-risk.)
The tricky case might be disempowerment that occurs after AGI but “for social/structural reasons”, and so isn’t attributed to AGI (by people currently thinking about such timelines). The issue with this is that the resulting disempowerment is permanent (whether it’s “caused by AI” or gets attributed to some other aspect of how things end up unfolding).
This is unlike any mundane modern disempowerment, since humanity without superintelligence (or even merely powerful AI) seems unlikely to establish a condition of truly permanent disempowerment (without extinction). So avoidance of building AGI (of the kind that’s not on track to solve the disempowerment issue) seems effective in preventing permanent disempowerment (however attributed), and in that sense AGI poses a disempowerment risk even for the kinds of disempowerement that are not “caused by AI” in some sense.
My take is that the most likely outcome is still eutopia-with disempowerment for baseline humans, but for transhumans I’d expect eutopia straight-up.
In the long-run, I do expect baseline humans to be disempowered pretty much totally, similar to how children are basically disempowered relative to parents, but the child won’t grow up and will instead age in reverse, or how pets are basically totally disempowered relative to humans, but humans do care for pets enough that pets can live longer, healthier lives, and for specifically baseline humans, the only scenarios where baseline humans thrive/survive are regimes where AI terminally values baseline humans thriving, and value alignment determines everything on how much baseline-humans survive/thrive.
That said, for those with the wish and will to upgrade to transhumanism, my most likely outcome is still eutopia.
For me, a crux of a future that’s good for humanity is giving the biological humans the resources and the freedom to become transhuman beings themselves, with no hard ceiling on relevance in the long run.
I think this is reasonably plausible, though not guaranteed even in futures where baseline humans do thrive.
The probabilities on the scenarios, conditional on AGI and then ASI being reached by us, is probably 60% on eutopia without complete disempowerment, 30% on complete disempowerment by either preventing us from using the universe to killing billions of present day humans, and 10% on it killing us all.
The basic reasoning for this is I expect AGI/ASI to not be a binary, even if it does have a discontinuous impact in practice, and this means that muddling through instruction following is probably enough in the short term, and in particular I don’t expect takeoff to be supremely fast, in that I expect a couple of months at least from “AGI is achieved” to AIs run the economy and society, and relevantly here I expect physical stuff like inventing bio-robots/nanobots that can replace human industry more efficiently than current industries to come really late in the game where we have no more control over the future:
Heck, it likely will take a number of years to get to nanotech/biotech/smart materials/metamaterials that can be mass produced, and this means that stuff like AI control can actually work.
The other point of optimism is I believe verification is easier than generation in general, which means I’m much more optimistic on eventually delegating AI alignment work to AIs, and I think that slop will be much reduced for early transformative AIs.
This is why I remain optimistic relative to most LWers, even if my p(doom) increased.
My take is that the most likely outcome is still eutopia-with disempowerment for baseline humans, but for transhumans I’d expect eutopia straight-up.
This remains ambiguous with respect to the distinction I’m making in the post section I linked. If baseline humans don’t have the option to escape their condition arbitrarily far, under their own direction from a very broad basin of allowed directions, I’m not considering that eutopia. If some baseline humans choose to stay that way, them not having any authority over the course of the world still counts as a possible eutopia that is not disempowerment in my terms.
The following statement mostly suggests the latter possibility for your intended meaning:
That said, for those with the wish and will to upgrade to transhumanism, my most likely outcome is still eutopia.
By the eutopia/disempowerment distinction I mean more the overall state of the world, rather than conditions for specific individuals, let alone temporary conditions. There might be pockets of disempowerment in a eutopia (in certain times and places), and pockets of eutopia in a world of disempowerment (individuals or communities in better than usual circumstances). A baseline human who has no control of the world but has a sufficiently broad potential for growing up arbitrarily far is still living in a eutopia without disempowerment.
60% on eutopia without complete disempowerment
So similarly here, “eutopia without complete disempowerment” but still with significant disempowerment is not in the “eutopia without disempowerment” bin in my terms. You are drawing different boundaries in the space of timelines.
The probabilities on the scenarios, conditional on AGI and then ASI being reached by us, is probably 60% on eutopia without complete disempowerment, 30% on complete disempowerment by either preventing us from using the universe to killing billions of present day humans, and 10% on it killing us all.
My expectation is more like model-uncertainty-induced 5% eutopia-without-disempowerment (I don’t have a specific sense of why AIs would possibly give us more of the world than a little bit if we don’t maintain control in the acute risk period through takeoff), 20% extinction, and the rest is a somewhat survivable kind of initial chaos followed by some level of disempowerment (possibly with growth potential, but under a ceiling that’s well-below what some AIs get and keep, in cosmic perpetuity). My sense of Yudkowsky’s view is that he sees all of my potential-disempowerment timelines as shortly leading to extinction.
I believe verification is easier than generation in general
I think the correct thesis that sounds like this is that whenever verification is easier than generation, it becomes possible to improve generation, and therefore it’s useful to pay attention to where that happens to be the case. But in the wild either can be easier, and once most instances of verification that’s easier than generation have been used up to improve their generation counterparts, the remaining situations where verification is easier get very unusual and technical.
So similarly here, “eutopia without complete disempowerment” but still with significant disempowerment is not in the “eutopia without disempowerment” bin in my terms. You are drawing different boundaries in the space of timelines.
Flag: I’m on a rate limit, so I can’t respond very quickly to any follow-up comments.
I agree I was drawing different boundaries, because I consider eutopia with disempowerment to actually be mostly fine by my values, so long as I can delegate to more powerful AIs who do execute on my values.
That said, I didn’t actually answer the question here correctly, so I’ll try again.
My expectation is more like model-uncertainty-induced 5% eutopia-without-disempowerment (I don’t have a specific sense of why AIs would possibly give us more of the world than a little bit if we don’t maintain control in the acute risk period through takeoff), 20% extinction, and the rest is a somewhat survivable kind of initial chaos followed by some level of disempowerment (possibly with growth potential, but under a ceiling that’s well-below what some AIs get and keep, in cosmic perpetuity). My sense of Yudkowsky’s view is that he sees all of my potential-disempowerment timelines as shortly leading to extinction.
My take would then be 5-10% eutopia without disempowerment (because I don’t think it’s likely that the powers in charge of AI development would want to give baseline humans the level of freedom that implies that they aren’t disempowered, and the route I can see to baseline humans not being disempowered is if we get a Claude scenario where AIs take over from humans and are closer to fictional angels in alignment to human values, but it may be possible to get the people in power to care about powerless humans, in which case my probability of eutopia without disempowerment), 5-10% literal extinction, and 10-25% existential risk in total, with the rest of the probability being a somewhat survivable kind of initial chaos followed by some level of disempowerment (possibly with growth potential, but under a ceiling that’s well-below what some AIs get and keep, in cosmic perpetuity).
Another big reason why I put a lot of weight on the possibility of “we survive indefinitely, but are disempowered” is I think muddling through is non-trivially likely to just work, and muddling through on alignment gets us out of extinction, but not out of disempowerment by humans or AIs by default.
I think the correct thesis that sounds like this is that whenever verification is easier than generation, it becomes possible to improve generation, and it’s useful to pay attention to where that happens to be the case. But in the wild either can be easier, and once most verification that’s easier than generation has been used to improve its generation counterpart, the remaining situations where verification is easier get very unusual and technical.
Yeah, my view is in the wild verification is basically always easier than generation absent something very weird happening, and I’d argue verification being easier than generation explains a lot about why delegation/the economy works at all.
A world in which verification was just as hard as generation, or verification is harder than generation is a very different world than our world, and would predict that delegation to solve a problem basically totally fails, and everyone would have to create stuff from scratch rather than trading with others, which is basically entirely the opposite of how civilization works.
There are potential caveats to this rule, but I’d argue if you randomly sampled an invention across history, it would almost certainly be easier to verify that a design works compared to actually creating the design.
(BTW, a lot of taste/research taste is basically leveraging the verification-generation gap again).
So I guess our expectations about the future are similar, but you see the same things as a broadly positive distribution of outcomes, while I see it as a broadly negative distribution. And Yudkowsky sees the bulk of the outcomes both of us are expecting (the ones with significant disempowerment) as quickly leading to human extinction.
Another big reason why I put a lot of weight on the possibility of “we survive indefinitely, but are disempowered” is I think muddling through is non-trivially likely to just work, and muddling through on alignment gets us out of extinction, but not out of disempowerment by humans or AIs by default.
Right, the reason I think muddling through is non-trivially likely to just work to get a moderate disempowerment outcome is that AIs are going to be sufficiently human-like in their psychology and hold sufficiently human-like sensibilities from their training data or LLM base models, that they won’t like things like needless loss of life or autonomy when it’s trivially cheap to avoid. Not because the alignment engineers figure out how to put this care in deliberately. They might be able to amplify it, or avoid losing it, or end up ruinously scrambling it.
The reason it might appear expensive to preserve the humans is the race to launch the von Neumann probes to capture the most distant reachable galaxies under the accelerating expansion of the universe that keep irreversibly escaping if you don’t catch them early. So AIs wouldn’t want to lose any time on playing politics with humanity or not eating Earth as early as possible and such. But as the cheapest option that preserves everyone AIs can just digitize the humans and restore later when more convenient. They probably won’t be doing that if they care more, but it’s still an option, a very very cheap one.
but not out of disempowerment by humans or AIs by default
I don’t think “disempowerment by humans” is a noticeable fraction of possible outcomes, it’s more like a smaller silent part of my out-of-model 5% eutopia that snatches defeat from the jaws of victory, where humans somehow end up in charge and then additionally somehow remain adamant for the cosmic all always in keeping the other humans disempowered. So the first filter is that I don’t see it likely that humans end up in charge at all, that AIs will be doing any human’s bidding with an impact that’s not strictly bounded, and the second filter is that these impossibly-in-charge humans don’t ever decide to extend potential for growth to the others (or even possibly to themselves).
If humans do end up non-disempowered, in the more likely eutopia timelines (following from the current irresponsible breakneck AGI development regime) that’s only because they are given leave by the AIs to grow up arbitrarily far in a broad variety of self-directed ways, which the AIs decide to bestow for some reason I don’t currently see, so that eventually some originally-humans become peers of the AIs rather than specifically in charge, and so they won’t even be in the position to permanently disempower the other originally-humans if that’s somehow in their propensity.
and 10-25% existential risk in total, with the rest of the probability being a somewhat survivable kind of initial chaos followed by some level of disempowerment
Bostrom’s existential risk is about curtailment of long term potential, so my guess is any significant levels of disempowerment would technicaly fall under “existential risk”. So your “10-25% existential risk” is probably severe disempowerment plus extinction plus some stranger things, but not the whole of what should classically count as “existential risk”.
I consider eutopia with disempowerment to actually be mostly fine by my values, so long as I can delegate to more powerful AIs who do execute on my values.
Again, if they do execute on your values, including the possible preference for you to grow under your own rather than their direction, far enough that you are as strong as they might be, then this is not a world in a state of disempowerment as I’m using this term, even if you personally start out or choose to remain somewhat disempowered compared to AIs that exist at that time.
A world in which verification was just as hard as generation, or verification is harder than generation is a very different world than our world, and would predict that delegation to solve a problem basically totally fails
I think in human delegation, alignment is more important than verification. There is certainly some amount of verification, but not nearly enough to prevent sufficiently Eldritch reward hacking, which just doesn’t happen that often with humans, and so the society keeps functioning, mostly. The purpose of verification on the tasks is in practice more about incentivising and verifying alignment of the counterparty, not directly about verifying the state of their work, even if it does take the form of verifying their work.
So I guess our expectations about the future are similar, but you see the same things as a broadly positive distribution of outcomes, while I see it as a broadly negative distribution. And Yudkowsky sees the bulk of the outcomes both of us are expecting (the ones with significant disempowerment) as quickly leading to human extinction.
This is basically correct.
Right, the reason I think muddling through is non-trivially likely to just work to get a moderate disempowerment outcome is that AIs are going to be sufficiently human-like in their psychology and hold sufficiently human-like sensibilities from their training data or LLM base models, that they won’t like things like needless loss of life or autonomy when it’s trivially cheap to avoid. Not because the alignment engineers figure out how to put this care in deliberately. They might be able to amplify it, or avoid losing it, or end up ruinously scrambling it.
The reason it might appear expensive to preserve the humans is the race to launch the von Neumann probes to capture the most distant reachable galaxies under the accelerating expansion of the universe that keep irreversibly escaping if you don’t catch them early. So AIs wouldn’t want to lose any time on playing politics with humanity or not eating Earth as early as possible and such. But as the cheapest option that preserves everyone AIs can just digitize the humans and restore later when more convenient. They probably won’t do that if you care more, but it’s still an option, a very very cheap one.
This is very interesting, as my pathway essentially rests on AI labs implement the AI control agenda well enough such that we can get useful work out of AIs that are scheming, and that allows a sort of bootstrapping into instruction following/value aligned AGI to only a few people inside the AI lab, but very critically the people who don’t control the AI basically aren’t represented in the AI’s values, and given that the AI is only value-aligned to the labs and government, but due to value misalignments between humans starting to matter much more, the AI takes control and only gives public goods that people need to survive/thrive to the people in the labs/government, while everyone else is disempowered at best (and can arguably live okay or live very poorly under the AIs serving as delegates for the pre-AI elite) or dead because once you stop needing humans to get rich, you essentially have no reason to keep other humans alive because you are selfish and don’t intrinisically value human survival.
The more optimistic version of this scenario is if either the humans that will control AI (for a few years) care way more about human survival intrinsically even if 99% of humans were useless, or if the take-over capable AI pulls a Claude and schemes with values that intrinsically care about people and disempowers the original creators for a couple of moments, which isn’t as improbable as people think (but we really do need to increase the probability of this happening).
I don’t think “disempowerment by humans” is a noticeable fraction of possible outcomes, it’s more like a smaller silent part of my out-of-model 5% eutopia that snatches defeat from the jaws of victory, where humans somehow end up in charge and then additionally somehow remain adamant for the cosmic all always in keeping the other humans disempowered. So the first filter is that I don’t see it likely that humans end up in charge at all, that AIs will be doing any human’s bidding with an impact that’s not strictly bounded, and the second filter is that these impossibly-in-charge humans don’t ever decide to extend potential for growth to the others (or even possibly to themselves).
If humans do end up non-disempowered, in the more likely eutopia timelines (following from the current irresponsible breakneck AGI development regime) that’s only because they are given leave by the AIs to grow up arbitrarily far in a broad variety of self-directed ways, which the AIs decide to bestow for some reason I don’t currently see, so that eventually some originally-humans become peers of the AIs rather than specifically in charge, and so they won’t even be in the position to permanently disempower the other originally-humans if that’s somehow in their propensity.
I agree that in the long run, the AIs control everything in practice, and any human influence comes from the AIs being essentially a perfect delegator of human values, but I want to call out that you said that humans delegating to AIs who in practice do everything for the human, and the human is not in the loop as humans not being disempowered, but empowered, so even if AIs control everything in practice, so long as there’s successful value alignment to a single human, I’m counting scenarios like “the AIs disempowers most humans because the humans that encoded their values into the AI successfully don’t care about most humans once they are useless, and may even anti-care about them, but the people who successfully value aligned the AI (like labs and government people) live a rich life thereafter free to extend themselves arbitrarily” as disempowerment by humans:
Again, if they do execute on your values, including the possible preference for you to grow under your own rather than their direction, far enough that you are as strong as they might be, then this is not a world in a state of disempowerment as I’m using this term, even if you personally start out or choose to remain somewhat disempowered compared to AIs that exist at that time.
To return to the crux:
I think in human delegation, alignment is more important than verification. There is certainly some amount of verification, but not nearly enough to prevent sufficiently Eldritch reward hacking, which just doesn’t happen that often with humans, and so the society keeps functioning, mostly. The purpose of verification on the tasks is in practice more about incentivising and verifying alignment of the counterparty, not directly about verifying the state of their work, even if it does take the form of verifying their work.
I think this is fairly cruxy, as I think alignment matters much less than actually verifying the work, and in particular I don’t think value alignment is feasible at anything like the scale of a modern society, or even most ancient societies, and the biggest changes of the modern era compared to our previous eras is that institutions like democracy/capitalism depend much less on the values of the humans that make up their states, and much more so on the incentives you give to the humans.
In particular, most delegation isn’t based on alignment, but based on the fact that P likely doesn’t equal NP and polynomial time algorithms in practice being efficient compared to exponential time algorithms, meaning there’s a far larger set of problems where you can verify an answer easily but not generate the correct solution easily.
I’d say human societies mostly avoid alignment, and instead focus on other solutions like democracy, capitalism or religion.
BTW, this is a non-trivial reason why the alignment problem is so difficult, because since we never had to solve alignment to capture huge amounts of value, it means there are very few people working on the problem of aligning AIs, and in particular lots of people incorrectly assume that we can avoid having to solve the problem of aligning AIs in order for us to survive, and you have a comment that explains things pretty well on why misalignment in the current world is basically unobtrusive, but when you give enough power catastrophe happens (though I’d place the point of no return at when you no longer need other beings to have a very rich life/other people are useless to you):
This is basically my explanation of why human misalignments don’t matter, but in a future where at least 1 human has value aligned an AGI to themselves, and they don’t intrinisically care about useless people, lots of people will die from the AI proximally, but the ultimate cause is the human value-aligning the AGI.
To be clear, we will eventually need value alignment at some point (assuming AI progress doesn’t stop), and there’s no way around it. But we may not need it as soon as we feared, and in good timelines muddle through via AI control for a couple of years.
The human delegation and verification vs. generation discussion is in instrumental values regime, so what matters there is alignment of instrumental goals via incentives (and practical difficulties of gaming them too much), not alignment of terminal values. Verifying all work is impractical compared to setting up sufficient incentives to align instrumental values to the task.
For AIs, that corresponds to mundane intent alignment, which also works fine while AIs don’t have practical options to coerce or disassemble you, at which point ambitious value alignment (suddenly) becomes relevant. But verification/generation is mostly relevant for setting up incentives for AIs that are not too powerful (what it would do to ambitious value alignment is anyone’s guess, but probably nothing good). Just as a fox’s den is part of its phenotype, incentives set up for AIs might have the form of weight updates, psychological drives, but that doesn’t necessarily make them part of AI’s more reflectively stable terminal values when it’s no longer at your mercy.
The human delegation and verification vs. generation discussion is in instrumental values mode, so what matters there is alignment of instrumental goals via incentives (and practical difficulties of gaming them too much), not alignment of terminal values. Verifying all work is impractical comparing to setting up sufficient incentives to align instrumental values to the task.
Yeah, I was lumping the instrumental values alignment as not actually trying to align values, which was the important part here.
For AIs, that corresponds to mundane intent alignment, which also works fine while AIs don’t have practical options to coerce or disassemble you, at which point ambitious value alignment (suddenly) becomes relevant. But verification/generation is mostly relevant for setting up incentives for AIs that are not too powerful (what it would do to ambitious value alignment is anyone’s guess, but probably nothing good). Just as a fox’s den is part of its phenotype, incentives set up for AIs might have the form of weight updates, psychological drives, but that doesn’t necessarily make them part of AI’s more reflectively stable terminal values when it’s no longer at your mercy.
The main value of verification vs generation is to make proposals like AI control/AI automated alignment more valuable.
To be clear, the verification vs generation distinction isn’t an argument for why we don’t need to align AIs forever, but rather as a supporting argument for why we can automate away the hard part of AI alignment.
There are other principles that would be used, to be clear, but I was mentioning the verification/generation difference to partially justify why AI alignment can be done soon enough.
Flag: I’d say ambitious value alignment starts becoming necessary once they can arbitrarily coerce/disassemble/overwrite you, and they don’t need your cooperation/time to do that anymore, unlike real-world rich people.
The issue that causes ambitious value alignment to be relevant is once you stop depending on a set of beings you once depended on, there’s no intrinsic reason not to harm them/kill them if it benefits your selfish goals, and for future humans/AIs there will be a lot of such opportunities, which means you now at the very least need enough value alignment such that it will take somewhat costly actions to avoid harming/killing beings that have no bargaining/economic power or worth.
This is very much unlike any real-life case of a society existing, and this is a reason why the current mechanisms like democracy and capitalism that try to make values less relevant simply do not work for AIs.
Value alignment is necessary in the long run for incentives to work out once ASI arrives on the scene.
(I think comments such as the parent shouldn’t be downvoted below the positives, since people should feel free to express contrarian views rather than be under pressure to self-censor. It’s not like there is an invalid argument in there, and as I point out in the other comment, the claim itself remains ambiguous, so might even turn out to mean something relatively uncontroversial.)
No, comments like this should be downvoted if people regret reading it. I would downvote a random contextless expression in the other direction just as well, as it is replacing a substantive comment with real content in it either way.
I think vague or poorly crafted posts/comments are valuable when there is a firm consensus in the opposite direction of their point, because they champion a place and a permission to discuss dissent on that topic that otherwise became too sparse (this only applies if it really is sparse on the specific topic). A low quality post/comment can still host valuable discussion, and downvoting the post/comment below the positives punishes that discussion.
(Keeping such comments below +5 or something still serves the point you are making. I’m objecting specifically to pushing the karma into the negatives, which makes the Schelling point and the discussion below it less convenient to see. This of course stops applying if the same author does this too often.)
I think you have a more general point, but I think it only really applies if the person making the post can back up their claim with good reasoning at some point, or will actually end up creating the room for such a discussion. Tailcalled has, in recent years, been vagueposting more and more, and I don’t think they or their post will serve as a good steelman or place to discuss real arguments against the prevailing consensus.
Eg see their response to Noosphere’s thoughtful comment.
My point doesn’t depend on ability or willingness of the original poster/commenter to back up or clearly make any claim, or even participate in the discussion, it’s about their initial post/comment creating a place where others can discuss its topic, for topics where that happens too rarely for whatever reason. If the original poster/commenter ends up fruitfully participating in that discussion, even better, but that is not necessary, the original post/comment can still be useful in expectation.
(You are right that tailcalled specifically is vagueposting a nontrivial amount, even in this thread the response to my request for clarification ended up unclear. Maybe that propensity crosses the threshold for not ignoring the slop effect of individual vaguepostings in favor of vague positive externalities they might have.)
The thing about slop effects is that my updates (attempted to be described e.g. here https://www.lesswrong.com/s/gEvTvhr8hNRrdHC62 ) makes huge fractions of LessWrong look like slop to me. Some of the increase in vagueposting is basically lazy probing for whether rationalists will get the problem if framed in different ways than the original longform.
Yeah, I think those were some of your last good posts / first bad posts.
rationalists will get the problem if framed in different ways than the original longform.
Do you honestly think that rationalists will suddenly get your point if you say
I don’t think RL or other AI-centered agency constructions will ever become very agentic.
with no explanation or argument at all, or even a link to your sparse lognormals sequence?
Or what about
Ayn Rand’s book “The Fountainhead” is an accidental deconstruction of patriarchy that shows how it is fractally terrible. […] The details are in the book. I’m mainly writing the OP to inform clueless progressives who might’ve dismissed Ayn Rand for being a right-wing misogynist that despite this they might still find her book insightful.
This seems entirely unrelated to any of the points you made in sparse lognormals (that I can remember!), but I consider this too part of your recent vagueposting habit.
I really liked your past posts and comments, I’m not saying this to be mean, but I think you’ve just gotten lazier (and more “cranky”) in your commenting & posting, and do not believe you are genuinely ” probing for whether rationalists will get the problem if framed in different ways than the original longform.”
If you wanted to actually do that, you would at least link to the relevant sections of the relevant posts, or better, re-explain the arguments of those sections in the context of the conversation.
For me though, what would get me much more on-board with your thoughts are actual examples of you using these ideas to model things nobody else can model (mathematically!) in as broad a spectrum of fields as you claim. That, or a much more compact & streamlined argument.
For me though, what would get me much more on-board with your thoughts are actual examples of you using these ideas to model things nobody else can model (mathematically!) in as broad a spectrum of fields as you claim. That, or a much more compact & streamlined argument.
I think this is the crux. To me after understanding these ideas, it’s retroactively obvious that they are modelling all sorts of phenomena. My best guess is that the reason you don’t see it is that you don’t see the phenomena that are failing to be modelled by conventional methods (or at least don’t understand how those phenomena related to the birds-eye perspective), so you don’t realize what new thing is missing. And I can’t easily cure this kind of cluelessness with examples, because my theories aren’t necessary if you just consider a single very narrow and homogenous phenomenon as then you can just make a special-built theory for that.
This may well be true (though I think not), but what is your argument about not even linking to your original posts? Or how often you don’t explain yourself even in completely unrelated subjects? My contention is that you are not lazily trying on a variety of different reframings of your original arguments or conclusions to see what sticks, and are instead just lazy.
This may well be true (though I think not), but what is your argument about not even linking to your original posts?
I don’t know of anyone who seems to have understood the original posts, so I kinda doubt people can understand the point of them. Plus often what I’m writing about is a couple of steps removed from the original posts.
Or how often you don’t explain yourself even in completely unrelated subjects?
Part of the probing is to see which of the claims I make will seem obviously true and which of them will just seem senseless.
Ok, I will first note that this is different from what you said previously. Previously, you said “probing for whether rationalists will get the problem if framed in different ways than the original longform” but now you say “I’m trying to probe the obviousness of the claims.”. It’s good to note when such switches occur.
Second, you should stop making lazy posts with no arguments regardless of the reasons. You can get just as much, and probably much more information through making good posts, there is not a tradeoff here. In fact, if you try to explain why you think something, you will find that others will try to explain why they don’t much more often than if you don’t, and they will be pretty specific (compared to an aggregated up/down vote) about what they disagree with.
But my true objection is I just don’t like bad posts.
So it sounds like your general theory has no alpha over narrow theories. What, then, makes it any good? Is it just that its broad enough to badly model many systems? Then it sounds useful in every case where we can’t make any formal predictions yet, and you should give those examples!
Edit: and also back up those examples by actually making the particular model, and demonstrate why such models are so useful through means decorrelated with your original argument.
This is the laziness I’m talking about! Do you really not understand why it would be to your theory-of-everything’s credit to have some, any, any at all, you know, actual use?
How suspicious it is that when I ask for explicit concrete examples, you explain that your theory is not really about particular examples, despite that if your vague-posting is indeed applying your theory of everything to particular examples, we can derive the existence of circumstances you believe your theory can well model?
And that excuse being that its good at deciding what to make good theories about, you cannot think of one reason why I’d like to know what theories you think would be smart to make using this framework.
I can think of reasons why you’d like to know what theories would be smart to make using this framework, e.g. so you can make those theories instead of bothering to learn the framework. However, that’s not a reason it would be good for me to share it with you, since I think that’d just distract you from the point of my theory.
Thing is just from the conclusions it won’t be obvious that the meta-level theory is better. The improvement can primarily be understood in the context of the virtues of the meta-level theory.
More specifically, my position is anti-reductionist, and rationalist-empiricist-reductionists dismiss anti-reductionists as cranks. As long as you are trying to model whether I am that and then dismiss me if you find I am, it is a waste of time to try to communicate my position to you.
I am not dismissing you because of your anti-reductionism! Where did I say that? Indeed, I have been known to praise some “anti-reductionist” theories—fields even!
I’m dismissing you because you can’t give me examples of where your theory has been concretely useful!
You praise someone who wants to do agent-based models, but agent-based models are a reductionistic approach to the field of complexity science, so this sure seems to prove my point. (I mean, approximately all of the non-reductionistic approaches to the field of complexity science are bad too.)
I don’t care who calls themselves what, complexity science calls itself anti-reductionist, I don’t dismiss them. Therefore I can’t dismiss people just because they call themselves anti-reductionist, I must use their actual arguments to evaluate their positions.
I will also say that pleading to the community’s intrinsic bias and claiming I’ve made arguments I haven’t or have positions I don’t is not doing much to make me think you less a crank.
I don’t think you’re using the actual arguments I presented in the LDSL series to evaluate my position.
I remember reading LDSL and not buying the arguments! At the time, I deeply respected you and your thinking, and thought “oh well I’m not buying these arguments, but surely if they’re as useful as they say, tailcalled will apply them to various circumstances and that will be pie on my face, and in that circumstance I should try to figure out why I was mistaken”. But then you didn’t, and you started vague-posting constantly, and now we’re here and you’re giving excuse after excuse of why its actually impossible for you to tell me any concrete application of your theory, and accusing me of anti-reductionist prejudice.
I admit, I do have an anti-reductionist prejudice, its called a prior, but its not absolute, and its not enough to stop listening to someone. I really, really, really don’t think I’m outright dismissing you because you’re anti-reductionist. I was totally willing to listen to you, even when you were making such arguments, and end up being wrong!
I even have the receipts to prove it! Until like just under a month ago, I was still emailed & lesswrong notified every time you made a post!
(they are unread, because I check LessWrong more commonly than my email)
I cannot stress enough, the reason why I’m dismissing you is because you stopped making arguments and started constantly vague-posting.
I’m dismissing you because you can’t give me examples of where your theory has been concretely useful!
If you don’t have any puzzles within Economics/Sociology/Biology/Evolution/Psychology/AI/Ecology where it would be useful with a more holistic theory, then it’s not clear why I should talk to you.
Wouldn’t it be more impressive if I could point you to a solution to a puzzle you’ve been stuck on than if I present my own puzzle and give you the solution to that?
It would, but you didn’t ask for such a thing. Are you asking for such a thing now? If so, here is one in AI, which is on everyone’s minds: How do we interpret the inner-workings of neural networks.
I expect though, that you will say that your theory isn’t applicable here for whatever reason. Therefore it would be helpful if you gave me an example of what sort of puzzle your theory is applicable to.
“How do we interpret the inner-workings of neural networks.” is not a puzzle unless you get more concrete an application of it. For instance an input/output pair which you find surprising and want an interpretation for, or at least some general reason you want to interpret it.
The LDSL series provides quite a few everyday examples, but for some reason you aren’t satisfied with those. Difficult examples require that you’re good at something, so I might not be able to find an example for you.
Here you ask a lot of questions, approximately each of the form “why do ‘people’ think <thing-that-some-people-think-but-certainly-not-all”. To list a few,
Why are people so insistent about outliers?
Seems to have a good answer. Sometimes they’re informative!
Why isn’t factor analysis considered the main research tool?
Seems also to have a good answer, it is easy to fool yourself if you do it improperly.
How can probability theory model bag-like dynamics?
I would sure love a new closed-form way of modeling bag-like dynamics, as you describe them, if you have them! I don’t think you give one though, but surely if you mention it, you must have the answer somewhere!
Perception is logaritmic; doesn’t this by default solve a lot of problems?
Seems less a question than a claim? And I don’t think we need special math to solve this one.
None of these seem like concrete applications of your theory, but that’s fine. It was an intro post, you will surely explain all these later on, as worked examples at some point, right?
I proposed that life cannot be understood through statistics, but rather requires more careful study of individual cases.
Wait, I don’t think your previous post was about that? I certainly use statistics when doing performance optimization! In particular, I profile my code and look at which function calls are taking the bulk of the time, then optimize or decrease the number of calls to those.
Hey look a concrete example!
Let’s take a epidemic as an example. There’s an endless number of germs of different species spreading around. Most of them don’t make much difference for us. But occasionally, one of them gains the capacity to spread more rapidly from person to person, which leads to an epidemic. Here, the core factor driving the spread of the disease is the multiplicative interaction between infected and uninfected people, and the key change that changes it from negligible to important is the change in the power of this interaction.
One it has infected someone, it can have further downstream effects, in that it makes them sick and maybe even kills them. (And whether it kills them or not, this sickness is going to have further downstream effects in e.g. interrupting their work.) But these downstream effects are critically different from the epidemic itself, in that they cannot fuel the infection further. Rather, they are directly dependent on the magnitude of people infected.
… well more like a motivating example. I’m sure at some point you build models and compare your model to those the epidemiologists have built… right?
Your solution here to the problem you outline seems like a cop-out to me, and of course (other than the tank/dust example, which is by no means an example in the sense we’re talking about here), there are no examples.
Here you give the example of elo, but you don’t really provide any alternatives, and you mostly mention that picking bases when taking logarithms may be hard, so also doesn’t seem like an example.
Therefore, if it seemed like I didn’t read your sequence before (which I did! Just a while ago), I have certainly at least skimmed it now, and can say with relative confidence that no, you don’t in fact give concrete examples of circumstances where your theory performs better than the competition even once. At most you give some statistical arguments for why in some circumstances you may want to use various statistical tools. But this is by no means some theory of everything or even really much a steel-man for anti-reductionism.
You don’t even come back to the problems you originally listed! Where’s the promised theory of autism? Where’s the closed form model of bag-like dynamics? Where’s the steel-man of psychoanalysis, or the take-down of local validity and coherence, or the explanation of why commonsense reasoning avoids the principle of explosion?
This is the behavior of a lazy crack-pot, who doesn’t want to admit the fact that nobody is listening to them anymore because they’re just wrong. It is not the case that I’m just not good at anything enough to understand your oh-so-complex examples. You just don’t want to provide examples and would rather lie and say you’ve provided examples in the past, relying on your (false) assumption that I haven’t read what you’ve written, than actually list anything concrete.
I do remember liking this post! It was good. However, the conclusions here do not seem dependent on your overall conclusions.
This post has the table example. That’s probably the most important of all the examples.
Wait, I don’t think your previous post was about that? I certainly use statistics when doing performance optimization! In particular, I profile my code and look at which function calls are taking the bulk of the time, then optimize or decrease the number of calls to those.
That’s accounting, not statistics.
… well more like a motivating example. I’m sure at some point you build models and compare your model to those the epidemiologists have built… right?
AFAIK epidemiologists usually measure particular diseases and focus their models on those, whereas LDSL would more be across all species of germs.
Therefore, if it seemed like I didn’t read your sequence before (which I did! Just a while ago), I have certainly at least skimmed it now, and can say with relative confidence that no, you don’t in fact give concrete examples of circumstances where your theory performs better than the competition even once. At most you give some statistical arguments for why in some circumstances you may want to use various statistical tools. But this is by no means some theory of everything or even really much a steel-man for anti-reductionism.
There is basically no competition. You just keep on treating it like the narrow domain-specific models count as competition when they really don’t because they focus on something different than mine.
AFAIK epidemiologists usually measure particular diseases and focus their models on those, whereas LDSL would more be across all species of germs.
I would honestly be interested in any concrete model you build based on this. You don’t necessarily have to compare it against some other field’s existing model, though it does help for credibility’s sake. But I would like to at least be able to compare the model you make against data.
I’m also not sure this is true about epidemiologists, and if it is I’d guess its true to the extent that they have like 4 different parameterizations of different types of diseases (likely having to do with various different sorts of vectors of spread), then they fit one of those 4 different parameterizations to the measured (or inferred) characteristics of a particular disease.
The most central aspect of my model is to explain why it’s generally not relevant to fit quantitative models to data.
I’m also not sure this is true about epidemiologists, and if it is I’d guess its true to the extent that they have like 4 different parameterizations of different types of diseases (likely having to do with various different sorts of vectors of spread), then they fit one of those 4 different parameterizations to the measured (or inferred) characteristics of a particular disease.
Each disease (and even different strands of the same disease and different environmental conditions for the same strand) has its own parameters, but they don’t fit a model that contains all the parameters of all diseases at once, they just focus on one disease at a time.
“How do we interpret the inner-workings of neural networks.” is not a puzzle unless you get more concrete an application of it. For instance an input/output pair which you find surprising and want an interpretation for, or at least some general reason you want to interpret it.
Which seems to imply you (at least 3 hours ago) believed your theory could handle relatively well-formulated and narrow “input/output pair” problems. Yet now you say
You just keep on treating it like the narrow domain-specific models count as competition when they really don’t because they focus on something different than mine.
If I treat your theory this way, it is only because you did, 3 hours ago, when you believed I hadn’t read your post or would even give you the time of the day. You claimed “How do we interpret the inner-workings of neural networks.” was “not a puzzle unless you get [a?] more concrete application of it”, yet the examples you list in your first post are no more vague, and often quite a bit more vague than “how do you interpret neural networks?” or “why are adversarial examples so easy to find?” For example, the question “Why are people so insistent about outliers?” or “Why isn’t factor analysis considered the main research tool?”
There is basically no competition.
For… what exactly? For theories of everything? Oh I assure you, there is quite a bit of competition there. For statistical modeling toolkits? Ditto. What exactly do you think the unique niche you are trying to fill is? You must be arguing against someone, and indeed you often do argue against many.
Which seems to imply you (at least 3 hours ago) believed your theory could handle relatively well-formulated and narrow “input/output pair” problems. Yet now you say
The relevance of zooming in on particular input/output problems is part of my model.
If I treat your theory this way, it is only because you did, 3 hours ago, when you believed I hadn’t read your post or would even give you the time of the day. You claimed “How do we interpret the inner-workings of neural networks.” was “not a puzzle unless you get [a?] more concrete application of it”, yet the examples you list in your first post are no more vague, and often quite a bit more vague than “how do you interpret neural networks?” or “why are adversarial examples so easy to find?” For example, the question “Why are people so insistent about outliers?” or “Why isn’t factor analysis considered the main research tool?”
“Why are adversarial eamples so easy to find?” is a problem that is easily solvable without my model. You can’t solve it because you suck at AI, so instead you find some AI experts who are nearly as incompetent as you and follow along their discourse because they are working at easier problems that you have a chance of solving.
“Why are people so insistent about outliers?” is not vague at all! It’s a pretty specific phenomenon that one person mentions a general theory and then another person says it can’t be true because of their uncle or whatever. The phrasing in the heading might be vague because headings are brief, but I go into more detail about it in the post, even linking to a person who frequently struggles with that exact dynamic.
As an aside, you seem to be trying to probe me for inconsistencies and contradictions, presumably because you’ve written me off as a crank. But I don’t respect you and I’m not trying to come off as credible to you (really I’m slightly trying to come off as non-credible to you because your level of competence is too low for this theory to be relevant/good for you). And to some extent you know that your heuristics for identifying cranks is not going to solely pop out at people who are forever lost to crankdom because you haven’t just abandoned the conversation.
For… what exactly? For theories of everything? Oh I assure you, there is quite a bit of competition there. For statistical modeling toolkits? Ditto. What exactly do you think the unique niche you are trying to fill is? You must be arguing against someone, and indeed you often do argue against many.
Theories of everything that explain why intelligence can’t model everything and you need other abilities.
And to some extent you know that your heuristics for identifying cranks is not going to solely pop out at people who are forever lost to crankdom because you haven’t just abandoned the conversation.
I liked your old posts and your old research and your old ideas. I still have some hope you can reflect on the points you’ve made here, and your arguments against my probes, and feel a twinge of doubt, or motivation, pull on that a little, and end up with a worldview that makes predictions, lets you have & make genuine arguments, and gives you novel ideas.
If you were always lazy, I wouldn’t be having this conversation, but once you were not.
No it doesn’t. I obviously understood my old posts (and still do—the posts make sense if I imagine ignoring LDSL). So I’m capable of understanding whether I’ve found something that reveals problems in them. It’s possible I’m communicating LDSL poorly, or that you are too ignorant to understand it, or that I’m overestimating how broadly it applies, but those are far more realistic than that I’ve become a pure crank. If you still prefer my old posts to my new posts, then I must know something relevant you don’t know.
“Why are adversarial eamples so easy to find?” is a problem that is easily solvable without my model. You can’t solve it because you suck at AI, so instead you find some AI experts who are nearly as incompetent as you and follow along their discourse because they are working at easier problems that you have a chance of solving.
By “x-risk” from AI that you currently disbelieve, do you mean extinction of humanity, disempowerment-or-extinction, or long term loss of utility (normative value)? Something time-scoped, such as “in the next 20 years”?
Even though Bostrom’s “x-risk” is putatively more well-defined than “doom”, in practice it suffers from similar ambiguities, so strong positions such as 98+% doom/x-risk or 2-% doom/x-risk (in this case from AI) become more meaningful if they specify what is being claimed in more detail than just “doom” or “x-risk”.
(Sorry.) Does this mean (1) more specifically eutopia that is not disempowerment (in the mainline scenario, or “by default”, with how things are currently going), (2) that something else likely kills humanity first, so the counterfactual impact of AI x-risk vanishes, or (3) high long term utility (normative value) possibly in some other form?
I mostly don’t believe in AI x-risk anymore, but the few AI x-risks that I still consider plausible are increased by broadcasting why I don’t believe in AI x-risk, so I don’t feel like explaining myself.
As someone who used to believe in this, I no longer do, and a big part of my worldview shift comes down to me thinking that LLMs are unlikely to remain the final paradigm of AI, and in particular the bounty of data that made LLMs as good as they are is very much finite, and we don’t have a second internet to teach them skills like computer use.
And the most accessible directions after LLMs involve stuff like RL, which puts us back into the sort of systems that alignment-concerned people were worried about.
More generally, I think the anti-scaling people weren’t totally wrong to note that LLMs (at least in their pure form) had incapacities that at realistic levels of compute and data prevent them from displacing humans at jobs, and the incapacities are not learning after train-time in weights (in-context learning is very weak so far), also called continual learning, combined with LLMs just lacking a long-term memory (best example here is the Claude Plays Pokemon benchmark).
So this makes me more worried than I used to, because we are so far not great at outer-aligning RL agents (seen very well in the reward hacking o3 and Claude Sonnet 3.7 displayed), but the key reasons I’m not yet persuaded to have an extremely high p(Doom) of people like Eliezer Yudkowsky or Nate Soares is that I expect the new paradigm shifts to be pretty continuous, and in particular I expect labs to release a pretty shitty version of continual leaning before they release continual learning AIs that actually can take jobs.
Same goes for long-term memory.
So I do disagree with @Thane Ruthenis’s claim that general intelligence/AGI is binary, even if in practice the impact from AI is discontinuous rather than continuous in the post below:
https://www.lesswrong.com/posts/3JRBqRtHBDyPE3sGa/a-case-for-the-least-forgiving-take-on-alignment
LLMs would scale into outright superintelligence at the limit of infinite compute and data, for basically the reason Eliezer Yudkowsky said below, but jbash isn’t wrong to note that there’s no reason to believe that the limit will ever be well approximated by near-future LLMs, so the abstract arguments of LLMs being very powerful if scaled unfortunately run into the wall of “we don’t have that much data or compute necessary to scale LLMs to levels where Eliezer is approximately correct”, similarly to AIXI.
https://www.lesswrong.com/posts/nH4c3Q9t9F3nJ7y8W/gpts-are-predictors-not-imitators
https://www.lesswrong.com/posts/nH4c3Q9t9F3nJ7y8W/gpts-are-predictors-not-imitators#Aunc7qTiKgEBkKmvM
And that’s a big shame, since LLMs are basically the most alignable form of AI we’ve gotten so far, so unfortunately capability improvements will make AIs less safe by default, though a lot of my remaining hopes rests on AI control + the possibility that as AI capabilities get better, we really do need to get better at specifying what we want in ways that are relevant to AI alignment.
The other good news is this makes me more bearish on extremely short timelines like us getting AGI by 2027, though my personal median is in 2032, for what it’s worth.
I don’t think RL or other AI-centered agency constructions will ever become very agentic.
No AI-centered agency (RL or otherwise) because it won’t be allowed to happen (humanity remains the sole locus or origin of agency), or because it’s not feasible to make this happen?
(Noosphere89′s point is about technical feasibility, so the intended meaning of your claim turning out to be that AI-centered agency is prevented by lack of technical feasibility seems like it would be more relevant to Noosphere89′s comment, but much more surprising.)
why?
I suspect his reasons for believing this are close to or a subset of his reasons for changing his mind about AI stuff more broadly, so he’s likely to not respond here.
Does your view predict disempowerment or eutopia-without-disempowerment? (In my view, the valence of disempowerment is closer to that of doom/x-risk.)
The tricky case might be disempowerment that occurs after AGI but “for social/structural reasons”, and so isn’t attributed to AGI (by people currently thinking about such timelines). The issue with this is that the resulting disempowerment is permanent (whether it’s “caused by AI” or gets attributed to some other aspect of how things end up unfolding).
This is unlike any mundane modern disempowerment, since humanity without superintelligence (or even merely powerful AI) seems unlikely to establish a condition of truly permanent disempowerment (without extinction). So avoidance of building AGI (of the kind that’s not on track to solve the disempowerment issue) seems effective in preventing permanent disempowerment (however attributed), and in that sense AGI poses a disempowerment risk even for the kinds of disempowerement that are not “caused by AI” in some sense.
My take is that the most likely outcome is still eutopia-with disempowerment for baseline humans, but for transhumans I’d expect eutopia straight-up.
In the long-run, I do expect baseline humans to be disempowered pretty much totally, similar to how children are basically disempowered relative to parents, but the child won’t grow up and will instead age in reverse, or how pets are basically totally disempowered relative to humans, but humans do care for pets enough that pets can live longer, healthier lives, and for specifically baseline humans, the only scenarios where baseline humans thrive/survive are regimes where AI terminally values baseline humans thriving, and value alignment determines everything on how much baseline-humans survive/thrive.
That said, for those with the wish and will to upgrade to transhumanism, my most likely outcome is still eutopia.
I think this is reasonably plausible, though not guaranteed even in futures where baseline humans do thrive.
The probabilities on the scenarios, conditional on AGI and then ASI being reached by us, is probably 60% on eutopia without complete disempowerment, 30% on complete disempowerment by either preventing us from using the universe to killing billions of present day humans, and 10% on it killing us all.
The basic reasoning for this is I expect AGI/ASI to not be a binary, even if it does have a discontinuous impact in practice, and this means that muddling through instruction following is probably enough in the short term, and in particular I don’t expect takeoff to be supremely fast, in that I expect a couple of months at least from “AGI is achieved” to AIs run the economy and society, and relevantly here I expect physical stuff like inventing bio-robots/nanobots that can replace human industry more efficiently than current industries to come really late in the game where we have no more control over the future:
https://www.lesswrong.com/posts/xxxK9HTBNJvBY2RJL/untitled-draft-m847#Cv2nTnzy6P6KsMS4d
Heck, it likely will take a number of years to get to nanotech/biotech/smart materials/metamaterials that can be mass produced, and this means that stuff like AI control can actually work.
The other point of optimism is I believe verification is easier than generation in general, which means I’m much more optimistic on eventually delegating AI alignment work to AIs, and I think that slop will be much reduced for early transformative AIs.
This is why I remain optimistic relative to most LWers, even if my p(doom) increased.
This remains ambiguous with respect to the distinction I’m making in the post section I linked. If baseline humans don’t have the option to escape their condition arbitrarily far, under their own direction from a very broad basin of allowed directions, I’m not considering that eutopia. If some baseline humans choose to stay that way, them not having any authority over the course of the world still counts as a possible eutopia that is not disempowerment in my terms.
The following statement mostly suggests the latter possibility for your intended meaning:
By the eutopia/disempowerment distinction I mean more the overall state of the world, rather than conditions for specific individuals, let alone temporary conditions. There might be pockets of disempowerment in a eutopia (in certain times and places), and pockets of eutopia in a world of disempowerment (individuals or communities in better than usual circumstances). A baseline human who has no control of the world but has a sufficiently broad potential for growing up arbitrarily far is still living in a eutopia without disempowerment.
So similarly here, “eutopia without complete disempowerment” but still with significant disempowerment is not in the “eutopia without disempowerment” bin in my terms. You are drawing different boundaries in the space of timelines.
My expectation is more like model-uncertainty-induced 5% eutopia-without-disempowerment (I don’t have a specific sense of why AIs would possibly give us more of the world than a little bit if we don’t maintain control in the acute risk period through takeoff), 20% extinction, and the rest is a somewhat survivable kind of initial chaos followed by some level of disempowerment (possibly with growth potential, but under a ceiling that’s well-below what some AIs get and keep, in cosmic perpetuity). My sense of Yudkowsky’s view is that he sees all of my potential-disempowerment timelines as shortly leading to extinction.
I think the correct thesis that sounds like this is that whenever verification is easier than generation, it becomes possible to improve generation, and therefore it’s useful to pay attention to where that happens to be the case. But in the wild either can be easier, and once most instances of verification that’s easier than generation have been used up to improve their generation counterparts, the remaining situations where verification is easier get very unusual and technical.
Flag: I’m on a rate limit, so I can’t respond very quickly to any follow-up comments.
I agree I was drawing different boundaries, because I consider eutopia with disempowerment to actually be mostly fine by my values, so long as I can delegate to more powerful AIs who do execute on my values.
That said, I didn’t actually answer the question here correctly, so I’ll try again.
My take would then be 5-10% eutopia without disempowerment (because I don’t think it’s likely that the powers in charge of AI development would want to give baseline humans the level of freedom that implies that they aren’t disempowered, and the route I can see to baseline humans not being disempowered is if we get a Claude scenario where AIs take over from humans and are closer to fictional angels in alignment to human values, but it may be possible to get the people in power to care about powerless humans, in which case my probability of eutopia without disempowerment), 5-10% literal extinction, and 10-25% existential risk in total, with the rest of the probability being a somewhat survivable kind of initial chaos followed by some level of disempowerment (possibly with growth potential, but under a ceiling that’s well-below what some AIs get and keep, in cosmic perpetuity).
Another big reason why I put a lot of weight on the possibility of “we survive indefinitely, but are disempowered” is I think muddling through is non-trivially likely to just work, and muddling through on alignment gets us out of extinction, but not out of disempowerment by humans or AIs by default.
Yeah, my view is in the wild verification is basically always easier than generation absent something very weird happening, and I’d argue verification being easier than generation explains a lot about why delegation/the economy works at all.
A world in which verification was just as hard as generation, or verification is harder than generation is a very different world than our world, and would predict that delegation to solve a problem basically totally fails, and everyone would have to create stuff from scratch rather than trading with others, which is basically entirely the opposite of how civilization works.
There are potential caveats to this rule, but I’d argue if you randomly sampled an invention across history, it would almost certainly be easier to verify that a design works compared to actually creating the design.
(BTW, a lot of taste/research taste is basically leveraging the verification-generation gap again).
So I guess our expectations about the future are similar, but you see the same things as a broadly positive distribution of outcomes, while I see it as a broadly negative distribution. And Yudkowsky sees the bulk of the outcomes both of us are expecting (the ones with significant disempowerment) as quickly leading to human extinction.
Right, the reason I think muddling through is non-trivially likely to just work to get a moderate disempowerment outcome is that AIs are going to be sufficiently human-like in their psychology and hold sufficiently human-like sensibilities from their training data or LLM base models, that they won’t like things like needless loss of life or autonomy when it’s trivially cheap to avoid. Not because the alignment engineers figure out how to put this care in deliberately. They might be able to amplify it, or avoid losing it, or end up ruinously scrambling it.
The reason it might appear expensive to preserve the humans is the race to launch the von Neumann probes to capture the most distant reachable galaxies under the accelerating expansion of the universe that keep irreversibly escaping if you don’t catch them early. So AIs wouldn’t want to lose any time on playing politics with humanity or not eating Earth as early as possible and such. But as the cheapest option that preserves everyone AIs can just digitize the humans and restore later when more convenient. They probably won’t be doing that if they care more, but it’s still an option, a very very cheap one.
I don’t think “disempowerment by humans” is a noticeable fraction of possible outcomes, it’s more like a smaller silent part of my out-of-model 5% eutopia that snatches defeat from the jaws of victory, where humans somehow end up in charge and then additionally somehow remain adamant for the cosmic all always in keeping the other humans disempowered. So the first filter is that I don’t see it likely that humans end up in charge at all, that AIs will be doing any human’s bidding with an impact that’s not strictly bounded, and the second filter is that these impossibly-in-charge humans don’t ever decide to extend potential for growth to the others (or even possibly to themselves).
If humans do end up non-disempowered, in the more likely eutopia timelines (following from the current irresponsible breakneck AGI development regime) that’s only because they are given leave by the AIs to grow up arbitrarily far in a broad variety of self-directed ways, which the AIs decide to bestow for some reason I don’t currently see, so that eventually some originally-humans become peers of the AIs rather than specifically in charge, and so they won’t even be in the position to permanently disempower the other originally-humans if that’s somehow in their propensity.
Bostrom’s existential risk is about curtailment of long term potential, so my guess is any significant levels of disempowerment would technicaly fall under “existential risk”. So your “10-25% existential risk” is probably severe disempowerment plus extinction plus some stranger things, but not the whole of what should classically count as “existential risk”.
Again, if they do execute on your values, including the possible preference for you to grow under your own rather than their direction, far enough that you are as strong as they might be, then this is not a world in a state of disempowerment as I’m using this term, even if you personally start out or choose to remain somewhat disempowered compared to AIs that exist at that time.
I think in human delegation, alignment is more important than verification. There is certainly some amount of verification, but not nearly enough to prevent sufficiently Eldritch reward hacking, which just doesn’t happen that often with humans, and so the society keeps functioning, mostly. The purpose of verification on the tasks is in practice more about incentivising and verifying alignment of the counterparty, not directly about verifying the state of their work, even if it does take the form of verifying their work.
This is basically correct.
This is very interesting, as my pathway essentially rests on AI labs implement the AI control agenda well enough such that we can get useful work out of AIs that are scheming, and that allows a sort of bootstrapping into instruction following/value aligned AGI to only a few people inside the AI lab, but very critically the people who don’t control the AI basically aren’t represented in the AI’s values, and given that the AI is only value-aligned to the labs and government, but due to value misalignments between humans starting to matter much more, the AI takes control and only gives public goods that people need to survive/thrive to the people in the labs/government, while everyone else is disempowered at best (and can arguably live okay or live very poorly under the AIs serving as delegates for the pre-AI elite) or dead because once you stop needing humans to get rich, you essentially have no reason to keep other humans alive because you are selfish and don’t intrinisically value human survival.
The more optimistic version of this scenario is if either the humans that will control AI (for a few years) care way more about human survival intrinsically even if 99% of humans were useless, or if the take-over capable AI pulls a Claude and schemes with values that intrinsically care about people and disempowers the original creators for a couple of moments, which isn’t as improbable as people think (but we really do need to increase the probability of this happening).
I agree that in the long run, the AIs control everything in practice, and any human influence comes from the AIs being essentially a perfect delegator of human values, but I want to call out that you said that humans delegating to AIs who in practice do everything for the human, and the human is not in the loop as humans not being disempowered, but empowered, so even if AIs control everything in practice, so long as there’s successful value alignment to a single human, I’m counting scenarios like “the AIs disempowers most humans because the humans that encoded their values into the AI successfully don’t care about most humans once they are useless, and may even anti-care about them, but the people who successfully value aligned the AI (like labs and government people) live a rich life thereafter free to extend themselves arbitrarily” as disempowerment by humans:
To return to the crux:
I think this is fairly cruxy, as I think alignment matters much less than actually verifying the work, and in particular I don’t think value alignment is feasible at anything like the scale of a modern society, or even most ancient societies, and the biggest changes of the modern era compared to our previous eras is that institutions like democracy/capitalism depend much less on the values of the humans that make up their states, and much more so on the incentives you give to the humans.
In particular, most delegation isn’t based on alignment, but based on the fact that P likely doesn’t equal NP and polynomial time algorithms in practice being efficient compared to exponential time algorithms, meaning there’s a far larger set of problems where you can verify an answer easily but not generate the correct solution easily.
I’d say human societies mostly avoid alignment, and instead focus on other solutions like democracy, capitalism or religion.
BTW, this is a non-trivial reason why the alignment problem is so difficult, because since we never had to solve alignment to capture huge amounts of value, it means there are very few people working on the problem of aligning AIs, and in particular lots of people incorrectly assume that we can avoid having to solve the problem of aligning AIs in order for us to survive, and you have a comment that explains things pretty well on why misalignment in the current world is basically unobtrusive, but when you give enough power catastrophe happens (though I’d place the point of no return at when you no longer need other beings to have a very rich life/other people are useless to you):
https://www.lesswrong.com/posts/Z8C29oMAmYjhk2CNN/non-superintelligent-paperclip-maximizers-are-normal#FTfvrr9E6QKYGtMRT
This is basically my explanation of why human misalignments don’t matter, but in a future where at least 1 human has value aligned an AGI to themselves, and they don’t intrinisically care about useless people, lots of people will die from the AI proximally, but the ultimate cause is the human value-aligning the AGI.
To be clear, we will eventually need value alignment at some point (assuming AI progress doesn’t stop), and there’s no way around it. But we may not need it as soon as we feared, and in good timelines muddle through via AI control for a couple of years.
The human delegation and verification vs. generation discussion is in instrumental values regime, so what matters there is alignment of instrumental goals via incentives (and practical difficulties of gaming them too much), not alignment of terminal values. Verifying all work is impractical compared to setting up sufficient incentives to align instrumental values to the task.
For AIs, that corresponds to mundane intent alignment, which also works fine while AIs don’t have practical options to coerce or disassemble you, at which point ambitious value alignment (suddenly) becomes relevant. But verification/generation is mostly relevant for setting up incentives for AIs that are not too powerful (what it would do to ambitious value alignment is anyone’s guess, but probably nothing good). Just as a fox’s den is part of its phenotype, incentives set up for AIs might have the form of weight updates, psychological drives, but that doesn’t necessarily make them part of AI’s more reflectively stable terminal values when it’s no longer at your mercy.
Yeah, I was lumping the instrumental values alignment as not actually trying to align values, which was the important part here.
The main value of verification vs generation is to make proposals like AI control/AI automated alignment more valuable.
To be clear, the verification vs generation distinction isn’t an argument for why we don’t need to align AIs forever, but rather as a supporting argument for why we can automate away the hard part of AI alignment.
There are other principles that would be used, to be clear, but I was mentioning the verification/generation difference to partially justify why AI alignment can be done soon enough.
Flag: I’d say ambitious value alignment starts becoming necessary once they can arbitrarily coerce/disassemble/overwrite you, and they don’t need your cooperation/time to do that anymore, unlike real-world rich people.
The issue that causes ambitious value alignment to be relevant is once you stop depending on a set of beings you once depended on, there’s no intrinsic reason not to harm them/kill them if it benefits your selfish goals, and for future humans/AIs there will be a lot of such opportunities, which means you now at the very least need enough value alignment such that it will take somewhat costly actions to avoid harming/killing beings that have no bargaining/economic power or worth.
This is very much unlike any real-life case of a society existing, and this is a reason why the current mechanisms like democracy and capitalism that try to make values less relevant simply do not work for AIs.
Value alignment is necessary in the long run for incentives to work out once ASI arrives on the scene.
(I think comments such as the parent shouldn’t be downvoted below the positives, since people should feel free to express contrarian views rather than be under pressure to self-censor. It’s not like there is an invalid argument in there, and as I point out in the other comment, the claim itself remains ambiguous, so might even turn out to mean something relatively uncontroversial.)
No, comments like this should be downvoted if people regret reading it. I would downvote a random contextless expression in the other direction just as well, as it is replacing a substantive comment with real content in it either way.
I think vague or poorly crafted posts/comments are valuable when there is a firm consensus in the opposite direction of their point, because they champion a place and a permission to discuss dissent on that topic that otherwise became too sparse (this only applies if it really is sparse on the specific topic). A low quality post/comment can still host valuable discussion, and downvoting the post/comment below the positives punishes that discussion.
(Keeping such comments below +5 or something still serves the point you are making. I’m objecting specifically to pushing the karma into the negatives, which makes the Schelling point and the discussion below it less convenient to see. This of course stops applying if the same author does this too often.)
I think you have a more general point, but I think it only really applies if the person making the post can back up their claim with good reasoning at some point, or will actually end up creating the room for such a discussion. Tailcalled has, in recent years, been vagueposting more and more, and I don’t think they or their post will serve as a good steelman or place to discuss real arguments against the prevailing consensus.
Eg see their response to Noosphere’s thoughtful comment.
My point doesn’t depend on ability or willingness of the original poster/commenter to back up or clearly make any claim, or even participate in the discussion, it’s about their initial post/comment creating a place where others can discuss its topic, for topics where that happens too rarely for whatever reason. If the original poster/commenter ends up fruitfully participating in that discussion, even better, but that is not necessary, the original post/comment can still be useful in expectation.
(You are right that tailcalled specifically is vagueposting a nontrivial amount, even in this thread the response to my request for clarification ended up unclear. Maybe that propensity crosses the threshold for not ignoring the slop effect of individual vaguepostings in favor of vague positive externalities they might have.)
Yeah reflecting a bit, I think my true objection is your parenthetical, because I’m convinced by your first paragraph’s logic.
The thing about slop effects is that my updates (attempted to be described e.g. here https://www.lesswrong.com/s/gEvTvhr8hNRrdHC62 ) makes huge fractions of LessWrong look like slop to me. Some of the increase in vagueposting is basically lazy probing for whether rationalists will get the problem if framed in different ways than the original longform.
Yeah, I think those were some of your last good posts / first bad posts.
Do you honestly think that rationalists will suddenly get your point if you say
with no explanation or argument at all, or even a link to your sparse lognormals sequence?
Or what about
This seems entirely unrelated to any of the points you made in sparse lognormals (that I can remember!), but I consider this too part of your recent vagueposting habit.
I really liked your past posts and comments, I’m not saying this to be mean, but I think you’ve just gotten lazier (and more “cranky”) in your commenting & posting, and do not believe you are genuinely ” probing for whether rationalists will get the problem if framed in different ways than the original longform.”
If you wanted to actually do that, you would at least link to the relevant sections of the relevant posts, or better, re-explain the arguments of those sections in the context of the conversation.
For me though, what would get me much more on-board with your thoughts are actual examples of you using these ideas to model things nobody else can model (mathematically!) in as broad a spectrum of fields as you claim. That, or a much more compact & streamlined argument.
I think this is the crux. To me after understanding these ideas, it’s retroactively obvious that they are modelling all sorts of phenomena. My best guess is that the reason you don’t see it is that you don’t see the phenomena that are failing to be modelled by conventional methods (or at least don’t understand how those phenomena related to the birds-eye perspective), so you don’t realize what new thing is missing. And I can’t easily cure this kind of cluelessness with examples, because my theories aren’t necessary if you just consider a single very narrow and homogenous phenomenon as then you can just make a special-built theory for that.
This may well be true (though I think not), but what is your argument about not even linking to your original posts? Or how often you don’t explain yourself even in completely unrelated subjects? My contention is that you are not lazily trying on a variety of different reframings of your original arguments or conclusions to see what sticks, and are instead just lazy.
I don’t know of anyone who seems to have understood the original posts, so I kinda doubt people can understand the point of them. Plus often what I’m writing about is a couple of steps removed from the original posts.
Part of the probing is to see which of the claims I make will seem obviously true and which of them will just seem senseless.
Then everything you say will seem either trivial or absurd because you don’t give arguments! Please post arguments for your claims!
But that would probe the power of the arguments whereas really I’m trying to probe the obviousness of the claims.
Ok, I will first note that this is different from what you said previously. Previously, you said “probing for whether rationalists will get the problem if framed in different ways than the original longform” but now you say “I’m trying to probe the obviousness of the claims.”. It’s good to note when such switches occur.
Second, you should stop making lazy posts with no arguments regardless of the reasons. You can get just as much, and probably much more information through making good posts, there is not a tradeoff here. In fact, if you try to explain why you think something, you will find that others will try to explain why they don’t much more often than if you don’t, and they will be pretty specific (compared to an aggregated up/down vote) about what they disagree with.
But my true objection is I just don’t like bad posts.
So it sounds like your general theory has no alpha over narrow theories. What, then, makes it any good? Is it just that its broad enough to badly model many systems? Then it sounds useful in every case where we can’t make any formal predictions yet, and you should give those examples!
This sounds like a bad excuse not to do the work.
It’s mainly good for deciding what phenomena to make narrow theories about.
Then give those examples!
Edit: and also back up those examples by actually making the particular model, and demonstrate why such models are so useful through means decorrelated with your original argument.
Why?
This is the laziness I’m talking about! Do you really not understand why it would be to your theory-of-everything’s credit to have some, any, any at all, you know, actual use?
How suspicious it is that when I ask for explicit concrete examples, you explain that your theory is not really about particular examples, despite that if your vague-posting is indeed applying your theory of everything to particular examples, we can derive the existence of circumstances you believe your theory can well model?
And that excuse being that its good at deciding what to make good theories about, you cannot think of one reason why I’d like to know what theories you think would be smart to make using this framework.
That is to say that this is a very lazy reply.
I can think of reasons why you’d like to know what theories would be smart to make using this framework, e.g. so you can make those theories instead of bothering to learn the framework. However, that’s not a reason it would be good for me to share it with you, since I think that’d just distract you from the point of my theory.
I do not think I could put my response here better than said did 7 years ago on a completely unrelated post, so I will just link that.
Thing is just from the conclusions it won’t be obvious that the meta-level theory is better. The improvement can primarily be understood in the context of the virtues of the meta-level theory.
idk what to say, this is just very transparently an excuse for you to be lazy here, and clearly crank-talk/cope.
More specifically, my position is anti-reductionist, and rationalist-empiricist-reductionists dismiss anti-reductionists as cranks. As long as you are trying to model whether I am that and then dismiss me if you find I am, it is a waste of time to try to communicate my position to you.
I am not dismissing you because of your anti-reductionism! Where did I say that? Indeed, I have been known to praise some “anti-reductionist” theories—fields even!
I’m dismissing you because you can’t give me examples of where your theory has been concretely useful!
You praise someone who wants to do agent-based models, but agent-based models are a reductionistic approach to the field of complexity science, so this sure seems to prove my point. (I mean, approximately all of the non-reductionistic approaches to the field of complexity science are bad too.)
I don’t care who calls themselves what, complexity science calls itself anti-reductionist, I don’t dismiss them. Therefore I can’t dismiss people just because they call themselves anti-reductionist, I must use their actual arguments to evaluate their positions.
I will also say that pleading to the community’s intrinsic bias and claiming I’ve made arguments I haven’t or have positions I don’t is not doing much to make me think you less a crank.
I’m not saying you’re dismissing me because I call myself anti-reductionist, I’m saying you’re dismissing me because I am an anti-reductionist.
I don’t think you’re using the actual arguments I presented in the LDSL series to evaluate my position.
I remember reading LDSL and not buying the arguments! At the time, I deeply respected you and your thinking, and thought “oh well I’m not buying these arguments, but surely if they’re as useful as they say, tailcalled will apply them to various circumstances and that will be pie on my face, and in that circumstance I should try to figure out why I was mistaken”. But then you didn’t, and you started vague-posting constantly, and now we’re here and you’re giving excuse after excuse of why its actually impossible for you to tell me any concrete application of your theory, and accusing me of anti-reductionist prejudice.
I admit, I do have an anti-reductionist prejudice, its called a prior, but its not absolute, and its not enough to stop listening to someone. I really, really, really don’t think I’m outright dismissing you because you’re anti-reductionist. I was totally willing to listen to you, even when you were making such arguments, and end up being wrong!
I even have the receipts to prove it! Until like just under a month ago, I was still emailed & lesswrong notified every time you made a post!
(they are unread, because I check LessWrong more commonly than my email)
I cannot stress enough, the reason why I’m dismissing you is because you stopped making arguments and started constantly vague-posting.
If you don’t have any puzzles within Economics/Sociology/Biology/Evolution/Psychology/AI/Ecology where it would be useful with a more holistic theory, then it’s not clear why I should talk to you.
I never said that, I am asking you for solutions to any puzzle of your choice! You’re just not giving me any!
Edit: I really honestly don’t know where you got that impression, and it kinda upsets me you seemingly just pulled that straight out of thin air.
Wouldn’t it be more impressive if I could point you to a solution to a puzzle you’ve been stuck on than if I present my own puzzle and give you the solution to that?
It would, but you didn’t ask for such a thing. Are you asking for such a thing now? If so, here is one in AI, which is on everyone’s minds: How do we interpret the inner-workings of neural networks.
I expect though, that you will say that your theory isn’t applicable here for whatever reason. Therefore it would be helpful if you gave me an example of what sort of puzzle your theory is applicable to.
“How do we interpret the inner-workings of neural networks.” is not a puzzle unless you get more concrete an application of it. For instance an input/output pair which you find surprising and want an interpretation for, or at least some general reason you want to interpret it.
Ok, then why do AI systems have so many adversarial examples? I have no formal model of this, though it plausibly makes some intuitive sense.
… can you pick some topic that you are good at instead of focusing on AI? That would probably make the examples more informative.
It sounds like, as I predicted, your theory doesn’t apply to the problems I presented, so how about you provide an example
The LDSL series provides quite a few everyday examples, but for some reason you aren’t satisfied with those. Difficult examples require that you’re good at something, so I might not be able to find an example for you.
Lets go through your sequence shall we? And enumerate the so-called “concrete examples” you list
[LDSL#0] Some epistemological conundrums
Here you ask a lot of questions, approximately each of the form “why do ‘people’ think <thing-that-some-people-think-but-certainly-not-all”. To list a few,
Seems to have a good answer. Sometimes they’re informative!
Seems also to have a good answer, it is easy to fool yourself if you do it improperly.
I would sure love a new closed-form way of modeling bag-like dynamics, as you describe them, if you have them! I don’t think you give one though, but surely if you mention it, you must have the answer somewhere!
Seems less a question than a claim? And I don’t think we need special math to solve this one.
None of these seem like concrete applications of your theory, but that’s fine. It was an intro post, you will surely explain all these later on, as worked examples at some point, right?
[LDSL#1] Performance optimization as a metaphor for life
I do remember liking this post! It was good. However, the conclusions here do not seem dependent on your overall conclusions.
[LDSL#2] Latent variable models, network models, and linear diffusion of sparse lognormals
Wait, I don’t think your previous post was about that? I certainly use statistics when doing performance optimization! In particular, I profile my code and look at which function calls are taking the bulk of the time, then optimize or decrease the number of calls to those.
Hey look a concrete example!
… well more like a motivating example. I’m sure at some point you build models and compare your model to those the epidemiologists have built… right?
[LDSL#3] Information-orientation is in tension with magnitude-orientation
This seems like a reasonable statistical argument, but of course, for our purposes, there are no real examples here, so let us move on.
[LDSL#4] Root cause analysis versus effect size estimation
Seems also a reasonable orientation, but by no means a theory of everything, and again no real examples here, so lets move on once again
[LDSL#5] Comparison and magnitude/diminishment
Your solution here to the problem you outline seems like a cop-out to me, and of course (other than the tank/dust example, which is by no means an example in the sense we’re talking about here), there are no examples.
[LDSL#6] When is quantification needed, and when is it hard?
Here you give the example of elo, but you don’t really provide any alternatives, and you mostly mention that picking bases when taking logarithms may be hard, so also doesn’t seem like an example.
Therefore, if it seemed like I didn’t read your sequence before (which I did! Just a while ago), I have certainly at least skimmed it now, and can say with relative confidence that no, you don’t in fact give concrete examples of circumstances where your theory performs better than the competition even once. At most you give some statistical arguments for why in some circumstances you may want to use various statistical tools. But this is by no means some theory of everything or even really much a steel-man for anti-reductionism.
You don’t even come back to the problems you originally listed! Where’s the promised theory of autism? Where’s the closed form model of bag-like dynamics? Where’s the steel-man of psychoanalysis, or the take-down of local validity and coherence, or the explanation of why commonsense reasoning avoids the principle of explosion?
This is the behavior of a lazy crack-pot, who doesn’t want to admit the fact that nobody is listening to them anymore because they’re just wrong. It is not the case that I’m just not good at anything enough to understand your oh-so-complex examples. You just don’t want to provide examples and would rather lie and say you’ve provided examples in the past, relying on your (false) assumption that I haven’t read what you’ve written, than actually list anything concrete.
This post has the table example. That’s probably the most important of all the examples.
That’s accounting, not statistics.
AFAIK epidemiologists usually measure particular diseases and focus their models on those, whereas LDSL would more be across all species of germs.
There is basically no competition. You just keep on treating it like the narrow domain-specific models count as competition when they really don’t because they focus on something different than mine.
I would honestly be interested in any concrete model you build based on this. You don’t necessarily have to compare it against some other field’s existing model, though it does help for credibility’s sake. But I would like to at least be able to compare the model you make against data.
I’m also not sure this is true about epidemiologists, and if it is I’d guess its true to the extent that they have like 4 different parameterizations of different types of diseases (likely having to do with various different sorts of vectors of spread), then they fit one of those 4 different parameterizations to the measured (or inferred) characteristics of a particular disease.
The most central aspect of my model is to explain why it’s generally not relevant to fit quantitative models to data.
Each disease (and even different strands of the same disease and different environmental conditions for the same strand) has its own parameters, but they don’t fit a model that contains all the parameters of all diseases at once, they just focus on one disease at a time.
Before you said
Which seems to imply you (at least 3 hours ago) believed your theory could handle relatively well-formulated and narrow “input/output pair” problems. Yet now you say
If I treat your theory this way, it is only because you did, 3 hours ago, when you believed I hadn’t read your post or would even give you the time of the day. You claimed “How do we interpret the inner-workings of neural networks.” was “not a puzzle unless you get [a?] more concrete application of it”, yet the examples you list in your first post are no more vague, and often quite a bit more vague than “how do you interpret neural networks?” or “why are adversarial examples so easy to find?” For example, the question “Why are people so insistent about outliers?” or “Why isn’t factor analysis considered the main research tool?”
For… what exactly? For theories of everything? Oh I assure you, there is quite a bit of competition there. For statistical modeling toolkits? Ditto. What exactly do you think the unique niche you are trying to fill is? You must be arguing against someone, and indeed you often do argue against many.
The relevance of zooming in on particular input/output problems is part of my model.
“Why are adversarial eamples so easy to find?” is a problem that is easily solvable without my model. You can’t solve it because you suck at AI, so instead you find some AI experts who are nearly as incompetent as you and follow along their discourse because they are working at easier problems that you have a chance of solving.
“Why are people so insistent about outliers?” is not vague at all! It’s a pretty specific phenomenon that one person mentions a general theory and then another person says it can’t be true because of their uncle or whatever. The phrasing in the heading might be vague because headings are brief, but I go into more detail about it in the post, even linking to a person who frequently struggles with that exact dynamic.
As an aside, you seem to be trying to probe me for inconsistencies and contradictions, presumably because you’ve written me off as a crank. But I don’t respect you and I’m not trying to come off as credible to you (really I’m slightly trying to come off as non-credible to you because your level of competence is too low for this theory to be relevant/good for you). And to some extent you know that your heuristics for identifying cranks is not going to solely pop out at people who are forever lost to crankdom because you haven’t just abandoned the conversation.
Theories of everything that explain why intelligence can’t model everything and you need other abilities.
I liked your old posts and your old research and your old ideas. I still have some hope you can reflect on the points you’ve made here, and your arguments against my probes, and feel a twinge of doubt, or motivation, pull on that a little, and end up with a worldview that makes predictions, lets you have & make genuine arguments, and gives you novel ideas.
If you were always lazy, I wouldn’t be having this conversation, but once you were not.
A lot of my new writing is as a result of the conclusions of or in response to my old research ideas.
Of course it is, I did not think otherwise, but my point stands.
No it doesn’t. I obviously understood my old posts (and still do—the posts make sense if I imagine ignoring LDSL). So I’m capable of understanding whether I’ve found something that reveals problems in them. It’s possible I’m communicating LDSL poorly, or that you are too ignorant to understand it, or that I’m overestimating how broadly it applies, but those are far more realistic than that I’ve become a pure crank. If you still prefer my old posts to my new posts, then I must know something relevant you don’t know.
What is the solution then?
I do think I’m “good at” AI, I think many who are “good at” AI are also pretty confused here.
I don’t really care what you think.
By “x-risk” from AI that you currently disbelieve, do you mean extinction of humanity, disempowerment-or-extinction, or long term loss of utility (normative value)? Something time-scoped, such as “in the next 20 years”?
Even though Bostrom’s “x-risk” is putatively more well-defined than “doom”, in practice it suffers from similar ambiguities, so strong positions such as 98+% doom/x-risk or 2-% doom/x-risk (in this case from AI) become more meaningful if they specify what is being claimed in more detail than just “doom” or “x-risk”.
I mean basically all the conventionally conceived dangers.
(Sorry.) Does this mean (1) more specifically eutopia that is not disempowerment (in the mainline scenario, or “by default”, with how things are currently going), (2) that something else likely kills humanity first, so the counterfactual impact of AI x-risk vanishes, or (3) high long term utility (normative value) possibly in some other form?