TsviBT

Karma: 7,910

TsviBT Aug 11, 2025, 4:58 PM
10 points
5
on: Negative utilitarianism is more intuitive than you think
These intuitions (at least, my versions, insofar as I have some version of them) seem to be more about harm vs. edification (I mean, damage to / growth of actual capacity) rather than experientially-feels-good vs. -bad. If a child loses a limb, or has permanent developmental damage from starvation, or drowns to death, that will damage zer long term ability to grow and be fulfilled. Eye strain is another kind of lasting harm.

In fact, if there is some way to reliably, significantly, consensually, and truly (that is, in a deep / wholesome sense) increase someone else’s agency, then I do think there’s a strong moral obligation to do so! (But there are many difficulties and tradeoffs etc.)

TsviBT Aug 10, 2025, 6:52 PM
2 points
0
in reply to: Noosphere89’s comment on: TsviBT’s Shortform
No, humans do this all the time, constantly, originarily (https://www.lesswrong.com/posts/5tqFT3bcTekvico4d/do-confident-short-timelines-make-sense#Creativity___Originariness) when they are kids. They keep using roughly the same set of faculties on harder and harder problems, including sometimes making globally novel insights. Gippities learn in a different way which does not go on to do that. You can be helped in noticing that it’s a different way via sample complexity.

TsviBT Aug 10, 2025, 4:23 AM
2 points
0
in reply to: onslaught’s comment on: TsviBT’s Shortform
So basically you just don’t think creativity is a thing? That’s one impasse we could be at. What I mean is gestured at here:

https://tsvibt.blogspot.com/2022/08/structure-creativity-and-novelty.html

More discussion here:

https://tsvibt.blogspot.com/2023/01/the-voyage-of-novelty.html

https://tsvibt.blogspot.com/2023/01/endo-dia-para-and-ecto-systemic-novelty.html

https://tsvibt.blogspot.com/2023/01/a-strong-mind-continues-its-trajectory.html

TsviBT Aug 9, 2025, 8:49 AM
25 points
5
on: TsviBT’s Shortform
If someone asks me “what’s the least impressive thing you think AI won’t be able to do by 20XX”, I give answers like “make lots of original interesting math concepts”. (See https://www.lesswrong.com/posts/sTDfraZab47KiRMmT/views-on-when-agi-comes-and-on-strategy-to-reduce#comments) People sometimes say “well that’s a pretty impressive thing, you’re talking about the height of intellectual labor”.

A main reason I give examples like this is that math is an area where it’s feasible for there to be a legible absence of legibilization. (By legible, I mean interpersonal explicitness https://www.lesswrong.com/posts/KuKaQEu7JjBNzcoj5/explicitness .) Mathematicians are supposed to make interesting novel definitions that legibilize inexplicit ideas / ways of thinking. If they don’t, you can tell that their publications are not so interesting. It is legible that they failed to legibilize.

In fact I suspect there will be many much “easier” feats that AI won’t be able to do for a while (some decades). Easier, in the sense that many more humans are able to do those feats. Much harder, in the sense that it requires creativity, and therefore requires having the algorithm-pieces for creativity. That’s easy for humans because it’s our birthright, but “hard” for AI because it doesn’t have that yet. Lots of little novel “knacks”, of the sort people can pick up; surprising connections or analogies; solving lots of little problems, or coming up with a way of seeing some sort of situation in a way that makes it make sense.

But knacks, insights, predictions, inventions—these are not categories. I’m not saying “AI can’t make predictions” or “AI can’t learn knacks”, because that would be nonsensical, because those aren’t categories. I’m saying that some things humans do, require broad-spectrum creativity; but it’s hard to describe those things as a task, and so I don’t give it as an answer to the question about what AI can’t do. (If a task is stereotyped, not that hard, has clear feedback, and has lots of demonstrations—all things correlated with legibility—then probably AI will be able to do it soon!)

TsviBT Aug 1, 2025, 3:33 AM
3 points
1
on: Two Kinds of Do Overs
Nice!

TsviBT Jul 31, 2025, 2:01 PM
2 points
0
on: TsviBT’s Shortform
Is there a nice way to bet on large-evidence, small-probability differences?

Normally we bet on substantial probability differences, like I say 10% you say 50% or similar. Betting makes sense there—you incentivize having correct probabilities, at least within a few percent or whatever. Is there some way to incentivize reporting the right log-odds, to within a logit or whatever?

One sort of answer might be showing that/how you can always extract a mid-range probability disagreement on latent variables, under some assumptions on the structure of the latent variable models underlying the large-logit small-probability disagreement.

TsviBT Jul 29, 2025, 6:51 AM
2 points
0
in reply to: Gram Stone’s comment on: Do confident short timelines make sense?

So I guess my model says that ‘merely static representations’ of semantic volumetric histories will constitute the first optimally specific board states of Nature in history, and we will use them to define loss functions so that we can do supervised learning on human games (recorded volumetric episodes) and learn a transition model (predictive model of the time evolution of recorded volumetric episodes, or ‘next-moment prediction’) and an action space (generative model of recorded human actions), then we will combine this with Engineered Search and some other stuff, then solve Go (kill everyone).

I think getting this to work in a way that actually kills everyone, rather than merely is AlphaFold or similar, is really really hard—in the sense that it requires more architectural insight than you’re giving credit for. (This is a contingent claim in the sense that it depends on details of the world that aren’t really about intelligence—for example, if it were pretty easy to make an engineered supervirus that kills everyone, then AlphaFold + current ambient tech could have been enough.) I think the easiest way is to invent the more general thing. The systems you adduce are characterized by being quite narrow! For a narrow task, yeah, plausibly the more hand-engineered thing will win first.

Back at the upthread point, I’m totally baffled by and increasingly skeptical of your claim to have some good reason to have a non-unimodal distribution. You brought up the 3D thing, but are you really claiming to have such a strong reason to think that exactly the combination of algorithmic ideas you sketched will work to kill everyone, and that the 3D thing is exactly most of what’s missing, that it’s “either this exact thing works in <5 years, or else >10 years” or similar?? Or what’s the claim? IDK maybe it’s not worth clarifying further, but so far I still just want to call BS on all such claims.

TsviBT Jul 24, 2025, 3:13 AM
2 points
0
in reply to: Gram Stone’s comment on: Do confident short timelines make sense?

So ‘antepenultimate algorithmic insight,’ and ‘one of just a few remaining puzzle pieces in a lethal neuromorphic architecture’ both strike me as relatively fair characterizations.

Ok. This is pretty implausible to me. Bagiński’s whack-a-mole thing seems relevant here, as well as the bitter lesson. Bolting MAV3D into your system seems like the contemporary equivalent of manually writing convolution filters in your computer vision system. You’re not striking at the relevant level of generality. In other words, in humans, all the power comes from stuff other than a MAV3D-like thing—a human’s MAV3D-like thing is emergent / derivative from the other stuff. Probably.

TsviBT Jul 23, 2025, 9:03 AM
2 points
0
in reply to: Gram Stone’s comment on: Do confident short timelines make sense?

in which case on my model, actually yes, something akin to human episodic imagination, if not ‘true’ counterfactuals, should come right out of the box

but it seems like an essential part of something that could [kill everyone]

I don’t know how strong of a / what kind of a claim you’re trying to make here… Are you claiming NeRFs represent a substantial chunk of the Xth algorithmic insight? Or not an algorithmic part, but rather setting up a data source with which someone can make the Xth insight? Or...?

TsviBT Jul 21, 2025, 5:42 PM
4 points
0
in reply to: abramdemski’s comment on: Do confident short timelines make sense?

However, you don’t believe we know enough to get even that far (by 2030). To you it is perhaps more closely analogous to trying to construct a bridge without having even an intuitive understanding of gravity.

Yeah, if I had to guess, I’d guess it’s more like this. (I’d certainly say so w.r.t. alignment—we have no fucking idea what mind-consequence-determiners even are.)

Though I suppose I don’t object to your analogy here, given that it wouldn’t actually work! That “bridge” would collapse the first time you drive a truck over it.

TsviBT Jul 21, 2025, 5:12 PM
6 points
0
in reply to: abramdemski’s comment on: Do confident short timelines make sense?
If someone says 10% by 2030, we disagree, but it would be hard to find something to talk about purely on that basis. (Of course, they could have other more specific beliefs that I could argue with.) If they say, IDK, 25% or something (IDK, obviously not a sharp cutoff by any means, why would there be?), then I start feeling like we ought to be able to find a disagreement just by investigating what makes us say such different probabilities. Also I start feeling like they have strategically bad probabilities (I mean, their beliefs that are incorrect according to me would have practical implications that I think are mistaken actions). (On second thought, probably even 10% has strategically bad implications, assuming that implies 20% by 2035 or similar.)

TsviBT Jul 21, 2025, 5:08 PM
4 points
0
in reply to: abramdemski’s comment on: Do confident short timelines make sense?
(Noting that I don’t endorse the description of my argument as “physicalist”, though I acknowledge that the “spontaneously” thing kinda sounds like that. Allow me to amend / clarify: I’m saying that you, a mind with understanding and agency, cannot spontaneously assemble beams into a bridge—you have to have some understanding about load and steel and bridges and such. I use this to counter “no blockers” arguments, but I’m not denying that we’re in a special regime due to the existence of minds (humans); the point is that those minds still have to understand a bunch of specific stuff. As mentioned here: https://www.lesswrong.com/posts/sTDfraZab47KiRMmT/views-on-when-agi-comes-and-on-strategy-to-reduce#The__no_blockers__intuition )

TsviBT Jul 21, 2025, 4:25 PM
6 points
0
in reply to: abramdemski’s comment on: Do confident short timelines make sense?

anything above 1% by end-of-year 2030 to be “high confidence in short timelines” of the sort he would have something to say about

Say what now?? Did I write that somewhere? That would be a typo or possibly a thinko. My own repeatedly stated probabilities would be around 1% or .5%! E.g. in https://www.lesswrong.com/posts/sTDfraZab47KiRMmT/views-on-when-agi-comes-and-on-strategy-to-reduce

TsviBT Jul 20, 2025, 2:13 PM
1 point
0
in reply to: BryceStansfield’s comment on: Overview of strong human intelligence amplification methods

Is there something I’m missing on the neuroscience end? I’m not at all familiar with the field.

I’m not a neuroscientist either, but if you’re not at all familiar with the field, then yes of course there’s stuff you’re missing.

As I wrote:

The butcher number. Current electrodes kill more neurons than they record. That doesn’t scale safely to millions of connections. Bad feedback. Neural synapses are not strictly feedforward; there is often reciprocal signaling and regulation. Electrodes wouldn’t communicate that sort of feedback, which might be important for learning.

Current systems are are still low numbers of active connections. Maybe this can be scaled up, but seems quite hard to scale it up by several orders of magnitude.

More zoomed out, I think the only method that would definitely work is germline engineering (well, and WBE, but that has its own problems). Everything else is speculation—what should make us think you can increase someone’s deep problem solving ability that way?

TsviBT Jul 20, 2025, 10:08 AM
3 points
0
in reply to: espoire’s comment on: Overview of strong human intelligence amplification methods
Though to be clear, it seems very very difficult to me, like it might be at a vaguely comparable level of difficulty as “solving biology”. Which is part of why I’m not working on that directly, but instead aiming at technological human intelligence amplification.

TsviBT Jul 20, 2025, 9:30 AM
3 points
0
in reply to: espoire’s comment on: Overview of strong human intelligence amplification methods
Like trying to transcribe the egregore’s algorithm into something easily human-readable?

Yes, that’s basically what I mean. There’s a lot of coded movements. Examples of classes of examples:
- dogwhistles
- microaggressions
- signaling, shibboleths
- second- and higher-order norm enforcement (mocking non-enforcers of norms, etc.)
- quorum sensing
- performativity (playing dumb, performative lying, preference falsification, etc.)
- hype / hyperstitioning
- enthymemes
- envisioning futures
- anti-inductivity (e.g. cryptolects)
So you’d first of all want to decode this stuff so that
- you can understand what’s even happening
- you can reflect on what’s happening—is it good or bad, how could it be done better or worse, should it be combatted, etc.
- you can support it or combat it effectively if needed.
Further, there’s presumably healthy, humanity-aligned ways of participating in egregores (I mean the name is a bit scary, but like, some companies, governments, religious strains, traditions, norms, grand plans, etc., are good to participate in), or in other words effective, epistemic, Good shared-intentionality-weaving. This is an entire huge and fundamental missing sector of our philosophy. We might have to understand this better to make progress on hard things. Decoding obvious egregores would be a way in. As an example, I suspect there is some sequence of words, humanly producible, maybe with prerequisites (such as having a community backing you up, or similar), that would persuade most AGI researchers to just stop—but you might need more theory to produce those words.

TsviBT Jul 19, 2025, 10:41 PM
LW: 4 AF: 2
0
AF
in reply to: Nick_Tarleton’s comment on: Views on when AGI comes and on strategy to reduce existential risk
I think I don’t understand this argument. In creating AI we can draw on training data, which breaks the analogy to making a replicator actually from scratch (are you using a premise that this is a dead end, or something, because “Nearly all [thinkers] do not write much about the innards of their thinking processes...”?).

You’re technically right that the analogy is broken in that way, yeah. Likewise, if someone gleans substantial chunks of the needed Architecture by looking at scans of brains. But yes, as you say, I think the actual data (in both cases) doesn’t directly tell you what you need to know, by any stretch. (To riff on an analogy from Kabir Kumar: it’s sort of like trying to infer the inner workings of a metal casting machine, purely by observing price fluctuations for various commodities. It’s probably possible in theory, but staring at the price fluctuations—which are a highly mediated / garbled / fuzzed emanation from the “guts” of various manufacturing processes—is not a good way to discover the important ideas about how casting machines can work. Cf. https://www.lesswrong.com/posts/unCG3rhyMJpGJpoLd/koan-divining-alien-datastructures-from-ram-activations )

We’ve seen that supervised learning and RL (and evolution) can create structural richness (if I have the right idea of what you mean) out of proportion to the understanding that went into them.

Not sure I buy the claims about SL and RL. In the case of SL, it’s only going “a little ways away from the data”, in terms of the structure you get. Or so I claim uncertainly. (Hm… maybe the metaphor of “distance from the data” is quite bad.… really I mean “it’s only exploring a pretty impoverished sector in structurespace, partly due to data and partly due to other Architecture”.) In the case of RL, what are the successes in terms of gaining new learned structure? There’s going to be some—we can point to AlphaZero, and maybe some robotics things—but I’m skeptical that this actually represents all that much structural richness. The actual NNs in AlphaZero would have some nontrivial structure, but hard to tell how much, and it’s going to be pretty narrow / circumscribed, e.g. it wouldn’t represent most interesting math concepts.

Anyway, the claim is of course true of evolution. The general point is true, that learning systems can be powerful, and specifically high-leverage in various ways (e.g. lots of learning from small algorithmic complexity fingerprint as with evolution or Solomonoff induction, or from fairly small compute as in humans).

Of course this doesn’t mean any particular learning process is able to create a strong mind, but, idk, I don’t see a way to put a strong lower bound on how much more powerful a learning process is necessary,

Right, no one knows. Could be next month that everyone dies from AGI. The only claims I’d really argue strongly would be claims like
- If you have median 2029 or similar, either you’re overconfident or you know something dispositive that I don’t know.
- If you have probability of AGI by 2029 less than .05%, either you’re overconfident or you know something dispositive that I don’t know.
Besides my comments about the bitter lesson and about the richness of evolution’s search, I’ll also say that it just seems to me like there’s lots of ideas—at the abstract / fundamental / meta level of learning and thinking—that have yet to be put into practice in AI. I wrote in the OP:

The self-play that evolution uses (and the self-play that human children use) is much richer, containing more structural ideas, than the idea of having an agent play a game against a copy of itself.

IME if you think about these sorts of things—that is, if you think about how the 2.5 known great and powerful optimization processes (evolution, humans, humanity/science) do their impressive thing that they do—if you think about that, you see lots of sorts of feedback arrangements and ways of exploring the space of structures / algorithms, many of which are different in some fundamental character from what’s been tried so far in AI. And, these things don’t add up, in my head, to a general intelligence—though of course that is only a deficiency in my imagination, one way or another.

(EDIT: Maybe (you’d say) I should be drawing such a strong lower bound from the point about sample efficiency...?)

I don’t personally lean super heavily on the sample efficiency thing. I mean, if we see a system that’s truly only trained on some human data that’s of size less than 10x the amount that a well-read adult human has read (plus compute / thinking), and it performs like GPT-4 or similar, that would be really weird and surprising, and I would be confused, and I’d be somewhat more scared. But I don’t think it would necessarily imply that you’re about to get AGI.

Conversely, I definitely don’t think that high sample complexity strongly implies that you’re not about to get AGI. (Well, I guess if you’re about to get AGI, there should probably be spikes in sample efficiency in specific areas—e.g. you’d be able to invent much more interesting math with little or no data, whereas previously you had to train on vast math corpora. But we don’t necessarily have to observe these domain spikes before dying of nanopox.)

Yeah, in particular it seems like I’m updating more than you from induction on the conceptual-progress-to-capabilities ratio we’ve seen so far / on what seem like surprises to the ‘we need lots of ideas’ view. (Or maybe you disagree about observations there, or disagree with that frame.) (The “missing update” should weaken this induction, but doesn’t invalidate it IMO.)

Yeah… To add a bit of color, I’d say I’m pretty wary of mushing. Like, we mush together all “capabilities” and then update on how much “capabilities” our current learning programs have. I don’t feel like that sort of reasoning ought to work very well. But I haven’t yet articulated how mushing is anything more specific than categorization, if it is more specific. Maybe what I mean by mushing is “sticking to a category and hanging lots of further cognition (inferences, arguments, plans) on the category, without putting in suitable efforts to refine the category into subcategories”. I wrote:

We should have been trying hard to retrospectively construct new explanations that would have predicted the observations. Instead we went with the best PREEXISTING explanation that we already had.

TsviBT Jul 17, 2025, 2:23 PM
2 points
0
in reply to: Expertium’s comment on: Do confident short timelines make sense?
Ok gotcha, thanks. In that case it doesn’t seem super relevant to me. I would expect there to be lots of gains in any areas where there’s algebraicness to chew through; and I don’t think this indicates much about whether we’re getting AGI. Being able to “unlock” domains, so that you can now chew through algebraicness there, does weakly indicate something, but it’s a very fuzzy signal IMO.

(For contrast, a behavior such as originarily producing math concepts has a large non-algebraic component, and would IMO be a fairly strong indicator of general intelligence.)

TsviBT Jul 17, 2025, 2:11 PM
4 points
0
in reply to: Expertium’s comment on: Do confident short timelines make sense?
Um, ok, were any of the examples impressive? For example, did any of the examples derive their improvement by some way other than chewing through bits of algebraicness? (The answer could easily be yes without being impressive, for example by applying some obvious known idea to some problem that simply hadn’t happened to have that idea applied to it before, but that’s a good search criterion.)

TsviBT Jul 17, 2025, 1:54 PM
5 points
2
in reply to: Expertium’s comment on: Do confident short timelines make sense?
Please provide more detail about this example. What did the system invent? How did the system work? What makes you think it’s novel? Would it have worked without the LLM?

(All of the previous many times someone said something of the form “actually XYZ was evidence of generality / creativity / deep learning being awesome / etc.”, and I’ve spent time looking into the details, it turns out that they were giving a quite poor summary of the result, in favor of making the thing sound more scary / impressive. Or maybe using a much lower bar for lots of descriptor words. But anyway, please be specific.)