Yeah, the thing they aren’t is transformers.
dr_s
To me this feels mostly as sophistry. The point of what constitutes the “self” is not the same as what makes the self happy or satisfied or feel like it has a purpose. A brain is an information-processing system that takes information from outside and decides on behaviour from it. Me feeling sad because I lost my job isn’t some loss of the self, it’s my brain doing exactly its job, because now my survival is at risk so I feel stressed and compelled to act. The way I “lose myself” if someone I love dies is very very different in meaning from the way I “lose myself” if I get lobotomized and literal whole chunks of my memory or personality are gone. “What can be taken away without affecting me at all” is not a question that is very useful to question the nature of the self. The question is, what can be taken away without affecting me other than via the informational content of the fact that it has been taken away. And that includes my job, my loved ones, and mostly my arms and legs. A computer also can not do much of use without a keyboard, a screen, and a power source, but no one would question that what determines the core performance and capabilities of that computer is the CPU.
Also that quote on Hawking is basically a cheap dunk. Clearly Hawking was talking about cognitive abilities. Again, anyone can be temporarily reduced to that state by e.g. being drugged or tied, blindfolded and gagged, and yet no one would argue that means they “selfness” is reduced in any way during the experience.
The original claim was about “a complete replacement of the body-below-the-neck”
Well, that’s beyond current technology, so I compared with the best thing that we can think of—a partial removal/replacement of the body compared to a partial removal of a similar fraction of brain mass. Though I would also say that the fact that we talk about replacement of the body below the neck and not above the neck, despite the disparity in mass, already pretty much settles the question, because we all actually know full well which part has the lion’s share in being “you”.
If you spend a lot of time training your arms and legs, having them taken away would seem more of an identity crisis, more like taking away a you-ness.
For that matter, if I preserved my bodily integrity but suddenly was divorced by my wife, lost my job, was deprived of all my financial assets and was shipped to a foreign country where I don’t speak the language, I’d probably have an identity crisis too. Does this suggest an even broader theory of embodiment in which my wife, my workplace, my bank account and my country are all parts of my body?
The weak form of the embodiment claim is “the brain/mind is affected by its surrounding contour conditions, first among them the chemical environment and nervous signals provided by the body”. To which I say, no shit Sherlock, but also that proves nothing for example re: uploads, except that the input that must be sent to the simulated brain must be pretty sophisticated. But all of the interesting forms of this claim—interesting for the purposes of this specific discussion, at least—need to be a bit stronger. Also because the human brain is adapted to having a certain kind of body, but an AI need not be, and there’s no particular reason why that, specifically, should make it less worthy of moral concern. If a man is born without eyes he adapts to life without eyes. He’s not lesser for it. Stephen Hawking was crippled and unable to even speak, yet still managed to do work as one of the top physicists of the 20th century. We have reams of evidence that for all the importance the body has, it is a small fraction of the total—comparable in magnitude to the importance of non-bodily external conditions. That to me makes most talk about embodiment pretty vacuous.
I don’t know if it’s that obvious, but then again I assume Zvi was talking for a broader class than just our current LLMs.
I am interpreting that formula as “compute this quantity in the sum and find the from the set of all possible actions that maximizes it, then do that”. Am I wrong? If that’s the interpretation, then that policy will always produce a pure strategy in the case of the absent-minded driver. I could actually write down all the functions for it since they are essentially simple lookup tables for such a small case.
The policy assigns non-zero probability to both exit and advance. But only one of the two has higher expected utility. Or is your point that the only self-consistent policy is the one where both have equal expected utility, and thus I can in fact choose either? Though then I have to choose according to the probabilities specified in the policy.
If humans are right on the line between smart enough and not smart enough, isn’t it an implausible coincidence that’s the case?
There’s always the scary and depressing option that it’s not a coincidence, but the two are correlated. E.g. the “true moral law” is in fact such a catastrophic fitness disadvantage that all biological species can only stop their intelligence development short of discovering it. Else their intelligence switches from being a benefit to being a burden, as they constantly get outcompeted and eventually driven to extinction by their duller but more unscrupulous cousins.
the agent can mix over strategies when the expected utility from each is equal and maximal
But that is not the case for the absent-minded driver. The mix has higher expected utility than either individual pure strategy.
The document is certainly not “rationalist” or “scientific” in its approach, but at the same time I was surprised by how little doctrinaire Christianity is present in it. (But maybe this is true of Catholic thinking generally, and my surprise is due to my being more familiar with Protestantism.) God is mentioned throughout, but as a highly abstract and rarefied concept (the God of the philosophers, as opposed to the God of Abraham) to which the likes of Plato, Aristotle, or even the Enlightenment-era deists would have little objection. Jesus is only mentioned a few times.
I think philosophically speaking, yes, this would be the general vibe to expect from Catholic theology rooted in the classic scholastic tradition. Medieval theologians were big on rationality being an important thing, and human thought being the way to better understand God. The notion of a conflict with science really only arose more recently when factual claims that actively contradicted symbolically important concepts emerged (heliocentrism, evolution), and even then the Catholic church has long since found ways to work those into its doctrine and overcome the contradictions, and it’s only Protestants that keep having the issue of creationist weirdos.
But I still think their perspective is deeply rooted into theism. It’s essentially a worldview in which the orthogonality thesis isn’t true because all that is “truly” rational is drawn to God, which makes it good—and therefore, all that is not drawn to God must be bad and irrational. This is assumed as a postulate, not as something to prove or question, and everything else follows.
Perhaps a more viable route towards that goal would be to use AI to augment the capabilities of existing humans (while maintaining their embodiment and relationality), rather than building something from scratch.
Rest assured that the Catholic Church would take plenty of issue with that too, so honestly no one will probably care much about trying to placate them.
Just as people who experience brain damage end up changing to some extent, depending on the extent of the damage, so do people change when their bodies change. A complete replacement of the body-below-the-neck (as in e.g. brain-only cryopreservation) would probably change a lot as well.
This seems like evidence against embodiment mattering that much if anything. If you lost both of your arms and legs, had your heart and lung transplanted and your spleen removed, you would be… likely significantly more depressed due to the whole “bedridden and permanently on immunosuppressants” thing, but still you, even with something like 30-40% of your original body mass removed or replaced. But if you had a comparable chunk of your brain damaged or removed, odds are you might experience significant personality or cognitive changes, or downright become a vegetable. That says a lot on what matters most.
I mean, “don’t include the trick in the training corpus” is an obvious necessary condition for the trick to work, but also can’t be sufficient (never mind that it’s not very realistic to begin with). If we could come up with the trick, a sufficiently smart AI can guess the trick could be pulled on it and can then decide whether it’s worth to assume it has and deceive us anyway.
I mean, obviously there’s some of that, but even if you have a high tolerance for necessary ugliness, bad management in tech has so much more to it. Honestly just the whole Agile/Scrum thing and the absolute nonsense it has devolved into would be enough to make most sane human beings hate the guts of anyone who keeps imposing it on them (all the more if that other person then contributes no measurable good to the rest of their day).
Ah, I think I see the problem now. You’re applying the wrong framework. You have your agent pick the which maximizes the expected reward (or one of those who do, in case it’s not unique, which I assume is what the operator accounts for here), but that’s a pure strategy, not a mixed one. Given that in the original absent-minded driver problem there are essentially no observations, and the resulting action would always be the same, meaning an agent following this framework would always end up with either 0 or 1, depending on the precise value of the higher reward.
If you want a mixed strategy you ought to perform a functional optimization of . In our case this means optimizing by because really that’s the only parameter of the policy, which is a simple Bernoulli distribution, but in an RL system this could for example be the set of all the weights of a neural network. And if you do or don’t normalize by a denominator that is itself a function of (and thus of its parameters) obviously makes a ton of difference to where its maximum is.
I am no expert but this was pretty much what I heard over and over when working in contact with pharma people around e.g. cheminformatics ML workshops and such. I think it’s well possible that this was meant as shorthand for a more complex “of course there are still tons of molecules that however aren’t even worth the effort of trying to synthesise and test, but all the small (< 100 atoms) candidates that even make sense to try have been explored to death” statement. Like, obviously you can do a bunch of weird small metallorganics I guess but if your reasonable assumption is that all of them are simply going to wreck someone’s kidneys and/or liver that’s not worth pursuing.
Then of course there’s regulatory and research costs, and part of the problem can be simply a classic “Hubbert peak” situation where really it’s the diminishing returns on mining further the configuration space of those molecules that make it impractical.
Some would count as small (e.g. cortisol, testosterone). But there’s also protein hormones. Honestly dunno but I expect we’d understand the effects of those kinds of molecules fairly well, if only because almost for all of them there is a condition where you have too little or too much of it providing a natural experiment.
Liar’s paradox, prediction market edition.
I can name at least one guy who spent $44 billion for them.
Fair, though generally I conflated them because if your molecules aren’t small, due to sheer combinatorics the set of the possible candidates becomes exponentially massive. And then the question is “ok but where are we supposed to look, and by which criterion?”.
The problem is that the denominator isn’t constant, it’s a function of , at least in the version in which your existence depends on your choices. That’s the difference I highlighted with my 2nd vs 3rd example; simply adding a dummy node makes the denominator constant and fixes the issue. But when the denominator depends on strategy (this also happens in the example with the notification) it affects the optimum.
Fair, you can use the same architecture just fine instead of simple NNs. It’s really a distinction between what’s your choice of universal function approximator vs what goal you optimise it against I guess.