Max TK

Karma: 49

Max TK 16 Aug 2022 16:21 UTC
1 point
−8
on: Humans provide an untapped wealth of evidence about alignment
I would be the last person to dismiss the potential relevance understanding value formation and management in the human brain might have for AI alignment research, but I think there are good reasons to assume that the solutions our evolution has resulted in would be complex and not sufficiently robust.
Humans are [Mesa-Optimizers](https://www.alignmentforum.org/tag/mesa-optimization) and the evidence is solid that as a consequence, our alignment with the implicit underlying utility function (reproductive fitness) is rather brittle (i.e. sex with contraceptives, opiate abuse etc. are examples of such “failure points”).
Like others have expressed here before me I would also argue that human alignment has to perform in a very narrow environment which is shared with many very similar agents that are all on (roughly) the same power level. The solutions the human evolution has produced to ensure human semi-alignment is therefore to a significant degree not just a purely neurological one but also a social one.
Whatever these solutions are we should not expect that they will generalize well or that they would be reliable in a very different environment like one of an intelligent actor who has an absolute power monopoly.

This suggests that researching the human mind alone would not yield a technology that is robust enough to use when we have only exactly one shot at getting it right. We need solutions to the aforementioned abstractions and toy models because we probably should try to find a way to build a system that is theoretically safe and not just “probably safe in a narrow environment”.

Max TK 18 Mar 2023 5:32 UTC
1 point
0
in reply to: Muyyd’s comment on: Ethical AI investments?
This is an important question. To what degree are both of these (naturally conflicting) goals important to you? How important is making money? How important is increasing AI-safety?

Max TK 18 Mar 2023 7:48 UTC
2 points
0
in reply to: baturinsky’s comment on: AGI With Internet Access: Why we won’t stuff the genie back in its bottle.
I think that’s not an implausible assumption.
However this could mean that some of the things I described might still be too difficult for it to pull them off successfully, so in the case of an early breakout dealing with it might be slightly less hopeless.

Max TK 18 Mar 2023 13:34 UTC
1 point
0
in reply to: blf’s comment on: AGI With Internet Access: Why we won’t stuff the genie back in its bottle.
Good addition! I even know a few of those “AI rights activists” myself.
Since this here is my first post—would it be considered bad practice to edit my post to include it?

Max TK 18 Mar 2023 13:43 UTC
2 points
1
in reply to: [deleted]’s comment on: AGI With Internet Access: Why we won’t stuff the genie back in its bottle.
One very problematic aspect of this view that I would like to point out is that in a sense, most ‘more aligned’ AGIs of otherwise equal capability level seem to be effectively ‘more tied down’ versions, so we should assume them to have a lower effective power level than a less aligned AGI that has a shorter list of priorities.
If we imagine both as competing players in a strategy game, it seems that the latter has to follow fewer rules.

Max TK 18 Mar 2023 13:51 UTC
2 points
1
in reply to: baturinsky’s comment on: AGI With Internet Access: Why we won’t stuff the genie back in its bottle.
Maybe if it happens early there is a chance that it manages to become an intelligent computer virus but is not intelligent enough to further scale its capabilities or produce effective schemes likely to result in our complete destruction. I know I am grasping at straws at this point, but maybe it’s not absolutely hopeless.

The result could be a corrupted infrastructure and a cultural shock strong enough for the people to burn down OpenAI’s headquarters (metaphorically speaking) and AI-accelerating research to be internationally sanctioned.

In the past I have thought a lot about “early catastrophe scenarios”, and while I am not convinced it seemed to me that these might be the most survivable ones.

Max TK 9 Apr 2023 13:50 UTC
3 points
2
on: My Objections to “We’re All Gonna Die with Eliezer Yudkowsky”
weakly suggested that more dimensions do reduce demon formation
This also makes a lot of sense intuitively, as it should become more difficult in higher dimensions to construct walls (hills / barriers without holes).

Max TK 11 Aug 2023 18:41 UTC
2 points
−1
in reply to: AnthonyC’s comment on: Memetic Judo #1: On Doomsday Prophets v.2.2
Interesting insight. Sadly there isn’t much to be done against the beliefs of someone who is certain that god will save us.

Maybe the following: Assuming the frame of a believer, the signs of AGI being a dangerous technology seem obvious on closer inspection. If god exists, then we should therefore assume that this is an intentional test he has placed in front of us. God has given us all the signs. God helps those who help themselves.

Max TK 11 Aug 2023 23:45 UTC
1 point
0
in reply to: Gerald Monroe’s comment on: Memetic Judo #1: On Doomsday Prophets v.2.2
Isn’t that a response to a completely different kind of argument? I am probably not going to discuss this here, since it seems very off-topic, but if you want I can consider putting it on my list for arguments I might discuss in this form in a future article.

Max TK 16 Aug 2023 14:28 UTC
3 points
2
in reply to: Matt Goldenberg’s comment on: Memetic Judo #3: The Intelligence of Stochastic Parrots v.2
About point 1: I think you are right with that assumption, though I believe that many people repeat this argument without having really a stance on (or awareness of) brain physicalism. That’s why I didn’t hesitate to include it. Still, if you have a decent idea of how to improve this article for people who are sceptical of physicalism, I would like to add it.

About point 2: Yeah you might be right … a reference to OthelloGPT would make it more convincing—I will add it later!

Edit: Still, I believe that “mashup” isn’t even a strictly false characterization of concept composition. I think I might add a paragraph explicitly explaining that and how I think about it.

Max TK 16 Aug 2023 15:26 UTC
3 points
0
in reply to: dr_s’s comment on: Memetic Judo #3: The Intelligence of Stochastic Parrots v.2
Good point. I think I will add it later.

Max TK 16 Aug 2023 15:52 UTC
1 point
0
in reply to: TAG’s comment on: Memetic Judo #3: The Intelligence of Stochastic Parrots v.2
I don’t really know what to make of this objection, because I have never seen the stochastic parrot argument applied to a specific, limited architecture as opposed to the general category.

Edit: Maybe make a suggestion of how to rephrase to improve my argument.

Max TK 16 Aug 2023 16:30 UTC
1 point
0
in reply to: Gerald Monroe’s comment on: Memetic Judo #2: Incorporal Switches and Levers Compendium

the delta for power efficiency is currently ~1000 times in favor of brains ⇒ brain: ~20 W, AGI: ~20kW, kWh in Germany: 0,33 Euro 20 kWh: ~6 Euro ⇒ running our AGI would, if we are assuming that your description of the situation is correct, cost around 6 Euros in energy per hour, which is cheaper than a human worker.

So … while I don’t assume that such estimates need to be correct or apply to an AGI (that doesn’t exist yet) I don’t think you are making a very convincing point so far.

Max TK 16 Aug 2023 16:39 UTC
1 point
0
in reply to: TAG’s comment on: Memetic Judo #3: The Intelligence of Stochastic Parrots v.2
LLMs use 1 or more inner layers, so shouldn’t the proof apply to them?

Max TK 16 Aug 2023 16:42 UTC
1 point
−2
in reply to: Gerald Monroe’s comment on: Memetic Judo #3: The Intelligence of Stochastic Parrots v.2
Based on your phrasing I sense you are trying to object to something here, but it doesn’t seem to have much to do with my article. Is this correct or am I just misunderstanding your point?

Max TK 16 Aug 2023 17:14 UTC
0 points
0
in reply to: Gerald Monroe’s comment on: Memetic Judo #3: The Intelligence of Stochastic Parrots v.2
Usually between people in international forums, there is a gentlemen’s agreement to not be condescending over things like language comprehension or spelling errors, and I would like to continue this tradition, even though your own paragraphs would offer wide opportunities for me to do the same.

Max TK 16 Aug 2023 17:17 UTC
3 points
0
in reply to: TAG’s comment on: Memetic Judo #3: The Intelligence of Stochastic Parrots v.2
Of the universal approximation theorem

Max TK 16 Aug 2023 17:20 UTC
−1 points
−2
in reply to: Gerald Monroe’s comment on: Memetic Judo #2: Incorporal Switches and Levers Compendium
You were the one who made that argument, not me. 🙄

Max TK 16 Aug 2023 18:59 UTC
1 point
0
in reply to: Gerald Monroe’s comment on: Memetic Judo #2: Incorporal Switches and Levers Compendium
My argument does not depend on the AI being able to survive inside a bot net. I mentioned several alternatives.

Max TK 16 Aug 2023 21:59 UTC
1 point
0
in reply to: Morpheus’s comment on: Memetic Judo #3: The Intelligence of Stochastic Parrots v.2
#parrotGang