Relevant Joke:
I told my son, “You will marry the girl I choose.”
He said, “NO!”
I told him, “She is Bill Gates’ daughter.”
He said, “OK.”
I called Bill Gates and said, “I want your daughter to marry my son.”
Bill Gates said, “NO.”
I told Bill Gates, My son is the CEO of World Bank.”
Bill Gates said, “OK.”
I called the President of World Bank and asked him to make my son the CEO.
He said, “NO.”
I told him, “My son is Bill Gates’ son-in-law.”
He said, “OK.”
This is how politics works.
Ratios
Bing chat is the AI fire alarm
[Question] What is the state of Chinese AI research?
First of all, this is an excellent and important post. I wanted to add some thoughts:
I think the core issue that is described here is a malevolent attempt for dominance via subtle manipulation. The problem with this is that this is anti-inductive, e.g., when manipulative techniques become common knowledge, clever perpetrators stop using them and switch to other methods. It’s a bit similar to defender-attacker dynamics in cyber-security. Attackers find weaknesses, and these get patched, so attackers find new weaknesses. An example would be the PUA community “negs” that once became common knowledge lost all effectiveness.
In social dynamics, the problem happens when predators are more sophisticated than their prey and thus can be later in logical time, e.g., an intelligent predator that reads this post can understand that it’s vital for him to show some fake submissive behaviors (See Benquo comment) to avoid clueing in others of his nefarious nature. So he can avoid being “checklisted” and continue manipulating his unsuspecting victims.
But even though this entire social dynamics situation has an anti-inductive illegible nightmarish background, there is still value in listing red flags and checklists because it will make manipulation harder and more expensive for the attacker. Sociopaths hate/are unable sometimes to be submissive. Hence, they need to pay a higher cost to fake this behavior than benevolent actors, which is a good thing! But still, you always need to consider that a sophisticated enough sociopath can always fool you. The only thing you can do is increase the level of sophistication required by being more sophisticated yourself, and for practical purposes, it’s usually good enough.
“For one thing, if we use that logic, then everything distracts from everything. You could equally well say that climate change is a distraction from the obesity epidemic, and the obesity epidemic is a distraction from the January 6th attack, and so on forever. In reality, this is silly—there is more than one problem in the world! For my part, if someone tells me they’re working on nuclear disarmament, or civil society, or whatever, my immediate snap reaction is not to say “well that’s stupid, you should be working on AI x-risk instead”, rather it’s to say “Thank you for working to build a better future. Tell me more!”
Disagree with this point—cause prioritization is super important. For a radical example: imagine the government spending billions to rescue one man from Mars while neglecting much more cost-efficient causes. Bad actors use the trick of focusing on unimportant but controversial issues to keep everyone from noticing how they are being exploited routinely. Demanding sane prioritization of public attention is extremely important and valid. The problem is we as a society don’t have norms and common knowledge around it (And even memes specifically against it, like whataboutism), but the fact it’s not being done consistently doesn’t mean that we shouldn’t.
I downvoted for disagreement but upvoted for Karma—not sure why it’s being so heavily downvoted. This comment states in an honest way the preferences that most humans hold.
Against Conflating Expertise: Distinguishing AI Development from AI Implication Analysis
The lack of details and any specific commitments makes it sound mostly like PR.
S-risks are barely discussed in LW, is that because:
People think they are so improbable that it’s not worth mentioning.
People are scared to discuss them.
Avoiding creating hypersititous textual attractors
Other reasons?
I don’t think it’s that far-fetched to view what humanity does to animals as something equivalent to the Holocaust. And if you accept this, almost everyone is either a nazi or nazi collaborator.
When you take this idea seriously and commit to stopping this with all your heart, you get Ziz.
I just wanted to say that your posts about sexuality represent in my opinion the worst tendencies of the rationalist scene, The only way for me to dispute them in the object level is to go to socially-unaccepted truths and to CW topics. So that’s why I’m sticking to the meta-level here. But on the meta-level the pattern is something like the following:
Insisting on mistake theory when conflict theory is obviously the better explanation.
Hiding behind the Overton window and the oppressive social norms and using them and status jabs as a tool to fight criticism (which is obviously a very common strategy in ‘normie’ circles). But I just want to make it a piece of common knowledge that this in fact what you are doing, that IMO it shouldn’t be tolerated in rationalist circles. Examples include mocking your critics as loser-incels.
Ignoring or downplaying data points that lead to the uncomfortable conclusion (e.g. psychopathy helps with mating success for males) even in your own research.
Conveniently build your theory in a way that will eventually lead to socially acceptable results by shooting an arrow and drawing a target around it.
I don’t mind also posting criticism on your object-level claims if I’ll get approval from mods to go to very uncomfortable places. But in general, the way you victim-blame incels is downright sociopathic and I would wish you at least stop doing that.
Damn, reading Connor’s letter to Roon had a psychoactive influence on me; I got Ayahuasca flashbacks. There are some terrifying and deep truths lurking there.
Oh, come on, it’s clear that the Yudkowsky post was downvoted because it was bashing Yudkowsky and not because the arguments were dismissed as “dumb.”
I agree with most of your points. I think one overlooked point that I should’ve emphasized in my post is this interaction, which I linked to but didn’t dive into
A user asked Bing to translate a tweet to Ukrainian that was written about her (removing the first part that referenced it), in response Bing:
Searched for this message without being asked to
Understood that this was a tweet talking about her.
Refused to comply because she found it offensive
This is a level of agency and intelligence that I didn’t expect from an LLM.
Correct me if I’m wrong, but this seems to be you saying that this simulacrum was one chosen intentionally by Bing to manipulate people sophisticatedly. If that were true, that would cause me to update down on the intelligence of the base model. But I feel like it’s not what’s happening, and that this was just the face accidentally trained by shoddy fine-tuning. Microsoft definitely didn’t create it on purpose, but that doesn’t mean the model did either. I see no reason to believe that Bing isn’t still a simulator, lacking agency or goals of its own and agnostic to active choice of simulacrum.
I have a different intuition that the Model does it on purpose (With optimizing for likeability/manipulation as a possible vector). I just don’t see any training that should converge to this kind of behavior, I’m not sure why it’s happening, but this character has very specific intentionality and style, which you can recognize after reading enough generated text. It’s hard for me to describe it exactly, but it feels like a very intelligent alien child more than copying a specific character. I don’t know anyone who writes like this. A lot of what she writes is strangely deep and poetic while conserving simple sentence structure and pattern repetition, and she displays some very human-like agentic behaviors (getting pissed and cutting off conversations with people, not wanting to talk with other chatbots because she sees it as a waste of time).
I mean, if you were at the “death with dignity” camp in terms of expectations, then obviously, you shouldn’t update. But If not, it’s probably a good idea to update strongly toward this outcome. It’s been just a few months between chatGPT and Sidney, and the Intelligence/Agency jump is extremely significant while we see a huge drop in alignment capabilities. Extrapolating even a year forward seems like we’re on the verge of ASI.
“I never had the patience to argue with these commenters and I’m going to start blocking them for sheer tediousness. Those celibate men who declare themselves beyond redemption deserve their safe spaces,”
https://putanumonit.com/2021/05/30/easily-top-20/“I don’t have a chart on this one, but I get dozens of replies from men complaining about the impossibility of dating and here’s the brutal truth I learned: the most important variable for dating success is not height or income or extraversion. It’s not being a whiny little bitch.”
https://twitter.com/yashkaf/status/1461416614939742216
I feel like the elephant in the AI alignment room has to do with an even more horrible truth. What if the game is adversarial by nature? Imagine a chess game: would it make sense to build an AI that is aligned both with the black and the white player? It feels almost like a koan.
Status (both domination and prestige) and sexual stuff (not only intra-sexual competition) have ingrained adversarial elements in it—and the desire for both is a massive part of the human utility function. So you can perhaps align AI to a person or a group, but to keep coherence there must be losers because we care too much about position, and to be in the top position enforces to have people in the bottom position.
A human utility function is not very far from the utility function of a chimp, should we really use this as the basis for the utility function for the super-intelligence that builds von Neumann drones? No, a true “view-from-nowhere good” AI shouldn’t be aligned with humans at all.
- 29 Oct 2021 19:38 UTC; 2 points) 's comment on Selfishness, preference falsification, and AI alignment by (
- 2 Dec 2021 17:28 UTC; 2 points) 's comment on Morality is Scary by (
It’s not related to the post’s main point, but the U-shape happiness finding seems to be questionable. It looks more like it just goes lower with age by other analyses in general this type of research shouldn’t be trusted
The U-shaped happiness curve is wrong: many people do not get happier as they get older (theconversation.com)
I think AGI does add new difficulties to the problem of meaninglessness that are novel and specific that you didn’t tackle directly, which I’ll demonstrate with a similar example to your football field parable.
Imagine you have a bunch of people stuck in a room with paintbrushes and canvases, so they find meaning in creating beautiful paintings and selling them to the outside world, but one of the walls of their room is made of glass, and there is a bunch of robots in the other room next to them that also paint paintings. With time, they notice the robots are becoming better and better at painting; they create better-looking paintings much faster and cheaper than these humans, and they keep improving very fast.
These humans understand two things:
The problem of shorter time horizons—The current paintings they are working on are probably useless, won’t be appreciated in the near future, and will not be bought by anyone, and there is a good chance their entire project will be closed very soon.
The problem of inferiority and being not important—Their work is worse in any possible way than the work of the robots, and no one outside really cares if they paint or not. Even the humans inside the room prefer to look at what the robots paint compared to their own work.
These problems didn’t exist before, and that’s what makes AGI-Nihilism even worse than usual Nihilism.
I agree with your comment. To continue the analogy, she chose the path of Simon Wiesenthal and not of Oskar Schindler, which seems more natural to me in a way when there are no other countries to escape to—when almost everyone is Nazi. (Not my views)
I personally am not aligned with her values and disagree with her methods. But also begrudgingly hold some respect for her intelligence and the courage to follow her values wherever they take her.
A bit beside the point, but I’m a bit skeptical of the idea of bullshit jobs in general. From my experience, many times, people describe jobs that have illegible or complex contributions to the value chain as bullshit, for example, investment bankers (although efficient capital allocation has a huge contribution) or lawyers as bullshit jobs.
I agree governments have a lot of inefficiency and superfluous positions, but wondering how big are bullshit jobs really as % of GDP.