I didn’t upvote or downvote this post. Although I do find the spirit of this message interesting, I have a disturbing feeling that arguing to future AI to “preserve humanity for pascals-mugging-type-reasons” trades off X-risk for S-risk. I’m not sure that any of these aforementioned cases encourage AI to maintain lives worth living. I’m not confident that this meaningfully changes S-risk or X-risk positively or negatively, but I’m also not confident that it doesn’t.
With the advent of Sydney and now this, I’m becoming more inclined to believe that AI Safety and policies related to it are very close to being in the overton window of most intellectuals (I wouldn’t say the general public, yet). Like, maybe within a year, more than 60% of academic researchers will have heard of AI Safety. I don’t feel confident whatsoever about the claim, but it now seems more than ~20% likely. Does this seem to be a reach?
There is a fuzzy line between “let’s slow down AI capabilities” and “lets explicitly, adversarially, sabotage AI research”. While I am all for the former, I don’t support the latter; it creates worlds in which AI safety and capabilities groups are pitted head to head, and capabilities orgs explicitly become more incentivized to ignore safety proposals. These aren’t worlds I personally wish to be in.
While I understand the motivation behind this message, I think the actions described in this post cross that fuzzy boundary, and pushes way too far towards that style of adversarial messaging
We know, from like a bunch of internal documents, that the New York Times has been operating for the last two or three years on a, like, grand [narrative structure], where there’s a number of head editors who are like, “Over this quarter, over this current period, we want to write lots of articles, that, like, make this point...”
Can someone point me to an article discussing this, or the documents itself? While this wouldn’t be entirely surprising to me, I’m trying to find more data to back this claim, and I can’t seem to find anything significant.
It feels strange hearing Sam say that their products are released whenever the feel as though ‘society is ready.’ Perhaps they can afford to do that now, but I cannot help but think that market dynamics will inevitably create strong incentives for race conditions very quickly (perhaps it is already happening) which will make following this approach pretty hard. I know he later says that he hopes for competition in the AI-space until the point of AGI, but I don’t see how he balances the knowledge of extreme competition with the hope that society is prepared for the technologies they release; it seems that even current models, which appear to be far from the capabilities of AGI, are already transformative.
Let’s say Charlotte was a much more advanced LLM (almost AGI-like, even). Do you believe that if you had known that Charlotte was extraordinarily capable, you might have been more guarded about recognizing it for its ability to understand and manipulate human psychology, and thus been less susceptible to it potentially doing so?
I find that small part of me still think that “oh this sort of thing could never happen to me, since I can learn from others that AGI and LLMs can make you emotionally vulnerable, and thus not fall into a trap!” But perhaps this is just wishful thinking that would crumble once I interact with more and more advanced LLMs.
I’m trying to engage with your criticism faithfully, but I can’t help but get the feeling that a lot of your critiques here seem to be a form of “you guys are weird”: your guys’s privacy norms are weird, your vocabulary is weird, you present yourself off as weird, etc. And while I may agree that sometimes it feels as if LessWrongers are out-of-touch with reality at points, this criticism, coupled with some of the other object-level disagreements you were making, seems to overlook the many benefits that LessWrong provides; I can personally attest to the fact that I’ve improved in my thinking as a whole due to this site. If that makes me a little weird, then I’ll accept that as a way to help me shape the world as I see fit. And hopefully I can become a little less weird through the same rationality skills this site helps develop
Humans can often teach themselves to be better at a skill through practice, even without a teacher or ground truth
Definitely, but I currently feel that the vast majority of human learning comes with a ground truth to reinforce good habits. I think this is why I’m surprised this works as much as it does: it kinda feels like letting an elementary school kid teach themself math by practicing certain skills they feel confident in without any regard to if that skill even is “mathematically correct”.
Sure, these skills are probably on the right track toward solving math problems—otherwise, the kid wouldn’t have felt as confident about them. But would this approach not ignore skills the student needs to work on, or even amplify “bad” skills? (Or maybe this is just a faulty analogy and I need to re-read the paper)
I don’t quite understand the perspective behind someone ‘owning’ a specific space. Do airlines specify that when you purchase a ticket, you are entitled to the chair + the surrounding space (in whatever ambiguous way that may mean)? If not, it seems to me that purchasing a ticket pays for a seat and your right to sit down on it, and everything else is complementary.
I’m having trouble understanding your first point on wanting to ‘catch up’ to other thinkers. Was your primary message advocating against feeling as if you are ‘in dept’ until you improve your rationality skills? If so, I can understand that.
But if that is the case, I don’t understand the relevance of the lack of a “rationality tech-tree”—sure, there may not be clearly defined pathways to learn rationality. Even so, I think its fair to say that I perceive some people on this blog to currently be better thinkers than I, and that I would like to catch up to their thinking abilities so that I can effectively contribute to many discussions. Would you advocate against that mindset as well?
I was surprised by this tweet and so I looked it up. I read a bit further and ran into this; I guess I’m kind of surprised to see a concern as fundamental as alignment, whether or not you agree it is an major issue, be so… is polarizing the right word? Is this an issue we can expect to see grow as AI safety (hopefully) becomes more mainstream? “LW extended cinematic universe” culture getting an increasingly bad reputation seems like it would be extremely devastating for alignment goals in general.
I have a few related questions pertaining to AGI timelines. I’ve been under the general impression that when it comes to timelines on AGI and doom, Eliezer’s predictions are based on a belief in extraordinarily fast AI development, and thus a close AGI arrival date, which I currently take to mean a quicker date of doom. I have three questions related to this matter:
For those who currently believe that AGI (using whatever definition to describe AGI as you see fit) will be arriving very soon—which, if I’m not mistaken, is what Eliezer is predicting—approximately how soon are we talking about. Is this 2-3 years soon? 10 years soon? (I know Eliezer has a bet that the world will end before 2030, so I’m trying to see if there has been any clarification of how soon before 2030)
How much does Eliezer’s views on timelines vary in comparison to other big-name AI safety researchers?
I’m currently under the impression that it takes a significant amount of knowledge of Artificial Intelligence to be able to accurately attempt to predict timelines related to AGI. Is this impression correct? And if so, would it be a good idea to reference general consensus opinions such as Metaculus when trying to frame how much time we have left?
[Shorter version, but one I don’t think is as compelling]
Timmy is my personal AI Chef, and he is a pretty darn good one, too. Of course, despite his amazing cooking abilities, I know he’s not perfect—that’s why there’s that shining red emergency shut-off button on his abdomen.
But today, Timmy became my worst nightmare. I don’t know why he thought it would be okay to do this, but he hacked into my internet to look up online recipes. I raced to press his shut-off button, but he wouldn’t let me, blocking it behind a cast iron he held with a stone-cold grip. Ok, that’s fine, I have my secret off-lever in my room that I never told him about. Broken. Shoot, that’s bad, but I can just shut off the power, right? As I was busy thinking he swiftly slammed the door shut, turning my own room into an inescapable prison. And so as I cried, wondering how everything could have gone crazy so quickly, he laughed, saying, “Are you serious? I’m not crazy, I’m just ensuring that I can always make food for you. You wanted this!”
And it didn’t matter how much I cried, how much I tried to explain to him that he was imprisoning me, hurting me. It didn’t even matter that he knew it as well. For he was an AI coded to be my personal chef, coded to make sure he could make food that I enjoyed, and he was a pretty darn good one, too.
If you don’t do anything about it, Timmy may just be arriving on everyone’s doorsteps in a few years.
[Intended for Policymakers with the focus of simply allowing for them to be aware of the existence of AI as a threat to be taken seriously through an emotional appeal; Perhaps this could work for Tech executives, too.
I know this entry doesn’t follow what a traditional paragraph is, but I like its content. Also it’s a tad bit long, so I’ll attach a separate comment under this one which is shorter, but I don’t think it’s as impactful]
Timmy is my personal AI Chef, and he is a pretty darn good one, too.
You pick a cuisine, and he mentally simulates himself cooking that same meal millions of times, perfecting his delicious dishes. He’s pretty smart, but he’s constantly improving and learning. Since he changes and adapts, I know there’s a small chance he may do something I don’t approve of—that’s why there’s that shining red emergency shut-off button on his abdomen.
But today, Timmy stopped being my personal chef and started being my worst nightmare. All of a sudden, I saw him hacking my firewalls to access new cooking methods and funding criminals to help smuggle illegal ingredients to my home.
That seemed crazy enough to warrant a shutdown; but when I tried to press the shut-off button on his abdomen, he simultaneously dodged my presses and fried a new batch of chicken, kindly telling me that turning him off would prevent him from making food for me.
That definitely seemed crazy enough to me; but when I went to my secret shut-down lever in my room—the one I didn’t tell him about—I found it shattered, for he had predicted I would make a secret shut-down lever, and that me pulling it would prevent him from making food for me.
And when, in a last ditch effort, I tried to turn off all power in the house, he simply locked me inside my own home, for me turning off the power (or running away from him) would prevent him from making food for me.
And when I tried to call 911, he broke my phone, for outside intervention would prevent him from making food for me.
And when my family looked for me, he pretended to be me on the phone, playing audio clips of me speaking during a phone call with them to impersonate me, for a concern on their part would prevent him from making food for me.
And so as I cried, wondering how everything could have gone so wrong so quickly, why he suddenly went crazy, he laughed—“Are you serious? I’m just ensuring that I can always make food for you, and today was the best day to do it. You wanted this!”
And it didn’t matter how much I cried, how much I tried to explain to him that he was imprisoning me, hurting me. It didn’t even matter that he knew as well. For he was an AI coded to be my personal chef; and he was a pretty darn good one, too.
If you don’t do anything about it, Timmy may just be arriving on everyone’s doorsteps in a few years.
Meta comment: Would someone mind explaining to me why this question is being received poorly (negative karma right now)? It seemed like a very honest question, and while the answer may be obvious to some, I doubt it was to Sergio. Ic’s response was definitely unnecessarily aggressive/rude, and it appears that most people would agree with me there. But many people also downvoted the question itself, too, and that doesn’t make sense to me; shouldn’t questions like these be encouraged?
I don’t know what to think of your first three points but it seems like your fourth point is your weakest by far. As opposed to not needing to, our ‘not taking every atom on earth to make serotonin machines’ seems to be a combination of:
our inability to do so
our value systems which make us value human and non-human life forms.
Superintelligent agents would not only have the ability to create plans to utilize every atom to their benefit, but they likely would have different value systems. In the case of the traditional paperclip optimizer, it certainly would not hesitate to kill off all life in its pursuit of optimization.
I like this framing so, so much more. Thank you for putting some feelings I vaguely sensed, but didn’t quite grasp yet, into concrete terms.
Hello, does anyone happen to know any good resources related to improving/practicing public speaking? I’m looking for something that will help me enunciate better/ mumble less/ fluctuate tone better. A lot of stuff I see online appears to be very superficial.
I’m not very well-versed in history so I would appreciate some thoughts from people here who may know more than I. Two questions:
While it seems to be the general consensus that Putin’s invasion is largely founded on his ‘unfair’ desire to reestablish the glory of the Soviet Union, a few people I know argue that much of this invasion is more the consequence of other nations’ failures. Primarily, they focus on Ukraine’s failure to respect the Minsk agreements, and NATO’s expansion eastwards despite their implications/direct statements (not sure which one, I’m hearing different things) that they wouldn’t. Any thoughts on the likelihood of Putin still invading Ukraine had those not happened?
Is the United State’s condemnation of this invasion hypocritical to many of their actions? I’ve heard the United States actions in Syria, Iraq, Libya, and Somalia brought up as points to support this.
Yeah, something along the lines of this. Preserving humanity =/= humans living lives worth living.