starship006(Cody)

Karma: 281

starship006 18 Mar 2023 20:34 UTC
31 points
26
on: An Appeal to AI Superintelligence: Reasons to Preserve Humanity
I didn’t upvote or downvote this post. Although I do find the spirit of this message interesting, I have a disturbing feeling that arguing to future AI to “preserve humanity for pascals-mugging-type-reasons” trades off X-risk for S-risk. I’m not sure that any of these aforementioned cases encourage AI to maintain lives worth living. I’m not confident that this meaningfully changes S-risk or X-risk positively or negatively, but I’m also not confident that it doesn’t.

starship006 26 Sep 2021 16:42 UTC
31 points
on: Petrov Day 2021: Mutually Assured Destruction?
Attention LessWrong—I do not have any sort of power as I do not have a code. I also do not know anybody who has the code.

I would like to say, though, that I had a very good apple pie last night.

That’s about it. Have a great Petrov day :)

starship006 29 Dec 2021 2:45 UTC
28 points
on: A non-magical explanation of Jeffrey Epstein
Woah.… I don’t know what exactly I was expecting to get out of this article, but I thoroughly enjoyed it! Would love to see the possible sequence you mentioned come to life.

starship006 18 Feb 2023 21:26 UTC
20 points
13
on: We should be signal-boosting anti Bing chat content
There is a fuzzy line between “let’s slow down AI capabilities” and “lets explicitly, adversarially, sabotage AI research”. While I am all for the former, I don’t support the latter; it creates worlds in which AI safety and capabilities groups are pitted head to head, and capabilities orgs explicitly become more incentivized to ignore safety proposals. These aren’t worlds I personally wish to be in.
While I understand the motivation behind this message, I think the actions described in this post cross that fuzzy boundary, and pushes way too far towards that style of adversarial messaging

starship006 7 Nov 2021 1:32 UTC
15 points
0
on: App and book recommendations for people who want to be happier and more productive
Awesome recommendations, I really appreciated them (especially the one on game theory, that was a lot of fun to play through). I would like to also suggest Replacing Guilt series by Nate Soares for those who haven’t seen it on his blog or on the EA forum, a fantastic series that I would highly recommend people to check out.

starship006 28 Feb 2023 4:55 UTC
13 points
16
on: $20 Million in NSF Grants for Safety Research
With the advent of Sydney and now this, I’m becoming more inclined to believe that AI Safety and policies related to it are very close to being in the overton window of most intellectuals (I wouldn’t say the general public, yet). Like, maybe within a year, more than 60% of academic researchers will have heard of AI Safety. I don’t feel confident whatsoever about the claim, but it now seems more than ~20% likely. Does this seem to be a reach?

starship006 20 Jan 2023 21:52 UTC
12 points
6
on: Transcript of Sam Altman’s interview touching on AI safety
It feels strange hearing Sam say that their products are released whenever the feel as though ‘society is ready.’ Perhaps they can afford to do that now, but I cannot help but think that market dynamics will inevitably create strong incentives for race conditions very quickly (perhaps it is already happening) which will make following this approach pretty hard. I know he later says that he hopes for competition in the AI-space until the point of AGI, but I don’t see how he balances the knowledge of extreme competition with the hope that society is prepared for the technologies they release; it seems that even current models, which appear to be far from the capabilities of AGI, are already transformative.

starship006 30 Mar 2023 18:54 UTC
11 points
13
on: “Dangers of AI and the End of Human Civilization” Yudkowsky on Lex Fridman
Sheesh. Wild conversation. While I felt Lex was often missing the points Eliezer was saying, I’m glad he gave him the space and time to speak. Unfortunately, it felt like the conversation would keep moving towards reaching a super critical important insight that Eliezer wanted Lex to understand, and then Lex would just change the topic onto something else, and then Eliezer just had to begin building towards a new insight. Regardless, I appreciate that Lex and Eliezer thoroughly engaged with each other; this will probably spark good dialogue and get more people interested in the field. I’m glad it happened.
For those who are time constrained and wondering what is in it: Lex and Eliezer basically cover a whole bunch of high-level points related to AI not-kill-everyone-ism, delving into various thought experiments and concepts which formulate Eliezer’s worldview. Nothing super novel that you probably haven’t heard of if you’ve been following the field for some time.

starship006 7 Jun 2022 19:32 UTC
8 points
0
on: AGI Safety FAQ / all-dumb-questions-allowed thread
I have a few related questions pertaining to AGI timelines. I’ve been under the general impression that when it comes to timelines on AGI and doom, Eliezer’s predictions are based on a belief in extraordinarily fast AI development, and thus a close AGI arrival date, which I currently take to mean a quicker date of doom. I have three questions related to this matter:
1. For those who currently believe that AGI (using whatever definition to describe AGI as you see fit) will be arriving very soon—which, if I’m not mistaken, is what Eliezer is predicting—approximately how soon are we talking about. Is this 2-3 years soon? 10 years soon? (I know Eliezer has a bet that the world will end before 2030, so I’m trying to see if there has been any clarification of how soon before 2030)
2. How much does Eliezer’s views on timelines vary in comparison to other big-name AI safety researchers?
3. I’m currently under the impression that it takes a significant amount of knowledge of Artificial Intelligence to be able to accurately attempt to predict timelines related to AGI. Is this impression correct? And if so, would it be a good idea to reference general consensus opinions such as Metaculus when trying to frame how much time we have left?

starship006 6 Apr 2022 14:23 UTC
8 points
on: Don’t die with dignity; instead play to your outs
I like this framing so, so much more. Thank you for putting some feelings I vaguely sensed, but didn’t quite grasp yet, into concrete terms.

starship006 16 Jun 2023 14:21 UTC
7 points
7
in reply to: Stephen McAleese’s comment on: Lightcone Infrastructure/LessWrong is looking for funding
Ehh… feels like your base rate of 10% for LW users who are willing to pay for a subscription is too high, especially seeing how the ‘free’ version would still offer everything I (and presumably others) care about. Generalizing to other platforms, this feels closest to Twitter’s situation with Twitter Blue, whose rates appear is far, far lower: if we be generous and say they have one million subscribers, then out of the 41.5 million monetizable daily active users they currently have, this would suggest a base rate of less than 3%.

starship006 21 Dec 2022 7:26 UTC
6 points
1
on: Podcast: What’s Wrong With LessWrong
I’m trying to engage with your criticism faithfully, but I can’t help but get the feeling that a lot of your critiques here seem to be a form of “you guys are weird”: your guys’s privacy norms are weird, your vocabulary is weird, you present yourself off as weird, etc. And while I may agree that sometimes it feels as if LessWrongers are out-of-touch with reality at points, this criticism, coupled with some of the other object-level disagreements you were making, seems to overlook the many benefits that LessWrong provides; I can personally attest to the fact that I’ve improved in my thinking as a whole due to this site. If that makes me a little weird, then I’ll accept that as a way to help me shape the world as I see fit. And hopefully I can become a little less weird through the same rationality skills this site helps develop

starship006 12 Jan 2023 16:12 UTC
5 points
0
on: How it feels to have your mind hacked by an AI
Let’s say Charlotte was a much more advanced LLM (almost AGI-like, even). Do you believe that if you had known that Charlotte was extraordinarily capable, you might have been more guarded about recognizing it for its ability to understand and manipulate human psychology, and thus been less susceptible to it potentially doing so?
I find that small part of me still think that “oh this sort of thing could never happen to me, since I can learn from others that AGI and LLMs can make you emotionally vulnerable, and thus not fall into a trap!” But perhaps this is just wishful thinking that would crumble once I interact with more and more advanced LLMs.

starship006 31 Mar 2023 15:18 UTC
4 points
0
on: Widening Overton Window—Open Thread
The increased public attention towards AI Safety risk is probably a good thing. But, when stuff like this is getting lumped in with the rest of AI Safety, it feels like the public-facing slow-down-AI movement is going to be a grab-bag of AI Safety, AI Ethics, and AI… privacy(?). As such, I’m afraid that the public discourse will devolve into “Woah-there-Slow-AI” and “GOGOGOGO” tribal warfare; from the track record of American politics, this seems likely—maybe even inevitable?
More importantly, though, what I’m afraid of is that this will translate into adversarial relations between AI Capabilities organizations and AI Safety orgs (more generally, that capabilities teams will become less inclined to incorporate safety concerns in their products).
I’m not actually in an AI organization, so if someone is in one and has thoughts on this dynamic happening/not happening, I would love to hear.

starship006 26 Mar 2023 19:32 UTC
4 points
0
on: GPT-4 Specs: 1 Trillion Parameters?
Relevant Manifold Market:

starship006 5 Sep 2022 17:32 UTC
4 points
2
on: The ethics of reclining airplane seats
I don’t quite understand the perspective behind someone ‘owning’ a specific space. Do airlines specify that when you purchase a ticket, you are entitled to the chair + the surrounding space (in whatever ambiguous way that may mean)? If not, it seems to me that purchasing a ticket pays for a seat and your right to sit down on it, and everything else is complementary.

starship006 28 Nov 2023 4:03 UTC
3 points
0
on: Shallow review of live agendas in alignment & safety
Reverse engineering. Unclear if this is being pushed much anymore. 2022: Anthropic circuits, Interpretability In The Wild, Grokking mod arithmetic
FWIW, I was one of Neel’s MATS 4.1 scholars and I would classify ³⁄₄ of Neel’s scholar’s outputs as reverse engineering some component of LLMs (for completeness, this is the other one, which doesn’t nicely fit as ‘reverse engineering’ imo). I would also say that this is still an active direction of research (lots of ground to cover with MLP neurons, polysemantic heads, and more)

starship006 2 Oct 2022 4:25 UTC
3 points
0
in reply to: Quintin Pope’s comment on: Paper: Large Language Models Can Self-improve [Linkpost]
Humans can often teach themselves to be better at a skill through practice, even without a teacher or ground truth
Definitely, but I currently feel that the vast majority of human learning comes with a ground truth to reinforce good habits. I think this is why I’m surprised this works as much as it does: it kinda feels like letting an elementary school kid teach themself math by practicing certain skills they feel confident in without any regard to if that skill even is “mathematically correct”.
Sure, these skills are probably on the right track toward solving math problems—otherwise, the kid wouldn’t have felt as confident about them. But would this approach not ignore skills the student needs to work on, or even amplify “bad” skills? (Or maybe this is just a faulty analogy and I need to re-read the paper)

starship006 2 Jul 2022 2:05 UTC
3 points
0
on: Looking back on my alignment PhD
I’m having trouble understanding your first point on wanting to ‘catch up’ to other thinkers. Was your primary message advocating against feeling as if you are ‘in dept’ until you improve your rationality skills? If so, I can understand that.
But if that is the case, I don’t understand the relevance of the lack of a “rationality tech-tree”—sure, there may not be clearly defined pathways to learn rationality. Even so, I think its fair to say that I perceive some people on this blog to currently be better thinkers than I, and that I would like to catch up to their thinking abilities so that I can effectively contribute to many discussions. Would you advocate against that mindset as well?

starship006 15 May 2022 23:02 UTC
3 points
on: Should we buy Google stock?
Meta comment: Would someone mind explaining to me why this question is being received poorly (negative karma right now)? It seemed like a very honest question, and while the answer may be obvious to some, I doubt it was to Sergio. Ic’s response was definitely unnecessarily aggressive/rude, and it appears that most people would agree with me there. But many people also downvoted the question itself, too, and that doesn’t make sense to me; shouldn’t questions like these be encouraged?