awg

Karma: 364

awg 13 Oct 2022 17:31 UTC
3 points
1
on: A stubborn unbeliever finally gets the depth of the AI alignment problem
If I might take a crack at summarizing, it seems that you’ve realized the full scope of both the inner alignment and outer alignment problems. It’s pretty scary indeed. I think your insight that we don’t even need full AGI for these kinds of things to become large social problems is spot on and as an industry insider I’m super glad you’re seeing that now!
One other thing that I think is worth contemplating is how much computational irreducibility would actually come into play with respect to still having real-world power and impact. I don’t think you would need to get anywhere near perfect simulation in order to begin to have extremely good predictive power over the world. We’re already seeing this in graphics and physics modeling. We’re starting to be able to do virtual wind tunnel simulations that yield faster and better results than physical simulations,^[1] and I think we’ll continue to find this to be the case. So presumably an advanced AGI would be able to create even better simulations that still work just-as-good in the real world and would forego much of the need for doing actual physical experimentation. Though I’m curious about what others think here!
1. ^
  See here. Link to paper here.

awg 13 Oct 2022 19:14 UTC
2 points
0
in reply to: mjt’s comment on: Why I think strong general AI is coming soon
Here is a paper from January 2022 on arXiv that details the sort of generalization-hop we’re seeing models doing.

awg 13 Oct 2022 21:59 UTC
1 point
0
in reply to: aelwood’s comment on: A stubborn unbeliever finally gets the depth of the AI alignment problem
So I guess any case in which you don’t have the data ready would still have more difficulties.
I’m not so sure...another interesting/alarming thing is noting how these models are “grokking” concepts in a way that lets them generalize.

awg 15 Oct 2022 18:49 UTC
−1 points
1
in reply to: porby’s comment on: Counterarguments to the basic AI x-risk case
Agreed that superhuman intelligence seems like the kind of thing that could be a very powerful lever. What gets me is that we don’t seem to know how orthogonal or non-orthogonal intelligence and empathy are to one another.^[1] If we were capable of creating a superhumanly intelligent AI and we were to be able to give it superhuman empathy, I might be inclined to trust ceding over a large amount of power and control to that system (or set of systems whatever). But a sociopathic superhuman intelligence? Definitely not ceding power over to that system.
The question then becomes to me, how confident are we that we are not creating dangerously sociopathic AI?
1. ^
  If I were to take a stab, I would say they were almost entirely orthogonal, as we have perfectly intelligent yet sociopathic humans walking around today who lack any sort of empathy. Giving any of these people superhuman ability and control would seem like an obviously terrible idea to me.

awg 21 Oct 2022 17:40 UTC
3 points
0
on: Plans Are Predictions, Not Optimization Targets
It’s interesting to think how this fits in with other notions like the Simulators idea.

awg 30 Oct 2022 3:26 UTC
15 points
11
on: Am I secretly excited for AI getting weird?
This nicely sums up the feeling of “excitement” I get when thinking about this sort of thing too. It’s more of an anxiety really. I agree it’s very related to the feeling of COVID-19 as it was unfolding in the first weeks of February and March 2020. It’s also similar to the feeling I got when Trump was elected, or on Jan 6. It’s just this feeling like something big and momentous is happening or is about to happen. That all of a sudden something totally consequential that you hadn’t previously considered is coming to pass. It’s a weird, doom-y sort of feeling.

awg 6 Dec 2022 20:24 UTC
3 points
2
on: Using GPT-Eliezer against ChatGPT Jailbreaking
This is kind of an “oh doy!” approach and I love it. It’s so straightforward and yet works so well for being such an obtuse, first-pass implementation. (Though I agree that there’s a high likelihood jailbreaks to this would be found quickly.) Still. This is wild.

awg 11 Jan 2023 17:05 UTC
4 points
0
on: [Rumour] Microsoft to invest $10B in OpenAI, will receive 75% of profits until they recoup investment: GPT would be integrated with Office
Gary Marcus talked about this in his newsletter today and reached sort of the opposite conclusion: that a deal like this might indicate that OpenAI leadership are bailing out now while the getting is good because the long term prospects for what they have are not looking great. Then again, he would say that?
Either way this is indeed a big deal (if true), whatever comes of it!

awg 18 Jan 2023 15:58 UTC
4 points
2
in reply to: Søren Elverlin’s comment on: OpenAI’s Alignment Plan is not S.M.A.R.T.
I’m interested why you would think that writing “Superintelligence” would require less GI than full self-driving from NY to SF. The former seems like a pretty narrow task compared to the latter.

awg 3 Feb 2023 23:18 UTC
7 points
5
in reply to: the gears to ascension’s comment on: If I encounter a capabilities paper that kinda spooks me, what should I do with it?
If it truly raises your hackles then maybe it’s worth sharing with at least one or two people who are working in safety research directly? Spreading it by ones and twos amongst people who would use the information for good (as it were) doesn’t seem too dangerous to me.

awg 27 Feb 2023 16:31 UTC
0 points
0
on: awg’s Shortform
EY gets mentioned in a recent newsletter in the Atlantic from writer Derek Thompson.

awg 26 Mar 2023 18:04 UTC
3 points
0
on: The Overton Window widens: Examples of AI risk in the media
Another article in The Atlantic today explicitly mentions the existential risk of AI and currently sits as the 9th most popular article on the website.

awg 26 Mar 2023 20:33 UTC
1 point
0
in reply to: RHollerith’s comment on: How likely do you think worse-than-extinction type fates to be?
Some advanced intelligence that takes over doesn’t have to be directed toward human suffering for s-risk to happen. It could just happen as a byproduct of whatever unimaginable things the advanced intelligence might want/do as it goes about its own business completely heedless of us. In those cases we’re suffering in the same way that some nameless species in some niche of the world is suffering because humans, unaware that species even exists, are encroaching on and destroying its natural domain in the scope of just going about our own comparatively unimaginable business.

awg 26 Mar 2023 20:46 UTC
1 point
0
in reply to: awg’s comment on: The Overton Window widens: Examples of AI risk in the media
In addition, there is a front-page article on The Economist website today about AI accelerationism among the big tech firms.
A particular interesting figure from the article (unrelated to Overton Window):
In 2022, amid a tech-led stockmarket crunch, the big five poured $223bn into research and development (R&D), up from $109bn in 2019 (see chart 1). That was on top of $161bn in capital expenditure, a figure that had also doubled in three years. All told, this was equivalent to 26% of their combined annual revenues last year, up from 16% in 2015.

awg 26 Mar 2023 23:59 UTC
1 point
0
in reply to: JBlack’s comment on: How likely do you think worse-than-extinction type fates to be?
Thanks for the helpful clarification!

awg 27 Mar 2023 17:30 UTC
1 point
0
in reply to: awg’s comment on: The Overton Window widens: Examples of AI risk in the media
On the front-page of the NY Times today (and this week):

awg 29 Mar 2023 18:09 UTC
11 points
5
on: Nobody’s on the ball on AGI alignment
I heavily endorse the tone and message of this post!
I also have a sense of optimism coming from society’s endogenous response to threats like these, especially with respect to how public response to COVID went from Feb → Mar 2020 and how the last 6 months have gone for public response to AI and AI safety (or even just the 4-5 months since ChatGPT was released in November). We could also look at the shift in response to climate change over the past 5-10 years.
Humanity does seem to have a knack for figuring things out just in the nick of time. Can’t say it’s something I’m glad to be relying on for optimism in this moment, but it has worked out in the past...

“Sorcerer’s Apprentice” from Fantasia as an analogy for alignment

awg29 Mar 2023 18:21 UTC

7 points

4 comments1 min readLW link

(video.disney.com)

awg 29 Mar 2023 19:25 UTC
1 point
0
in reply to: Stanisław Barzowski’s comment on: “Sorcerer’s Apprentice” from Fantasia as an analogy for alignment
Totally! Just figured for many like myself this might have been the first/only exposure to the tale.

awg 30 Mar 2023 20:08 UTC
5 points
4
in reply to: Wei Dai’s comment on: Nobody’s on the ball on AGI alignment
COVID and climate change are actually easy problems
I’m not sure I’d agree with that at all. Also, how are you calculating what cost is “necessary” for problems like COVID/climate change vs. incurred because of a “less-than-perfect” response? How are we even determining what the “perfect” response would be? We have no way of measuring the counterfactual damage from some other response to COVID, we can only (approximately) measure the damage that has happened due to our actual response.
For those reasons alone I don’t make the same generalization you do about predicting the approximate range of damage from these types of problems.
To me the generalization to be made is simply that: as an exogenous threat looms larger on the public consciousness, the larger the societal response to that threat becomes. And the larger the societal response to exogenous threats, the more likely we are to find some solution to overcoming them: either by hard work, miracle, chance, or whatever.
And I think there’s a steady case to be made that the exogenous xrisk from AI is starting to loom larger and larger on the public consciousness: The Overton Window widens: Examples of AI risk in the media.

awg

“Sorcerer’s Ap­pren­tice” from Fan­ta­sia as an anal­ogy for alignment

“Sorcerer’s Apprentice” from Fantasia as an analogy for alignment