Jack R

Karma: 301

Berkeley group house, spots open

Jack RSep 22, 2022, 5:13 PM

4 points

1 comment1 min readLW link

Jack R Sep 20, 2022, 3:47 AM
6 points
2
on: Prize idea: Transmit MIRI and Eliezer’s worldviews
Aren’t turned off by perceived arrogance
One hypothesis I’ve had is that people with more MIRI-like views tend to be more arrogant themselves. A possible mechanism is that the idea that the world is going to end and that they are the only ones who can save is appealing in a way that shifts their views on certain questions and changes the way they think about AI (e.g. they need less explanation that they are some of the most important people ever, so they spend less time considering why AI might go well by default).

[ETA: In case it wasn’t clear, I am positing subconscious patterns correlated with arrogance that lead to MIRI-like views]

Jack R Sep 12, 2022, 6:51 PM
2 points
in reply to: James_Miller’s comment on: How can we ensure that a Friendly AI team will be sane enough?
How’d this go? Just searched LW for “neurofeedback” since I recently learned about it

Jack R Sep 6, 2022, 12:37 AM
1 point
0
in reply to: elifland’s comment on: Discussion on utilizing AI for alignment
That argument makes sense, thanks

Jack R Sep 4, 2022, 10:05 PM
7 points
1
on: Discussion on utilizing AI for alignment
We are very likely not going to miss out on alignment by a 2x productivity boost, that’s not how things end up in the real world. We’ll either solve alignment or miss by a factor of >10x.
Why is this true?

Jack R Sep 4, 2022, 7:52 PM
9 points
1
on: The shard theory of human values
the genome can’t directly make us afraid of death
It’s not necessarily direct, but in case you aren’t aware of it, prepared learning is a relevant phenomenon,since apparently the genome does predispose us to certain fears

Jack R Aug 30, 2022, 5:13 AM
2 points
0
on: Announcing Encultured AI: Building a Video Game
Seems like this guy has already started trying to use GPT-3 in a videogame: GPT3 AI Game Prototype

[Linkpost] Can lab-grown brains become conscious?

Jack RAug 28, 2022, 5:45 PM

14 points

3 comments1 min readLW link

Jack R Aug 23, 2022, 10:41 PM
1 point
0
in reply to: johnswentworth’s comment on: AGI Timelines Are Mostly Not Strategically Relevant To Alignment
Not sure if it was clear, but the reason I asked was because it seems like if you think the fraction changes significantly before AGI, then the claim that Thane quotes in the top-level comment wouldn’t be true.

Jack R Aug 23, 2022, 10:24 PM
4 points
0
on: AGI Timelines Are Mostly Not Strategically Relevant To Alignment
Don’t timelines change your views on takeoff speeds? If not, what’s an example piece of evidence that updates your timelines but not your takeoff speeds?

Jack R Aug 23, 2022, 10:23 PM
2 points
1
in reply to: Thane Ruthenis’s comment on: AGI Timelines Are Mostly Not Strategically Relevant To Alignment
Same—also interested if John was assuming that the fraction of deployment labor that is automated changes negligibly over time pre-AGI.

Jack R Aug 21, 2022, 4:20 AM
2 points
0
on: Broad Picture of Human Values
Humans can change their action patterns on a dime, inspired by philosophical arguments, convinced by logic, indoctrinated by political or religious rhetoric, or plainly because they’re forced to.
I’d add that action patterns can change for reasons other than logical/deliberative ones. For example, adapting to a new culture means you might adopt and have new reactions to objects, gestures, etc that are considered symbolic in that culture.

Jack R Aug 18, 2022, 10:11 PM
2 points
0
on: Discovering Agents
so the edge $(~ S, ~ Q)$ is terminal
Earlier you said that the blue edges were terminal edges.

Jack R Aug 18, 2022, 6:11 AM
11 points
6
in reply to: ShardPhoenix’s comment on: Announcing Encultured AI: Building a Video Game
What are some of the “various things” you have in mind here? It seems possible to me that something like “AI alignment testing” is straightforwardly upstream of what players want, but maybe you were thinking of something else

Jack R Aug 11, 2022, 7:21 PM
3 points
0
on: Flash Classes: Pendulums, Policy-Level Decisionmaking, Saving State
“Go with your gut” [...] [is] insensitive to circumstance.
People’s guts seem very sensitive to circumstance, especially compared to commitments.

Jack R Aug 11, 2022, 7:26 AM
3 points
0
on: What misalignment looks like as capabilities scale
But the capabilities of neural networks are currently advancing much faster than our ability to understand how they work or interpret their cognition;
Naively, you might think that as opacity increases, trust in systems decreases, and hence something like “willingness to deploy” decreases.
How good of an argument does this seem to you against the hypothesis that “capabilities will grow faster than alignment”? I’m viewing the quoted sentence as an argument for the hypothesis.

Some initial thoughts:
- A highly capable system doesn’t necessarily need to be deployed by humans to disempower humans, meaning “deployment” is not necessarily a good concept to use here
- On the other hand, deployability of systems increases investment in AI (how much?), meaning that increasing opacity might in some sense decreases future capabilities compared to counterfactuals where the AI was less opaque
- I don’t know how much willingness to deploy really decreases from increased opacity, if at all
- Opacity can be thought of as the inability to predict behavior in a given new environment. As models have scaled, the number of benchmarks we test them on also seems to have scaled, which does help us understand their behavior. So perhaps the measure that’s actually important is the “difference between tested behavior and deployed behavior” and it’s unclear to me what this metric looks like over time. [ETA: it feels obvious that our understanding of AI’s deployed behavior has worsened, but I want to be more specific and sure about that]

Jack R Jul 28, 2022, 8:59 PM
1 point
0
in reply to: Yonatan Cale’s comment on: Will working here advance AGI? Help us not destroy the world!
I was thinking of the possibility of affecting decision-making, either directly by rising the ranks (not very likely) or indirectly by being an advocate for safety at an important time and pushing things into the Overton window within an organization.
I imagine Habryka would say that a significant possibility here is that joining an AGI lab will wrongly turn you into an AGI enthusiast. I think biasing effects like that are real, though I also think it’s hard to tell in cases like that how much you are biased v.s. updating correctly on new information, and one could make similar bias claims about the AI x-risk community (e.g. there is social pressure to be doomy; only being exposed to heuristic arguments for doom and few heuristic arguments for optimism will bias you to be doomier than you would be given more information).

Jack R Jul 27, 2022, 4:51 AM
1 point
0
in reply to: habryka’s comment on: Will working here advance AGI? Help us not destroy the world!
It seems like you are confident that the delta in capabilites would outweigh any delta in general alignment sympathy. Is this what you think?

Jack R Jul 26, 2022, 8:43 AM
2 points
1
on: A central AI alignment problem: capabilities generalization, and the sharp left turn
Attempting to manually specify the nature of goodness is a doomed endeavor, of course, but that’s fine, because we can instead specify processes for figuring out (the coherent extrapolation of) what humans value. […] So today’s alignment problems are a few steps removed from tricky moral questions, on my models.

I‘m not convinced that choosing those processes is significantly non-moral. I might be misunderstanding what you are pointing at, but it feels like the fact that being able to choose the voting system gives you power over the vote’s outcome is evidence of this sort of thing—that meta decisions are still importantly tied to decisions.

Jack R Jul 14, 2022, 10:34 PM
1 point
3
in reply to: Algon’s comment on: Criticism of EA Criticism Contest
I think there should be a word for your parsing, maybe “VNM utilitarianism,” but I think most people mean roughly what’s on the wiki page for utilitarianism:
Utilitarianism is a family of normative ethical theories that prescribe actions that maximize happiness and well-being for all affected individuals

Jack R

Berkeley group house, spots open

[Linkpost] Can lab-grown brains be­come con­scious?

[Linkpost] Can lab-grown brains become conscious?