Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Christopher King
Karma:
669
@theking@mathstodon.xyz
All
Posts
Comments
New
Top
Old
Page
1
ARC tests to see if GPT-4 can escape human control; GPT-4 failed to do so
Christopher King
15 Mar 2023 0:29 UTC
116
points
22
comments
2
min read
LW
link
Formalizing the “AI x-risk is unlikely because it is ridiculous” argument
Christopher King
3 May 2023 18:56 UTC
47
points
17
comments
3
min read
LW
link
AI community building: EliezerKart
Christopher King
1 Apr 2023 15:25 UTC
45
points
0
comments
2
min read
LW
link
The way AGI wins could look very stupid
Christopher King
12 May 2023 16:34 UTC
42
points
22
comments
1
min read
LW
link
Anthropically Blind: the anthropic shadow is reflectively inconsistent
Christopher King
29 Jun 2023 2:36 UTC
40
points
38
comments
10
min read
LW
link
Solomonoff induction still works if the universe is uncomputable, and its usefulness doesn’t require knowing Occam’s razor
Christopher King
18 Jun 2023 1:52 UTC
38
points
28
comments
4
min read
LW
link
“Corrigibility at some small length” by dath ilan
Christopher King
5 Apr 2023 1:47 UTC
32
points
3
comments
9
min read
LW
link
(www.glowfic.com)
[Question]
Accuracy of arguments that are seen as ridiculous and intuitively false but don’t have good counter-arguments
Christopher King
29 Apr 2023 23:58 UTC
30
points
39
comments
1
min read
LW
link
Current AI harms are also sci-fi
Christopher King
8 Jun 2023 17:49 UTC
26
points
3
comments
1
min read
LW
link
Optimality is the tiger, and annoying the user is its teeth
Christopher King
28 Jan 2023 20:20 UTC
25
points
5
comments
2
min read
LW
link
Proposal: we should start referring to the risk from unaligned AI as a type of *accident risk*
Christopher King
16 May 2023 15:18 UTC
22
points
6
comments
2
min read
LW
link
GPT-4 is bad at strategic thinking
Christopher King
27 Mar 2023 15:11 UTC
22
points
8
comments
1
min read
LW
link
A better analogy and example for teaching AI takeover: the ML Inferno
Christopher King
14 Mar 2023 19:14 UTC
18
points
0
comments
5
min read
LW
link
How do low level hypotheses constrain high level ones? The mystery of the disappearing diamond.
Christopher King
11 Jul 2023 19:27 UTC
17
points
11
comments
2
min read
LW
link
Bing finding ways to bypass Microsoft’s filters without being asked. Is it reproducible?
Christopher King
20 Feb 2023 15:11 UTC
16
points
15
comments
1
min read
LW
link
Does GPT-4 exhibit agency when summarizing articles?
Christopher King
24 Mar 2023 15:49 UTC
16
points
2
comments
5
min read
LW
link
GPT-4 aligning with acasual decision theory when instructed to play games, but includes a CDT explanation that’s incorrect if they differ
Christopher King
23 Mar 2023 16:16 UTC
7
points
4
comments
8
min read
LW
link
PCAST Working Group on Generative AI Invites Public Input
Christopher King
13 May 2023 22:49 UTC
7
points
0
comments
1
min read
LW
link
(terrytao.wordpress.com)
Challenge proposal: smallest possible self-hardening backdoor for RLHF
Christopher King
29 Jun 2023 16:56 UTC
7
points
0
comments
2
min read
LW
link
Inference from a Mathematical Description of an Existing Alignment Research: a proposal for an outer alignment research program
Christopher King
2 Jun 2023 21:54 UTC
7
points
4
comments
16
min read
LW
link
Back to top
Next