Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Christopher King
Karma:
669
@theking@mathstodon.xyz
All
Posts
Comments
New
Top
Old
Page
1
Optimality is the tiger, and annoying the user is its teeth
Christopher King
28 Jan 2023 20:20 UTC
25
points
5
comments
2
min read
LW
link
Is this a weak pivotal act: creating nanobots that eat evil AGIs (but nothing else)?
Christopher King
10 Feb 2023 19:26 UTC
0
points
3
comments
1
min read
LW
link
Threatening to do the impossible: A solution to spurious counterfactuals for functional decision theory via proof theory
Christopher King
11 Feb 2023 7:57 UTC
5
points
4
comments
5
min read
LW
link
Bing finding ways to bypass Microsoft’s filters without being asked. Is it reproducible?
Christopher King
20 Feb 2023 15:11 UTC
16
points
15
comments
1
min read
LW
link
Is there a ML agent that abandons it’s utility function out-of-distribution without losing capabilities?
Christopher King
22 Feb 2023 16:49 UTC
1
point
7
comments
1
min read
LW
link
A ranking scale for how severe the side effects of solutions to AI x-risk are
Christopher King
8 Mar 2023 22:53 UTC
3
points
0
comments
2
min read
LW
link
Could Roko’s basilisk acausally bargain with a paperclip maximizer?
Christopher King
13 Mar 2023 18:21 UTC
1
point
8
comments
1
min read
LW
link
A better analogy and example for teaching AI takeover: the ML Inferno
Christopher King
14 Mar 2023 19:14 UTC
18
points
0
comments
5
min read
LW
link
ARC tests to see if GPT-4 can escape human control; GPT-4 failed to do so
Christopher King
15 Mar 2023 0:29 UTC
116
points
22
comments
2
min read
LW
link
Capabilities Denial: The Danger of Underestimating AI
Christopher King
21 Mar 2023 1:24 UTC
6
points
5
comments
3
min read
LW
link
Exploring the Precautionary Principle in AI Development: Historical Analogies and Lessons Learned
Christopher King
21 Mar 2023 3:53 UTC
−1
points
2
comments
9
min read
LW
link
GPT-4 aligning with acasual decision theory when instructed to play games, but includes a CDT explanation that’s incorrect if they differ
Christopher King
23 Mar 2023 16:16 UTC
7
points
4
comments
8
min read
LW
link
A crazy hypothesis: GPT-4 already is agentic and is trying to take over the world!
Christopher King
24 Mar 2023 1:19 UTC
−2
points
11
comments
9
min read
LW
link
Does GPT-4 exhibit agency when summarizing articles?
Christopher King
24 Mar 2023 15:49 UTC
16
points
2
comments
5
min read
LW
link
More experiments in GPT-4 agency: writing memos
Christopher King
24 Mar 2023 17:51 UTC
5
points
2
comments
10
min read
LW
link
GPT-4 is bad at strategic thinking
Christopher King
27 Mar 2023 15:11 UTC
22
points
8
comments
1
min read
LW
link
GPT-4 busted? Clear self-interest when summarizing articles about itself vs when article talks about Claude, LLaMA, or DALL·E 2
Christopher King
31 Mar 2023 17:05 UTC
6
points
4
comments
4
min read
LW
link
Imagine a world where Microsoft employees used Bing
Christopher King
31 Mar 2023 18:36 UTC
6
points
2
comments
2
min read
LW
link
AI community building: EliezerKart
Christopher King
1 Apr 2023 15:25 UTC
45
points
0
comments
2
min read
LW
link
Do we have a plan for the “first critical try” problem?
Christopher King
3 Apr 2023 16:27 UTC
−3
points
14
comments
1
min read
LW
link
Back to top
Next