Christopher King

Karma: 669

@theking@mathstodon.xyz

Optimality is the tiger, and annoying the user is its teeth

Christopher King28 Jan 2023 20:20 UTC

25 points

5 comments2 min readLW link

Is this a weak pivotal act: creating nanobots that eat evil AGIs (but nothing else)?

Christopher King10 Feb 2023 19:26 UTC

0 points

3 comments1 min readLW link

Threatening to do the impossible: A solution to spurious counterfactuals for functional decision theory via proof theory

Christopher King11 Feb 2023 7:57 UTC

5 points

4 comments5 min readLW link

Bing finding ways to bypass Microsoft’s filters without being asked. Is it reproducible?

Christopher King20 Feb 2023 15:11 UTC

16 points

15 comments1 min readLW link

Is there a ML agent that abandons it’s utility function out-of-distribution without losing capabilities?

Christopher King22 Feb 2023 16:49 UTC

1 point

7 comments1 min readLW link

A ranking scale for how severe the side effects of solutions to AI x-risk are

Christopher King8 Mar 2023 22:53 UTC

3 points

0 comments2 min readLW link

Could Roko’s basilisk acausally bargain with a paperclip maximizer?

Christopher King13 Mar 2023 18:21 UTC

1 point

8 comments1 min readLW link

A better analogy and example for teaching AI takeover: the ML Inferno

Christopher King14 Mar 2023 19:14 UTC

18 points

0 comments5 min readLW link

ARC tests to see if GPT-4 can escape human control; GPT-4 failed to do so

Christopher King15 Mar 2023 0:29 UTC

116 points

22 comments2 min readLW link

Capabilities Denial: The Danger of Underestimating AI

Christopher King21 Mar 2023 1:24 UTC

6 points

5 comments3 min readLW link

Exploring the Precautionary Principle in AI Development: Historical Analogies and Lessons Learned

Christopher King21 Mar 2023 3:53 UTC

−1 points

2 comments9 min readLW link

GPT-4 aligning with acasual decision theory when instructed to play games, but includes a CDT explanation that’s incorrect if they differ

Christopher King23 Mar 2023 16:16 UTC

7 points

4 comments8 min readLW link

A crazy hypothesis: GPT-4 already is agentic and is trying to take over the world!

Christopher King24 Mar 2023 1:19 UTC

−2 points

11 comments9 min readLW link

Does GPT-4 exhibit agency when summarizing articles?

Christopher King24 Mar 2023 15:49 UTC

16 points

2 comments5 min readLW link

More experiments in GPT-4 agency: writing memos

Christopher King24 Mar 2023 17:51 UTC

5 points

2 comments10 min readLW link

GPT-4 is bad at strategic thinking

Christopher King27 Mar 2023 15:11 UTC

22 points

8 comments1 min readLW link

GPT-4 busted? Clear self-interest when summarizing articles about itself vs when article talks about Claude, LLaMA, or DALL·E 2

Christopher King31 Mar 2023 17:05 UTC

6 points

4 comments4 min readLW link

Imagine a world where Microsoft employees used Bing

Christopher King31 Mar 2023 18:36 UTC

6 points

2 comments2 min readLW link

AI community building: EliezerKart

Christopher King1 Apr 2023 15:25 UTC

45 points

0 comments2 min readLW link

Do we have a plan for the “first critical try” problem?

Christopher King3 Apr 2023 16:27 UTC

−3 points

14 comments1 min readLW link