RSS

Christopher King

Karma: 826

@theking@mathstodon.xyz

Does GPT-4 ex­hibit agency when sum­ma­riz­ing ar­ti­cles?

Christopher KingMar 24, 2023, 3:49 PM
16 points
2 comments5 min readLW link

A crazy hy­poth­e­sis: GPT-4 already is agen­tic and is try­ing to take over the world!

Christopher KingMar 24, 2023, 1:19 AM
−2 points
11 comments9 min readLW link

GPT-4 al­ign­ing with aca­sual de­ci­sion the­ory when in­structed to play games, but in­cludes a CDT ex­pla­na­tion that’s in­cor­rect if they differ

Christopher KingMar 23, 2023, 4:16 PM
7 points
4 comments8 min readLW link

Ex­plor­ing the Pre­cau­tion­ary Prin­ci­ple in AI Devel­op­ment: His­tor­i­cal Analo­gies and Les­sons Learned

Christopher KingMar 21, 2023, 3:53 AM
−1 points
2 comments9 min readLW link

Ca­pa­bil­ities De­nial: The Danger of Un­der­es­ti­mat­ing AI

Christopher KingMar 21, 2023, 1:24 AM
6 points
5 comments3 min readLW link

ARC tests to see if GPT-4 can es­cape hu­man con­trol; GPT-4 failed to do so

Christopher KingMar 15, 2023, 12:29 AM
116 points
22 comments2 min readLW link

A bet­ter anal­ogy and ex­am­ple for teach­ing AI takeover: the ML Inferno

Christopher KingMar 14, 2023, 7:14 PM
18 points
0 comments5 min readLW link

Could Roko’s basilisk acausally bar­gain with a pa­per­clip max­i­mizer?

Christopher KingMar 13, 2023, 6:21 PM
1 point
8 comments1 min readLW link

A rank­ing scale for how se­vere the side effects of solu­tions to AI x-risk are

Christopher KingMar 8, 2023, 10:53 PM
3 points
0 comments2 min readLW link

Is there a ML agent that aban­dons it’s util­ity func­tion out-of-dis­tri­bu­tion with­out los­ing ca­pa­bil­ities?

Christopher KingFeb 22, 2023, 4:49 PM
1 point
7 comments1 min readLW link

Bing find­ing ways to by­pass Microsoft’s filters with­out be­ing asked. Is it re­pro­ducible?

Christopher KingFeb 20, 2023, 3:11 PM
27 points
15 comments1 min readLW link

Threat­en­ing to do the im­pos­si­ble: A solu­tion to spu­ri­ous coun­ter­fac­tu­als for func­tional de­ci­sion the­ory via proof theory

Christopher KingFeb 11, 2023, 7:57 AM
5 points
4 comments5 min readLW link

Is this a weak pivotal act: cre­at­ing nanobots that eat evil AGIs (but noth­ing else)?

Christopher KingFeb 10, 2023, 7:26 PM
0 points
3 comments1 min readLW link

Op­ti­mal­ity is the tiger, and an­noy­ing the user is its teeth

Christopher KingJan 28, 2023, 8:20 PM
25 points
6 comments2 min readLW link