RSS

The LessWrong 2021 Re­view (In­tel­lec­tual Cir­cle Ex­pan­sion)

1 Dec 2022 21:17 UTC
68 points
26 comments8 min readLW link

Sum­mary of a new study on out-group hate (and how to fix it)

AllAmericanBreakfast4 Dec 2022 1:53 UTC
36 points
7 comments3 min readLW link
(www.pnas.org)

[Question] Is school good or bad?

tailcalled3 Dec 2022 13:14 UTC
10 points
57 comments1 min readLW link

In­ner and outer al­ign­ment de­com­pose one hard prob­lem into two ex­tremely hard problems

TurnTrout2 Dec 2022 2:43 UTC
79 points
6 comments53 min readLW link

Three Fables of Mag­i­cal Girls and Longtermism

Ulisse Mini2 Dec 2022 22:01 UTC
20 points
6 comments2 min readLW link

ChatGPT seems over­con­fi­dent to me

qbolec4 Dec 2022 8:03 UTC
18 points
1 comment16 min readLW link

The Sin­gu­lar Value De­com­po­si­tions of Trans­former Weight Ma­tri­ces are Highly Interpretable

28 Nov 2022 12:54 UTC
155 points
25 comments31 min readLW link

Jailbreak­ing ChatGPT on Re­lease Day

Zvi2 Dec 2022 13:10 UTC
162 points
41 comments6 min readLW link
(thezvi.wordpress.com)

AI can ex­ploit safety plans posted on the Internet

Peter S. Park4 Dec 2022 12:17 UTC
−6 points
4 comments1 min readLW link

Take 3: No in­de­scrib­able heav­en­wor­lds.

Charlie Steiner4 Dec 2022 2:48 UTC
18 points
8 comments2 min readLW link

Did ChatGPT just gaslight me?

ThomasW1 Dec 2022 5:41 UTC
120 points
43 comments9 min readLW link
(equonc.substack.com)

[Question] Do any of the AI Risk eval­u­a­tions fo­cus on hu­mans as the risk?

jmh30 Nov 2022 3:09 UTC
10 points
7 comments1 min readLW link

Against meta-eth­i­cal hedonism

Joe Carlsmith2 Dec 2022 0:23 UTC
20 points
3 comments35 min readLW link

Utili­tar­i­anism is the only op­tion

aelwood3 Dec 2022 17:14 UTC
−15 points
4 comments1 min readLW link

The blue-min­imis­ing robot and model splintering

Stuart_Armstrong28 May 2021 15:09 UTC
14 points
4 comments3 min readLW link1 review

[Question] Will chat logs and other records of our lives be main­tained in­definitely by the ad­ver­tis­ing in­dus­try?

mako yass29 Nov 2022 0:30 UTC
14 points
5 comments1 min readLW link

Godzilla Strategies

johnswentworth11 Jun 2022 15:44 UTC
149 points
64 comments3 min readLW link

Could an AI be Reli­gious?

mk544 Dec 2022 5:00 UTC
0 points
4 comments1 min readLW link

Log­i­cal in­duc­tion for soft­ware engineers

Alex Flint3 Dec 2022 19:55 UTC
88 points
2 comments27 min readLW link

Take 1: We’re not go­ing to re­verse-en­g­ineer the AI.

Charlie Steiner1 Dec 2022 22:41 UTC
32 points
4 comments4 min readLW link