Cheap Model → Big Model design

Maxwell PetersonNov 19, 2023, 10:50 PM
15 points

4 votes

Overall karma indicates overall quality.

2 comments7 min readLW link

Hu­man-like sys­tem­atic gen­er­al­iza­tion through a meta-learn­ing neu­ral network

BurnyNov 19, 2023, 9:41 PM
8 points

5 votes

Overall karma indicates overall quality.

0 comments2 min readLW link
(twitter.com)

“Benev­olent [ie, Ruler] AI is a bad idea” and a sug­gested al­ter­na­tive (not au­thor)

the gears to ascensionNov 19, 2023, 8:22 PM
22 points

12 votes

Overall karma indicates overall quality.

11 comments1 min readLW link
(www.palladiummag.com)

Align­ment is Hard: An Un­com­putable Align­ment Problem

Alexander BistagneNov 19, 2023, 7:38 PM
−5 points

9 votes

Overall karma indicates overall quality.

4 comments1 min readLW link
(github.com)

New pa­per shows truth­ful­ness & in­struc­tion-fol­low­ing don’t gen­er­al­ize by default

joshcNov 19, 2023, 7:27 PM
60 points

33 votes

Overall karma indicates overall quality.

0 comments4 min readLW link

In favour of a sovereign state of Gaza

Yair HalberstadtNov 19, 2023, 4:08 PM
8 points

13 votes

Overall karma indicates overall quality.

3 comments4 min readLW link

My Crit­i­cism of Sin­gu­lar Learn­ing Theory

Joar SkalseNov 19, 2023, 3:19 PM
83 points

52 votes

Overall karma indicates overall quality.

56 comments12 min readLW link

“Why can’t you just turn it off?”

RokoNov 19, 2023, 2:46 PM
48 points

46 votes

Overall karma indicates overall quality.

25 comments1 min readLW link

Spa­cious­ness In Part­ner Dance: A Nat­u­ral­ism Demo

LoganStrohlNov 19, 2023, 7:00 AM
78 points

22 votes

Overall karma indicates overall quality.

6 comments19 min readLW link1 review

Alt­man firing re­tal­i­a­tion in­com­ing?

trevorNov 19, 2023, 12:10 AM
50 points

47 votes

Overall karma indicates overall quality.

23 comments5 min readLW link

When Will AIs Develop Long-Term Plan­ning?

PeterMcCluskeyNov 19, 2023, 12:08 AM
18 points

7 votes

Overall karma indicates overall quality.

5 comments4 min readLW link
(bayesianinvestor.com)

Killswitch

JunioNov 18, 2023, 10:53 PM
2 points

2 votes

Overall karma indicates overall quality.

0 comments3 min readLW link

Superalignment

Douglas_ReayNov 18, 2023, 10:37 PM
−4 points

10 votes

Overall karma indicates overall quality.

4 comments1 min readLW link
(openai.com)

Pre­dictable Defect-Co­op­er­ate?

quetzal_rainbowNov 18, 2023, 3:38 PM
7 points

3 votes

Overall karma indicates overall quality.

1 comment2 min readLW link

I think I’m just con­fused. Once a model ex­ists, how do you “red-team” it to see whether it’s safe. Isn’t it already dan­ger­ous?

FTPickleNov 18, 2023, 2:16 PM
21 points

11 votes

Overall karma indicates overall quality.

13 comments1 min readLW link

AI Safety Camp 2024

Linda LinseforsNov 18, 2023, 10:37 AM
15 points

9 votes

Overall karma indicates overall quality.

1 comment4 min readLW link
(aisafety.camp)

Post-EAG Mu­sic Party

jefftkNov 18, 2023, 3:00 AM
14 points

4 votes

Overall karma indicates overall quality.

2 comments2 min readLW link
(www.jefftk.com)

Let­ter to a Sonoma County Jail Cell

MadHatterNov 18, 2023, 2:24 AM
9 points

13 votes

Overall karma indicates overall quality.

1 comment1 min readLW link
(open.substack.com)

1. A Sense of Fair­ness: De­con­fus­ing Ethics

RogerDearnaleyNov 17, 2023, 8:55 PM
17 points

10 votes

Overall karma indicates overall quality.

8 comments15 min readLW link

Sam Alt­man fired from OpenAI

LawrenceCNov 17, 2023, 8:42 PM
192 points

93 votes

Overall karma indicates overall quality.

75 comments1 min readLW link
(openai.com)

On the lethal­ity of bi­ased hu­man re­ward ratings

Nov 17, 2023, 6:59 PM
48 points

18 votes

Overall karma indicates overall quality.

10 comments37 min readLW link

Coup probes: Catch­ing catas­tro­phes with probes trained off-policy

Fabien RogerNov 17, 2023, 5:58 PM
93 points

31 votes

Overall karma indicates overall quality.

9 comments11 min readLW link1 review

On Lies and Liars

Gabriel AlfourNov 17, 2023, 5:13 PM
31 points

25 votes

Overall karma indicates overall quality.

4 comments14 min readLW link
(cognition.cafe)

Clas­sify­ing rep­re­sen­ta­tions of sparse au­toen­coders (SAEs)

AnnahNov 17, 2023, 1:54 PM
15 points

7 votes

Overall karma indicates overall quality.

6 comments2 min readLW link

R&D is a Huge Ex­ter­nal­ity, So Why Do Mar­kets Do So Much of it?

Maxwell TabarrokNov 17, 2023, 1:14 PM
15 points

8 votes

Overall karma indicates overall quality.

14 comments3 min readLW link
(maximumprogress.substack.com)

On ex­clud­ing dan­ger­ous in­for­ma­tion from training

ShayBenMosheNov 17, 2023, 11:14 AM
23 points

11 votes

Overall karma indicates overall quality.

5 comments3 min readLW link

The dan­gers of re­pro­duc­ing while old

garymmNov 17, 2023, 5:55 AM
23 points

6 votes

Overall karma indicates overall quality.

6 comments1 min readLW link
(www.garymm.org)

I put odds on ends with Nathan Young

KatjaGraceNov 17, 2023, 5:40 AM
8 points

1 vote

Overall karma indicates overall quality.

0 comments1 min readLW link
(worldspiritsockpuppet.com)

De­bate helps su­per­vise hu­man ex­perts [Paper]

habrykaNov 17, 2023, 5:25 AM
29 points

10 votes

Overall karma indicates overall quality.

6 comments1 min readLW link
(github.com)

A to Z of things

KatjaGraceNov 17, 2023, 5:20 AM
71 points

25 votes

Overall karma indicates overall quality.

8 comments1 min readLW link1 review
(worldspiritsockpuppet.com)

On Tap­ping Out

ScrewtapeNov 17, 2023, 3:23 AM
52 points

29 votes

Overall karma indicates overall quality.

14 comments8 min readLW link1 review

Elic­it­ing La­tent Knowl­edge in Com­pre­hen­sive AI Ser­vices Models

acabodiNov 17, 2023, 2:36 AM
6 points

3 votes

Overall karma indicates overall quality.

0 comments5 min readLW link

Some Rules for an Alge­bra of Bayes Nets

Nov 16, 2023, 11:53 PM
98 points

24 votes

Overall karma indicates overall quality.

45 comments14 min readLW link1 review

How much to up­date on re­cent AI gov­er­nance moves?

Nov 16, 2023, 11:46 PM
112 points

41 votes

Overall karma indicates overall quality.

5 comments29 min readLW link

New LessWrong fea­ture: Dialogue Matching

Bird ConceptNov 16, 2023, 9:27 PM
106 points

35 votes

Overall karma indicates overall quality.

22 comments3 min readLW link

Towards Eval­u­at­ing AI Sys­tems for Mo­ral Sta­tus Us­ing Self-Reports

Nov 16, 2023, 8:18 PM
45 points

14 votes

Overall karma indicates overall quality.

3 comments1 min readLW link
(arxiv.org)

So­cial Dark Matter

Duncan Sabien (Inactive)Nov 16, 2023, 8:00 PM
367 points

271 votes

Overall karma indicates overall quality.

129 comments34 min readLW link2 reviews

AI #38: Let’s Make a Deal

ZviNov 16, 2023, 7:50 PM
44 points

24 votes

Overall karma indicates overall quality.

2 comments55 min readLW link
(thezvi.wordpress.com)

Fore­cast­ing AI (Overview)

jsteinhardtNov 16, 2023, 7:00 PM
35 points

11 votes

Overall karma indicates overall quality.

0 comments2 min readLW link
(bounded-regret.ghost.io)

We Should Talk About This More. Epistemic World Col­lapse as Im­mi­nent Safety Risk of Gen­er­a­tive AI.

Joerg WeissNov 16, 2023, 6:46 PM
11 points

5 votes

Overall karma indicates overall quality.

2 comments29 min readLW link

In­tel­li­gence in sys­tems (hu­man, AI) can be con­cep­tu­al­ized as the re­s­olu­tion and through­put at which a sys­tem can pro­cess and af­fect Shan­non in­for­ma­tion.

AiresJLNov 16, 2023, 5:46 PM
0 points

2 votes

Overall karma indicates overall quality.

0 comments2 min readLW link

Life on the Grid (Part 2)

rogersbaconNov 16, 2023, 5:22 PM
7 points

3 votes

Overall karma indicates overall quality.

0 comments15 min readLW link
(www.secretorum.life)

The im­pos­si­bil­ity of ra­tio­nally an­a­lyz­ing par­ti­san news

RationalDinoNov 16, 2023, 4:19 PM
4 points

7 votes

Overall karma indicates overall quality.

4 comments1 min readLW link

We are Peace­craft.ai!

MadHatterNov 16, 2023, 2:15 PM
15 points

15 votes

Overall karma indicates overall quality.

20 comments2 min readLW link

A di­alec­ti­cal view of the his­tory of AI, Part 1: We’re only in the an­tithe­sis phase. [A syn­the­sis is in the fu­ture.]

Bill BenzonNov 16, 2023, 12:34 PM
6 points

5 votes

Overall karma indicates overall quality.

0 comments12 min readLW link

[Question] How much fraud is there in academia?

ChristianKlNov 16, 2023, 11:50 AM
23 points

9 votes

Overall karma indicates overall quality.

10 comments1 min readLW link

Learn­ing co­effi­cient es­ti­ma­tion: the details

Zach FurmanNov 16, 2023, 3:19 AM
36 points

13 votes

Overall karma indicates overall quality.

0 comments2 min readLW link
(colab.research.google.com)

[Question] AI Safety orgs- what’s your biggest bot­tle­neck right now?

Kabir KumarNov 16, 2023, 2:02 AM
1 point

4 votes

Overall karma indicates overall quality.

0 comments1 min readLW link

My cri­tique of Eliezer’s deeply ir­ra­tional beliefs

JorterderNov 16, 2023, 12:34 AM
−35 points

13 votes

Overall karma indicates overall quality.

1 comment9 min readLW link
(docs.google.com)

Ex­trap­o­lat­ing from Five Words

Gordon Seidoh WorleyNov 15, 2023, 11:21 PM
40 points

21 votes

Overall karma indicates overall quality.

11 comments2 min readLW link