Agenda Manipulation

PazzazNov 9, 2024, 2:13 PM
2 points

1 vote

Overall karma indicates overall quality.

0 comments3 min readLW link

Force Se­quen­tial Out­put with SCP?

jefftkNov 9, 2024, 12:40 PM
9 points

2 votes

Overall karma indicates overall quality.

4 comments1 min readLW link
(www.jefftk.com)

An­thropic teams up with Palan­tir and AWS to sell AI to defense customers

Matrice JacobineNov 9, 2024, 11:50 AM
9 points

5 votes

Overall karma indicates overall quality.

0 comments2 min readLW link
(techcrunch.com)

GPT-4o Can In Some Cases Solve Moder­ately Com­pli­cated Captchas

dirkNov 9, 2024, 4:04 AM
12 points

5 votes

Overall karma indicates overall quality.

2 comments1 min readLW link

LLMs Look In­creas­ingly Like Gen­eral Reasoners

eggsyntaxNov 8, 2024, 11:47 PM
94 points

44 votes

Overall karma indicates overall quality.

45 comments3 min readLW link

ov­ereng­ineered air filter shelving

bhauthNov 8, 2024, 10:04 PM
26 points

9 votes

Overall karma indicates overall quality.

2 comments5 min readLW link
(bhauth.com)

Big­ger Livers?

sarahconstantinNov 8, 2024, 9:50 PM
99 points

44 votes

Overall karma indicates overall quality.

17 comments6 min readLW link
(sarahconstantin.substack.com)

New UChicago Ra­tion­al­ity Group

Noah BirnbaumNov 8, 2024, 9:20 PM
11 points

6 votes

Overall karma indicates overall quality.

0 comments1 min readLW link

Ac­tive Re­call and Spaced Rep­e­ti­tion are Differ­ent Things

Saul MunnNov 8, 2024, 8:14 PM
51 points

21 votes

Overall karma indicates overall quality.

2 comments3 min readLW link
(www.brasstacks.blog)

The King and the Golem—The Animation

WriterNov 8, 2024, 6:23 PM
72 points

27 votes

Overall karma indicates overall quality.

1 comment1 min readLW link

Bor­ing & straight­for­ward trauma explanation

lemonhopeNov 8, 2024, 9:45 AM
24 points

15 votes

Overall karma indicates overall quality.

7 comments1 min readLW link

Cur­ricu­lum of Ascension

andrew sauerNov 7, 2024, 11:54 PM
13 points

5 votes

Overall karma indicates overall quality.

0 comments18 min readLW link

An­a­lyz­ing how SAE fea­tures evolve across a for­ward pass

Nov 7, 2024, 10:07 PM
47 points

40 votes

Overall karma indicates overall quality.

0 comments1 min readLW link
(arxiv.org)

Mar­kets Are In­for­ma­tion—Beat­ing the Sports­books at Their Own Game

JJXWNov 7, 2024, 8:58 PM
9 points

6 votes

Overall karma indicates overall quality.

1 comment2 min readLW link
(thehobbyist.substack.com)

Sig­nal­ing with Small Orange Diamonds

jefftkNov 7, 2024, 8:20 PM
40 points

18 votes

Overall karma indicates overall quality.

1 comment1 min readLW link
(www.jefftk.com)

Fun­da­men­tal Uncer­tainty: Chap­ter 9 - How do we live with un­cer­tainty?

Gordon Seidoh WorleyNov 7, 2024, 6:15 PM
11 points

3 votes

Overall karma indicates overall quality.

2 comments15 min readLW link

AI #89: Trump Card

ZviNov 7, 2024, 4:30 PM
42 points

28 votes

Overall karma indicates overall quality.

12 comments42 min readLW link
(thezvi.wordpress.com)

Quan­tum Im­mor­tal­ity: A Per­spec­tive if AI Doomers are Prob­a­bly Right

Nov 7, 2024, 4:06 PM
13 points

22 votes

Overall karma indicates overall quality.

55 comments14 min readLW link

On Tar­geted Ma­nipu­la­tion and De­cep­tion when Op­ti­miz­ing LLMs for User Feedback

Nov 7, 2024, 3:39 PM
51 points

17 votes

Overall karma indicates overall quality.

7 comments11 min readLW link

In the Name of All That Needs Saving

pleiotrothNov 7, 2024, 3:26 PM
18 points

7 votes

Overall karma indicates overall quality.

3 comments22 min readLW link

Agency over­hang as a proxy for Sharp left turn

Nov 7, 2024, 12:14 PM
6 points

4 votes

Overall karma indicates overall quality.

0 comments5 min readLW link

The Case Against Mo­ral Realism

Zero ContradictionsNov 7, 2024, 10:14 AM
−5 points

10 votes

Overall karma indicates overall quality.

10 comments1 min readLW link
(thewaywardaxolotl.blogspot.com)

[Question] What are the pri­mary drivers that caused se­lec­tion pres­sure for in­tel­li­gence in hu­mans?

Towards_KeeperhoodNov 7, 2024, 9:40 AM
8 points

4 votes

Overall karma indicates overall quality.

15 comments1 min readLW link

The Lo­gis­tics of Distri­bu­tion of Mean­ing: Against Epistemic Bureaucratization

SahilNov 7, 2024, 5:27 AM
30 points

13 votes

Overall karma indicates overall quality.

7 comments12 min readLW link

SAEs are highly dataset de­pen­dent: a case study on the re­fusal direction

Nov 7, 2024, 5:22 AM
67 points

25 votes

Overall karma indicates overall quality.

4 comments14 min readLW link

Should CA, TX, OK, and LA merge into a gi­ant swing state, just for elec­tions?

Thomas KwaNov 6, 2024, 11:01 PM
115 points

60 votes

Overall karma indicates overall quality.

35 comments1 min readLW link

New Fund­ing Cat­e­gory Open in Fore­sight’s AI Safety Grants

Allison DuettmannNov 6, 2024, 10:59 PM
15 points

4 votes

Overall karma indicates overall quality.

0 comments1 min readLW link

Scat­tered thoughts on what it means for an LLM to believe

TheManxLoinerNov 6, 2024, 10:10 PM
5 points

3 votes

Overall karma indicates overall quality.

4 comments5 min readLW link

The Bayesian Con­spir­acy Live Recording

EneaszNov 6, 2024, 4:25 PM
9 points

3 votes

Overall karma indicates overall quality.

0 comments1 min readLW link

An­thropic: Three Sketches of ASL-4 Safety Case Components

Zach Stein-PerlmanNov 6, 2024, 4:00 PM
95 points

35 votes

Overall karma indicates overall quality.

33 comments1 min readLW link
(alignment.anthropic.com)

Meme Talk­ing Points

ymeskhoutNov 6, 2024, 3:27 PM
33 points

24 votes

Overall karma indicates overall quality.

0 comments3 min readLW link

Ad­vi­sors for Smaller Ma­jor Donors?

jefftkNov 6, 2024, 2:30 PM
18 points

6 votes

Overall karma indicates overall quality.

2 comments3 min readLW link
(www.jefftk.com)

Scis­sors State­ments for Pres­i­dent?

AnnaSalamonNov 6, 2024, 10:38 AM
121 points

58 votes

Overall karma indicates overall quality.

33 comments1 min readLW link

[Question] How to cite LessWrong as an aca­demic source?

PhilosophicalSoulNov 6, 2024, 8:28 AM
10 points

4 votes

Overall karma indicates overall quality.

6 comments1 min readLW link

How to put Cal­ifor­nia and Texas on the cam­paign trail!

Yair HalberstadtNov 6, 2024, 6:08 AM
25 points

10 votes

Overall karma indicates overall quality.

4 comments1 min readLW link

LDT (and ev­ery­thing else) can be irrational

Christopher KingNov 6, 2024, 4:05 AM
11 points

14 votes

Overall karma indicates overall quality.

15 comments2 min readLW link

Join my new sub­scriber chat

sarahconstantinNov 6, 2024, 2:30 AM
7 points

4 votes

Overall karma indicates overall quality.

0 comments1 min readLW link
(sarahconstantin.substack.com)

Grace­ful Degradation

ScrewtapeNov 5, 2024, 11:57 PM
84 points

55 votes

Overall karma indicates overall quality.

8 comments4 min readLW link

An al­ter­na­tive ap­proach to superbabies

Towards_KeeperhoodNov 5, 2024, 10:56 PM
48 points

23 votes

Overall karma indicates overall quality.

19 comments3 min readLW link

Ap­ply to be a men­tor in SPAR!

agucovaNov 5, 2024, 9:32 PM
5 points

3 votes

Overall karma indicates overall quality.

0 comments1 min readLW link

Go­ing Beyond “im­ma­tu­rity”

moisentinelNov 5, 2024, 8:51 PM
−3 points

5 votes

Overall karma indicates overall quality.

2 comments2 min readLW link

In­tent al­ign­ment as a step­ping-stone to value alignment

Seth HerdNov 5, 2024, 8:43 PM
37 points

16 votes

Overall karma indicates overall quality.

8 comments3 min readLW link

Why Re­cur­sion Phar­ma­ceu­ti­cals aban­doned cell paint­ing for bright­field imaging

Abhishaike MahajanNov 5, 2024, 2:51 PM
29 points

11 votes

Overall karma indicates overall quality.

1 comment18 min readLW link
(www.owlposting.com)

Win­ning isn’t enough

Nov 5, 2024, 11:37 AM
44 points

20 votes

Overall karma indicates overall quality.

30 comments9 min readLW link

An­thropic—The case for tar­geted regulation

anagumaNov 5, 2024, 7:07 AM
11 points

2 votes

Overall karma indicates overall quality.

0 comments2 min readLW link
(www.anthropic.com)

The Shal­low Bench

Karl FaulksNov 5, 2024, 5:07 AM
48 points

20 votes

Overall karma indicates overall quality.

5 comments3 min readLW link

Us­ing Nar­ra­tive Prompt­ing to Ex­tract Policy Fore­casts from LLMs

Max GhenisNov 5, 2024, 4:37 AM
5 points

3 votes

Overall karma indicates overall quality.

0 comments1 min readLW link

ML4Good (AI Safety Boot­camp) - Ex­pe­rience report

JanEbbingNov 5, 2024, 1:18 AM
13 points

8 votes

Overall karma indicates overall quality.

0 comments3 min readLW link

Catas­trophic Cy­ber Ca­pa­bil­ities Bench­mark (3CB): Ro­bustly Eval­u­at­ing LLM Agent Cy­ber Offense Capabilities

Nov 5, 2024, 1:01 AM
8 points

4 votes

Overall karma indicates overall quality.

0 comments6 min readLW link
(www.apartresearch.com)

[Question] Could or­cas be (trained to be) smarter than hu­mans? 

Towards_KeeperhoodNov 4, 2024, 11:29 PM
59 points

30 votes

Overall karma indicates overall quality.

23 comments1 min readLW link