[Question] Who wants to be in­vited to the LW Me­ta­mod­ern di­alogue?

hunterglennJun 5, 2024, 4:39 PM
−3 points
1 comment1 min readLW link

Non­re­ac­tivity: a sim­ple model of meditation

cesiumquailJun 5, 2024, 4:26 PM
21 points
4 comments6 min readLW link

graph­patch: a Python Library for Ac­ti­va­tion Patching

Occam's LaserJun 5, 2024, 3:08 PM
13 points
2 comments1 min readLW link

Startup Stock Op­tions: the Short­est Com­plete Guide for Employees

Boris TJun 5, 2024, 3:03 PM
17 points
3 comments1 min readLW link
(borisagain.substack.com)

Ag­grega­tive Prin­ci­ples of So­cial Justice

Cleo NardoJun 5, 2024, 1:44 PM
29 points
10 comments37 min readLW link

What and how much makes a differ­ence?

Marius Adrian NicoarăJun 5, 2024, 10:30 AM
7 points
0 comments2 min readLW link

An­nounc­ing ILIAD — The­o­ret­i­cal AI Align­ment Conference

Jun 5, 2024, 9:37 AM
163 points
18 comments2 min readLW link

Se­cond-Order Ra­tion­al­ity, Sys­tem Ra­tion­al­ity, and a fea­ture sug­ges­tion for LessWrong

Mati_RoyJun 5, 2024, 7:20 AM
13 points
2 comments8 min readLW link

Former OpenAI Su­per­al­ign­ment Re­searcher: Su­per­in­tel­li­gence by 2030

Julian BradshawJun 5, 2024, 3:35 AM
70 points
30 comments1 min readLW link
(situational-awareness.ai)

On “first crit­i­cal tries” in AI alignment

Joe CarlsmithJun 5, 2024, 12:19 AM
54 points
8 comments14 min readLW link

Take­off speeds pre­sen­ta­tion at Anthropic

Tom DavidsonJun 4, 2024, 10:46 PM
92 points
0 comments25 min readLW link

A Reflec­tion on Richard Ham­ming’s “You and Your Re­search”: Striv­ing for Greatness

aysajanJun 4, 2024, 8:07 PM
8 points
5 comments21 min readLW link
(www.aysajaneziz.com)

A Semiotic Cri­tique of the Orthog­o­nal­ity Thesis

Nicolas VillarrealJun 4, 2024, 6:52 PM
3 points
10 comments15 min readLW link

Here’s Why In­definite Life Ex­ten­sion Will Never Work, Even Though it Does.

HomingHamsterJun 4, 2024, 6:48 PM
−13 points
5 comments18 min readLW link

Ideas for Next-Gen­er­a­tion Writ­ing Plat­forms, us­ing LLMs

ozziegooenJun 4, 2024, 6:40 PM
26 points
4 commentsLW link

Ev­i­dence of Learned Look-Ahead in a Chess-Play­ing Neu­ral Network

Erik JennerJun 4, 2024, 3:50 PM
121 points
14 comments13 min readLW link

Is This Lie De­tec­tor Really Just a Lie De­tec­tor? An In­ves­ti­ga­tion of LLM Probe Speci­fic­ity.

Josh LevyJun 4, 2024, 3:45 PM
39 points
0 comments18 min readLW link

[Paper] Stress-test­ing ca­pa­bil­ity elic­i­ta­tion with pass­word-locked models

Jun 4, 2024, 2:52 PM
85 points
10 comments12 min readLW link
(arxiv.org)

Cir­cuit Board Ordering

jefftkJun 4, 2024, 2:00 PM
10 points
0 comments1 min readLW link
(www.jefftk.com)

[Question] Has any­one here writ­ten about re­li­gious fic­tion­al­ism?

SpectrumDTJun 4, 2024, 12:10 PM
0 points
4 comments1 min readLW link

Is Wittgen­stein’s Lan­guage Game used when helping Ai un­der­stand lan­guage?

VisionaryHeraJun 4, 2024, 7:41 AM
3 points
7 comments1 min readLW link

Smart­phone Eti­quette: Sugges­tions for So­cial Interactions

Declan MolonyJun 4, 2024, 6:01 AM
26 points
4 comments3 min readLW link

Just ad­mit that you’ve zoned out

joecJun 4, 2024, 2:51 AM
91 points
22 comments2 min readLW link

(Not) Derailing the LessOn­line Puz­zle Hunt

ErrorJun 4, 2024, 1:28 AM
74 points
2 comments4 min readLW link

Mas­culinity—A Case For Courage

James Stephen BrownJun 4, 2024, 12:04 AM
24 points
0 comments7 min readLW link
(nonzerosum.games)

Philoso­phers wrestling with evil, as a so­cial me­dia feed

David GrossJun 3, 2024, 10:25 PM
51 points
2 comments16 min readLW link

ACI#8: Value as a Func­tion of Pos­si­ble Worlds

Akira PyinyaJun 3, 2024, 9:49 PM
6 points
2 comments7 min readLW link

in defense of Linus Pauling

bhauthJun 3, 2024, 9:27 PM
49 points
8 comments2 min readLW link
(www.bhauth.com)

Find­ing the es­ti­mate of the value of a state in RL agents

Jun 3, 2024, 8:26 PM
8 points
4 comments4 min readLW link

Search­ing Magic Cards

jefftkJun 3, 2024, 5:40 PM
9 points
2 comments1 min readLW link
(www.jefftk.com)

The Stan­dard Analogy

Zack_M_DavisJun 3, 2024, 5:15 PM
125 points
28 comments12 min readLW link

[Question] How was Less On­line for you?

Gordon Seidoh WorleyJun 3, 2024, 5:10 PM
22 points
4 comments1 min readLW link

AI catas­tro­phes and rogue deployments

BuckJun 3, 2024, 5:04 PM
120 points
16 comments8 min readLW link

Com­pa­nies’ safety plans ne­glect risks from schem­ing AI

Zach Stein-PerlmanJun 3, 2024, 3:00 PM
73 points
4 comments6 min readLW link

ACX Meetup

svfritzJun 3, 2024, 1:02 PM
1 point
0 comments1 min readLW link

Com­ments on An­thropic’s Scal­ing Monosemanticity

Robert_AIZIJun 3, 2024, 12:15 PM
98 points
8 comments7 min readLW link

Poli­tics is the mind-kil­ler, but maybe we should talk about it anyway

Chris_LeongJun 3, 2024, 6:37 AM
14 points
33 comments3 min readLW link

[Question] How do you shut down an es­caped model?

quetzal_rainbowJun 2, 2024, 7:51 PM
15 points
8 comments1 min readLW link

How to Bet­ter Re­port Sparse Au­toen­coder Performance

J BostockJun 2, 2024, 7:34 PM
20 points
4 comments3 min readLW link

[Question] List of ar­gu­ments for Bayesianism

Aryeh EnglanderJun 2, 2024, 7:06 PM
9 points
3 comments1 min readLW link

Ori­gins of the Lab Mouse

Niko_McCartyJun 2, 2024, 3:40 PM
16 points
0 comments20 min readLW link
(press.asimov.com)

Why write down the ba­sics of logic if they are so ev­i­dent?

Crazy philosopherJun 2, 2024, 12:02 PM
3 points
9 comments1 min readLW link

How it All Went Down: The Puz­zle Hunt that took us way, way Less Online

A*Jun 2, 2024, 8:01 AM
135 points
5 comments5 min readLW link

Si­mu­la­tions and Altruism

FateGrinderJun 2, 2024, 2:45 AM
−7 points
2 comments25 min readLW link

Scan­ning your Brain with 100,000,000,000 wires?

Johannes C. MayerJun 1, 2024, 6:37 PM
6 points
6 comments2 min readLW link

[Question] Turn­ing la­texed notes into blog posts

Terence CoelhoJun 1, 2024, 6:03 PM
5 points
2 comments1 min readLW link

How do you know you are right when de­bat­ing? Calcu­late your AmIRight score.

MrThinkJun 1, 2024, 3:55 PM
2 points
5 comments2 min readLW link

Links for May

Kaj_SotalaJun 1, 2024, 10:20 AM
20 points
16 comments18 min readLW link
(kajsotala.fi)

[Question] What do co­her­ence ar­gu­ments ac­tu­ally prove about agen­tic be­hav­ior?

sunwillriseJun 1, 2024, 9:37 AM
123 points
39 comments6 min readLW link

AI Safety: A Climb To Ar­maged­don?

kmenouJun 1, 2024, 6:02 AM
8 points
3 comments1 min readLW link
(arxiv.org)