Four vi­sions of Trans­for­ma­tive AI success

Steven ByrnesJan 17, 2024, 8:45 PM
112 points
22 comments15 min readLW link

AI Dis­clo­sure Bal­lot Ini­ti­a­tive (and vot­ing method)

Aaron HamlinJan 17, 2024, 8:02 PM
−8 points
3 comments1 min readLW link

Hatch­ing the Cos­mic Egg (Hymn to Diony­sus)

rogersbaconJan 17, 2024, 6:34 PM
7 points
0 comments9 min readLW link
(www.secretorum.life)

[Question] What do peo­ple col­lo­quially mean by deep breath­ing? Slow, large, or di­aphrag­matic?

VipulNaikJan 17, 2024, 6:01 PM
13 points
8 comments2 min readLW link

AlphaGeom­e­try: An Olympiad-level AI sys­tem for geometry

alyssavanceJan 17, 2024, 5:17 PM
45 points
9 comments1 min readLW link
(deepmind.google)

On An­thropic’s Sleeper Agents Paper

ZviJan 17, 2024, 4:10 PM
54 points
5 comments36 min readLW link
(thezvi.wordpress.com)

A Ped­a­gog­i­cal Guide to Corrigibility

A.H.Jan 17, 2024, 11:45 AM
6 points
3 comments16 min readLW link

An In­tro­duc­tion To The Man­delbrot Set That Doesn’t Men­tion Com­plex Numbers

YitzJan 17, 2024, 9:48 AM
82 points
11 comments9 min readLW link

Vote in the LessWrong re­view! (LW 2022 Re­view vot­ing phase)

habrykaJan 17, 2024, 7:22 AM
26 points
9 comments2 min readLW link

Co­a­lescer Models

Jan 17, 2024, 6:39 AM
16 points
2 comments10 min readLW link

Maybe talk­ing isn’t the best way to com­mu­ni­cate with LLMs

mnvrJan 17, 2024, 6:24 AM
3 points
1 comment1 min readLW link
(mrmr.io)

D&D.Sci Hyper­sphere Anal­y­sis Part 3: Beat it with Lin­ear Algebra

aphyerJan 16, 2024, 10:44 PM
26 points
1 comment5 min readLW link

The weak-to-strong gen­er­al­iza­tion (WTSG) pa­per in 60 seconds

sudoJan 16, 2024, 10:44 PM
12 points
1 comment1 min readLW link
(arxiv.org)

So­cial me­dia al­ign­ment test

amayhewJan 16, 2024, 8:56 PM
1 point
0 comments1 min readLW link
(naiveskepticblog.wordpress.com)

Med­i­cal Roundup #1

ZviJan 16, 2024, 8:30 PM
57 points
9 comments29 min readLW link
(thezvi.wordpress.com)

Be­ing nicer than Clippy

Joe CarlsmithJan 16, 2024, 7:44 PM
109 points
32 comments27 min readLW link

How poly­se­man­tic can one neu­ron be? In­ves­ti­gat­ing fea­tures in TinyS­to­ries.

Evan AndersJan 16, 2024, 7:10 PM
14 points
0 comments8 min readLW link
(evanhanders.blog)

Ap­ply­ing AI Safety con­cepts to astronomy

FarisJan 16, 2024, 6:29 PM
1 point
0 comments12 min readLW link

Manag­ing catas­trophic mi­suse with­out ro­bust AIs

Jan 16, 2024, 5:27 PM
63 points
17 comments11 min readLW link

[Question] What are the most com­mon so­cial in­se­cu­ri­ties?

ChipmonkJan 16, 2024, 5:24 PM
9 points
6 comments1 min readLW link

Why wasn’t preser­va­tion with the goal of po­ten­tial fu­ture re­vival started ear­lier in his­tory?

Andy_McKenzieJan 16, 2024, 4:15 PM
31 points
1 comment6 min readLW link

[Question] Why are peo­ple un­keen to im­mor­tal­ity that would come from tech­nolog­i­cal ad­vance­ments and/​or AI?

Gabi QUENEJan 16, 2024, 2:23 PM
12 points
41 comments1 min readLW link

Deal­ing with Awkwardness

Jonathan MoregårdJan 16, 2024, 12:32 PM
13 points
0 comments4 min readLW link
(honestliving.substack.com)

The im­pos­si­ble prob­lem of due process

mingyuanJan 16, 2024, 5:18 AM
197 points
64 comments14 min readLW link

[Re­tracted] New­ton’s law of cool­ing from first principles

NisanJan 16, 2024, 4:21 AM
9 points
15 comments2 min readLW link

Sparse Au­toen­coders Work on At­ten­tion Layer Outputs

Jan 16, 2024, 12:26 AM
84 points
9 comments18 min readLW link

Goals se­lected from learned knowl­edge: an al­ter­na­tive to RL alignment

Seth HerdJan 15, 2024, 9:52 PM
42 points
18 comments7 min readLW link

In­tro­duc­ing REBUS: A Ro­bust Eval­u­a­tion Bench­mark of Un­der­stand­ing Symbols

Jan 15, 2024, 9:21 PM
33 points
0 comments1 min readLW link

Live Sound: Big-O Improvements

jefftkJan 15, 2024, 7:50 PM
8 points
0 comments1 min readLW link
(www.jefftk.com)

In­ves­ti­gat­ing Bias Rep­re­sen­ta­tions in LLMs via Ac­ti­va­tion Steering

DawnLuJan 15, 2024, 7:39 PM
29 points
4 comments5 min readLW link

Sparse MLP Distillation

slavachalnevJan 15, 2024, 7:39 PM
30 points
3 comments6 min readLW link

Re­view of Align­ment Plan Cri­tiques- De­cem­ber AI-Plans Cri­tique-a-Thon Re­sults

IknownothingJan 15, 2024, 7:37 PM
24 points
0 comments25 min readLW link
(aiplans.substack.com)

[Question] What does it look like for AI to sig­nifi­cantly im­prove hu­man co­or­di­na­tion, be­fore su­per­in­tel­li­gence?

Bird ConceptJan 15, 2024, 7:22 PM
22 points
2 comments1 min readLW link

Now Ac­cept­ing Player Ap­pli­ca­tions for Band of Blades

Joe RogeroJan 15, 2024, 5:58 PM
2 points
0 comments3 min readLW link

Three Types of Con­straints in the Space of Agents

Jan 15, 2024, 5:27 PM
26 points
3 comments17 min readLW link

The case for train­ing fron­tier AIs on Sume­rian-only corpus

Jan 15, 2024, 4:40 PM
130 points
16 comments3 min readLW link

How to Pro­mote More Pro­duc­tive Dialogue Out­side of LessWrong

sweenesmJan 15, 2024, 2:16 PM
18 points
4 comments2 min readLW link

[Question] Come and day­dream with me about sci­ence reform

TeaTieAndHatJan 15, 2024, 11:09 AM
9 points
1 comment1 min readLW link

AI do­ing philos­o­phy = AI gen­er­at­ing hands?

Wei DaiJan 15, 2024, 9:04 AM
46 points
23 commentsLW link

Even if we lose, we win

MorphismJan 15, 2024, 2:15 AM
24 points
17 comments4 min readLW link

De­tach­ment vs at­tach­ment [AI risk and men­tal health]

Neil Jan 15, 2024, 12:41 AM
15 points
4 comments3 min readLW link

Mak­ing up statis­tics to es­tab­lish pri­or­ity on Land Value Tax vs Earned In­come Tax Credit vs So­cial Me­dia Dy­namic Regulation

CanucklugJan 14, 2024, 11:57 PM
−5 points
2 comments7 min readLW link

Is the uni­verse all there is? ‘Ev­i­dence’ for ob­jects out­side the uni­verse...

JonathanHallJan 14, 2024, 11:56 PM
−4 points
27 comments11 min readLW link

[Question] What is the min­i­mum amount of time travel and re­sources needed to se­cure the fu­ture?

PerhapsJan 14, 2024, 10:01 PM
−3 points
5 comments1 min readLW link

Gothen­burg LW /​ ACX meetup

StefanJan 14, 2024, 9:21 PM
1 point
0 comments1 min readLW link

Gothen­burg LW /​ ACX meetup

StefanJan 14, 2024, 9:20 PM
1 point
1 comment1 min readLW link

D&D.Sci Hyper­sphere Anal­y­sis Part 2: Non­lin­ear Effects & Interactions

aphyerJan 14, 2024, 7:59 PM
24 points
0 comments7 min readLW link

Gen­der Exploration

sapphireJan 14, 2024, 6:57 PM
117 points
26 comments5 min readLW link
(open.substack.com)

List of pro­jects that seem im­pact­ful for AI Governance

Jan 14, 2024, 4:53 PM
14 points
0 comments13 min readLW link

The Leeroy Jenk­ins prin­ci­ple: How faulty AI could guaran­tee “warn­ing shots”

titotalJan 14, 2024, 3:03 PM
48 points
6 commentsLW link
(titotal.substack.com)