$300 for the best sci-fi prompt: the results

RomanSJan 3, 2024, 7:10 PM
16 points
19 comments7 min readLW link

Agent mem­branes/​bound­aries and for­mal­iz­ing “safety”

ChipmonkJan 3, 2024, 5:55 PM
26 points
46 comments3 min readLW link

Safety First: safety be­fore full al­ign­ment. The de­on­tic suffi­ciency hy­poth­e­sis.

ChipmonkJan 3, 2024, 5:55 PM
48 points
3 comments3 min readLW link

Prac­ti­cally A Book Re­view: Ap­pendix to “Non­lin­ear’s Ev­i­dence: De­bunk­ing False and Mislead­ing Claims” (ThingOfThings)

tailcalledJan 3, 2024, 5:07 PM
111 points
25 comments2 min readLW link
(thingofthings.substack.com)

Triv­ial Math­e­mat­ics as a Path Forward

ACrackedPotJan 3, 2024, 4:41 PM
−4 points
2 comments2 min readLW link

Copy­right Con­fronta­tion #1

ZviJan 3, 2024, 3:50 PM
34 points
7 comments18 min readLW link
(thezvi.wordpress.com)

[Question] The­o­ret­i­cally, could we bal­ance the bud­get painlessly?

Logan ZoellnerJan 3, 2024, 2:46 PM
4 points
12 comments1 min readLW link

Jo­hannes’ Biography

Johannes C. MayerJan 3, 2024, 1:27 PM
24 points
0 comments10 min readLW link

What Helped Me—Kale, Blood, CPAP, X-tiamine, Methylphenidate

Johannes C. MayerJan 3, 2024, 1:22 PM
35 points
12 comments2 min readLW link

[Question] Does LessWrong make a differ­ence when it comes to AI al­ign­ment?

PhilosophicalSoulJan 3, 2024, 12:21 PM
18 points
13 comments1 min readLW link

[Question] Ter­minol­ogy: <some­thing>-ware for ML?

Oliver SourbutJan 3, 2024, 11:42 AM
17 points
27 comments1 min readLW link

Trad­ing off Lives

jefftkJan 3, 2024, 3:40 AM
53 points
12 comments2 min readLW link
(www.jefftk.com)

MonoPoly Restricted Trust

ymeskhoutJan 2, 2024, 11:02 PM
42 points
37 comments9 min readLW link

Agent mem­branes and causal distance

ChipmonkJan 2, 2024, 10:43 PM
20 points
3 comments3 min readLW link

Fo­cus­ing on Mal-Alignment

John FisherJan 2, 2024, 7:51 PM
1 point
0 comments1 min readLW link

Gentle­ness and the ar­tifi­cial Other

Joe CarlsmithJan 2, 2024, 6:21 PM
313 points
33 comments11 min readLW link

Oth­er­ness and con­trol in the age of AGI

Joe CarlsmithJan 2, 2024, 6:15 PM
43 points
0 comments7 min readLW link

Apol­o­giz­ing is a Core Ra­tion­al­ist Skill

johnswentworthJan 2, 2024, 5:47 PM
156 points
42 comments5 min readLW link

Cortés, AI Risk, and the Dy­nam­ics of Com­pet­ing Conquerors

James_MillerJan 2, 2024, 4:37 PM
14 points
2 comments3 min readLW link

OpenAI’s Pre­pared­ness Frame­work: Praise & Recommendations

Orpheus16Jan 2, 2024, 4:20 PM
66 points
1 comment7 min readLW link

Dat­ing Roundup #2: If At First You Don’t Succeed

ZviJan 2, 2024, 4:00 PM
54 points
29 comments47 min readLW link
(thezvi.wordpress.com)

Look­ing for Read­ing Recom­men­da­tions: Con­tent Moder­a­tion, Power & Censorship

Joerg WeissJan 2, 2024, 11:37 AM
2 points
7 comments1 min readLW link

AI Is Not Software

DavidmanheimJan 2, 2024, 7:58 AM
58 points
29 comments5 min readLW link

Are Me­tac­u­lus AI Timelines In­con­sis­tent?

Chris_LeongJan 2, 2024, 6:47 AM
17 points
7 comments2 min readLW link

Bos­ton Sols­tice 2023 Retrospective

jefftkJan 2, 2024, 3:10 AM
33 points
0 comments6 min readLW link
(www.jefftk.com)

Steer­ing Llama-2 with con­trastive ac­ti­va­tion additions

Jan 2, 2024, 12:47 AM
125 points
29 comments8 min readLW link
(arxiv.org)

Twin Cities ACX Meetup—Jan­uary 2024

Timothy M.Jan 1, 2024, 9:13 PM
1 point
2 comments1 min readLW link

San Fran­cisco ACX Meetup “First Satur­day”

guenaelJan 1, 2024, 8:58 PM
1 point
1 comment1 min readLW link

Mech In­terp Challenge: Jan­uary—De­ci­pher­ing the Cae­sar Cipher Model

CallumMcDougallJan 1, 2024, 6:03 PM
17 points
0 comments3 min readLW link

Aldix and the Book of Life

villeJan 1, 2024, 5:23 PM
1 point
0 comments4 min readLW link
(medium.com)

Me­tac­u­lus Hosts ACX 2024 Pre­dic­tion Contest

ChristianWilliamsJan 1, 2024, 4:38 PM
4 points
0 commentsLW link
(www.metaculus.com)

The Act It­self: Ex­cep­tion­less Mo­ral Norms

SebastianG Jan 1, 2024, 4:06 PM
5 points
11 comments6 min readLW link

De­cep­tion Chess

Chris LandJan 1, 2024, 3:40 PM
7 points
2 comments4 min readLW link

Stop talk­ing about p(doom)

Isaac KingJan 1, 2024, 10:57 AM
42 points
22 comments3 min readLW link

[Question] What should a non-ge­nius do in the face of rapid progress in GAI to en­sure a de­cent life?

kalerJan 1, 2024, 8:22 AM
11 points
16 comments1 min readLW link

A hermeneu­tic net for agency

TsviBTJan 1, 2024, 8:06 AM
58 points
4 comments30 min readLW link

2023 in AI predictions

jessicataJan 1, 2024, 5:23 AM
107 points
35 comments5 min readLW link

Rhythm Stage Setup Components

jefftkJan 1, 2024, 3:10 AM
10 points
4 comments2 min readLW link
(www.jefftk.com)

Bayesian up­dat­ing in real life is mostly about un­der­stand­ing your hypotheses

Max HJan 1, 2024, 12:10 AM
68 points
4 comments11 min readLW link

Dark Art: Inception

Abu IbrahimDec 31, 2023, 9:09 PM
11 points
0 comments3 min readLW link

A case for AI al­ign­ment be­ing difficult

jessicataDec 31, 2023, 7:55 PM
106 points
59 comments15 min readLW link1 review
(unstableontology.com)

The Roots of Progress 2023 in review

jasoncrawfordDec 31, 2023, 6:16 PM
22 points
0 comments11 min readLW link
(rootsofprogress.org)

Ex­tended Navel-Gaz­ing On My 2023 Donations

jennDec 31, 2023, 6:10 PM
8 points
0 commentsLW link
(jenn.site)

aisafety.info, the Table of Content

Charbel-RaphaëlDec 31, 2023, 1:57 PM
23 points
1 comment11 min readLW link

AIOS

samhealyDec 31, 2023, 1:23 PM
−3 points
5 comments6 min readLW link

AI Align­ment Metastrategy

Vanessa KosoyDec 31, 2023, 12:06 PM
124 points
13 comments7 min readLW link

[Question] Does the hard­ness of AI al­ign­ment un­der­mine FOOM?

TruePathDec 31, 2023, 11:05 AM
8 points
14 comments1 min readLW link

Speed of Failing

nano_brascaDec 31, 2023, 10:39 AM
8 points
0 comments2 min readLW link

[Question] Es­ti­mat­ing Re­turns to In­tel­li­gence vs Num­bers, Strength and Looks

TruePathDec 31, 2023, 10:03 AM
3 points
6 comments1 min readLW link

Plan­ning to build a cryp­to­graphic box with perfect secrecy

Lysandre TerrisseDec 31, 2023, 9:31 AM
40 points
6 comments11 min readLW link