Drug de­vel­op­ment costs can range over two or­ders of magnitude

rossryNov 3, 2024, 11:13 PM
38 points

6 votes

Overall karma indicates overall quality.

0 comments11 min readLW link

Redefin­ing Tol­er­ance: Beyond Pop­per’s Paradox

mindprisonNov 3, 2024, 10:23 PM
−1 points

7 votes

Overall karma indicates overall quality.

0 comments3 min readLW link

Goal: Un­der­stand Intelligence

Johannes C. MayerNov 3, 2024, 9:20 PM
14 points

14 votes

Overall karma indicates overall quality.

19 comments1 min readLW link

Cur­rent safety train­ing tech­niques do not fully trans­fer to the agent setting

Nov 3, 2024, 7:24 PM
158 points

62 votes

Overall karma indicates overall quality.

9 comments5 min readLW link

Why our poli­ti­ci­ans aren’t Median

Yair HalberstadtNov 3, 2024, 2:03 PM
72 points

35 votes

Overall karma indicates overall quality.

15 comments3 min readLW link

Hu­man Bio­di­ver­sity (Part 4: As­tral Codex Ten)

Evan_GaensbauerNov 3, 2024, 4:20 AM
−13 points

18 votes

Overall karma indicates overall quality.

6 comments1 min readLW link
(reflectivealtruism.com)

Un­der­stand­ing in­com­pa­ra­bil­ity ver­sus in­com­men­su­ra­bil­ity in re­la­tion to RLHF

artemiocobbNov 2, 2024, 10:57 PM
1 point

3 votes

Overall karma indicates overall quality.

1 comment2 min readLW link

elec­tric turbofans

bhauthNov 2, 2024, 10:50 PM
63 points

34 votes

Overall karma indicates overall quality.

2 comments5 min readLW link
(bhauth.com)

Real­ity as Cat­e­gory-The­o­retic State Machines: A Math­e­mat­i­cal Framework

Wenitte ApiouNov 2, 2024, 9:04 PM
−8 points

7 votes

Overall karma indicates overall quality.

0 comments2 min readLW link

The Me­dian Re­searcher Problem

johnswentworthNov 2, 2024, 8:16 PM
157 points

130 votes

Overall karma indicates overall quality.

70 comments1 min readLW link

Test­ing “True” Lan­guage Un­der­stand­ing in LLMs: A Sim­ple Proposal

MtryaSamNov 2, 2024, 7:12 PM
9 points

5 votes

Overall karma indicates overall quality.

2 comments2 min readLW link

Test­ing “True” Lan­guage Un­der­stand­ing in LLMs: A Sim­ple Proposal

MtryaSamNov 2, 2024, 7:12 PM
−3 points

3 votes

Overall karma indicates overall quality.

0 comments2 min readLW link

Frag­ile, Ro­bust, and An­tifrag­ile Prefer­ence Satisfaction

adamShimiNov 2, 2024, 5:25 PM
19 points

9 votes

Overall karma indicates overall quality.

0 comments5 min readLW link
(formethods.substack.com)

Higher Order Signs, Hal­lu­ci­na­tion and Schizophrenia

Nicolas VillarrealNov 2, 2024, 4:33 PM
4 points

9 votes

Overall karma indicates overall quality.

0 comments13 min readLW link
(nicolasdvillarreal.substack.com)

[Question] Is OpenAI net nega­tive for AI Safety?

Lysandre TerrisseNov 2, 2024, 4:18 PM
4 points

4 votes

Overall karma indicates overall quality.

0 comments1 min readLW link

Two ar­gu­ments against longter­mist thought experiments

momom2Nov 2, 2024, 10:22 AM
15 points

9 votes

Overall karma indicates overall quality.

5 comments3 min readLW link

Both-Side­sism—When Fair & Balanced Goes Wrong

James Stephen BrownNov 2, 2024, 3:04 AM
3 points

30 votes

Overall karma indicates overall quality.

15 comments6 min readLW link
(nonzerosum.games)

What can we learn from in­se­cure do­mains?

Logan ZoellnerNov 1, 2024, 11:53 PM
14 points

15 votes

Overall karma indicates overall quality.

21 comments1 min readLW link

Science ad­vances one funeral at a time

Nov 1, 2024, 11:06 PM
100 points

46 votes

Overall karma indicates overall quality.

9 comments2 min readLW link

The Carte­sian Crisis

mindprisonNov 1, 2024, 11:02 PM
−5 points

4 votes

Overall karma indicates overall quality.

2 comments2 min readLW link

Hy­poth­e­sis on Com­po­si­tion Cir­cuits in Vi­sion Transformers

phenomanonNov 1, 2024, 10:16 PM
2 points

2 votes

Overall karma indicates overall quality.

0 comments3 min readLW link

SAE Prob­ing: What is it good for?

Nov 1, 2024, 7:23 PM
34 points

13 votes

Overall karma indicates overall quality.

0 comments11 min readLW link

[Question] Set The­ory Mul­ti­verse vs Math­e­mat­i­cal Truth—Philo­soph­i­cal Discussion

Wenitte ApiouNov 1, 2024, 6:56 PM
8 points

5 votes

Overall karma indicates overall quality.

25 comments1 min readLW link

Ed­u­ca­tional CAI: Align­ing a Lan­guage Model with Ped­a­gog­i­cal Theories

Bharath PuranamNov 1, 2024, 6:55 PM
5 points

3 votes

Overall karma indicates overall quality.

1 comment13 min readLW link

Pre­dic­tion mar­kets and Taxes

Edmund NelsonNov 1, 2024, 5:39 PM
11 points

7 votes

Overall karma indicates overall quality.

8 comments1 min readLW link

Den­tistry, Oral Sur­geons, and the Ineffi­ciency of Small Markets

GeneSmithNov 1, 2024, 5:26 PM
86 points

58 votes

Overall karma indicates overall quality.

18 comments5 min readLW link

Live Machin­ery: An In­ter­face De­sign Philos­o­phy for Whole­some AI Futures

SahilNov 1, 2024, 5:24 PM
48 points

25 votes

Overall karma indicates overall quality.

3 comments35 min readLW link

Seek­ing Collaborators

abramdemskiNov 1, 2024, 5:13 PM
62 points

23 votes

Overall karma indicates overall quality.

15 comments7 min readLW link

Com­plete Feedback

abramdemskiNov 1, 2024, 4:58 PM
25 points

10 votes

Overall karma indicates overall quality.

8 comments3 min readLW link

Lev­ers for Biolog­i­cal Progress—A Re­sponse to “Machines of Lov­ing Grace”

Niko_McCartyNov 1, 2024, 4:35 PM
17 points

5 votes

Overall karma indicates overall quality.

0 comments20 min readLW link
(www.asimov.press)

2024 Unoffi­cial LW Com­mu­nity Cen­sus, Re­quest for Comments

ScrewtapeNov 1, 2024, 4:34 PM
23 points

12 votes

Overall karma indicates overall quality.

32 comments3 min readLW link

[Question] When en­gag­ing with a large amount of re­sources dur­ing a liter­a­ture re­view, how do you pre­vent your­self from be­com­ing over­whelmed?

corruptedCatapillarNov 1, 2024, 7:29 AM
25 points

10 votes

Overall karma indicates overall quality.

2 comments3 min readLW link

(draft) Cy­borg soft­ware should be open (?)

AtillaYasarNov 1, 2024, 7:24 AM
4 points

4 votes

Overall karma indicates overall quality.

5 comments3 min readLW link

Another UFO Bet

codyzNov 1, 2024, 1:55 AM
9 points

9 votes

Overall karma indicates overall quality.

11 comments1 min readLW link

Trad­ing Candy

jefftkNov 1, 2024, 1:10 AM
28 points

13 votes

Overall karma indicates overall quality.

4 comments1 min readLW link
(www.jefftk.com)

Jar­gonBot Beta Test

RaemonNov 1, 2024, 1:05 AM
84 points

38 votes

Overall karma indicates overall quality.

55 comments6 min readLW link

GPT-4o Guardrails Gone: Data Poi­son­ing & Jailbreak-Tuning

Nov 1, 2024, 12:10 AM
18 points

8 votes

Overall karma indicates overall quality.

0 comments6 min readLW link
(far.ai)

The sling­shot helps with learning

Wilson WuOct 31, 2024, 11:18 PM
33 points

11 votes

Overall karma indicates overall quality.

0 comments8 min readLW link

Toward Safety Case In­spired Ba­sic Research

Oct 31, 2024, 11:06 PM
55 points

16 votes

Overall karma indicates overall quality.

3 comments13 min readLW link

Spooky Recom­men­da­tion Sys­tem Scaling

phdeadOct 31, 2024, 10:00 PM
11 points

4 votes

Overall karma indicates overall quality.

0 comments4 min readLW link

‘Meta’, ‘mesa’, and mountains

LorecOct 31, 2024, 5:25 PM
1 point

3 votes

Overall karma indicates overall quality.

0 comments3 min readLW link

Toward Safety Cases For AI Scheming

Oct 31, 2024, 5:20 PM
60 points

23 votes

Overall karma indicates overall quality.

1 comment2 min readLW link

AI #88: Thanks for the Memos

ZviOct 31, 2024, 3:00 PM
46 points

17 votes

Overall karma indicates overall quality.

5 comments77 min readLW link
(thezvi.wordpress.com)

The Com­pendium, A full ar­gu­ment about ex­tinc­tion risk from AGI

Oct 31, 2024, 12:01 PM
196 points

87 votes

Overall karma indicates overall quality.

52 comments2 min readLW link
(www.thecompendium.ai)

Some Pre­limi­nary Notes on the Promise of a Wis­dom Explosion

Chris_LeongOct 31, 2024, 9:21 AM
2 points

2 votes

Overall karma indicates overall quality.

0 comments1 min readLW link
(aiimpacts.org)

What TMS is like

SableOct 31, 2024, 12:44 AM
218 points

117 votes

Overall karma indicates overall quality.

23 comments6 min readLW link
(affablyevil.substack.com)

AI Safety at the Fron­tier: Paper High­lights, Oc­to­ber ’24

gasteigerjoOct 31, 2024, 12:09 AM
3 points

1 vote

Overall karma indicates overall quality.

0 comments9 min readLW link
(aisafetyfrontier.substack.com)

Stan­dard SAEs Might Be In­co­her­ent: A Choos­ing Prob­lem & A “Con­cise” Solution

Kola AyonrindeOct 30, 2024, 10:50 PM
27 points

11 votes

Overall karma indicates overall quality.

0 comments12 min readLW link

Generic ad­vice caveats

Saul MunnOct 30, 2024, 9:03 PM
27 points

13 votes

Overall karma indicates overall quality.

1 comment3 min readLW link
(www.brasstacks.blog)

I turned de­ci­sion the­ory prob­lems into memes about trolleys

TapataktOct 30, 2024, 8:13 PM
105 points

54 votes

Overall karma indicates overall quality.

23 comments1 min readLW link