Reli­able Sources: The Story of David Gerard

TracingWoodgrainsJul 10, 2024, 7:50 PM
390 points
54 comments43 min readLW link

Manag­ing Emo­tional Po­ten­tial Energy

adamShimiJul 10, 2024, 6:20 PM
24 points
4 comments4 min readLW link
(epistemologicalfascinations.substack.com)

[EAFo­rum xpost] A break­down of OpenAI’s revenue

Jul 10, 2024, 6:09 PM
57 points
5 comments1 min readLW link
(forum.effectivealtruism.org)

Solv­ing Pas­cal’s Wager us­ing dy­namic programming

Paul WilczewskiJul 10, 2024, 6:09 PM
1 point
0 comments5 min readLW link

Fluent, Cruxy Predictions

RaemonJul 10, 2024, 6:00 PM
86 points
14 comments14 min readLW link

An­titrust as Con­trol­led Creative Destruction

Martin SustrikJul 10, 2024, 4:40 PM
14 points
2 comments2 min readLW link
(250bpm.substack.com)

New page: Integrity

Zach Stein-PerlmanJul 10, 2024, 3:00 PM
91 points
3 comments1 min readLW link

AirBnB Baking

jefftkJul 10, 2024, 12:50 PM
7 points
1 comment1 min readLW link
(www.jefftk.com)

DIY RLHF: A sim­ple im­ple­men­ta­tion for hands on experience

Jul 10, 2024, 12:07 PM
28 points
0 comments6 min readLW link

Use­ful­ness grounds truth

invertedpassionJul 10, 2024, 7:58 AM
0 points
0 comments4 min readLW link

On pass­ing Com­plete and Hon­est Ide­olog­i­cal Tur­ing Tests (CHITTs)

Aryeh EnglanderJul 10, 2024, 4:01 AM
11 points
2 comments1 min readLW link

[Question] Pon­der­ing how good or bad things will be in the AGI future

SherrinfordJul 9, 2024, 10:46 PM
11 points
9 comments2 min readLW link

Causal Graphs of GPT-2-Small’s Resi­d­ual Stream

David UdellJul 9, 2024, 10:06 PM
53 points
7 comments7 min readLW link

[Question] If AI starts to end the world, is suicide a good idea?

IlluminateRealityJul 9, 2024, 9:53 PM
0 points
8 comments1 min readLW link

Ra­tion­al­ist Pu­rity Test

Gunnar_ZarnckeJul 9, 2024, 8:30 PM
−9 points
5 comments1 min readLW link
(ratpuritytest.com)

That which can be de­stroyed by the truth, should be as­sumed to should be de­stroyed by it

Thac0Jul 9, 2024, 7:39 PM
6 points
0 comments3 min readLW link

AISN #38: Supreme Court De­ci­sion Could Limit Fed­eral Abil­ity to Reg­u­late AI Plus, “Cir­cuit Break­ers” for AI sys­tems, and up­dates on China’s AI industry

Jul 9, 2024, 7:28 PM
5 points
0 comments5 min readLW link
(newsletter.safe.ai)

Sum­mer Tour Stops

jefftkJul 9, 2024, 7:10 PM
10 points
0 comments3 min readLW link
(www.jefftk.com)

Fix sim­ple mis­takes in ARC-AGI, etc.

Oleg TrottJul 9, 2024, 5:46 PM
9 points
9 comments1 min readLW link

Paper Sum­mary: The Effects of Com­mu­ni­cat­ing Uncer­tainty on Public Trust in Facts and Numbers

Jeffrey HeningerJul 9, 2024, 4:50 PM
42 points
2 comments2 min readLW link
(blog.aiimpacts.org)

UC Berkeley course on LLMs and ML Safety

Dan HJul 9, 2024, 3:40 PM
36 points
1 comment1 min readLW link
(rdi.berkeley.edu)

What and Why: Devel­op­men­tal In­ter­pretabil­ity of Re­in­force­ment Learning

Garrett BakerJul 9, 2024, 2:09 PM
68 points
4 comments6 min readLW link

Med­i­cal Roundup #3

ZviJul 9, 2024, 1:10 PM
39 points
4 comments19 min readLW link
(thezvi.wordpress.com)

Con­sent across power differentials

Ramana KumarJul 9, 2024, 11:42 AM
50 points
12 comments3 min readLW link

[Question] How bad would AI progress need to be for us to think gen­eral tech­nolog­i­cal progress is also bad?

Jim BuhlerJul 9, 2024, 10:43 AM
9 points
5 comments1 min readLW link

How LLMs Learn: What We Know, What We Don’t (Yet) Know, and What Comes Next

JonasbJul 9, 2024, 9:58 AM
2 points
0 comments16 min readLW link
(www.denominations.io)

WTF is with the In­fancy Gospel of Thomas?!? A deep dive into satire, philos­o­phy, and more

kromemJul 9, 2024, 9:29 AM
18 points
2 comments11 min readLW link

Book Re­view: Safe Enough? A His­tory of Nu­clear Power and Ac­ci­dent Risk

ErickBallJul 9, 2024, 1:12 AM
10 points
0 comments28 min readLW link

Me, My­self, and AI: the Si­tu­a­tional Aware­ness Dataset (SAD) for LLMs

Jul 8, 2024, 10:24 PM
109 points
37 comments5 min readLW link

Robin Han­son & Liron Shapira De­bate AI X-Risk

LironJul 8, 2024, 9:45 PM
34 points
4 comments1 min readLW link
(www.youtube.com)

“The Sin­gu­lar­ity Is Nearer” by Ray Kurzweil—Review

LavenderJul 8, 2024, 9:32 PM
22 points
0 comments4 min readLW link

Sam­ple Prevalence vs Global Prevalence

jefftkJul 8, 2024, 9:00 PM
11 points
0 comments2 min readLW link
(www.jefftk.com)

Ad­vice to ju­nior AI gov­er­nance researchers

Orpheus16Jul 8, 2024, 7:19 PM
66 points
1 comment5 min readLW link

Pan­theon Interface

Jul 8, 2024, 7:03 PM
127 points
22 comments6 min readLW link

Launch­ing the AI Fore­cast­ing Bench­mark Series Q3 | $30k in Prizes

ChristianWilliamsJul 8, 2024, 5:20 PM
5 points
0 commentsLW link
(www.metaculus.com)

The Golden Mean of Scien­tific Virtues

adamShimiJul 8, 2024, 5:16 PM
12 points
4 comments8 min readLW link
(epistemologicalfascinations.substack.com)

Mas­s­ape­qua (Long Is­land), New York, USA – ACX Meetup

Gabriel WeilJul 8, 2024, 5:01 PM
2 points
0 comments1 min readLW link

Dialogue in­tro­duc­tion to Sin­gu­lar Learn­ing Theory

Olli JärviniemiJul 8, 2024, 4:58 PM
101 points
15 comments8 min readLW link

An­nounc­ing The Techno-Hu­man­ist Man­i­festo: A new philos­o­phy of progress for the 21st century

jasoncrawfordJul 8, 2024, 4:33 PM
18 points
4 comments5 min readLW link
(blog.rootsofprogress.org)

Re­sponse to Dileep Ge­orge: AGI safety war­rants plan­ning ahead

Steven ByrnesJul 8, 2024, 3:27 PM
27 points
7 comments27 min readLW link

Why not par­li­a­men­tar­i­anism? [book by Ti­ago Ribeiro dos San­tos]

Arturo MaciasJul 8, 2024, 2:57 PM
2 points
1 comment4 min readLW link

Games of My Child­hood: The Troops

Kaj_SotalaJul 8, 2024, 11:20 AM
18 points
0 comments5 min readLW link
(kajsotala.fi)

Towards shut­down­able agents via stochas­tic choice

Jul 8, 2024, 10:14 AM
59 points
11 comments23 min readLW link
(arxiv.org)

On scal­able over­sight with weak LLMs judg­ing strong LLMs

Jul 8, 2024, 8:59 AM
49 points
18 comments7 min readLW link
(arxiv.org)

Poker is a bad game for teach­ing epistemics. Fig­gie is a bet­ter one.

rossryJul 8, 2024, 6:05 AM
105 points
47 comments11 min readLW link
(blog.rossry.net)

Con­trol­led Creative Destruction

Martin SustrikJul 8, 2024, 4:36 AM
11 points
0 comments2 min readLW link

On say­ing “Thank you” in­stead of “I’m Sorry”

Michael CohnJul 8, 2024, 3:13 AM
136 points
16 comments3 min readLW link

How can I get over my fear of be­com­ing an em­u­lated con­scious­ness?

James Dowdell7 Jul 2024 22:02 UTC
6 points
8 comments5 min readLW link

An Ex­tremely Opinionated An­no­tated List of My Favourite Mechanis­tic In­ter­pretabil­ity Papers v2

Neel Nanda7 Jul 2024 17:39 UTC
136 points
16 comments25 min readLW link

Joint manda­tory dona­tion as a way to in­crease the num­ber of donations

Crazy philosopher7 Jul 2024 10:56 UTC
3 points
3 comments2 min readLW link