The Long-Term Fu­ture Fund is look­ing for a full-time fund chair

Oct 5, 2023, 10:18 PM
52 points
0 comments7 min readLW link
(forum.effectivealtruism.org)

Prov­ably Safe AI

PeterMcCluskeyOct 5, 2023, 10:18 PM
35 points
15 comments4 min readLW link
(bayesianinvestor.com)

Stampy’s AI Safety Info soft launch

Oct 5, 2023, 10:13 PM
120 points
9 comments2 min readLW link

Im­pacts of AI on the hous­ing markets

PottedRosePetalOct 5, 2023, 9:24 PM
8 points
0 comments5 min readLW link

Towards Monose­man­tic­ity: De­com­pos­ing Lan­guage Models With Dic­tionary Learning

Zac Hatfield-DoddsOct 5, 2023, 9:01 PM
288 points
22 comments2 min readLW link1 review
(transformer-circuits.pub)

Ideation and Tra­jec­tory Model­ling in Lan­guage Models

NickyPOct 5, 2023, 7:21 PM
16 points
2 comments10 min readLW link

A well-defined his­tory in mea­surable fac­tor spaces

Matthias G. MayerOct 5, 2023, 6:36 PM
22 points
0 comments2 min readLW link

Eval­u­at­ing the his­tor­i­cal value mis­speci­fi­ca­tion argument

Matthew BarnettOct 5, 2023, 6:34 PM
188 points
162 comments7 min readLW link3 reviews

Trans­la­tions Should Invert

abramdemskiOct 5, 2023, 5:44 PM
48 points
19 comments3 min readLW link

Cen­sor­ship in LLMs is here to stay be­cause it mir­rors how our own in­tel­li­gence is structured

mnvrOct 5, 2023, 5:37 PM
3 points
0 comments1 min readLW link

Twin Cities ACX Meetup Oc­to­ber 2023

Timothy M.Oct 5, 2023, 4:29 PM
1 point
2 comments1 min readLW link

This anime sto­ry­board doesn’t ex­ist: a graphic novel writ­ten and illus­trated by GPT4

RomanSOct 5, 2023, 2:01 PM
12 points
7 comments55 min readLW link

AI #32: Lie Detector

ZviOct 5, 2023, 1:50 PM
45 points
19 comments44 min readLW link
(thezvi.wordpress.com)

Can the House Leg­is­late?

jefftkOct 5, 2023, 1:40 PM
26 points
6 comments2 min readLW link
(www.jefftk.com)

Mak­ing progress on the ``what al­ign­ment tar­get should be aimed at?″ ques­tion, is urgent

ThomasCederborgOct 5, 2023, 12:55 PM
2 points
0 comments18 min readLW link

Re­sponse to Quintin Pope’s Evolu­tion Pro­vides No Ev­i­dence For the Sharp Left Turn

ZviOct 5, 2023, 11:39 AM
129 points
29 comments9 min readLW link

How to Get Ra­tion­al­ist Feedback

Nicholas / Heather KrossOct 5, 2023, 2:03 AM
16 points
0 comments2 min readLW link

On my AI Fable, and the im­por­tance of de re, de dicto, and de se refer­ence for AI alignment

PhilGoetzOct 5, 2023, 12:50 AM
9 points
5 comments1 min readLW link

Un­der­speci­fied Prob­a­bil­ities: A Thought Ex­per­i­ment

lunatic_at_largeOct 4, 2023, 10:25 PM
8 points
4 comments2 min readLW link

Fra­ter­nal Birth Order Effect and the Ma­ter­nal Im­mune Hypothesis

BuckyOct 4, 2023, 9:18 PM
20 points
1 comment2 min readLW link

How to solve de­cep­tion and still fail.

Charlie SteinerOct 4, 2023, 7:56 PM
40 points
7 comments6 min readLW link

PortAu­dio M1 Latency

jefftkOct 4, 2023, 7:10 PM
8 points
5 comments1 min readLW link
(www.jefftk.com)

Open Philan­thropy is hiring for mul­ti­ple roles across our Global Catas­trophic Risks teams

aarongertlerOct 4, 2023, 6:04 PM
6 points
0 comments3 min readLW link
(forum.effectivealtruism.org)

Safe­guard­ing Hu­man­ity: En­sur­ing AI Re­mains a Ser­vant, Not a Master

kgldeshapriyaOct 4, 2023, 5:52 PM
−20 points
2 comments2 min readLW link

The 5 Pillars of Happiness

Gabi QUENEOct 4, 2023, 5:50 PM
−24 points
5 comments5 min readLW link

[Question] Us­ing Re­in­force­ment Learn­ing to try to con­trol the heat­ing of a build­ing (dis­trict heat­ing)

Tony KarlssonOct 4, 2023, 5:47 PM
3 points
5 comments1 min readLW link

ra­tio­nal­is­tic prob­a­bil­ity(lit­ter­ally just throw­ing shit out there)

NotaSprayer ASprayerOct 4, 2023, 5:46 PM
−30 points
8 comments2 min readLW link

AISN #23: New OpenAI Models, News from An­thropic, and Rep­re­sen­ta­tion Engineering

Dan HOct 4, 2023, 5:37 PM
15 points
2 comments5 min readLW link
(newsletter.safe.ai)

I don’t find the lie de­tec­tion re­sults that sur­pris­ing (by an au­thor of the pa­per)

JanBOct 4, 2023, 5:10 PM
97 points
8 comments3 min readLW link

[Question] What ev­i­dence is there of LLM’s con­tain­ing world mod­els?

Chris_LeongOct 4, 2023, 2:33 PM
17 points
17 comments1 min readLW link

En­tan­gle­ment and in­tu­ition about words and mean­ing

Bill BenzonOct 4, 2023, 2:16 PM
4 points
0 comments2 min readLW link

Why a Mars colony would lead to a first strike situation

RemmeltOct 4, 2023, 11:29 AM
−60 points
8 comments1 min readLW link
(mflb.com)

[Question] What are some ex­am­ples of AIs in­stan­ti­at­ing the ‘near­est un­blocked strat­egy prob­lem’?

EJTOct 4, 2023, 11:05 AM
6 points
4 comments1 min readLW link

Graph­i­cal ten­sor no­ta­tion for interpretability

Jordan TaylorOct 4, 2023, 8:04 AM
141 points
11 comments19 min readLW link

[Link] Bay Area Win­ter Sols­tice 2023

Oct 4, 2023, 2:19 AM
18 points
3 comments1 min readLW link
(fb.me)

[Question] Who de­ter­mines whether an al­ign­ment pro­posal is the defini­tive al­ign­ment solu­tion?

MiguelDevOct 3, 2023, 10:39 PM
−1 points
6 comments1 min readLW link

AXRP Epi­sode 25 - Co­op­er­a­tive AI with Cas­par Oesterheld

DanielFilanOct 3, 2023, 9:50 PM
43 points
0 comments92 min readLW link

When to Get the Booster?

jefftkOct 3, 2023, 9:00 PM
50 points
15 comments2 min readLW link
(www.jefftk.com)

OpenAI-Microsoft partnership

Zach Stein-PerlmanOct 3, 2023, 8:01 PM
51 points
19 comments1 min readLW link

[Question] Cur­rent AI safety tech­niques?

Zach Stein-PerlmanOct 3, 2023, 7:30 PM
30 points
2 comments2 min readLW link

Test­ing and Au­toma­tion for In­tel­li­gent Sys­tems.

Sai Kiran KammariOct 3, 2023, 5:51 PM
−13 points
0 comments1 min readLW link
(resource-cms.springernature.com)

Me­tac­u­lus An­nounces Fore­cast­ing Tour­na­ment to Eval­u­ate Fo­cused Re­search Or­ga­ni­za­tions, in Part­ner­ship With the Fed­er­a­tion of Amer­i­can Scien­tists

ChristianWilliamsOct 3, 2023, 4:44 PM
13 points
0 commentsLW link
(www.metaculus.com)

What would it mean to un­der­stand how a large lan­guage model (LLM) works? Some quick notes.

Bill BenzonOct 3, 2023, 3:11 PM
20 points
4 comments8 min readLW link

[Question] Po­ten­tial al­ign­ment tar­gets for a sovereign su­per­in­tel­li­gent AI

Paul CologneseOct 3, 2023, 3:09 PM
29 points
4 comments1 min readLW link

Monthly Roundup #11: Oc­to­ber 2023

ZviOct 3, 2023, 2:10 PM
42 points
12 comments35 min readLW link
(thezvi.wordpress.com)

Why We Use Money? - A Walrasian View

Savio CoelhoOct 3, 2023, 12:02 PM
4 points
3 comments8 min readLW link

Mech In­terp Challenge: Oc­to­ber—De­ci­pher­ing the Sorted List Model

CallumMcDougallOct 3, 2023, 10:57 AM
23 points
0 comments3 min readLW link

Early Ex­per­i­ments in Re­ward Model In­ter­pre­ta­tion Us­ing Sparse Autoencoders

Oct 3, 2023, 7:45 AM
17 points
0 comments5 min readLW link

Some Quick Fol­low-Up Ex­per­i­ments to “Taken out of con­text: On mea­sur­ing situ­a­tional aware­ness in LLMs”

Miles TurpinOct 3, 2023, 2:22 AM
31 points
0 comments9 min readLW link

My Mid-Ca­reer Tran­si­tion into Biosecurity

jefftkOct 2, 2023, 9:20 PM
26 points
4 comments2 min readLW link
(www.jefftk.com)