Near-mode think­ing on AI

Olli JärviniemiAug 4, 2024, 8:47 PM
128 points
9 comments5 min readLW link

Water­marks: Sign­ing, Brand­ing, and Boobytrapping

Shankar SivarajanAug 4, 2024, 8:41 PM
4 points
0 comments1 min readLW link

Model­ling So­cial Ex­change: A Sys­tem­a­tised Method to Judge Friend­ship Quality

Wynn WalkerAug 4, 2024, 6:49 PM
6 points
0 comments5 min readLW link

We’re not as 3-Di­men­sional as We Think

silentbobAug 4, 2024, 2:39 PM
46 points
17 comments5 min readLW link

You don’t know how bad most things are nor pre­cisely how they’re bad.

Solenoid_EntityAug 4, 2024, 2:12 PM
329 points
49 comments5 min readLW link

Can We Pre­dict Per­sua­sive­ness Bet­ter Than An­thropic?

Lennart FinkeAug 4, 2024, 2:05 PM
22 points
5 comments4 min readLW link

[Question] What should we do about COVID in 2024?

ChristianKlAug 4, 2024, 10:57 AM
20 points
2 comments1 min readLW link

To­k­enized SAEs: In­fus­ing per-to­ken bi­ases.

Aug 4, 2024, 9:17 AM
20 points
20 comments15 min readLW link

Thoughts On Democracy

Zero ContradictionsAug 4, 2024, 6:02 AM
2 points
0 comments1 min readLW link
(zerocontradictions.net)

AI Align­ment through Com­par­a­tive Advantage

artemiocobbAug 4, 2024, 12:32 AM
−2 points
4 comments3 min readLW link

La­bel­ling, Vari­ables, and In-Con­text Learn­ing in Llama2

Joshua PenmanAug 3, 2024, 7:36 PM
6 points
0 comments1 min readLW link
(colab.research.google.com)

[Question] Dan Hendrycks and EA

jeffreycarusoAug 3, 2024, 1:33 PM
−4 points
4 comments1 min readLW link

[Question] Why do Min­i­mal Bayes Nets of­ten cor­re­spond to Causal Models of Real­ity?

DalcyAug 3, 2024, 12:39 PM
27 points
1 comment1 min readLW link

Why did ChatGPT say that? Prompt en­g­ineer­ing and more, with PIZZA.

Jessica RumbelowAug 3, 2024, 12:07 PM
41 points
2 comments4 min readLW link

Co­op­er­a­tion and Align­ment in Del­e­ga­tion Games: You Need Both!

Aug 3, 2024, 10:16 AM
8 points
0 comments14 min readLW link
(www.oliversourbut.net)

SRE’s re­view of Democracy

Martin SustrikAug 3, 2024, 7:20 AM
48 points
2 comments3 min readLW link
(250bpm.substack.com)

The Case Against Libertarianism

Zero ContradictionsAug 3, 2024, 5:05 AM
−4 points
1 comment1 min readLW link
(zerocontradictions.net)

We Don’t Just Let Peo­ple Die—So What Next?

James Stephen BrownAug 3, 2024, 1:04 AM
11 points
8 comments10 min readLW link

The EA case for Trump

Judd RosenblattAug 3, 2024, 1:00 AM
14 points
1 comment1 min readLW link
(www.secondbest.ca)

I didn’t think I’d take the time to build this cal­ibra­tion train­ing game, but with web­sim it took roughly 30 sec­onds, so here it is!

mako yassAug 2, 2024, 10:35 PM
24 points
2 comments5 min readLW link

Eval­u­at­ing Sparse Au­toen­coders with Board Game Models

Aug 2, 2024, 7:50 PM
38 points
1 comment9 min readLW link

The Bit­ter Les­son for AI Safety Research

Aug 2, 2024, 6:39 PM
57 points
5 comments3 min readLW link

Eth­i­cal De­cep­tion: Should AI Ever Lie?

Jason ReidAug 2, 2024, 5:53 PM
5 points
2 comments7 min readLW link

[Question] Re­quest for AI risk quotes, es­pe­cially around speed, large im­pacts and black boxes

Nathan YoungAug 2, 2024, 5:49 PM
6 points
0 comments1 min readLW link

A Sim­ple Toy Co­her­ence Theorem

Aug 2, 2024, 5:47 PM
74 points
22 comments7 min readLW link

All the Fol­low­ing are Distinct

Gianluca CalcagniAug 2, 2024, 4:35 PM
16 points
3 comments9 min readLW link

The ‘strong’ fea­ture hy­poth­e­sis could be wrong

lewis smithAug 2, 2024, 2:33 PM
231 points
19 comments17 min readLW link

An in­for­ma­tion-the­o­retic study of ly­ing in LLMs

Aug 2, 2024, 10:06 AM
17 points
0 comments4 min readLW link

How I Wrought a Lesser Scribing Ar­ti­fact (You Can, Too!)

LorxusAug 2, 2024, 3:35 AM
12 points
0 comments5 min readLW link

The Rise and Stag­na­tion of Modernity

Zero ContradictionsAug 2, 2024, 3:31 AM
1 point
0 comments1 min readLW link
(thewaywardaxolotl.blogspot.com)

Les­sons from the FDA for AI

RemmeltAug 2, 2024, 12:52 AM
1 point
4 commentsLW link
(ainowinstitute.org)

AI Rights for Hu­man Safety

Simon GoldsteinAug 1, 2024, 11:01 PM
53 points
6 comments1 min readLW link
(papers.ssrn.com)

Case Study: In­ter­pret­ing, Ma­nipu­lat­ing, and Con­trol­ling CLIP With Sparse Autoencoders

Gytis DaujotasAug 1, 2024, 9:08 PM
45 points
7 comments7 min readLW link

Op­ti­miz­ing Re­peated Correlations

SatvikBeriAug 1, 2024, 5:33 PM
26 points
1 comment1 min readLW link

The need for multi-agent experiments

Martín SotoAug 1, 2024, 5:14 PM
43 points
3 comments9 min readLW link

Dragon Agnosticism

jefftkAug 1, 2024, 5:00 PM
95 points
75 comments2 min readLW link
(www.jefftk.com)

Mor­ris­town ACX Meetup

mbrooksAug 1, 2024, 4:29 PM
2 points
1 comment1 min readLW link

Some com­ments on intelligence

ViliamAug 1, 2024, 3:17 PM
30 points
5 comments3 min readLW link

[Question] [Thought Ex­per­i­ment] Given a but­ton to ter­mi­nate all hu­man­ity, would you press it?

lorepieriAug 1, 2024, 3:10 PM
−2 points
9 comments1 min readLW link

Are un­paid UN in­tern­ships a good idea?

CipollaAug 1, 2024, 3:06 PM
1 point
7 comments4 min readLW link

AI #75: Math is Easier

ZviAug 1, 2024, 1:40 PM
46 points
25 comments72 min readLW link
(thezvi.wordpress.com)

Tem­po­rary Cog­ni­tive Hyper­pa­ram­e­ter Alteration

Jonathan MoregårdAug 1, 2024, 10:27 AM
9 points
0 comments3 min readLW link
(honestliving.substack.com)

Tech­nol­ogy and Progress

Zero ContradictionsAug 1, 2024, 4:49 AM
1 point
0 comments1 min readLW link
(thewaywardaxolotl.blogspot.com)

Do Pre­dic­tion Mar­kets Work?

Benjamin_SturiskyAug 1, 2024, 2:31 AM
7 points
0 comments4 min readLW link

2/​3 Aussie & NZ AI Safety folk of­ten or some­times feel lonely or dis­con­nected (and 16 other bar­ri­ers to im­pact)

yanni kyriacosAug 1, 2024, 1:15 AM
13 points
0 comments8 min readLW link

[Question] Can UBI over­come in­fla­tion and rent seek­ing?

Gordon Seidoh WorleyAug 1, 2024, 12:13 AM
5 points
34 comments1 min readLW link

Recom­men­da­tion: re­ports on the search for miss­ing hiker Bill Ewasko

eukaryoteJul 31, 2024, 10:15 PM
169 points
28 comments14 min readLW link
(eukaryotewritesblog.com)

Eco­nomics101 pre­dicted the failure of spe­cial card pay­ments for re­fugees, 3 months later whole of Ger­many wants to adopt it

Yanling GuoJul 31, 2024, 9:09 PM
3 points
3 comments2 min readLW link

Am­bi­guity in Pre­dic­tion Mar­ket Re­s­olu­tion is Still Harmful

aphyerJul 31, 2024, 8:32 PM
43 points
17 comments3 min readLW link

AI labs can boost ex­ter­nal safety research

Zach Stein-Perlman31 Jul 2024 19:30 UTC
31 points
1 comment1 min readLW link