All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 202320242025

All Jan Feb Mar Apr May Jun JulAugSep Oct Nov Dec

All 1 2 345 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Labelling, Variables, and In-Context Learning in Llama2

Joshua PenmanAug 3, 2024, 7:36 PM

6 points

0 comments1 min readLW link

(colab.research.google.com)

[Question] Dan Hendrycks and EA

jeffreycarusoAug 3, 2024, 1:33 PM

−4 points

4 comments1 min readLW link

[Question] Why do Minimal Bayes Nets often correspond to Causal Models of Reality?

DalcyAug 3, 2024, 12:39 PM

27 points

1 comment1 min readLW link

Why did ChatGPT say that? Prompt engineering and more, with PIZZA.

Jessica RumbelowAug 3, 2024, 12:07 PM

41 points

2 comments4 min readLW link

Cooperation and Alignment in Delegation Games: You Need Both!

Oliver Sourbut, Lewis Hammond and HarrietW

Aug 3, 2024, 10:16 AM

8 points

0 comments14 min readLW link

(www.oliversourbut.net)

SRE’s review of Democracy

Martin SustrikAug 3, 2024, 7:20 AM

48 points

2 comments3 min readLW link

(250bpm.substack.com)

The Case Against Libertarianism

Zero ContradictionsAug 3, 2024, 5:05 AM

−4 points

1 comment1 min readLW link

(zerocontradictions.net)

We Don’t Just Let People Die—So What Next?

James Stephen BrownAug 3, 2024, 1:04 AM

11 points

8 comments10 min readLW link

The EA case for Trump

Judd RosenblattAug 3, 2024, 1:00 AM

14 points

1 comment1 min readLW link

(www.secondbest.ca)

I didn’t think I’d take the time to build this calibration training game, but with websim it took roughly 30 seconds, so here it is!

mako yassAug 2, 2024, 10:35 PM

24 points

2 comments5 min readLW link

Evaluating Sparse Autoencoders with Board Game Models

Adam Karvonen, Sam Marks, Can, Benjamin Wright, Jannik Brinkmann, Logan Riggs and Rico Angell

Aug 2, 2024, 7:50 PM

38 points

1 comment9 min readLW link

The Bitter Lesson for AI Safety Research

adamk, Richard Ren, Dan H and Gabe M

Aug 2, 2024, 6:39 PM

57 points

5 comments3 min readLW link

Ethical Deception: Should AI Ever Lie?

Jason ReidAug 2, 2024, 5:53 PM

5 points

2 comments7 min readLW link

[Question] Request for AI risk quotes, especially around speed, large impacts and black boxes

Nathan YoungAug 2, 2024, 5:49 PM

6 points

0 comments1 min readLW link

A Simple Toy Coherence Theorem

johnswentworth and David Lorell

Aug 2, 2024, 5:47 PM

74 points

22 comments7 min readLW link

All the Following are Distinct

Gianluca CalcagniAug 2, 2024, 4:35 PM

16 points

3 comments9 min readLW link

The ‘strong’ feature hypothesis could be wrong

lewis smithAug 2, 2024, 2:33 PM

231 points

19 comments17 min readLW link

An information-theoretic study of lying in LLMs

Annah and Guillaume Corlouer

Aug 2, 2024, 10:06 AM

17 points

0 comments4 min readLW link

How I Wrought a Lesser Scribing Artifact (You Can, Too!)

LorxusAug 2, 2024, 3:35 AM

12 points

0 comments5 min readLW link

The Rise and Stagnation of Modernity

Zero ContradictionsAug 2, 2024, 3:31 AM

1 point

0 comments1 min readLW link

(thewaywardaxolotl.blogspot.com)

Lessons from the FDA for AI

RemmeltAug 2, 2024, 12:52 AM

1 point

4 comments LW link

(ainowinstitute.org)

AI Rights for Human Safety

Simon GoldsteinAug 1, 2024, 11:01 PM

53 points

6 comments1 min readLW link

(papers.ssrn.com)

Case Study: Interpreting, Manipulating, and Controlling CLIP With Sparse Autoencoders

Gytis DaujotasAug 1, 2024, 9:08 PM

45 points

7 comments7 min readLW link

Optimizing Repeated Correlations

SatvikBeriAug 1, 2024, 5:33 PM

26 points

1 comment1 min readLW link

The need for multi-agent experiments

Martín SotoAug 1, 2024, 5:14 PM

43 points

3 comments9 min readLW link

Dragon Agnosticism

jefftkAug 1, 2024, 5:00 PM

95 points

75 comments2 min readLW link

(www.jefftk.com)

Morristown ACX Meetup

mbrooksAug 1, 2024, 4:29 PM

2 points

1 comment1 min readLW link

Some comments on intelligence

ViliamAug 1, 2024, 3:17 PM

30 points

5 comments3 min readLW link

[Question] [Thought Experiment] Given a button to terminate all humanity, would you press it?

lorepieriAug 1, 2024, 3:10 PM

−2 points

9 comments1 min readLW link

Are unpaid UN internships a good idea?

CipollaAug 1, 2024, 3:06 PM

1 point

7 comments4 min readLW link

AI #75: Math is Easier

ZviAug 1, 2024, 1:40 PM

46 points

25 comments72 min readLW link

(thezvi.wordpress.com)

Temporary Cognitive Hyperparameter Alteration

Jonathan MoregårdAug 1, 2024, 10:27 AM

9 points

0 comments3 min readLW link

(honestliving.substack.com)

Technology and Progress

Zero ContradictionsAug 1, 2024, 4:49 AM

1 point

0 comments1 min readLW link

(thewaywardaxolotl.blogspot.com)

Do Prediction Markets Work?

Benjamin_SturiskyAug 1, 2024, 2:31 AM

7 points

0 comments4 min readLW link

2/3 Aussie & NZ AI Safety folk often or sometimes feel lonely or disconnected (and 16 other barriers to impact)

yanni kyriacosAug 1, 2024, 1:15 AM

13 points

0 comments8 min readLW link

[Question] Can UBI overcome inflation and rent seeking?

Gordon Seidoh WorleyAug 1, 2024, 12:13 AM

5 points

34 comments1 min readLW link

Recommendation: reports on the search for missing hiker Bill Ewasko

eukaryoteJul 31, 2024, 10:15 PM

169 points

28 comments14 min readLW link

(eukaryotewritesblog.com)

Economics101 predicted the failure of special card payments for refugees, 3 months later whole of Germany wants to adopt it

Yanling GuoJul 31, 2024, 9:09 PM

3 points

3 comments2 min readLW link

Ambiguity in Prediction Market Resolution is Still Harmful

aphyerJul 31, 2024, 8:32 PM

43 points

17 comments3 min readLW link

AI labs can boost external safety research

Zach Stein-PerlmanJul 31, 2024, 7:30 PM

31 points

1 comment1 min readLW link

Women in AI Safety London Meetup

njgJul 31, 2024, 6:13 PM

1 point

0 comments1 min readLW link

Constructing Neural Network Parameters with Downstream Trainability

ch271828nJul 31, 2024, 6:13 PM

1 point

0 comments1 min readLW link

(github.com)

Want to work on US emerging tech policy? Consider the Horizon Fellowship.

ElikaJul 31, 2024, 6:12 PM

4 points

0 comments1 min readLW link

[Question] What are your cruxes for imprecise probabilities / decision rules?

Anthony DiGiovanniJul 31, 2024, 3:42 PM

36 points

33 comments1 min readLW link

The new UK government’s stance on AI safety

Elliot MckernonJul 31, 2024, 3:23 PM

17 points

0 comments4 min readLW link

Cat Sustenance Fortification

jefftk31 Jul 2024 2:30 UTC

14 points

7 comments1 min readLW link

(www.jefftk.com)

Twitter thread on open-source AI

Richard_Ngo31 Jul 2024 0:26 UTC

33 points

6 comments2 min readLW link

(x.com)

Twitter thread on AI takeover scenarios

Richard_Ngo31 Jul 2024 0:24 UTC

37 points

0 comments2 min readLW link

(x.com)

Twitter thread on AI safety evals

Richard_Ngo31 Jul 2024 0:18 UTC

63 points

3 comments2 min readLW link

(x.com)

Twitter thread on politics of AI safety

Richard_Ngo31 Jul 2024 0:00 UTC

35 points

2 comments1 min readLW link

(x.com)