All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 202120222023 2024

All Jan Feb Mar Apr May Jun Jul AugSepOct Nov Dec

All 1 2 3 4 5 6 7 8910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

Monitoring for deceptive alignment

evhub8 Sep 2022 23:07 UTC

135 points

8 comments9 min readLW link

[An email with a bunch of links I sent an experienced ML researcher interested in learning about Alignment / x-safety.]

David Scott Krueger (formerly: capybaralet)8 Sep 2022 22:28 UTC

47 points

1 comment5 min readLW link

Progress links & tweets, 2022-09-08

jasoncrawford8 Sep 2022 20:43 UTC

13 points

3 comments1 min readLW link

(rootsofprogress.org)

Turning WhatsApp Chat Data into Prompt-Response Form for Fine-Tuning

hatta_afiq8 Sep 2022 20:05 UTC

1 point

0 comments1 min readLW link

Postmortem: Trying out for Manifold Markets

Milli | Martin and Austin Chen

8 Sep 2022 17:54 UTC

24 points

0 comments3 min readLW link

Thoughts on AGI consciousness / sentience

Steven Byrnes8 Sep 2022 16:40 UTC

38 points

37 comments6 min readLW link

A rough idea for solving ELK: An approach for training generalist agents like GATO to make plans and describe them to humans clearly and honestly.

Michael Soareverix8 Sep 2022 15:20 UTC

2 points

2 comments2 min readLW link

What Should AI Owe To Us? Accountable and Aligned AI Systems via Contractualist AI Alignment

xuan8 Sep 2022 15:04 UTC

32 points

15 comments25 min readLW link

ACX Book Review Discussion

Screwtape8 Sep 2022 14:22 UTC

5 points

0 comments1 min readLW link

Covid 9/8/22: Booster Boosting

Zvi8 Sep 2022 13:50 UTC

34 points

9 comments24 min readLW link

(thezvi.wordpress.com)

Solar Blackout Resistance

jefftk8 Sep 2022 13:30 UTC

69 points

32 comments3 min readLW link

(www.jefftk.com)

All AGI safety questions welcome (especially basic ones) [Sept 2022]

plex8 Sep 2022 11:56 UTC

22 points

48 comments2 min readLW link

[Question] Sequences/Eliezer essays beyond those in AI to Zombies?

Domenic8 Sep 2022 5:05 UTC

4 points

4 comments1 min readLW link

Linkpost: Github Copilot productivity experiment

Daniel Kokotajlo8 Sep 2022 4:41 UTC

88 points

4 comments1 min readLW link

(github.blog)

OpenPrinciples Bootcamp (Free) -- Reflect & Act on your Rationality Principles.

ti_guo8 Sep 2022 3:06 UTC

6 points

3 comments4 min readLW link

Searching for Modularity in Large Language Models

NickyP and Stephen Fowler

8 Sep 2022 2:25 UTC

44 points

3 comments14 min readLW link

90% of anything should be bad (& the precision-recall tradeoff)

cartografie8 Sep 2022 1:20 UTC

33 points

22 comments6 min readLW link

How to Do Research. v1

Pablo Repetto8 Sep 2022 1:08 UTC

29 points

4 comments41 min readLW link

(pabloernesto.github.io)

Galaxy Trucker Needs a New Second Half

jefftk7 Sep 2022 20:10 UTC

13 points

7 comments1 min readLW link

(www.jefftk.com)

[Question] In a lack of data, how should you weigh credences in theoretical physics’s Theories of Everything, or TOEs?

Noosphere897 Sep 2022 18:25 UTC

7 points

11 comments1 min readLW link

Generators Of Disagreement With AI Alignment

George3d67 Sep 2022 18:15 UTC

27 points

9 comments9 min readLW link

(www.epistem.ink)

Shrödinger’s lottery or: Why you are going to live forever

Chase Dowdell7 Sep 2022 18:13 UTC

1 point

2 comments4 min readLW link

Is training data going to be diluted by AI-generated content?

Hannes Thurnherr7 Sep 2022 18:13 UTC

10 points

7 comments1 min readLW link

It’s (not) how you use it

Eleni Angelou7 Sep 2022 17:15 UTC

8 points

1 comment2 min readLW link

First we shape our social graph; then it shapes us

Henrik Karlsson7 Sep 2022 15:50 UTC

52 points

6 comments8 min readLW link

(escapingflatland.substack.com)

AI-assisted list of ten concrete alignment things to do right now

lukehmiles7 Sep 2022 8:38 UTC

8 points

5 comments4 min readLW link

Can “Reward Economics” solve AI Alignment?

Q Home7 Sep 2022 7:58 UTC

3 points

15 comments18 min readLW link

Is there a list of projects to get started with Interpretability?

Franziska Fischer7 Sep 2022 4:27 UTC

8 points

2 comments1 min readLW link

Progress Report 7: making GPT go hurrdurr instead of brrrrrrr

Nathan Helm-Burger7 Sep 2022 3:28 UTC

21 points

0 comments4 min readLW link

Framing AI Childhoods

David Udell6 Sep 2022 23:40 UTC

37 points

8 comments4 min readLW link

Deleted comments archive

Said Achmiz6 Sep 2022 21:54 UTC

9 points

3 comments1 min readLW link

Guitar Pedals on Fiddle

jefftk6 Sep 2022 19:30 UTC

10 points

0 comments2 min readLW link

(www.jefftk.com)

Rejected Early Drafts of Newcomb’s Problem

zahmahkibo6 Sep 2022 19:04 UTC

112 points

5 comments3 min readLW link

[Question] How can we secure more research positions at our universities for x-risk researchers?

Neil Crawford6 Sep 2022 17:17 UTC

11 points

0 comments1 min readLW link

Community Building for Graduate Students: A Targeted Approach

Neil Crawford6 Sep 2022 17:17 UTC

6 points

0 comments4 min readLW link

How Josiah became an AI safety researcher

Neil Crawford6 Sep 2022 17:17 UTC

4 points

0 comments1 min readLW link

No, human brains are not (much) more efficient than computers

Jesse Hoogland6 Sep 2022 13:53 UTC

20 points

21 comments3 min readLW link

(www.jessehoogland.com)

On oxytocin-sensitive neurons in auditory cortex

Steven Byrnes6 Sep 2022 12:54 UTC

32 points

6 comments12 min readLW link

EA & LW Forums Weekly Summary (28 Aug − 3 Sep 22’)

Zoe Williams6 Sep 2022 11:06 UTC

51 points

2 comments14 min readLW link

Alex Lawsen On Forecasting AI Progress

Michaël Trazzi6 Sep 2022 9:32 UTC

18 points

0 comments2 min readLW link

(theinsideview.ai)

What are you for?

lsusr6 Sep 2022 3:32 UTC

42 points

5 comments1 min readLW link

The Power (and limits?) of Chunking

NicholasKross6 Sep 2022 2:26 UTC

8 points

2 comments1 min readLW link

Another Unphrased B-part

jefftk6 Sep 2022 1:30 UTC

10 points

0 comments2 min readLW link

(www.jefftk.com)

[Exploratory] Becoming more Agentic

Johannes C. Mayer6 Sep 2022 0:45 UTC

6 points

1 comment1 min readLW link

AI Governance Needs Technical Work

Mau5 Sep 2022 22:28 UTC

41 points

1 comment8 min readLW link

program searches

Tamsin Leake5 Sep 2022 20:04 UTC

21 points

2 comments2 min readLW link

(carado.moe)

Overton Gymnastics: An Exercise in Discomfort

Shoshannah Tekofsky and omark

5 Sep 2022 19:20 UTC

40 points

15 comments4 min readLW link

The Good King

GregorDeVillain5 Sep 2022 19:17 UTC

−6 points

0 comments13 min readLW link

Beta Readers are Great

HoldenKarnofsky5 Sep 2022 19:10 UTC

28 points

0 comments1 min readLW link

(www.cold-takes.com)

Impact Shares For Speculative Projects

Elizabeth5 Sep 2022 18:00 UTC

30 points

8 comments7 min readLW link

(acesounderglass.com)