Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Page
1
ACI#6: A Non-Dualistic ACI Model
Akira Pyinya
Nov 9, 2023, 11:01 PM
10
points
2
comments
6
min read
LW
link
How I got so excited about HowTruthful
Bruce Lewis
Nov 9, 2023, 6:49 PM
17
points
3
comments
5
min read
LW
link
The case for “Generous Tit for Tat” as the ultimate game theory strategy
positivesum
Nov 9, 2023, 6:41 PM
2
points
3
comments
8
min read
LW
link
(tryingtruly.substack.com)
Text Posts from the Kids Group: 2021
jefftk
Nov 9, 2023, 5:50 PM
38
points
1
comment
8
min read
LW
link
(www.jefftk.com)
AI #37: Moving Too Fast
Zvi
Nov 9, 2023, 5:50 PM
53
points
5
comments
76
min read
LW
link
(thezvi.wordpress.com)
Learning-theoretic agenda reading list
Vanessa Kosoy
Nov 9, 2023, 5:25 PM
103
points
1
comment
2
min read
LW
link
1
review
Open-ended/Phenomenal Ethics (TLDR)
Ryo
Nov 9, 2023, 4:58 PM
3
points
0
comments
1
min read
LW
link
Polysemantic Attention Head in a 4-Layer Transformer
Jett Janiak
,
cmathw
and
StefanHex
Nov 9, 2023, 4:16 PM
51
points
0
comments
6
min read
LW
link
On OpenAI Dev Day
Zvi
Nov 9, 2023, 4:10 PM
60
points
0
comments
15
min read
LW
link
(thezvi.wordpress.com)
Antropical Probabilities Are Fully Explained by Difference in Possible Outcomes
Ape in the coat
Nov 9, 2023, 3:34 PM
19
points
7
comments
5
min read
LW
link
A free to enter, 240 character, open-source iterated prisoner’s dilemma tournament
Isaac King
Nov 9, 2023, 8:24 AM
64
points
19
comments
1
min read
LW
link
(manifold.markets)
Into AI Safety Episodes 1 & 2
jacobhaimes
Nov 9, 2023, 4:36 AM
2
points
0
comments
1
min read
LW
link
(into-ai-safety.github.io)
Making Bad Decisions On Purpose
Screwtape
Nov 9, 2023, 3:36 AM
49
points
8
comments
5
min read
LW
link
Metaculus’s New Sidebar Helps You Find Forecasts Faster
ChristianWilliams
Nov 8, 2023, 8:56 PM
15
points
0
comments
LW
link
(www.metaculus.com)
Open-ended ethics of phenomena (a desiderata with universal morality)
Ryo
Nov 8, 2023, 8:10 PM
1
point
0
comments
8
min read
LW
link
Open Agency model can solve the AI regulation dilemma
Roman Leventov
Nov 8, 2023, 8:00 PM
22
points
1
comment
2
min read
LW
link
Gothenburg LW / ACX meetup
Stefan
Nov 8, 2023, 7:52 PM
1
point
0
comments
1
min read
LW
link
[Question]
Why is lesswrong blocking wget and curl (scrape)?
nick lacombe
Nov 8, 2023, 7:42 PM
21
points
15
comments
1
min read
LW
link
[Question]
Is there a lesswrong archive of all public posts?
nick lacombe
Nov 8, 2023, 7:26 PM
12
points
7
comments
1
min read
LW
link
Five projects from AI Safety Hub Labs 2023
charlie_griffin
Nov 8, 2023, 7:19 PM
47
points
1
comment
6
min read
LW
link
(www.aisafetyhub.org)
[Question]
Can a stupid person become intelligent?
A. T.
Nov 8, 2023, 7:01 PM
12
points
24
comments
2
min read
LW
link
Prosthetic Intelligence
Krantz
Nov 8, 2023, 7:01 PM
7
points
9
comments
2
min read
LW
link
[Question]
Do you have a satisfactory workflow for learning about a line of research using GPT4, Claude, etc?
ryan_b
Nov 8, 2023, 6:05 PM
9
points
3
comments
1
min read
LW
link
What’s going on? LLMs and IS-A sentences
Bill Benzon
Nov 8, 2023, 4:58 PM
6
points
15
comments
4
min read
LW
link
[Question]
What will happen with real estate prices during a slow takeoff?
Ricardo Meneghin
Nov 8, 2023, 11:58 AM
8
points
1
comment
1
min read
LW
link
Tall Tales at Different Scales: Evaluating Scaling Trends For Deception In Language Models
Felix Hofstätter
,
Francis Rhys Ward
,
HarrietW
,
LAThomson
,
Ollie J
,
Patrik Bartak
and
Sam F. Brown
Nov 8, 2023, 11:37 AM
49
points
0
comments
18
min read
LW
link
How well does your research adress the theory-practice gap?
Jonas Hallgren
Nov 8, 2023, 11:27 AM
18
points
0
comments
10
min read
LW
link
Growth and Form in a Toy Model of Superposition
Liam Carroll
and
Edmund Lau
Nov 8, 2023, 11:08 AM
89
points
7
comments
14
min read
LW
link
Running your own workshop on handling hostile disagreements
Camille Berger
Nov 8, 2023, 10:28 AM
12
points
1
comment
7
min read
LW
link
Thinking By The Clock
Screwtape
Nov 8, 2023, 7:40 AM
197
points
29
comments
8
min read
LW
link
1
review
[Question]
Impressions from base-GPT-4?
mishka
Nov 8, 2023, 5:43 AM
25
points
25
comments
1
min read
LW
link
Quantopian contest, but for food intake and weight
Lucent
Nov 8, 2023, 5:41 AM
40
points
9
comments
3
min read
LW
link
How I Think, Part Two: Distrusting Individuals
Richard Henage
Nov 8, 2023, 4:06 AM
4
points
6
comments
3
min read
LW
link
How I Think, Part One: Investing in Fun
Richard Henage
Nov 8, 2023, 4:00 AM
5
points
2
comments
5
min read
LW
link
Concrete positive visions for a future without AGI
Max H
Nov 8, 2023, 3:12 AM
41
points
28
comments
8
min read
LW
link
South Bay ACX/LW/EA Meetup & Vegansgiving Potluck
IS
Nov 8, 2023, 2:30 AM
10
points
0
comments
1
min read
LW
link
Progress links digest, 2023-11-07: Techno-optimism and more
jasoncrawford
Nov 8, 2023, 2:05 AM
17
points
7
comments
11
min read
LW
link
(rootsofprogress.org)
Announcing Athena—Women in AI Alignment Research
Claire Short
Nov 7, 2023, 9:46 PM
80
points
2
comments
3
min read
LW
link
Vote on Interesting Disagreements
Ben Pace
Nov 7, 2023, 9:35 PM
159
points
131
comments
1
min read
LW
link
What is democracy for?
Johnstone
Nov 7, 2023, 6:17 PM
−5
points
10
comments
7
min read
LW
link
Scalable And Transferable Black-Box Jailbreaks For Language Models Via Persona Modulation
Soroush Pour
,
rusheb
,
Quentin FEUILLADE--MONTIXI
,
Arush
and
scasper
Nov 7, 2023, 5:59 PM
38
points
2
comments
2
min read
LW
link
(arxiv.org)
Implementing Decision Theory
justinpombrio
Nov 7, 2023, 5:55 PM
22
points
12
comments
3
min read
LW
link
Mirror, Mirror on the Wall: How Do Forecasters Fare by Their Own Call?
nikos
7 Nov 2023 17:39 UTC
14
points
5
comments
14
min read
LW
link
Symbiotic self-alignment of AIs.
Spiritus Dei
7 Nov 2023 17:18 UTC
1
point
0
comments
3
min read
LW
link
AMA: Earning to Give
jefftk
7 Nov 2023 16:20 UTC
53
points
8
comments
1
min read
LW
link
(www.jefftk.com)
The Stochastic Parrot Hypothesis is debatable for the last generation of LLMs
Quentin FEUILLADE--MONTIXI
and
Pierre Peigné
7 Nov 2023 16:12 UTC
52
points
21
comments
6
min read
LW
link
Preface to the Sequence on LLM Psychology
Quentin FEUILLADE--MONTIXI
7 Nov 2023 16:12 UTC
33
points
0
comments
2
min read
LW
link
What I’ve been reading, November 2023
jasoncrawford
7 Nov 2023 13:37 UTC
23
points
1
comment
5
min read
LW
link
(rootsofprogress.org)
AI Alignment [Progress] this Week (11/05/2023)
Logan Zoellner
7 Nov 2023 13:26 UTC
24
points
0
comments
4
min read
LW
link
(midwitalignment.substack.com)
On the UK Summit
Zvi
7 Nov 2023 13:10 UTC
74
points
6
comments
30
min read
LW
link
(thezvi.wordpress.com)
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel