Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
1
[Question]
Why Carl Jung is not popular in AI Alignment Research?
MiguelDev
Mar 17, 2023, 11:56 PM
−3
points
13
comments
1
min read
LW
link
[Event] Join Metaculus for Forecast Friday on March 24th!
ChristianWilliams
Mar 17, 2023, 10:47 PM
3
points
0
comments
LW
link
Meetup Tip: The Next Meetup Will Be. . .
Screwtape
Mar 17, 2023, 10:04 PM
44
points
0
comments
3
min read
LW
link
The Power of High Speed Stupidity
robotelvis
Mar 17, 2023, 9:41 PM
33
points
6
comments
9
min read
LW
link
1
review
(messyprogress.substack.com)
Retrospective on ‘GPT-4 Predictions’ After the Release of GPT-4
Stephen McAleese
Mar 17, 2023, 6:34 PM
26
points
6
comments
6
min read
LW
link
“Carefully Bootstrapped Alignment” is organizationally hard
Raemon
Mar 17, 2023, 6:00 PM
262
points
23
comments
11
min read
LW
link
1
review
[Question]
Are nested jailbreaks inevitable?
judson
Mar 17, 2023, 5:43 PM
1
point
0
comments
1
min read
LW
link
Ethical AI investments?
Justin wilson
Mar 17, 2023, 5:43 PM
24
points
15
comments
1
min read
LW
link
New economic system for AI era
ksme sho
Mar 17, 2023, 5:42 PM
−1
points
1
comment
5
min read
LW
link
On some first principles of intelligence
Macheng_Shen
Mar 17, 2023, 5:42 PM
−14
points
0
comments
4
min read
LW
link
Essential Behaviorism Terms
Rivka
Mar 17, 2023, 5:41 PM
15
points
1
comment
10
min read
LW
link
Vector semantics and “Kubla Khan,” Part 2
Bill Benzon
Mar 17, 2023, 4:32 PM
2
points
0
comments
3
min read
LW
link
Super-Luigi = Luigi + (Luigi—Waluigi)
Alexei
Mar 17, 2023, 3:27 PM
16
points
9
comments
1
min read
LW
link
Survey on intermediate goals in AI governance
MichaelA
and
MaxRa
Mar 17, 2023, 1:12 PM
25
points
3
comments
1
min read
LW
link
GPT-4 solves Gary Marcus-induced flubs
JakubK
Mar 17, 2023, 6:40 AM
56
points
29
comments
2
min read
LW
link
(docs.google.com)
[Question]
Are the LLM “intelligence” tests publicly available for humans to take?
nim
Mar 17, 2023, 12:09 AM
7
points
12
comments
1
min read
LW
link
Donation offsets for ChatGPT Plus subscriptions
Jeffrey Ladish
Mar 16, 2023, 11:29 PM
53
points
3
comments
3
min read
LW
link
The algorithm isn’t doing X, it’s just doing Y.
Cleo Nardo
Mar 16, 2023, 11:28 PM
53
points
43
comments
5
min read
LW
link
Announcing the ERA Cambridge Summer Research Fellowship
Nandini Shiralkar
Mar 16, 2023, 10:57 PM
11
points
0
comments
3
min read
LW
link
Gradual takeoff, fast failure
Max H
Mar 16, 2023, 10:02 PM
15
points
4
comments
5
min read
LW
link
Conceding a short timelines bet early
Matthew Barnett
Mar 16, 2023, 9:49 PM
133
points
17
comments
1
min read
LW
link
Attribution Patching: Activation Patching At Industrial Scale
Neel Nanda
Mar 16, 2023, 9:44 PM
45
points
10
comments
58
min read
LW
link
(www.neelnanda.io)
[Question]
Will 2023 be the last year you can write short stories and receive most of the intellectual credit for writing them?
lc
Mar 16, 2023, 9:36 PM
20
points
11
comments
1
min read
LW
link
Is it a bad idea to pay for GPT-4?
nem
Mar 16, 2023, 8:49 PM
24
points
8
comments
1
min read
LW
link
Are AI developers playing with fire?
marcusarvan
Mar 16, 2023, 7:12 PM
6
points
0
comments
10
min read
LW
link
[Question]
When will computer programming become an unskilled job (if ever)?
lc
Mar 16, 2023, 5:46 PM
36
points
55
comments
1
min read
LW
link
[Appendix] Natural Abstractions: Key Claims, Theorems, and Critiques
LawrenceC
,
Erik Jenner
and
Leon Lang
Mar 16, 2023, 4:38 PM
48
points
0
comments
13
min read
LW
link
Natural Abstractions: Key Claims, Theorems, and Critiques
LawrenceC
,
Leon Lang
and
Erik Jenner
Mar 16, 2023, 4:37 PM
241
points
26
comments
45
min read
LW
link
3
reviews
On the Crisis at Silicon Valley Bank
Zvi
Mar 16, 2023, 3:50 PM
59
points
9
comments
41
min read
LW
link
(thezvi.wordpress.com)
[Question]
What literature on the neuroscience of decision making can you recommend?
quetzal_rainbow
Mar 16, 2023, 3:32 PM
3
points
0
comments
1
min read
LW
link
[Question]
What organizations other than Conjecture have (esp. public) info-hazard policies?
David Scott Krueger (formerly: capybaralet)
Mar 16, 2023, 2:49 PM
20
points
1
comment
1
min read
LW
link
[Question]
Is there an analysis of the common consideration that splitting an AI lab into two (e.g. the founding of Anthropic) speeds up the development of TAI and therefore increases AI x-risk?
tchauvin
Mar 16, 2023, 2:16 PM
4
points
0
comments
1
min read
LW
link
A chess game against GPT-4
Rafael Harth
Mar 16, 2023, 2:05 PM
24
points
23
comments
1
min read
LW
link
ChatGPT getting out of the box
qbolec
Mar 16, 2023, 1:47 PM
6
points
3
comments
1
min read
LW
link
[Question]
Are funds (such as the Long-Term Future Fund) willing to give extra money to AI safety researchers to balance for the opportunity cost of taking an “industry” job?
Malleable_shape
Mar 16, 2023, 11:54 AM
5
points
1
comment
1
min read
LW
link
Three levels of exploration and intelligence
Q Home
Mar 16, 2023, 10:55 AM
9
points
3
comments
21
min read
LW
link
Here, have a calmness video
Kaj_Sotala
Mar 16, 2023, 10:00 AM
111
points
15
comments
2
min read
LW
link
(www.youtube.com)
Wittgenstein’s Language Games and the Critique of the Natural Abstraction Hypothesis
Chris_Leong
Mar 16, 2023, 7:56 AM
16
points
20
comments
2
min read
LW
link
Red-teaming AI-safety concepts that rely on science metaphors
catubc
Mar 16, 2023, 6:52 AM
5
points
4
comments
5
min read
LW
link
[ASoT] Some thoughts on human abstractions
leogao
Mar 16, 2023, 5:42 AM
42
points
4
comments
5
min read
LW
link
How I Run Solstice, Step by Step
maia
Mar 16, 2023, 3:23 AM
42
points
0
comments
16
min read
LW
link
(particularvirtue.blogspot.com)
GPT-4 Multiplication Competition
dandelion4
Mar 16, 2023, 3:09 AM
11
points
7
comments
1
min read
LW
link
Want to predict/explain/control the output of GPT-4? Then learn about the world, not about transformers.
Cleo Nardo
Mar 16, 2023, 3:08 AM
107
points
26
comments
5
min read
LW
link
[Question]
Is it worth avoiding detailed discussions of expectations about agency levels of powerful AIs?
David Johnston
Mar 16, 2023, 3:06 AM
11
points
6
comments
2
min read
LW
link
Why self-improvement?
Adam Zerner
Mar 16, 2023, 2:49 AM
12
points
4
comments
2
min read
LW
link
[Question]
What is a good comprehensive examination of risks near the Ohio train derailment?
1a3orn
Mar 16, 2023, 12:21 AM
17
points
0
comments
1
min read
LW
link
Write a Book?
jefftk
16 Mar 2023 0:10 UTC
45
points
7
comments
3
min read
LW
link
(www.jefftk.com)
AI Safety − 7 months of discussion in 17 minutes
Zoe Williams
15 Mar 2023 23:41 UTC
25
points
0
comments
LW
link
How well did Manifold predict GPT-4?
David Chee
15 Mar 2023 23:19 UTC
49
points
5
comments
2
min read
LW
link
80k podcast episode on sentience in AI systems
Robbo
15 Mar 2023 20:19 UTC
15
points
0
comments
13
min read
LW
link
(80000hours.org)
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel