Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
TW123
Karma:
1,234
All
Posts
Comments
New
Top
Old
Page
1
Risks from AI Overview: Summary
Dan H
,
Mantas Mazeika
and
TW123
Aug 18, 2023, 1:21 AM
25
points
1
comment
13
min read
LW
link
(www.safe.ai)
Catastrophic Risks from AI #6: Discussion and FAQ
Dan H
,
Mantas Mazeika
and
TW123
Jun 27, 2023, 11:23 PM
24
points
1
comment
13
min read
LW
link
(arxiv.org)
Catastrophic Risks from AI #5: Rogue AIs
Dan H
,
Mantas Mazeika
and
TW123
Jun 27, 2023, 10:06 PM
15
points
0
comments
22
min read
LW
link
(arxiv.org)
Catastrophic Risks from AI #4: Organizational Risks
Dan H
,
Mantas Mazeika
and
TW123
Jun 26, 2023, 7:36 PM
23
points
0
comments
21
min read
LW
link
(arxiv.org)
Catastrophic Risks from AI #3: AI Race
Dan H
,
Mantas Mazeika
and
TW123
Jun 23, 2023, 7:21 PM
18
points
9
comments
29
min read
LW
link
(arxiv.org)
Catastrophic Risks from AI #2: Malicious Use
Dan H
,
Mantas Mazeika
and
TW123
Jun 22, 2023, 5:10 PM
38
points
1
comment
17
min read
LW
link
(arxiv.org)
Catastrophic Risks from AI #1: Introduction
Dan H
,
Mantas Mazeika
and
TW123
Jun 22, 2023, 5:09 PM
40
points
1
comment
5
min read
LW
link
(arxiv.org)
[MLSN #9] Verifying large training runs, security risks from LLM access to APIs, why natural selection may favor AIs over humans
Dan H
and
TW123
Apr 11, 2023, 4:03 PM
11
points
0
comments
6
min read
LW
link
(newsletter.mlsafety.org)
[MLSN #8] Mechanistic interpretability, using law to inform AI alignment, scaling laws for proxy gaming
Dan H
and
TW123
Feb 20, 2023, 3:54 PM
20
points
0
comments
4
min read
LW
link
(newsletter.mlsafety.org)
What’s the deal with AI consciousness?
TW123
Jan 11, 2023, 4:37 PM
6
points
13
comments
9
min read
LW
link
(aiwatchtower.substack.com)
Implications of simulators
TW123
Jan 7, 2023, 12:37 AM
17
points
0
comments
12
min read
LW
link
“AI” is an indexical
TW123
Jan 3, 2023, 10:00 PM
10
points
0
comments
6
min read
LW
link
(aiwatchtower.substack.com)
A Year of AI Increasing AI Progress
TW123
Dec 30, 2022, 2:09 AM
148
points
3
comments
2
min read
LW
link
Did ChatGPT just gaslight me?
TW123
Dec 1, 2022, 5:41 AM
123
points
45
comments
9
min read
LW
link
(aiwatchtower.substack.com)
A philosopher’s critique of RLHF
TW123
Nov 7, 2022, 2:42 AM
55
points
8
comments
2
min read
LW
link
ML Safety Scholars Summer 2022 Retrospective
TW123
Nov 1, 2022, 3:09 AM
29
points
0
comments
LW
link
Announcing the Introduction to ML Safety course
Dan H
,
TW123
and
ozhang
Aug 6, 2022, 2:46 AM
73
points
6
comments
7
min read
LW
link
$20K In Bounties for AI Safety Public Materials
Dan H
,
TW123
and
ozhang
Aug 5, 2022, 2:52 AM
71
points
9
comments
6
min read
LW
link
Examples of AI Increasing AI Progress
TW123
Jul 17, 2022, 8:06 PM
107
points
14
comments
1
min read
LW
link
Open Problems in AI X-Risk [PAIS #5]
Dan H
and
TW123
Jun 10, 2022, 2:08 AM
61
points
6
comments
36
min read
LW
link
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel