Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
AI Misuse
Tag
Last edit:
1 May 2023 17:42 UTC
by
Raemon
AI misuse.
Humans using AI in a way that harms humanity.
Relevant
New
Old
Adversarial Robustness Could Help Prevent Catastrophic Misuse
aog
11 Dec 2023 19:12 UTC
30
points
18
comments
9
min read
LW
link
Managing catastrophic misuse without robust AIs
ryan_greenblatt
and
Buck
16 Jan 2024 17:27 UTC
63
points
17
comments
11
min read
LW
link
Distinguishing misuse is difficult and uncomfortable
lemonhope
1 May 2023 16:23 UTC
17
points
3
comments
1
min read
LW
link
Proposal: Align Systems Earlier In Training
OneManyNone
16 May 2023 16:24 UTC
18
points
0
comments
11
min read
LW
link
Misalignment or misuse? The AGI alignment tradeoff
Max_He-Ho
20 Jun 2025 10:43 UTC
3
points
0
comments
1
min read
LW
link
(forum.effectivealtruism.org)
Human study on AI spear phishing campaigns
Simon Lermen
,
Fred Heiding
and
Andrew Kao
3 Jan 2025 15:11 UTC
81
points
8
comments
5
min read
LW
link
Proposal: we should start referring to the risk from unaligned AI as a type of *accident risk*
Christopher King
16 May 2023 15:18 UTC
22
points
6
comments
2
min read
LW
link
On excluding dangerous information from training
ShayBenMoshe
17 Nov 2023 11:14 UTC
23
points
5
comments
3
min read
LW
link
Visual Prompt Injections: Results on testing AI spam-defense and AI vulnerability to deceptive web ads.
Seon Gunness
3 Jun 2025 20:10 UTC
4
points
0
comments
12
min read
LW
link
Scalable And Transferable Black-Box Jailbreaks For Language Models Via Persona Modulation
Soroush Pour
,
rusheb
,
Quentin FEUILLADE--MONTIXI
,
Arush
and
scasper
7 Nov 2023 17:59 UTC
38
points
2
comments
2
min read
LW
link
(arxiv.org)
Technical Risks of (Lethal) Autonomous Weapons Systems
Heramb
23 Oct 2024 20:41 UTC
2
points
0
comments
1
min read
LW
link
(encodejustice.org)
How to solve the misuse problem assuming that in 10 years the default scenario is that AGI agents are capable of synthetizing pathogens
jeremtti
27 Nov 2024 21:17 UTC
6
points
0
comments
9
min read
LW
link
Covert Malicious Finetuning
Tony Wang
and
dannyhalawi
2 Jul 2024 2:41 UTC
94
points
4
comments
3
min read
LW
link
No comments.
Back to top