Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Artem Karpov
Karma:
161
All
Posts
Comments
New
Top
Old
NEST: Nascent Encoded Steganographic Thoughts
Artem Karpov
17 Feb 2026 7:55 UTC
20
points
8
comments
13
min read
LW
link
Steganographic Chains of Thought Are Low-Probability but High-Stakes: Evidence and Arguments
Artem Karpov
11 Dec 2025 7:40 UTC
20
points
1
comment
6
min read
LW
link
The Illegible Chain-of-Thought Menagerie
Artem Karpov
18 Nov 2025 12:01 UTC
3
points
0
comments
8
min read
LW
link
artkpv’s Shortform
Artem Karpov
12 Oct 2025 9:52 UTC
2
points
15
comments
1
min read
LW
link
How dangerous is encoded reasoning?
Artem Karpov
30 Jun 2025 11:54 UTC
17
points
0
comments
10
min read
LW
link
Philosophical Jailbreaks: Demo of LLM Nihilism
Artem Karpov
4 Jun 2025 12:03 UTC
3
points
0
comments
5
min read
LW
link
The Steganographic Potentials of Language Models
Artem Karpov
,
Tinuade
and
SCho
8 May 2025 11:23 UTC
9
points
0
comments
1
min read
LW
link
CCS on compound sentences
Artem Karpov
4 May 2024 12:23 UTC
6
points
0
comments
9
min read
LW
link
Inducing human-like biases in moral reasoning LMs
Artem Karpov
,
Austin Meek
,
Bogdan Ionut Cirstea
and
SCho
20 Feb 2024 16:28 UTC
23
points
3
comments
14
min read
LW
link
How important is AI hacking as LLMs advance?
Artem Karpov
29 Jan 2024 18:41 UTC
1
point
0
comments
6
min read
LW
link
My (naive) take on Risks from Learned Optimization
Artem Karpov
31 Oct 2022 10:59 UTC
7
points
0
comments
5
min read
LW
link
Back to top