Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Aligned AI Proposals
Tag
Last edit:
20 Dec 2022 15:20 UTC
by
particlemania
Relevant
New
Old
An LLM-based “exemplary actor”
Roman Leventov
29 May 2023 11:12 UTC
15
points
0
comments
12
min read
LW
link
Aligning an H-JEPA agent via training on the outputs of an LLM-based “exemplary actor”
Roman Leventov
29 May 2023 11:08 UTC
11
points
10
comments
29
min read
LW
link
The Goal Misgeneralization Problem
Myspy
18 May 2023 23:40 UTC
1
point
0
comments
1
min read
LW
link
(drive.google.com)
Proposal: Align Systems Earlier In Training
OneManyNone
16 May 2023 16:24 UTC
18
points
0
comments
11
min read
LW
link
Annotated reply to Bengio’s “AI Scientists: Safe and Useful AI?”
Roman Leventov
8 May 2023 21:26 UTC
18
points
1
comment
7
min read
LW
link
(yoshuabengio.org)
Against sacrificing AI transparency for generality gains
Ape in the coat
7 May 2023 6:52 UTC
−3
points
0
comments
2
min read
LW
link
A Proposal for AI Alignment: Using Directly Opposing Models
Arne B
27 Apr 2023 18:05 UTC
0
points
0
comments
3
min read
LW
link
How to express this system for ethically aligned AGI as a Mathematical formula?
Oliver Siegel
19 Apr 2023 20:13 UTC
−1
points
0
comments
1
min read
LW
link
Speculation on mapping the moral landscape for future Ai Alignment
Sven Heinz (Welwordion)
16 Apr 2023 13:43 UTC
1
point
0
comments
1
min read
LW
link
Research proposal: Leveraging Jungian archetypes to create values-based models
MiguelDev
5 Mar 2023 17:39 UTC
5
points
2
comments
2
min read
LW
link
An Open Agency Architecture for Safe Transformative AI
davidad
20 Dec 2022 13:04 UTC
55
points
22
comments
4
min read
LW
link
No comments.
Back to top