Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Sam Marks
Karma:
1,568
All
Posts
Comments
New
Top
Old
What’s up with LLMs representing XORs of arbitrary features?
Sam Marks
3 Jan 2024 19:44 UTC
154
points
61
comments
16
min read
LW
link
Some open-source dictionaries and dictionary learning infrastructure
Sam Marks
5 Dec 2023 6:05 UTC
45
points
7
comments
5
min read
LW
link
Thoughts on open source AI
Sam Marks
3 Nov 2023 15:35 UTC
52
points
17
comments
10
min read
LW
link
Turning off lights with model editing
Sam Marks
12 May 2023 20:25 UTC
67
points
5
comments
2
min read
LW
link
(arxiv.org)
[Crosspost] ACX 2022 Prediction Contest Results
Scott Alexander
,
Eric Neyman
and
Sam Marks
24 Jan 2023 6:56 UTC
46
points
6
comments
8
min read
LW
link
AGISF adaptation for in-person groups
Sam Marks
,
Xander Davies
and
Richard_Ngo
13 Jan 2023 3:24 UTC
44
points
2
comments
3
min read
LW
link
Update on Harvard AI Safety Team and MIT AI Alignment
Xander Davies
,
Sam Marks
,
kaivu
,
tlevin
,
eleni
,
maxnadeau
and
Naomi Bashkansky
2 Dec 2022 0:56 UTC
60
points
4
comments
8
min read
LW
link
Recommend HAIST resources for assessing the value of RLHF-related alignment research
Sam Marks
and
Xander Davies
5 Nov 2022 20:58 UTC
26
points
9
comments
3
min read
LW
link
Caution when interpreting Deepmind’s In-context RL paper
Sam Marks
1 Nov 2022 2:42 UTC
103
points
6
comments
4
min read
LW
link
Safety considerations for online generative modeling
Sam Marks
7 Jul 2022 18:31 UTC
42
points
9
comments
14
min read
LW
link
Proxy misspecification and the capabilities vs. value learning race
Sam Marks
16 May 2022 18:58 UTC
23
points
3
comments
4
min read
LW
link
If you’re very optimistic about ELK then you should be optimistic about outer alignment
Sam Marks
27 Apr 2022 19:30 UTC
17
points
8
comments
3
min read
LW
link
Sam Marks’s Shortform
Sam Marks
13 Apr 2022 21:38 UTC
3
points
26
comments
1
min read
LW
link
2022 ACX predictions: market prices
Sam Marks
6 Mar 2022 6:24 UTC
21
points
2
comments
5
min read
LW
link
Movie review: Don’t Look Up
Sam Marks
4 Jan 2022 20:16 UTC
35
points
6
comments
11
min read
LW
link
[Book review] Gödel, Escher, Bach: an in-depth explainer
Sam Marks
29 Sep 2021 19:03 UTC
98
points
23
comments
23
min read
LW
link
1
review
[Question]
For mRNA vaccines, is (short-term) efficacy really higher after the second dose?
Sam Marks
25 Apr 2021 20:21 UTC
27
points
13
comments
3
min read
LW
link
Back to top