Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Teun van der Weij
Karma:
123
All
Posts
Comments
New
Top
Old
Simple distribution approximation: When sampled 100 times, can language models yield 80% A and 20% B?
Teun van der Weij
,
Felix Hofstätter
and
Francis Rhys Ward
29 Jan 2024 0:24 UTC
39
points
5
comments
4
min read
LW
link
An Introduction to AI Sandbagging
Teun van der Weij
,
Felix Hofstätter
and
Francis Rhys Ward
26 Apr 2024 13:40 UTC
41
points
5
comments
8
min read
LW
link
Back to top