RSS

Ollie J

Karma: 179

[Paper] AI Sand­bag­ging: Lan­guage Models can Strate­gi­cally Un­der­perform on Evaluations

13 Jun 2024 10:04 UTC
77 points
10 comments2 min readLW link
(arxiv.org)

Tall Tales at Differ­ent Scales: Eval­u­at­ing Scal­ing Trends For De­cep­tion In Lan­guage Models

8 Nov 2023 11:37 UTC
49 points
0 comments18 min readLW link

ChatGPT banned in Italy over pri­vacy concerns

Ollie J31 Mar 2023 17:33 UTC
18 points
4 comments1 min readLW link
(www.bbc.co.uk)

Whisper’s Wild Implications

Ollie J3 Jan 2023 12:17 UTC
19 points
6 comments5 min readLW link