RSS

Paul Colognese

Karma: 220

Personal website

[Linkpost] Fron­tier AI Task­force: first progress report

Paul Colognese7 Sep 2023 19:06 UTC
21 points
0 comments4 min readLW link
(www.gov.uk)

Aligned AI via mon­i­tor­ing ob­jec­tives in Au­toGPT-like systems

Paul Colognese24 May 2023 15:59 UTC
26 points
4 comments4 min readLW link

Towards a solu­tion to the al­ign­ment prob­lem via ob­jec­tive de­tec­tion and eval­u­a­tion

Paul Colognese12 Apr 2023 15:39 UTC
9 points
7 comments12 min readLW link

De­ci­sion Trans­former Interpretability

6 Feb 2023 7:29 UTC
82 points
13 comments24 min readLW link

Paul Colog­nese’s Shortform

Paul Colognese2 Feb 2023 19:15 UTC
2 points
1 comment1 min readLW link

Au­dit­ing games for high-level interpretability

Paul Colognese1 Nov 2022 10:44 UTC
33 points
1 comment7 min readLW link

De­cep­tion?! I ain’t got time for that!

Paul Colognese18 Jul 2022 0:06 UTC
54 points
5 comments13 min readLW link