RSS

Tomáš Gavenčiak

Karma: 111

A researcher in CS theory, AI safety and other stuff.

In­terLab – a toolkit for ex­per­i­ments with multi-agent interactions

22 Jan 2024 18:23 UTC
68 points
0 comments8 min readLW link
(acsresearch.org)

Spar­sity and in­ter­pretabil­ity?

1 Jun 2020 13:25 UTC
41 points
3 comments7 min readLW link

How can In­ter­pretabil­ity help Align­ment?

23 May 2020 16:16 UTC
37 points
3 comments9 min readLW link

What is In­ter­pretabil­ity?

17 Mar 2020 20:23 UTC
35 points
0 comments11 min readLW link