RSS

VojtaKovarik

Karma: 684

My original background is in mathematics (analysis, topology, Banach spaces) and game theory (imperfect information games). Nowadays, I do AI alignment research (mostly systemic risks, sometimes pondering about “consequentionalist reasoning”).

AI Safety De­bate and Its Applications

VojtaKovarik23 Jul 2019 22:31 UTC
38 points
5 comments12 min readLW link

New pa­per: (When) is Truth-tel­ling Fa­vored in AI de­bate?

VojtaKovarik26 Dec 2019 19:59 UTC
32 points
7 comments5 min readLW link
(medium.com)

OpenAI could help X-risk by wa­ger­ing itself

VojtaKovarik20 Apr 2023 14:51 UTC
31 points
16 comments1 min readLW link

AI Ser­vices as a Re­search Paradigm

VojtaKovarik20 Apr 2020 13:00 UTC
30 points
12 comments4 min readLW link
(docs.google.com)

Re­cur­sive Mid­dle Man­ager Hell: AI Edition

VojtaKovarik4 May 2023 20:08 UTC
30 points
11 comments2 min readLW link

Values Form a Shift­ing Land­scape (and why you might care)

VojtaKovarik5 Dec 2020 23:56 UTC
28 points
6 comments4 min readLW link

Risk Map of AI Systems

15 Dec 2020 9:16 UTC
28 points
3 comments8 min readLW link

AI Un­safety via Non-Zero-Sum Debate

VojtaKovarik3 Jul 2020 22:03 UTC
25 points
10 comments5 min readLW link

Ex­tinc­tion Risks from AI: In­visi­ble to Science?

21 Feb 2024 18:07 UTC
24 points
7 comments1 min readLW link
(arxiv.org)

My Align­ment “Plan”: Avoid Strong Op­ti­mi­sa­tion and Align Economy

VojtaKovarik31 Jan 2024 17:03 UTC
24 points
9 comments7 min readLW link

Ex­tinc­tion-level Good­hart’s Law as a Prop­erty of the Environment

21 Feb 2024 17:56 UTC
23 points
0 comments10 min readLW link

Dy­nam­ics Cru­cial to AI Risk Seem to Make for Com­pli­cated Models

21 Feb 2024 17:54 UTC
18 points
0 comments9 min readLW link

Which Model Prop­er­ties are Ne­c­es­sary for Eval­u­at­ing an Ar­gu­ment?

21 Feb 2024 17:52 UTC
17 points
2 comments7 min readLW link

Weak vs Quan­ti­ta­tive Ex­tinc­tion-level Good­hart’s Law

21 Feb 2024 17:38 UTC
17 points
1 comment2 min readLW link

De­con­fuse Your­self about Agency

VojtaKovarik23 Aug 2019 0:21 UTC
15 points
9 comments5 min readLW link

For­mal­iz­ing Ob­jec­tions against Sur­ro­gate Goals

VojtaKovarik2 Sep 2021 16:24 UTC
13 points
23 comments1 min readLW link

[Question] What is the pur­pose and ap­pli­ca­tion of AI De­bate?

VojtaKovarik4 Apr 2024 0:38 UTC
13 points
9 comments1 min readLW link

Fun­da­men­tally Fuzzy Con­cepts Can’t Have Crisp Defi­ni­tions: Co­op­er­a­tion and Align­ment vs Math and Physics

VojtaKovarik21 Jul 2023 21:03 UTC
12 points
18 comments3 min readLW link

Le­gi­t­imis­ing AI Red-Team­ing by Public

VojtaKovarik19 Apr 2023 14:05 UTC
10 points
7 comments3 min readLW link

Redefin­ing Fast Takeoff

VojtaKovarik23 Aug 2019 2:15 UTC
10 points
1 comment1 min readLW link