RSS

VojtaKovarik

Karma: 684

My original background is in mathematics (analysis, topology, Banach spaces) and game theory (imperfect information games). Nowadays, I do AI alignment research (mostly systemic risks, sometimes pondering about “consequentionalist reasoning”).

AI Safety De­bate and Its Applications

VojtaKovarik23 Jul 2019 22:31 UTC
38 points
5 comments12 min readLW link

De­con­fuse Your­self about Agency

VojtaKovarik23 Aug 2019 0:21 UTC
15 points
9 comments5 min readLW link

Redefin­ing Fast Takeoff

VojtaKovarik23 Aug 2019 2:15 UTC
10 points
1 comment1 min readLW link

New pa­per: (When) is Truth-tel­ling Fa­vored in AI de­bate?

VojtaKovarik26 Dec 2019 19:59 UTC
32 points
7 comments5 min readLW link
(medium.com)

AI Ser­vices as a Re­search Paradigm

VojtaKovarik20 Apr 2020 13:00 UTC
30 points
12 comments4 min readLW link
(docs.google.com)

AI Un­safety via Non-Zero-Sum Debate

VojtaKovarik3 Jul 2020 22:03 UTC
25 points
10 comments5 min readLW link

AI Prob­lems Shared by Non-AI Systems

VojtaKovarik5 Dec 2020 22:15 UTC
7 points
2 comments4 min readLW link

Values Form a Shift­ing Land­scape (and why you might care)

VojtaKovarik5 Dec 2020 23:56 UTC
28 points
6 comments4 min readLW link

Risk Map of AI Systems

15 Dec 2020 9:16 UTC
28 points
3 comments8 min readLW link

For­mal­iz­ing Ob­jec­tions against Sur­ro­gate Goals

VojtaKovarik2 Sep 2021 16:24 UTC
13 points
23 comments1 min readLW link

[Question] How do you al­ign your emo­tions through up­dates and ex­is­ten­tial un­cer­tainty?

VojtaKovarik17 Apr 2023 20:46 UTC
4 points
10 comments1 min readLW link

Le­gi­t­imis­ing AI Red-Team­ing by Public

VojtaKovarik19 Apr 2023 14:05 UTC
10 points
7 comments3 min readLW link

OpenAI could help X-risk by wa­ger­ing itself

VojtaKovarik20 Apr 2023 14:51 UTC
31 points
16 comments1 min readLW link

Re­cur­sive Mid­dle Man­ager Hell: AI Edition

VojtaKovarik4 May 2023 20:08 UTC
30 points
11 comments2 min readLW link

Fun­da­men­tally Fuzzy Con­cepts Can’t Have Crisp Defi­ni­tions: Co­op­er­a­tion and Align­ment vs Math and Physics

VojtaKovarik21 Jul 2023 21:03 UTC
12 points
18 comments3 min readLW link

AI Aware­ness through In­ter­ac­tion with Blatantly Alien Models

VojtaKovarik28 Jul 2023 8:41 UTC
7 points
5 comments3 min readLW link

Con­trol vs Selec­tion: Civil­i­sa­tion is best at con­trol, but nav­i­gat­ing AGI re­quires selection

VojtaKovarik30 Jan 2024 19:06 UTC
7 points
1 comment1 min readLW link

My Align­ment “Plan”: Avoid Strong Op­ti­mi­sa­tion and Align Economy

VojtaKovarik31 Jan 2024 17:03 UTC
24 points
9 comments7 min readLW link

Weak vs Quan­ti­ta­tive Ex­tinc­tion-level Good­hart’s Law

21 Feb 2024 17:38 UTC
17 points
1 comment2 min readLW link

Which Model Prop­er­ties are Ne­c­es­sary for Eval­u­at­ing an Ar­gu­ment?

21 Feb 2024 17:52 UTC
17 points
2 comments7 min readLW link