VojtaKovarik

Karma: 684

My original background is in mathematics (analysis, topology, Banach spaces) and game theory (imperfect information games). Nowadays, I do AI alignment research (mostly systemic risks, sometimes pondering about “consequentionalist reasoning”).

AI Safety Debate and Its Applications

VojtaKovarik23 Jul 2019 22:31 UTC

38 points

5 comments12 min readLW link

Deconfuse Yourself about Agency

VojtaKovarik23 Aug 2019 0:21 UTC

15 points

9 comments5 min readLW link

Redefining Fast Takeoff

VojtaKovarik23 Aug 2019 2:15 UTC

10 points

1 comment1 min readLW link

New paper: (When) is Truth-telling Favored in AI debate?

VojtaKovarik26 Dec 2019 19:59 UTC

32 points

7 comments5 min readLW link

(medium.com)

AI Services as a Research Paradigm

VojtaKovarik20 Apr 2020 13:00 UTC

30 points

12 comments4 min readLW link

(docs.google.com)

AI Unsafety via Non-Zero-Sum Debate

VojtaKovarik3 Jul 2020 22:03 UTC

25 points

10 comments5 min readLW link

AI Problems Shared by Non-AI Systems

VojtaKovarik5 Dec 2020 22:15 UTC

7 points

2 comments4 min readLW link

Values Form a Shifting Landscape (and why you might care)

VojtaKovarik5 Dec 2020 23:56 UTC

28 points

6 comments4 min readLW link

Risk Map of AI Systems

VojtaKovarik and Jan_Kulveit

15 Dec 2020 9:16 UTC

28 points

3 comments8 min readLW link

Formalizing Objections against Surrogate Goals

VojtaKovarik2 Sep 2021 16:24 UTC

13 points

23 comments1 min readLW link

[Question] How do you align your emotions through updates and existential uncertainty?

VojtaKovarik17 Apr 2023 20:46 UTC

4 points

10 comments1 min readLW link

Legitimising AI Red-Teaming by Public

VojtaKovarik19 Apr 2023 14:05 UTC

10 points

7 comments3 min readLW link

OpenAI could help X-risk by wagering itself

VojtaKovarik20 Apr 2023 14:51 UTC

31 points

16 comments1 min readLW link

Recursive Middle Manager Hell: AI Edition

VojtaKovarik4 May 2023 20:08 UTC

30 points

11 comments2 min readLW link

Fundamentally Fuzzy Concepts Can’t Have Crisp Definitions: Cooperation and Alignment vs Math and Physics

VojtaKovarik21 Jul 2023 21:03 UTC

12 points

18 comments3 min readLW link

AI Awareness through Interaction with Blatantly Alien Models

VojtaKovarik28 Jul 2023 8:41 UTC

7 points

5 comments3 min readLW link

Control vs Selection: Civilisation is best at control, but navigating AGI requires selection

VojtaKovarik30 Jan 2024 19:06 UTC

7 points

1 comment1 min readLW link

My Alignment “Plan”: Avoid Strong Optimisation and Align Economy

VojtaKovarik31 Jan 2024 17:03 UTC

24 points

9 comments7 min readLW link

Weak vs Quantitative Extinction-level Goodhart’s Law

VojtaKovarik and Ida Mattsson

21 Feb 2024 17:38 UTC

17 points

1 comment2 min readLW link

Which Model Properties are Necessary for Evaluating an Argument?

VojtaKovarik and Ida Mattsson

21 Feb 2024 17:52 UTC

17 points

2 comments7 min readLW link