Thomas Kwa

Karma: 1,483

Doing alignment research with Vivek Hebbar’s team at MIRI.

Failure modes in a shard the­ory al­ign­ment plan

Thomas Kwa27 Sep 2022 22:34 UTC
24 points
2 comments7 min readLW link

Utility func­tions and prob­a­bil­ities are entangled

Thomas Kwa26 Jul 2022 5:36 UTC
13 points
5 comments1 min readLW link

Deriv­ing Con­di­tional Ex­pected Utility from Pareto-Effi­cient Decisions

Thomas Kwa5 May 2022 3:21 UTC
24 points
1 comment6 min readLW link

Most prob­lems don’t differ dra­mat­i­cally in tractabil­ity (un­der cer­tain as­sump­tions)

Thomas Kwa4 May 2022 0:05 UTC
8 points
0 comments3 min readLW link

The case for turn­ing glowfic into Sequences

Thomas Kwa27 Apr 2022 6:58 UTC
72 points
24 comments5 min readLW link

Mesa-util­ity func­tions might not be purely proxy goals

Thomas Kwa22 Apr 2022 22:16 UTC
12 points
17 comments1 min readLW link

[Question] (When) do high-di­men­sional spaces have lin­ear paths down to lo­cal min­ima?

Thomas Kwa22 Apr 2022 15:35 UTC
12 points
8 comments1 min readLW link

How dath ilan co­or­di­nates around solv­ing alignment

Thomas Kwa13 Apr 2022 4:22 UTC
47 points
37 comments5 min readLW link

5 Tips for Good Hearting

Thomas Kwa1 Apr 2022 19:47 UTC
25 points
10 comments1 min readLW link

Can we simu­late hu­man evolu­tion to cre­ate a some­what al­igned AGI?

Thomas Kwa28 Mar 2022 22:55 UTC
21 points
7 comments7 min readLW link

Jet­lag, Nausea, and Diar­rhea are Largely Optional

Thomas Kwa21 Mar 2022 22:40 UTC
83 points
27 comments2 min readLW link

The Box Spread Trick: Get rich slightly faster

Thomas Kwa1 Sep 2020 21:41 UTC
40 points
44 comments6 min readLW link

Thomas Kwa’s Bounty List

Thomas Kwa13 Jun 2020 0:03 UTC
12 points
15 comments1 min readLW link

[Question] What past highly-up­voted posts are over­rated to­day?

Thomas Kwa9 Jun 2020 21:25 UTC
14 points
9 comments1 min readLW link

[Question] How to learn from a stronger ra­tio­nal­ist in daily life?

Thomas Kwa20 May 2020 4:55 UTC
16 points
11 comments1 min readLW link

My ex­pe­rience with the “ra­tio­nal­ist un­canny valley”

Thomas Kwa23 Apr 2020 20:27 UTC
65 points
18 comments5 min readLW link

Thomas Kwa’s Shortform

Thomas Kwa22 Mar 2020 23:19 UTC
2 points
68 comments1 min readLW link