RSS

Quantilization

TagLast edit: 26 Jun 2023 17:07 UTC by Vladimir_Nesov

A Quantilizer is a proposed AI design which aims to reduce the harms from Goodhart’s law and specification gaming by selecting reasonably effective actions from a distribution of human-like actions, rather than maximizing over actions. It it more of a theoretical tool for exploring ways around these problems than a practical buildable design.

See also

How to safely use an optimizer

Simon Fischer28 Mar 2024 16:11 UTC
17 points
3 comments7 min readLW link

AISC pro­ject: Satis­fIA – AI that satis­fies with­out over­do­ing it

Jobst Heitzig11 Nov 2023 18:22 UTC
11 points
0 comments1 min readLW link
(docs.google.com)

He­donic Loops and Tam­ing RL

beren19 Jul 2023 15:12 UTC
20 points
14 comments9 min readLW link

AISC team re­port: Soft-op­ti­miza­tion, Bayes and Goodhart

27 Jun 2023 6:05 UTC
37 points
2 comments15 min readLW link

[Question] Why don’t quan­tiliz­ers also cut off the up­per end of the dis­tri­bu­tion?

Alex_Altair15 May 2023 1:40 UTC
25 points
2 comments1 min readLW link

Soft op­ti­miza­tion makes the value tar­get bigger

Jeremy Gillen2 Jan 2023 16:06 UTC
117 points
20 comments12 min readLW link

Quan­tiliz­ers and Gen­er­a­tive Models

Adam Jermyn18 Jul 2022 16:32 UTC
24 points
5 comments4 min readLW link

Quan­tilizer ≡ Op­ti­mizer with a Bounded Amount of Output

itaibn016 Nov 2021 1:03 UTC
11 points
4 comments2 min readLW link

Re­cur­sive Quan­tiliz­ers II

abramdemski2 Dec 2020 15:26 UTC
30 points
15 comments13 min readLW link

When to use quantilization

RyanCarey5 Feb 2019 17:17 UTC
65 points
5 comments4 min readLW link

Stable Poin­t­ers to Value III: Re­cur­sive Quantilization

abramdemski21 Jul 2018 8:06 UTC
20 points
4 comments4 min readLW link

Another view of quan­tiliz­ers: avoid­ing Good­hart’s Law

jessicata9 Jan 2016 4:02 UTC
26 points
2 comments2 min readLW link

Quan­tiliz­ers max­i­mize ex­pected util­ity sub­ject to a con­ser­va­tive cost constraint

jessicata28 Sep 2015 2:17 UTC
33 points
3 comments5 min readLW link
No comments.