A Quantilizer is a proposed AI design which aims to reduce the harms from Goodhart’s law and specification gaming by selecting reasonably effective actions from a distribution of human-like actions, rather than maximizing over actions. It it more of a theoretical tool for exploring ways around these problems than a practical buildable design.

Another view of quan­tiliz­ers: avoid­ing Good­hart’s Law

Soft op­ti­miza­tion makes the value tar­get bigger

Quan­tiliz­ers max­i­mize ex­pected util­ity sub­ject to a con­ser­va­tive cost constraint

When to use quantilization

Quan­tiliz­ers and Gen­er­a­tive Models

[Question] Why don’t quan­tiliz­ers also cut off the up­per end of the dis­tri­bu­tion?

He­donic Loops and Tam­ing RL

Quan­tilizer ≡ Op­ti­mizer with a Bounded Amount of Output

Stable Poin­t­ers to Value III: Re­cur­sive Quantilization

AISC team re­port: Soft-op­ti­miza­tion, Bayes and Goodhart

Re­cur­sive Quan­tiliz­ers II

AISC pro­ject: Satis­fIA – AI that satis­fies with­out over­do­ing it

How to safely use an optimizer

