StrivingForLegibility

Karma: 304

StrivingForLegibility 23 Nov 2023 7:49 UTC
2 points
−3
in reply to: rotatingpaguro’s comment on: Game Theory without Argmax [Part 2]
Edit: Cleo Nardo has confirmed that they intended $\prod i \in I$ to mean the cartesian product of sets, the ordinary thing for that symbol to mean in that context. I misunderstood the semantics of what $B (x)$ was intended to represent. I’ve updated my implementation to use the intended cartesian product when calculating the best response function, the rest of this comment is my initial (wrong) interpretation of $B (x)$ .
I needed to go back to one of the papers cited in Part 1 to understand what that $\prod i \in I$ was doing in that expression. I found the answer in A Generalization of Nash’s Theorem with Higher-Order Functionals. I’m going to do my best to paraphrase Hedges’ notation into Cleo’s notation, to avoid confusion.
TLDR: $B (x)$ is picking out the set of option-profiles $P (X)$ that are simultaneously best-responses by all players to that option-profile $x$ . It does this by considering all of the option-profiles that can result by each player best-responding, then takes the intersection of those sets.
On page 6, Hedges defines the best response correspondence $B \in X \to P (X)$
$B (x) = ⋂ i \in I B_{i} (x)$
Where
$B_{i} \in X \to P (X)$
Hedges builds up the idea of Nash Equilibria using quantifiers rather than optimizers, (like $max$ rather than $argmax$ ), but I believe the approaches are equivalent. Unpacking $B : x \mapsto \prod i \in I ψ_{i} (g \circ U_{i} (x))$ from the inside out:
$U_{i} (x) \in X_{i} \to X$
$g \circ U_{i} (x) \in X_{i} \to R$
That makes $g \circ U_{i} (x$ ) a $ψ_{i}$ -task. Since $ψ_{i} \in (X_{i} \to R) \to P (X_{i})$ , we know that $ψ_{i} (g \circ U_{i} (x)) \in P (X_{i})$ .
This is where I had to go looking through papers. What sort of product takes a set of best-responses from each player, relative to a given option-profile, and returns a set of option-profiles that are simultaneously regarded by each player as a best-response? I thought about just taking the Cartesian product of the sets, but that wouldn’t get us only the mutual best-responses.
Let’s call the way that each player maps option-profiles to best-responses $b_{i} \in X \to P (X_{i})$ . This is exactly the sets we want to take the product of:
$b_{i} (x) = ψ_{i} (g \circ U_{i} (x))$
Hedges introduces notation on page 3 to handle the operation of taking an option-profile, varying one player’s option, and leaving the rest the same. Paraphrasing, Hedges defines $x (i \mapsto α) \in \prod j \in I X_{j}$ by
$x (i \mapsto α)_{j} = {\begin{matrix} α & if i = j x_{j} & o t h e r w i s e \end{matrix}$
You can read $x (i \mapsto α)$ as “give me a new copy of $x$ , where the $i$ th entry has been set to the value $α$ .” Hedges uses this to define the deviation maps equivalently to the way Cleo did. $U_{i} : X \to (X_{i} \to X)$
$U_{i} (x) (α) = x (i \mapsto α)$
The correspondences $B_{i} \in X \to P (X)$ take as input an option profile, and returns the set of option-profiles which are player $i$ ’s optimal unilateral deviations from that option profile. To construct $B_{i}$ from $b_{i}$ , we want to map $b_{i} (x) \in P (X_{i})$ to the option-profiles which deviate from $x$ in those exact ways.
$B_{i} (x) = {x (i \mapsto α) : α \in b_{i} (x)}$
We can then use Hedges’ $B (x) = ⋂ i \in I B_{i} (x)$ to get the best-response correspondence! We can unpack this to get a definition of $B$ using objects that Cleo defined, using that deviation notation from Hedges:
$B (x) = ⋂ i \in I {x (i \mapsto α) : α \in ψ_{i} (g \circ U_{i} (x))}$
Thank you Cleo for writing this article! This was my first introduction to Higher-Order Game Theory, and I wrote up an implementation in TypeScript to help me understand how all of the pieces fit together!

StrivingForLegibility 24 Nov 2023 2:59 UTC
2 points
−6
in reply to: rotatingpaguro’s comment on: Game Theory without Argmax [Part 2]
Edit: Cleo Nardo has confirmed that they intended $\prod i \in I$ to mean the cartesian product of sets, the ordinary thing for that symbol to mean in that context. I misunderstood the semantics of what $B (x)$ was intended to represent. I’ve updated my implementation to use the intended cartesian product when calculating the best response function, the rest of this comment is based on my initial (wrong) interpretation of $B (x)$ .
I write the original expression, and your expression rewritten using the OP’s notation:
Original: $B : x \mapsto \prod i \in I ψ_{i} (g \circ U_{i} (x))$
Yours: $\begin{matrix} B (x) & = ⋂ i \in I {x (i \mapsto α) : α \in ψ_{i} (g \circ U_{i} (x))} = ⋂ i \in I U_{i} (x) (ψ_{i} (g \circ U_{i} (x))) \end{matrix}$
(I’m using the notation that a function applied to a set is the image of that set.)
This is a totally clear and valid rewriting using that notation! My background is in programming and I spent a couple minutes trying to figure out how mathematicians write “apply this function to this set.”
I believe the way that $B (x)$ is being used is to find Nash equilibria, using Cleo’s definition 6.5:
Like before, the $Ψ$ -nash equilibria of $g$ is the set of option-profiles $x \in X$ such that $x \in B (x)$ .
These are going to be option-profiles where “not deviating” is considered optimal by every player simultaneously. I agree with your conclusion that this leads $B (x)$ to take on values that are either ${}$ or ${x}$ . When $B (x) = {}$ , this indicates that $x$ is not a Nash equilibrium. When $B (x) = {x}$ , we know that $x$ is a Nash equilibrium.

StrivingForLegibility 25 Nov 2023 5:16 UTC
1 point
0
in reply to: Cleo Nardo’s comment on: Game Theory without Argmax [Part 2]
Got it, I misunderstood the semantics of what $B (x)$ was supposed to capture. I thought the elements needed to be mutual best-responses. Thank you for the clarification, I’ve updated my implementation accordingly!

A Decision Theory Can Be Rational or Computable, but Not Both

StrivingForLegibility21 Dec 2023 21:02 UTC

9 points

4 comments1 min readLW link

StrivingForLegibility 22 Dec 2023 4:47 UTC
1 point
0
in reply to: RogerDearnaley’s comment on: A Decision Theory Can Be Rational or Computable, but Not Both
Yes! I’m a fan of Yudkowsky’s view that the sensation of free will is the sensation of “couldness” among multiple actions. When it feels like I could do one thing or another, it feels like I have free will. When it feels like I could have chosen differently, it feels like I chose freely.
I suspect that an important ingredient of the One True Decision Theory is being shaped in such a way that other agents, modelling how you’ll respond to different policies they might implement, find it in their interest to implement policies which treat you fairly.

StrivingForLegibility 22 Dec 2023 5:41 UTC
4 points
0
in reply to: Cole Wyeth’s comment on: A Decision Theory Can Be Rational or Computable, but Not Both
This is a much more nuanced take! At the beginning of Chapter 6, Jan proposes restricting our attention to agents which are limit computable.
Our agents are useless if they cannot be approximated in practice, i.e., by a regular Turing machine. Therefore we posit that any ideal for a ‘perfect agent’ needs to be limit computable ( $Δ_{2}^{0}$ ).
This seems like a very reasonable restriction! Any implementation needs to be computable, but it makes sense to look for theoretic ideals which can be approximated.

How Emergency Medicine Solves the Alignment Problem

StrivingForLegibility26 Dec 2023 5:24 UTC

41 points

4 comments6 min readLW link

StrivingForLegibility 26 Dec 2023 16:53 UTC
7 points
2
in reply to: mishka’s comment on: How Emergency Medicine Solves the Alignment Problem
Absolutely! I have less experience on the “figuring out what interventions are appropriate” side of the medical system, but I know of several safety measures they employ that we can adapt for AI safety.
For example, no actor is unilaterally permitted to think up a novel intervention and start implementing it. They need to convince a institutional review board that the intervention has merit, and that a clinical trial can be performed safely and ethically. Then the intervention needs to be approved by a bunch of bureaucracies like the FDA. And then medical directors can start incorporating that intervention into their protocols.
The AI design paradigm that I’m currently most in favor of, and that I think is compatible with the EMS Agenda 2050, is Drexler’s Comprehensive AI Services (CAIS). Where a bunch of narrow AI systems are safely employed to do specific, bounded tasks. A superintelligent system might come up with amazing novel interventions, and collaborate with humans and other superintelligent systems to design a clinical trial for testing them. Every party along the path from invention to deployment can benefit from AI systems helping them perform their roles more safely and effectively.

An Ontology for Strategic Epistemology

StrivingForLegibility28 Dec 2023 22:11 UTC

9 points

0 comments5 min readLW link

Building Trust in Strategic Settings

StrivingForLegibility28 Dec 2023 22:12 UTC

24 points

0 comments7 min readLW link

Distributed Strategic Epistemology

StrivingForLegibility28 Dec 2023 22:12 UTC

11 points

0 comments3 min readLW link

Social Choice Theory and Logical Handshakes

StrivingForLegibility29 Dec 2023 3:49 UTC

14 points

0 comments4 min readLW link

Optimization Markets

StrivingForLegibility30 Dec 2023 1:24 UTC

13 points

2 comments2 min readLW link

When Can Optimization Be Done Safely?

StrivingForLegibility30 Dec 2023 1:24 UTC

12 points

0 comments3 min readLW link

StrivingForLegibility 2 Jan 2024 3:52 UTC
1 point
0
in reply to: osten’s comment on: Optimization Markets
It seems straightforward! Kaggle is the closest example I’ve been able to think of. But yes that’s totally the sort of thing that I think would constitute an optimization market!