Cycling in GANs/self-play?
James Camacho
I think having all of this in mind as you train is actually pretty important. That way, when something doesn’t work, you know where to look:
Am I exploring enough, or stuck always pulling the first lever? (free energy)
Is it biased for some reason? (probably the metric)
Is it stuck not improving? (step or batch size)
Weight-initialization isn’t too helpful to think about yet (other than avoiding explosions at the very beginning of training, and maybe a little for transfer learning), but we’ll probably get hyper neural networks within a few years.
I like this take, especially it’s precision, though I disagree in a few places.
conductance-corrected Wasserstein metric
This is the wrong metric, but I won’t help you find the right one.
the step-size effective loss potential critical batch size regime
You can lower the step-size and increase the batch-size as you train to keep the perturbation bounded. Like, sure, you could claim an ODE solver doesn’t give you the exact solution, but adaptive methods let you get within any desired tolerance.
for the weight-initialization distribution
This is another “hyper”parameter to feed into the model. I agree that, at some point, the turtles have to stop, and we can call that the initial weight distribution, though I’d prefer the term ‘interpreter’.
up to solenoidal flux corrections
Hmm… you sure you’re using the right flux? Not all boundaries of boundaries are zero, and GANs (and self-play) probably use a 6-complex.
If you “want to stop smoking” or “want to donate more” but do not, you are either deluding yourself, lacking intelligence, or preferring ignorance. Deluding yourself can make you feel happier about yourself. “I’m the kind of person who wants to help out other people! Just not the kind who actually does [but let’s not think about that].” Arguably, this is what you really prefer: to be happy, whether or not your thoughts are conistent with your behavior. If you are smart enough, and really want to get to the bottom of any inconsistencies you find yourself exhibiting, you will, and will no longer be inconsistent. You’ll either bite the bullet and say you actually do prefer the lung cancer over the shakes, or actually quit smoking.
Are the majority of rationalists deluded or dishonest? Absolutely. As I said in my post, utilitarianism is not well-defined, but most rationalists prefer running with the delsuion.
There are also people that genuinely prefer others’ well-being over a marginal increase in theirs—mostly wealthy or ascetic folks—and I think this is the target audience of EA evangelism. However, a lot of people don’t genuinely prefer others’ well-bing over a marginal increase in their own (or at least, the margin is pretty small), but these people still end up caught with Singer’s thought experiment, not realizing that the conclusions it leads them to (e.g. that they should donate to GiveWell) are inconsistent with their more fundamental values.
-
The ellipsis is, “genuinely prefer others’ well-being over a marginal increase in their own,” from the previous sentence.
-
They have to be smarter to recognize their actual beliefs and investigate what is consistent with them. They have to be more honest, because there is social pressure to think things like, “oh of course I care about others,” and hide how much or little they care.
-
I think the title is fine. The post mostly reads, “if you want a quantum analogue, here’s the path to take”.
Yeah, that was about the only sentence I read in the paper. I was wondering if you’d seen a theoretical justification (logos) rather than just an ethical appeal (ethos), but didn’t want to comb through the maths myself. By the way, fidelity won’t give the same posterior. I haven’t worked through the maths whatsoever, but I’d still put >95% probability on this claim.
Is there a reason they switched from divergence to fidelity when going quantum? You should want to get the classical Bayes’ rule in the limit as your density matrices become classical, and fidelity definitely doesn’t give you that.
Please avoid the abbreviation “MWI” until you’ve at least written “many-worlds interpretation” once. I had to do a ctrl+f and go to the eighth iteration of “MWI” before I could read your post, because all the information I was getting is this is something like UDASSA, and MWI is some information or decision theory term that I don’t know, but need to to even make sense of the first paragraph.
Then why is it too difficult for you to write down one of those definitions or theories where your criticism makes any sense?
Words demarcate the boundaries of meanings. You seem to be claiming there is some undefinable quality to the word “truth” that is useful to us, i.e. some unmeaningful meaning. Believe in epehemeral qualities all you like, but don’t criticize me for missing out on some “truths” that are impossible to discover anyway.
Millions of years ago, the world was pretty much zero sum. Animals weren’t great at planning, such as going back for reinforcements or waiting months to take revenge, so fights were brief affairs determined mostly by physical prowess, which wasn’t too hard to predict ahead of time. It was relatively easy to tell when you can get away with bullying a weaker animal for food, instead of hunting for your own.
When humans come along, with tools and plans, there is suddenly much less common knowledge when you get into a fight. What allies does this other human have to call upon? What weapons have they trained in? If they’re running away, are they just weaker, or are they leading you into a trap? If you actually can win the fight, you should take it, but the variance has shot up due to the unknowns so you need a higher expected chance of winning if you don’t want an unlucky roll to end your life. If you enter fights when you instictively feel you can win, then you will evolve to lower this instictual confidence.
You do know truth only means, “consistent with some set of assumptions (axioms)”? What does it mean to look for “true axioms”? That’s why I defer to useful ones.
I would say that I focus my thinking on the universes I can get sensory input showing the thinking is useful.
Re: this thread
The “guarantor” of your future is two things:
A belief that logic works.
Taking a Kolmogorov prior and Bayesian updating on new sense data.
Believing logic works has nothing to do with faith—it’s that you cannot do anything useful with the alternative. Then, once you’ve assumed logic and created maths, you just find the simplest explanations that fit with what you see. Will the future always be what you expect? No, but you can make claims with very high confidence, e.g. “in 99.99% of worlds where I receive the sense data I did, the Sun will actually rise tomorow.”
Just because you panic about the unknown does not mean the unknown will actually be a large factor in your reality.
Why do I believe this? Well I’ve seen this evolution in Risk. Newer players will overattack, using all their troops on the first few turns to take territories and break bonuses. They slowly evolve into turtles, keeping all their troops in one stack blocked by their own territories so they couldn’t do anything even if they wanted to, and only ever attacking one territory at a time. This is where most players stop their evolution, because after learning zeroth-order heuristics like, “the world is scary, better be super conservative,” the only way to further progress is to start modelling conflicts more than zero turns ahead.
The purpose of “underdog bias” is nearly the opposite of your best guess. It is because conflicts are too complicated for most people to model, and optional to get into. Even after several million years of evolution making brains smarter, humans still usually fail to see more than zero turns ahead in very simple games like Risk (e.g., if I break his bonus, and he goes right after me… well I can break his bonus now! Let’s do it!). If you can’t accurately model the effects of starting a conflict, but you’re also prone to getting into conflicts you think you can win (thanks evolution), the best hack is to make you believe you won’t win.
Some goods can have freeriders, and some cannot. To prevent freeriders on the roads, you need some form of policing. A toll booth or a military could work. While it’s possible to form different governments for different goods, this can lead to fighting between the police forces. Eventually one wins, gains a monopoly on power, and becomes the “legitimate” government.
As to...
Why not have each person deciding whether they value roads enough to subscribe to a road [tax], or whether they value an educated public enough to contribute to that?
It’s because of freeriders. This is why we have someone else decide how much the roads or public education are helping them. Maybe by putting a tax on gasoline or land around a school. I think if they overestimate how much value you’re getting out of the roads or schools, you should complain and ask them to change the tax code. For most areas, you’ll get more value than you paid in taxes, so you only have to spend mental energy when it becomes apparent that you’re not.
What if somebody doesn’t feel they get value out of foreign trade? Why should they pay? Similarly, if you own a ship and the Nazis might sink it, then why aren’t you paying to protect it, rather than demanding that everybody pay?
They shouldn’t. This can be solved directly by having tariffs (and for thousands of years, was what was done). This feels obvious, and my guess is there’s some woke mind virus at work, something like, “taxes are a fungible pool of money that everyone gets an equal say in its distribution.” If you don’t already believe that, and you’re trying to be the first person to collect taxes, you’ll collect them for a purpose, and refund any extra money, not find a new purpose for it.
what about the foreign trade value that came out of the goodwill PEPFAR and other USAID programs were creating?
Which is it? Are these countries very poor, where PEPFAR would be a huge percent of their GDP, or are they so rich that the goodwill generated exceeds the charity? Or, is it that they virtue signal to other, richer countries that America is a benevolent dictator, and it’s okay to keep the dollar hegemony? I think that is actually a really good reason to have USAID programs—it slows down other nations’ urgency to compete—but I also believe America’s hegemony has <10 years left. I think it’s still good to commit to goodwill, so that the next powers to be are more likely to be kind in return. That is the public good we’re funding, nothing else. Is it worth $20–40bn/year? Probably.
When it comes to foreign aid, the only consistent stance to have is: charity work, not government work.
Seeing that stuff as pure charity is deeply naive.
Deeply naive with a helping of arrogance. Why would you believe I didn’t consider goodwill, and then just decided it wasn’t worth it to add another few paragraphs going several rebuttals deeper? You’ll also find that I tend to respond to people with the same style of argumentation they employ. Such as, if you flippantly call something inconsistent, I’ll flippantly call it consistent.
I get really worried when people seize this much power this easily. Especially in education. Education is rife with people reshaping education for hundreds of thousands or millions of students, in ways they believe will be positive, but end up being massively detrimental.
The very fact you can have this much of an impact after only a few years and no track record or proof of concept points to the system being seriously unmeritocratic. And people who gain power in unmeritocratic systems are unlikely to do a good job with that power.
Does this mean you, in particular, should drop your work? Well, I don’t know you. I have no reason to trust you, but I also have no reason to trust the person who would replace you. What I would recommend is to find ways to make your system more meritocratic. Perhaps you can get your schools to participate in the AI Olympiad, and have the coaches for the best teams in the state give talks on what went well, and what didn’t. Perhaps you can ask professors at UToronto’s AI department to give a PD session on teaching AI. But, looking at the lineup from the 2024 NOAI conference, it looks like there’s no correlation between what gets platformed and what actually works.