Things slow down when Ilya isn’t there to YOLO in the right direction in an otherwise very high-dimensional space.
Decaeneus
I often mistakenly behave as if my payoff structure is binary instead of gradual. I think others do too, and this cuts across various areas.
For instance, I might wrap up my day and notice that it’s already 11:30pm, though I’d planned to go to sleep an hour earlier, by 10:30pm. My choice is, do I do a couple of me-things like watch that interesting YouTube video I’d marked as “watch later”, or do I just go to sleep ASAP? I often do the former and then predictably regret it the next day when I’m too tired to function well. I’ve reflected on what’s going on in my mind (with the ultimate goal of changing my behavior) and I think the simplest explanation is that I behave as if the payoff curve, in this case of length of sleep, is binary rather than gradual. Rational decision-making would prescribe that, especially once you’re getting less rest than you need, every additional hour of sleep is worth more rather than less. However, I suspect my instinctive thought process is something like “well, I’ve already missed my sleep target even if I go to sleep ASAP, so might as well watch a couple of videos and enjoy myself a little since my day tomorrow is already shot.”
This is pretty terrible! It’s the opposite of what I should be doing!
Maybe something like this is going on when poor people spend a substantial fraction of their income on the lottery (I’m already poor and losing an extra $20 won’t change that, but if I win I’ll stop being poor, so let me try) or when people who are out of shape choose not to exercise (I’m already pretty unhealthy and one 30-minute workout won’t change that, so why waste my time.) or when people who have a setback in their professional career have trouble picking themselves back up (my story is not going to be picture perfect anyway, so why bother.)
It would be good to have some kind of mental reframing to help me avoid this prectictably regrettable behavior.
What if a major contributor to the weakness of LLMs’ planning abilities is that the kind of step-by-step description of what a planning task looks like is content that isn’t widely available in common text training datasets? It’s mostly something we do silently, or we record in non-public places.
Maybe whoever gets the license to train on Jira data is going to get to crack this first.
Right—successful private companies (like nearly all the hot AI labs) are staying private for far longer (indefinitely?) so this bet will not capture any of the value they create for themselves.
It might also be that AGI is broadly deflationary, in that it will mostly melt moats and, with them, corporate margins (in most cases, except maybe the ones of the first company to roll out AGI).
Daniel Gross’ [AGI Trades](https://dcgross.com/agitrades) (in particular the first question under “Markets”) comes to mind.
It just seems far from certain to me that this bet will benefit from the outcome it’s trying to hedge / capture, and given the possible implications here, I’d just urge whoever is considering putting this kind of bet on to get comfortable with that linkage (between real-world outcome and financial outcome) and not just take it for granted.
What gives you confidence that much value will accrue to the equity of the companies in those indices?
It seems like, in the past, technological revolutions mostly increase churn and are anti-incumbent in some way e.g. (this may be false in particular, but just to illustrate my argument with a concrete-sounding example) ORCL has over 150k employees whose jobs might get nuked if AGI can painlessly and securely transfer its clients to OSS instead of expensive enterprise solutions.
If I try to think about what’s the most incumbent-friendly environment, almost by definition it ought to be one where not much is changing, but you’re trying to capture value in the opposite scenario.
(sci-fi take?) If time travel and time loops are possible, would this not be the (general sketch of the) scenario under which it comes into existence:
1. a lab figures out some candidate particles that could be sent back in time, build a detector for them and start scanning for them. suppose the particle has some binary state. if the particle is +1 (-1) the lab buys (shorts) stock futures and exits after 5 minutes
2. the trading strategy will turn out to be very accurate and the profits from the trading strategy will be utilized to fund the research required to build the time machine
3. at some arbitrary point in the future, eventually, the r&d and engineering efforts are successful. once the device is built, the lab starts sending information back in time to tip itself to future moves in stock futures (the very same particles it originally received). this closes the time loop and guarantees temporal consistency
Reasons why this might not happen:
time doesn’t work like this, or time travel / loops aren’t possible
civilization doesn’t survive long enough to build the device
the lab can’t commit to using its newfound riches to build the device, breaking the logic and preventing the whole thing from working in the first place
Thanks for these references! I’m a big fan, but for some reason your writing sits in the silly under-exploited part of my 2-by-2 box of “how much I enjoy reading this” and “how much of this do I actually read”, so I’d missed all of your posts on this topic! I caught up with some of it, and it’s far further along than my thinking. On a basic level, it matches my intuitive model of a sparse-ish network of causality which generates a much much denser network of correlation on top of it. I too would have guessed that the error rate on “good” studies would be lower!
Does belief quantization explain (some amount of) polarization?
Suppose people generally do Bayesian updating on beliefs. It seems plausible that most people (unless trained to do otherwise) subconsciosuly quantize their beliefs—let’s say, for the sake of argument, by rounding to the nearest 1%. In other words, if someone’s posterior on a statement is 75.2%, it will be rounded to 75%.
Consider questions that exhibit group-level polarization (e.g. on climate change, or the morality of abortion, or whatnot) and imagine that there is a series of “facts” that are floating around that someone uninformed doesn’t know about.
If one is exposed to facts in a randomly chosen order, then one will arrive at some reasonable posterior after all facts have been processed—in fact we can use this as a computational definition of the what it would be rational to conclude.
However, suppose that you are exposed to the facts that support the in-group position first (e.g. when coming of age in your own tribe) and the ones that contradict it later (e.g. when you leave the nest.) If your in-group is chronologically your first source of intel, this is plausible. In this case, if you update on sufficiently many supportive facts of the in-group stance, and you quantize, you’ll end up with a 100% belief on the in-group stance (or, conversely, a 0% belief on the out-group stance), after which point you will basically be unmoved by any contradictory facts you may later be exposed to (since you’re locked into full and unshakeable conviction by quantization).
One way to resist this is to refuse to ever be fully convinced of anything. However, this comes at a cost, since it’s cognitively expensive to hold onto very small numbers, and to intuitively update them well.
Causality is rare! The usual statement that “correlation does not imply causation” puts them, I think, on deceptively equal footing. It’s really more like correlation is almost always not causation absent something strong like an RCT or a robust study set-up.
Over the past few years I’d gradually become increasingly skeptical of claims of causality just by updating on empirical observations, but it just struck me that there’s a good first principles reason for this.
For each true cause of some outcome we care to influence, there are many other “measurables” that correlate to the true cause but, by default, have no impact on our outcome of interest. Many of these measures will (weakly) correlate to the outcome though, via their correlation to the true cause. So there’s a one-to-many relationship between the true cause and the non-causal correlates. Therefore, if all you know is that something correlates with a particular outcome, you should have a strong prior against that correlation being causal.
My thinking previously was along the lines of p-hacking: if there are many things you can test, some of them will cross a given significance threshold by chance alone. But I’m claiming something more specific than that: any true cause is bound to be correlated to a bunch of stuff, which will therefore probably correlate with our outcome of interest (though more weakly, and not guaranteed since correlation is not necessarily transitive).
The obvious idea of requiring a plausible hypothesis for the causation helps somewhat here, since it rules out some of the non-causal correlates. But it may still leave many of them untouched, especially the more creative our hypothesis formation process is! Another (sensible and obvious, that maybe doesn’t even require agreement with the above) heuristic is to distrust small (magnitude) effects, since the true cause is likely to be more strongly correlated with the outcome of interest than any particular correlate of the true cause.
Perhaps that can work depending on the circumstances. In the specific case of a toddler, at the risk of not giving him enough credit, I think that type of distinction is too nuanced. I suspect that in practice this will simply make him litigate every particular application of any given rule (since it gives him hope that it might work) which raises the cost of enforcement dramatically. Potentially it might also make him more stressed, as I think there’s something very mentally soothing / non-taxing about bright line rules.
I think with older kids though, it’s obviously a really important learning to understand that the letter of the law and the spirit of the law do not always coincide. There’s a bit of a blackpill that comes with that though, once you understand that people can get away with violating the spirit as long as they comply with the letter, or that complying with the spirit (which you can grok more easily) does not always guarantee compliance with the letter, which puts you at risk of getting in trouble.
Pretending not to see when a rule you’ve set is being violated can be optimal policy in parenting sometimes (and I bet it generalizes).
Example: suppose you have a toddler and a “rule” that food only stays in the kitchen. The motivation is that each time food is brough into the living room there is a small chance of an accident resulting in a permanent stain. There’s cost to enforcing the rule as the toddler will put up a fight. Suppose that one night you feel really tired and the cost feels particularly high. If you enforce the rule, it will be much more painful than it’s worth in that moment (meaning, fully discounting future consequences). If you fail to enforce the rule, you undermine your authority which results in your toddler fighting future enforcement (of this and possibly all other rules!) much harder, as he realizes that the rule is in fact negotiable / flexible.
However, you have a third choice, which is to credibly pretend to not see that he’s doing it. It’s true that this will undermine your perceived competence, as an authority, somewhat. However, it does not undermine the perception that the rule is to be fully enforced if only you noticed the violation. You get to “skip” a particularly costly enforcement, without taking steps back that compromise future enforcement much.
I bet this happens sometimes in classrooms (re: disruptive students) and prisons (re: troublesome prisoners) and regulation (re: companies that operate in legally aggressive ways).
Of course, this stops working and becomes a farce once the pretense is clearly visible. Once your toddler knows that sometimes you pretend not to see things to avoid a fight, the benefit totally goes away. So it must be used judiciously and artfully.
Agreed with your example, and I think that just means that L2 norm is not a pure implementation of what we mean by “simple”, in that it also induces some other preferences. In other words, it does other work too. Nevertheless, it would point us in the right direction frequently e.g. it will dislike networks whose parameters perform large offsetting operations, akin to mental frameworks or beliefs that require unecessarily and reducible artifice or intermediate steps.
Worth keeping in mind that “simple” is not clearly defined in the general case (forget about machine learning). I’m sure lots has been written about this idea, including here.
Regularization implements Occam’s Razor for machine learning systems.
When we have multiple hypotheses consistent with the same data (an overdetermined problem) Occam’s Razor says that the “simplest” one is more likely true.
When an overparameterized LLM is traversing the subspace of parameters that solve the training set seeking the smallest l2-norm say, it’s also effectively choosing the “simplest” solution from the solution set, where “simple” is defined as lower parameter norm i.e. more “concisely” expressed.
In early 2024 I think it’s worth noting that deep-learning based generative models (presently, LLMs) have the property of generating many plausible hypotheses, not all of which are true. In a sense, they are creative and inaccurate.
An increasingly popular automated problem-solving paradigm seems to be bolting a slow & precise-but-uncreative verifier onto a fast & creative-but-imprecise (deep learning based) idea fountain, a la AlphaGeometry and FunSearch.
Today, in a paper published in Nature, we introduce FunSearch, a method to search for new solutions in mathematics and computer science. FunSearch works by pairing a pre-trained LLM, whose goal is to provide creative solutions in the form of computer code, with an automated “evaluator”, which guards against hallucinations and incorrect ideas. By iterating back-and-forth between these two components, initial solutions “evolve” into new knowledge. The system searches for “functions” written in computer code; hence the name FunSearch.
Perhaps we’re getting close to making the valuable box you hypothesize.
Upon reflection, the only way this would work is if verification were easier than deception, so to speak. It’s not obvious that this is the case. Among humans, for instance, it seems very difficult for a more intelligent person to tell, in the general case, whether a less intelligent person is lying or telling the truth (unless the verifier is equipped with more resources and can collect evidence and so on, which is very difficult to do about some topics such as the verified’s internal state) so, in the case of humans, in general, deception seems easier than verification.
So perhapst the daisy-chain only travels down the intelligence scale, not up.
To be sure, let’s say we’re talking about something like “the entirety of published material” rather than the subset of it that comes from academia. This is meant to very much include the open source community.
Very curious, in what way are most CS experiments not replicable? From what I’ve seen in deep learning, for instance, it’s standard practice to include a working github repo along with the paper (I’m sure you know lots more about this than I do). This is not the case in economics, for instance, just to pick a field I’m familiar with.
I wonder how much of the tremendously rapid progress of computer science in the last decade owes itself to structurally more rapid truth-finding, enabled by:
the virtual nature of the majority of the experiments, making them easily replicable
the proliferation of services like github, making it very easy to replicate others’ experiments
(a combination of the points above) the expectation that one would make one’s experiments easily available for replication by others
There are other reasons to expect rapid progress in CS (compared to, say, electrical engineering) but I wonder how much is explained by this replication dynamic.
It feels like (at least in the West) the majority of our ideation about the future is negative, e.g.
popular video games like Fallout
zombie apocalypse themed tv
shows like Black Mirror (there’s no equivalent White Mirror)
Are we at a historically negative point in the balance of “good vs bad ideation about the future” or is this type of collective pessimistic ideation normal?
If the balance towards pessimism is typical, is the promise of salvation in the afterlife in e.g. Christianity a rare example of a powerful and salient positive ideation about our futures (conditioned on some behavior)?
From personal observation, kids learn text (say, from a children’s book, and from songs) back-to-front. That is, the adult will say all but the last word in the sentence, and the kid will (eventually) learn to chime in to complete the sentence.
This feels correlated to LLMs learning well when tasked with next-token prediction, and those predictions being stronger (less uniform over the vocabulary) when the preceding sequences get longer.
I wonder if there’s a connection to having rhyme “live” in the last sound of each line, as opposed to the first.
This raises the question of what it means to want to do something, and who exactly (or which cognitive system) is doing the wanting.
Of course I do want to keep watching YT, but I also recognize there’s a cost to it. So on some level, weighing the pros and cons, I (or at least an earlier version of me) sincerely do want to go to bed by 10:30pm. But, in the moment, the tradeoffs look different from how they appeared from further away, and I make (or, default into) a different decision.
An interesting hypothetical here is whether I’d stay up longer when play time starts at 11:30pm than when play time starts at, say, 10:15pm (if bedtime is 10:30pm). The wanting to play, and the temptation to ignore the cost, might be similar in both scenarios. But this sunk cost / binary outcome fallacy would suggest that I’ll (marginally) blow further past my deadline in the former situation than in the latter.