I wonder if you’re referring to the “spurious rewards” paper. If so, I wonder if you’re aware of [this critique] (https://safe-lip-9a8.notion.site/Incorrect-Baseline-Evaluations-Call-into-Question-Recent-LLM-RL-Claims-2012f1fbf0ee8094ab8ded1953c15a37) of its methodology, which might be enough to void the result.
shawnghu
I think the critique generalizes if it’s a little more focused. If a huge number of papers arose that just demonstrated that EM arose in a bunch of settings that varied superficially without a clear theory of why, this post would be a good critique of that phenomenon.
How do you feel about mutual combat laws in Washington and Texas, where you can fight by agreement (edit: you can’t grievously injure each other, apparently)?
I find it absurd on priors to think that soccer of any demographic could result in more concussions than any of those five full-contact sports, particularly the three where part of the objective is explicitly to hit your opponent in the head very hard if you can. (Even factoring in the fact that you do a bunch of headers in soccer.) (Maybe if you do some trickery like selecting certain subpopulations of the practitioners of these sports, but...)
I don’t disagree in general with the claim that words can be useful for coordinating about natural ideas. The thing that’s missing here is my understanding that there’s a particular natural idea here that isn’t captured by “mech interp lacks good paradigms”.
Is anything which lacks a good+relevant paradigm by default “pre-good-relevant-paradigm”, or is there more subtlety to the idea?
So Parameter Decomposition in theory suggests solutions to the anomalies of Second-Wave Mech Interp. But a theory doesn’t make a paradigm.
Nitpicking a little bit here, I think this is a different use of the word “theory” than the use in the phrase “scientific theory”. One could think you mean the latter in its second usage here, but it seems like you’re making a claim more like “these things could make progress explaining some of these things, if the experiments go well”.
> The requirement that the parameter components sum to the original parameters also means that there can be no ‘missing mechanisms’. At worst, there can only be ‘parameter components which aren’t optimally minimal or simple’.
Echoing a part of Adam Shai’s comment, I don’t see how this is different from the feature-based case. Won’t there be a problem if you extract a bunch of parameter components you “can explain”, and then you’re left with a big one you “can’t explain”, which “isn’t optimally minimal or simple”?
> Another attractive property of Parameter Decomposition is that it identifies Minimum Description Length as the optimization criterion for our explanations of neural networks
Why is this an attractive property? (Serious question.)
What’s the distinction between what you’re pointing at and the statement that mech interp lacks good paradigms? I think the latter statement is true and descriptive, but I presume you want to say something else.
Sorry, yeah, it was badly worded.
Being able to discern what makes someone an expert at X is a skill, Y.
People who are good at X aren’t necessarily good at Y; Y is a separate skill. (- Skill in Y generalizes across different values of X somewhat)
One needs to look for authors that somehow are good at Y; I didn’t specify how you could do this, and maybe there’s not a very good way in general. (But I do like the Caro biographies. But also, maybe I like them for their entertainment value.)
Re: self-help books, I mostly share your position in thinking that ~80% of such books could be a paragraph to a page, ~18% of them could be blog posts of varying length, and only the remaining ~2% have something substantial to say from a pure informational standpoint. (Worse, in many cases, padding the length of a self help book actively makes it worse/less coherent.) Moreover, I agree that of the good-ish 20%, there is a lot of overlap in the prescriptions given, implied or otherwise. I think that even when a book of this type is done “well”, the purpose of most of the text isn’t for it to be of maximum entropy or something in distinguishing world models, but in giving a bunch of perspectives on a small set of ideas in the hopes that one of them sticks particularly well, or the cumulative exposure makes the idea stick with you better. Spaced repetition or other ritualistic behaviors might achieve the same thing, but require more active agency on your part.
I happen to like the inner game of tennis in particular, and feel that its overlap in useful advice with other books in the genre is relatively low, though I might have a hard time defending my taste explicitly.
I like Viliam’s comment and think that it largely depends on the biography; consider that one internet rule that says that 90% of everything is crap (more specifically, I don’t think that people are by default skilled or diligent in discerning what the “true” factors in developing skill or success are, including experts, and this discernment is in itself a skill that you need to look for). You have to select for biographies that have the characteristics you want, which naturally takes more work to discriminate. More broadly, I don’t think there is any systematic answer to your question of whether, for a given story, the named factors are true. For a lot of life wisdom, unfortunately, at the base level the applicability of various stories has to filter through a vibes/intuition layer because lives are so different and the world changes so fast.
That aside, I think there is another nice benefit to reading a biography rather than just taking away the list of advice, which is that the human brain likes stories and characters, and that makes the given advice much more vivid/salient and therefore likelier to make a difference in your end behavior.
One famous sort of example in the category of biographies are those written by Robert Caro, for which the author has undoubtedly gone to painstaking lengths to investigate causes extremely thoroughly in a mostly epistemically virtuous way, but he himself would admit that his works have presented information in the framework of a narrative which was assembled by him (he would likely also say that this narrative was “true”). (The alternative is the presentation of a bunch of facts in order, which lack salience without some kind of overarching narrative.)
Finally, I wonder if you really feel that e.g the Inner Game of Tennis really doesn’t have any substantive information (ie, is fundamentally just willing the reader into believing in a self-fulfilling prophecy).
I do think that $200-$400 seem like reasonable consulting rates.
I think the situations with family are complicated, because sure, there are social/cultural reasons one might be expected to do those things for family. Usually people hold those cultural norms alongside a stronger distinction between the ingroup (family) and the outgroup (all other people by default), though, so letting your impressions from that culture teach you things about how to behave in a culture with a weaker distinction might be maladaptive.
(I actually was suggesting you try asking for objectively completely unreasonable things just to look at the flinch. For example, you could ask a stranger for $100 for no reason. They would say no, but no harm would be done.)
One frame that might be useful to you is that in a way, it is imperative to at least sufficiently assert your value to others (if not overassert it the socially expected amount). An overly modest estimate is still a miscalibrated one, and people will make suboptimal decisions as a result. (Putting aside the behavior and surpluses given to other people, you are also a player in this game, and your being underallocated resources is globally suboptimal.)
-
If this would not obviously make things worse, be more socially connected with people who have expectations of you; not necessarily friends but possibly colleagues or people who simply assume you should be working at times and get feedback about that in a natural way. It’s possible that the prospect of this is anxiety-inducing and would be awful but that it would not actually be very awful.
-
Recognize that you don’t need to do most things perfectly or even close to it, and as a corollary, you don’t need to be particularly ready to handle tasks even if they are important. You can handle an email or an urgent letter without priming yourself or being in the right state of mind. The vast majority of things are this way.
-
Sit in the start position of your task, as best as you can operationalize that (e.g, navigate to the email and open it, or hit the reply button and sit in front of it), for one minute, without taking your attention off of the task. Progress the amount of time upwards as necessary/possible. (One possible success-mode from doing this is that you get bored of being in this position or you become aware that you’re tired of the thing not being done. (You would hope your general anxiety about the task in day-to-day life would achieve this for you, but it’s not mechanically optimized enough to.) Another possible success-mode is that the immediate feelings you have about doing the task subside.)
-
Beta-blockers.
-
I agree that there are internet conflicts worth participating in, for sure. This site contains a large number of them!
But the original post was mostly about the value of passively reading certain things vs certain other things for entertainment. (In the first paragraph, I separate out “arguments on classic culture war topics” as an example of the sorts of conflicts that are most likely a waste of resources.)
To me it seems that the burden of proof lies on the side that asserts that human minds are able to access all heights of creation. Why should that be? Are we past some specific threshold?
Another way of operationalizing the objections to your argument are: what is the analogue to the event “flips heads”? If the predicate used is “conditional on AI models achieving power level X, what is the probability of Y event?” and the new model is below level X, by construction we have gained 0 bits of information about this.
Obviously this example is a little contrived, but not that contrived, and trying to figure out what fair predicates are to register will result in more objections to your original statement.
I would be interested, in a week or two, for your assessment of why it did or didn’t work. Projecting from myself, I would not expect to notice the post-it pretty much at all, ever, but this might work for mysterious second-order reasons anyway.
although i think here is fine, in addition you can try the SSC subreddit.
i know we are not all in a position to do this, but maybe if you don’t focus too solely on uni as being for your career interests, but also as a way of growing, learning about things you intrinsically enjoy, and enjoying yourself, the conflict will dissolve.
doing this is a decent all-purpose strategy for thriving long-term in life under most AI outcomes (other than, you know, being dead). (If AI turns out to be a flop, great. If AI turns out mid-strength and requiring human symbiosis, qualitative expertise and passion will be at a premium. If AI turns out to replace al human economic value, hopefully you learned something about how to authentically enjoy your life.)
I tend to think of myself as immune to rage-baiting/click-baiting/general toxicity from social media and politics. I generally don’t engage in arguments on classic culture war topics on the internet, and I knowingly avoid consuming much news on the grounds that it will make me feel worse without inducing meaningful change.
But I recently realized that the phenomenon has slightly broader implications: presumably in any medium, outrage is just more attractive to the human brain, and conflicts are entertaining, especially ones where you can take a side or criticize both sides.
This made me realize that this issue isn’t constrained to just the forms of media I’m more explicitly cynical about. In particular, some of the content I read about culture, even if it is more nuanced, and even if it reads as following a debate, is still essentially to me about observing conflict for entertainment.
Entertainment for entertainment’s sake is fine, but I doubt if doing this kind of reading is on the Pareto frontier of “enjoy myself” and “inform myself about things in an actionable way”.
On the far end of the spectrum, I am not sure if it’s sensible to attempt to fully avoid engaging in any form of social derision/feeling outrage, even unproductive outrage. It does feel like some forms of outrage arise organically as a natural consequence of valuing things.
Just as a look into a different world, ever since I can remember I have loved eating a lot, have wished that I could do more without consequence, and I have dedicated a lot of mental energy to try to figure out ways of maximizing the amount of eating I get to do at a minimum of effort and penalty, and conversely, eating less than I want to involves a considerable about of dissatisfaction and suffering.
If I could magically eat as much as I wanted to without suffering negative consequences to my health (or getting over full), I would basically be eating constantly and it would be one of the main sources of baseline hedonism I have, the same way others listen to music, take baths, etc. Even though it would be expensive.
Projecting my mind onto others, I think this is why a lot of fad diets exist, and there’s a disproportionately large amount of low quality discourse about diet and nutrition. A lot of people want the same thing to be possible and are coping hard to believe that it could be so.
Viewed in this light, the “growing a bigger liver” thing seems like a pretty straightforward idea, although I admit that it sounds pretty grotesque from a certain perspective. (Still, though, by comparison, some people do want to grow more muscle primarily so that they can eat more (something I explicitly did the calculations for), and on the grotesque side, people have taken some pretty nasty drugs or just gotten themselves infected with tapeworms just to stay thin.)
I’m not trying to be derisive; in fact, I relate to you greatly. But it’s by being on the outside that I’m able to levy a few more direct criticisms:
-
Were you not paid for the other work that you did, leading dev teams and getting frontier research done? Those things should be a baseline on the worth of your time.
-
If that, have you ever tried to maximize the amount of money you can get the) other people to acknowledge your time as worth (ie, get a high salary offer)?
-
Separately, do you know the going rate for consultants with approximately your expertise? Or any other reference class you cna make up. Consulting can cost an incredible amount of money, and that price can be “fair” in a pretty simple sense if it averts the need to do 10s of hours of labor at high wages. It may be one of the highest leverage activities per unit time that exists as a conventional economic activity that a person can simply do.
-
Aside from market rates or whatever, I suggest you just try asking for unreasonable things, or more money than you feel you’re worth (think of it as an experiment, and maybe observe what happens in your mind when you flinch from this).
-
Do you have any emotional hangup about the prospect of trading money for labor generally, or money for anything?
-
Separately, do you have a hard time asserting your worth to others (or maybe just strangers) on some baseline level?
-
Yeah, whenever a result is sensational and comes from a less-than-absolutely-huge name, my prior is that the result is due to mistakes (like 60-95% depending on the degree of surprisingness), and defacto this means I just don’t update on papers like this one any more until significant followup work is done.