Occasionally think about topics discussed here. Will post if I have any thoughts worth sharing.
Tomás B.
What would be a good exit plan? If you’ve thought about this, can you share your plan and/or discuss (privately) my specific situation?′
+1 for this. Would love to talk to other people seriously considering exit. Maybe we could start a Telegram or something.
I have created this Google Calendar link if anyone wants to quickly setup a reminder: https://calendar.google.com/event?action=TEMPLATE&tmeid=MjM5cnQ3cmUwcW5lNXIxM3Nva2pqYTI4MXUgZGtrcXJnamwyM2R1aG8wcjIxZ3ZudWRuMjhAZw&tmsrc=dkkqrgjl23duho0r21gvnudn28%40group.calendar.google.com
One thing we have to account for is advances architecture even in a world where Moore’s law is dead, to what extent memory bandwidth is a constraint on model size, etc. You could rephrase this as how much of an “architecture overhang” exists. One frame to view this through is in era the of Moore’s law we sort of banked a lot of parallel architectural advances as we lacked a good use case for such things. We now have such a use case. So the question is how much performance is sitting in the bank, waiting to be pulled out in the next 5 years.
I don’t know how seriously to take the AI ASIC people, but they are claiming very large increases in capability, on the order of 100-1000x in the next 10 years, if this is a true this is a multiplier on top of increased investment. See this response from a panel including big-wigs at NVIDIA, Google, and Cerebras about projected capabilities: https://youtu.be/E__85F_vnmU?t=4016. On top of this, one has to account, too, for algorithmic advancement: https://openai.com/blog/ai-and-efficiency/
Another thing to note is though by parameter count, the largest modern models are 10000x smaller than the human brain, if one buys that parameter >= synapse idea (which most don’t but is not entirely off the table), the temporal resolution is far higher. So once we get human-sized models, they may be trained almost comically faster than human minds are. So on top an architecture overhang we may have this “temporal resolution overhang”, too, where once models are as powerful as the human brain they will almost certainly be trained much faster. And on top of this there is an “inference overhang” where because inference is much, much cheaper than training, once you are done training an economically useful model, you will almost tautologically have a lot of compute to exploit it with.
Hopefully I am just being paranoid (I am definitely more of a squib than a wizard in these domains), but I am seeing overhangs everywhere!
Your estimates of hardware advancement seem higher than most people’s. I’ve enjoyed your comments on such things and think there should be a high-level, full length post on them, especially with widely respected posts claiming much longer times until human-level hardware.Would be willing to subsidize such a thing if you are interested. Would pay 500 USD to yourself or a charity of your choice for a post on the potential of ASICS, Moore’s law, how quickly we can overcome the memory bandwidth bottlenecks and such things. Would also subsidize a post estimating an answer this question, too: https://www.lesswrong.com/posts/7htxRA4TkHERiuPYK/parameter-vs-synapse
This is probably not a meta enough comment, but I have been using kettlebells since the pandemic and I think they are the highest ROI form of exercise I have ever tried. I do 5 minutes of kettlebell swings with a 60 pound bell 3 times a day: before work, on my lunch break, and after work. My strength has significantly increased and it feels like a good cardio workout too.
My big problem with exercise is not the discomfort but the monotony. Swings are much more exhausting than most exercises and are also a hybrid of lifting and cardio, making them very efficient.
Just posting in case you did not get my PM. It has my email in it.
I did not write this post. Just thought it was interesting/relevant for LessWrong.
I don’t know too much about it. But I do know it was used extensively by Shell; they credited it with allowing them to respond to the Oil Shock much quicker than their competitors. They had analyzed the symptoms of a similar scenario (which was considered extremely outlandish at the time of scenario’s creation) and begin to notice eerie similarities between those symptoms and their present reality.
I see it as a sort of social technology that tries to assist an organization (and perhaps an individual) in resisting becoming the proverbial slowly-boiling frog.
As to evidence of its efficacy, I am only aware of anecdotal evidence. There appears to be an extensive Wikipedia page on the topic but I have not read it—my knowledge comes mostly from hearing Vernor Vinge speak about the technique, as he assisted in scenario-creation for several companies.
Ever since I heard Vinge speak about this, I have occasionally tried to think about the present as if it were a scenario I developed in the past: what sort of scenario would it be, how surprised would my past self be, and so on. Seeing how much The Pile improved GPT-J’s performance on this task trigged such thoughts.
Fair enough. “Silly” is out.
This is my favourite LW post in a long while. Trying to think what the shoot-the-moon strat would be for AI risk, ha.
Regarding your podcast example, I have some thoughts:
Psychometrics is both correct and incredibly unpopular—this means there is possibly an arbitrage here for anyone willing to believe in it.
Very high IQ people are rare and often have hobbies that are considered low-status in the general population. Searching for low-status signals that are predictive of cognitive ability looks to be an efficient means of message targeting.
It is interesting to note that Demis Hassibias’s prodigious ability was obvious to anyone paying attention to board games competitions in the late 90s. It may have been high ROI to sponsor the Mind Sports Olympiad at that time just for a small shot at influencing someone like Demis. There are likely other low-status signals of cognitive ability that will allow us to find diamonds in the rough.
Those who do well in strategic video games, board games, and challenging musical endeavors may be worth targeting. (Heavy metal for example—being very low-status and extremely technical musically—is a good candidate for being underpriced).
With this in mind, one obvious idea for messaging is to run ads. Unfortunately, high-impact people almost certainly have ad-blockers on their phones and computers.
However, the podcast space offers a way around this. Most niche 3rd party apps allow podcasters to advertise their podcasts on the podcast search pages. On the iPhone, at least, these cannot be adblocked trivially.
As the average IQ of a 3rd-party podcast app user is likely sligher higher than those who use first-party podcast apps, the audience is plausibly slightly enriched for high-impact people already. By focusing ads on podcast categories that are both cheap and good proxies for listener’s IQs (especially of the low-status kind mentioned above) one may be able to do even better.
I have been doing this for the AXRP podcast on the Overcast podcast app, and it has worked out to about ~5 dollars per subscriber. I did this without asking the permission of the podcast’s host.
Due to the recurring nature of podcasts and the parasocial relationship podcast listeners develop to the hosts of podcasts, it is my opinion their usefulness as a propaganda and inculcation tool is underappreciated at this time. It is very plausible to me that 5 dollars per subscriber may indeed be very cheap for the right podcast.
Directly sponsoring niche podcasts with extremely high-IQ audiences may be even more promising. There are likely mathematics, music theory, games and puzzle podcasts that are small enough to have not attracted conventional advertisers but are enriched enough in intelligent listeners to be a gold mine from this perspective.
I do not think I am a particularly good fit for this project. My only qualification is I am the only person I am aware of who is running such a project. Someone smarter with a better understanding of statistics would plausibly do far better. Perhaps if you have an application by a higher-quality person with a worse idea, you can give them my project. Then I can use my EA budget on something even crazier!
Thanks! Any thoughts on Codex? Do you think insane progress in code generation will continue for at least a few years?
I think it is fine to take notes, and fine to share them with friends. I’d prefer if this was not posted publicly on the web, as the reason he did not want to be recorded is it allowed him to speak more freely.
If possible on this site, perhaps a good compromise would be to make it available to LessWrong members only.
See, it is on the front page of HackerNews now, all over Reddit. I’m the person who books guests for Joshua’s meetups, and I feel like this is a sort of defection against Altman and future attendees of the meetup. As I said, I think notes are fine and sharing them privately is fine but publishing on the open web vastly increases the probability of some journalist writing a click-bait story about your paraphrased take of what Altman said.
Actually attending the meetups was a trivial inconvenience that reduced the probability of this occurring. Perhaps the damage is now done, but I really don’t feel right about this.
I take some responsibility for not being explicit about not publishing notes on the web, for whatever reason this was not a problem last time.
>If you don’t want any notes to be published this post is a good incentive to make that explicit in the future or even retroactively.
I consider this point to be slightly uncivil, but we will be explict in the future.
I would also like it to be removed.
I think it has mostly been pretty civil. I have nothing against the OP and don’t think he is malicious. Just think the situation is unfortunate. I was not blameless. We should have explicitly mentioned in the emails to not publish notes. And I should have asked OP flat out to remove it in my initial reply, rather than my initial slightly timid attempt.
Most of our meetups are recorded and posted publicly and obviously we are fine with summarization and notes, but about 1⁄10 guests prefer them to not be recorded.
Though my preferred outcome would be you taking the post down without much of a fuss, I understand this is a pretty self-serving preference. I did like the compromise idea of making the post available only to members, but that does not appear to be an existing feature of the site.
Taking it down helps remedy a failure of my own rather than yours, as we clearly should have been more explicit about this.
You posting them initially is perfectly understandable. Though I disagreed with your desire to keep them up after I requested them down, I understand this is a matter of opinion.
“Defection” is a pretty loaded word and I should not have used it.
In general, I think it is really great when people provide public goods like book reviews or highlights (I also think it is really rewarding and I have never regretted doing such things myself), so to the extent this has discouraged you from this path, I would like to point out that this is obviously a weird “scissor case” and similar efforts in the future will certainly be well received.
Is that truly the case? I recall reading Corey Washington a former linguist (who left the field for neuroscience in frustration with its culture and methods) claim that when he was a linguist the general attitude was there was no way in hell something like GPT-2 would ever work even close to the degree that it does.
Found it: