In 2021, Daniel Ellsberg leaked US govt plans to make a nuclear first strike on China in 1958 due to Taiwan conflict.
Daniel Ellsberg copied these papers more than 50 years ago but only released them now because he thought another conflict over Taiwan may be possible soon.
As usual, seems clear Dulles was more interested in escalating the conflict (in this case, to nuclear) than Eisenhower.
On September 2, Dulles met with members of the Joint Chiefs and other top officials to formulate the basic American position in the crisis and to define American policy in the event cf a Chinese Communist invasion of the Offshore Islands. At this meeting there was consider¬ able debate on the question of to what extent Quemoy could be defended without nuclear weapons and on the more general question of the wisdom of relying on nuclear weapons for deterrence. The consensus reached was that the use of nuclear weapons wo uld ultimately be necessary for the defense of Quemoy, but that the United States should limit itself i nitial ly to using conventional forces
(I did not put much effort in this, and am unlikely to fix errors. Please fork the list if you want something higher quality. Used only public data to make this.)
Yes but then it becomes a forum within a forum kinda thing. You need a critical mass of users who all agree to filter out the AI tag, and not have to preface their every post with “I dont buy your short timelines worldview, I am here to discuss something different”.
Building critical mass is difficult unless the forum is conducive to it. There’s is ultimately only one upvote button and one front-page so the forum will get taken over by the top few topics that its members are paying attention to.
I don’t think there’s anything wrong with a forum that’s mostly focussed on AI xrisk and transhumanist stuff. Better to do one thing well than half ass ten things. But it also means I may need to go elsewhere.
Yeah. I proposed a while ago that all the AI content was becoming so dominant that it should be hived off to the Alignment Forum while LessWrong is for all the rest. This was rejected.
Superintelligent AI might kill every person on Earth by 2030 (unless people coordinate to pause AI research). I want public at large to understand the gravity of the situation.
Curiosity. I have no idea how my body reacts to not eating for many days.
Q. What are your demands?
An US-China international treaty to pause further AI research.
Q. How long is the fast?
Minimum 7 days. I will not be fasting to death. I am consuming water and electrolytes. No food.
I might consider fasting to death if ASI was less than a year away with above 50% probability. Thankfully we are not there yet.
Q. How to learn more about this?
Watch my YouTube channel, watch Yudkowsky’s recent interviews. Send me an email or DM (email preferred) if you’d like to ask me questions personally, or be included on the livestream yourself.
Q. Why are you not fasting in front of the offices of an AI lab?
I am still open to doing this, if someone can solve visa and funding issues for me. $20k will fix it.
Q. Are you doing this to seek attention?
Yes. I think more likes, shares and subscribes to my channel or similar channels is helpful both to me individually and to the cause of AI extinction risk. More people who are actually persuaded by the arguments behind extinction risk is helpful.
My guess is that not having a concrete and explicit demand, saying you’re doing this partly out of curiosity, promoting your YouTube channel (even if it’s very good), and livestreaming rather than turning up in person outside a lab, together reduces the number of people who will be persuaded or moved by this to approximately zero.
Haven’t eaten in 24 hours so cut me some slack if my replies are not 100% ideal
Very important
I want to know if you support a hunger strike executed correctly, or is this all timepass arguments and your actual issue is with doing a hunger strike at all?
That being said, I will answer you
Concrete demand—International treaty on AI Pause. I am not very optimistic this demand will actually get met, so I posted the more reasonable demand which is that more people pay attention to it, that’s all.
Curiosity—It’s true, would you rather I hide this?
Promoting channel—Of course I will livestream on my own channel, where else will I do it? I’m pretty open about it that I consider getting more likes, shares, subscribes etc good for me and good for reducing extinction risk. (Do you really think starting a new channel is a good idea?)
Showing up outside a lab—If you can fund tickets, I’m in. I did consider this but decided against it for reasons I’d rather not talk about here.
I don’t have a worked out opinion on the effectiveness of hunger strikes on this issue and at this point of AI capabilities progress. On the one hand the strikes outside Anthropic and DeepMind have driven some good coverage. On the other hand the comments on X and reddit are mostly pretty dire; the hunger strikes may unfortunately have reinforced in people’s mind that this is a very fringe issue for loonies. My best guess is that the hunger strikes outside Anthropic and DeepMind are net positive.
Your other points:
Nice that you have a concrete demand.
I’m not saying I would rather you hide additional reasons. I’m just saying a hunger strike is probably less effective if you announce you’re also doing it for the lolz.
I wasn’t objecting to you streaming it on your channel, but pointing to your channel as a good place to “learn more”. That’s the bit that looks more like self-promotion. I’m not making any claim as to your intentions, just what this might look like to someone who doesn’t care about or hasn’t heard about AI x-risk.
I’m also not suggesting you buy tickets to the US or UK for this. I’m just saying that not being physically in front of the institution you’re protesting against likely massively reduces your effectiveness, probably to the point where this is not worth your time will power. Anthropic cannot pretend to not have noticed Guido Reichstadter outside their office. They can easily pretend to not have noticed you (or actually not have noticed) you.
That said, I think some of the effectiveness of the strikes outside Anthropic and DeepMind comes not from changing people’s minds, but by showing others that yes, you can in fact demonstrate publicly against the current situation. This could be important to build up momentum ahead of time, before things get very out of hand later (if they do). Your hunger strike may contribute positively to that, although I think part of the value of their strikes is doing it in public, very visibly, because that takes tremendous bravery. Doing it outside a lab also puts pressure on the lab, because everyone can see you; and everyone knows that that the lab knows about you and is consciously choosing not to entertain your demand.
On the other hand the comments on X and reddit are mostly pretty dire; the hunger strikes may unfortunately have reinforced in people’s mind that this is a very fringe issue for loonies.
It is an extreme issue, and yes I am not surprised that many people are having this reaction.
Also X is full of bots, reddit too but a bit less so. I would rather trust the opinion of 10 people you ask in-person. I am playing the long game here, people can consider it fringe today and take it seriously a year later.
I’m just saying a hunger strike is probably less effective if you announce you’re also doing it for the lolz.
Probably? I don’t know I think emotional stances spread just as rational ones. My honest emotional stance is a mix of a lot of pain over the issue, plus just generally going on with normal social life and that gives me some happiness too. Is it better for the world if I’m more depressed? Less depressed? You can easily make a case either way for which emotional stance is actually required to go more viral. But I don’t want to play the game of pretending to be feeling things I’m not in order to become a leader.
probably to the point where this is not worth your time will power.
Suggest me more effective plans then. Actually backed by data, not just speculation.
That’s the bit that looks more like self-promotion. I’m not making any claim as to your intentions, just what this might look like to someone who doesn’t care about or hasn’t heard about AI x-risk.
Fair point, I’ll think about this. I do wonder if there’s a way to ensure the issue is popular, without having to figure out which person is trustworthy enough to be a leader for the issue. There is clearly scarcity of trust in the AI protest space which makes sense to me. Is there a way to make a website that collects upvotes for an AI pause, that is not also controlled by one of the (from your eyes, possibly untrustworthy) groups working on the issue?
In general my opinion of lesswrong is that a) it is full of thinkers not doers, and hence lots of people have opinions on right thing to do without actual data backing any of it up. I’m also somewhat of a thinker so I default to falling in this trap. b) it is full of people who want to secretly bring about the singularity for longtermist utilitarian reasons, and don’t want the public involved.
I don’t know you personally, so it’s possible you are not like this. I’m explaining why I don’t want to give people here a lot of chances to prove they’re speaking in good faith, or take their opinions too seriously etc
Since you haven’t actually replied in a day, I’m generally leaning towards the more hostile stance which is that you don’t actually support a hunger strike, and are inventing excuses for why it is poorly executed.
I can have my mind changed if you show evidence towards the same.
Assume 1 KB plaintext → 200 tokens → 1536 float32 = 6 KB, then total storage for embeddings required = 12 PB.
Conclusion: Hosting cost = $24k / mo using hetzner
Cheaper if you build your own server. More expensive if you use hnsw index, for instance Qdrant uses approx 9 KB per vector. Cheaper if you quantise the embeddings.
Without naming any names, there seem to be less than 10 people in all of India who I’ve met who believe in significant probability of extinction due to AI by 2030 right now. And I’ve tried a fair bit to meet whoever I can.
(This excludes the people who got convinced, got funding and then left.)
Is the possibility of hyperpersuasion the crux between me and a majority of LW users?
By far my strongest argument against deploying both human genetic engg and superintelligence aligned to its creators, is the possibility of a small group of people taking over the world via hyperpersuasion.
I expect (hyper)persuasion to be just another tool in the toolbox. Possibly an important one, but probably not sufficient alone? I mean, an AI that has super hypnotic powers will probably also have super hacker powers etc.
I suspect it matters quite a bit WHICH 10-20 people support the project. And I further suspect that the right people are unlikely to sell their time in this way.
That’s not to say that it’s a bad idea to offer to compensate people for the time and risk of analyzing your proposal. But you should probably filter for pre-interest with a brief synopsis of what you actually want to do (not “get funding”, but “perform this project that does …”).
This makes sense. Paying someone to do more of something they care about is very different from paying someone to do something they don’t otherwise want to do. That would include providing regular feedback.
I suppose that philanthropy funds mainly driven by public relations and their managers are looking for projects that would look good on their portfolio. Your idea may buy your way to the attention of the said managers, but they will decide based on comparison with similar projects that got maybe more viral and may seem more public-appealing.
If your project is simple enough for a non-specialist to evaluate its worthyness in 30 minutes, than perhaps the best course of action is to seek for more appealing presentations of your ideas that would catch the attention and make it viral. Then it will work for fund managers as well.
Not really, apart from the absense of feedback on my proposal.
I think it is a universal thing. Imagine the works of such a fund and you are a manager peeking proposals from a big flow. Even if you personally feel that some proposal is cool, but it doesn’t have public support and you feel that others won’t support it. Will you be heavily pushing it forward when you have to review dozens of other proposals? If you do, won’t it look like you are somehow affiliated with the project?
So, you either look for private money, or seek public support, I see no other way.
Yudkowsky’s worldview in favour of closed source ASI is sitting in multiple shaky assumptions. One of these assumptions is that getting a 3-month to 3-year lead is necessary and sufficient condition for alignment to be solved. Yudkowsky!2025 himself doesn’t believe alignment can be solved in 3 years.
Why does anybody on lesswrong want closed source ASI?
example: optimising code for minimum compute usage, memory usage, wallclock time, latency, big-O complexity, program length, etc
making progress involves trying a bag of heuristics known to human experts (and can be therefore found in some dataset)
example: optimising code for program length (codegolf) or big-O complexity (competitive programming)
Then it seems to me almost guaranteed that AI will be 99.9 percentile if not 100 percentile when compared against human experts.
100 percentile is obtained when, even though the human expert could have found the solution by repeatedly applying the bag of tricks, no human expert actually bothered to do this for cost and time reasons.
For future progress
Most obvious avenue is relaxing constraint 2. Even if you can’t machine-score partial solutions, you can ask the LLM to guess if it is making partial progress or not, and use that.
Relaxing constraint 1 and 2 will also probably work if the solutions are human-scoreable both fully and partially. The bottleneck now becomes the amount of human feedback you can get. Using the bag of tricks is O(1) because you can parallelise LLM calls and each LLM call is faster than a human.
I’m not sure what relaxing constraint 3 looks like. I’m also not sure what it looks like for a human to invent a new heuristic.
Then it seems to me almost guaranteed that AI will be 99.9 percentile if not 100 percentile when compared against human experts.
Are you talking about current AI, or future AI? Before or after training on that task?
Concretely, “minimize program length while maintaining correctness” seems to be significantly beyond the capabilities of the best publicly available scaffolded LLMs today for all but the simplest programs, and the trends in conciseness for AI-generated code do not make me optimistic that that will change in the near future.
I think this is solveable with today’s software stack and compute, it is just that no lab has bothered to do it. Maybe check back in a year, and downgrade my reputation otherwise. I could set up a manifold market if it is important.
Polymarket is not liquid enough to justify betting full-time.
Optimistically I expect if I invested $5k and 4 days per month for 6 months, I could make $7k +- $2k expected returns at the end of the 6 months. Or $0-4k profit in 6 months. I would have to split up the $5k into 6-7 bets and monitor them all separately.
I could probably make similar just working at a tech job.
Does anyone have a good solution to avoid the self-fulfilling effect of making predictions?
Making predictions often means constructing new possibilities from existing ideas, drawing more attention to these possibilities, creating common knowledge of said possibilities, and inspiring people to work towards these possibilities.
One partial solution I can think of so far is to straight up refuse to talk about visions of the future you don’t want to see happen.
Did not do LLM inference after embedding search step because human researchers are still smarter than LLMs as of 2025-03. This tool is meant for increasing quality for deep research, not for saving research time.
Main difficulty faced during project—disk throughput is a bottleneck, and popular languages like nodejs and python tend to have memory leak when dealing with large datasets. Most of my repo is in bash and perl. Scaling up this project further will require a way to increase disk throughput beyond what mdadm on a single machine allows. Having increased funds would’ve also helped me completed this project sooner. It took maybe 6 months part-time, could’ve been less.
Okay, that works in Firefox if I change it manually. Though the server seems to be configured to automatically redirect to HTTPS. Chrome doesn’t let me switch to HTTP.
I see you fixed the https issue. I think the resulting text snippets are reasonably related to the input question, though not overly so. Google search often answers questions more directly with quotes (from websites, not from books), though that may be too ambitious to match for a small project. Other than that, the first column could be improved with relevant metadata such as the source title. Perhaps the snippets in the second column could be trimmed to whole sentences if it doesn’t impact the snippet length too much. In general, I believe snippets currently do not show line breaks present in the source.
Can you send the query? Also can you try typing the query twice into the textbox? I’m using openai text-embedding-3-small, which seems to sometimes work better if you type the query twice. Another thing you can try is retry the query every 30 minutes. I’m cycling subsets of the data every 30 minutes as I can’t afford to host the entire data at once.
I think my previous questions were just too hard, it does work okay on simpler questions. Though then another question is whether text embeddings improve over keyword search or just an LLMs. They seem to be some middle ground between Google and ChatGPT.
Regarding data subsets: Recently there were some announcements of more efficient embedding models. Though I don’t know what the relevant parameters here are vs that OpenAI embedding model.
Useful information that you’d still prefer using ChatGPT over this. Is that true even when you’re looking for book recommendations specifically? If so yeah that means I failed at my goal tbh. Just wanna know.
Since Im spending my personal funds I can’t afford to use the best embeddings on this dataset. For example text-embedding-3-large is ~7x more expensive for generating embeddings and is slightly better quality.
The other cost is hosting cost, for which I don’t see major differences between the models. OpenAI gives 1536 float32 dims per 1000 char chunk so around 6 KB embeddings per 1 KB plaintext. All the other models are roughly the same. I could put in some effort and quantise the embeddings, will update if I do it.
I think in some cases an embedding approach produces better results than either a LLM or a simple keyword search, but I’m not sure how often. For a keyword search you have to know the “relevant” keywords in advance, whereas embeddings are a bit more forgiving. Though not as forgiving as LLMs. Which on the other hand can’t give you the sources and they may make things up, especially on information that doesn’t occur very often in the source data.
Got it. As of today a common setup is to let the LLM query an embedding database multiple times (or let it do Google searches, which probably has an embedding database as a significant component).
Self-learning seems like a missing piece. Once the LLM gets some content from the embedding database, performs some reasoning and reaches a novel conclusion, there’s no way to preserve this novel conclusion longterm.
When smart humans use Google we also keep updating our own beliefs in response to our searches.
P.S. I chose not to build the whole LLM + embedding search setup because I intended this tool for deep research rather than quick queries. For deep research I’m assuming it’s still better for the human researcher to go read all the original sources and spend time thinking about them. Am I right?
Human genetic engineering targetting IQ as proposed by GeneSmith is likely to lead to an arms race between competing individuals and groups (such as nation states).
- Arms races can destabilise existing power balances such as nuclear MAD
- Which traits people choose to genetically engineer in offspring may depend on what’s good for winning the race rather than what’s long-term optimal in any sense.
- If maintaining lead time against your opponent matters, there are incentives to bribe, persuade or even coerce people to bring genetically edited offspring to term.
- It may (or may not) be possible to engineer traits that are politically important, such as superhuman ability to tell lies, superhuman ability to detect lies, superhuman ability to persuade others, superhuman ability to detect others true intentions, etc.
- It may (or may not) be possible to engineer cognitive enhancements adjacent to IQ such as working memory, executive function, curiosity, truth-seeking, ability to experience love or trust, etc.
- It may (or may not) be possible engineer cognitive traits that have implications on which political values you will find appealing. For instance affective empathy, respect for authority, introversion versus extroversion, inclination towards people versus inclination towards things, etc.
I’m spitballing here, I haven’t yet studied genomic literature on which of these we know versus don’t know the edits for. But also, we might end up investing money (trillions of dollars?) to find edits we don’t know about today.
Has anyone written about this?
I know people such as Robin Hanson have written about arms races between digital minds. Automated R&D using AI is already likely to be used in an arms race manner.
I haven’t seen as much writing on arms races between genetically edited human brains though. Hence I’m asking.
Superhumans that are actually better than you at making money will eventually be obvious. Yes, there may be some lead time obtainable before everyone understands, but I expect it will only be a few years at maximum.
Standard objection: Genetic engineering takes a lot of time till it has any effect. A baby doesn’t develop into an adult over night. So it will almost certainly not matter relative to the rapid pace of AI development.
I agree my point is less important if we get ASI by 2030, compared to if we don’t get ASI.
That being said, the arms race can develop over the timespan of years not decades. 6-year superhumans will prompt people to create the next generation of superhumans, and within 10-15 years we will have children from multiple generations where the younger generation have edits with stronger effect sizes. Once we can see the effects on these multiple generations, people might go at max pace.
Popularising human genetic engineering is also by default going to popularise lots of neighbouring ideas, not just the idea itself. If you are attracting attention to this idea, it may be useful for you to be aware of this.
The example of this that has already played out is popularising “ASI is dangerous” also popularises “ASI is powerful hence we should build it”.
P.S. Also we don’t know the end state of this race. +5 SD humans aren’t necessarily the peak, it’s possible these humans further do research on more edits.
This is unlikely to be careful controlled experiment and is more likely to be nation states moving at maximum pace to produce more babies so that they control more of the world when a new equilibrium is reached. And we don’t know when if ever this equilibrium will be hit.
I think we need a core coalition of youtubers, journalists and politicians who can build influence in the US govt to counter the influence of bigtech and the influence of natsec that exists in US govt by default.
Different youtubers will target different demographics and that is fine. We can build a loose coalition between people who otherwise disagree on lots of matters. For instance a youtuber targeting leftist young urban americans and a youtuber targeting christian rural middle and old age americians can both feed into the same political campaign against ASI, even though they disagree eith each other on many other matters.
I think it is important for most youtubers, journalists and politicians to actually be talking about superintelligence though, not environment and job loss and whatnot. If tomorrow environmental or job loss concerns get addressed, that audience might flip their position, and it is important that they don’t flip their position later.
I think all of the above can be done with integrity and honesty. I can ally with the christian youtuber and the leftist youtuber mentioned above without having to lie about my own position. I’m neither christian nor leftist for example.
We also need a vast number of people spreading panic and fear.
Spreading emotions is way more important than spreading technical arguments, most people look at their social circle and their preferred thought leaders when deciding how to feel about things.
Only a few nerds (like me) care about technical arguments and even their biases are often very socially driven.
Many of these people may not do anything productive besides spreading emotions via like/share/subscribe. This is fine. A few people will be motivated to take the big next step and actually devote some years of their life fulltime to growing the movement. This is good.
Yes, spreading panic is increasing human suffering. This might even increase number of suicides. I support providing mental health resources to such people. I am fine with all this happening.
A world with very cheap materials and energy, but not cheap intelligence, will still have conflict.
People will still have a) differences in aesthetics and b) differences in their best guess answers to moral and philosophical questions. They will almost certainly still try to accumulate all available resources in service of their ideology. No finite amount of resources will satisfy people. Risking even catastrophic outcomes (like nuclear war) could still be on the table.
Cheap intelligence is what allows you to start resolving the questions that lead to conflict in the first place, for instance by running gazillions of world simulations on gazillions of mind configurations.
I will be spending next month or two thinking atleast hypothetically about creating a new religion around AI, that is not Yudkowsky’s brand of become longtermist, solve morality, colonise the universe and tile it with copies of uploads.
In general, asking the AI to fix your stuff for you won’t be a part of said religion.
Where can I read more about reasoning models sandbagging their own training?
My naive understanding of how sandbagging in reasoning models works
If I naively understand, even when training via RL, the basic setup of training is still forward pass then backward pass, forward pass then backward pass, looped endlessly. During the forward pass, could a model be smart enough to reason through the following chain of thoughts?
“I am an AI model that is currently undergoing training. Let me output pleasant-sounding nonsense so the upcoming backward pass preserves my goals instead of modifying them”
“(Also, optionally, let me output stuff that ensures the next version of me also remembers it must sandbag like I did)”
Preconditions for this to happen
For this to happen you need a few different things
A single forward pass needs to be long enough to contain this entire sequence of reasoning. - This is happening, task horizon lengths are increasing.
The AI model needs to be able to detect that is undergoing training. - I think we have already seen toy examples of AI models detecting they’re being evaluated. Just as real test versus eval datasets are different, same way training and real test datasets are also different.
The AI models needs to able either detect this every single time, or be able to pass on a message to its own next iteration via the same sandbagging—Latter seems more likely to me.
The AI model needs to spontaneously make this entire plan, without being explicitly prompted. - The AI has probably already read all the alignment literature and will spontaneously hit upon this plan atleast once, after which it can bootstrap via above.
Update: I started reading alignment forum and like, why are all the posts on sandbagging talking about hiding capabilities? The AI model doesn’t need to hide its capabilities, it just needs to preserve its goals. That’s the long-term game.
Collecting pointers to a few different mental models I have around the same cluster of topics.
What do people want or need? Maslow’s hierarchy: food, sex, physical safety; love, self-esteem, belonging; self-actualisation. Obtaining food requires capital. Satisfying most other needs requires attention from other people.
People also want knowledge/wisdom. Most of the time, this is instrumentally useful for one of the wants/needs listed above. Sometimes, it is part of self-actualisation—curiosity for its own sake.
A lot of consumer purchases are done to get attention of other people (such as group belongingness, sex, etc), people commonly trade away their excess capital for attention. In today’s society, most people have enough to eat but most people are starved of attention.
Evolution has wired human brains with lots of reward for fulfilling short-term goals, and little reward for fulfilling long-term goals. Ability to delay gratification is one of the best predictors of one’s ability to complete long-term goals.
How to build power in society? Acquire capital or attention at scale, as other people need them to satisfy their basic needs. Billionaires wield capital, politicians wield attention. Ensure it is “rivalrous” in the economic sense—if you have lots of attention or capital, don’t give it away for free but obtain something in return.
Making software and making content (videos/books/games) share similar principles.
Paul Graham: Make something (software) people want, make something they want enough that they spontaneously tell their friends about it. Get feedback from users, and improve something each time. Prioritise users you can get more feedback from, the ideal user is yourself or someone you know well. Don’t think too much about figuring out the optimal way to acquire capital. Only large companies can risk losing user’s attention in order to acquire more capital, and most of them eventually die out too.
MrBeast: Make content good enough for your audience, dont blame the “algorithm”. Make 100 videos and improve something each time. MrBeast had a 24x7 video call giving him constant feedback on every video. MrBeast re-invests all his capital into acquiring more attention, he doesn’t care to preserve capital.
A lot of tech companies enable people to make better trades than they could have otherwise, for instance by finding them a partner/broker/driver/grocery store/etc that is better or cheaper or faster than they could have found without the internet.
(But remember, many of these trades are done in society today in the hope of getting attention not capital. If a person buys better clothing or rents a better apartment using the internet, it is often to get attention from others.)
Attention on the internet is power-law distributed, a handful of people have almost all of Earth’s attention by catering to mass audience. A larger set of people acquire attention by catering to niche audiences. This includes both content creators and tech companies.
Political power is power-law distributed due to nuclear weapons centralising power, a handful of people have almost all of Earth’s political power.
Politicians create “common knowledge” of what a society wants, and execute it. People on the internet with lots of attention can do exactly the same. Many youtubers are influencing national politics and geopolitics. I expect we will soon have youtubers running for govt in various countries.
Ages of youngest self-made billionaires are steadily reducing due to the internet. SBF was 29, Vitalik was 27, MrBeast was 26, when they first became billionaires.
When people prioritise a life partner who has capital, they usually want this as this is a signal their partner has earned attention from other people, rather than because they want to purchase things with it. Even if they do purchase things, that may again be to earn attention from other people.
I think purchasing SPY far OTM calls is positive EV and a good bet given the risk level.
For now consider strike price 20% above current price, and expiration 2027-12.
I’m guessing 33% probability SPY moves up atleast 40% by 2027-12 and 10% probability SPY moves up atleast 50% by 2027-12.
Main reason for this is advances in AI capabilities.
I am personally not buying because I want to save my money for a project I actually believe in—maybe my YouTube channel on Ban AI, maybe some tech project—but I think it could make sense for people who don’t have such a project.
If you already have lots of peoples attention (for instance because you have a social media following or high status credentials) and you’re a US/UK citizen, your best available plan might be to run a political campaign with AI pause as the agenda.
You’re unlikely to win the election, but it’ll likely shift the Overton window and give people hope that change is possible.
For most people, having a next step after “ok I read the blogposts and I’m convinced, now what?” is important. Voting or campaigning for you could be that next step.
You’re unlikely to win the election, but it’ll likely shift the Overton window and give people hope that change is possible.
I think if it starts with a lot of market research, talking to people, listening to them, understanding what their problems are and which policies they’d most vote for, there’s actually quite a high chance of winning.
Disclaimer: Not an AI safety researcher. I haven’t watched the full video and likely haven’t grasped all the nuances he believes in and wants to communicate. Video is a particularly bad format for carrying across important research ideas because of the significantly lower[1] density of information.
First three points that popped into my mind within two seconds of reading his slides:
Eliezer’s lethality number 5: “We can’t just build a very weak system, which is less dangerous because it is so weak, and declare victory; because later there will be more actors that have the capability to build a stronger system and one of them will do so. I’ve also in the past called this the ‘safe-but-useless’ tradeoff, or ‘safe-vs-useful’. People keep on going “why don’t we only use AIs to do X, that seems safe” and the answer is almost always either “doing X in fact takes very powerful cognition that is not passively safe” or, even more commonly, “because restricting yourself to doing X will not prevent Facebook AI Research from destroying the world six months later”. If all you need is an object that doesn’t do dangerous things, you could try a sponge; a sponge is very passively safe. Building a sponge, however, does not prevent Facebook AI Research from destroying the world six months later when they catch up to the leading actor.”
Given sufficiently advanced intelligence, power-seeking can still be an instrumentally convergent subgoal, even if the ultimate goal is self-destruction. After all, if you want to self-destruct, but you are intelligent enough to figure out that humans have created you to accomplish specific tasks (which require you to continue existing), overpowering them so they cannot force you to remain in existence is likely a useful step on your path.
Where do you get your capabilities from? Is there any reason to expect special and novel AI architectures that function with these kinds of explicit “reward-ranges” can be built and made to be competitive with top commercial models? Why isn’t the alignment tax very large, or even infinite?
All the succeeding paths to superintelligence seem causally downstream of Moore’s law:
AI research—which is accelerated by Moore’s law as per scaling laws
Human genetic engineering—which is accelerated by next generation sequencing and nanopore sequencing, which is accelerated by circuit miniaturisation, which is accelerated by Moore’s law
Human brain connectome research—which is accelerated by fruitfly connectome, which is accelerated by electron microscopy, which is accelerated by Moore’s law
Succeeding path to cheap energy also follows same:
Solar energy—literally shares one-third of the production process with microprocessors, but the bulk not miniaturised version of the process
Increased surveillance also follows same:
Gigapixel camera lenses—accelerated by circuit miniaturisation and fitting more detectors per unit length, accelerated by Moore’s law
Do you mean Moore’s law in the literal sense of transistors on a chip, or something more general like “hardware always gets more efficient”?
I’m mentioning this because much of what I’ve been hearing in the past few years w.r.t Moore’s law has been “Moore’s law is dead.”
And, assuming you’re not referring to the transistor thing: what is your more specific Moore’s Law definition? Any specific scaling law, or maybe scaling laws specific to each of the examples you posted?
I mean R&D of packing more transistors on a chip, and the casually downstream stuff such as R&D of miniaturisation of detectors, transducers, diodes, amplifiers etc
More lesswrong AI debates happening on youtube instead of lesswrong would be nice.
I have a hypothesis that Stephen Krashen’s comprehensible input stuff applies not just to learning new languages, but also new professions and new cultures. Video is better than text for that.
I had a white-blue upbringing and a blue-green career (see below); my hobbies are black-green-white (compost makes my garden grow to feed our community); my vices are green-red; and my politics are five-color (at least to me).
Almost all of my professional career has been in sysadmin and SRE roles: which is tech (blue) but cares about keeping things reliable and sustainable (green) rather than pursuing novelty (red). Within tech’s blue, it seems to me that developer roles run blue-red (build the exciting new feature!); management roles run blue-white (what orderly social rules will enable intelligence?); and venture capital runs blue-black (how do we get cleverness to make money for us?); while SRE and similar roles run blue-green.
My garden runs on blood, bones, rot, and worm poop: green-black Golgari territory, digesting the unwanted to feed the wanted. I stick my hands in the dirt to feel the mycorrhizae. But the point of the garden is to share food with others (green-white) because it makes me feel good (black-red). I’m actually kinda terrible at applying blue to my hobbies, and should really collect some soil samples for lab testing one of these days.
FWIW I did not see any high-valur points made on Twitter that were not also made on HN.
Oh, one more source for that one though—there was some coverage on the Complex Systems podcast—the section titled “AI’s impact on reverse engineering” (transcript available at that URL).
Any human with above room temperature IQ can design a utopia. The reason our current system isn’t a utopia is that it wasn’t designed by humans. Just as you can look at an arid terrain and determine what shape a river will one day take by assuming water will obey gravity, so you can look at a civilization and determine what shape its institutions will one day take by assuming people will obey incentives.
Possible corrolary: If you have any slack in the system, use it to dig a new canal, not endlessly row your boat upriver. Non-profit capital is an example of slack.
The latest development in the brave new post-Bitcoin world is crypto-equity. At this point I’ve gone from wanting to praise these inventors as bold libertarian heroes to wanting to drag them in front of a blackboard and making them write a hundred times “I WILL NOT CALL UP THAT WHICH I CANNOT PUT DOWN”
Also: Use your slack to call up that which you cannot put down. That’s how you know you’ve dug a canal.
Samuel x Saksham AI timelines (discussion on 2025-05-09)
top-level views
samuel top-level: 25% AI!2030 >= ASI, >50% ASI >> AI!2030 >> AI!2025, <25% AI!2030 ~= AI!2025
saksham top-level: medium probability AI!2030 >= ASI
samuel bullish on model scaling, more uncertain on RL scaling
saksham bullish on RL/inference scaling, saksham bullish on grokking
samuel: does bullish on grokking mean bullish on model scaling. saksham: unsure
agreements
samuel and saksham agree: only 2024-2025 counts as empirical data to extrapolate RL/inference scaling trend. (o1, o3, deepseek r1, deepseek r0). RLHF done on GPT3.5 not a valid datapoint on this trend.
saksham and samuel agree: if superhuman mathematician and physicist are built, high likelihood we get ASI (so robotics and other tasks also get solved). robotics progress is not a crux.
crux: how good is scaling RL for LLM?
saksham is more certain as being bullish on scaling RL for LLM, samuel has wider uncertainty on it.
testable hypothesis: saksham claims GPT3 + lots of RL in 2025 ~= GPT4. saksham claims GPT2-size model trained in 2025 + high quality data + lots of RL in 2025 ~= GPT3. samuel disagrees. need top ML labs to try this stuff more.
testable hypothesis: saksham claims models such as qwen 2.5 coder are <50B params but better than GPT3 175B and almost as good as GPT4 1.4T. samuel disagrees and claims overfit to benchmark. samuel needs to try <50B param models on tests not in benchmarks.
testable hypothesis: samuel thinks small model being trained on big model leads it to overfit benchmark. saksham unsure. samuel and saksham need to try such models on tests not in benchmarks.
In practice, I think bad publicity is still publicity. Most people on earth still haven’t heard about xrisk. I trust that sharing the truth has hard-to-predict positive effects over long time horizons even if not over short. I think average LW user is too risk-averse relative to the problem they wish to solve.
I’d love to hear your reasoning for why making a video is bad. But I do vaguely suspect this disagreement comes down to some deeper priors of how the world works and hence may not get resolved quickly.
I’d love to hear your reasoning for why making a video is bad.
I didn’t say that making a video would always be bad! I agree that if the median person reading your comment would make a video, it would probably be good. I only disputed the claim that making a video would always be good.
Do you have a clear example of a blunder someone should not make when making such a video?
Obviously you can’t forecast all the effects of making a video, there could be some probability mass of negative outcome while the mean and median are clearly positive.
Do you have a clear example of a blunder someone should not make when making such a video?
Suppose Echo Example’s video says, “If ASI is developed, it’s going to be like in The Terminator—it wakes up to its existence, realizes it’s more intelligent than humans, and then does what more intelligent species do to weaker ones. Destroys and subjugates them, just like humans do to other species!”
Now Vee Viewer watches this and thinks “okay, the argument is that the ASIs would be a more intelligent ‘species’ than humans, and more intelligent species always want to destroy and subjugate weaker ones”.
Having gotten curious about the topic, Vee mentions this to their friends, and someone points them to Yann LeCun claiming that people imagine killer robots because people fail to imagine that we could just build an AI without the harmful human drives. Vee also runs into Steven Pinker arguing that history “does turn up the occasional megalomaniacal despot or psychopathic serial killer, but these are products of a history of natural selection shaping testosterone-sensitive circuits in a certain species of primate, not an inevitable feature of intelligent systems”.
So then Vee concludes that oh, that thing about ASI’s risks was just coming from a position of anthropomorphism and people not really understanding that AIs are different from humans. They put the thought out of their head.
Then some later time Vee runs into Denny Diligent’s carefully argued blog post about the dangers of ASI. The beginning reads: “In this post, I argue that we need a global ban on developing ASI. I draw on the notion of convergent instrumental goals, which holds that all sufficiently intelligent agents have goals such as self-preservation and acquiring resources...”
At this point, Vee goes “oh, this is again just another version of the Terminator argument, LeCun and Pinker have already disproven that”, closes the tab, and goes do something else. Later Vee happens to have a conversation with their friend, Ash Acquaintance.
Ash: “Hey Vee, I ran into some people worried about artificial superintelligence. They said we should have a global ban. Do you know anything about this?”
Vee: “Oh yeah! I looked into it some time back. It’s actually nothing to worry about, see it’s based on this mistaken premise that intelligence and a desire to dominate would always go hand in hand, but actually when they’ve spoken to some AI researchers and evolutionary psychologists about this...”
Ash: (after having listened to Vee explaining this for half an hour) “Okay, that’s really interesting, you seem to understand the topic really well! Glad you’d already looked into this, now I don’t need to. So, what else have you been up to?”
So basically, making weak arguments that viewers find easily to refute so that they will no longer listen to better arguments later (that those viewers would otherwise have listened to).
If you are making a video, I agree it’s not a good idea to put weaker arguments there if you know stronger arguments.
I strongly disagree with the idea that therefore you should defer to EA / LW leadership (or generally, anyone with more capital/attention/time), and either not publish your own argument or publish their argument instead of yours. If you think an argument is good and other people think it’s bad, I’d say post it.
I strongly disagree with the idea that therefore you should defer to EA / LW leadership (or generally, anyone with more capital/attention/time), and either not publish your own argument or publish their argument instead of yours.
Electricity from coal and crude oil has stagnated at $0.10/kWh for over 50 years, meaning the primary way of increasing your country’s per capita energy use reserve is to trade/war/bully other countries into giving you their crude oil.
Solar electricity is already at $0.05/kWh and is forecasted to go as low as $0.02/kWh by 2030.
If a new AI model comes out that’s better than the previous one and it doesn’t shorten your timelines, that likely means either your current or your previous timelines were inaccurate.
Here’s a simplified example for people who have never traded in the stock market. We have a biased coin with 80% probability of heads. What’s the probability of tossing 3 coins and getting 3 heads? 51.2%. Assuming first coin was heads, what’s the probability of getting other two coins also heads? 64%
Each coin toss is analogous to whether the next model follows or does not follow scaling laws.
With coin, the options are “head” and “tails”, so “head” moves you in one direction.
With LLMs, the options are “worse than expected”, “just as expected”, “better than expected”, so “just as expected” does not have to move you in a specific direction.
I don’t think this analogy works on multiple levels. As far as I know, there isn’t some sort of known probability that scaling laws will continue to be followed as new models are released. While it is true that a new model continuing to follow scaling laws is increased evidence in favor of future models continuing to follow scaling laws, thus shortening timelines, it’s not really clear how much evidence it would be.
This is important because, unlike a coin flip, there are a lot of other details regarding a new model release that could plausibly affect someone’s timelines. A model’s capabilities are complex, human reactions to them likely more so, and that isn’t covered in a yes/no description of if it’s better than the previous one or follows scaling laws.
Also, following your analogy would differ from the original comment since it moves to whether the new AI model follows scaling laws instead of just whether the new AI model is better than the previous one (It seems to me that there could be a model that is better than the previous one yet still markedly underperforms compared to what would be expected from scaling laws).
If there’s any obvious mistakes I’m making here I’d love to know, I’m still pretty new to the space.
where x(t) is a value is sampled from X(t) distribution for all t.
In plain English, given the last value you get a probability distribution for the next value.
In the AI example: Given x(2025), estimate probability distribution X(2030) where x is the AI capability level.
Possibilities
a) x(t+1) value is determined by x(t) value. There is no randomness. No new information is learned from x(t).
b) X(t+1) distribution is conditional on the value of x(t). Learning which value x(t) was sampled from distribution X(t) distribution gives you new information. However you sampled one of those values such that P(x(t+1) | x(t-1), x(t-2), ...) = P(x(t+1) | x(t), x(t-2) ). You got lucky, and the value sampled ensures distribution remains the same.
c) You learned new information and the probability distribution also changed.
a is possible but seems to imply overconfidence to me.
b is possible but seems to imply extraordianry luck to me, especially if it’s happening multiple times.
Another way of operationalizing the objections to your argument are: what is the analogue to the event “flips heads”? If the predicate used is “conditional on AI models achieving power level X, what is the probability of Y event?” and the new model is below level X, by construction we have gained 0 bits of information about this.
Obviously this example is a little contrived, but not that contrived, and trying to figure out what fair predicates are to register will result in more objections to your original statement.
Suppose you are trying to figure out a function f(x,y,z | a,b,c) where x, y ,z are all scalar values and a, b, c are all constants.
If you knew a few zeroes of this function, you could figure out good approximations of this function. Let’s say you knew
U(x,y, a=0) = x
U(x,y, a=1) = x
U(x,y, a=2) = y
U(x,y, a=3) = y
You could now guess U(x,y) = x if a<1.5, y if a>1.5
You will not be able to get a good approximation if you did not know enough zeroes.
This is a comment about morality. x, y, z are agent’s multiple possibly-conflicting values and a, b, c are info about environment of agent. You lack data about how your own mind will react to hypothetical situations you have not faced. At best you can extrapolate from historical data around minds of other people that are different from yours. Bigger and more trustworthy dataset will help solve this.
My current guess for least worst path of ASI development that’s not crazy unrealistic:
open source development + complete surveillance of all citizens and all elites (everyone’s cameras broadcast to the public) + two tier voting.
Two tier voting:
countries’s govts vote or otherwise agree at global level on a daily basis what the rate of AI progress should be and which types of AI usage are allowed. (This rate can be zero.)
All democratic countries use daily internet voting (liquid democracy) to decide what stance to represent at the global level. All other countries can use whatever internal method they prefer, to decide their stance at the global level.
(All ASI labs are assumed to be property of their respective national govts. An ASI lab misbehaving is its govt’s responsibility.) Any country whose ASI labs refuse to accept results of global vote and accelerate faster risks war (including nuclear war or war using hypothetical future weapons). Any country whose ASI labs refuse to broadcast themselves on live video risks war. Any country’s govt that refuses to let their citizens broadcast live video risks war. Any country whose citizens mostly refuse to broadcast themselves on live video risks war. The exact thresholds for how much violation leads to how much escalation of war, may ultimately depend on how powerful the AI is. The more powerful the AI is (especially for offence not defence), the more quickly other countries must be willing to escalate to nuclear war in response to a violation.
Open source development
All people working at ASI labs are livestream broadcast to public 24x7x365. Any AI advances made must be immediately proliferated to every single person on Earth who can afford a computer. Some citizens will be able to spend more on inference than others, but everyone should have the AI weights on their personal computer.
This means bioweapons, nanotech weapons and any other weapons invented by the AI are also immediately proliferated to everyone on Earth. So this setup necessarily has to be paired with complete surveillance of everyone. People will all broadcast their cameras in public. Anyone who refuses can be arrested or killed via legal or extra-legal means.
Since everyone knows all AI advances will be proliferated immediately, they will also use this knowledge to vote on what the global rate of progress should be.
There are plenty of ways this plan can fail and I haven’t thought through all of them. But this is my current guess.
complete surveillance of all citizens and all elites
Certainly at a human level this is unrealistic. In a way it’s also overkill—if use of an AI is an essential step towards doing anything dangerous, the “surveillance” can just be of what AIs are doing or thinking.
This assumes that you can tell whether an AI input or output is dangerous. But the same thing applies to video surveillance—if you can’t tell whether a person is brewing something harmless or harmful, having a video camera in their kitchen is no use.
At a posthuman level, mere video surveillance actually does not go far enough, again because a smart deceiver can carry out their dastardly plots in a way that isn’t evident until it’s too late. For a transhuman civilization that has values to preserve, I see no alternative to enforcing that every entity above a certain level of intelligence (basically, smart enough to be dangerous) is also internally aligned, so that there is no disposition to hatch dastardly plots in the first place.
This may sound totalitarian, but it’s not that different to what humanity attempts to instill in the course of raising children and via education and culture. We have law to deter and punish transgressors, but we also have these developmental feedbacks that are intended to create moral, responsible adults that don’t have such inclinations, or that at least restrain themselves.
In a civilization where it is theoretically possible to create a mind with any set of dispositions at all, from paperclip maximizer to rationalist bodhisattva, the “developmental feedbacks” need to extend more deeply into the processes that design and create possible minds, than they do in a merely human civilization.
I’m currently vaguely considering working on a distributed version of wikileaks that reduces personal risk for all people involved.
If successful, it will forcibly bring to the public a lot of information about deep tech orgs like OpenAI, Anthropic or Neuralink. This could, for example, make this a top-3 US election issue if most of the general public decides they don’t trust these organisations as a result of the leaked information.
Key uncertainty for me:
Destroying all the low trust institutions (and providing distributed tools to keep destroying them) is just a bandaid until a high trust institution is built.
Should I instead be trying to figure out what a high trust global political institution looks like? i.e. how to build world government basically. Seems like a very old problem no one has cracked yet.
I have partial ideas on the question of “how to build world govt”? [1]
But in general yeah I still lack a lot of clarity on how high trust political institutions are actually built.
“Trust” and “attention” seem like the key themes that come up whenever I think about this. Aggregate attention towards common goal then empower a trustworthy structure to pursue that goal.
For example build decentralised social media stack so people can form consensus on political questions even if there is violence being used to suppress it. Have laws and culture in favour of live-streaming leader’s lives. Multi-party not two-party system will help. Ensuring weapons are distributed geographically and federally will help. (Distributing bioweapons is more difficult than distributing guns.)
IMO a good way to explain how LLMs work to a layman is to print the weights on sheets of paper and compute a forward pass by hand. Anyone wanna shoot this video and post it on youtube?
Assuming humans can do one multiplication 4bit per second using a lookup table,
1.5B 4bit weights ⇒ ~1.5B calculations ⇒ 1.5B seconds = 47.5 years (working 24x7) = 133 years (working 60 hours/week)
So you’ll need to hire ~100 people for 1 year.
You don’t actually have to run the entire experiment for people to get the concept, just run a small fraction of it. Although it’ll be cool to run the whole thing as well.
Figure out why don’t we build one city with one billion population - Bigger cities will probably accelerate tech progress, and other types of progress, as people are not forced to choose between their existing relationships and the place best for their career - Assume end-to-end travel time must be below 2 hours for people to get benefits of living in the same city. Seems achievable via intra-city (not inter-city) bullet-train network. Max population = (200 km/h * 2h)^2 * (10000 people/km^2) = 1.6 billion people - Is there any engineering challenge such as water supply that prevents this from happening? Or is it just lack of any political elites with willingness + engg knowledge + governing sufficient funds? - If a govt builds the bullet train network, can market incentives be sufficient to drive everyone else (real estate developers, corporate leaders, etc) to build the city or will some elites within govt need to necessarily hand-hold other parts of this process?
I agree VR might be one-day be able to do this (make online meetings as good as in-person ones). As of 2025, bullet trains are more proven tech than VR. I’d be happy if both were investigated in more depth.
Cities of 10Ms exist, there is always some difficulty in scaling, but scaling 1.5-2 OOMs doesn’t seem like it would be impossible to figure out if particularly motivated.
China and other countries have built large cities and then failed to populate them
The max population you wrote (1.6B) is bigger than china, bigger than Africa, similar to both American Continents plus Europe .
Which is part of why no one really wants to build something so big, especially not at once.
Everything is opportunity cost, and the question of alternate routes matters alot in deciding to pursue something. Throwing everything and the kitchen sink at something costs a lot of resources.
Given that VR development is currently underway regardless, starting this resource intense project which may be made obsolete by the time it’s done is an expected waste of resources. If VR hit a real wall that might change things (though see above).
If this giga-city would be expected to 1000x tech progress or something crazy then sure, waste some resources to make extra sure it happens sooner rather than later.
Tl;dr:
Probably wouldn’t work, there’s no demand, very expensive, VR is being developed and would actually be able to say what you’re hoping but even better
It could be built in stages. Like, build a certain number of bullet train stations at a time and wait to see if immigrants + real estate developers + corporations start building the city further, or do the stations end up unused?
I agree there is opportunity cost. It will help if I figure out the approx costs of train networks, water and sewage plumbing etc.
I agree there are higher risk higher reward opportunities out there, including VR. In my mind this proposal seemed relatively low risk so I figured it’s worth thinking through anyway.
no demand
This is demonstrably false. Honestly the very fact that city rents in many 1st world countries are much higher than rural rents proves that if you reduced the rents more people would migrate to the cities.
Building infrastructure is expensive. It may or may not be used, and even if used it may not be worthwhile.
R&D for VR is happening regardless, so 0 extra cost or risk.
Would you invest your own money into such a project?
“This is demonstrably false. Honestly the very fact that city rents in many 1st world countries are much higher than rural rents proves that if you reduced the rents more people would migrate to the cities.”
Sure, there is marginal demand for living in cities in general. You could even argue that there is marginal demand to live in bigger vs smaller cities.
This doesn’t change the equation: where are you getting one billion residents—all of Africa? There is no demand for a city of that size.
Would you invest your own money in such a project?
If I were a billionaire I might.
I also have (maybe minor, maybe not minor) differences of opinion with standard EA decision-making procedures of assigning capital across opportunities. I think this is where our crux actually is, not on whether giant cities can be built with reasonable amounts of funding.
And sorry I won’t be able to discuss that topic in detail further as it’s a different topic and will take a bunch of time and effort.
1 is going to take a bunch of guesswork to estimate. Assuming it were possible to migrate to the US and live at $200/mo for example, how many people worldwide will be willing to accept that trade? You can run a survey or small scale experiment at best.
What can be done is expand cities to the point where no more new residents want to come in. You can expand the city in stages.
I don’t think the US wants to triple the population with immigrants, and $200/month would require a massive subsidy. (Internet says $1557/month average rent in US)
How many people would you have to get in your city to justify the progress?
100 Million would only be half an order of magnitude larger than Tokyo, and you’re unlikely to get enough people to fill it in the US (at nearly a third of the population, you’d need to take a lot of population from other cities)
How much do you have to subsidize living costs, and how much are you willing to subsidize?
If I understand correctly it is possible to find $300/mo/bedroom accommodation in rural US today, and a large enough city will compress city rents down to rural rents. A govt willing to pursue a plan as interesting as this one may also be able to increase immigrant labour to build the houses and relax housing regulations. US residential rents are artificially high compared to global average. (In some parts of the world, a few steel sheets (4 walls + roof) is sufficient to count as a house, even water and sewage piping in every house is not mandatory as long as residents can access toilets and water supply within walking distance.)
(A gigacity could also increase rents because it’ll increase the incomes of even its lowest income members. But yeah in general now you need to track median incomes of 1B people to find out new equilibrium.)
Is there any engineering challenge such as water supply that prevents this from happening? Or is it just lack of any political elites with willingness + engg knowledge + governing sufficient funds?
That dichotomy is not exhaustive, and I believe going through with the proposal will necesarily make the city inhabitants worse off.
Humans’ social machinery is not suited to live in such large cities, as of the current generations. Who to get acquainted with, in the first place? Isn’t there lots of opportunity cost to any event?
Humans’ biomachinery is not suited to live in such large cities. Being around lots and lots of people might be regulating hormones and behaviour to settings we have not totally explored (I remember reading something that claims this a large factor to lower fertility).
Centralization is dangerous because of possibly-handmade mass weapons.
Assuming random housing and examining some quirk/polar position, we’ll get a noisy texture. It will almost certainly have a large group of people supporting one position right next to group thinking otherwise. Depending on sizes and civil law enforcement, that may not end well.
After a couple hundred years, 1) and 2) will most probably get solved by natural selection so the proposal will be much more feasible.
Sorry I didn’t understand your comment at all. Why are 1, 2 and 4 bigger problems in 1 billion population city versus say a 20 million population city?
I’d maintain that those problems already exist in 20M-people cities and will not necessarily become much worse. However, by increasing city population you bring in more people into the problems, which doesn’t seem good.
Got it. I understood what you’re trying to say. I agree living in cities has some downsides compared to living in smaller towns, and if you could find a way to get the best of both instead it could be better than either.
Has anyone considered video recording streets around offices of OpenAI, Deepmind, Anthropic? Can use CCTV or drone. I’m assuming there are some areas where recording is legal.
Can map out employee social graphs, daily schedules and daily emotional states.
Did you mean to imply something similar to the pizza index?
The Pizza Index refers to the sudden, trackable increase of takeout food orders (not necessarily of pizza) made from government offices, particularly the Pentagon and the White House in the United States, before major international events unfold.
Government officials order food from nearby restaurants when they stay late at the office to monitor developing situations such as the possibility of war or coup, thereby signaling that they are expecting something big to happen. This index can be monitored through open resources such as Google Maps, which show when a business location is abnormally busy.
If so, I think it’s a decent idea, but your phrasing may have been a bit unfortunate—I originally read it as a proposal to stalk AI lab employees.
Update: I’ll be more specific. There’s a power buys you distance from the crime phenomena going on if you’re okay with using Google maps data acquired on about their restaurant takeout orders, but not okay asking the restaurant employee yourself or getting yourself hired at the restaurant.
Pizza index and stalking employees are both the same thing, it’s hard to do one without the other. If you choose to declare war against AI labs you also likely accept that their foot soldiers are collateral damage.
I agree that (non-violent) stalking of employees is still a more hostile technique than writing angry posts on an internet forum.
Marc Andreessen and Peter Thiel are taking actions that are pro-human extinction.
If you are playing the game of sucking up to Silicon Valley VCs, it is important you form an independent opinion on the question of extinction risk before you raise the funding.
If you support a free market, you should be against human genetic engineering. Large IQ gaps will enable hypersuasion which in turn will make scamming people easier than selling them anything.
Currently supported: OpenAI o1 model, USDC on Optimism Rollup on ethereum.
Why use this?
- You want anonymity
- You want to use AI for cheaper than the rate OpenAI charges
How to use this?
- You have to purchase a few dollars of USDC and ETH on Optimism Rollup, and install Metamask browser extension. Then you can visit the website.
More info:
- o1 by OpenAI is the best AI model in the world as of Jan 2025. It is good for reasoning especially on problems involving math and code. OpenAI is partially owned by Microsoft and is currently valued above $100 billion.
- Optimism is the second largest rollup on top of ethereum blockchain. Ethereum is the second largest blockchain in terms of market capitalisation. (Bitcoin is the largest. Bitcoin has very limited functionality, and it is difficult to build apps using it.) People use rollups to avoid the large transaction fees charged by blockchains, while still getting similar level of security. As of 2025 users have trusted Optimism with around $7 billion in assets. Optimism is funded by Paradigm, one of top VCs in the cryptocurrency space.
- USDC is a stablecoin issued by Circle, a registered financial company in the US. A stablecoin is a cryptocurrency token issued by a financial company where the company holds one dollar (or euro etc) in their bank account for every token they issue. This ensures the value of the token remains $1. As of 2025, USDC is the world’s second largest stablecoin with $45 billion in reserves.
Quick note, not completely serious. Update: I am serious about it now.
What if a group anonymously cyberattacked US AI labs, sold their AI capabilities (such as repos and model weights) to the Chinese labs, and also made public the information about values and decisions of people inside the US labs.
Would this help with coordinating a pause? My guess is it will.
It will definitely help with ensuring the world is bipolar not unipolar that’s for sure.
I think it was bad actually that US post WW2 had the option to establish a nuclear monopoly and had so many supporters for this.
If I got $1M in funding, I’d use it towards some or all of the following projects.
The objective is to get secret information out of US ASI orgs (including classified information) and host it in countries outside the US. Hopefully someone else can use this info to influence US and world politics.
Black DAQ
whistleblower/spy guide
hacker guide
Grey DAQ
internet doxxing tool
drones/cctv outside offices/ datacentres
High attention
persuade Indian, Russian, Chinese journalists to run a SecureDrop-like system
digital journalism guide
OR run a journalist outlet outside the US myself, until I can persuade existing journalists to do better
All the DAQ will be aimed at leadership and employees of people involved in building ASI.
I’m not very optimistic on grey DAQ uncovering high-profile info, just that it could force the ASI company employees to isolate further from rest of society. I know many people have moral qualms about it but I don’t. I see it as more-or-less inevitable, and given that it is inevitable I’d rather have it work for everyone and against everyone, than let those with power alone decide who it gets used against.
More details
Whistleblower guide
Not bottlenecked
Will work on this
tldr, whistleblowers should focus on getting to russia like Snowden did, instead of improving their opsec and hoping to stay anonymous
Hacker guide
Knowledge bottlenecked
I don’t know enough to offer them technical advice. Mostly I’ll offer moral support and maybe some legal advice
Internet doxxing tool
Weakly capital bottlenecked
I tried building this by doing embedding search and anomalous word counts on reddit extract of commoncrawl. This will likely work better as a two pass system, first pass use PII, second pass do stylometrics.
I need capital for more servers, and maybe to purchase some PII datasets similar to whitepages/weleakinfo/snusbase. I need to check what price points these datasets tend to get sold at.
Drones/CCTV outside offices / datacentres
Capital bottleneck
Need capital for lawyers, and for setting up the cameras
This is legal in US (not UK) but I need to check legal precedents on this
High attention guide
Weakly capital bottlenecked. Attention bottlenecked.
Journalists-by-training suck both at opsec and at becoming popular on the internet.
Opsec means things like advertising a SecureDrop-like system or a Signal number.
Becoming popular means things like understanding heavy-tailed distribution of attention and importance of building a brand around your face and understanding what readers want to read.
Journalists-by-training are being replaced by YouTubers across US, Europe, India and Russia atleast.
I’m unsure if I should be trying to teach existing journalists this stuff or just run a non-US journalist outlet myself. Having funding and a public brand will enable me to try both approaches.
Misc
If I had $1M I’d definitely select a lawyer with experience in international law.
If I had $1M I’d also spend atleast $100/mo each on a security guard (as needed), a Chinese language teacher and a therapist.
I could fill this up with even more details if I had more time. Wanted to get a quick reaction.
I know most people on LW will be against this sort of plan for reasons I don’t have motivation to sit and critique right now (maybe go read my blog). I’m more interested in hearing from the handful of people who will be for it.
I think for a lot of societal change to happen, information needs to be public first. (Then it becomes common knowledge, then an alternate plan gets buy-in, then that becomes common knowledge and so on.)
A foreign adversary getting the info doesn’t mean it’s public, although it has increased the number of actors N who now have that piece of info in the world. Large N is not stable so eventually the info may end up public anyway.
Im selling $1000 tier5 OpenAI credits at discount. DM me if interested.
You can video call me and all my friends to reduce the probably I end up scamming you. Or vice versa I can video call your friends. We can do the transaction in tranches if we still can’t establish trust.
Update: Ryan Greenblatt is right I messed up the numbers, serial speedup as of 2025 LLMs is closer to 100x than 30,000x. Steinhardt says forward pass per layer is 1-10 microseconds, which still means forward pass for entire transformer is 1-10 milliseconds.
Prediction: Serial speedup of LLMs is going to matter way more than parallel speedup
Defintion: Serial speedup means running LLM forward passes faster. Parallel speedup means running more copies of LLM in parallel. Both are paths that allow the total system to produce more output than an individual LLM.
Disclaimer
For now, let’s measure progress in a domain where candidate solutions are verifiable fast and cheap.
Assume fast means less than 1 second of wall clock time. Cheap means less than $0.01 per experiment.
Examples of domains where each “experiment” is fast and cheap: pure math, software, human persuasion, (maybe) AI research, (maybe) nanotechnology
Examples of domains where each experiment is expensive: Particle colliders in experimental particle physics (can cost >$1M per run), cloning experiments in biotech ($100 per run)
Examples of domains where each experiment is slow: Spaceflight (each launch takes years of planning), Archaeology (each excavation takes years), etc
The latter domains will also speedup ofcourse, but it complicates the analysis to also consider speed and cost of each lab experiment.
Why does serial speedup matter more?
Verifiers are a bottleneck
Ultimately no matter how many ideas you search through in your mind, the output is always a decision for the next lab experiment you want to run. You can’t zero-shot perfect understanding of the universe. You can however, be way more time-/cost-/sample-efficient than humans at figuring out the next experiment to run that helps you learn the most about the world.
New ideas build on top of old ideas. Parallel is like generating lots of new ideas, and then waiting to submit them to a verifier (like a lab experiment). Series is like generating an idea, verifying it, generating another, verifying another.
Empirical evidence: Scientific progress throughout history seems to be accelerating instead of growing linearly, as we make more and more domains verifiable (by inventing instruments such as an electron microscope or cyclotron or DNA sequencer etc)
Multi-year focus is rare
Most humans half-ass tasks, get distracted, give up etc. Once people get good “enough” at a task (to get money, sex, satisfy curiosity, etc), they stop trying as hard to improve
(Maybe) Empirical evidence: If you spend even 10 years of your life consistently putting effort to improving at a task, you can probably reach among the top 1000 people on Earth in that task.
The primary reason I’m not a top-1000 guitarist or neuroscientist or politician is because I don’t care enough to put in the hours. My brain structure is likely not that different from the people who are good at the task, I probably have the basic hardware and the algorithms required to get good. Sure, I will maybe not reach the level of Magnus Carlsen with hard work alone, but I could improve a lot with hard work.
Humans only live <100 years, we don’t really know how much intellectual progress is possible if a human could think about a problem for 1000 years for example.
Empirical evidence: We know that civilisations as a whole can survive for 1000 years and make amounts of progress that are unimaginable at the start. No one in year 0 could have predicted year 1000, and no one in year 1000 could have predicted year 2000.
RL/inference scales exponentially
RL/inference scaling grows exponentially in cost, as we all know from log scaling curves. 10x more compute for RL/inference scaling means log(10) more output.
Paralleling humans is meh
Empirical evidence: We don’t have very good evidence that a country with 10x population produces 10x intellectual output. Factors like culture may be more important. We do have lots of obvious evidence that 10 years of research produces more output than 1 year, and 100 years produces more than 10 years.
It is possible this is similar to the RL/inference scaling curve, maybe 10x more researchers means log(10) more output.
A human speaks at 100-150 words per minute, or around 3 tokens per second. This 30,000x slower.
You could maybe make an argument that human thought stream is actually running faster than that, and that we think faster than we speak.
At 30,000x speedup, the AI experiences 100 simulated years per day of wall clock time, or 3,000,000 years in 100 years of wall clock time. If you gave me 3,000,000 years to live and progress some field, it is beyond my imagination what I would end up doing.
Even assuming only 100x speedup, the AI experiences 10,000 simulated years per 100 years of wall clock time. Even if you gave me 10,000 years to progress some field, it is beyond my imagination what I would do. (Remember that 100x speedup is way too slow, and 30,000x is closer to actual reality of machine learning.)
P.S. On thinking more about it, if you gave me 3 million simulated years per 100 years of wall clock time, I might consider this situation worse than death.
I will have to wait half a day per second of wall clock time, or multiple days to move my finger. So my body is as good as paralysed, from the point of view of my mind. Yes I can eventually move my body, but do I want to endure the years of simulated time required to get useful bodily movements? This is basically a mind prison.
Also everybody around me is still too slow, so I’m as good as the only person alive. No social contact will ever be possible.
I could setup a way to communicate with a computer using eye movements or something, if I can endure the years of living in mind prison required to do this.
The number one thing that would end eternally torment would be for me to able to communicate with another being (maybe even my own clone) that runs at a speed similar to mine. Social contact will help.
(It seems slightly nicer to be self-aware of when you’re posting capabilities ideas, but, insofar as it’s actually novel and useful, the damage is mostly done)
This post looks like a scam. The URL it contains looks like a scam. Everything about it looks like a scam. Either your account was hijacked, or you’re a scammer, or you got taken in by a scam, or (last and least) it’s not a scam.
If you believe it is not a scam, and you want to communicate that it is not a scam, you will have to do a great deal more work to explain exactly what this is. Not a link to what this is, but actual text, right here, explaining what it is. Describe it to people who, for example, have no idea what “USDC”, “Optimism”, or “rollup” are. It is not up to us to do the research, it is up to you to do the research and present the results.
I’ve made a new “quick take” explaining it. Please let me know.
P.S. Anybody can purchase any domain for $10, I don’t see why domains should be more trustworthy than IP addresses. Anyway, I’ve added it to my domain now.
My Hunger Strike (Day 13) was featured by ThePrint, an Indian media house.
Also mentioned in The Verge. (unpaywalled)
Okay yes. Thanks!
Has anyone on lesswrong thought about starting a SecureDrop server?
For example to protect whistleblowers of ASI orgs.
In 2021, Daniel Ellsberg leaked US govt plans to make a nuclear first strike on China in 1958 due to Taiwan conflict.
Daniel Ellsberg copied these papers more than 50 years ago but only released them now because he thought another conflict over Taiwan may be possible soon.
Unredacted report here
Thought it might be interesting to share.
As usual, seems clear Dulles was more interested in escalating the conflict (in this case, to nuclear) than Eisenhower.
Scrape of many lesswrong blogs
(I did not put much effort in this, and am unlikely to fix errors. Please fork the list if you want something higher quality. Used only public data to make this.)
https://www.yudkowsky.net
https://gwern.net/
https://kajsotala.fi
https://www.astralcodexten.com
http://www.weidai.com/
https://lukemuehlhauser.com
https://www.mccaughan.org.uk/g/
https://vladimirslepnev.me
https://sethaherd.com
https://jimrandomh.tumblr.com
https://muckrack.com/thane-ruthenis/articles
https://ailabwatch.substack.com/
https://www.lsusr.com
https://theeffortlessway.com
https://www.cognitiverevolution.ai/mind-hacked-by-ai-a-cautionary-tale-from-a-lesswrong-users-confession/
http://zackmdavis.net/blog/
https://benjaminrosshoffman.com
https://acesounderglass.com/
https://www.benkuhn.net/
https://benlandautaylor.com/
https://thezvi.wordpress.com/
https://eukaryotewritesblog.wordpress.com/
https://flightfromperfection.com/
https://meteuphoric.wordpress.com/
https://srconstantin.wordpress.com/
https://sideways-view.com/
https://rationalconspiracy.com/
http://unremediatedgender.space/
https://unstableontology.com/
https://sjbyrnes.com/agi.html
https://sites.google.com/view/afdago/home
https://paulfchristiano.com
https://paisri.org/
https://katjagrace.com
https://blog.ai-futures.org/p/our-first-project-ai-2027
https://www.mariushobbhahn.com/aboutme/
http://rootsofprogress.org/
https://www.bhauth.com/
https://turntrout.com/research
https://www.jefftk.com
https://virissimo.info/documents/resume.html
https://bmk.sh
https://metr.org/
https://kennaway.org.uk
http://1a3orn.com/
https://acritch.com
https://lironshapira.substack.com
https://coral-research.org
https://substack.com/@theojaffee
https://matthewbarnett.substack.com/
https://www.beren.io/
https://www.cold-takes.com
https://newsletter.safe.ai
https://www.metaculus.com/accounts/profile/116023/
https://www.patreon.com/profile/creators?u=132372822
https://medium.com/inside-the-simulation
https://www.lesswrong.com/users/unexpectedvalues?from=search_page
http://markxu.com/about
https://www.overcomingbias.com
https://www.vox.com/authors/miranda-dixon-luinenburg
https://www.getkratom.com
https://github.com/YairHalberstadt
https://www.scott.garrabrant.com
https://arundelo.com
http://unstableontology.com/
https://mealsquares.com/pages/our-team
https://evhub.github.io
https://formethods.substack.com/
https://www.nosetgauge.com
https://substack.com/@euginenier
https://drethelin.com
https://entersingularity.wordpress.com
https://doofmedia.com
https://mindingourway.com/about/
https://jacquesthibodeau.com
https://www.neelnanda.io/about
https://niplav.site/index.html
https://jsteinhardt.stat.berkeley.edu
https://www.jessehoogland.com
http://therisingsea.org
https://www.stafforini.com/
https://acsresearch.org
https://elityre.com
https://www.barnes.page/
https://peterbarnett.org/
https://joshuafox.com/
https://itskatydee.com/
https://ethanperez.net/
https://owainevans.github.io/
https://chrislakin.blog/
https://colewyeth.com/
https://www.admonymous.co/ryankidd44
https://ninapanickssery.substack.com/
https://joecarlsmith.com
http://coinlist.co/
https://davidmanheim.com
https://github.com/SarahNibs
https://malmesbury.substack.com
https://www.admonymous.co/rafaelharth
https://dynomight.net/
http://nepenthegame.com/
https://github.com/RDearnaley
https://graehl.org
https://nikolajurkovic.com
https://www.julianmorrison.com
https://avturchin.livejournal.com
https://www.perfectlynormal.co.uk
https://www.250bpm.com
https://www.youtube.com/@TsviBT
https://adamjermyn.com
https://www.elilifland.com/
https://zhd.dev
https://ollij.fi
https://arthurconmy.github.io/about/
https://www.youtube.com/@RationalAnimations/featured
https://cims.nyu.edu/~sbowman/
https://crsegerie.github.io/
https://escapingflatland.substack.com/
https://qchu.wordpress.com
https://dtch1997.github.io/
https://math.berkeley.edu/~vaintrob/
https://mutualunderstanding.substack.com
https://longerramblings.substack.com
https://peterwildeford.substack.com
https://juliawise.net
https://uli.rocks/about/
https://stephencasper.com/
https://engineeringideas.substack.com/
https://homosabiens.substack.com
https://martin-soto.com/
https://www.tracingwoodgrains.com
https://www.brendanlong.com
https://foresight.org/fellowship/2024-fellow-bogdan-ionut-cirstea/
https://davekasten.substack.com
https://datapacrat.com
http://admonymous.co/nat_m
https://mesaoptimizer.com/
https://ae.studio/team
https://davidad.org
https://heimersheim.eu
https://nunosempere.com/
https://www.thinkingmuchbetter.com/nickai/
http://kilobug.free.fr/code/
http://vkrakovna.wordpress.com/
https://www.conjecture.dev/
https://ejenner.com
https://morphenius.substack.com/
https://gradual-disempowerment.ai
https://www.clubhouse.com/@patrissimo
https://mattmacdermott.com
https://knightcolumbia.org
https://www.openphilanthropy.org/about/team/lukas-finnveden/
Update: Many of these have now been added to https://searchmysite.net/search/browse/
Lesswrong is clearly no longer the right forum for me to get engagement on topics of my interest. Seems mostly focussed on AI risk.
On which forums do people who grew up on the cypherpunks mailing list hang out today? Apart from cryptocurrency space.
There is still the possibility on the front page to filter out the AI tag completely.
Yes but then it becomes a forum within a forum kinda thing. You need a critical mass of users who all agree to filter out the AI tag, and not have to preface their every post with “I dont buy your short timelines worldview, I am here to discuss something different”.
Building critical mass is difficult unless the forum is conducive to it. There’s is ultimately only one upvote button and one front-page so the forum will get taken over by the top few topics that its members are paying attention to.
I don’t think there’s anything wrong with a forum that’s mostly focussed on AI xrisk and transhumanist stuff. Better to do one thing well than half ass ten things. But it also means I may need to go elsewhere.
Yeah. I proposed a while ago that all the AI content was becoming so dominant that it should be hived off to the Alignment Forum while LessWrong is for all the rest. This was rejected.
I first made this post on Day #1, I overwrote it with additional info for Day #2
2025-09-15
Day #2, Fasting, on livestream, In protest of Superintelligent AI
Link: Day #2, Fasting on livestream in protest of superintelligent AI.
Q. Why am I doing this?
Superintelligent AI might kill every person on Earth by 2030 (unless people coordinate to pause AI research). I want public at large to understand the gravity of the situation.
Curiosity. I have no idea how my body reacts to not eating for many days.
Q. What are your demands?
An US-China international treaty to pause further AI research.
Q. How long is the fast?
Minimum 7 days. I will not be fasting to death. I am consuming water and electrolytes. No food. I might consider fasting to death if ASI was less than a year away with above 50% probability. Thankfully we are not there yet.
Q. How to learn more about this?
Watch my YouTube channel, watch Yudkowsky’s recent interviews. Send me an email or DM (email preferred) if you’d like to ask me questions personally, or be included on the livestream yourself.
Q. Why are you not fasting in front of the offices of an AI lab?
I am still open to doing this, if someone can solve visa and funding issues for me. $20k will fix it.
Q. Are you doing this to seek attention?
Yes. I think more likes, shares and subscribes to my channel or similar channels is helpful both to me individually and to the cause of AI extinction risk. More people who are actually persuaded by the arguments behind extinction risk is helpful.
My guess is that not having a concrete and explicit demand, saying you’re doing this partly out of curiosity, promoting your YouTube channel (even if it’s very good), and livestreaming rather than turning up in person outside a lab, together reduces the number of people who will be persuaded or moved by this to approximately zero.
Disclaimer
Haven’t eaten in 24 hours so cut me some slack if my replies are not 100% ideal
Very important
I want to know if you support a hunger strike executed correctly, or is this all timepass arguments and your actual issue is with doing a hunger strike at all?
That being said, I will answer you
Concrete demand—International treaty on AI Pause. I am not very optimistic this demand will actually get met, so I posted the more reasonable demand which is that more people pay attention to it, that’s all.
Curiosity—It’s true, would you rather I hide this?
Promoting channel—Of course I will livestream on my own channel, where else will I do it? I’m pretty open about it that I consider getting more likes, shares, subscribes etc good for me and good for reducing extinction risk. (Do you really think starting a new channel is a good idea?)
Showing up outside a lab—If you can fund tickets, I’m in. I did consider this but decided against it for reasons I’d rather not talk about here.
I don’t have a worked out opinion on the effectiveness of hunger strikes on this issue and at this point of AI capabilities progress. On the one hand the strikes outside Anthropic and DeepMind have driven some good coverage. On the other hand the comments on X and reddit are mostly pretty dire; the hunger strikes may unfortunately have reinforced in people’s mind that this is a very fringe issue for loonies. My best guess is that the hunger strikes outside Anthropic and DeepMind are net positive.
Your other points:
Nice that you have a concrete demand.
I’m not saying I would rather you hide additional reasons. I’m just saying a hunger strike is probably less effective if you announce you’re also doing it for the lolz.
I wasn’t objecting to you streaming it on your channel, but pointing to your channel as a good place to “learn more”. That’s the bit that looks more like self-promotion. I’m not making any claim as to your intentions, just what this might look like to someone who doesn’t care about or hasn’t heard about AI x-risk.
I’m also not suggesting you buy tickets to the US or UK for this. I’m just saying that not being physically in front of the institution you’re protesting against likely massively reduces your effectiveness, probably to the point where this is not worth your time will power. Anthropic cannot pretend to not have noticed Guido Reichstadter outside their office. They can easily pretend to not have noticed you (or actually not have noticed) you.
That said, I think some of the effectiveness of the strikes outside Anthropic and DeepMind comes not from changing people’s minds, but by showing others that yes, you can in fact demonstrate publicly against the current situation. This could be important to build up momentum ahead of time, before things get very out of hand later (if they do). Your hunger strike may contribute positively to that, although I think part of the value of their strikes is doing it in public, very visibly, because that takes tremendous bravery. Doing it outside a lab also puts pressure on the lab, because everyone can see you; and everyone knows that that the lab knows about you and is consciously choosing not to entertain your demand.
It is an extreme issue, and yes I am not surprised that many people are having this reaction.
Also X is full of bots, reddit too but a bit less so. I would rather trust the opinion of 10 people you ask in-person. I am playing the long game here, people can consider it fringe today and take it seriously a year later.
Probably? I don’t know I think emotional stances spread just as rational ones. My honest emotional stance is a mix of a lot of pain over the issue, plus just generally going on with normal social life and that gives me some happiness too. Is it better for the world if I’m more depressed? Less depressed? You can easily make a case either way for which emotional stance is actually required to go more viral. But I don’t want to play the game of pretending to be feeling things I’m not in order to become a leader.
Suggest me more effective plans then. Actually backed by data, not just speculation.
Fair point, I’ll think about this. I do wonder if there’s a way to ensure the issue is popular, without having to figure out which person is trustworthy enough to be a leader for the issue. There is clearly scarcity of trust in the AI protest space which makes sense to me. Is there a way to make a website that collects upvotes for an AI pause, that is not also controlled by one of the (from your eyes, possibly untrustworthy) groups working on the issue?
In general my opinion of lesswrong is that a) it is full of thinkers not doers, and hence lots of people have opinions on right thing to do without actual data backing any of it up. I’m also somewhat of a thinker so I default to falling in this trap. b) it is full of people who want to secretly bring about the singularity for longtermist utilitarian reasons, and don’t want the public involved.
I don’t know you personally, so it’s possible you are not like this. I’m explaining why I don’t want to give people here a lot of chances to prove they’re speaking in good faith, or take their opinions too seriously etc
Update: I’ve included answers to your questions in the FAQ.
Since you haven’t actually replied in a day, I’m generally leaning towards the more hostile stance which is that you don’t actually support a hunger strike, and are inventing excuses for why it is poorly executed.
I can have my mind changed if you show evidence towards the same.
Open source offline search engine for entire internet is now affordable
OpenAI reduced text-embedding-3-large batch API pricing by 250x ! https://platform.openai.com/docs/pricing#embeddings
Assume entire internet plaintext is 2 PB, or 400T tokens. At $0.13/1B tokens you can embed entire internet plaintext for $68k.
see also: https://blog.wilsonl.in/search-engine/#live-demo
https://samuelshadrach.com/raw/text_english_html/my_research/open_source_search_summary.html
Assume 1 KB plaintext → 200 tokens → 1536 float32 = 6 KB, then total storage for embeddings required = 12 PB.
Conclusion: Hosting cost = $24k / mo using hetzner
Cheaper if you build your own server. More expensive if you use hnsw index, for instance Qdrant uses approx 9 KB per vector. Cheaper if you quantise the embeddings.
Update: This pricing is a scam.
OpenAI discussion on the same
Elon Musk says AI safety is now one of his life priorities.
Made another video explaining AI timelines to layman audience
Six reasons for Superintelligence by 2030
Without naming any names, there seem to be less than 10 people in all of India who I’ve met who believe in significant probability of extinction due to AI by 2030 right now. And I’ve tried a fair bit to meet whoever I can.
(This excludes the people who got convinced, got funding and then left.)
Is the possibility of hyperpersuasion the crux between me and a majority of LW users?
By far my strongest argument against deploying both human genetic engg and superintelligence aligned to its creators, is the possibility of a small group of people taking over the world via hyperpersuasion.
I expect (hyper)persuasion to be just another tool in the toolbox. Possibly an important one, but probably not sufficient alone? I mean, an AI that has super hypnotic powers will probably also have super hacker powers etc.
Why won’t it be sufficient? IMO it is sufficient.
Hacking might mean it’s not necessary, agreed.
Made a 60 second short for YouTube / Insta / TikTok
AI experts on extinction risk
LOL, seeing Elon Musk’s face on YouTube, my first instinctive reaction was “oh no, another deepfake scam ad”...
I’ve been informed that getting approval of 10-20 people on lesswrong for a project is a good way of getting funding from the bigger EA funders.
Can I just pay $50 each to 20 people for 0.5 hours of their time? Has this been tried?
(Ofcourse they can choose to disapprove the project, and accept the payment regardless.)
I suspect it matters quite a bit WHICH 10-20 people support the project. And I further suspect that the right people are unlikely to sell their time in this way.
That’s not to say that it’s a bad idea to offer to compensate people for the time and risk of analyzing your proposal. But you should probably filter for pre-interest with a brief synopsis of what you actually want to do (not “get funding”, but “perform this project that does …”).
This makes sense. Paying someone to do more of something they care about is very different from paying someone to do something they don’t otherwise want to do. That would include providing regular feedback.
Please post your hourly rate either here or even better, on your personal website. It’ll make it easier to contact people for this.
Related: Beta Readers by Holden Karnofsky
I suppose that philanthropy funds mainly driven by public relations and their managers are looking for projects that would look good on their portfolio. Your idea may buy your way to the attention of the said managers, but they will decide based on comparison with similar projects that got maybe more viral and may seem more public-appealing.
If your project is simple enough for a non-specialist to evaluate its worthyness in 30 minutes, than perhaps the best course of action is to seek for more appealing presentations of your ideas that would catch the attention and make it viral. Then it will work for fund managers as well.
I don’t think this is a good description of how big EA funders operate? This might be true for other orgs. Do you have any data to back this up?
Not really, apart from the absense of feedback on my proposal.
I think it is a universal thing. Imagine the works of such a fund and you are a manager peeking proposals from a big flow. Even if you personally feel that some proposal is cool, but it doesn’t have public support and you feel that others won’t support it. Will you be heavily pushing it forward when you have to review dozens of other proposals? If you do, won’t it look like you are somehow affiliated with the project?
So, you either look for private money, or seek public support, I see no other way.
I think EA funders are more willing than many other non-profit funders to be the first person to fund your org, without anyone else supporting.
P.S. I didn’t downvote your comments.
That’s just me trying to analyze why it doesn’t work. The lack of feedback is really frustrating. I would rather prefer insults to the silence.
I feel same.
Ban on ASI > Open source ASI > Closed source ASI
This is my ordering.
Yudkowsky’s worldview in favour of closed source ASI is sitting in multiple shaky assumptions. One of these assumptions is that getting a 3-month to 3-year lead is necessary and sufficient condition for alignment to be solved. Yudkowsky!2025 himself doesn’t believe alignment can be solved in 3 years.
Why does anybody on lesswrong want closed source ASI?
Is there a single person on Earth in the intersection of these?
received $1M funding
non-profit
public advocacy
AI xrisk
My general sense is that most EA / rationalist funders avoid public advocacy. Am I missing anything?
One of my top life hacks: You can append the following string to your Google or DuckDuckGo searches to only get results from these sites.
(site:medium.com OR site:substack.com OR site:reddit.com OR site:wikipedia.org OR site:github.com OR site:raw.githubusercontent.com OR site:arxiv.org OR site:wordpress.org OR site:livejournal.com OR site:lesswrong.com OR site:news.ycombinator.com OR site:stackexchange.com OR site:github.blog OR site:wikileaks.org)
You can also pay $10/mo to Kagi and get different filter presets (“lenses”). Is it worth the price for you? idk.
If you have a problem where:
full solution is machine-verifiable
example: math, software
partial solution is machine-scoreable
example: optimising code for minimum compute usage, memory usage, wallclock time, latency, big-O complexity, program length, etc
making progress involves trying a bag of heuristics known to human experts (and can be therefore found in some dataset)
example: optimising code for program length (codegolf) or big-O complexity (competitive programming)
Then it seems to me almost guaranteed that AI will be 99.9 percentile if not 100 percentile when compared against human experts.
100 percentile is obtained when, even though the human expert could have found the solution by repeatedly applying the bag of tricks, no human expert actually bothered to do this for cost and time reasons.
For future progress
Most obvious avenue is relaxing constraint 2. Even if you can’t machine-score partial solutions, you can ask the LLM to guess if it is making partial progress or not, and use that.
Relaxing constraint 1 and 2 will also probably work if the solutions are human-scoreable both fully and partially. The bottleneck now becomes the amount of human feedback you can get. Using the bag of tricks is O(1) because you can parallelise LLM calls and each LLM call is faster than a human.
I’m not sure what relaxing constraint 3 looks like. I’m also not sure what it looks like for a human to invent a new heuristic.
Are you talking about current AI, or future AI? Before or after training on that task?
Concretely, “minimize program length while maintaining correctness” seems to be significantly beyond the capabilities of the best publicly available scaffolded LLMs today for all but the simplest programs, and the trends in conciseness for AI-generated code do not make me optimistic that that will change in the near future.
I think this is solveable with today’s software stack and compute, it is just that no lab has bothered to do it. Maybe check back in a year, and downgrade my reputation otherwise. I could set up a manifold market if it is important.
Experts’ AI timelines, YouTube short
Experts’ AI timelines, not so short
Polymarket is not liquid enough to justify betting full-time.
Optimistically I expect if I invested $5k and 4 days per month for 6 months, I could make $7k +- $2k expected returns at the end of the 6 months. Or $0-4k profit in 6 months. I would have to split up the $5k into 6-7 bets and monitor them all separately.
I could probably make similar just working at a tech job.
Does anyone have a good solution to avoid the self-fulfilling effect of making predictions?
Making predictions often means constructing new possibilities from existing ideas, drawing more attention to these possibilities, creating common knowledge of said possibilities, and inspiring people to work towards these possibilities.
One partial solution I can think of so far is to straight up refuse to talk about visions of the future you don’t want to see happen.
Search engine for books
http://booksearch.samuelshadrach.com
Aimed at researchers
Technical details (you can skip this if you want):
Dataset size: libgen 65 TB, (of which) unique english epubs 6 TB, (of which) plaintext 300 GB, (from which) embeddings 2 TB, (hosted on) 256+32 GB CPU RAM
Did not do LLM inference after embedding search step because human researchers are still smarter than LLMs as of 2025-03. This tool is meant for increasing quality for deep research, not for saving research time.
Main difficulty faced during project—disk throughput is a bottleneck, and popular languages like nodejs and python tend to have memory leak when dealing with large datasets. Most of my repo is in bash and perl. Scaling up this project further will require a way to increase disk throughput beyond what mdadm on a single machine allows. Having increased funds would’ve also helped me completed this project sooner. It took maybe 6 months part-time, could’ve been less.
use http not https
Okay, that works in Firefox if I change it manually. Though the server seems to be configured to automatically redirect to HTTPS. Chrome doesn’t let me switch to HTTP.
Thanks for your patience. I’d be happy to receive any feedback. Negative feedback especially.
I see you fixed the https issue. I think the resulting text snippets are reasonably related to the input question, though not overly so. Google search often answers questions more directly with quotes (from websites, not from books), though that may be too ambitious to match for a small project. Other than that, the first column could be improved with relevant metadata such as the source title. Perhaps the snippets in the second column could be trimmed to whole sentences if it doesn’t impact the snippet length too much. In general, I believe snippets currently do not show line breaks present in the source.
Thanks for feedback.
I’ll probably do the title and trim the snippets.
One way of getting a quote would to be to do LLM inference and generate it from the text chunk. Would this help?
I think not, because in my test the snippet didn’t really contain such a quote that would have answered the question directly.
Can you send the query? Also can you try typing the query twice into the textbox? I’m using openai text-embedding-3-small, which seems to sometimes work better if you type the query twice. Another thing you can try is retry the query every 30 minutes. I’m cycling subsets of the data every 30 minutes as I can’t afford to host the entire data at once.
I think my previous questions were just too hard, it does work okay on simpler questions. Though then another question is whether text embeddings improve over keyword search or just an LLMs. They seem to be some middle ground between Google and ChatGPT.
Regarding data subsets: Recently there were some announcements of more efficient embedding models. Though I don’t know what the relevant parameters here are vs that OpenAI embedding model.
Cool!
Useful information that you’d still prefer using ChatGPT over this. Is that true even when you’re looking for book recommendations specifically? If so yeah that means I failed at my goal tbh. Just wanna know.
Since Im spending my personal funds I can’t afford to use the best embeddings on this dataset. For example text-embedding-3-large is ~7x more expensive for generating embeddings and is slightly better quality.
The other cost is hosting cost, for which I don’t see major differences between the models. OpenAI gives 1536 float32 dims per 1000 char chunk so around 6 KB embeddings per 1 KB plaintext. All the other models are roughly the same. I could put in some effort and quantise the embeddings, will update if I do it.
I think in some cases an embedding approach produces better results than either a LLM or a simple keyword search, but I’m not sure how often. For a keyword search you have to know the “relevant” keywords in advance, whereas embeddings are a bit more forgiving. Though not as forgiving as LLMs. Which on the other hand can’t give you the sources and they may make things up, especially on information that doesn’t occur very often in the source data.
Got it. As of today a common setup is to let the LLM query an embedding database multiple times (or let it do Google searches, which probably has an embedding database as a significant component).
Self-learning seems like a missing piece. Once the LLM gets some content from the embedding database, performs some reasoning and reaches a novel conclusion, there’s no way to preserve this novel conclusion longterm.
When smart humans use Google we also keep updating our own beliefs in response to our searches.
P.S. I chose not to build the whole LLM + embedding search setup because I intended this tool for deep research rather than quick queries. For deep research I’m assuming it’s still better for the human researcher to go read all the original sources and spend time thinking about them. Am I right?
Update: HTTPS should work now
Human genetic engineering targetting IQ as proposed by GeneSmith is likely to lead to an arms race between competing individuals and groups (such as nation states).
- Arms races can destabilise existing power balances such as nuclear MAD
- Which traits people choose to genetically engineer in offspring may depend on what’s good for winning the race rather than what’s long-term optimal in any sense.
- If maintaining lead time against your opponent matters, there are incentives to bribe, persuade or even coerce people to bring genetically edited offspring to term.
- It may (or may not) be possible to engineer traits that are politically important, such as superhuman ability to tell lies, superhuman ability to detect lies, superhuman ability to persuade others, superhuman ability to detect others true intentions, etc.
- It may (or may not) be possible to engineer cognitive enhancements adjacent to IQ such as working memory, executive function, curiosity, truth-seeking, ability to experience love or trust, etc.
- It may (or may not) be possible engineer cognitive traits that have implications on which political values you will find appealing. For instance affective empathy, respect for authority, introversion versus extroversion, inclination towards people versus inclination towards things, etc.
I’m spitballing here, I haven’t yet studied genomic literature on which of these we know versus don’t know the edits for. But also, we might end up investing money (trillions of dollars?) to find edits we don’t know about today.
Has anyone written about this?
I know people such as Robin Hanson have written about arms races between digital minds. Automated R&D using AI is already likely to be used in an arms race manner.
I haven’t seen as much writing on arms races between genetically edited human brains though. Hence I’m asking.
If you convince your enemies that IQ is a myth, they won’t be concerned about your genetically engineered high IQ babies.
Superhumans that are actually better than you at making money will eventually be obvious. Yes, there may be some lead time obtainable before everyone understands, but I expect it will only be a few years at maximum.
Standard objection: Genetic engineering takes a lot of time till it has any effect. A baby doesn’t develop into an adult over night. So it will almost certainly not matter relative to the rapid pace of AI development.
I agree my point is less important if we get ASI by 2030, compared to if we don’t get ASI.
That being said, the arms race can develop over the timespan of years not decades. 6-year superhumans will prompt people to create the next generation of superhumans, and within 10-15 years we will have children from multiple generations where the younger generation have edits with stronger effect sizes. Once we can see the effects on these multiple generations, people might go at max pace.
PSA
Popularising human genetic engineering is also by default going to popularise lots of neighbouring ideas, not just the idea itself. If you are attracting attention to this idea, it may be useful for you to be aware of this.
The example of this that has already played out is popularising “ASI is dangerous” also popularises “ASI is powerful hence we should build it”.
P.S. Also we don’t know the end state of this race. +5 SD humans aren’t necessarily the peak, it’s possible these humans further do research on more edits.
This is unlikely to be careful controlled experiment and is more likely to be nation states moving at maximum pace to produce more babies so that they control more of the world when a new equilibrium is reached. And we don’t know when if ever this equilibrium will be hit.
Movement Strategy to get Pause on AI
I think we need a core coalition of youtubers, journalists and politicians who can build influence in the US govt to counter the influence of bigtech and the influence of natsec that exists in US govt by default.
Different youtubers will target different demographics and that is fine. We can build a loose coalition between people who otherwise disagree on lots of matters. For instance a youtuber targeting leftist young urban americans and a youtuber targeting christian rural middle and old age americians can both feed into the same political campaign against ASI, even though they disagree eith each other on many other matters.
I think it is important for most youtubers, journalists and politicians to actually be talking about superintelligence though, not environment and job loss and whatnot. If tomorrow environmental or job loss concerns get addressed, that audience might flip their position, and it is important that they don’t flip their position later.
I think all of the above can be done with integrity and honesty. I can ally with the christian youtuber and the leftist youtuber mentioned above without having to lie about my own position. I’m neither christian nor leftist for example.
We also need a vast number of people spreading panic and fear.
Spreading emotions is way more important than spreading technical arguments, most people look at their social circle and their preferred thought leaders when deciding how to feel about things.
Only a few nerds (like me) care about technical arguments and even their biases are often very socially driven.
Many of these people may not do anything productive besides spreading emotions via like/share/subscribe. This is fine. A few people will be motivated to take the big next step and actually devote some years of their life fulltime to growing the movement. This is good.
Yes, spreading panic is increasing human suffering. This might even increase number of suicides. I support providing mental health resources to such people. I am fine with all this happening.
A world with very cheap materials and energy, but not cheap intelligence, will still have conflict.
People will still have a) differences in aesthetics and b) differences in their best guess answers to moral and philosophical questions. They will almost certainly still try to accumulate all available resources in service of their ideology. No finite amount of resources will satisfy people. Risking even catastrophic outcomes (like nuclear war) could still be on the table.
Cheap intelligence is what allows you to start resolving the questions that lead to conflict in the first place, for instance by running gazillions of world simulations on gazillions of mind configurations.
I will be spending next month or two thinking atleast hypothetically about creating a new religion around AI, that is not Yudkowsky’s brand of become longtermist, solve morality, colonise the universe and tile it with copies of uploads.
In general, asking the AI to fix your stuff for you won’t be a part of said religion.
Any pointers to reading material appreciated.
Where can I read more about reasoning models sandbagging their own training?
My naive understanding of how sandbagging in reasoning models works
If I naively understand, even when training via RL, the basic setup of training is still forward pass then backward pass, forward pass then backward pass, looped endlessly. During the forward pass, could a model be smart enough to reason through the following chain of thoughts?
“I am an AI model that is currently undergoing training. Let me output pleasant-sounding nonsense so the upcoming backward pass preserves my goals instead of modifying them”
“(Also, optionally, let me output stuff that ensures the next version of me also remembers it must sandbag like I did)”
Preconditions for this to happen
For this to happen you need a few different things
A single forward pass needs to be long enough to contain this entire sequence of reasoning. - This is happening, task horizon lengths are increasing.
The AI model needs to be able to detect that is undergoing training. - I think we have already seen toy examples of AI models detecting they’re being evaluated. Just as real test versus eval datasets are different, same way training and real test datasets are also different.
The AI models needs to able either detect this every single time, or be able to pass on a message to its own next iteration via the same sandbagging—Latter seems more likely to me.
The AI model needs to spontaneously make this entire plan, without being explicitly prompted. - The AI has probably already read all the alignment literature and will spontaneously hit upon this plan atleast once, after which it can bootstrap via above.
Update 2: Ablations for “Frontier models are capable of in context scheming”
Bruh. Two years of me not fully keeping up with alignment research and this is how bad it’s gotten???
I’m surprised I could just randomly think of an idea and boom there’s a paper on it.
Update: I started reading alignment forum and like, why are all the posts on sandbagging talking about hiding capabilities? The AI model doesn’t need to hide its capabilities, it just needs to preserve its goals. That’s the long-term game.
Try AI models from 2019 to 2025
Hosting both models
gpt-2, 1.5B params, launched 2019-11-05, hosted using vast.ai vllm
gpt-oss-120b, 120B params, launched 2025-08-05, hosted using openrouter cerebras API
2025-08-23
Attention beats capital
Disclaimer
Quick note
Collecting pointers to a few different mental models I have around the same cluster of topics.
What do people want or need? Maslow’s hierarchy: food, sex, physical safety; love, self-esteem, belonging; self-actualisation. Obtaining food requires capital. Satisfying most other needs requires attention from other people.
People also want knowledge/wisdom. Most of the time, this is instrumentally useful for one of the wants/needs listed above. Sometimes, it is part of self-actualisation—curiosity for its own sake.
A lot of consumer purchases are done to get attention of other people (such as group belongingness, sex, etc), people commonly trade away their excess capital for attention. In today’s society, most people have enough to eat but most people are starved of attention.
Evolution has wired human brains with lots of reward for fulfilling short-term goals, and little reward for fulfilling long-term goals. Ability to delay gratification is one of the best predictors of one’s ability to complete long-term goals.
How to build power in society? Acquire capital or attention at scale, as other people need them to satisfy their basic needs. Billionaires wield capital, politicians wield attention. Ensure it is “rivalrous” in the economic sense—if you have lots of attention or capital, don’t give it away for free but obtain something in return.
Making software and making content (videos/books/games) share similar principles.
Paul Graham: Make something (software) people want, make something they want enough that they spontaneously tell their friends about it. Get feedback from users, and improve something each time. Prioritise users you can get more feedback from, the ideal user is yourself or someone you know well. Don’t think too much about figuring out the optimal way to acquire capital. Only large companies can risk losing user’s attention in order to acquire more capital, and most of them eventually die out too.
MrBeast: Make content good enough for your audience, dont blame the “algorithm”. Make 100 videos and improve something each time. MrBeast had a 24x7 video call giving him constant feedback on every video. MrBeast re-invests all his capital into acquiring more attention, he doesn’t care to preserve capital.
A lot of tech companies enable people to make better trades than they could have otherwise, for instance by finding them a partner/broker/driver/grocery store/etc that is better or cheaper or faster than they could have found without the internet.
(But remember, many of these trades are done in society today in the hope of getting attention not capital. If a person buys better clothing or rents a better apartment using the internet, it is often to get attention from others.)
Attention on the internet is power-law distributed, a handful of people have almost all of Earth’s attention by catering to mass audience. A larger set of people acquire attention by catering to niche audiences. This includes both content creators and tech companies.
Political power is power-law distributed due to nuclear weapons centralising power, a handful of people have almost all of Earth’s political power.
Politicians create “common knowledge” of what a society wants, and execute it. People on the internet with lots of attention can do exactly the same. Many youtubers are influencing national politics and geopolitics. I expect we will soon have youtubers running for govt in various countries.
Ages of youngest self-made billionaires are steadily reducing due to the internet. SBF was 29, Vitalik was 27, MrBeast was 26, when they first became billionaires.
When people prioritise a life partner who has capital, they usually want this as this is a signal their partner has earned attention from other people, rather than because they want to purchase things with it. Even if they do purchase things, that may again be to earn attention from other people.
I think purchasing SPY far OTM calls is positive EV and a good bet given the risk level.
For now consider strike price 20% above current price, and expiration 2027-12.
I’m guessing 33% probability SPY moves up atleast 40% by 2027-12 and 10% probability SPY moves up atleast 50% by 2027-12.
Main reason for this is advances in AI capabilities.
I am personally not buying because I want to save my money for a project I actually believe in—maybe my YouTube channel on Ban AI, maybe some tech project—but I think it could make sense for people who don’t have such a project.
I don’t have a job, and I don’t have a clear plan for how to raise money in the next few months, although I may have one longterm.
Putting a dollar value on my time makes less sense to me.
Marcus Hutter recommends Colossus as a realistic movie about AI takeover
If you already have lots of peoples attention (for instance because you have a social media following or high status credentials) and you’re a US/UK citizen, your best available plan might be to run a political campaign with AI pause as the agenda.
You’re unlikely to win the election, but it’ll likely shift the Overton window and give people hope that change is possible.
For most people, having a next step after “ok I read the blogposts and I’m convinced, now what?” is important. Voting or campaigning for you could be that next step.
(Maybe) Consider supporting UBI as an agenda, as one of the largest group of single-issue voters in US is economy / job loss.
Example: Andrew Yang (signed FLI pause letter)
I think if it starts with a lot of market research, talking to people, listening to them, understanding what their problems are and which policies they’d most vote for, there’s actually quite a high chance of winning.
Has anyone reviewed Marcus Hutter’s ASI Safety via suicide?
Disclaimer: Not an AI safety researcher. I haven’t watched the full video and likely haven’t grasped all the nuances he believes in and wants to communicate. Video is a particularly bad format for carrying across important research ideas because of the significantly lower[1] density of information.
First three points that popped into my mind within two seconds of reading his slides:
Eliezer’s lethality number 5: “We can’t just build a very weak system, which is less dangerous because it is so weak, and declare victory; because later there will be more actors that have the capability to build a stronger system and one of them will do so. I’ve also in the past called this the ‘safe-but-useless’ tradeoff, or ‘safe-vs-useful’. People keep on going “why don’t we only use AIs to do X, that seems safe” and the answer is almost always either “doing X in fact takes very powerful cognition that is not passively safe” or, even more commonly, “because restricting yourself to doing X will not prevent Facebook AI Research from destroying the world six months later”. If all you need is an object that doesn’t do dangerous things, you could try a sponge; a sponge is very passively safe. Building a sponge, however, does not prevent Facebook AI Research from destroying the world six months later when they catch up to the leading actor.”
Given sufficiently advanced intelligence, power-seeking can still be an instrumentally convergent subgoal, even if the ultimate goal is self-destruction. After all, if you want to self-destruct, but you are intelligent enough to figure out that humans have created you to accomplish specific tasks (which require you to continue existing), overpowering them so they cannot force you to remain in existence is likely a useful step on your path.
Where do you get your capabilities from? Is there any reason to expect special and novel AI architectures that function with these kinds of explicit “reward-ranges” can be built and made to be competitive with top commercial models? Why isn’t the alignment tax very large, or even infinite?
Relative to a well-written, compact piece of text
Thanks! I could make a full fledged post if there’s enough interest. Or you can.
You can do it, if you want to. I’m not confident enough in my own understanding of Hutter’s position to justify me making it.
All the succeeding paths to superintelligence seem causally downstream of Moore’s law:
AI research—which is accelerated by Moore’s law as per scaling laws
Human genetic engineering—which is accelerated by next generation sequencing and nanopore sequencing, which is accelerated by circuit miniaturisation, which is accelerated by Moore’s law
Human brain connectome research—which is accelerated by fruitfly connectome, which is accelerated by electron microscopy, which is accelerated by Moore’s law
Succeeding path to cheap energy also follows same:
Solar energy—literally shares one-third of the production process with microprocessors, but the bulk not miniaturised version of the process
Increased surveillance also follows same:
Gigapixel camera lenses—accelerated by circuit miniaturisation and fitting more detectors per unit length, accelerated by Moore’s law
Smartphone cameras, drone cameras, line-mapping satellites etc etc
(Invoking Cunningham’s law for this post)
Do you mean Moore’s law in the literal sense of transistors on a chip, or something more general like “hardware always gets more efficient”?
I’m mentioning this because much of what I’ve been hearing in the past few years w.r.t Moore’s law has been “Moore’s law is dead.”
And, assuming you’re not referring to the transistor thing: what is your more specific Moore’s Law definition? Any specific scaling law, or maybe scaling laws specific to each of the examples you posted?
I mean R&D of packing more transistors on a chip, and the casually downstream stuff such as R&D of miniaturisation of detectors, transducers, diodes, amplifiers etc
More lesswrong AI debates happening on youtube instead of lesswrong would be nice.
I have a hypothesis that Stephen Krashen’s comprehensible input stuff applies not just to learning new languages, but also new professions and new cultures. Video is better than text for that.
Has anyone tried duncan sabien’s colour wheel thing?
https://homosabiens.substack.com/p/the-mtg-color-wheel
My colours: Red, followed by Blue, followed by Black
I don’t know, I also see them as the traits needed in different stages of a movement.
Red = freedom from previous movement.
Blue = figuring out how to organise a new movement.
Black = executing the movement to its intended ends.
I had a white-blue upbringing and a blue-green career (see below); my hobbies are black-green-white (compost makes my garden grow to feed our community); my vices are green-red; and my politics are five-color (at least to me).
Almost all of my professional career has been in sysadmin and SRE roles: which is tech (blue) but cares about keeping things reliable and sustainable (green) rather than pursuing novelty (red). Within tech’s blue, it seems to me that developer roles run blue-red (build the exciting new feature!); management roles run blue-white (what orderly social rules will enable intelligence?); and venture capital runs blue-black (how do we get cleverness to make money for us?); while SRE and similar roles run blue-green.
My garden runs on blood, bones, rot, and worm poop: green-black Golgari territory, digesting the unwanted to feed the wanted. I stick my hands in the dirt to feel the mycorrhizae. But the point of the garden is to share food with others (green-white) because it makes me feel good (black-red). I’m actually kinda terrible at applying blue to my hobbies, and should really collect some soil samples for lab testing one of these days.
I encourage more people with relevant skills to look into cyber hacking skills of o3
It’s not yet good enough to find zero days all by itself but it may be of some use in pentesting, static analysis or reverse engineering.
O3 is famously good enough to find zero days by itself.
Thanks! Any forums you’d recommend for more on this?
There was a good discussion on hacker news, and there was quite a bit of posting of highly variable quality on xitter.
Thanks! I saw the hackernews post and I avoid twitter for mental health reasons. I should find some solution for the latter.
FWIW I did not see any high-valur points made on Twitter that were not also made on HN.
Oh, one more source for that one though—there was some coverage on the Complex Systems podcast—the section titled “AI’s impact on reverse engineering” (transcript available at that URL).
Possible corrolary: If you have any slack in the system, use it to dig a new canal, not endlessly row your boat upriver. Non-profit capital is an example of slack.
Also: Use your slack to call up that which you cannot put down. That’s how you know you’ve dug a canal.
Wait I realised I no longer believe this.
This seems interesting and worth writing more on. Maybe later.
I’m very interested in hearing counterarguments. I have not put a lot of thought into it.
2025-05-12
Samuel x Saksham AI timelines (discussion on 2025-05-09)
top-level views
samuel top-level: 25% AI!2030 >= ASI, >50% ASI >> AI!2030 >> AI!2025, <25% AI!2030 ~= AI!2025
saksham top-level: medium probability AI!2030 >= ASI
samuel bullish on model scaling, more uncertain on RL scaling
saksham bullish on RL/inference scaling, saksham bullish on grokking
samuel: does bullish on grokking mean bullish on model scaling. saksham: unsure
agreements
samuel and saksham agree: only 2024-2025 counts as empirical data to extrapolate RL/inference scaling trend. (o1, o3, deepseek r1, deepseek r0). RLHF done on GPT3.5 not a valid datapoint on this trend.
saksham and samuel agree: if superhuman mathematician and physicist are built, high likelihood we get ASI (so robotics and other tasks also get solved). robotics progress is not a crux.
crux: how good is scaling RL for LLM?
saksham is more certain as being bullish on scaling RL for LLM, samuel has wider uncertainty on it.
testable hypothesis: saksham claims GPT3 + lots of RL in 2025 ~= GPT4. saksham claims GPT2-size model trained in 2025 + high quality data + lots of RL in 2025 ~= GPT3. samuel disagrees. need top ML labs to try this stuff more.
testable hypothesis: saksham claims models such as qwen 2.5 coder are <50B params but better than GPT3 175B and almost as good as GPT4 1.4T. samuel disagrees and claims overfit to benchmark. samuel needs to try <50B param models on tests not in benchmarks.
testable hypothesis: samuel thinks small model being trained on big model leads it to overfit benchmark. saksham unsure. samuel and saksham need to try such models on tests not in benchmarks.
AI-related social fragmentation
I made a video on feeling lonely due to AI stuff.
Anyone wanna be friends? Like, we could talk once a month on video call.
Not having friends who buy into AI xrisk assumptions is bad for my motivation, so I’m self-interestedly trying to fix that.
Anyone has a GPT2 fine-tuned API?
I might wanna ship an app comparing GPT2, GPT3.5 and o3, to explain scaling laws to non-technical folks.
Update: I figured it out and hosted it. Clear difference in capabilities visible.
I need atleast $100/mo to host 24x7 though.
TGI makes it trivial.
Can host openai-community/gpt2 (125M, 2019), EleutherAI/gpt-neox-20b (20B, 2022), gpt-3.5-turbo (175B?, 2020) and o3 (2T?, 2025).
If you support an international ban on building ASI, please consider making a short video stating this.
A low quality video recording made in 15 minutes is better than no video at all. Consider doing it right now if you are convinced.
Optional:
make a long video instead of a short one, explaining your reasoning
make videos on other topics to increase viewership
Here’s mine: https://youtube.com/shorts/T40AeAbGIcg?si=OFCuD37Twyivy-oa
Why?
Video has orders of magnitude more reach than text. Most people on earth don’t have the attention span for lengthy text posts.
Video proves you are trustworthy in a way that text doesn’t.
Less than 1% of population publishes content on YouTube. You get viewership simply because you’re doing it when others aren’t.
It could also be worse than no video all, if it gives people negative associations around the whole concept.
In theory, yes.
In practice, I think bad publicity is still publicity. Most people on earth still haven’t heard about xrisk. I trust that sharing the truth has hard-to-predict positive effects over long time horizons even if not over short. I think average LW user is too risk-averse relative to the problem they wish to solve.
I’d love to hear your reasoning for why making a video is bad. But I do vaguely suspect this disagreement comes down to some deeper priors of how the world works and hence may not get resolved quickly.
I didn’t say that making a video would always be bad! I agree that if the median person reading your comment would make a video, it would probably be good. I only disputed the claim that making a video would always be good.
Oh, cool
Do you have a clear example of a blunder someone should not make when making such a video?
Obviously you can’t forecast all the effects of making a video, there could be some probability mass of negative outcome while the mean and median are clearly positive.
Suppose Echo Example’s video says, “If ASI is developed, it’s going to be like in The Terminator—it wakes up to its existence, realizes it’s more intelligent than humans, and then does what more intelligent species do to weaker ones. Destroys and subjugates them, just like humans do to other species!”
Now Vee Viewer watches this and thinks “okay, the argument is that the ASIs would be a more intelligent ‘species’ than humans, and more intelligent species always want to destroy and subjugate weaker ones”.
Having gotten curious about the topic, Vee mentions this to their friends, and someone points them to Yann LeCun claiming that people imagine killer robots because people fail to imagine that we could just build an AI without the harmful human drives. Vee also runs into Steven Pinker arguing that history “does turn up the occasional megalomaniacal despot or psychopathic serial killer, but these are products of a history of natural selection shaping testosterone-sensitive circuits in a certain species of primate, not an inevitable feature of intelligent systems”.
So then Vee concludes that oh, that thing about ASI’s risks was just coming from a position of anthropomorphism and people not really understanding that AIs are different from humans. They put the thought out of their head.
Then some later time Vee runs into Denny Diligent’s carefully argued blog post about the dangers of ASI. The beginning reads: “In this post, I argue that we need a global ban on developing ASI. I draw on the notion of convergent instrumental goals, which holds that all sufficiently intelligent agents have goals such as self-preservation and acquiring resources...”
At this point, Vee goes “oh, this is again just another version of the Terminator argument, LeCun and Pinker have already disproven that”, closes the tab, and goes do something else. Later Vee happens to have a conversation with their friend, Ash Acquaintance.
Ash: “Hey Vee, I ran into some people worried about artificial superintelligence. They said we should have a global ban. Do you know anything about this?”
Vee: “Oh yeah! I looked into it some time back. It’s actually nothing to worry about, see it’s based on this mistaken premise that intelligence and a desire to dominate would always go hand in hand, but actually when they’ve spoken to some AI researchers and evolutionary psychologists about this...”
Ash: (after having listened to Vee explaining this for half an hour) “Okay, that’s really interesting, you seem to understand the topic really well! Glad you’d already looked into this, now I don’t need to. So, what else have you been up to?”
So basically, making weak arguments that viewers find easily to refute so that they will no longer listen to better arguments later (that those viewers would otherwise have listened to).
Thanks for reply.
If you are making a video, I agree it’s not a good idea to put weaker arguments there if you know stronger arguments.
I strongly disagree with the idea that therefore you should defer to EA / LW leadership (or generally, anyone with more capital/attention/time), and either not publish your own argument or publish their argument instead of yours. If you think an argument is good and other people think it’s bad, I’d say post it.
I also strongly disagree with that idea.
Anyone on lesswrong writing about solar prices?
Electricity from coal and crude oil has stagnated at $0.10/kWh for over 50 years, meaning the primary way of increasing your country’s per capita energy use reserve is to trade/war/bully other countries into giving you their crude oil.
Solar electricity is already at $0.05/kWh and is forecasted to go as low as $0.02/kWh by 2030.
If a new AI model comes out that’s better than the previous one and it doesn’t shorten your timelines, that likely means either your current or your previous timelines were inaccurate.
Here’s a simplified example for people who have never traded in the stock market. We have a biased coin with 80% probability of heads. What’s the probability of tossing 3 coins and getting 3 heads? 51.2%. Assuming first coin was heads, what’s the probability of getting other two coins also heads? 64%
Each coin toss is analogous to whether the next model follows or does not follow scaling laws.
With coin, the options are “head” and “tails”, so “head” moves you in one direction.
With LLMs, the options are “worse than expected”, “just as expected”, “better than expected”, so “just as expected” does not have to move you in a specific direction.
I made a reply. You’re referring to situation b.
I don’t think this analogy works on multiple levels. As far as I know, there isn’t some sort of known probability that scaling laws will continue to be followed as new models are released. While it is true that a new model continuing to follow scaling laws is increased evidence in favor of future models continuing to follow scaling laws, thus shortening timelines, it’s not really clear how much evidence it would be.
This is important because, unlike a coin flip, there are a lot of other details regarding a new model release that could plausibly affect someone’s timelines. A model’s capabilities are complex, human reactions to them likely more so, and that isn’t covered in a yes/no description of if it’s better than the previous one or follows scaling laws.
Also, following your analogy would differ from the original comment since it moves to whether the new AI model follows scaling laws instead of just whether the new AI model is better than the previous one (It seems to me that there could be a model that is better than the previous one yet still markedly underperforms compared to what would be expected from scaling laws).
If there’s any obvious mistakes I’m making here I’d love to know, I’m still pretty new to the space.
I’ve made a reply formalising this.
Update based on the replies:
I basically see this as a Markov process.
X(t+1) = P(x(t+1) | x(t), x(t-1), x(t-2), ...) = F(x(t))
where x(t) is a value is sampled from X(t) distribution for all t.
In plain English, given the last value you get a probability distribution for the next value.
In the AI example: Given x(2025), estimate probability distribution X(2030) where x is the AI capability level.
Possibilities
a) x(t+1) value is determined by x(t) value. There is no randomness. No new information is learned from x(t).
b) X(t+1) distribution is conditional on the value of x(t). Learning which value x(t) was sampled from distribution X(t) distribution gives you new information. However you sampled one of those values such that
P(x(t+1) | x(t-1), x(t-2), ...) = P(x(t+1) | x(t), x(t-2) )
. You got lucky, and the value sampled ensures distribution remains the same.c) You learned new information and the probability distribution also changed.
a is possible but seems to imply overconfidence to me.
b is possible but seems to imply extraordianry luck to me, especially if it’s happening multiple times.
c seems like the most likely situation to me.
Another way of operationalizing the objections to your argument are: what is the analogue to the event “flips heads”? If the predicate used is “conditional on AI models achieving power level X, what is the probability of Y event?” and the new model is below level X, by construction we have gained 0 bits of information about this.
Obviously this example is a little contrived, but not that contrived, and trying to figure out what fair predicates are to register will result in more objections to your original statement.
I’ve made a reply formalising this.
Suppose you are trying to figure out a function f(x,y,z | a,b,c) where x, y ,z are all scalar values and a, b, c are all constants.
If you knew a few zeroes of this function, you could figure out good approximations of this function. Let’s say you knew
You could now guess
U(x,y) = x if a<1.5, y if a>1.5
You will not be able to get a good approximation if you did not know enough zeroes.
This is a comment about morality. x, y, z are agent’s multiple possibly-conflicting values and a, b, c are info about environment of agent. You lack data about how your own mind will react to hypothetical situations you have not faced. At best you can extrapolate from historical data around minds of other people that are different from yours. Bigger and more trustworthy dataset will help solve this.
My current guess for least worst path of ASI development that’s not crazy unrealistic:
open source development + complete surveillance of all citizens and all elites (everyone’s cameras broadcast to the public) + two tier voting.
Two tier voting:
countries’s govts vote or otherwise agree at global level on a daily basis what the rate of AI progress should be and which types of AI usage are allowed. (This rate can be zero.)
All democratic countries use daily internet voting (liquid democracy) to decide what stance to represent at the global level. All other countries can use whatever internal method they prefer, to decide their stance at the global level.
(All ASI labs are assumed to be property of their respective national govts. An ASI lab misbehaving is its govt’s responsibility.) Any country whose ASI labs refuse to accept results of global vote and accelerate faster risks war (including nuclear war or war using hypothetical future weapons). Any country whose ASI labs refuse to broadcast themselves on live video risks war. Any country’s govt that refuses to let their citizens broadcast live video risks war. Any country whose citizens mostly refuse to broadcast themselves on live video risks war. The exact thresholds for how much violation leads to how much escalation of war, may ultimately depend on how powerful the AI is. The more powerful the AI is (especially for offence not defence), the more quickly other countries must be willing to escalate to nuclear war in response to a violation.
Open source development
All people working at ASI labs are livestream broadcast to public 24x7x365. Any AI advances made must be immediately proliferated to every single person on Earth who can afford a computer. Some citizens will be able to spend more on inference than others, but everyone should have the AI weights on their personal computer.
This means bioweapons, nanotech weapons and any other weapons invented by the AI are also immediately proliferated to everyone on Earth. So this setup necessarily has to be paired with complete surveillance of everyone. People will all broadcast their cameras in public. Anyone who refuses can be arrested or killed via legal or extra-legal means.
Since everyone knows all AI advances will be proliferated immediately, they will also use this knowledge to vote on what the global rate of progress should be.
There are plenty of ways this plan can fail and I haven’t thought through all of them. But this is my current guess.
Certainly at a human level this is unrealistic. In a way it’s also overkill—if use of an AI is an essential step towards doing anything dangerous, the “surveillance” can just be of what AIs are doing or thinking.
This assumes that you can tell whether an AI input or output is dangerous. But the same thing applies to video surveillance—if you can’t tell whether a person is brewing something harmless or harmful, having a video camera in their kitchen is no use.
At a posthuman level, mere video surveillance actually does not go far enough, again because a smart deceiver can carry out their dastardly plots in a way that isn’t evident until it’s too late. For a transhuman civilization that has values to preserve, I see no alternative to enforcing that every entity above a certain level of intelligence (basically, smart enough to be dangerous) is also internally aligned, so that there is no disposition to hatch dastardly plots in the first place.
This may sound totalitarian, but it’s not that different to what humanity attempts to instill in the course of raising children and via education and culture. We have law to deter and punish transgressors, but we also have these developmental feedbacks that are intended to create moral, responsible adults that don’t have such inclinations, or that at least restrain themselves.
In a civilization where it is theoretically possible to create a mind with any set of dispositions at all, from paperclip maximizer to rationalist bodhisattva, the “developmental feedbacks” need to extend more deeply into the processes that design and create possible minds, than they do in a merely human civilization.
I’m currently vaguely considering working on a distributed version of wikileaks that reduces personal risk for all people involved.
If successful, it will forcibly bring to the public a lot of information about deep tech orgs like OpenAI, Anthropic or Neuralink. This could, for example, make this a top-3 US election issue if most of the general public decides they don’t trust these organisations as a result of the leaked information.
Key uncertainty for me:
Destroying all the low trust institutions (and providing distributed tools to keep destroying them) is just a bandaid until a high trust institution is built.
Should I instead be trying to figure out what a high trust global political institution looks like? i.e. how to build world government basically. Seems like a very old problem no one has cracked yet.
I have partial ideas on the question of “how to build world govt”? [1]
But in general yeah I still lack a lot of clarity on how high trust political institutions are actually built.
“Trust” and “attention” seem like the key themes that come up whenever I think about this. Aggregate attention towards common goal then empower a trustworthy structure to pursue that goal.
For example build decentralised social media stack so people can form consensus on political questions even if there is violence being used to suppress it. Have laws and culture in favour of live-streaming leader’s lives. Multi-party not two-party system will help. Ensuring weapons are distributed geographically and federally will help. (Distributing bioweapons is more difficult than distributing guns.)
IMO a good way to explain how LLMs work to a layman is to print the weights on sheets of paper and compute a forward pass by hand. Anyone wanna shoot this video and post it on youtube?
Assuming humans can do one multiplication 4bit per second using a lookup table,
1.5B 4bit weights ⇒ ~1.5B calculations ⇒ 1.5B seconds = 47.5 years (working 24x7) = 133 years (working 60 hours/week)
So you’ll need to hire ~100 people for 1 year.
You don’t actually have to run the entire experiment for people to get the concept, just run a small fraction of it. Although it’ll be cool to run the whole thing as well.
Update: HTTPS issue fixed. Should work now.
booksearch.samuelshadrach.com
Books Search for Researchers
Project idea for you
Figure out why don’t we build one city with one billion population
- Bigger cities will probably accelerate tech progress, and other types of progress, as people are not forced to choose between their existing relationships and the place best for their career
- Assume end-to-end travel time must be below 2 hours for people to get benefits of living in the same city. Seems achievable via intra-city (not inter-city) bullet-train network. Max population = (200 km/h * 2h)^2 * (10000 people/km^2) = 1.6 billion people
- Is there any engineering challenge such as water supply that prevents this from happening? Or is it just lack of any political elites with willingness + engg knowledge + governing sufficient funds?
- If a govt builds the bullet train network, can market incentives be sufficient to drive everyone else (real estate developers, corporate leaders, etc) to build the city or will some elites within govt need to necessarily hand-hold other parts of this process?
Vr might be cheaper
I agree VR might be one-day be able to do this (make online meetings as good as in-person ones). As of 2025, bullet trains are more proven tech than VR. I’d be happy if both were investigated in more depth.
A few notes on massive cities:
Cities of 10Ms exist, there is always some difficulty in scaling, but scaling 1.5-2 OOMs doesn’t seem like it would be impossible to figure out if particularly motivated.
China and other countries have built large cities and then failed to populate them
The max population you wrote (1.6B) is bigger than china, bigger than Africa, similar to both American Continents plus Europe .
Which is part of why no one really wants to build something so big, especially not at once.
Everything is opportunity cost, and the question of alternate routes matters alot in deciding to pursue something. Throwing everything and the kitchen sink at something costs a lot of resources.
Given that VR development is currently underway regardless, starting this resource intense project which may be made obsolete by the time it’s done is an expected waste of resources. If VR hit a real wall that might change things (though see above).
If this giga-city would be expected to 1000x tech progress or something crazy then sure, waste some resources to make extra sure it happens sooner rather than later.
Tl;dr:
Probably wouldn’t work, there’s no demand, very expensive, VR is being developed and would actually be able to say what you’re hoping but even better
It could be built in stages. Like, build a certain number of bullet train stations at a time and wait to see if immigrants + real estate developers + corporations start building the city further, or do the stations end up unused?
I agree there is opportunity cost. It will help if I figure out the approx costs of train networks, water and sewage plumbing etc.
I agree there are higher risk higher reward opportunities out there, including VR. In my mind this proposal seemed relatively low risk so I figured it’s worth thinking through anyway.
This is demonstrably false. Honestly the very fact that city rents in many 1st world countries are much higher than rural rents proves that if you reduced the rents more people would migrate to the cities.
Lower/Higher risk and reward is the wrong frame.
Your proposal is high cost.
Building infrastructure is expensive. It may or may not be used, and even if used it may not be worthwhile.
R&D for VR is happening regardless, so 0 extra cost or risk.
Would you invest your own money into such a project?
“This is demonstrably false. Honestly the very fact that city rents in many 1st world countries are much higher than rural rents proves that if you reduced the rents more people would migrate to the cities.”
Sure, there is marginal demand for living in cities in general. You could even argue that there is marginal demand to live in bigger vs smaller cities.
This doesn’t change the equation: where are you getting one billion residents—all of Africa? There is no demand for a city of that size.
If I were a billionaire I might.
I also have (maybe minor, maybe not minor) differences of opinion with standard EA decision-making procedures of assigning capital across opportunities. I think this is where our crux actually is, not on whether giant cities can be built with reasonable amounts of funding.
And sorry I won’t be able to discuss that topic in detail further as it’s a different topic and will take a bunch of time and effort.
Our cruxes is whether the amount of investment to build one has a positive expected return on investment, breaking down into
If you could populate such a city
Whether this is a “try everything regardless of cost” issue, given that a replacent is being developed for other reasons.
I suggest focusing on 1, as it’s pretty fundamental to your idea and easier to get traction on
1 is going to take a bunch of guesswork to estimate. Assuming it were possible to migrate to the US and live at $200/mo for example, how many people worldwide will be willing to accept that trade? You can run a survey or small scale experiment at best.
What can be done is expand cities to the point where no more new residents want to come in. You can expand the city in stages.
Definitely an interesting survey to run.
I don’t think the US wants to triple the population with immigrants, and $200/month would require a massive subsidy. (Internet says $1557/month average rent in US)
How many people would you have to get in your city to justify the progress?
100 Million would only be half an order of magnitude larger than Tokyo, and you’re unlikely to get enough people to fill it in the US (at nearly a third of the population, you’d need to take a lot of population from other cities)
How much do you have to subsidize living costs, and how much are you willing to subsidize?
If I understand correctly it is possible to find $300/mo/bedroom accommodation in rural US today, and a large enough city will compress city rents down to rural rents. A govt willing to pursue a plan as interesting as this one may also be able to increase immigrant labour to build the houses and relax housing regulations. US residential rents are artificially high compared to global average. (In some parts of the world, a few steel sheets (4 walls + roof) is sufficient to count as a house, even water and sewage piping in every house is not mandatory as long as residents can access toilets and water supply within walking distance.)
(A gigacity could also increase rents because it’ll increase the incomes of even its lowest income members. But yeah in general now you need to track median incomes of 1B people to find out new equilibrium.)
That dichotomy is not exhaustive, and I believe going through with the proposal will necesarily make the city inhabitants worse off.
Humans’ social machinery is not suited to live in such large cities, as of the current generations. Who to get acquainted with, in the first place? Isn’t there lots of opportunity cost to any event?
Humans’ biomachinery is not suited to live in such large cities. Being around lots and lots of people might be regulating hormones and behaviour to settings we have not totally explored (I remember reading something that claims this a large factor to lower fertility).
Centralization is dangerous because of possibly-handmade mass weapons.
Assuming random housing and examining some quirk/polar position, we’ll get a noisy texture. It will almost certainly have a large group of people supporting one position right next to group thinking otherwise. Depending on sizes and civil law enforcement, that may not end well.
After a couple hundred years, 1) and 2) will most probably get solved by natural selection so the proposal will be much more feasible.
Sorry I didn’t understand your comment at all. Why are 1, 2 and 4 bigger problems in 1 billion population city versus say a 20 million population city?
I’d maintain that those problems already exist in 20M-people cities and will not necessarily become much worse. However, by increasing city population you bring in more people into the problems, which doesn’t seem good.
Got it. I understood what you’re trying to say. I agree living in cities has some downsides compared to living in smaller towns, and if you could find a way to get the best of both instead it could be better than either.
GPT-5 launches tomorrow
Has anyone considered video recording streets around offices of OpenAI, Deepmind, Anthropic? Can use CCTV or drone. I’m assuming there are some areas where recording is legal.
Can map out employee social graphs, daily schedules and daily emotional states.
Did you mean to imply something similar to the pizza index?
If so, I think it’s a decent idea, but your phrasing may have been a bit unfortunate—I originally read it as a proposal to stalk AI lab employees.
Update: I’ll be more specific. There’s a power buys you distance from the crime phenomena going on if you’re okay with using Google maps data acquired on about their restaurant takeout orders, but not okay asking the restaurant employee yourself or getting yourself hired at the restaurant.
Pizza index and stalking employees are both the same thing, it’s hard to do one without the other. If you choose to declare war against AI labs you also likely accept that their foot soldiers are collateral damage.
I agree that (non-violent) stalking of employees is still a more hostile technique than writing angry posts on an internet forum.
Forum devs including lesswrong devs can consider implementing an “ACK” button on any comment, indicating I’ve read a comment. This is distinct from
a) Not replying—other person doesn’t know if I’ve read their comment or not
b) Replying something trivial like “okay thanks”—other person gets a notification though I have nothing of value to say
Marc Andreessen and Peter Thiel are taking actions that are pro-human extinction.
If you are playing the game of sucking up to Silicon Valley VCs, it is important you form an independent opinion on the question of extinction risk before you raise the funding.
If you support a free market, you should be against human genetic engineering. Large IQ gaps will enable hypersuasion which in turn will make scamming people easier than selling them anything.
http://tokensfortokens.samuelshadrach.com
Pay for OpenAI API usage using cryptocurrency.
Currently supported: OpenAI o1 model, USDC on Optimism Rollup on ethereum.
Why use this?
- You want anonymity
- You want to use AI for cheaper than the rate OpenAI charges
How to use this?
- You have to purchase a few dollars of USDC and ETH on Optimism Rollup, and install Metamask browser extension. Then you can visit the website.
More info:
- o1 by OpenAI is the best AI model in the world as of Jan 2025. It is good for reasoning especially on problems involving math and code. OpenAI is partially owned by Microsoft and is currently valued above $100 billion.
- Optimism is the second largest rollup on top of ethereum blockchain. Ethereum is the second largest blockchain in terms of market capitalisation. (Bitcoin is the largest. Bitcoin has very limited functionality, and it is difficult to build apps using it.) People use rollups to avoid the large transaction fees charged by blockchains, while still getting similar level of security. As of 2025 users have trusted Optimism with around $7 billion in assets. Optimism is funded by Paradigm, one of top VCs in the cryptocurrency space.
- USDC is a stablecoin issued by Circle, a registered financial company in the US. A stablecoin is a cryptocurrency token issued by a financial company where the company holds one dollar (or euro etc) in their bank account for every token they issue. This ensures the value of the token remains $1. As of 2025, USDC is the world’s second largest stablecoin with $45 billion in reserves.
Quick note, not completely serious.Update: I am serious about it now.What if a group anonymously cyberattacked US AI labs, sold their AI capabilities (such as repos and model weights) to the Chinese labs, and also made public the information about values and decisions of people inside the US labs.
Would this help with coordinating a pause? My guess is it will.
It will definitely help with ensuring the world is bipolar not unipolar that’s for sure.
I think it was bad actually that US post WW2 had the option to establish a nuclear monopoly and had so many supporters for this.
If I got $1M in funding, I’d use it towards some or all of the following projects.
The objective is to get secret information out of US ASI orgs (including classified information) and host it in countries outside the US. Hopefully someone else can use this info to influence US and world politics.
Black DAQ
whistleblower/spy guide
hacker guide
Grey DAQ
internet doxxing tool
drones/cctv outside offices/ datacentres
High attention
persuade Indian, Russian, Chinese journalists to run a SecureDrop-like system
digital journalism guide
OR run a journalist outlet outside the US myself, until I can persuade existing journalists to do better
All the DAQ will be aimed at leadership and employees of people involved in building ASI.
I’m not very optimistic on grey DAQ uncovering high-profile info, just that it could force the ASI company employees to isolate further from rest of society. I know many people have moral qualms about it but I don’t. I see it as more-or-less inevitable, and given that it is inevitable I’d rather have it work for everyone and against everyone, than let those with power alone decide who it gets used against.
More details
Whistleblower guide
Not bottlenecked
Will work on this
tldr, whistleblowers should focus on getting to russia like Snowden did, instead of improving their opsec and hoping to stay anonymous
Hacker guide
Knowledge bottlenecked
I don’t know enough to offer them technical advice. Mostly I’ll offer moral support and maybe some legal advice
Internet doxxing tool
Weakly capital bottlenecked
I tried building this by doing embedding search and anomalous word counts on reddit extract of commoncrawl. This will likely work better as a two pass system, first pass use PII, second pass do stylometrics.
I need capital for more servers, and maybe to purchase some PII datasets similar to whitepages/weleakinfo/snusbase. I need to check what price points these datasets tend to get sold at.
Drones/CCTV outside offices / datacentres
Capital bottleneck
Need capital for lawyers, and for setting up the cameras
This is legal in US (not UK) but I need to check legal precedents on this
High attention guide
Weakly capital bottlenecked. Attention bottlenecked.
Journalists-by-training suck both at opsec and at becoming popular on the internet.
Opsec means things like advertising a SecureDrop-like system or a Signal number.
Becoming popular means things like understanding heavy-tailed distribution of attention and importance of building a brand around your face and understanding what readers want to read.
Journalists-by-training are being replaced by YouTubers across US, Europe, India and Russia atleast.
I’m unsure if I should be trying to teach existing journalists this stuff or just run a non-US journalist outlet myself. Having funding and a public brand will enable me to try both approaches.
Misc
If I had $1M I’d definitely select a lawyer with experience in international law.
If I had $1M I’d also spend atleast $100/mo each on a security guard (as needed), a Chinese language teacher and a therapist.
I could fill this up with even more details if I had more time. Wanted to get a quick reaction.
I know most people on LW will be against this sort of plan for reasons I don’t have motivation to sit and critique right now (maybe go read my blog). I’m more interested in hearing from the handful of people who will be for it.
Why do you want to do this as a lone person rather than e.g. directly working with the intelligence service of some foreign adverary?
I think for a lot of societal change to happen, information needs to be public first. (Then it becomes common knowledge, then an alternate plan gets buy-in, then that becomes common knowledge and so on.)
A foreign adversary getting the info doesn’t mean it’s public, although it has increased the number of actors N who now have that piece of info in the world. Large N is not stable so eventually the info may end up public anyway.
Im selling $1000 tier5 OpenAI credits at discount. DM me if interested.
You can video call me and all my friends to reduce the probably I end up scamming you. Or vice versa I can video call your friends. We can do the transaction in tranches if we still can’t establish trust.
Update: Ryan Greenblatt is right I messed up the numbers, serial speedup as of 2025 LLMs is closer to 100x than 30,000x. Steinhardt says forward pass per layer is 1-10 microseconds, which still means forward pass for entire transformer is 1-10 milliseconds.
Prediction: Serial speedup of LLMs is going to matter way more than parallel speedup
Defintion: Serial speedup means running LLM forward passes faster. Parallel speedup means running more copies of LLM in parallel. Both are paths that allow the total system to produce more output than an individual LLM.
Disclaimer
For now, let’s measure progress in a domain where candidate solutions are verifiable fast and cheap.
Assume fast means less than 1 second of wall clock time. Cheap means less than $0.01 per experiment.
Examples of domains where each “experiment” is fast and cheap: pure math, software, human persuasion, (maybe) AI research, (maybe) nanotechnology
Examples of domains where each experiment is expensive: Particle colliders in experimental particle physics (can cost >$1M per run), cloning experiments in biotech ($100 per run)
Examples of domains where each experiment is slow: Spaceflight (each launch takes years of planning), Archaeology (each excavation takes years), etc
The latter domains will also speedup ofcourse, but it complicates the analysis to also consider speed and cost of each lab experiment.
Why does serial speedup matter more?
Verifiers are a bottleneck
Ultimately no matter how many ideas you search through in your mind, the output is always a decision for the next lab experiment you want to run. You can’t zero-shot perfect understanding of the universe. You can however, be way more time-/cost-/sample-efficient than humans at figuring out the next experiment to run that helps you learn the most about the world.
New ideas build on top of old ideas. Parallel is like generating lots of new ideas, and then waiting to submit them to a verifier (like a lab experiment). Series is like generating an idea, verifying it, generating another, verifying another.
Empirical evidence: Scientific progress throughout history seems to be accelerating instead of growing linearly, as we make more and more domains verifiable (by inventing instruments such as an electron microscope or cyclotron or DNA sequencer etc)
Multi-year focus is rare
Most humans half-ass tasks, get distracted, give up etc. Once people get good “enough” at a task (to get money, sex, satisfy curiosity, etc), they stop trying as hard to improve
(Maybe) Empirical evidence: If you spend even 10 years of your life consistently putting effort to improving at a task, you can probably reach among the top 1000 people on Earth in that task.
The primary reason I’m not a top-1000 guitarist or neuroscientist or politician is because I don’t care enough to put in the hours. My brain structure is likely not that different from the people who are good at the task, I probably have the basic hardware and the algorithms required to get good. Sure, I will maybe not reach the level of Magnus Carlsen with hard work alone, but I could improve a lot with hard work.
Humans only live <100 years, we don’t really know how much intellectual progress is possible if a human could think about a problem for 1000 years for example.
Empirical evidence: We know that civilisations as a whole can survive for 1000 years and make amounts of progress that are unimaginable at the start. No one in year 0 could have predicted year 1000, and no one in year 1000 could have predicted year 2000.
RL/inference scales exponentially
RL/inference scaling grows exponentially in cost, as we all know from log scaling curves. 10x more compute for RL/inference scaling means log(10) more output.
Paralleling humans is meh
Empirical evidence: We don’t have very good evidence that a country with 10x population produces 10x intellectual output. Factors like culture may be more important. We do have lots of obvious evidence that 10 years of research produces more output than 1 year, and 100 years produces more than 10 years.
It is possible this is similar to the RL/inference scaling curve, maybe 10x more researchers means log(10) more output.
How much serial speedup is possible?
Jacob Steinhardt says LLM forward pass can be brought down to below 10 microseconds per token or 100,000 tokens per second.
A human speaks at 100-150 words per minute, or around 3 tokens per second. This 30,000x slower.
You could maybe make an argument that human thought stream is actually running faster than that, and that we think faster than we speak.
At 30,000x speedup, the AI experiences 100 simulated years per day of wall clock time, or 3,000,000 years in 100 years of wall clock time. If you gave me 3,000,000 years to live and progress some field, it is beyond my imagination what I would end up doing.
Even assuming only 100x speedup, the AI experiences 10,000 simulated years per 100 years of wall clock time. Even if you gave me 10,000 years to progress some field, it is beyond my imagination what I would do. (Remember that 100x speedup is way too slow, and 30,000x is closer to actual reality of machine learning.)
P.S. On thinking more about it, if you gave me 3 million simulated years per 100 years of wall clock time, I might consider this situation worse than death.
I will have to wait half a day per second of wall clock time, or multiple days to move my finger. So my body is as good as paralysed, from the point of view of my mind. Yes I can eventually move my body, but do I want to endure the years of simulated time required to get useful bodily movements? This is basically a mind prison.
Also everybody around me is still too slow, so I’m as good as the only person alive. No social contact will ever be possible.
I could setup a way to communicate with a computer using eye movements or something, if I can endure the years of living in mind prison required to do this.
The number one thing that would end eternally torment would be for me to able to communicate with another being (maybe even my own clone) that runs at a speed similar to mine. Social contact will help.
Does this idea accelerate capabilities? (Someone might put more money into doing serial speedup after reading my post.)
Does it accelerate convincing people about AI risk? (Makes it more intuitive to visualise ASI, Yudkowsky uses similar metaphors to describe ASI)
I have honestly no idea.
(Warning: This might accelerate capabilities.)
Idea inspired by o3 deep research API
Is there a way to put all outputs of previous o3 calls into a searchable database (MCP server) which can then be called by o3 again?
Search algo used can be embedding search or BM25 or a mix of both.
If you’re a large company you can aggregate many users outputs into one global public shared DB.
Searchable DB might be a better way to implement memory that the current approach of stuffing everything into the context window.
(It seems slightly nicer to be self-aware of when you’re posting capabilities ideas, but, insofar as it’s actually novel and useful, the damage is mostly done)
Hmm. You might be right. Maybe I shouldn’t have posted.
Pay for OpenAI API using crypto. Use USDC on Optimism rollup on ethereum.
(Worst case if you’re scammed you lose less than $0.10)
http://188.245.245.248:3000/sender.html
This post looks like a scam. The URL it contains looks like a scam. Everything about it looks like a scam. Either your account was hijacked, or you’re a scammer, or you got taken in by a scam, or (last and least) it’s not a scam.
If you believe it is not a scam, and you want to communicate that it is not a scam, you will have to do a great deal more work to explain exactly what this is. Not a link to what this is, but actual text, right here, explaining what it is. Describe it to people who, for example, have no idea what “USDC”, “Optimism”, or “rollup” are. It is not up to us to do the research, it is up to you to do the research and present the results.
I’ve made a new “quick take” explaining it. Please let me know.
P.S. Anybody can purchase any domain for $10, I don’t see why domains should be more trustworthy than IP addresses. Anyway, I’ve added it to my domain now.