Makes total sense, AI images are higher resolution than classical pictures (which are limited by the dexterity of the painter), so you’re basically getting 11.67 pictures in each.
Pricing is linear with tokens even though actual cost per token is quadratic. That means the pricing is some approximate curve fitting relating to expected use. I would be curious about where the actual cost curve for tokens intersects the actual cost curve for a single image.
People sometimes say things like “I bribed my child to have an injection with a packet of crisps”.
This is interesting because this clearly isn’t a bribe—it’s a straightforward deal: I got to vaccinate my child, you got a packet of crisps, we’re both better off.
A bribe is only possible when someone is representing someone else’s interests. Then you cut a deal where they abuse their responsibility in return for some personal benefit to them.
So why do people use the term? My guess it’s because it feels dirty since crisps aren’t healthy, and bribery has been extended to mean any deal that feels immoral?
Or maybe it’s because they feel they shouldn’t have to give the child anything for them to have an injection, since the injection is for the childs sake, and a bribe is frequently an extortion you shouldn’t have to pay.
Bribery can happen when it’s already someone’s duty to do something, but they refuse to do it until they’re paid extra. The parent may think the child has a duty to accept the vaccination; either arising from the child’s self-interest, from the public interest (or categorical imperative), or from a duty to obey authority. By accepting the vaccination only when paid off with a snack, the child is acting not from duty but from desire for the snack. Thus, the child does not learn the habit of acting from duty.
Among the parents I know, the issue isn’t that kids have a “duty” to get vaccinated. That would imply a critique of their own child that’s not all all implied when they talk about “bribery.”
The word “bribe” in this context has two implications.
They are acknowledging that using external incentives to extract compliance from their kids risks corrupting their intrinsic motivation. They call it a “bribe” to humorously emphasize to other parents that they’re aware of this issue, and that the “bribe” is the exception, not the rule.
They are pointing out that shots are scary and painful, while the benefits are hard for a kid to understand. When you’re forcing your kid through that, it shows love and caring to boost their morale with a treat. This lets them be seen as a caring yet responsible parent, who cares both about their kid’s long-term health and their short term feelings.
Bribe’s a convenient, one syllable word that’s become a widely understood shorthand among parents for precisely this combination of meanings. It lets people imply a deep philosophy of parenting while also seeming funny and with it among their parent friends. The fact that it’s an exaggeration and not literally apt is part of the charm.
Could it also be understood in the sense that the child-in-the-moment is representing the child-overall? The fair transaction between adult (who wants to be a good parent) and the child-overall (who wants to be healthy) is for them to just cooperate and make the injection happen. But the child-overall’s middle-man, the child-in-the-moment, has bargaining power and wants to use this for some personal benefit (crisps)?
persuade (someone) to act in one’s favor, typically illegally or dishonestly, by a gift of money or other inducement.
I don’t think it necessarily means the person being bribed “is representing someone else’s interests,” although this is often the case, like when bribing police or politicians. I think “making a deal that feels immoral” is a good loose definition of the word “bribe.”
To me the deal with the child to get a packet of crisps in return for being vaccinated does that feel immoral and I don’t think that all people who use the word bribe in that context would say that either they or the child acted immoral.
If many native speakers use a word in a way that you think is wrong, you are probably misunderstanding them. They are probably using a different definition of bribe than you.
I feel like many folks use ‘bribe’ to just mean any positive but nonstandard reward for an action.
Most murder mysteries on TV tend to have a small number of suspects, and the trick is to find which one did it. I get the feeling that real life murders the police either have absolutely no idea who did it, or know exactly who did it and just need to prove that it was them to the satisfaction of the court of law.
That explains why forensic tests (e.g. fingerprints) are used despite being pretty suspect. They convince the jury that the guilty guy did it, which is all that matters.
It’s a good model sir. Whilst it doesn’t beat every other model on everything, it’s definitely pushed the pareto frontier a step further out.
It hallucinates pretty badly. ChatGPT 5 did too when it was released, hopefully they can fix this in future patches and it’s not inherent to the model.
To those who were hoping/expecting to have hit a wall. Clearly hasn’t happened yet (although neither have we proved that LLMs can take us all the way to AGI).
Costs are slightly higher than 2.5-pro, much higher than gpt 5.1, and none of googles models have seen any price reduction in the last couple of years. This suggests that it’s not quickly getting cheaper to run a given model, and that pushing the pareto frontier forward is costing ever more in inference. (However we are learning how to get more intelligence out of a fixed size with newer small models).
I would say Google currently has the best image models and best LLM, but that doesn’t prove they’re in the lead. I expect openai and anthropic to drop new models in the next few months, and Google won’t release a new one for another 6 months at best. It’s lead is not strong enough to last that long.
However we can firmly say that Google is capable of creating SOTA models that give openai and anthropic a run for their money, something many were doubting just a year ago.
Google has some tremendous structural advantages:
independent training and inference stack with TPUs, JAX, etc. It is possible they can do ML at a scale and price point noone else can achieve.
trivial distribution. If Google comes up with a good integration they have dozens of products where they can instantly push it out to hundreds of millions of people (monetising is a different question).
deep pockets. No immediate need to generate a profit, or beg investors for money.
lots of engineers. This doesn’t help with the cure model, but does help with integrations and RLHF.
Now that they’ve proven they can execute, they should likely be considered frontrunners for the AI race.
On the other hand ChatGPT has much greater brand recognition, and LLM usage is sticky. Things aren’t looking great for anthropic though with neither deep pockets or high usage.
In terms of existential risk: this is likely to make the race more desperate, which is unlikely to lead to good things.
Running trains more frequently can reduce reliability:
Consider a train line that takes 200 minutes to travel. Assume trains break down once every hundred journeys, and take 4 hours to clear. When a train breaks down, no other trains can pass it.
Now consider the two extremes:
If there’s one train every 2 minutes the train line will essentially always have one broken train, and travelling the line will likely take about 7+ hours.
Meanwhile if there’s just 1 train going back and forth you’ll have 1 delayed train every month, which will delay people for 4 hours. You’re still better off in this scenario than the previous one.
The sweet spot in terms of average transit time is closer to a train every 2 minutes than a train every 400 minutes, but the sweet spot for predictability of the service will have fewer more reliable trains.
As anecdotal evidence, I notice that the Northern Line frequently had breakages and had a train every 2-6 minutes, and Israel Railways very rarely has breakages and has a train twice an hour on my line.
This all points to both investing a lot of effort into train reliability and running fewer, longer trains.
I don’t think it’s accurate to model breakdowns as a linear function of journeys or train-miles unless irregular effects like extreme weather are a negligible fraction of breakdowns.
So far Claude 3.7 is the only non-reasoning model I’ve tried that answers this correctly. All reasoning models did as well.
Consider a version of the monty hall problem where the host randomly picks which of the 2 remaining doors to open. It reveals a goat. What should you do?
Reserve soldiers in Israel are paid their full salaries by national insurance. If they are also able to work (which is common as the IDF isn’t great at efficiently using it’s manpower) they can legally work and will get paid by their company on top of whatever they receive from national insurance.
Given how often sensible policies aren’t implemented because of their optics, it’s worth appreciating those cases where that doesn’t happen. The biggest impact of a war on Israel is to the economy, and anything which encourages people to work rather than waste time during a war is a good policy. But it could so easily have been rejected because it implies soldiers are slacking off from their reserve duties.
We have finally solved an age old problem in philosophy:
Gemini 3 pro is 1.2 cents per thousand tokens.
Gemini 3 pro image is 13.4 cents per image.
Therefore an image is worth 11167 words, not 1000 as the classicists would have it.
A single token is ~0.75 words, so it’s more like an image is worth 8375 words.
Makes total sense, AI images are higher resolution than classical pictures (which are limited by the dexterity of the painter), so you’re basically getting 11.67 pictures in each.
Pricing is linear with tokens even though actual cost per token is quadratic. That means the pricing is some approximate curve fitting relating to expected use. I would be curious about where the actual cost curve for tokens intersects the actual cost curve for a single image.
Gotta account for wordflation since the old days. Might have been 1000 back then
Ah, but I can embed 11168 words in an image!
People sometimes say things like “I bribed my child to have an injection with a packet of crisps”.
This is interesting because this clearly isn’t a bribe—it’s a straightforward deal: I got to vaccinate my child, you got a packet of crisps, we’re both better off.
A bribe is only possible when someone is representing someone else’s interests. Then you cut a deal where they abuse their responsibility in return for some personal benefit to them.
So why do people use the term? My guess it’s because it feels dirty since crisps aren’t healthy, and bribery has been extended to mean any deal that feels immoral?
Or maybe it’s because they feel they shouldn’t have to give the child anything for them to have an injection, since the injection is for the childs sake, and a bribe is frequently an extortion you shouldn’t have to pay.
Bribery can happen when it’s already someone’s duty to do something, but they refuse to do it until they’re paid extra. The parent may think the child has a duty to accept the vaccination; either arising from the child’s self-interest, from the public interest (or categorical imperative), or from a duty to obey authority. By accepting the vaccination only when paid off with a snack, the child is acting not from duty but from desire for the snack. Thus, the child does not learn the habit of acting from duty.
Among the parents I know, the issue isn’t that kids have a “duty” to get vaccinated. That would imply a critique of their own child that’s not all all implied when they talk about “bribery.”
The word “bribe” in this context has two implications.
They are acknowledging that using external incentives to extract compliance from their kids risks corrupting their intrinsic motivation. They call it a “bribe” to humorously emphasize to other parents that they’re aware of this issue, and that the “bribe” is the exception, not the rule.
They are pointing out that shots are scary and painful, while the benefits are hard for a kid to understand. When you’re forcing your kid through that, it shows love and caring to boost their morale with a treat. This lets them be seen as a caring yet responsible parent, who cares both about their kid’s long-term health and their short term feelings.
Bribe’s a convenient, one syllable word that’s become a widely understood shorthand among parents for precisely this combination of meanings. It lets people imply a deep philosophy of parenting while also seeming funny and with it among their parent friends. The fact that it’s an exaggeration and not literally apt is part of the charm.
Could it also be understood in the sense that the child-in-the-moment is representing the child-overall? The fair transaction between adult (who wants to be a good parent) and the child-overall (who wants to be healthy) is for them to just cooperate and make the injection happen. But the child-overall’s middle-man, the child-in-the-moment, has bargaining power and wants to use this for some personal benefit (crisps)?
The definition of “bribe” from Google is
I don’t think it necessarily means the person being bribed “is representing someone else’s interests,” although this is often the case, like when bribing police or politicians. I think “making a deal that feels immoral” is a good loose definition of the word “bribe.”
To me the deal with the child to get a packet of crisps in return for being vaccinated does that feel immoral and I don’t think that all people who use the word bribe in that context would say that either they or the child acted immoral.
If many native speakers use a word in a way that you think is wrong, you are probably misunderstanding them. They are probably using a different definition of bribe than you.
I feel like many folks use ‘bribe’ to just mean any positive but nonstandard reward for an action.
Most murder mysteries on TV tend to have a small number of suspects, and the trick is to find which one did it. I get the feeling that real life murders the police either have absolutely no idea who did it, or know exactly who did it and just need to prove that it was them to the satisfaction of the court of law.
That explains why forensic tests (e.g. fingerprints) are used despite being pretty suspect. They convince the jury that the guilty guy did it, which is all that matters.
See https://issues.org/mnookin-fingerprints-evidence/ for more on fingerprints.
Has anyone looked into the recent Chinese paper claiming to have reversed aging in monkeys?
Is it real or BS?
https://www.cell.com/cell/abstract/S0092-8674(25)00571-9?_returnURL=https%3A%2F%2Flinkinghub.elsevier.com%2Fretrieve%2Fpii%2FS0092867425005719%3Fshowall%3Dtrue
I’ll do a review if someone uploads it to sci-hub.
Apparently BS: https://x.com/cremieuxrecueil/status/1975057814696677733
More details here https://x.com/cremieuxrecueil/status/1975079984776577153
(on the blue-red discourse)
We’ve finally created the scissor statement from the classic blog post, don’t create the scissor statement!
It seems LLMs are less likely to hallucinate answers if you end each question with ‘If you don’t know, say “I don’t know”’.
They still hallucinate a bit, but less. Given how easy it is I’m surprised openAI and Microsoft don’t already do that.
Has its own failure modes. What does it even mean not to know something? It is just yet another category of possible answers.
Still a nice prompt. Also works on humans.
Quick thoughts on Gemini 3 pro:
It’s a good model sir. Whilst it doesn’t beat every other model on everything, it’s definitely pushed the pareto frontier a step further out.
It hallucinates pretty badly. ChatGPT 5 did too when it was released, hopefully they can fix this in future patches and it’s not inherent to the model.
To those who were hoping/expecting to have hit a wall. Clearly hasn’t happened yet (although neither have we proved that LLMs can take us all the way to AGI).
Costs are slightly higher than 2.5-pro, much higher than gpt 5.1, and none of googles models have seen any price reduction in the last couple of years. This suggests that it’s not quickly getting cheaper to run a given model, and that pushing the pareto frontier forward is costing ever more in inference. (However we are learning how to get more intelligence out of a fixed size with newer small models).
I would say Google currently has the best image models and best LLM, but that doesn’t prove they’re in the lead. I expect openai and anthropic to drop new models in the next few months, and Google won’t release a new one for another 6 months at best. It’s lead is not strong enough to last that long.
However we can firmly say that Google is capable of creating SOTA models that give openai and anthropic a run for their money, something many were doubting just a year ago.
Google has some tremendous structural advantages:
independent training and inference stack with TPUs, JAX, etc. It is possible they can do ML at a scale and price point noone else can achieve.
trivial distribution. If Google comes up with a good integration they have dozens of products where they can instantly push it out to hundreds of millions of people (monetising is a different question).
deep pockets. No immediate need to generate a profit, or beg investors for money.
lots of engineers. This doesn’t help with the cure model, but does help with integrations and RLHF.
Now that they’ve proven they can execute, they should likely be considered frontrunners for the AI race.
On the other hand ChatGPT has much greater brand recognition, and LLM usage is sticky. Things aren’t looking great for anthropic though with neither deep pockets or high usage.
In terms of existential risk: this is likely to make the race more desperate, which is unlikely to lead to good things.
Running trains more frequently can reduce reliability:
Consider a train line that takes 200 minutes to travel. Assume trains break down once every hundred journeys, and take 4 hours to clear. When a train breaks down, no other trains can pass it.
Now consider the two extremes:
If there’s one train every 2 minutes the train line will essentially always have one broken train, and travelling the line will likely take about 7+ hours.
Meanwhile if there’s just 1 train going back and forth you’ll have 1 delayed train every month, which will delay people for 4 hours. You’re still better off in this scenario than the previous one.
The sweet spot in terms of average transit time is closer to a train every 2 minutes than a train every 400 minutes, but the sweet spot for predictability of the service will have fewer more reliable trains.
As anecdotal evidence, I notice that the Northern Line frequently had breakages and had a train every 2-6 minutes, and Israel Railways very rarely has breakages and has a train twice an hour on my line.
This all points to both investing a lot of effort into train reliability and running fewer, longer trains.
I don’t think it’s accurate to model breakdowns as a linear function of journeys or train-miles unless irregular effects like extreme weather are a negligible fraction of breakdowns.
So far Claude 3.7 is the only non-reasoning model I’ve tried that answers this correctly. All reasoning models did as well.
Fwiw this is the kind of question that has definitely been answered in the training data, so I would not count this as an example of reasoning.
I expected so, which is why I was surprised they didn’t get it.
Anthropic is calling it an “hybrid reasoning model”. I don’t know what they mean by that.
Fun fact I just discovered—Asian elephants are actually more closely related to wooly mammoths than they are to African elephants!
And they call this AGI!
https://g.co/gemini/share/0194f36c58af
Reserve soldiers in Israel are paid their full salaries by national insurance. If they are also able to work (which is common as the IDF isn’t great at efficiently using it’s manpower) they can legally work and will get paid by their company on top of whatever they receive from national insurance.
Given how often sensible policies aren’t implemented because of their optics, it’s worth appreciating those cases where that doesn’t happen. The biggest impact of a war on Israel is to the economy, and anything which encourages people to work rather than waste time during a war is a good policy. But it could so easily have been rejected because it implies soldiers are slacking off from their reserve duties.