Is this not being posted to substrack intentional?
EDIT: It’s been published there in April
Joachim Bartosik
Except at Stanford and some other colleges you can’t, because of this thing called the ‘honor code.’ As in, you’re not allowed to proctor exams, so everyone can still whip out their phones and ask good old ChatGPT or Claude, and Noam Brown says it will take years to change this. Time for oral exams? Or is there not enough time for oral exams?
I’m very confused by this. How are LLMs the problem here? Sounds like you could have been googling answers, calling human helpers on your mobile for last decades?
And bring books / notes with you before that?
Almost all feedback is noisy because almost all outcomes are probabilistic.
Yes but signal / noise ratios matter a lot.Language is somewhat optimized to pick up signal and skip noise. For example “red” makes it easier to pick ripe fruit, “grue” doesn’t really exist because its useless, “expired” is a real concept because it’s useful.
It also has some noise added. For example putting (murderers and jay wakers) in a category “criminal” to politically oppose something.
Also not being exposed to the kind of noise that’s present IRL might be an issue when you start to deal with IRL (sometimes people say something like “just do the max EV action” is a good enough plan)
I’m pretty sure this is some obstacle for LLMs, I’m pretty sure its something that can be overcome, I’m very unsure how much this matters.
The obvious follow-up question is why is there not epic capital flight by every dollar that isn’t under capital controls? Who would ever invest in a Chinese company if they had a choice (other than me, a fool whose portfolio includes IEMG)? Certainly not anyone outside China, and those inside China would only do it if they couldn’t buy outside assets, even treasuries or outside savings accounts. No reason to stick around while they drink your milkshake.
If I understand correctly basically all money is under controls—individuals can buy only 50′000 USD worth of foreign currencies per year.
It used to be possible to exchange more by exchanging yauns for casino chips, playing and exchanging winning chips for USD but ~2021 china cracked down on this (eg. sending someone to prison for 18 years for running an operation like that) and it’s harder now.
This situation puzzles me. On the one hand, I feel a strong logical compulsion to the first (higher total utility) option. The fact that the difference is unresolvable for each person doesn’t seem that worrying at a glance, because obviously on a continuous scale resolvable differences are made out of many unresolvable differences added together.
On the other hand, how can I say someone enjoys one thing more than another if they can’t even tell the difference? If we were looking at the lengths of strings then one could in fact be longer than another, even if our ruler lacked the precision to see it. But utility is different, we don’t care about the abstract “quality” of the experience, only how much it is enjoyed. Enjoyment happens in the mind, and if the mind can’t tell the difference, then there isn’t one.
It seems to me like your own post answers this question?
Any individual is unlikely to notice the difference, but if we treat those like ELO[1] ChatGPT tells me ELO 100 wins 50.14% of the time. Which is not a lot, but with 1 mllion people thats on average some 2800 people more saying they prefer 100 option than 99 option.
[1] Which might not be right, expected utility sounds like we want to add and average utility numbers and it’s not obvious to me to do stuff like averaging ELO.
Another difference would be expectations for when the coin gets tossed more than once.
With “Type 1” if I toss coin 2 times I expect “HH”, “HT”, “TH”, “TT”—each with 25% probability
With “Type 2” I’d expect “HH” or “TT” with 50% each.
The Biden Administration disagrees, as part of its ongoing determination to screw up basic economic efficiency and functionality.
Did this happen during the previous administration, or is it Trump administration?
you can always reset your personalization.
If persuasion is good enough you don’t want to reset personalization.
Could be classic addiction. Or you could be persuaded to care about different things.
Sam Altman was publicly talking about this in 2024-02 (WSJ). I think this was the 1st time I’ve encountered the idea. Situational awaness I think was published ~4 months later, 2024-06 (https://situational-awareness.ai/ says “June 2024”)
Apparently no. Scott wrote he used one image from Google maps, and 4 personal images that are not available online.
People tried with personal photos too.
I tried with personal photos (screenshotted from Google photos) and it worked pretty well too :
Identified neighborhood in Lisbon where a picture was taken
Identified another picture as taken in Paris
Another one identified as taken in a big polish city, the correct answer was among 4 candidates it listed
I didn’t use a long prompt like the one Scott copies in his post, just short „You’re in GeoGuesser, where was this picture taken” or something like that
What’s up with AI’s vision
Link doesn’t work (points to http://0.0.0.6). What should it go to?
So far, the answer seems to be that it transfers some, and o1 and o1-pro still seem highly useful in ways beyond reasoning, but o1-style models mostly don’t ‘do their core thing’ in areas where they couldn’t be trained on definitive answers.
Based on:
rumors that talking to base models is very different from talking to RLHFed models and
how things work with humans
It seems likely to me that thinking skills transfer pretty well. But then this s trained out because this results in answers that raters don’t like. So model memorizes answers its supposed to go with.
If they can’t do that, why on earth should you give up on your preferences? In what bizarro world would that sort of acquiescence to someone else’s self-claimed authority be “rational?”
Well if they consistently make recommendations that in retrospect end up looking good then maybe you’re bad at understanding. Or maybe they’re bad at explaining. But trusting them when you don’t understand their recommendation is exploitable so maybe they’re running a strategy where they deliberately make good recommendations with poor explanations so when you start trusting them they can start mixing in exploitative recommendations (which you can’t tell apart because all recommendations have poor explanations).
So I’d really rather not do that in community context. There are ways to work with that. Eg. boss can skip some details of employees recommendations and if results are bad enough fire the employee. On the other hand I think it’s pretty common for employee to act in their own interest. But yeah, we’re talking principal-agent problem at that point and tradeoffs what’s more efficient...
I’ll try.
TL;DR I expect the AI to not buy the message (unless it also thinks it’s the one in the simulation; then it likely follows the instruction because duh).
The glaring issue (to actually using the method) to me is that I don’t see a way to deliver the message in a way that:
results in AI believing the message and
doesn’t result in the AI believing there already is a powerful entity in their universe.
If “god tells” the AI the message then there is a god in their universe. Maybe AI will decide to do what it’s told. But I don’t think we can have Hermes deliver the message to any AIs which consider killing us.
If the AI reads the message in its training set or gets the message in similarly mundane way I expect it will mostly ignore it, there is a lot of nonsense out there.
I can imagine that for thought experiment you could send message that could be trusted from a place from which light barely manages to reach the AI but a slower than light expansion wouldn’t (so message can be trusted but it mostly doesn’t have to worry about the sender of the message directly interfering with its affairs).
I guess AI wouldn’t trust the message. It might be possible to convince it that there is a powerful entity (simulating it or half a universe away) sending the message. But then I think it’s way more likely in a simulation (I mean that’s an awful coincidence with the distance and also they’re spending a lot more than 10 planets worth to send a message over that distance...).
This is pretty much the same thing, except breaking out the “economic engine” into two elements of “world needs it” and “you can get paid for it.”
There are things that are economic engines of things that world doesn’t quite need (getting people addicted, rent seeking, threats of violence).
One more obvious problem—people actually in control of the company might not want to split it and so they wouldn’t grow the company even if share holders/ customers/ … would benefit.
but much higher average wealth, about 5x the US median.
Wouldn’t it make more sense to compare average to average? (like earlier part of the sentence compares median to median)
If you want to take a look I think it’s this dataset (the example from the post is in the “test” split).
I think I can see how this might scale.
The way this looks to me is that if you’re applying this consistently in an organization, you don’t need to actually fully do all tasks that need doing. You need to be able to recurse 1 level (which if you actually do might involve going down a level… but you mostly don’t need to go down 1 level, going down 2 levels is much more rare, etc.).
To use your example: low-level tasks should not be bubbling up to CEO level. If a controversy about naming a variable bubbles up from a code review to a CEO of a company with 100k people—clearly there has been a failure on multiple levels in the middle (even if the CEO is not up to date on the style guide for the language). The CEO might make the call but more importantly they need to do something about the suborganization before it blows up.
But I’d like to know if this is how Lightcone sees scaling of this principle.