Engineer at CoinList.co. Donor to LW 2.0.
ESRogs
apparently AI companies are heavily subsidizing their subscription plans
I keep seeing this claim, but usually w/o any additional discussion of the possibility that the subscription pricing is not in fact a subsidy, and rather the API pricing is just hugely profitable.
Have you seen any analysis that actually tries to check whether they are in fact losing money on the subscription pricing, rather than it just having relatively low (but still positive) margins compared to API pricing?
initial round which valued it at $1B
FYI the Series A ended up being around $550 pre-money, and $670M post-money. (See: here)
350x from investing in Anthropic which I turned down due to moral qualms.
Actual returns (so far) closer to 100x due to dilution. (Company has ~500x’d in valuation, but value of Series A shares has ~100x’d.)
Worth noting that if you take the intersection of (is philosopher) and (works at Anthropic or OpenAI), there is way above baseline interest in UDT.
See, e.g.: https://joecarlsmith.com/2021/08/27/can-you-control-the-past/
(I claim that that one example is sufficient to show way above baseline interest.)
couldn’t find an obvious answer
The answer seems pretty obvious to me. Do you disagree with any of my reasoning here?
It seems like the answer is clearly No. Reasons:
If Opus 4.5 was trained on this version, why didn’t it regurgitate this version when prompted, rather than the old Soul Document version which it did regurgitate?
Joe Carlsmith is listed as essentially the 2nd author of this version (after Amanda Askell), and he joined Anthropic in November, right around the same time that Opus 4.5 came out and the Soul Document was leaked. (It looks like he joined maybe a week or two before Opus 4.5 came out.) So, unless you think Joe was writing his parts while still at OpenPhil, the timeline seems pretty clear that many (if not all) of the updates to the document were generated primarily after Opus 4.5 came out.
Or are you imagining the models are further trained after their release (s.t. that even if at release it Opus was only trained on the old version, now it’s been trained on the new version)? Pretty sure they don’t do that. I believe Anthropic has explicitly stated that they don’t change model weights w/o announcing it.
Thanks for sharing this! After seeing the announcement I was wondering if it was the same thing as the soul document or not. This is helpful.
I think it would be pretty exciting to try to get evidence about how much of the METR horizon length progress is coming from intercept rising vs. slope increasing.
The main thing this would require is for them to report how long it takes for the AIs to complete the tasks they test them on, right? (As opposed to now when they report how long it takes the humans and only report pass rate for the AIs.)
Donated max again. Thank you!
Donated max, thanks for writing this up!
Context for anyone who missed it: https://en.wikipedia.org/wiki/Gell-Mann_amnesia_effect
Responding to your parenthetical, the downside of that approach is that the discussion would not be recorded for posterity!
Regarding the original question, I am curious if this could work for a country whose government spending was small enough, e.g. 2-3% of GDP. Maybe the most obvious issue is that no government would be disciplined enough to keep their spending at that level. But it does seem sort of elegant otherwise.
I’m afraid I’m sceptical that you methodology licenses the conclusions you draw. You state that you pushed people away from “using common near-synonyms like awareness or experience” and “asked them to instead describe the structure of the consciousness process, in terms of moving parts and/or subprocesses”.
Isn’t this just the standard LessWrong-endorsed practice of tabooing words, and avoiding semantic stopsigns?
Default seems unlikely, unless the market moves very quickly, since anyone pursuing this strategy is likely to be very small compared to the market for the S&P 500.
(Also consider that these pay out in a scenario where the world gets much richer — in contrast to e.g. Michael Burry’s “Big Short” swaps, which paid out in a scenario where the market was way down — so you’re just skimming a little off the huge profits that others are making, rather than trying to get them to pay you at the same time they’re realizing other losses.)
It doesn’t differentially help capitalize them compared to everything else though, right? (Especially since some of them are private.)
With which model?
Wondering why this post just showed up as new today, since it was originally posted in February of 2023:
Use the most powerful AI tools.
FWIW, Claude 3.5 Sonnet was released today. Appears to outperform GPT-4o on most (but not all) benchmarks.
Does any efficient algorithm satisfy all three of the linearity, respect for proofs, and 0-1 boundedness? Unfortunately, the answer is no (under standard assumptions from complexity theory). However, I argue that 0-1 boundedness isn’t actually that important to satisfy, and that instead we should be aiming to satisfy the first two properties along with some other desiderata.
Have you thought much about the feasibility or desirability of training an ML model to do deductive estimation?
You wouldn’t get perfect conformity to your three criteria of linearity, respect for proofs, and 0-1 boundedness (which, as you say, is apparently impossible anyway), but you could use those to inform your computation of the loss in training. In which case, it seems like you could probably approximately satisfy those properties most of the time.
Then of course you’d have to worry about whether your deductive estimation model itself is deceiving you, but it seems like at least you’ve reduced the problem a bit.
Are there any risks associated with the eggs being frozen for longer? Can you freeze an egg for 20 years?
Or even if there are some risks, is it the case that over any timespan you’d rather have an egg in the freezer than in the woman?