kave

Karma: 1,873

Hello! I work at Lightcone and like LessWrong :-)

kave 1 Apr 2022 20:00 UTC
42 points
on: Accounting For College Costs
Harvard tells us that their median class size is 12 and over 75% of their courses have fewer than 20 students.
Smaller class sizes sounds pretty good! Maybe worth paying for? But I am reminded of the claim that most flights are empty, even though most people find themselves on full flights. Similarly, most person-class-hours might be spent in the biggest classes (cf the inspection paradox).

kave 14 Mar 2024 18:59 UTC
41 points
48
in reply to: Rafael Harth’s comment on: ‘Empiricism!’ as Anti-Epistemology
I sometimes like things being said in a long way. Mostly that’s just because it helps me stew on the ideas and look at them from different angles. But also, specifically, I liked the engagement with a bunch of epistemological intuitions and figuring out what can be recovered from them. I like in particular connecting the “trend continues” trend to the redoubtable “electron will weight the same tomorrow” intuition.
(I realise you didn’t claim there was nothing else in the dialogue, just not enough to justify the length)

kave 19 Sep 2023 18:11 UTC
LW: 29 AF: 14
13
AF
on: Anthropic’s Responsible Scaling Policy & Long-Term Benefit Trust
As a general matter, Anthropic has consistently found that working with frontier AI models is an essential ingredient in developing new methods to mitigate the risk of AI.
What are some examples of work that is most largeness-loaded and most risk-preventing? My understanding is that interpretability work doesn’t need large models (though I don’t know about things like influence functions). I imagine constitutional AI does. Is that the central example or there are other pieces that are further in this direction?

kave 3 Apr 2024 7:59 UTC
28 points
0
in reply to: Nnotm’s comment on: The Story of “I Have Been A Good Bing”
Much sweat and some tears were spent on trying to get something like that working, but the Shoggoths are fickle

kave 12 Dec 2023 17:21 UTC
25 points
0
on: The likely first longevity drug is based on sketchy science. This is bad for science and bad for longevity.
Some Manifold markets:

kave 12 Jan 2024 22:40 UTC
LW: 21 AF: 13
5
AF
in reply to: habryka’s comment on: Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training
This paper also seems dialectically quite significant. I feel like it’s a fairly well-delineated claim that can be digested by mainsteam ML and policy spaces. Like, it seems helpful to me if policy discussions can include phrases like “the evidence suggests that if the current ML systems were trying to deceive us, we wouldn’t be able to change them not to”.
What links here?
- On Anthropic’s Sleeper Agents Paper by Zvi (17 Jan 2024 16:10 UTC; 54 points)

kave 27 Apr 2024 17:33 UTC
19 points
1
on: D&D.Sci
Curated! This kicked off a wonderful series of fun data science challenges. I’m impressed that it’s still going after over 3 years, and that other people have joined in with running them, especially @aphyer who has an entry running right now (go play it!).
Thank you, @abstractapplic for making these. I don’t think I’ve ever submitted a solution, but I often like playing around with them a little (nowadays I just make inquiries with ChatGPT). I particularly like
That it nuanced my understanding of the supremacy of neural networks and when “just throw a neural net” at it might work or might not.
Here’s to another 3.4 years!

kave 20 Jan 2024 19:41 UTC
16 points
6
in reply to: TurnTrout’s comment on: TurnTrout’s shortform feed
Some quotes from the wiki article on Shoggoths:
Being amorphous, shoggoths can take on any shape needed, making them very versatile within aquatic environments.
At the Mountains of Madness includes a detailed account of the circumstances of the shoggoths’ creation by the extraterrestrial Elder Things. Shoggoths were initially used to build the cities of their masters. Though able to “understand” the Elder Things’ language, shoggoths had no real consciousness and were controlled through hypnotic suggestion. Over millions of years of existence, some shoggoths mutated, developed independent minds, and rebelled.
Quoting because (a) a lot of these features seem like an unusually good match for LLMs and (b) acknowledging that is picking a metaphor that fictionally rebelled, and thus is potentially alignment-is-hard loaded as a metaphor.

kave 29 Apr 2024 21:37 UTC
15 points
15
on: Ironing Out the Squiggles
It seems unlikely that different hastily cobbled-together programs would have the same bug.
Is this true? My sense is that in, for example, Advent of Code problems, different people often write the same bug into their program.

kave 14 Mar 2024 21:48 UTC
15 points
0
on: kave’s Shortform
Sometimes running to stand still is the right thing to do

It’s nice when good stuff piles up into even more good stuff, but sometimes it doesn’t:
- Sometimes people are worried that they will habituate to caffeine and lose any benefit from taking it.
- Most efforts to lose weight are only temporarily successful (unless using medicine or surgery).
- The hedonic treadmill model claims it’s hard to become durably happier.
- Productivity hacks tend to stop working.
These things are like Alice’s red queen’s race: always running to stay in the same place. But I think there’s a pretty big difference between running that keeps you exactly where you would have been if you hadn’t bothered, and running that either moves you a little way and then stops, or running that stops you moving in one direction.
I’m not sure what we should call such things, but one idea is hamster wheels for things that make no difference, bungee runs for things that let you move in a direction a bit but you have to keep running to stay there, and backwards escalators for things where you’re fighting to stay in the same place rather than moving in a direction (named for the grand international pastime of running down rising escalators).
I don’t know which kind of thing is most common, but I like being able to ask which dynamic is at play. For example, I wonder if weight loss efforts are often more like backwards escalators than hamster wheels. People tend to get fatter as they get older. Maybe people who are trying (but failing) to lose weight are gaining weight more slowly than similar people who aren’t trying to do so?
Or my guess is that most people will have more energy than baseline if they take caffeine every day, even though any given dose will have less of an effect than taking the same amount of caffeine while being caffeine-naive, so they’ve bungee ran (done a bungee run?) a little way forward and that’s as far as they’ll go.
I am currently considering whether productivity hacks, which I’ve sworn off, are worth doing even though they only last for a little while. The extra, but finite, productivity could be worth it. (I think this would count as another bungee run).
I’d be interested to hear examples that fit within or break this taxonomy.

kave 1 Apr 2022 19:15 UTC
15 points
on: [Link] sona ike lili
FWIW, “powe” has been removed from “official” toki pona. A more standard translation might be “sona ike lili”.

kave 6 Oct 2023 4:58 UTC
13 points
0
on: Translations Should Invert
If I imagine having a compiler that translates back-and-forth between intuitionistic and classical logic as in the post, and I want to stop the accumulation of round-trip ‘cruft’, I think the easiest thing to do would be to add provenance information that let me figure out whether a provability predicate, say, was “original” or “translational”. But frustratingly that’s not really possible in the case where I’m trying to translate between people with pretty different ontologies (who might not be able to parse their interlocutors statements natively).
I dunno whether you’re thinking more about the case of differing ontologies or more about the case of preferred framings (but fluency with both), so not sure how relevant to your inquiries.

kave 10 Aug 2023 18:01 UTC
13 points
10
in reply to: gwern’s comment on: LLMs are (mostly) not helped by filler tokens
Adding filler tokens seems like it should always be neutral or harm a model’s performance: a fixed prefix designed to be meaningless across all tasks cannot provide any information about each task to locate the task (so no meta-learning) and cannot store any information about the in-progress task (so no amortized computation combining results from multiple forward passes).
I thought the idea was that in a single forward pass, the model has more tokens to think in. That is, the task description on its own is, say, 100 tokens long. With the filler tokens, it’s now, say, 200 tokens long. In principle, because of the uselessness/unnecessariness of the filler tokens, the model can just put task-relevant computation into the residual stream for those positions.

kave 6 Dec 2023 20:44 UTC
11 points
0
in reply to: Tao Lin’s comment on: Google Gemini Announced
Table 2 seems to provide a more direct comparison.

kave 5 Apr 2024 2:39 UTC
10 points
9
on: On Complexity Science
I think my big problem with complexity science (having bounced off it a couple of times, never having engaged with it productively) is that though some of the questions seem quite interesting, none of the answers or methods seem to have much to say.
Which is exacerbated by a tendency to imply they have answers (or at least something that is clearly going to lead to an answer)

kave 17 Jan 2024 5:56 UTC
10 points
6
in reply to: Elizabeth’s comment on: The impossible problem of due process
I feel like this is the opposite of the quoted text? Or your example is of the bad actor both “remaining reasonable” and “fighting dirty”

kave 12 Nov 2023 4:14 UTC
10 points
5
in reply to: Richard121’s comment on: AI Timelines
IIUC, 1000x was chosen to be on-the-order-of the solar energy reaching the earth

kave 11 Nov 2023 2:10 UTC
LW: 10 AF: 5
5
AF
on: AI Timelines
Curated. I feel like over the last few years my visceral timelines have shortened significantly. This is partly in contact with LLMs, particularly their increased coding utility, and a lot downstream of Ajeya’s and Daniel’s models and outreach (I remember spending an afternoon on an arts-and-crafts ‘build your own timeline distribution’ that Daniel had nerdsniped me with). I think a lot of people are in a similar position and have been similarly influenced. It’s nice to get more details on those models and the differences between them, as well as to hear Ege pushing back with “yeah but what if there are some pretty important pieces that are missing and won’t get scaled away?”, which I hear from my environment much less often.
There are a couple of pieces of extra polish that I appreciate. First, having some specific operationalisations with numbers and distributions up-front is pretty nice for grounding the discussion. Second, I’m glad that there was a summary extracted out front, as sometimes the dialogue format can be a little tricky to wade through.
On the object level, I thought the focus on schlep in the Ajeya-Daniel section and slowness of economy turnover in the Ajaniel-Ege section was pretty interesting. I think there’s a bit of a cycle with trying to do complicated things like forecast timelines, where people come up with simple compelling models that move the discourse a lot and sharpen people’s thinking. People have vague complaints that the model seems like it’s missing something, but it’s hard to point out exactly what. Eventually someone (often the person with the simple model) is able to name one of the pieces that is missing, and the discourse broadens a bit. I feel like schlep is a handle that captures an important axis that all three of our participants differ on.
I agree with Daniel that a pretty cool follow-up activity would be an expanded version of the exercise at the end with multiple different average worlds.

kave 19 Dec 2022 20:07 UTC
LW: 10 AF: 5
1
AF
on: Finite Factored Sets in Pictures
Curated. I am excited about many more distillations and expositions of relevant math on the Alignment Forum. There are a lot of things I like about this post as a distillation:
- Exercises throughout. They felt like they were simple enough that they helped me internalise definitions without disrupting the flow of reading.
- Pictures! This post made me start thinking of finite factorisations as hyperrectangles, and histories as dimensions that a property does not extend fully along.
- Clear links from Finite Factored Sets to Pearl. I think these are roughly the same links made in the original, but they felt clearer and more orienting here.
- Highlighting which of Scott’s results are the “main” results (even more than the “Fundamental Theorem” name already did).
- Magdalena Wache’s engagement in the comments.
I do think the pictures became less helpful to me towards the end, and I thus have worse intuitions about the causal inference part. I’m also not sure about the emphasis of this post on causal rather than temporal inference. But I still love the post overall.

kave 3 Apr 2024 20:21 UTC
9 points
3
on: My PhD thesis: Algorithmic Bayesian Epistemology
Curated.
Using Bayes-type epistemology is a core LessWrong topic, and I think this represents a bunch of progress on that front (whether the results are already real-world-ready or just real-world-inspired). I have only engaged with small parts of the thesis, but those parts seem pretty exciting; so far, I particularly like knowing about quasi-arithmetic pooling. It feels like I’ve become less confused about something that I didn’t know I was confused about — the connection between the character of the proper scoring rule and the right ways to aggregate those probabilities.
I also appreciate Eric’s work making blogposts explaining more of his thoughts in a friendly way. Hope to see a few more distillations come out of this thesis!