I have a public transport commute that occasionally has a fair bit of variance, so being punctual isn’t necessarily cheap in the sense that I’d have to trade-off being really early on most days just to handle the rare occasions when the extra time is actually needed.
NoUsernameSelected
https://drive.google.com/drive/folders/1R_0NeKfGvdSpsR1Mh0FkTj50cvxV20Wa?usp=sharing
Trying to share a chat out of AI Studio has proved annoying, as it turns out. I copied the transcript and took a full page screen capture instead, but the latter also turned out slightly scuffed with my usual tool. Apologies for the quality.
I just reran this test with Gemini 3 Pro Preview in my apartment.
It passed with flying colors. No real mistakes or major inefficiencies. Opus 4.5 performed worse, I didn’t end up running that round to completion.
I will note this is impacted by my place being a little smaller and less messy than what’s in the post, and I’m also not at all a coffee snob, but the actual coffee is serviceable imo.
I can share the Gemini chat if anyone’s interested.
ACX Meetup: Fall 2025
I think a lot of people who talk about being n-th percentile in some domain implicitly only include the set of people who participate in the activity at all. That’s a bit less clear-cut than “everyone alive”, but makes more sense to talk about and compare against imo.
I’d assume they have orders of magnitude fewer people working on arresting people for memes than their lack of capacity for paramedics or whatever else.
Vilnius – ACX Meetups Everywhere Spring 2025
Seems like a lot of paragraphs got collapsed together in this version of the post (vs the Wordpress and Substack ones)?
Vilnius – ACX Meetups Everywhere Fall 2024
I don’t get a progress bar on mobile (unless I’m missing it somehow), and the word count on hover feature seemingly broke on mobile as well a while ago (I remember it working before).
Why remove “x min read”? Even if it’s not gonna be super accurate between different people’s reading speeds, I still found it very helpful to decide at a glance how long a post is (e.g. whether to read it on the spot or bookmark it for later).
Showing the word count would also suffice.
I compared the Manifold forecasts with the community prediction on Metaculus and calculated a time-averaged Brier Score to score forecasts over time.
The so-called “nonsense” community prediction is still more accurate on average than Manifold for the same questions.
+1
Never really got into VRChat, but I’d be happy to try a LW meetup there.
Artemis II has already been delayed to 2025, as of yesterday.
Vilnius, Lithuania – ACX Meetups Everywhere Fall 2023
I agree about it having to fit on a single chip, but surely the neural net on-board would only have a relatively negligible impact on range compared to how much the electric motor consumes in motion?
I’d provide a counterexample analogy: speedruns.
Many high-level speedruns (and especially TAS runs) often look like some combination of completely stupid/insane/incomprehensible to casual players. Nevertheless, they work for the task they set out to do far more effectively than trying to beat the game quickly with “casual strats” would get you.
I think seeing a sufficiently smart AI doing stuff in the real world would converge to looking a lot like that from our POV.
Any particular reason you’ve linked all those tweets, but blocked general access to them? I’d probably be interested in reading some of those threads just going by the titles.
Aren’t likeable people more likely to attain status?