LessWrong team member / moderator. I’ve been a LessWrong organizer since 2011, with roughly equal focus on the cultural, practical and intellectual aspects of the community. My first project was creating the Secular Solstice and helping groups across the world run their own version of it. More recently I’ve been interested in improving my own epistemic standards and helping others to do so as well.
Raemon
[Geir Isene] A desktop made for one
Wowzers, didn’t know about Patagonia. That’s pretty interesting.
I think the current workflow for those of us who who AI to iterate on ideas and collect information is to draft with AI and the literally rewrite in our own words in the final form. It is quite bizarre but this actually seems to work.
On one hand, I think this can work. But, I caution that many versions of this are a trap that don’t produce good output. I recommend spending at least some chunks of time writing and thinking without any AI assistance, because otherwise I think you critical discernment skills probably won’t sharpen enough to contribute usefully.
I have not currently read most of this (just the tl;dr and some skimming), but wanted to quickly note: I think “rationality” in the LW sense is mostly useful for two reasons
1) having relatively ambitious, openended, confusing projects
2) navigating environmental disruption (i.e. covid)
3) being a good citizen, who is able to vote and participate in The Discourse in a way that shapes your country/world for the better.
I don’t think it’s especially great for “being a happy, well adjusted guy” (compared to other schools of thought with different vibes). I think it helps, but, not so overwhelmingly that I recommend it to a person who doesn’t naturally vibe with it.
Slightly varied example: is laying ambushes for enemy humans dishonest, during war?
It’s certainly deceptive. But I feel hesitant to lump it and “normal dishonesty” together, because I think there is some qualitative difference between degrading the commons and Winning At Conflict.
It’s dishonest (and quite bad) to wave a white flag of surrender, and then lure people into a trap (compared to leaking bad information to a spy to lure enemies into a trap). Because Surrender is a mode of communication that enemies both agree is good to have open.
Yeah but I kinda do put moderate odds on “The White House continues to actively try to destroy Anthropic and eventually either succeeds or at least it’s pretty visibly in-question.”
Fwiw the first one wasn’t rejected for “being raw/sloppy”, we just have particularly high standards for AI content because we get so much of it and we want to keep signal/noise quality high. And both the writing and idea quality need to be actively good.
I think it’s an achievable goal to learn to come up with interesting/meaningful contributions and articulate them well. You can ask AIs for meta-level advice on how to write without having them do your writing for you.
Curated. This seemed like a (relatively) straightforward thing to check that seems straightforwardly useful. I’m interested in seeing METR run a version of this against their existing task suite.
There’s always a bit of a double-edged sword of making a good capabilities eval because, even if people don’t have direct access to the eval to iterate against, it implicitly becomes a target. (i.e. it’s hard to tell, but I get some sense of companies striving to hit the METR trendline and beat the other companies on it).
My understanding is the METR task suite is basically saturated. You could probably construct a good version of this that is less saturated less quickly.
I’m wondering if there’s any way to keep this artificially low while making CoT time horizons high, and if there’s some sort of index you could publish that’s, like, ratio of CoT-time-horizon to non-CoT or something. I think for it to be that real/helpful you’d also want some kind of ”...and the CoT is faithful” metric that I don’t currently know of a robust solution for. (This is not a very well thought out idea, just musing)
Good luck!
Huh, curious for you model of why you predicted the other-which-way? This seemed like a classic “does better on LW” kinda post (for good or for ill). Although I wouldn’t have predicted so much disparity.
Past me definitely would have been frustrated about this right along with you, and somehow I have become a stodgy grownup villain from Peter Pan and I’m not sure how/why.
(I think it’s, like, to my detriment, I have less fun now. But, doesn’t seem like I can fix it by just trying to do more fun silly things, they just don’t resonate like they used to. I recall going to one of your laser-tag things and thinking ‘man, I really should be enjoying this but it feels meh for some reason’)
It’s plausible this is more like a muscle I need to rebuild.
I’m a little worried Anthropic has missed the window for this option, since now it might look like the Whitehouse was out to destroy them and they were just throwing in the towel.
(since they clearly don’t care about overrefusals).
(this particular claim here seems false/overstated. Like, clearly, overall, they are willing to accept overrefusals. That doesn’t meant they “don’t care about them”. Maybe they don’t, but, much more likely it just seems like a reasonable tradeoff to them.)
This is presumably not relevant anymore, but.… can you not just turn off memory?
Curated. In addition to the obvious “notice the difference between ‘actually helps with x-risk’ and ‘is x-risk themed’”, I think there’s an important corollary to “you’re asking for someone to make you a sucker.”
The problem isn’t just you might personally get exploited, it’s that you’re incentivizing an overall breeding ground for grifters. (Which could be both intentional grifters getting free/cheap labor, and people with just kinda bad taste destroying the signal/noise ratio of the ecosystem). See: The Moral Obligation Not to Get Eaten.
We do try to subsidize non-AI posts when Curating while keeping quality high.
Curated.
When I first opened this post, I skimmed the intro, and thought “this is a cute idea but seems crazy and I don’t believe it can work. Mnemonics for 19,000 genes? No way!”, and I closed the tab.
When I saw it got 200 karma I took a second look. And… well okay this still seems kinda crazy and I’d like to see someone else use the browser extension and see if it actually helps.
But, I read the “gender = protein transmembrane status” bit, and had a sort of sinking feeling I was about to wrong and a rising feeling of excitement, that, it doesn’t actually take that many dimensions/gradations to get enough bits to specify one guy within 19,000. And the dimensions do seem like categories that leverage my human-racial-bonus-to-identifying-people.
It may just be a cute idea. But I feel like I learned a potentially generalizable tool for mnemonics. I don’t particularly need to memorize protein-coding-genes. But, this has me vaguely excited to try and memorize something. :P
Oh, this was probably because I forgot to put in a street address. Looks fixed now.
My version of this is “don’t try to come up with a name until after you’ve found the venue, because the venue will have some kind of character that lends itself to some names better than others.”
Goddamn it we need to fix our linkpost UI. Thanks