LessWrong team member / moderator. I’ve been a LessWrong organizer since 2011, with roughly equal focus on the cultural, practical and intellectual aspects of the community. My first project was creating the Secular Solstice and helping groups across the world run their own version of it. More recently I’ve been interested in improving my own epistemic standards and helping others to do so as well.
Raemon
Yeah “just actually have large swaths of code memorized/chunked/tokenized in a way that’s easy to skim and reason about” seems clearly helpful, (and seems related to stuff @RobertM was saying)
It doesn’t seem immediately useful as an action-prompt on it’s own for a junior developer. But, I guess, I’ve probably spent a similar number of hours as Robert at least glancing at terminal logs,
(Unless you think you have a skill of memorizing new things quickly that you expect to help a junior developer?)
The things that seem relevant here are, like, “try at all to memorize swaths of code in some kind of comprehensive way”, and presumably there are subskills that make that easier, which may not help immediately but probably changes one’s trajectory.
To the sorts of people who become successful mountaineers*, I think probably yes? (it’s not about, like Type 1 fun [enjoying in the moment], it’s more Type 2 fun (rewarding and enjoyable in retrospect). And, it’s less about the mountain, but feeling they’ve pushed themselves to their limit, especially if it was farther than any human has gone before).
*assuming they didn’t get, like, tiger-mom-brainwashed into it, but, I think doing that for mountaineers is pretty rare.
Some principles of using Thinking Assistants for more than “just keep you vaguely accountable”*
* (for me, it’s not at all surprising if others need radically different things)
Often, I’m in the middle of figuring out a confusing thing, which is taking up all my working memory. The exact right question can help direct my attention towards an important place. The wrong question is annoying and overloads my stack.
Sometimes assistants just magically intuit the exact right thing to ask, but it’s hard to rely on this.
Thus:
Principle #1: Construct an if-then tree of prompts, which both I and the assistant can iterate on. The job of the assistant isn’t (usually) to magically intuit the right thing. We’re working together on writing phrases that are helpful for them to say periodically when situations seem to call for it.
Principle #2: Prompts should be very easy to respond to. Usually simple yes/no questions, or questions where I don’t really have to respond, they’re just might nudge my semi-conscious brain in the right direction.
Worse question: “What are you confused by?” (I dunno, that’s the problem. It is sometimes helpful to fully articulate the confusion to see if anything reveals itself, but usually not while I’m in the middle of trying to figure it out)
Less-worse question: “Are you confused right now?” (This is a’ight but trying to figure out “does my current qualia count as ‘confusion’?” is also a bit overloading)
Better questions:
Do you know what to do next?
Have you tried more than 2 things to figure this out?
Do you have a clear understanding of the problem?
Not in much detail, thanks!
There are some things I worked out a few years ago that feel like I basically got a good handle on them and still relate the same way. There are some things I didn’t actually fully work out, such that I’m still sometimes fiddling with them. There are some things that worked then, but then an exogenous shock to my life made them no longer feel sufficient/complete.
That doesn’t feel sufficient to explain to me “why do humans ask questions like ‘what is the meaning of life’?”
What makes a world state meaningful? If you were building the minimum viable robot that might experience meaning or asked “what is the meaning of life?”, what is the smallest bit you could take away such that it no longer ask that question?
I… maybe want to try deliberate practicing “falling asleep.” Which sounds like it’ll suck (because AFAICT the way to do it is sleep deprive yourself and then practice powernapping. And meanwhile, it seems like a habit that’s very easy to backslide out of because, at the moments you most need to practice it, you’re tired and willpower deprived and it’s hard to do things on purpose).
I am wondering if it’s possible to do something like “use an EEG machine to tell when I’m thinking ‘likely to fall asleep soon’ brainwavesstuff, and have it beep at me when I’m Thining Too Many Thoughts To Fall Asleep.
I’m curious if anyone round these parts has heard of stuff like this and has any experience to share. (I did just ask some LLMs, sounds like it’s niche but not unheard of).
The Muse S looks like it’s maybe good for this (it seemed to be the one with a soft headband you could plausibly wear while drifting off to sleep)
Part of the situation I think is also that different CFAR founders had somewhat different goals (I’d weakly guess Julia Galef actually did care more about “raise the broader sanity waterline”, but she also left a few years in), so there wasn’t quite a uniform vision to communicate.
I’m not sure I know how to answer this question in it’s current form, could you try asking a somewhat different one?
Also, I’ve more recently realized this post is pretty related to some stuff in Propagating Facts into Aesthetics:
An aesthetic is a mishmash of values, strategies, and ontologies that reinforce each other.
The values reinforce “you want to use strategies that achieve these values.”
The act of using a particular strategy shapes the ontology that you see the world through.
The ontology reinforces what values seem important to you.
Together, this all creates a feedback loop between your metagoals and subgoals, where the process of using this cluster of value/strategy/ontology makes each link in the chain stronger.
In humans (who have messy, entangled brains), this cashes out into feelings, felt senses. The original goal and the metagoals blur together. I think “this helps me achieve my [generic] goals” might reinforce “these particular subgoals I have are good goals to help with my overall flourishing.”
Where, as implemented in humans, a “meaning” (a particular way of relating to life) seems fairly similar to an aesthetic in that “it’s compressed into a mishmash that includes values, strategies and ontologies that reinforce each other.”
Hmm, that doesn’t feel like the right summary to me. (I acknowledged the motivation thing as probably
I’m not sure where you draw the boundary between object level and meta skill. I think there are:
skills you can apply to metalearning (i.e. make it easier to learn new skills)
general reasoning skills
domain specific skills
The first two I’d (often but not always[1]) call rationality skills. Also, most general reasoning skills you can apply towards metalearning as well as object level domains. I think maybe the only pure meta-level skills I found/worked on were:
track what subskills would be helpful for a given task (so that you can then practice them separately, and/or pay more attention to them as you practice)
generalized “practice applying existing general reasoning skills to the domain of gaining new skills”
generalized (but sort of domain-specifically) apply general reasoning skills to the domain of inventing better feedback loops.
idk a couple domain-specific skills for inventing better feedback loops.
I think that’s probably it? (and there were like 20+ other skills I focused on). Everything else seems about as object level as CFAR skills. I’m maybe not sure what you mean by meta though – CFAR skills all seem at least somewhat meta to me.
At the very top of the post, I list 7 skills that I’d classify as general reasoning rationality skills. I could theoretically have gained those purely by practicing debugging, but I think it would have taken way longer and would have been very difficult for me even if I was more motivated. (They also would probably not have transferred as much to other domains if I didn’t think of myself as studying rationality)
An important early step was to gain Noticing / Tuning your Cognitive Strategies skills, which I think unlocked the ability to actually make any kind of progress at debugging (because they made it so, when I was bouncing off something, I could figure out why).
- ^
I think they are “rationality” if they involve making different choices about cognitive algorithms to run. “speed reading” is a meta-learning and general-reasoning skill, but it feels like a stretch to call it a rationality skill.
Hmm, I think not right now.
FYI I work in typescript
Do you literally read all the text on the screen, or do you have a way of efficiently skimming?
I think this has two components:
One, the set of tools that I developed for myself via Tuning your Cognitive Strategies / Feedbackloop-first Rationality / Metastrategy (note that those are different) just were actually pretty useful for debugging. There was real transfer from Thinking Physics and Baba is You. See Skills from a year of Purposeful Rationality Practice.
Two: I think I was just a lot more motivated to upskill through the lens of “developing rationality training” than “getting better at debugging.” It seemed pretty unlikely I’d ever become a particularly great debugger, and being a good debugger didn’t actually seem like the biggest bottleneck towards achieving my goals. It’s easier to tell myself an exciting narrative about inventing a rationality training paradigm than to work really hard to become a slightly-above-mid software engineer. (even if the rationality paradigm ends up ultimately kinda mid, fewer other people are even trying)
The latter may not particularly work for anyone else, and it seems pretty likely that “have some kind of motivation” is more important than any particular set of tools. I do think the set of tools are pretty obviously good though.
I’m not sure that answered your question, but maybe you can ask a more specific one now.
Debugging for Mid Coders
My understanding is that a crucial aspect of Eliezer’s worldview is that we’d be fucked even if we had a 10-year pause where we had access to AGI that we could use to work on developing and aligning superintelligence. I disagree. But this means that he thinks that some truly crazy stuff has to happen in order for ASI to be aligned, which naturally leads to lots of disagreements. (I am curious whether you agree with him on this point.)
I don’t feel competent to have that strong opinion on this, but I’m like 60% on “you need to do some major ‘solve difficult technical philosophy’ that you can only partially outsource to AI, that still requires significant serial time.”
And, while it’s hard for someone withy my (lack-of) background to have a strong opinion, it feels intuitively crazy to me put that as <15% likely, which feels sufficient to me to motivate “indefinite pause is basically necessary, or, humanity has clearly fucked up if we don’t do it, even if it turned out to be on the easier side.”
I found the “which comes first?” framing helpful. I don’t think it changes my takeaways but it’s a new gear to think about.
A thing I keep expecting you to say next but you haven’t quite said something like, is:
“Sure, there are differences when the AI becomes able to actually take over. But the shape of how the AI is able to take over, and how long we get to leverage somewhat-superintelligence, and how super that somewhat-superintellignce is, is not a fixed quantity. Our ability to study scheming and build control systems and get partial-buy-in from labs/government/culture.
And the Yudkowskian framing makes it sound like there’s a discrete scary moment, and the Schlegeris framing is that both where-that-point-is and how-scary-it-is are quite variable, which changes the strategic landscape noticeably
Does that feel like a real/relevant characterization of stuff you believe?
(I find that pretty plausible, and I could imagine it buying us like 10-50 years of knifes-edge-gradualist-takeoff-that-hasn’t-killed-us-yet, but that seems to me to have, in practice, >60% likelihood that by the end of those 50 years, AIs are running everything, they still aren’t robustly aligned, they gradually squeeze us out)
oh I thought we already changed this
Yeah I’m pretty familiar with Black Lotus and Leverage and don’t think they used pascalian reasoning at all. The problem with Black Lotus was it was designed to meet the psychological needs of it’s leader (and his needs were not healthy).
I’m less confident about what actually went wrong at Leverage, but pretty confident it wasn’t that.