I think the current CCP having control over most/all of the universe seems like 50% as bad as AI takeover in my lights
This is a wild claim to me.
Can you elaborate on why you think this?
I think the current CCP having control over most/all of the universe seems like 50% as bad as AI takeover in my lights
This is a wild claim to me.
Can you elaborate on why you think this?
Thank you for writing this!
Some important things that I learned / clarified for myself from this comment:
Many plans depend on preserving the political will to maintain a geopolitical regime that isn’t the nash equilibrium, for years or decades. A key consideration for those plans is “how much much of the benefit of this plan will we have gotten, if the controlled regime breaks down early?”
Plans that depend on having human level AIs do alignment work (if those plans work at all), don’t have linear payoff in time spent working, but they are much closer to linear than plans that depend on genetically engineered super geniuses doing the alignment work.
In the AI alignment researcher plan, the AIs can be making progress as soon as they’re developed. In the super-genius plan, we need to develop the genetic engineering techniques and (potentially) have the super-geniuses grow up before they can get to work. The benefits to super-geniuses are backloaded, instead of linear.
(I don’t want to overstate this difference however, because if the plan of automating alignment research is just fundamentally unworkable, it doesn’t matter that the returns to automated alignment research would be closer to linear in time, if it did work. The more important crux is “could this work at all?”)
The complexity of “controlled takeoff” is in setting up the situation so that things are actually being done responsibly and safely, instead of only seeming so to people that aren’t equipped to judge. The complexity of “shut it all down” is in setting up an off-ramp. If “shut it all down” is also including “genetically engineer super-geniuses” as part of the plan, then it’s not clearly simpler than “controlled takeoff.”
Hell yeah.
I bought one, plus some filters.
FWIW, this broadly matches my own experience of working for Anna and participating in CFAR workshops.
There were tensions in how to relate to participants at AIRCS workshops, in particular.
These were explicitly recruitment programs for MIRI. This was extremely explicit—it was stated on the website, and I believe (though Buck could confirm) that all or most of the participants did a technical interview before they were invited to a workshop.
The workshops were part of an extended interview process. It was a combination of 1) the staff assessing the participants, 2) the participants assessing the staff, and 3) (to some extent) enculturating the participants into MIRI/rationalist culture.
However, the environment was dramatically less formal and more vulnerable than most job interviews: about a fourth of the content of the workshops was Circling, for instance.
This meant that workshop staff were both assessing the participants, and assessing their fit-to-the-culture while also aiming to be helpful to them and their personal development by their own lights, including helping them untangle philosophical confusions or internal conflicts.
These intentions were not incompatible, but there were sometimes in tension. It could feel callous to spend a few days having deep personal conversations with someone, talking with them and trying to support them, but then later, in a staff meeting, relatively quickly coming to a judgement about them: evaluating that they don’t make the cut.
This was a tension that we were aware of and discussed at the time. I think we overall did a good job of navigating it.
This was a very weird environment, by normal profesional standards. But to my knowledge, there was no incident in which we failed to do right by a AIRCS participant, exploited them, or treated them badly.
The majority of people who came had a good experience, regardless of whether they eventually got hired by MIRI. Of those that did not have a good experience, I believe this was predominantly (possibly entirely?) people who felt that the workshop was a waste of time, rather than that they had actively been harmed.
I would welcome any specific information to the contrary. I could totally believe that there was stuff that I was unaware of, or subtle dynamics that I wasn’t tracking at the time, but I would conclude were fucked up on reflection.
But as it is, I don’t think we failed to discharge our deontological duty towards any participants.
FWIW, I my guess is that there would be more benefit to the social fabric if you went into (some of) the details of what you observed, instead of making relatively high level statements and asking people to put weight on them to the extent that they respect your reputation.
(Hearing object level details at least makes it easier for people to make up their minds. eg there might be various specific behaviors that you think were irresponsible that some readers would think were non-issues if they heard the details.
Further, sharing observations allows for specific points to be addressed and resolved. The alternative is an unanswerable miasma that hangs over the org forever. “Duncan, one of the people who used to work at CFAR, explicitly disendorsed the org, but wouldn’t give details.” is the kind of thing that people can gossip about for years, but it doesn’t add gears to people’s models, and there’s nothing that anyone can say that can address the latent objections, because they’re unstated.)
However, I also acknowledge that for things like this, there may be a bunch of private details that you are either reluctant to share or are not a liberty to share, and there might be other reasons beside, to say less.
But, insofar as you’re willing to make this much of a callout post, I guess it would be better to be as specific as possible, especially as regards “malpractice” that you observed at CFAR workshops.
Is that true?
Name three.
It doesn’t seem like a formalism like VNM makes predictions, the way eg the law of gravity does. You can apply the VNM formalisms to model agents, and that model sometimes seems more or less applicable. But what observation could I see that would undermine or falsify the VNM formalism, as opposed to learning that that some particular agent doesn’t obey the VNM axioms.
I guess I wasn’t expecting that they had a volume control, at all (which once I say it out loud, is a kind of dumb).
But even more than that, I’m surprised that they were willing to change the volume. I would have guessed that the low level guy who could physically change the dial, would be like “sorry, the rule is I’m not allowed to change the volume.”
Were you the only person in the theater?
it took ten minutes to follow a browser extension hello world tutorial
Which tutorial did you use?
At Palisade we often short circuit discussions about if AI can really be creative or whatever, by clarifying that the thing that we’re concerned with is superhuman strategic capabilities, and that we’re not mainly worried about AIs that don’t have that.
I kind of like the evocativeness of “relentless creative resourcefulness” though, and might try using it.
Hell yeah. I loved reading this.
I plan to continue basically not consuming content from any of these platforms, tho.
Hell yeah. I loved reading this.
I plan to continue not consuming content from any of these platforms, tho.
It’s awesome that the footnotes are inside the blockquote.
It should be a Halt. Melt. Catch Fire. moment when your rationality workshop is somehow regularly crashing people’s easy-mode epistemics! To first-order, you should expect a successful rationality workshop to help people prone to psychosis.
For what it’s worth, I think this is directionally correct, and important, but I don’t necessarily buy it as worded.
Sometimes advanced techniques / tools for do allow power users to do more than they otherwise would be able to, but also break basic-level stuff for less advanced users. There are some people that are able to get a lot more out of their computers with a Linux install, and also for most people trying to use and work with Linux can totally to interfere with pretty basic stuff that that “just worked” when using windows, or (if you do it wrong) just break your machine, without having the tools to fix it.
It’s correspondingly not that surprising to me if power tools for for making big changes to people’s epistemologies sometimes have the effect of making some people worse at the basics. (Though obviously, if this is the case, a huge priority needs to be attending to and mitigating this dynamic.)
That said, I think that the rationality project broadly construed has often fallen into a failure mode of trying to do radically ambitious stuff without first solidly mastering the boring and bog standard basics, and often undershooting, not just our ambitions, but the more boring baselines.
Like, we aimed to be faster than science but in practice I think we often didn’t meet the epistemic standards of a reasonably healthy scientific subfield.
If I invest substantial effort in rationality development in the future, I intend to first focus on doing the basics really well before trying for superhuman rationality.
Is your view that CFAR workshops in the past, specifically in the period when you were involved with them, were responsible and careful, but you expect this new generation to be worse?
Or do you think that the past ones, in hindsight, actually weren’t adequately responsible and careful?
Or something else?
That this is part of the difference in worldviews is surprising and slightly horrifying. I wouldn’t have called this one!
I’m confused by Hanson’s perspective on this – he seems to think the result is actually “good/fine” instead of “horrifying and sad.” I’m not really sure what it is Hanson actually cares about.
I think partly, he just doesn’t think that it was realistic to hope for anything substantially different this kind of outcome. It doesn’t feel like a loss to him, it just feels like how things were always going to go.
This is not a full explanation though.
So, as far as I’m concerned, we saw something like goal-preservation in various models in the original alignment faking work. Both that work, and MIRI above, were like “aha! as foretold!” And then subsequent work seems to indicate that, nah, it wasn’t as foretold.
I think it’s more like “the situation is more confusing that it seemed at first, with more details that we don’t understand yet, and it’s not totally clear if we’re seeing what was foretold or not.”
This doesn’t really reflect how I (or most people?) conceive of the act/omission distinction.
Further, I deny that I “only somewhat begrudgingly accept [it] because [I’m]...guilty of using it for our [my] own defense.”
I currently do Cardio Interval Training, around 2 to 3 times a week.
More specifically, I typically do 3 or 4 cycles of sprinting to get my heart rate above 170 bpm, keeping my HR above 170 for about 50 seconds, and then slowing to a brisk walk for 120 seconds. (I’m gradually increasing the intense periods and shrinking the moderate-rest periods.) The I’m aiming for a quick, intense, cardiovascular exercise for optimizing my nighttime HRV.
Should I be concerned about doing that as frequently as every other day?
Where do you think that most of the benefits come from?
Edit: My personal consumption patterns are mostly not relevant to this question, so I moved what was formally the rest of this comment to a footnote.[1]
Perhaps I am dumb or my personal priorities are different than most people’s, but I expect a large share of the benefits from AI, to my life, personally, are going to be biotech advances, that eg could extend my life or make me smarter.
Like basically the things that could make my life better are 1) somehow being introduced to a compatible romantic partner, 2) cheaper housing, 3) biotech stuff. There isn’t much else.
I guess self-driving cars might make travel easier? But most of the cost of travel is housing.
I care a lot about ending factory farming, but that’s biotechnology again.
I guess AI, if it was trustworthy, could also substantially improve governance, which could have huge benefits to society.