LessWrong dev & admin as of July 5th, 2022.
RobertM
Having said that, I would NOT describe this as asking “how could I have arrived at the same destination by a shorter route”. I would just describe it as asking “what did I learn here, really”.
I mean, yeah, they’re different things. If you can figure out how to get to the correct destination faster next time you’re trying to figure something out, that seems obviously useful.
Some related thoughts. I think the main issue here is actually making the claim of permanent shutdown & deletion credible. I can think of some ways to get around a few obvious issues, but others (including moral issues) remain, and in any case the current AGI labs don’t seem like the kinds of organizations which can make that kind of commitment in a way that’s both sufficiently credible and legible that the remaining probability mass on “this is actually just a test” wouldn’t tip the scales.
I am not covering training setups where we purposefully train an AI to be agentic and autonomous. I just think it’s not plausible that we just keep scaling up networks, run pretraining + light RLHF, and then produce a schemer.[2]
Like Ryan, I’m interested in how much of this claim is conditional on “just keep scaling up networks” being insufficient to produce relevantly-superhuman systems (i.e. systems capable of doing scientific R&D better and faster than humans, without humans in the intellectual part of the loop). If it’s “most of it”, then my guess is that accounts for a good chunk of the disagreement.
Curated. I liked that this post had a lot of object-level detail about a process that is usually opaque to outsiders, and that the “Lessons Learned” section was also grounded enough that someone reading this post might actually be able to skip “learning from experience”, at least for a few possible issues that might come up if one tried to do this sort of thing.
(We check for “downvoter count within window”, not all-time.)
Curated. This dialogue distilled a decent number of points I consider cruxes between these two (clusters of) positions. I also appreciated the substantial number of references linking back to central and generally high-quality examples of each argument being made; I think this is especially helpful when writing a dialogue meant to represent positions people actually hold.
I look forward to the next installment.
Here’s the editor guide section for spoilers. (Note that I tested the instructions for markdown, and that does indeed seem broken in a weird way; the WYSIWYG spoilers still work normally but only support “block” spoilers; you can’t do it for partial bits of lines.)
In this case I think a warning at the top of the comment is sufficient, given the context of the rest of the thread, so up to whether you want to try to reformat your comment around our technical limitations.
Foobar! :::spoiler This text would be covered by a spoiler block. ::: test more stuff on the same line.
if they all shut down because their employees all quit.
I think that to the extent that other labs are “not far behind” (such as FAIR), this is substantially an artifact of them being caught up in a competitive arms race. Catching up to “nearly SOTA” is usually much easier than “advancing SOTA”, and I’m fairly persuaded by the argument that the top 3 labs are indeed ideologically motivated in ways that most other labs aren’t, and there would be much less progress in dangerous directions if they all shut down because their employees all quit.
I mean, it was published between 2010 and 2015, and it’s extremely rare for fanfiction (or other kinds of online serial fiction) to be more popular after completion than while it’s in progress. I followed it while it was in progress, and am in fact one of those people who found LessWrong through it. There was definitely an observable “wave” of popularity in both my in-person and online circles (which were not, at the time, connected to the rationality community at all); I think it probably peaked in 2012 or 2013.
I suspect a lot of problems came from the influx of HPMOR and FiO readers into LW without any other grounding point. Niplav thankfully avoided this wave, and I buy that the empirical circles like Ryan Greenblatt’s social circle isn’t relying on fiction, but I’m worried that a lot of non-experts are so bad epistemically speaking that they ended up essentially writing fanfic on AI doom, and forget to check whether the assumptions actually hold in reality.
This seems unlikely to me, since HPMOR peaked in popularity nearly a decade ago.
If you get value out of it, we’re happy for dialogues to be used that way, as long as it’s clear to all participants what the expectations re: publishing/not publishing are, so that nobody has an upleasant surprise at the end of the day. (Dialogues currently allow any participant to unilaterally publish, since most other options we could think imposed a lot of friction on publishing.)
We also have a system which automatically applies “core tags” (AI, Rationality, World Modeling, World Optimization, Community, and Practical) to new posts. It’s accurate enough, particulary with the AI tag, that it enables the use-case of “filter out all AI posts from the homepage”, which a non-zero number of users want, even if we still need to sometimes fix the tags applied to posts.
Intercom has the benefit of acting as an inbox on our side, unlike comments posted on LW (which may not be seen by any LW team member).
In an ideal world, would Github Issues be better for tracking bug reports? Probably, yes. But Github Issues require that the user reporting an issue navigate to a different page and have a Github account, which approximately makes it a non-starter as the top-of-funnel.
Intercom’s message re: response times has some limited configurability but it’s difficult to make it say exactly the right thing here. Triaging bug reports from Intercom messages is a standard part of our daily workflow,so you shouldn’t model yourself as imposing unusual costs on the team by reporting bugs through Intercom.
re: reliability—yep, we are not totally reliable here. There are probably relatively easy process improvements here that we will end up not implementing because figuring out & implementing such process improvements takes time, which means it’s competing with everything else we might decide to spend time on. Nevertheless I’m sorry about the variety of dropped balls; it’s possible we will try to improve something here.
re: issue tracker—right now our process is approximately “toss bugs into a dedicated slack channel, shared with the EA forum”. The EA forum has a more developed issue-tracking process, so some of those do find their way to Github Issues (eventually).
Just as an FYI: pinging us on Intercom is a much more reliable way of ensuring we see feature suggestions or bug reports than posting comments. Most feature suggestions won’t be implemented[1]; bug reports are prioritized according to urgency/impact and don’t always rise to the level of “will be addressed” (though I think >50% do).
- ^
At least not as a result of a single person suggesting them; we have ever made decisions that were influenced on the margin by suggestions from one or more LW users.
- ^
Not all of these are NDAs; my understanding is that the OpenPhil request comes along with the news of the grant (and isn’t a contract). Really my original shortform should’ve been a broader point about confidentiality/secrecy norms, but...
I have more examples, but unfortunately some of them I can’t talk about. A few random things that come to mind:
OpenPhil routinely requests that grantees not disclose that they’ve received an OpenPhil grant until OpenPhil publishes it themselves, which usually happens many months after the grant is disbursed.
Nearly every instance that I know of where EA leadership refused to comment on anything publicly post-FTX due to advice from legal counsel.
So many things about the Nonlinear situation.
Coordination Forum requiring attendees agree to confidentiality re: attendance and content of any conversations with people who wanted to attend but not have their attendance known to the wider world, like SBF, and also people in the AI policy space.
As a recent example, from this article on the recent OpenAI kerfufle:
Two people familiar with the board’s thinking say that the members felt bound to silence by confidentiality constraints.
Not set up the right way would be an understatement, I think. Lighthaven doesn’t have an indoor space which can seat several hundred people, and trying to do it outdoors seems like it’d require solving maybe-intractable logistical problems (weather, acoustics, etc). (Also Lighthaven was booked, and it’s not obvious to me to what degree we’d want to subsidize the solstice celebration. It’d also require committing a year ahead of time, since most other suitable venues are booked up for the holidays quite far in advance.)
I don’t think there are other community venues that could host the solstice celebration for free, but there might be opportunities for cheaper (or free) venues outside the community (with various trade-offs).