I am claiming that you can’t make a human seriously superhuman with a good education.
Is the claim that δo/δr for humans goes down over time so that o eventually hits an asymptote? If so, why won’t that apply to AI?
Serious genetic modification is another story, but at that point, your building an AI out of protien.
But it seems quite relevant that we haven’t successfully done that yet.
You couldn’t get much better results just by throwing more compute at it.
Okay, so my new story for this argument is:
For every task T, there are bottlenecks that limit its performance, which could be compute, data, algorithms, etc.
For the task of “AI research”, compute will not be the bottleneck.
So, once we get human-level performance on “AI research”, we can apply more compute to get exponential recursive self-improvement.
Is that your argument? If so, I think my question would be “why didn’t the bottleneck in point 2 vanish in point 3?” I think the only way this would be true would be if the bottleneck was algorithms, and there was a discontinuous jump in the capability of algorithms. I agree that in that world you would see a hard/fast/discontinuous takeoff, but I don’t see why we should expect that (again, the arguments in the linked posts argue against that premise).
Humans are not currently capable of self improvement in the understanding your o. I was talking about the subset of worlds where research talent ense. The “self improvement” section in bookstores doesn’t change the hardware or the operating system, it basically adds more data.
I’m not sure I understand this. Are you claiming δoδr is not positive for humans?
In most of the scenarios where the first smarter than human AI, is orders of magnitude faster than a human, I would expect a hard takeoff.
This sounds like “conditioned on a hard takeoff, I expect a hard takeoff”. It’s not exactly saying that, since speed could be different from intelligence, but you need to argue for the premise too: nearly all of the arguments in the linked post could be applied to your premise as well.
In a world where researchers have little idea what they are doing, and are running a new AI every hour hoping to stumble across something that works, the result holds.
In a world where research involves months thinking about maths, then a day writing code, then an hour running it, this result holds.
Agreed on both counts, and again I think the arguments in the linked posts suggest that the premises are not true.
As we went from having no algorithms that could say (tell a cat from a dog) straight to having algorithms superhumanly fast at doing so, there was no algorithm that worked, but took supercomputer hours, this seems like a plausible assumption.
This seems false to me. At what point would you say that we had AI systems that could tell a cat from a dog? I don’t know the history of object recognition, but I would guess that depending on how you operationalize it, I think the answer could be anywhere between the 60s and “we still can’t do it”. (Though it’s also possible that people didn’t care about object recognition until the 21st century, and only did other types of computer vision in the 60s-90s. It’s quite strange that object recognition is an “interesting” task, given how little information you get from it.)
Humans are already capable of self-improvement. This argument would suggest that the smartest human (or the one who was best at self-improvement, if you prefer) should have undergone fast takeoff and become seriously overpowered, but this doesn’t seem to have happened.
In a world where the limiting factor is researcher talent, not compute
Compute is definitely a limiting factor currently. Why would that change?
I just read through those comments, and didn’t really find any rebuttals. Most of them seemed like clarifications, terminology disagreements, and intuitions without supporting arguments. I would be hard-pressed to distill that discussion into anything close to a response.
One key thing is that AFAICT, when Paul says ‘slow takeoff’ what he actually means is ‘even faster takeoff, but without a sharp discontinuity’, or something like that.
Yes, but nonetheless these are extremely different views with large implications for what we should do.
Fwiw, my epistemic state is similar to SoerenMind’s. I basically believe the arguments for slow/continuous takeoff, haven’t fully updated towards them because I know many people still believe in fast takeoff, but am surprised not to have seen a response in over a year. Most of my work now takes continuous takeoff as a premise (because it is not a good idea to premise on fast takeoff when I don’t have any inside-view model that predicts fast takeoff).
I think a lot of the intuition right now is “there is an argument that inner optimizers will arise by default; we don’t know how likely it is but evolution is one example so it’s not non-negligible”.
For the argument part, have you read More realistic tales of doom? Part 2 is a good explanation of why inner optimizers might arise.
Ooh, I might have to try this, it does sound better.
I didn’t really have the time to write up more explanation, so it was a choice between posting it as is or not posting it at all, and I went with posting it as is.
Makes sense. I think I could not tell how much I should be trying to understand this until I understood it. I probably would have chosen not to read it if I had known how long it would take and how important I thought it was (ex-post, not ex-ante). For posts where that’s likely to be true, I would push for not posting at all.
Another way you could see this: given my current state of knowledge about this post, I think I could spend ~15 minutes making it significantly easier to understand. The resulting post would have been one that I could have read more than 15 minutes faster, probably, for the same level of understanding.
I think it’s not worth making a post if you don’t get at least one person reading it in as much depth as I did; so you should at the very least be willing to trade off some of your time for an equal amount of time of that reader, and the benefit scales massively the more readers you have. The fact that this was not something you wanted to do feels like a fairly strong signal that it’s not worth posting since it will waste other people’s time.
(Of course, it might have taken you longer than 15 minutes to make the post easier to understand, or readers might usually not take a whole 15+ minutes more to understand a post without exposition, but I think the underlying point remains.)
The point of the distillation step, thus, is just to increase sample efficiency by letting you get additional training in without requiring additional calls to H
Note that my proposed modification does allow for that, if the adversary predicts that both of the answers are sufficiently good that neither one needs to be recursed on. Tuning α in my version should allow you to get whatever sample efficiency you want. An annealing schedule could also make sense.
(Also, the sum isn’t a typo—I’m using the adversary to predict the negative of the loss, not the loss, which I admit is confusing and I should probably switch it.)
Ah, yeah, I see it now.
If I don’t understand something in your summary, I look it up, so I’ve already begun to organically build a useful knowledge base.
This seems like a great way to use the newsletter :)
Also, the newsletter provides me with a regular dose of reassurance and inspiration. Even when I don’t have time to thoroughly read the summaries, skimming them reminds me how interesting this field is.
Oh, I think there are a lot of email subscribers who skim/passively consume the newsletter. I didn’t focus very much on them in the retrospective because I don’t think I’m adding that much value to them.
It might be true that all of the people who read it thoroughly are subscribed by email, I’m not sure. It’s hard to tell because I expect skimmers far outnumber thorough readers, so seeing a few skimmers via the comments is not strong evidence that there aren’t thorough readers.
I think it might benefit me to, once a year, or maybe once a quarter, reading a higher level summary that goes over which papers seemed most important that year, and which overall research trends seemed most significant. I’m not sure if this is worth the opportunity cost for you, but it’d be helpful to me and probably others.
A slightly different option would be to read the yearly AI alignment literature review, use that to find the top N most interesting papers, and read their summaries in the spreadsheet. This also has the benefit of showing you a perspective other than mine on what’s important—there could be an Agent Foundations paper in the list that I haven’t summarized.
(I’d be interested in that both from the standpoint of my own personal knowledge, as well as tracking how stable your opinions are over time – when you list something as particularly interested or important do you tend to still think so a year later?)
I think that the stability of my opinions is going up over time, mainly because I started the newsletter while still new to the field.
I also think it’d make more sense for LessWrong to curate a “highlights of the highlights” post once every 3-12 months, than what we currently do, which is every so often randomly decide that a recent Newsletter was particularly good and curate that.
This seems good; I’m currently thinking I could write something like that once every 25 newsletters (which is about half a year), which should also help me evaluate the stability of my opinions.
Yeah, I like the idea of having specific times for feedback, it does seem more likely that people actually bother to give feedback in those cases.
Also, consider the case where nothing in the newsletter ever becomes the subject of wide agreement: this suggests to me that either the field is not making enough progress to settle questions (which is very bad), or that the newsletter is by accident or design excluding ideas upon which the field might settle (which seems bad from the perspective of the newsletter).
Certainly when my opinions are right I would hope that they become widely agreed upon (and I probably don’t care too much if it happens via information cascade or via good epistemics). The question is about when I’m wrong.
That is to say, it is very clear that this is a newsletter, and that your opinion differs from that of the authors of the papers. This goes a long way to preventing the kind of uncritical agreement that typifies information cascades.
Journalism has the same property, but I do see uncritical agreement with things journalists write. Admittedly the uncritical agreement comes from non-experts, but with the newsletter I’m worried mostly about insufficiently critical agreement from researchers working on different areas, so the analogy kinda sorta holds.
Finally, I expect this field and the associated communities are unusually sensitive to information cascades as a problem, and therefore less likely to fall victim to them.
Agreed that this is very helpful (and breaks the analogy with journalism), and it’s the main reason I’m not too worried about information cascades right now. That said, I don’t feel confident that it’s enough.
I think overall I agree with you that they aren’t a major risk, and it’s good to get a bit of information that at least you treat the opinion as an opinion.
There’s a lot of speculation about related-ish topics in Chapter 3 of the sequence linked above.
Fwiw the quoted section was written by Paul Christiano, and I have used that blog post in my sequence (with permission).
Also, for this particular question you can read just Chapter 1 of the sequence.
Other relevant writing of mine:
Comment on the AUP post
Comment on the desiderata post
But it’s true that that quoted passage is the best summary of my current position. Daniel’s answer is a good example of an underlying intuition that drives this position.
I can’t quite convince myself that no good method of value learning exists, and some other competent people seem to disagre ewith me.
No good method of measuring impact, presumably?
Hmm, this seems roughly plausible. It doesn’t gel with my experience of how many people seem to be trying to enter the field (which I would have estimated almost an order of magnitude less, maybe 100-200), but it’s possible that there’s a large group of such people who I don’t interact with who nonetheless are subscribed to the newsletter.
We also might have different intended meanings of “career in the field”.