porby

Karma: 1,687

porby 3 Aug 2022 3:59 UTC
22 points
13
in reply to: aogara’s comment on: Two-year update on my personal AI timelines
I suspect Chinchilla’s implied data requirements aren’t going to be that much of a blocker for capability gain. It is an important result, but it’s primarily about the behavior of current backpropped transformer based LLMs.
The data inefficiency of many architectures was known before Chinchilla, but the industry worked around it because it wasn’t yet a bottleneck. After Chinchilla, it has become one of the largest architectural optimization targets. Given the increase in focus and the relative infancy of the research, I would guess the next two years will see the picking of some very juicy low hanging fruit. There are a lot of options floating nearby in conceptspace and there is a lot of room to grow; I’d be surprised if data limitations still feel as salient in 2025.

Private alignment research sharing and coordination

porby4 Sep 2022 0:01 UTC

62 points

13 comments5 min readLW link

porby 5 Sep 2022 19:53 UTC
3 points
1
in reply to: Peter S. Park’s comment on: Private alignment research sharing and coordination
I suppose that’s an additional consideration. Keeping potentially concerning material out of trivially scraped training sets is pretty low cost and worth it.
I wouldn’t want to sacrifice much usability beyond the standard security measures to focus on that angle, though; that would mean trying to directly fight a threat which is 1. already able to misuse observed research, 2. already able to otherwise socially or technically engineer its way to gaining access to that research, and 3. somehow not already massively lethal without that research.

porby 5 Sep 2022 20:12 UTC
3 points
1
in reply to: Gunnar_Zarncke’s comment on: Private alignment research sharing and coordination
Could be worth poking at- the use case is quite a bit different, but I could see there being some common primitives. Probably some best practices if nothing else.

Why I think strong general AI is coming soon

porby28 Sep 2022 5:40 UTC

325 points

139 comments34 min readLW link 1 review

porby 28 Sep 2022 16:49 UTC
25 points
7
in reply to: jacob_cannell’s comment on: Why I think strong general AI is coming soon
Out of curiosity:
1. What rough probability do you assign to a 10x improvement in efficiency for ML tasks (GPU or not) within 20 years?
2. What rough probability do you assign to a 100x improvement in efficiency for ML tasks (GPU or not) within 20 years?
My understanding is that we actually agree about the important parts of hardware, at least to the degree I think this question is even relevant to AGI at this point. I think we may disagree about the software side, I’m not sure.
I do agree I left a lot out of the hardware limits analysis, but largely because I don’t think it is enough to move the needle on the final conclusion (and the post is already pretty long!).

porby 28 Sep 2022 17:10 UTC
16 points
0
in reply to: jacob_cannell’s comment on: Why I think strong general AI is coming soon
Looks like we’re in almost perfect agreement!

porby 28 Sep 2022 19:42 UTC
2 points
0
in reply to: jacob_cannell’s comment on: Why I think strong general AI is coming soon
Went ahead and included a callout for this explicitly in the text. Thanks for the feedback!

porby 28 Sep 2022 20:20 UTC
14 points
4
in reply to: Darcey’s comment on: Why I think strong general AI is coming soon
I think what’s going on is something like:
1. Being slightly better isn’t enough to unseat an entrenched option that is well understood. It would probably have to very noticeably better, particularly in scaling.
2. I expect the way the internal structures are used will usually dominate the details of the internal structure (once you’re already at the pretty good frontier).
3. If you’re already extremely familiar with transformers, and you can simply change how you use transformers for possible gains, you’re more likely to do that than to explore a from-scratch technique.
For example, in my research, I’m currently looking into some changes to the outer loop of execution to make language models interpretable by construction. I want to focus on that part of it, and I wanted the research to be easily consumable by other people. Building an entire new architecture from scratch would be a lot of work and would be less familiar to others. So, not surprisingly, I picked a transformer for the internal architecture.
But I also have other ideas about how it could be done that I suspect would work quite well. Bit hard to justify doing that for safety research, though :P
I think the amount of low hanging fruit is so high that we can productively investigate transformer derivatives for a long time without diminishing returns. They’re more like a canvas than some fixed Way To Do Things. It’s just also possible someone makes a jump with a non-transformer architecture at some point.

porby 28 Sep 2022 20:23 UTC
16 points
6
in reply to: Lone Pine’s comment on: Why I think strong general AI is coming soon
Now you know how the transformer feels!

porby 28 Sep 2022 21:02 UTC
14 points
10
in reply to: mocny-chlapik’s comment on: Why I think strong general AI is coming soon
I don’t actually think we’re bottlenecked by data. Chinchilla represents a change in focus (for current architectures), but I think it’s useful to remember what that paper actually told the rest of the field: “hey you can get way better results for way less compute if you do it this way.”
I feel like characterizing Chinchilla most directly as a bottleneck would be missing its point. It was a major capability gain, and it tells everyone else how to get even more capability gain. There are some data-related challenges far enough down the implied path, but we have no reason to believe that they are insurmountable. In fact, it looks an awful lot like it won’t even be very difficult!
With regards to whether deep learning goes anywhere: in order for this to occupy any significant probability mass, I need to hear an argument for how our current dumb architectures do as much as they do, and why that does not imply near-term weirdness. Like, “large transformers are performing {this type of computation} and using {this kind of information}, which we can show has {these bounds} which happens to include all the tasks it has been tested on, but which will not include more worrisome capabilities because {something something something}.”
The space in which that explanation could exist seems small to me. It makes an extremely strong, specific claim, that just so happens to be about exactly where the state of the art in AI is.

porby 28 Sep 2022 22:34 UTC
20 points
13
in reply to: mocny-chlapik’s comment on: Why I think strong general AI is coming soon
As mentioned in the post, that line of argument makes me more alarmed, not less.
1. We observe these AIs exhibiting soft skills that many people in 2015 would have said were decades away, or maybe even impossible for AI entirely.
2. We can use these AIs to solve difficult reasoning problems that most humans would do poorly on.
3. And whatever algorithms this AI is using to go about its reasoning, they’re apparently so simple that the AI can execute them while still struggling on absolutely trivial arithmetic.
4. WHAT?
Yes, the AI has some blatant holes in its capability. But what we’re seeing is a screaming-hair-on-fire warning that the problems we thought are hard are not hard.
What happens when we just slightly improve our AI architectures to be less dumb?

porby 28 Sep 2022 22:49 UTC
15 points
6
in reply to: agi-hater’s comment on: Why I think strong general AI is coming soon
Hmm...
Given the new account, the account name, the fact that there were a few posts in the minutes prior to this one rejected by the spam filter, the arguments, and the fact that the decently large followup comment was posted only 3 minutes after the first...
… are… are you the AI? Trying to convince me of dastardly things?
You can’t trick me!
:P

porby 29 Sep 2022 0:18 UTC
4 points
0
on: Private alignment research sharing and coordination
Following up on this:

I’ve contacted a number of people in the field regarding this idea (thanks to everyone who responded!).
The general vibe is “this seems like it could be useful, maybe, if it took off,” but it did not appear to actually solve the problems any specific person I contacted was having.
My expectation would be that people in large organizations would likely not publish anything in this system that they would not publish out in the open.
One of the critical pieces of this proposal is having people willing to coordinate across access boundaries. There were zero enthusiastic takers for that kind of burden, which doesn’t surprise me too much. Without a broad base of volunteers for that kind of task, though, this idea seems to require paying a group of highly trusted and well-informed people to manage coordination instead of, say, researching things. That seems questionable.
Overall, I still think there is an important hole in coordination, but I don’t think this proposal fills it well. I’m not yet sure what a better shape would be.
What links here?
- Has private AGI research made independent safety research ineffective already? What should we do about this? by Roman Leventov (23 Jan 2023 7:36 UTC; 43 points)
- Has private AGI research made independent safety research ineffective already? What should we do about this? by Roman Leventov (EA Forum; 23 Jan 2023 16:23 UTC; 15 points)

porby 29 Sep 2022 0:57 UTC
1 point
0
in reply to: Lone Pine’s comment on: Why I think strong general AI is coming soon
I did wonder about him. My understanding is that his most publicized bet was offering even odds on AGI in 2029. If I’m remembering that right… I can’t really fault him for trying to get free money from his perspective, but if one of the most notable critics in the field offers even odds on timelines even more aggressive than my own, I’m… not updating to longer timelines, probably.

porby 29 Sep 2022 2:06 UTC
15 points
3
in reply to: Ege Erdil’s comment on: Why I think strong general AI is coming soon
Notably, the result is correct; I did convert it to kelvin for the actual calculation. Just a leftover from when I was sketching things on wolframalpha. I’ll change that, since it is weird. (Thanks for the catch!)

porby 29 Sep 2022 16:55 UTC
9 points
4
in reply to: mocny-chlapik’s comment on: Why I think strong general AI is coming soon
The observation that things that people used to consider intelligent are now considered easy is critical.
The space of stuff remaining that we call intelligent, but AIs cannot yet do, is shrinking. Every time AI eats something, we realize it wasn’t even that complicated.
The reasonable lesson appears to be: we should stop default-thinking things are hard, and we should start thinking that even stupid approaches might be able to do too much.
It’s a statement more about the problem being solved, not the problem solver.
When you stack this on a familiarity with the techniques in use and how they can be transformatively improved with little effort, that’s when you start sweating.

porby 29 Sep 2022 17:17 UTC
2 points
0
in reply to: Qumeric’s comment on: Why I think strong general AI is coming soon
Thanks! I had forgotten that one; I’ll add it since it did seem to be one of the more meaningful ones.

porby 29 Sep 2022 17:44 UTC
6 points
5
in reply to: nlholdem’s comment on: Why I think strong general AI is coming soon
I didn’t actually update my timelines shorter in response to his bets since I was aware his motivations were partially to poke Elon and maybe get some (from what I understand his perspective to be) risk-free money. I’d just be far more persuaded had he offered odds that actually approached his apparent beliefs. As it is, it’s uninformative.
His 5 tests are indeed a solid test of capability, though some of the tests seem much harder than others. If an AI could do ³⁄₅ of them, I would be inclined to say AGI is extremely close, if not present.
I would be surprised if we see the cook one before AGI, given the requirement that it works in an arbitrary kitchen. I expect physical world applications to lag purely digital applications just because of the huge extra layer of difficulty imposed by working in a real time environment, all the extra variables that are difficult to capture in a strictly digital context, and the reliability requirements.
The “read a book and talk about it” one seems absolutely trivial in comparison.
I would really like to see him make far more predictions on a bunch of different timescales. If he predicted things correctly about GPT-4, the state of {whatever architecture} in 2025, the progress on the MATH dataset by 2025, and explained how all of these things aren’t concerning and so on, I would be much more inclined to step towards his position. (I don’t expect him to get everything right, that would be silly, I just want to see evidence, and greater details, of a generally functioning mental model.)

porby 29 Sep 2022 21:11 UTC
8 points
1
in reply to: nlholdem’s comment on: Why I think strong general AI is coming soon
So I think it fair to have it part of AGI.
I’d agree with that, I just strongly suspect we can hit dangerous capability without running this experiment first given how research proceeds. If there’s an AI system displaying other blatant signs of being an AGI (by this post’s definition, and assuming non-foom situation, and assuming we’re not dead yet), I won’t bother spending much time wondering about whether it could be a cook.
As an aside, I had a fun 20 minute chat with GPT-3 today and convinced myself that it doesn’t have the slightest understand of meaning at all!
Yup- GPT-3 is shallow in a lot of important ways. It often relies on what appears to be interpolation and memorization. The part that worries me is that architectures like it can still do very difficult reasoning tasks that many humans can’t, like the MATH dataset and minerva. When I look at those accomplishments, I’m not thinking “wow this ML architecture is super duper smart and amazing,” I think “uh oh that part of reasoning is apparently easy if current transformers can do it, while simultaneously failing at trivial things.” We keep getting signals that more and more of our ineffable cognitive skills are… just not that hard.
As we push into architectures that rely more on generalization through explicit reasoning (or maybe even interpolation/memorization at sufficiently absurd scales), a lot of those goofy little mistakes are going to collapse. I’m really worried that an AI that is built for actual reasoning with an architecture able to express what reasoning entails algorithmically is going to be a massive discontinuity, and that it might show up in less than 2 years. It might not take us all the way to AGI in one step, but I’m not looking forward to it.
I really dislike that, as a byproduct of working on safety research, I keep coming up with what look like promising avenues of research for massive capability gain. They seem so much easier to find than good safety ideas, or good ideas in the other fields I work in. I’ve done enough research that I know they wouldn’t all pan out, but the apparent ease is unsettling.

porby

Pri­vate al­ign­ment re­search shar­ing and coordination

Why I think strong gen­eral AI is com­ing soon

Private alignment research sharing and coordination

Why I think strong general AI is coming soon