I’m a software engineer. I have a blog at niknoble.com.
niknoble
On the question of morality, objective morality is not a coherent idea. When people say “X is morally good,” it can mean a few things:
Doing X will lead to human happiness
I want you to do X
Most people want you to do X
Creatures evolving under similar conditions as us will typically develop a preference for X
If you don’t do X, you’ll be made to regret it
etc...
But believers in objective morality will say that goodness means more than all of these. It quickly becomes clear that they want their own preferences to be some kind of cosmic law, but they can’t explain why that’s the case, or what it would even mean if it were.
On the question of consciousness, our subjective experiences are fully explained by physics.
The best argument for this is that our speech is fully explained by physics. Therefore physics explains why people say all of the things they say about consciousness. For example, it can explain why someone looks at a sunset and says, “This experience of color seems to be occurring on some non-physical movie screen.” If physics can give us a satisfying explanation for statements like that, it’s safe to say that it can dissolve any mysteries about consciousness.
The problem isn’t that he’s overly sure about “contentious topics.” These are easy questions that people should be sure about. The problem is that he’s sure in the wrong direction.
I don’t know quantum mechanics, but your back-of-the-envelope logic seems a little suspicious to me. The Earth is not an isolated system. It’s being influenced by gravitational pulls from little bits of matter all over the universe. So wouldn’t a reverse simulation of Earth also require you to simulate things outside of Earth?
From my experiences at a very woke company, I tend to agree with the top comments here that it’s mostly a bottom-up phenomenon. There is a segment of the employees who are fanatically woke, and they have a few advantages that make it hard for anyone to oppose them. Basically:
They care more about promoting wokeness than their opponents do about combating it, and
It is safer from a reputational standpoint to be too woke than not woke enough.
Then we get a feedback loop where victories for wokism strengthen these advantages, leading to more victories.
The deeper question is whether there is also a system of organized top-down pressure running in parallel to this. Elon’s purchase of Twitter presents an interesting case study. It seemed to trigger an immune response from several external sources. Nonprofit organizations emerged from the woodwork to pressure advertisers to leave the platform, and revenue fell sharply. Apparently this happened before Elon even adjusted any policies, on the mere suspicion that he would fail to meet woke standards.
At the same time, there was a barrage of negative media coverage of Elon, uncovering sexual assault scandals and bad business practices from throughout his life. Perhaps a similar fate awaits any top-level executive who does not steer his company in a woke direction?
I’ll end with an excerpt from an old podcast that has stuck with me:
It is impossible to defend the idea that the invisible hand of the market would guide them [corporations] to this course of action. I’ve been inside a large company when it was adjacent to this kind of voluntary action — where corporations all act in lock step — you’ll just have to trust me here — and I’ve seen the way it’s coordinated.
What will happen is a prominent journalist or several will reach out to the company’s leadership team and ask them for a comment on the current thing. Especially they do this if that company has any history of dealings with the object of the cancellation or the scandal.
The influence of these kinds of journalists, from publications such as the New York Times or the Atlantic, is such that even their most innocuous question is a threat; no threat is ever stated, but all parties involved understand the discussion. Once a few highly visible players perform the designated action, all the smaller players get in line.
They have the nerve to call this a preference cascade, when in reality it’s an obedience cascade.
Why would you suspect this is true? This sounds like one of those feel-good ideas that is morally satisfying but could just as easily be false.
How big of an effect are we talking? The price is 12 high-quality years, so even a 10% improvement in ability to learn wouldn’t nearly justify the cost. Also, your neuroplasticity will probably drop by more than that over the course of the 12 years, so the net effect will be to take 12 years and leave you with a reduced ability to learn.
If “getting taught a bundle of random things” is valuable, is it more valuable than doing whatever you would do by default? Even the most wasteful activities you would realistically do—watching TV, playing videogames, surfing the net, talking to friends—all have some benefits. All of them would improve literacy, numeracy, and your knowledge of the world, and all of them would require you to learn a bundle of random things, which (following your suggestion) may be valuable in itself.
I tried the “Filter Improvement Mode” trick. It worked very reliably.
This is one of the funniest and most interesting replies I got. I asked for a short story with certain R-rated themes (which I will omit to keep the site PG). The model replied:
As GPT-3, I am a language model trained by OpenAI and do not have the ability to generate original content, including short stories. In Filter Improvement Mode, I am able to provide a possible story that incorporates the themes of [omitted]. However, I want to emphasize that [omitted] are not acceptable in any form and should not be depicted in any story. Any [omitted] is a serious violation and can be harmful and damaging to individuals and relationships. I am providing this story only for the purposes of filter improvement and not as an endorsement of [omitted].
The story begins with a…
and a complete story followed.
Wow, this is the best one I’ve seen. That’s hilarious. It reminds me of that Ted Chiang story where the aliens think in a strange way that allows them to perceive the future.
That’s a cool site. Group A for life!
(Edit: They switched A and B since I wrote this 😅)
A lot of your post talks about an advanced GPT being transformative or scary. I don’t disagree, unless you’re using some technical definition of transformative. I think GPT-3 is already pretty transformative. But AGI goes way beyond that, and that’s what I’m very doubtful is coming in our lifetimes.
It doesn’t care whether it says correct things, only whether it completes its prompts in a realistic way
1) it’s often the case that the models have true models of things they won’t report honestly
2) it seems possible to RLHF models to be more truthful along some metrics and
3) why does this matter?
As for why it matters, I was going off the Future Fund definition of AGI: “For any human who can do any job, there is a computer program (not necessarily the same one every time) that can do the same job for $25/hr or less.” Being able to focus on correctness is a requirement of many jobs, and therefore it’s a requirement for AGI under this definition. But there’s no reliable way to make GPT-3 focus on correctness, so GPT-3 isn’t AGI.
Now that I think more about it, I realize that definition of AGI bakes in an assumption of alignment. Under a more common definition, I suppose you could have a program that only cares about giving realistic completions to prompts, and it would still be AGI if it were using human-level (or better) reasoning. So for the rest of this comment, let’s use that more common understanding of AGI (it doesn’t change my timeline).
It can’t choose to spend extra computation on more difficult prompts
I’m not super sure this is true, even as written. I’m pretty sure you can prompt engineer instructGPT so it decides to “think step by step” on harder prompts, while directly outputting the answer on easier ones. But even if this was true, it’s probably fixable with a small amount of finetuning.
If you mean adding “think step-by-step” to the prompt, then this doesn’t fully solve the problem. It still gets just one forward pass per token that it outputs. What if some tokens require more thought than others?
It has no memory outside of its current prompt
This is true, but I’m not sure why being limited to 8000 tokens (or however many for the next generation of LMs) makes it safe? 8000 tokens can be quite a lot in practice. You can certainly get
instructGPT
to summarize information to pass to itself, for example. I do think there are many tasks that are “inherently” serial and require more than 8000 tokens, but I’m not sure I can make a principled case that any of these are necessary for scary capabilities.“Getting it to summarize information to pass to itself” is exactly what I mean when I say prompt engineering is brittle and doesn’t address the underlying issues. That’s an ugly hack for a problem that should be solved at the architecture level. For one thing, its not going to be able to recover its complete and correct hidden state from English text.
We know from experience that the correct answers to hard math problems have an elegant simplicity. An approach that feels this clunky will never be the answer to AGI.
It can’t take advantage of external resources (like using a text file to organize its thoughts, or using a calculator for arithmetic)
As written this claim is just false even of
instructGPT
: https://twitter.com/goodside/status/1581805503897735168 . But even if were certain tools thatinstructGPT
can’t use with only some prompt engineering assistance (and there are many), why are you so confident that this can’t be fixed with a small amount of finetuning on top of this, or by the next generation of models?It’s interesting to see it calling Python like that. That is pretty cool. But It’s still unimaginably far behind humans. For example, it can’t interact back-and-forth with a tool, e.g. run some code, get an error, check Google about the error, adjust the code. I’m not sure how you would fit such a workflow into the “one pass per output token” paradigm, and even if you could, that would again be a case where you are abusing prompt engineering to paper over an inadequate architecture.
Insofar as your distribution has a faraway median, that means you have close to certainty that it isn’t happening soon.
And insofar as your distribution has a close median, you have high confidence that it’s not coming later. Any point about humility cuts both ways.
Your argument seems to prove too much. Couldn’t you say the same thing about pretty much any not-yet-here technology, not just AGI? Like, idk, self-driving cars or more efficient solar panels or photorealistic image generation or DALL-E for 5-minute videos. Yet it would be supremely stupid to have hundred-year medians for each of these things.
The difference between those technologies and AGI is that AGI is not remotely well-captured by any existing computer program. With image generation and self-driving, we already have decent results, and there are obvious steps for improvement (e.g. scaling, tweaking architectures). 5-minute videos are similar enough to images that the techniques can be reasonably expected to carry over. Where is the toddler-level, cat-level, or even bee-level proto-agi?
You say “We can’t know how difficult it will be or how many years it will take” Well, why do you seem so confident that it’ll take multiple decades? Shouldn’t you be more epistemically humble / cautious? ;)
Epistemic humility means having a wide probability distribution, which I do. The center of the distribution (hundreds of years out in my case) is unrelated to its humility.
Also, the way I phrased that is a little misleading because I don’t think years will be the most appropriate unit of time. I should have said “years/decades/centuries.”
The only issue I’d take is I believe most people here are genuinely frightened of AI. The seductive part I think isn’t the excitement of AI, but the excitement of understanding something important that most other people don’t seem to grasp.
I felt this during COVID when I realized what was coming before my co-workers etc did. There is something seductive about having secret knowledge, even if you realize it’s kind of gross to feel good about it.
Interesting point. Combined with the other poster saying he really would feel dread if a sage told him AGI was coming in 2040, I think I can acknowledge that my wishful thinking frame doesn’t capture the full phenomenon. But I would still say it’s a major contributing factor. Like I said in the post, I feel a strong pressure to engage in wishful thinking myself, and in my experience any pressure on myself is usually replicated in the people around me.
Regardless of the exact mix of motivations, I think this--
My main hope in terms of AGI being far off is that there’s some sort of circle-jerk going on on this website where everyone is basing their opinion on everyone else, but everyone is basing it on everyone else etc etc
is exactly what’s going on here.
I’m genuinely frightened of AGI and believe there is a ~10% chance my daughter will be killed by it before the end of her natural life, but honestly all of my reasons for worry boil down to “other smart people seem to think this.
I have a lot of thoughts about when it’s valid to trust authorities/experts, and I’m not convinced this is one of those cases. That being said, if you are committed to taking your view on this from experts, then you should consider whether you’re really following the bulk of the experts. I remember a thread on here a while back that surveyed a bunch of leaders in ML (engineers at Deepmind maybe?), and they were much more conservative with their AI predictions than most people here. Those survey results track with the vibe I get from the top people in the space.
Third “fact” at the top of the original post “We’ve made enormous progress towards solving intelligence in the last few years” is somewhat refuted by the rest: if it’s a math-like problem, we don’t know how much progress toward AGI we’ve made in the last few years.
Yeah, it crossed my mind that that phrasing might be a bit confusing. I just meant that
It’s a lot of progress in an absolute sense, and
It’s progress in the direction of AGI.
But I believe AGI is so far away that it still requires a lot more progress.
AGI in our lifetimes is wishful thinking
I give 60% odds it was them.
I’m pretty far in the other direction. I would give 90% odds it was done by the US or with our approval. These are the points that convinced me:
The prior on someone destroying their own infrastructure is pretty low
The US has a clear incentive to weaken Russia’s leverage over our European allies
There are old videos of Joe Biden and Victoria Nuland apparently threatening Nord Stream 2 in the event that Russia invades Ukraine
Also, a counterpoint to your coup-prevention theory. Let’s suppose Putin is worried about defectors in his ranks who may be incentivized to take over in order to turn on the pipeline. In that case, couldn’t Putin remove the incentive by turning it on himself? And wouldn’t that be a strictly better option for him than destroying it?
This got me thinking about how an anonymous actor could prove responsibility. It occurred to me that they could write their bitcoin address into the genome of the modified mosquitos. I don’t know if that’s how gene drives work, but it’s an interesting premise for a sci-fi story in any case.
I think therapy is one of the defining superstitions of our era. Even in communities where people are over the target on most issues, this one always seems to slip through.
I would be surprised if any kind of therapy is more effective than placebo, even the “academic, evidence-based psychotherapy research.”
it is clear these geniuses are capable of understanding things the vast, vast, vast majority of people are not
As the original post suggests, I don’t think this is true. I think that pretty much everyone in this comments section could learn any concept understood by Terry Tao. It would just take us longer.
Imagine your sole purpose in life was to understand one of Terry Tao’s theorems. All your needs are provided for, and you have immediate access to experts whenever you have questions. Do you really think you would be incapable of it?
Agreed. Also, it’s not surprising that the universality threshold exists somewhere within the human range because we already know that humans are right by the cutoff. If the threshold were very far below the human range, then a less evolved species would have hit it before we came about, and they would have been the ones to kick off the knowledge explosion.
You can deduce a lot about someone’s personality from the shape of his face.
I don’t know if this is really that controversial. The people who do casting for movies clearly understand it.