I think we’ve done an ok job at human alignment, given that the pension isn’t a bullet to the head.
I somewhat suspect that alignment is easier than most of less wrong think, but I’m definitely in the minority in this space.
I think we’ve done an ok job at human alignment, given that the pension isn’t a bullet to the head.
I somewhat suspect that alignment is easier than most of less wrong think, but I’m definitely in the minority in this space.
I suspect that it would given that the largest room for improvement would be physical (chip/wafer improvements), I suspect that there isn’t that much room for pure mathematically identical improvement of something like a transformer.
Happy to hear your opinion though!
A world where alignment is impossible should be safer than a world where alignment is very difficult.
Here’s why I think this:
Suppose we have two worlds. In world A, alignment is impossible.
In this world, suppose an ASI is invented. This ASI wants to scale in power as quickly and thoroughly as possible, this ASI has the following options:
Scale horizontally.
Algorithmic improvements that can be mathematically guaranteed to produce identical outcomes.
Chip/wafer improvements.
Notably, the agent cannot either retrain itself, or train another more powerful agent to act on its behalf, since it can’t align the resulting agent. This should restrict the vast majority of potential growth (even if it might still be easily enough to overpower humans in a given scenario).
In world B, the ASI agent can do all of the above, but can also train a successor agent, we should expect the ASI to be able to get vastly more intelligent vastly quicker.
“Conversely, if gorillas and chimps were capable of learning complex sign language for communication, we’d expect them to evolve/culturally develop such a language.”
I haven’t read much about the whole Koko situation, but my understanding is that part of the claim was that Koko was *unusually* adept with language.
A priori, if language comes “packaged for free” with some other high order cognitive functionalities that for whatever reason can only be maintained in a small proportion of chimps (maybe calorie availability, increased risk taking behaviour or something else), then it seems perfectly plausible that the capability for language would be present in some proportion of chimps >0 but below the critical threshold for language formation.
Alternatively, it also seems possible that the process of creating grammar is more difficult than the process of producing language in an already constructed grammar. In this case you could have a pretty high proportion of animals capable of producing language after instruction, but incapable of inventing language.
I think they probably would, but admit that it’s unprovable and people have good reason to disagree.
The difference to my mind is the difference between:
Personal security and security as a regime.
And 1-epsilon security and 1 security.
I think the difference between the two of these would drive a lot of dictators actions.
I don’t know as much about China, but you can see the first dynamic pretty clearly in Putin’s actions. It’d be hard to argue that it’s good for Russian national security for the Gazprom retirement plan to be “Falling into artic waters in the middle of the night”, but it makes Putin like 0.001% safer.
On the other hand, if there was literally no benefit to doing so, I think Putin would be content and optimally happy retiring to a personal solar system sized dacha.
Maybe this is controversial, but I think that dictators do care about other people, just far less than they care about their own power and safety. It’s well known, for example, that Kim Jong Un has a massive softspot for children.
On the other hand, the only reason democratic leaders don’t act like dictators is because they can’t.
I might be less concerned if the country leading ai development was a parliamentary democracy and not a presidential one, but the level of personal power held by the president of the USA will (imo) lead them to be exactly as prone to malevolent actions as someone like Xi in the CCP.
Like many Americans, I think Dario seems overly rosy about the democratic credentials of the USA and probably overly pessimistic about the CCP.
It wasn’t more than a week ago when the president of the US was blustering about invading an allied state, and I have no doubts that Donald Trump would commit worldwide atrocities if they had access to ASI.
On the other hand, it’s far from clear to me that autocracies would automatically become more repressive with ASI, it seems plausible to me that the psychological safety of being functionally unremovable could lead to a more blasse attitude towards dissonance. Who gives a shit if they can’t unthrone you anyway?
Alternatively, I most often see rote memorization recommended by people studying fields that are inherently somewhat organised.
It’s easy to see why anki might work well for something like “memorizing lots of words in kanji” because the work of organising concepts into buckets is already embedded in the kanji and kanji radicals.
It’s less obvious to me how you could, for example, learn optimal riichi mahjong with this type of method; and probably because of that I’ve never seen someone recommend that.
I’d just note, that you should be cautious of people “answering” this question in hindsight.
In both of the two subjects that I feel most professionally confident in and have had the chance to teach (maths and computer science) you’ll see people sharing a common refrain. “If only I’d learnt {Complicated method/language/Mental Model} first, I’d have saved myself so much time.”
The most common examples I’ve seen of this are people who are convinced that teaching kids pointer juggling is gonna give them a stronger foundation for CS, or the cult of “Linear Algebra Done Right” (a book that I love, but that isn’t a good introduction to the field imo).
“Lies to children” exist for a reason, and while some might be skippable, many form useful intellectual scaffolds.
This is getting a bit into the weeds, but I find that this blog post mirrors my experience with Turing incomplete languages: https://neilmitchell.blogspot.com/2020/11/turing-incomplete-languages.html?m=1 (it also has the advantage of talking about a language that I’ve used in industry, and can personally attest to a little).
Even if there’s a sophisticated and more accurate way of describing the problem space, practicality can often push you back to a more general description with an extra hacky constraint (resource limits) shoved on top.
Is this type of course structure typical in US universities? It seems very strange to me that real analysis wouldn’t be a first semester class, or that such a large proportion of classes in a maths degree would be on anything but maths.
Am I alone in not seeing any positive value whatsoever in humanity, or specific human beings, being reconstructed? If anything, it just seems to increase the S-risk of humanlike creatures being tortured by this ASI.
As for more abstract human values, I’m not remotely convinced that we could either:
a) Convince such a more technologically advanced civilization to update towards our values.
or
b) That they would interpret them in a way that’s meaningful to me, and not actively contra my interests.
I suspect, like many things in politics, that the main issue here is domestic politics more than foreign affairs.
If you’ve ever compared election results between single and multi-member systems, you’ll have noticed a trend. Even if via first preference count, a minor party seems to best represent a significant chunk of the population, unless they’re geographically concentrated you can expect them to pick up on the order of ~0 seats.
Similarly, if we’re not going to abandon democratic principles, we should probably have the consent of the majority in an area before we perform an experiment on them. Problem with this is that even if world/country wide there’s a quorum of people who would consent to a given experiment, it’s highly unlikely that they all live in the same place.
While something like a Schengen area might in principle alleviate some of these concerns, it introduces two main additional ones:
1) Does your experiment actually improve society? Or does it just attract the types of people who improve society themselves?
2) Most people aren’t a big fan of being told they have to move cities/countries to continue living their lifestyle. I suspect that Lesswrong users as a cohort undervalue stability relative to the rest of the population.
It’s worth noting that factory farming isn’t just coincidentally out of the limelight, in some (many?) areas it’s illegal to document. https://en.m.wikipedia.org/wiki/Ag-gag
While many of these laws seem somewhat reasonable on the surface, since they’re billed as strengthening trespass law, you can’t gather video evidence of a moral crime taking place on private property without at least some form of trespass.
I think a different use of MI is warranted here. While I highly doubt the ability to differentiate whether a value system is meshing well with someone for “good” or “bad” reasons, it seems more plausible to me that you could measure the reversibility of a value system.
The distinguishing feature of a trap here isn’t so much the badness, as the fact that it’s irreversible. If you used interpretability techniques to check whether someone could be reprogrammed from a belief, you’d avoid a lot of tricky situations.
Apologies for the late reply.
With a bit over 600k 0-3 year Olds in swim lessons at the time of the linked report, and around 1.2 million children in that age range in Australia, I’d estimate at least half of kids below 4 have taken swim lessons. So quite common, but not to the extent that I had thought.
Notably, swim lessons for young children are highly subsidized by most states, with many offering a fixed number of free lessons.
A bit later in primary school, the majority of kids will be given free swim lessons at their local public pool though.
Are child swim lessons common in America? Over here, free swim lessons are now provided for children, and mandatory swim lessons are provided as part of primary school. My understanding is that it’s made a relatively large dent in the rate of child drowning injury.
In particular, once your child is proficient at swimming, you can get lessons on plain clothes swimming incase of a trip, fall, or if another kid needs rescuing.
A transplant seems unnecessary if there’s any realistic change of probe technology advancing. Surely it’d be possible to grow the same neurones in wet lab, use brain probes to connect them to a living person, and keep the tinkering inside someone’s head to a minimum.
(Putting aside the profound ethical issues) In that case, neuronal material could even be swapped out on the fly if one batch is proving ineffective for a given task (or, a new batch could have old signals replayed to it to get it up to speed).
Is there something I’m missing on the neuroscience end? I’m not at all familiar with the field.
I’ve been a stay at home parent for a good chunk of the LLM period, so I haven’t seen anything at work, but anecdotally I’ve noticed a massive increase in chatGPTese on a language exchange app I use (hello talk).
While LLMs are (imo) a pretty amazing tool for solving the grammar ASK hypothesis, it’s pretty concerning that a space supposedly dedicated to the vulnerability that comes with language learning is becoming increasingly devoid of beginner mistakes.