Quadratic Reciprocity

Karma: 666

Quadratic Reciprocity 10 Jul 2022 16:52 UTC
13 points
2
on: Quadratic Reciprocity’s Shortform
Interesting bet on AI progress (with actual money) made in 1968:
1968 – Scottish chess champion David Levy makes a 500 pound bet with AI pioneers John McCarthy and Donald Michie that no computer program would win a chess match against him within 10 years.
1978 – David Levy wins the bet made 10 years earlier, defeating Chess 4.7 in a six-game match by a score of 4½–1½. The computer’s victory in game four is the first defeat of a human master in a tournament
In 1973, Levy wrote:
“Clearly, I shall win my … bet in 1978, and I would still win if the period were to be extended for another ten years. Prompted by the lack of conceptual progress over more than two decades, I am tempted to speculate that a computer program will not gain the title of International Master before the turn of the century and that the idea of an electronic world champion belongs only in the pages of a science fiction book.”
After winning the bet:
“I had proved that my 1968 assessment had been correct, but on the other hand my opponent in this match was very, very much stronger than I had thought possible when I started the bet.”He observed that, “Now nothing would surprise me (very much).”
In 1996, Popular Science asked Levy about Garry Kasparov’s impending match against Deep Blue. Levy confidently stated that ”...Kasparov can take the match 6 to 0 if he wants to. ‘I’m positive, I’d stake my life on it.’” In fact, Kasparov lost the first game, and won the match by a score of only 4–2. The following year, he lost their historic rematch 2.5–3.5.
So seems like he very much underestimated progress in chess despite winning the original bet.
https://en.wikipedia.org/wiki/David_Levy_(chess_player)

[Question] How do AI timelines affect how you live your life?

Quadratic Reciprocity11 Jul 2022 13:54 UTC

80 points

50 comments1 min readLW link

Quadratic Reciprocity 12 Jul 2022 5:16 UTC
LW: 42 AF: 17
3
AF
on: On how various plans miss the hard bits of the alignment challenge
As someone with limited knowledge of AI or alignment, I found this post accessible. There were times when I thought I knew vaguely what Nate meant but would not be able to explain it so I’m recording my confusions here to come back to when I’ve read up more. (If anyone wants to answer any of these r/NoStupidQuestions questions, that would be very helpful too).
1. “Your first problem is that the recent capabilities gains made by the AGI might not have come from gradient descent”. This is something that comes up in response to a few of the plans. Is the idea that during training, for advanced enough AIs capabilities gains come from gradient descent and also through processing input / interacting with the world. Or is the second part only after it has finished training. What does that concretely look like in ML?
2. Is a lot of the disagreement about these plans just because of others finding the idea of a “sharp left turn” more unlikely than Nate or is there more agreement about that idea but the disagreement is about what proposals might give us a shot at solving it?
3. What might an ambitious interpretability agenda focused on the sharp left turn and the generalization problem look like besides just trying harder at interpretability?
4. Another explanation of the “sharp left turn” would also be really helpful to me. At the moment, it feels like I can only explain why that happens by using analogies to humans/apes rather than being able to give a clear explanation for why we should expect that by default, using ML/alignment language.

Quadratic Reciprocity 18 Jul 2022 15:15 UTC
3 points
0
on: Quadratic Reciprocity’s Shortform
When we compare results from PaLM 540B to our own identically trained 62B and 8B model variants, improvements are typically log-linear. This alone suggests that we have not yet reached the apex point of the scaling curve. However, on a number of benchmarks, improvements are actually discontinuous, meaning that the improvements from 8B to 62B are very modest, but then jump immensely when scaling to 540B. This suggests that certain capabilities of language models only emerge when trained at sufficient scale, and there are additional capabilities that could emerge from future generations of models.
Examples of the tasks for which there was discontinuous improvement were: english_proverbs (guess which proverb best describes a text passage from a list—requires a very high level of abstract thinking) and logical_sequence (order a set of “things” (months, actions, numbers, letters, etc.) into their logical ordering.).
Eg of a logical_sequence task:
Input: Which of the following lists is correctly ordered chronologically? (a) drink water, feel thirsty, seal water bottle, open water bottle (b) feel thirsty, open water bottle, drink water, seal water bottle (c) seal water bottle, open water bottle, drink water, feel thirsty

Quadratic Reciprocity 18 Jul 2022 22:26 UTC
1 point
0
in reply to: Quadratic Reciprocity’s comment on: Quadratic Reciprocity’s Shortform
Over all 150 tasks [in BIG-bench], 25% of tasks had discontinuity greater than +10%, and 15% of tasks had a discontinuity greater than +20%.
Discontinuity = (actual accuracy for 540B model) - (log-linear projection using 8b → 62b)

Quadratic Reciprocity 3 Aug 2022 1:17 UTC
LW: 8 AF: 5
0
AF
on: Two-year update on my personal AI timelines
As a result, my timelines have also concentrated more around a somewhat narrower band of years. Previously, my probability increased from 10% to 60% over the course of the ~32 years from ~2032 and ~2064; now this happens over the ~24 years between ~2026 and ~2050.
10% probability by 2026 (!!)

Quadratic Reciprocity 3 Aug 2022 3:29 UTC
3 points
2
in reply to: Zach Stein-Perlman’s comment on: Two-year update on my personal AI timelines
While that is more coherent, it feels wrong to change the numbers like that if it causes losing track of the intuition of TAI by 2026 with probability 10%

Quadratic Reciprocity 12 Aug 2022 19:49 UTC
5 points
2
in reply to: Steven Byrnes’s comment on: Shard Theory: An Overview
the AGI might self-modify to be more coherent in a way that involves crushing / erasing a subset of its desires, and this subset might include the desires related to human flourishing. This is analogous to how you or I might try to self-modify to erase some of our (unendorsed) desires (to eat junk food, be selfish, be cruel, etc.), if we could
This seems like a big deal to me, because it feels like if I could modify myself, I would probably do so to make myself better at achieving a handful of goals like {having a positive impact, (maybe) obtaining a large amount of power/money, getting really really good at a particular skill} and everything else like {desire to eat good food, be vengeful, etc} would be thrown out. The first set feels different from the second because those desires feel more like maximising something, which is worrying.

Quadratic Reciprocity 13 Aug 2022 6:27 UTC
2 points
1
on: Team Shard Status Report
We’re going to take it off-distribution and see whether it terminally values (1) just the coins, (2) a small handful of in-distribution proxies for getting coins, or (3) all of its in-distribution proxies for coins
Is (2) here just referring to the type of stuff seen in Goal Misgeneralization in Deep Reinforcement Learning (CoinRun agent navigates to right-hand end of level instead of fetching the coin)?

Quadratic Reciprocity 15 Aug 2022 22:58 UTC
1 point
0
in reply to: David Udell’s comment on: Team Shard Status Report
Cool! How do you tell if it is (2) or (3)?

Quadratic Reciprocity 18 Aug 2022 6:37 UTC
9 points
2
on: Quadratic Reciprocity’s Shortform
I am constantly flipping back and forth between “I have terrible social skills” and “People only think I am smart and competent because I have charmed them with my awesome social skills”.

Quadratic Reciprocity 26 Aug 2022 4:05 UTC
12 points
6
in reply to: Tomás B.’s comment on: Common misconceptions about OpenAI
If we’re thinking about the same “very, very, very high-level person at OpenAI”, it does seem like this person now buys that inner alignment is a thing and is concerned about it (or says he’s concerned). It is scary because people at these AI labs don’t know all that much about AI alignment but also hopeful because they don’t seem to disagree with it and maybe just need to be given the arguments in a good way by someone they would listen to?

Quadratic Reciprocity 26 Aug 2022 4:16 UTC
6 points
1
in reply to: Noosphere89’s comment on: Common misconceptions about OpenAI
Letting GPT-3 interact with the internet seems pretty bad to me

Quadratic Reciprocity 29 Aug 2022 13:52 UTC
6 points
1
on: Quadratic Reciprocity’s Shortform
“If a factory is torn down but the rationality which produced it is left standing, then that rationality will simply produce another factory. If a revolution destroys a government, but the systematic patterns of thought that produced that government are left intact, then those patterns will repeat themselves. . . . There’s so much talk about the system. And so little understanding.”

Quadratic Reciprocity 5 Sep 2022 11:22 UTC
1 point
0
on: Quadratic Reciprocity’s Shortform
Game theory has many paradoxical models in which a player prefers having worse information, not a result of wishful thinking, escapism, or blissful ignorance, but of cold rationality. Coarse information can have a number of advantages.
(a) It may permit a player to engage in trade because other players do not fear his superior information.
(b) It may give a player a stronger strategic position because he usually has a strong position and is better off not knowing that in a particular realization of the game his position is weak.
Or, (c) as in the more traditional economics of uncertainty, poor information may permit players to insure each other.

Quadratic Reciprocity 11 Sep 2022 21:45 UTC
3 points
1
in reply to: the gears to ascension’s comment on: AI Risk Intro 1: Advanced AI Might Be Very Bad
I think this post is good for general consumption (if we’re thinking of general consumption as can be read and understood by the median college student). I think the ideal format for this post if targeted at that audience is probably different though—ideally, a lot of the text would be collapsible and hidden by default unless people want to read the longer explanation. So someone coming across this post could quickly understand the structure of the argument and then dive deeper into the sections they are unconvinced by or that feel unclear.

Quadratic Reciprocity 20 Sep 2022 0:31 UTC
19 points
9
in reply to: johnswentworth’s comment on: How my team at Lightcone sometimes gets stuff done
It would also be interesting to see examples of what terrible ops looks like. As one of the “kids”, here are some examples of things that were bad in previous projects I worked on (some mistakes for which I was responsible):

- getting obsessed with some bad metric (eg: number of people who come to an event) and spending lots of hours getting that number up instead of thinking about why I was doing that
- so many meetings, calling a meeting if there’s any uncertainty about what to do next
- there being uncertainty about what to do next because team members lack context, don’t know who’s responsible for what, who is working on what, etc—
some people taking on too much responsibility and not being able to pass it on because having to explain how to do a task to someone else would itself take up too much time—
very disorganised meetings where everyone wanted to have a say and it wasn’t clear at the end of it what the action steps were—
an unwillingness for the person with the most context to take the role of explicitly telling others what concrete tasks to do (because the other team members were their friends and they didn’t want to be too bossy)
- There not being enough structure for team members to give feedback if they thought an idea or project someone else was very excited about would be useless, as a result some mini-projects getting incubated that people weren’t excited about or projects that predictably failed because the person who wanted to do them did not have enough information to help them figure out how to avoid the failure modes other team members would have been concerned about (lack of sharing intuitions really well). also, people picking projects and tasks to do for bad reasons rather than via structured thinking about priorities
- including too many people in meetings because “we’d love to get person X’s thoughts on our plans as well (and person Y and person Z...)”
- it being hard to trust that other team members would actually get things assigned to them done on time, partly because it was difficult to see partial progress without having to ask the person how things were going
- changing platforms and processes based on whims rather than figuring things out early and sticking with them

I think some of the things mentioned in the post are pretty helpful for avoiding some of these problems.

Quadratic Reciprocity 29 Oct 2022 1:28 UTC
6 points
0
on: Apply to the Redwood Research Mechanistic Interpretability Experiment (REMIX), a research program in Berkeley
Is there a rough heuristic someone reading this post could use to decide if they have the requisite coding experience / comfort with python and PyTorch before applying and doing the TripleByte tests? In particular, how does the acceptance threshold differ from MLAB?

Quadratic Reciprocity 3 Nov 2022 21:07 UTC
3 points
0
on: All AGI Safety questions welcome (especially basic ones) [~monthly thread]
Why wouldn’t a solution to Eliciting Latent Knowledge (ELK) help with solving deceptive alignment as well? Isn’t the answer to whether the model is being deceptive part of its latent knowledge?

If ELK is solved in the worst case, how much more work needs to be done to solve the alignment problem as a whole?

Quadratic Reciprocity 3 Nov 2022 21:25 UTC
1 point
0
on: Quadratic Reciprocity’s Shortform
The volunteers’ dilemma game models a situation in which each player can either make a small sacrifice that benefits everybody or instead wait in the hope of benefiting from someone else’s sacrifice.
An example could be: you see a person dying and can decide whether to call an ambulance or not. You prefer that someone else call but if no one else would, you would strongly prefer calling over not calling and the person being dead. So if it is just you watching the person die, you would call the ambulance given these payoffs.

There are as many pure strategy Nash equilibria in this game as there are players, a “player x calls ambulance, everyone else does not” for every x. There’s also a symmetric mixed strategy Nash equilibrium where every player has the same probability p of calling the police.

The fun part is that as the number of bystanders goes up, not only does your own probability of calling go down but also at equilibrium, the combined probability that anyone at all would call the ambulance also goes down. The person is more likely to die if there are more observers around, assuming everyone is playing optimally.
One implication of this is that if you have a team of people, unless you try to assign specific individuals to take charge of specific tasks, you might end up in situations where the probability of tasks happening at all decreases as you add more people (everyone feels like it’s less their responsibility to take care of any particular ball).

Quadratic Reciprocity

[Question] How do AI timelines af­fect how you live your life?

[Question] How do AI timelines affect how you live your life?