Cleo Nardo

Karma: 2,270

DMs open.

Cleo Nardo 15 Sep 2022 22:09 UTC
3 points
0
in reply to: ESRogs’s comment on: How should DeepMind’s Chinchilla revise our AI forecasts?
Google owns DeepMind, but it seems that there is little flow of information back and forth.

Example 1: GoogleBrain spent approximately $12M to train PaLM, and $9M was wasted on suboptimal training because DeepMind didn’t share the Hoffman2022 results with them.

Example 2: I’m not a lawyer, but I think it would be illegal for Google to share any of its non-public data with DeepMind.

Cleo Nardo 15 Sep 2022 22:45 UTC
4 points
1
in reply to: Felix Hofstätter’s comment on: How should DeepMind’s Chinchilla revise our AI forecasts?
To me it seems that it might just as well make timelines longer to depend on algorithmic innovations as opposed to the improvements in compute that would help increase parameters.

I’ll give you an analogy:
Suppose your friend is running a marathon. You hear that at the halfway point she has a time of 1 hour 30 minutes. You think “okay I estimate she’ll finish the race in 4 hours”. Now you hear she has been running with her shoelaces untied. Should you increase or decrease your estimate?
Well, decrease. The time of 1:30 is more impressive if you learn her shoelaces were untied! It’s plausible your friend will notice and tie up her shoelaces.
But note that if you didn’t condition on the 1:30 information, then your estimate would increase if you learned her shoelaces were untied for the first half.

Now for Large Language Models:
Believing Kaplan’s scaling laws, we figure that the performance of LLMs depended on $N$ the number of parameters. But maybe there’s no room for improvement in $N$ -efficiency. LLMs aren’t much more $N$ -inefficient than the human brain, which is our only reference-point for general intelligence. So we expect little algorithmic innovation. LLMs will only improve because $N$ and $D$ grows.
On the other hand, believing Hoffman’s scaling laws, we figure that the performance of LLMs depended on $D$ the number of datapoints. But there is likely room for improvement in $D$ -efficiency. The brain is far more $D$ -inefficient than LLMs. So LLMs have been metaphorically running with their shoes untied. There is room for improvement. So we’re less surprised by algorithmic innovation. LLMs will still improve because $N$ and $D$ grows, but this isn’t the only path.

So Hoffman’s scaling laws shorten our timeline estimates.

This is an important observation to grok. If you’re already impressed by how an algorithm performs, and you learn that the algorithm has a flaw which would disadvantage it, then you should increase your estimate of future performance.

Cleo Nardo 3 Nov 2022 23:43 UTC
4 points
0
in reply to: jacob_cannell’s comment on: K-types vs T-types — what priors do you have?
what do you mean “the solomonoff prior is correct”? do you mean that you assign high prior likelihood to theories with low kolmogorov complexity?

this post claims: many people assign high prior likelihood to theories with low time complexity. and this is somewhat rational for them to do if they think that they would otherwise be susceptible to fallacious reasoning.

Cleo Nardo 4 Nov 2022 8:01 UTC
2 points
0
in reply to: β-redex’s comment on: K-types vs T-types — what priors do you have?
when translating between proof theory and computer science:

(computer program, computational steps, output) is mapped to (axioms, deductive steps, theorems) respectively.

kolmogorov-complexity maps to “total length of the axioms” and time-complexity maps to “number of deductive steps”.

Cleo Nardo 4 Nov 2022 8:09 UTC
4 points
3
in reply to: jacob_cannell’s comment on: K-types vs T-types — what priors do you have?
You could still be doing perfect bayesian reasoning regardless of your prior credences. Bayesian reasoning (at least as I’ve seen the term used) is agnostic about the prior, so there’s nothing defective about assigned a low prior to programs with high time-complexity.

Cleo Nardo 4 Nov 2022 14:09 UTC
3 points
0
in reply to: David Udell’s comment on: K-types vs T-types — what priors do you have?
yep. amended.

Cleo Nardo 6 Nov 2022 18:11 UTC
2 points
0
in reply to: Ansel’s comment on: K-types vs T-types — what priors do you have?
Thanks for the comments. I’ve made two edits:
There is a spectrum between two types of people, K-types and T-types.
and
I’ve tried to include views I endorse in both columns, however most of my own views are right-hand column because I am more K-type than T-type.
You’re correct that this is a spectrum rather than a strict binary. I should’ve clarified this. But I think it’s quite common to describe spectra by their extrema, for example:

Cleo Nardo 23 Nov 2022 2:43 UTC
10 points
1
on: Meta AI announces Cicero: Human-Level Diplomacy play (with dialogue)
Quick emarks and questions:
1. AI developers have been competing to solve purely-adversarial / zero-sum games, like Chess or Go. But Diplomacy, in contrast, is semi-cooperative. Will be safer if AGI emerges from semi-cooperative games than purely-adversarial games?
2. Is it safer if AGI can be negotiated with?
3. No-Press Diplomacy was solved by DeepMind in 2020. MetaAI was just solved Full-Press Diplomacy. The difference is that in No-Press Diplomacy the players can’t communicate whereas in Full-Press Diplomacy the players can chat for 5 minutes between rounds.
  
  Is Full-Press more difficult than No-Press Diplomacy, other than the skill of communicating one’s intentions?
  
  Full-Press Diplomacy requires a recursive theory of mind — does No-Press Diplomacy also?
4. CICERO consists of a planning engine and a dialogue engine. How much of the “intelligence” is the dialogue engine?
  
  Maybe the planning engine is doing all the work, and the dialogue engine is just converting plans into natural language, but isn’t doing anything more impressive than that.
  
  Alternatively, it might be that the dialogue engine (which is a large language model) is containing latent knowledge and skills.
5. Could an architecture like this actually be used in international diplomacy and corporate negotiations? Will it be?
6. There’s hope among the AI Safety community that competent-but-not-yet-dangerous AI might assist them in alignment research. Maybe this Diplomacy result will boost hope in the AI Governence community that competent-but-not-yet-dangerous AI might assist them in governance. Would this hope be reasonable?

Cleo Nardo 23 Nov 2022 23:39 UTC
4 points
3
in reply to: lise’s comment on: Against “Classic Style”
I think classic style is bad for all the situations that Pinker endorses it:
- Academic papers
- Non-fiction books
- Textbooks
- Blog posts
- Manuals
This is because I can’t think of any situations where the five limitations I mention would be appropriate.

Cleo Nardo 23 Nov 2022 23:49 UTC
2 points
0
in reply to: Samuel Hapák’s comment on: Against “Classic Style”
Avoiding hedging is only one aspect of classic style. I would also recommend against hedging, but I would replace hedging with more precise notions of uncertainty.

Cleo Nardo 24 Nov 2022 16:42 UTC
3 points
4
in reply to: Rosencrantz ’s comment on: Against “Classic Style”
Writing can definitely be overly “self-aware” sometimes (trust me I know!) but “classic style” is waaaayyy too restrictive.
My rule of thumb would be:
Write sentences that are maximally informative to your reader.
If you know that $ϕ$ and you expect that the reader’s beliefs about the subject matter would significantly change if they also knew $ϕ$ , then write that $ϕ$ .
This will include sentences about the document and the author — rather than just the subject.

Cleo Nardo 24 Nov 2022 17:14 UTC
3 points
2
in reply to: Ben’s comment on: Against “Classic Style”
When reading an academic paper, you don’t find it useful when the author points out their contributions? I definitely do. I like to know whether the author asserts $ϕ$ because it’s the consensus in the field, or whether the author asserts $ϕ$ because that’s the conclusion of the data. If I later encounter strong evidence against $ϕ$ then this difference matters — it determines whether I update against that particular author or against the whole field.

Cleo Nardo 25 Nov 2022 15:43 UTC
2 points
0
in reply to: Tapatakt’s comment on: When AI solves a game, focus on the game’s mechanics, not its theme.
This happened with EURISKO.

Cleo Nardo 30 Nov 2022 15:59 UTC
0 points
−2
in reply to: Richard Korzekwa ’s comment on: Against “Classic Style”
I agree that Pinker’ advice is moderate — e.g. he doesn’t prohibit authors from self-reference.

But this isn’t because classic style is moderate — actually classic style is very strict — e.g. it does prohibit authors from self-reference.

Rather, Pinker’s advice is moderate because he weakly endorses classic style. His advice is “use classic style except in rare situations where this would be bad on these other metric.

If I’ve read him correctly, then he might agree with all the limitations of classic style I’ve mentioned.

(But maybe I’ve misread Pinker. Maybe he endorses classic style absolutely but uses “classic style” to refer to a moderate set of rules.)

Cleo Nardo 7 Dec 2022 23:06 UTC
4 points
3
in reply to: jacquesthibs’s comment on: MIRI’s “Death with Dignity”, but in 80 seconds.
Yeah I mostly agree with Connor’s interpretation of Death with Dignity.
I know a lot of the community thought it was a bad post, and some thought it was downright infohazardous, but the concept of “death with dignity” is pretty lindy actually. When a group of soldiers are fighting a battle with awful odds, they don’t change their belief to “a miracle with save us”, they change their goal to “I’ll fight till my last breath”.
If people find the mindset harmful, then they won’t use it. If people find the mindset helpful, then they will use it. But I think everyone should try out the mindset for an hour or two.

Cleo Nardo 11 Dec 2022 4:14 UTC
1 point
0
in reply to: jessicata’s comment on: A silly argument for high P(doom)
Yep, the crux is: do we need a unique solution which solves all our problems, or can we accept that different problems are solved by different solutions? I somewhat lean to the former.

Cleo Nardo 13 Dec 2022 16:47 UTC
3 points
0
on: strawberry calm’s Shortform
BeReal — the app.
If you download the app BeReal then each day at a random time you will be given two minutes to take a photo with the front and back camera. All the other users are given a simultaneous “window of time”. These photos are then shared with your friends on the app. The idea is that (unlike Instagram), BeReal gives your friends a representative random sample of your life, and vice-versa.
If you and your friends are working on something impactful (e.g. EA or x-risk), then BeReal is a fun way to keep each other informed about your day-to-day life and work. Moreover, I find it keeps me “accountable” (i.e. stops me from procrastinating or wasting the whole day in bed).

Cleo Nardo 20 Dec 2022 0:13 UTC
2 points
0
in reply to: Jay Bailey’s comment on: Towards Hodge-podge Alignment
Correct — there’s a chance the expected utility quantilizer takes the same action as the expected utility maximizer. That probability is the inverse of the number of actions in the quantile, which is quite small (possibly measure zero) because because actionspace is so large.

Maybe it’s defined like this so it has simpler mathematical properties. Or maybe it’s defined like this because it’s safer. Not sure.

Cleo Nardo 20 Dec 2022 0:19 UTC
4 points
3
in reply to: davidad’s comment on: Towards Hodge-podge Alignment
Okay, I think I’ll write an alignment-relevent distillation of Myers’ book.

It might be useful for thinking about embedded agency — and especially for the problem that Scott Garrabrant has been grappling recently with his Cartesian Frames, i.e. how do we formalise world-states that admit multiple distinct decompositions into environment and agents.

Cleo Nardo 20 Dec 2022 0:46 UTC
6 points
1
in reply to: DragonGod’s comment on: Towards Hodge-podge Alignment
The heuristic is “assemblage is safer than its primitives”.
Formally:
- For every primitive $p$ and assemblages $A_{1}$ and $A_{2}$ and wiring diagram $D$ , the following is true:
- If $D \circ (A_{1} \otimes p)$ strongly dominates $A_{1}$ then $D \circ (A_{2} \otimes p)$ weakly dominates $A_{2}$ .
- Recall that $D \circ (A \otimes p)$ is the wiring-together of $A$ and $p$ using the wiring diagram $D$ .
- In English, this says that $p$ can’t be helpful in one assemblage and unhelpful in another.
I expect counterexamples to this heuristic to look like this:
- Many corrigibility primitives allow a human to influence certain properties of the internal state of the AI.
- Many interpretability primitives allow a human to learn certain properties of the internal state of the AI.
- These primitives might make an assemblage less safe because the AI could use these primitives itself, leading to self-modification.

Cleo Nardo

BeReal — the app.