This happened with EURISKO.
When reading an academic paper, you don’t find it useful when the author points out their contributions? I definitely do. I like to know whether the author asserts ϕ because it’s the consensus in the field, or whether the author asserts ϕ because that’s the conclusion of the data. If I later encounter strong evidence against ϕ then this difference matters — it determines whether I update against that particular author or against the whole field.
Writing can definitely be overly “self-aware” sometimes (trust me I know!) but “classic style” is waaaayyy too restrictive.
My rule of thumb would be:
Write sentences that are maximally informative to your reader.
If you know that ϕ and you expect that the reader’s beliefs about the subject matter would significantly change if they also knew ϕ, then write that ϕ.
This will include sentences about the document and the author — rather than just the subject.
Avoiding hedging is only one aspect of classic style. I would also recommend against hedging, but I would replace hedging with more precise notions of uncertainty.
I think classic style is bad for all the situations that Pinker endorses it:
This is because I can’t think of any situations where the five limitations I mention would be appropriate.
Quick emarks and questions:
AI developers have been competing to solve purely-adversarial / zero-sum games, like Chess or Go. But Diplomacy, in contrast, is semi-cooperative. Will be safer if AGI emerges from semi-cooperative games than purely-adversarial games?
Is it safer if AGI can be negotiated with?
No-Press Diplomacy was solved by DeepMind in 2020. MetaAI was just solved Full-Press Diplomacy. The difference is that in No-Press Diplomacy the players can’t communicate whereas in Full-Press Diplomacy the players can chat for 5 minutes between rounds.Is Full-Press more difficult than No-Press Diplomacy, other than the skill of communicating one’s intentions?Full-Press Diplomacy requires a recursive theory of mind — does No-Press Diplomacy also?
CICERO consists of a planning engine and a dialogue engine. How much of the “intelligence” is the dialogue engine?Maybe the planning engine is doing all the work, and the dialogue engine is just converting plans into natural language, but isn’t doing anything more impressive than that.Alternatively, it might be that the dialogue engine (which is a large language model) is containing latent knowledge and skills.
Could an architecture like this actually be used in international diplomacy and corporate negotiations? Will it be?
There’s hope among the AI Safety community that competent-but-not-yet-dangerous AI might assist them in alignment research. Maybe this Diplomacy result will boost hope in the AI Governence community that competent-but-not-yet-dangerous AI might assist them in governance. Would this hope be reasonable?
EA is constrained by the following formula:
Number of Donors x Average Donation = Number of Grants x Average Grant
If we lose a big donor, there are four things EA can do:
Increase the number of donors:
Outreach. Community growth. Might be difficult right now for reputation reasons, though fortunately, EA was very quick to denounce SBF.
Maybe lobby the government for cash?
Maybe lobby OpenAI, DeepMind, etc for cash?
Increase average donation:
Get another billionaire donor. Presumably, this is hard because otherwise EA would’ve done it already, but there might be factors that are hidden from me.
80K could begin pushing earning-to-give again. They shifted their recommendations a few years ago to promoting direct-impact careers. This made sense when EA was less funding-constrained.
Get existing donors to ramp up their donations. In the good ol’ days, EA used to be a club for people donating 60% of their income to anti-malaria bednets. Maybe EA will return to that frugal ascetic lifestyle.
Reduce the number of grants:
FTX was funding a number of projects. Some of these were higher priorities than others. Hopefully the high-priority projects retain their funding, whereas low-priority projects are paused.
EA has been engaged in a “hit-or-miss” approach to grant-making. This makes sense when you have more cash than sure-thing ideas. But now we have less cash we should focus on sure-thing ideas.
The problem with the “sure-thing” approach to grant-making is that it biases certain causes (e.g. global health & dev) over others (e.g. x-risk). I think that would be a mistake. Someone needs to think about how to calibrate for this bias.Here’s a tentative idea: EA needs more prizes and other forms of retrodictive funding. This will shift risk from the grant-maker to the researcher, which might be good because the researcher is more informed about the likelihood of success than the grant-maker.
Reduce average grant:
Maybe EA needs to focus on cheaper projects.
For example, in AI safety there has been a recent shift away from theoretic work (like MIRI’s decision theory) towards experimental work. This experimental work is very expensive because it involves (say) training large language models. This shift should be at least somewhat reversed.
Academics are very cheap! And they often already have funding. EA (especially AI safety) needs to do more outreach to established academics, such as top philosophers, mathematicians, economists, computer scientists, etc.
(Cross-post from EA forum)
Are you saying that it’s too early to claim “SBF committed fraud”, or “SBF did something unethical”, or “if SBF committed fraud, then he did something unethical”?
I think we have enough evidence to assert all three.
Thanks for the comments. I’ve made two edits:
There is a spectrum between two types of people, K-types and T-types.
I’ve tried to include views I endorse in both columns, however most of my own views are right-hand column because I am more K-type than T-type.
You’re correct that this is a spectrum rather than a strict binary. I should’ve clarified this. But I think it’s quite common to describe spectra by their extrema, for example:
Conflict theorists vs Mistake theorists
Convex and Concave Dispositions
Bullet-biters vs Bullet-swallowers.
You could still be doing perfect bayesian reasoning regardless of your prior credences. Bayesian reasoning (at least as I’ve seen the term used) is agnostic about the prior, so there’s nothing defective about assigned a low prior to programs with high time-complexity.
when translating between proof theory and computer science:
(computer program, computational steps, output) is mapped to (axioms, deductive steps, theorems) respectively.
kolmogorov-complexity maps to “total length of the axioms” and time-complexity maps to “number of deductive steps”.
what do you mean “the solomonoff prior is correct”? do you mean that you assign high prior likelihood to theories with low kolmogorov complexity?
this post claims: many people assign high prior likelihood to theories with low time complexity. and this is somewhat rational for them to do if they think that they would otherwise be susceptible to fallacious reasoning.
To me it seems that it might just as well make timelines longer to depend on algorithmic innovations as opposed to the improvements in compute that would help increase parameters.
I’ll give you an analogy:
Suppose your friend is running a marathon. You hear that at the halfway point she has a time of 1 hour 30 minutes. You think “okay I estimate she’ll finish the race in 4 hours”. Now you hear she has been running with her shoelaces untied. Should you increase or decrease your estimate?
Well, decrease. The time of 1:30 is more impressive if you learn her shoelaces were untied! It’s plausible your friend will notice and tie up her shoelaces.
But note that if you didn’t condition on the 1:30 information, then your estimate would increase if you learned her shoelaces were untied for the first half.Now for Large Language Models:
Believing Kaplan’s scaling laws, we figure that the performance of LLMs depended on N the number of parameters. But maybe there’s no room for improvement in N-efficiency. LLMs aren’t much more N-inefficient than the human brain, which is our only reference-point for general intelligence. So we expect little algorithmic innovation. LLMs will only improve because N and D grows.
On the other hand, believing Hoffman’s scaling laws, we figure that the performance of LLMs depended on D the number of datapoints. But there is likely room for improvement in D-efficiency. The brain is far more D-inefficient than LLMs. So LLMs have been metaphorically running with their shoes untied. There is room for improvement. So we’re less surprised by algorithmic innovation. LLMs will still improve because N and D grows, but this isn’t the only path.So Hoffman’s scaling laws shorten our timeline estimates.This is an important observation to grok. If you’re already impressed by how an algorithm performs, and you learn that the algorithm has a flaw which would disadvantage it, then you should increase your estimate of future performance.