It’s not a Schelling point if you communicate about it!
utilistrutil
MATS Alumni Impact Analysis
Something Is Lost When AI Makes Art
My favored version of this project would involve >50% of the work going into the econ literature and models on investor incentives, with attention to
Principal-agent problems
Information asymmetry
Risk preferences
Time discounting
And then a smaller fraction of the work would involve looking into AI labs, specifically. I’m curious if this matches your intentions for the project or whether you think there are important lessons about the labs that will not be found in the existing econ literature.
How does the fiduciary duty of companies to investors work?
OpenAI instructs investors to view their investments “in the spirit of a donation,” which might be relevant for this question.
I would really like to see a post from someone in AI policy on “Grading Possible Comprehensive AI Legislation.” The post would lay out what kind of safety stipulations would earn a bill an “A-” vs a “B+”, for example.
I’m imagining a situation where, in the next couple years, a big omnibus AI bill gets passed that contains some safety-relevant components. I don’t want to be left wondering “did the safety lobby get everything it asked for, or did it get shafted?” and trying to construct an answer ex-post.
I don’t know how I hadn’t seen this post before now! A couple weeks after you published this, I put out my own post arguing against most applications of analogies in explanations of AI risk. I’ve added a couple references to your post in mine.
Adult brains are capable of telekinesis, if you fully believe in your ability to move objects with your mind. Adults are generally too jaded to believe such things. Children have the necessary unreserved belief, but their minds are not developed enough to exercise the ability.
File under ‘noticing the start of an exponential’: A.I. Helped to Find a Vast Source of the Copper That A.I. Needs to Thrive
Scott Alexander says:
Suppose I notice I am a human on Earth in America. I consider two hypotheses. One is that everything is as it seems. The other is that there is a vast conspiracy to hide the fact that America is much bigger than I think—it actually contains one trillion trillion people. It seems like SIA should prefer the conspiracy theory (if the conspiracy is too implausible, just increase the posited number of people until it cancels out).
I am often confused by the kind of reasoning at play in the text I bolded. Maybe someone can help sort me out. As I increase the number of people in the conspiracy world, my prior in that world also decreases. If my prior falls faster than the number of people in the considered world grows, I will not be able to construct a conspiracy-world that allows the thought experiment to bite.
Consider the situation where I arrive at the airport, where I will wait in line at security. Wouldn’t I be more likely to discover a line 1000 people long than 100 people long? I am 10x more likely to exist in the longer line. The problem is that our prior on 1000 people security lines might be very low. The reasoning on display in the above passage would invite us to simply crank up the length of the line, say, to 1 million people. I suspect that SIA proponents don’t show up at the airport expecting lines this long. Why? Because the prior on a million-person line is more than a thousand times lower than the prior on a 100-person line.
This also applies to some presentations of Pascal’s mugging.
Jacob Steinhardt on predicting emergent capabilities:
There’s two principles I find useful for reasoning about future emergent capabilities:
If a capability would help get lower training loss, it will likely emerge in the future, even if we don’t observe much of it now.
As ML models get larger and are trained on more and better data, simpler heuristics will tend to get replaced by more complex heuristics. . . This points to one general driver of emergence: when one heuristic starts to outcompete another. Usually, a simple heuristic (e.g. answering directly) works best for small models on less data, while more complex heuristics (e.g. chain-of-thought) work better for larger models trained on more data.
The nature of these things is that they’re hard to predict, but general reasoning satisfies both criteria, making it a prime candidate for a capability that will emerge with scale.
I think you could also push to make government liable as part of this proposal
MATS Winter 2023-24 Retrospective
How LLMs Work, in the Style of The Economist
There might be indirect effects like increasing hype around AI and thus investment, but overall I think those effects are small and I’m not even sure about the sign.
Sign of the effect of open source on hype? Or of hype on timelines? I’m not sure why either would be negative.
Open source --> more capabilities R&D --> more profitable applications --> more profit/investment --> shorter timelines
The example I’ve heard cited is Stable Diffusion leading to LORA.
There’s a countervailing effect of democratizing safety research, which one might think outweighs because it’s so much more neglected than capabilities, more low-hanging fruit.
GDP is an absolute quantity. If GDP doubles, then that means something. So readers should be thinking about the distance between the curve and the x-axis.
But 1980 is arbitrary. When comparing 2020 to 2000, all that matters is that they’re 20 years apart. No one cares that “2020 is twice as far from 1980 as 2000” because time did not start in 1980.
This is the difference between a ratio scale and a cardinal scale. In a cardinal scale, the distance between points is meaningful, e.g., “The gap between 1 and 2 is twice as big as the gap between 2 and 4.” In a ratio scale, there is a well-defined zero point, which means the ratios of points are also meaningful, e.g., “4 is twice as large as 2.”
I just came across this word from John Koenig’s Dictionary of Obscure Sorrows, that nicely capture the thesis of All Debates Are Bravery Debates.
redesis n. a feeling of queasiness while offering someone advice, knowing they might well face a totally different set of constraints and capabilities, any of which might propel them to a wildly different outcome—which makes you wonder if all of your hard-earned wisdom’s fundamentally nonstraferable, like handing someone a gift card in your name that probably expired years ago.
Analogy Bank for AI Safety
(and perhaps also reversing some past value-drift due to the structure of civilization and so on)
Can you say more about why this would be desirable?
Favorite post of the year so far!