Google AI PM; Foundation board member
Dave Orr
Playing with DALL·E 2
parenting rules
Thanks for all the work you put in on these incredibly informative posts. This is, hands down, the best source of analysis I’ve found for all things COVID.
Why I think nuclear war triggered by Russian tactical nukes in Ukraine is unlikely
Blake is not reliable here. I work with LaMDA, and while I can’t say much specific other than what’s published (which directly refutes him), he is not accurate here.
I feel like a lot of the issues in this post are that the published RSPs are not very detailed and most of the work to flesh them out is not done. E.g. the comparison to other risk policies highlights lack of detail in various ways.
I think it takes a lot of time and work to build our something with lots of analysis and detail, years of work potentially to really do it right. And yes, much of that work hasn’t happened yet.
But I would rather see labs post the work they are doing as they do it, so people can give feedback and input. If labs do so, the frameworks will necessarily be much less detailed than they would if we waited until they were complete.
So it seems to me that we are in a messy process that’s still very early days. Feedback about what is missing and what a good final product would look like is super valuable, thank you for your work doing that. I hope the policy folks pay close attention.
But I think your view that RSPs are the wrong direction is misguided, or at least I don’t find your reasons to be persuasive—there’s much more work to be done before they’re good and useful, but that doesn’t mean they’re not valuable. Honestly I can’t think of anything much better that could have been reasonably done given the limited time and resources we all have.
I think your comments on the name are well taken. I think your ideas about disclaimers and such are basically impossible for a modern corporation, unfortunately. I think your suggestion about pushing for risk management in policy are the clear next step, that’s only enabled by the existence of an RSP in the first place.
Thanks for the detailed and thoughtful effortpost about RSPs!
To add to the list of wtfs of recent SBF behavior: giving that interview. His lawyers must hate him.
I think there’s another, related, but much worse problem.
As LLMs become more widely adopted, they will generate large amounts of text on the internet. This widely available text will become training data for future LLMs. Tons of low quality content will reinforce LLM proclivities to produce low quality content—or even if LLMs are generating high quality content, it will reinforce whatever tendencies and oddities they have, e.g. be permanently pegged to styles and topics of interest in 2010-2030.
This was a problem for translation. As google translate got better, people started posting translated versions of their website where the translation was from google. Then scrapers looking for parallel data to train on would find these, and it took a lot of effort to screen them out.
Accepting your estimates at face value, there are two problems: the availability of good training data may be a limiting factor; and good training data will be hard to find in a sea of computer generated content.
If you buy from a retailer, you are paying in time as well as money. This is a good deal for people who have relatively more time than money. If you buy from a scalper, you are substituting money for the time component, which is good for people who value their time more highly.
Therefore scalpers are shifting supply from people who have more time to people who have more money. This is likely moving supply from middle class people to rich(er) people.
If you’re in the set of people with more time than money, which is most people, I can see being upset. It arguably substantially increases time to PS5 because you weren’t previously competing with someone like me who doesn’t have time to spare to track inventory and call around, but has plenty of money. It’s removing consumers from a pool that they weren’t in yet.
I think some markets are basically efficient and very difficult to beat. The public stock market is one. I’m not convinced by the AI example basically due to priors—we’ve seen many many people claim to be able to beat the public markets without special information, with evidence that seems much more convincing than this, and they are on average wrong. So I don’t think at least this argument overcomes my priors.
However less liquid markets are for sure beatable. The prediction markets around the election are one. Crypto is another—I personally have done well not just investing in crypto but by co-founding a hedge fund that has actively traded crypto for 3 years, many trades per day, making a trading profit (earning alpha) on 1081/1093 days. (And the losing days were all very small, each well below a day’s average profits.)
I also sit on an investment committee for an endowment and see what returns can look like in private markets where it’s possible to have a high informational advantage and turn that into outsized returns.
So to me, the EMH is mostly true for highly liquid highly accessible markets. But for illiquid, less accessible, lower information markets, there is money to be made for people willing to put in the effort.
Whether it’s worth the opportunity cost is also another question, it’s not like it’s hard to make money lots of ways if you are motivated and smart. Crypto is a fun hobby for me, like poker used to be, and I like to make money from my hobbies. Not everyone wants to spend their free time looking for EV in weird places.
These are fun to think about.
It’s not entirely clear to me that the model is making a mistake with the expected value calculations.
The model’s goal is to complete the pattern given examples. In the other prize winning submissions, the intent of the prompter is pretty clear—e.g. there was an explicit instruction in the “repeat after me” task. But in the expected value case, all the prompts were consistent with either expected value or winning is good/losing is bad. And I think the latter frame is more accessible to people—if you asked a random person, I’m pretty sure they are more likely to go with the hindsight bias analysis.
You could argue that there’s a mismatch between the researcher’s expectations, that an EV calculation is the right way to approach these, and the behavior, but this seems to me to be more like straightforward train/test mismatch rather than anything deep going on.
One potentially interesting follow-on might be to poll humans to see how well they would do, perhaps via mechanical turk or similar. I predict that humans would be ~perfect on redefine and repeat after me, and would perform poorly on the expected value task. So they seem qualitatively different to me.
(I didn’t mention the negation task because I found the example to be confusing—a below average temp might be fine or dangerous depending on the size of the drop. Of course, negation has long been hard in NLP, so it’s perfectly plausible that it’s still a problem with LLMs. And maybe the other examples weren’t so borderline.)
This post seems like a nice illustration of Paul Graham’s latest essay about how you don’t understand something until you’ve written about it.
Writing about something, even something you know well, usually shows you that you didn’t know it as well as you thought. Putting ideas into words is a severe test. The first words you choose are usually wrong; you have to rewrite sentences over and over to get them exactly right. And your ideas won’t just be imprecise, but incomplete too. Half the ideas that end up in an essay will be ones you thought of while you were writing it.
How much do you think that your decisions affect Google’s stock price? Yes maybe more AI means a higher price, but on the margin how much will you be pushing that relative to a replacement AI person? And mostly the stock price fluctuates on stuff like how well the ads business is doing, macro factors, and I guess occasionally whether we gave a bad demo.
It feels to me like the incentive is just so diffuse that I wouldn’t worry about it much.
Your idea of just donating extra gains also seems fine.
So… when can we get the optimal guide, if this isn’t it? :)
One thing that I think is missing (maybe just beyond the scope of this post) is thinking about newcomers with a positive frame: how do we help them get up to speed, be welcomed, and become useful contributors?
You could imagine periodic open posts, for instance, where we invite 101-style questions, post your objection to AI risks, etc where more experienced folks could answer those kind of things without cluttering up the main site. Possibly multiple more specific such threads if there’s enough interest.
Then you can tell people who try to post level 1-3 stuff that they should go to those threads instead, and help make sure they get attention.
I’m sure there are other ideas as well—the main point is that we should think of both positive as well as negative actions to take in response to an influx of newbies.
For a fascinating, borderline nsfw look at bear week during COVID, and why infections might be atypical, this reddit thread is worth a gander.
“My point is… To everyone worried about the P-Town data: I wouldn’t get too nervous going to the grocery store just yet—unless you tend to have orgies at Market Basket.”
Not to put too fine a point on it, but you’re just wrong that these are easy problems. NLP is hard because language is remarkably complex. NLP is also hard because it feels so easy from the inside—I can easily tell what that pronoun refers to, goes the thinking, so it should be easy for the computer! But it’s not, fully understanding language is very plausibly AI-complete.
Even topic classification (which is what you need to reliably censor certain subjects), though it seems simple, has literal decades of research and is not all that close to being solved.
So I think you should update much more towards “NLP is much harder than I thought” rather than “OpenAI should be embarrassed at how crappy their NLP is”.
I think there are two paths, roughly, that RSPs could send us down.
RSPs are a good starting point. Over time we make them more concrete, build out the technical infrastructure to measure risk, and enshrine them in regulation or binding agreements between AI companies. They reduce risk substantially, and provide a mechanism whereby we can institute a global pause if necessary, which seems otherwise infeasible right now.
RSPs are a type of safety-washing. They provide the illusion of a plan, but as written they are so vague as to be meaningless. They let companies claim they take safety seriously but don’t meaningfully reduce risk, and in fact may increase it by letting companies skate by without doing real work, rather than forcing companies to act responsibly by just not developing a dangerous uncontrollable technology.
If you think that Anthropic and other labs that adopt these are fundamentally well meaning and trying to do the right thing, you’ll assume that we are by default heading down path #1. If you are more cynical about how companies are acting, then #2 may seem more plausible.
My feeling is that Anthropic et al are clearly trying to do the right thing, and that it’s on us to do the work to ensure that we stay on the good path here, by working to deliver the concrete pieces we need, and to keep the pressure on AI labs to take these ideas seriously. And to ask regulators to also take concrete steps to make RSPs have teeth and enforce the right outcomes.
But I also suspect that people on the more cynical side aren’t going to be persuaded by a post like this. If you think that companies are pretending to care about safety but really are just racing to make $$, there’s probably not much to say at this point other than, let’s see what happens next.
YouGov is a solid but not outstanding Internet pollster.
https://projects.fivethirtyeight.com/pollster-ratings/yougov/
Still have to worry about selection bias with Internet polls, but I don’t think you need to worry that they have a particular axe to grind here.
If anyone knows how to use money to speed up vaccine delivery I’d love to know. I might be able to quickly allocate something like $5-20M but I have no idea who to work with to do it. CA would be easiest. Also easier if it’s in a poor community like the central valley but honestly any leads would help.