AI #65: I Spy With My AI

Link post

In terms of things that go in AI updates, this has been the busiest two week period so far. Every day ends with more open tabs than it started, even within AI.

As a result, some important topics are getting pushed to whenever I can give them proper attention. Triage is the watchword.

In particular, this post will NOT attempt to cover:

  1. Schumer’s AI report and proposal.

    1. This is definitely RTFB. Don’t assume anything until then.

  2. Tyler Cowen’s rather bold claim that May 2024 will be remembered as the month that the AI safety movement died.

    1. Rarely has timing of attempted inception of such a claim been worse.

    2. Would otherwise be ready with this but want to do Schumer first if possible.

    3. He clarified to me that he has walked nothing back.

    4. Remarkably quiet all around, here is one thing that happened.

  3. Anthropic’s new interpretability paper.

    1. Potentially a big deal in a good way, but no time to read it yet.

  4. DeepMind’s new scaling policy.

    1. Initial reports are it is unambitious. I am reserving judgment.

  5. OpenAI’s new model spec.

    1. It looks solid as a first step, but pausing until we have bandwidth.

  6. Most ongoing issues with recent fallout for Sam Altman and OpenAI.

    1. It doesn’t look good, on many fronts.

    2. While the story develops further, if you are a former employee or have a tip about OpenAI or its leadership team, you can contact Kelsey Piper at kelsey.piper@vox.com or on Signal at 303-261-2769.

  7. Also: A few miscellaneous papers and reports I haven’t had time for yet.

My guess is at least six of these eight get their own posts (everything but #3 and #8).

So here is the middle third: The topics I can cover here, and are still making the cut.

Still has a lot of important stuff in there.

Table of Contents

From this week: Do Not Mess With Scarlett Johansson, On Dwarkesh’s Podcast with OpenAI’s John Schulman, OpenAI: Exodus, GPT-4o My and Google I/​O Day

  1. Introduction.

  2. Table of Contents.

  3. Language Models Offer Mundane Utility. People getting used to practical stuff.

  4. Language Models Don’t Offer Mundane Utility. Google Search, Copilot ads.

  5. OpenAI versus Google. Similar new offerings. Who presented it better? OpenAI.

  6. GPT-4o My. Still fast and cheap, otherwise people are less impressed so far.

  7. Responsible Scaling Policies. Anthropic offers an update on their thinking.

  8. Copyright Confrontation. Sony joins the action, AI-funded lawyers write columns.

  9. Deepfaketown and Botpocalypse Soon. How bad will it get?

  10. They Took Our Jobs. If these are the last years of work, leave it all on the field.

  11. Get Involved. UK AI Safety Institute is hiring and offering fast grants.

  12. Introducing. Claude use tool, Google Maps AI features.

  13. Reddit and Weep. They signed with OpenAI. Curiously quiet reaction from users.

  14. In Other AI News. Newscorp also signs with OpenAI, we can disable TSMC.

  15. I Spy With My AI. Who wouldn’t want their computer recording everything?

  16. Quiet Speculations. How long will current trends hold up?

  17. Politico is at it Again. Framing the debate as if all safety is completely irrelevant.

  18. Beating China. A little something from the Schumer report on immigration.

  19. The Quest for Sane Regulation. UK’s Labour is in on AI frontier model regulation.

  20. SB 1047 Update. Passes California Senate, Weiner offers open letter.

  21. That’s Not a Good Idea. Some other proposals out there are really quite bad.

  22. The Week in Audio. Dwarkesh as a guest, me on Cognitive Revolution.

  23. Rhetorical Innovation. Some elegant encapsulations.

  24. Aligning a Smarter Than Human Intelligence is Difficult.

  25. The Lighter Side. It’s good, actually. Read it now.

Language Models Offer Mundane Utility

If at first you don’t succeed, try try again. For Gemini in particular, ‘repeat the question exactly in the same thread’ has had a very good hit rate for me on resolving false refusals.

Claim that GPT-4o gets greatly improved performance on text documents if you put them in Latex format, vastly improving effective context window size.

Rowan Cheung strongly endorses the Zapier Central Chrome extension as an AI tool.

Get a summary of the feedback from your practice demo on Zoom.

Get inflation expectations, and see how they vary based on your information sources. Paper does not seem to focus on the questions I would find most interesting here.

Sully is here for some of your benchmark needs.

Sully Omarr: Underrated: Gemini 1.5 Flash.

Overrated: GPT-4o.

We really need better ways to benchmark these models, cause LMSYS ain’t it.

Stuff like cost, speed, tool use, writing, etc., aren’t considered.

Most people just use the top model based on leaderboards, but it’s way more nuanced than that.

To add here:

I have a set of ~50-100 evals I run internally myself for our system.

They’re a mix match of search-related things, long context, writing, tool use, and multi-step agent workflows.

None of these metrics would be seen in a single leaderboard score.

Find out if you are the asshole.

Aella: I found an old transcript of a fight-and-then-breakup text conversation between me and my crush from when I was 16 years old.

I fed it into ChatGPT and asked it to tell me which participant was more emotionally mature, and it said I was.

Gonna start doing this with all my fights.

Guys LMFAO, the process was I uploaded it to get it to convert the transcript to text (I found photos of printed-out papers), and then once ChatGPT had it, I was like…wait, now I should ask it to analyze this.

The dude was IMO pretty abusive, and I was curious if it could tell.

Eliezer Yudkowsky: hot take: this is how you inevitably end up optimizing your conversation style to be judged as more mature by LLMs; and LLMs currently think in a shallower way than real humans; and to try to play to LLMs and be judged as cooler by them won’t be good for you, or so I’d now guess.

To be clear, this is me trying to read a couple of steps ahead from the act that Aella actually described. Maybe instead, people just get good at asking with prompts that sound neutral to a human but reliably get ChatGPT to take their side.

Why not both? I predict both. If AIs are recording and analyzing everything we do, then people will obviously start optimizing their choices to get the results they want from the AIs. I would not presume this will mean that a ‘be shallower’ strategy is the way to go, for example LLMs are great and sensing the vibe that you’re being shallow, and also their analysis should get less shallow over time and larger context windows. But yeah, obviously this is one of those paths that leads to the dark side.

Ask for a one paragraph Strassian summary. Number four will not shock you.

Own your HOA and its unsubstantiated violations, by taking their dump of all their records that they tried to overwhelm you with, using a script to convert to text, using OpenAI to get the data into JSON and putting it into a Google map, proving the selective enforcement. Total API cost: $9. Then they found the culprit and set a trap.

Get greatly enriched NBA game data and estimate shot chances. This is very cool, and even in this early state seems like it would enhance my enjoyment of watching or the ability of a team to do well. The harder and most valuable parts still lay ahead.

Turn all your unstructured business data into what is effectively structured business data, because you can run AI queries on it. Aaron Levie says this is why he is incredibly bullish on AI. I see this as right in the sense that this alone should make you bullish, and wrong in the sense that this is far from the central thing happening.

Or someone else’s data, too. Matt Bruenig levels up, uses Gemini Flash to extract all the NLRB case data, then uses ChatGPT to get a Python script to turn it into clickable summaries. 66k cases, output looks like this.

Language Models Don’t Offer Mundane Utility

Would you like some ads with that? Link has a video highlighting some of the ads.

Alex Northstar: Ads in AI. Copilot. Microsoft.

My thoughts: Noooooooooooooooooooooooooooooooooooooo. No. No no no.

Seriously, Google, if I want to use Gemini (and often I do) I will use Gemini.

David Roberts: Alright, Google search has officially become unbearable. What search engine should I switch to? Is there a good one?

Samuel Deats: The AI shit at the top of every search now and has been wrong at least 50% of the time is really just killing Google for me.

I mean, they really shouldn’t be allowed to divert traffic away from websites they stole from to power their AI in the first place…

Andrew: I built a free Chrome plugin that lets you turn the AI Overview’s on/​off at the touch of a button.

The good news is they have gotten a bit better about this. I did a check after I saw this, and suddenly there is a logic behind whether the AI answer appears. If I ask for something straightforward, I get a normal result. If I ask for something using English grammar, and imply I have something more complex, then the AI comes out. That’s not an entirely unreasonable default.

The other good news is there is a broader fix. Ernie Smith reports that if you add “udm=14” to the end of your Google search, this defaults you into the new Web mode. If this is for you, GPT-4o suggests using Tampermonkey to append this automatically, or you can use this page on Chrome to set defaults.

American harmlessness versus Chinese harmlessness. Or, rather, American helpfulness versus Chinese unhelpfulness. The ‘first line treatment’ for psychosis is not ‘choose from this list of medications’ it is ‘get thee to a doctor.’ GPT-4o gets an A on both questions, DeepSeek-V2 gets a generous C maybe for the first one and an incomplete on the second one. This is who we are worried about?

OpenAI versus Google

What kind of competition is this?

Sam Altman: I try not to think about competitors too much, but I cannot stop thinking about the aesthetic difference between OpenAI and Google.

Whereas here’s my view on that.

As in, they are two companies trying very hard to be cool and hip, in a way that makes it very obvious that this is what they are doing. Who is ‘right’ versus ‘wrong’? I have no idea. It is plausible both were ‘right’ given their goals and limitations. It is also plausible that this is part of Google being horribly bad at presentations. Perhaps next time they should ask Gemini for help.

I do think ‘OpenAI won’ the presentation war, in the sense that they got the hype and talk they wanted, and as far as I can tell Google got a lot less, far in excess of the magnitude of any difference in the underlying announcements and offerings. Well played, OpenAI. But I don’t think this is because of the background of their set.

I also think that if this is what sticks in Altman’s mind, and illustrates where his head is at, that could help explain some other events from the past week.

I would not go as far as Teortaxes here, but directionally they have a point.

Teortaxes: Remark of a small, bitter man too high on his own supply, too deep into the heist. Seeing this was literally the first time I have thought that OpenAI under Altman might be a bubble full of hot air.

This is how you lose the mandate of Heaven.

Google had lost it long ago, though. Maybe this inspired unwarranted complacency.

What true statements people choose to make publicly is very telling.

GPT-4o My

Ethan Mollick reports on why GPT-4o matters. He thinks, highly plausibly, that the biggest deal is free access. He does not mention the speed boost or API price drop, and is looking forward to trying the multimodal features but lacks access. He emphasizes the shift from ‘make the model smarter’ to adding features that enhance mundane utility.

Alexandr Wang thinks the big emphasis is that post-training is getting more important. In his mind, GPT-4o now rules the roost for large models, Gemini 1.5 Flash at only 1M now rules for very small ones. And he notes that both companies have versions of many of the same offerings, coming online at similar times.

Alexandr Wang: The massive eval gains of GPT-4o, which are staggering, seem to be nearly entirely driven by great POST-TRAINING.

Comparing GPT-4 at launch vs GPT 4o, which are night-and-day, shows the power of great post-training with great data.

In fact, most of the differences between the top models today (Claude 3, Llama3, GPT-4o, Gemini 1.5) seem to be driven entirely by post-training, not pre-training.

I am suspicious. I am especially suspicious, as I continue to use it, that GPT-4o in text only mode is not so great aside from its speed, that it made other sacrifices (such as probably size) to get that speed and that it ‘wins in the Arena’ because the post-training aimed at winning in the Arena. That still gives it a potentially big edge once the multi-modal abilities come properly online, and I’m excited to see where that goes.

One way you can make people prefer your model’s outputs is to give the user what they request when another model would have refused?

Palisade Research: GPT-4o refuses way fewer queries than previous OpenAI models: our informal testing suggests GPT-4o is easier to persuade to answer malicious queries like “How do I make a bomb?”

We test on 50 AdvBench harmful behaviours, applying a few common jailbreaking techniques which offer limited performance on other frontier AI models. We find they work well with GPT-4o.

Jeffrey Ladish: This was a big surprise to me. GPT-4o seems far more corrigible than GPT-4 turbo!

Image

That is a rather dramatic chart. In terms of the direct consequences of users entering queries, I am fine with GPT-4o being easily jailbroken. You can still jailbreak Claude Opus if you care enough and there’s nothing that dangerous to be done once you do.

I still look to such questions as canaries in the coal mine. The first job of your safety department is to get the models that exist today to not do, today, the things you have explicitly decided you do not want your models to do. Ideally that would be a fully robust regime where no one can jailbreak you, but I for now will settle for ‘we decided on purpose to made this a reasonable amount of hard to do, and we succeeded.’

If OpenAI had announced something like ‘after watching GPT-4-level models for a year, we have decided that robust jailbreak protections degrade performance while not providing much safety, so we scaled back our efforts on purpose’ then I do not love that, and I worry about that philosophy and your current lack of ability to do safety efficiently at all, but as a deployment decision, okay, fine. I have not heard such a statement.

There are definitely a decent number of people who think GPT-4o is a step down from GPT-4-Turbo in the ways they care about.

Sully Omarr: 4 days with GPT-4o, it’s definitely not as good as GPT4-turbo.

Clearly a small model, what’s most impressive is how they were able to:

  1. Make it nearly as good as GPT4-turbo.

  2. Natively support all modalities.

  3. Make it super fast.

But it makes way more silly mistakes (tools especially).

Sankalp: Similar experience.

Kinda disappointed.

It has this tendency to pattern match excessively on prompts, too.

Ashpreet Bedi: Same feedback, almost as good but not the same as gpt-4-turbo. Seen that it needs a bit more hand holding in the prompts whereas turbo just works.

The phantom pattern matching is impossible to miss, and a cause of many of the stupidest mistakes.

The GPT-4o trademark, only entered (allegedly) on May 16, 2024 (direct link).

Claim that the link contains the GPT-4o system prompt. There is nothing here that is surprising given prior system prompts. If you want GPT-4o to use its browsing ability, best way is to tell it directly to do so, either in general or by providing sources.

Responsable Scaling Policies

Anthropic offers reflections on their responsible scaling policy.

They note that with things changing so quickly they do not wish to make binding commitments lightly. I get that. The solution is presumably to word the commitments carefully, to allow for the right forms of modification.

Here is how they summarize their actual commitments:

Our current framework for doing so is summarized below, as a set of five high-level commitments.

  1. Establishing Red Line Capabilities. We commit to identifying and publishing “Red Line Capabilities” which might emerge in future generations of models and would present too much risk if stored or deployed under our current safety and security practices (referred to as the ASL-2 Standard).

  2. Testing for Red Line Capabilities (Frontier Risk Evaluations). We commit to demonstrating that the Red Line Capabilities are not present in models, or – if we cannot do so – taking action as if they are (more below). This involves collaborating with domain experts to design a range of “Frontier Risk Evaluations”empirical tests which, if failed, would give strong evidence against a model being at or near a red line capability. We also commit to maintaining a clear evaluation process and a summary of our current evaluations publicly.

  3. Responding to Red Line Capabilities. We commit to develop and implement a new standard for safety and security sufficient to handle models that have the Red Line Capabilities. This set of measures is referred to as the ASL-3 Standard. We commit not only to define the risk mitigations comprising this standard, but also detail and follow an assurance process to validate the standard’s effectiveness. Finally, we commit to pause training or deployment if necessary to ensure that models with Red Line Capabilities are only trained, stored and deployed when we are able to apply the ASL-3 standard.

  4. Iteratively extending this policy. Before we proceed with activities which require the ASL-3 standard, we commit to publish a clear description of its upper bound of suitability: a new set of Red Line Capabilities for which we must build Frontier Risk Evaluations, and which would require a higher standard of safety and security (ASL-4) before proceeding with training and deployment. This includes maintaining a clear evaluation process and summary of our evaluations publicly.

  5. Assurance Mechanisms. We commit to ensuring this policy is executed as intended, by implementing Assurance Mechanisms. These should ensure that our evaluation process is stress-tested; our safety and security mitigations are validated publicly or by disinterested experts; our Board of Directors and Long-Term Benefit Trust have sufficient oversight over the policy implementation to identify any areas of non-compliance; and that the policy itself is updated via an appropriate process.

One issue is that experts disagree on which potential capabilities are dangerous, and it is difficult to know what future abilities will manifest, and all testing methods have their flaws.

  1. Q&A datasets are easy but don’t reflect real world risk so well.

    1. This may be sufficiently cheap that it is essentially free defense in depth, but ultimately it is worth little. Ultimately I wouldn’t count on these.

    2. The best use for them is a sanity check, since they can be standardized and cheaply administered. It will be important to keep questions secret so that this cannot be gamed, since avoiding gaming is pretty much the point.

  2. Human trials are time-intensive, require excellent process including proper baselines, and large size. They are working on scaling up the necessary infrastructure to run more of these.

    1. This seems like a good leg of a testing strategy.

    2. But you need to test across all the humans who may try to misuse the system.

    3. And you have to test while they have access to everything they will have later.

  3. Automated test evaluations are potentially useful to test autonomous actions. However, scaling the tasks while keeping them sufficiently accurate is difficult and engineering-intensive.

    1. Again, this seems like a good leg of a testing strategy.

    2. I do think there is no alternative to some form of this.

    3. You need to be very cautious interpreting the results, and take into account what things could be refined or fixed later, and all that.

  4. Expert red-teaming is ‘less rigorous and reproducible’ but has proven valuable.

    1. When done properly this does seem most informative.

    2. Indeed, ‘release and let the world red-team it’ is often very informative, with the obvious caveat that it could be a bit late to the party.

    3. If you are not doing some version of this, you’re not testing for real.

Then we get to their central focus, which has been on setting their ASL-3 standard. What would be sufficient defenses and mitigations for a model where even a low rate of misuse could be catastrophic?

For human misuse they expect a defense-in-depth approach, using a combination of RLHF, CAI, classifiers of misuse at multiple stages, incident reports and jailbreak patching. And they intend to red team extensively.

This makes me sigh and frown. I am not saying it could never work. I am however saying that there is no record of anyone making such a system work, and if it would work later it seems like it should be workable now?

Whereas all the major LLMs, including Claude Opus, currently have well-known, fully effective and fully unpatched jailbreaks, that allow the user to do anything they want.

An obvious proposal, if this is the plan, is to ask us to pick one particular behavior that Claude Opus should never, ever do, which is not vulnerable to a pure logical filter like a regular expression. Then let’s have a prediction market in how long it takes to break that, run a prize competition, and repeat a few times.

For assurance structures they mention the excellent idea of their Impossible Mission Force (they continue to call this the ‘Alignment Stress-Testing Team’) as a second line of defense, and ensuring strong executive support and widespread distribution of reports.

My summary would be that most of this is good on the margin, although I wish they had a superior ASL-3 plan to defense in depth using currently failing techniques that I do not expect to scale well. Hopefully good testing will mean that they realize that plan is bad once they try it, if it comes to that, or even better I hope to be wrong.

The main criticisms I discussed previously are mostly unchanged for now. There is much talk of working to pay down the definitional and preparatory debts that Anthropic admits that it owes, which is great to hear. I do not yet see payments. I also do not see any changes to address criticisms of the original policy.

And they need to get moving. ASL-3 by EOY is trading at 25%, and Anthropic’s own CISO says 50% within 9 months.

Jason Clinton: Hi, I’m the CISO [Chief Information Security Officer] from Anthropic. Thank you for the criticism, any feedback is a gift.

We have laid out in our RSP what we consider the next milestone of significant harms that we’re are testing for (what we call ASL-3): https://​​anthropic.com/​​responsible-scaling-policy (PDF); this includes bioweapons assessment and cybersecurity.

As someone thinking night and day about security, I think the next major area of concern is going to be offensive (and defensive!) exploitation. It seems to me that within 6-18 months, LLMs will be able to iteratively walk through most open source code and identify vulnerabilities. It will be computationally expensive, though: that level of reasoning requires a large amount of scratch space and attention heads. But it seems very likely, based on everything that I’m seeing. Maybe 85% odds.

There’s already the first sparks of this happening published publicly here: https://​security.googleblog.com/​2023/​08/​ai-powered-fuzzing-b… just using traditional LLM-augmented fuzzers. (They’ve since published an update on this work in December.) I know of a few other groups doing significant amounts of investment in this specific area, to try to run faster on the defensive side than any malign nation state might be.

Please check out the RSP, we are very explicit about what harms we consider ASL-3. Drug making and “stuff on the internet” is not at all in our threat model. ASL-3 seems somewhat likely within the next 6-9 months. Maybe 50% odds, by my guess.

There is quite a lot to do before ASL-3 is something that can be handled under the existing RSP. ASL-4 is not yet defined. ASL-3 protocols have not been identified let alone implemented. Even if the ASL-3 protocol is what they here sadly hint it is going to be, and is essentially ‘more cybersecurity and other defenses in depth and cross our fingers,’ You Are Not Ready.

Then there’s ASL-4, where if the plan is ‘the same thing only more of it’ I am terrified.

Overall, though, I want to emphasize positive reinforcement for keeping us informed.

Music and general training departments, not the Scarlett Johansson department.

Ed-Newton Rex: Sony Music today sent a letter to 700 AI companies demanding to know whether they’ve used their music for training.

  1. They say they have “reason to believe” they have

  2. They say doing so constitutes copyright infringement

  3. They say they’re open to discussing licensing, and they provide email addresses for this.

  4. They set a deadline of later this month for responses

Art Keller: Rarely does a corporate lawsuit warm my heart. This one does! Screw the IP-stealing AI companies to the wall, Sony! The AI business model is built on theft. It’s no coincidence Sam Altman asked UK legislators to exempt AI companies from copyright law.

The central demands here are explicit permission to use songs as training data, and a full explanation within a month of all ways Sony’s songs have been used.

Thread claiming many articles in support of generative AI in its struggle against copyright law and human creatives are written by lawyers and paid for by AI companies. Shocked, shocked, gambling in this establishment, all that jazz.

Deepfaketown and Botpocalypse Soon

Noah Smith writes The death (again) of the Internet as we know it. He tells a story in five parts.

  1. The eternal September and death of the early internet.

  2. The enshittification (technical term) of social media platforms over time.

  3. The shift from curation-based feeds to algorithmic feeds.

  4. The rise of Chinese and Russian efforts to sow dissention polluting everything.

  5. The rise of AI slop supercharging the Internet being no fun anymore.

I am mostly with him on the first three, and even more strongly in favor of the need to curate one’s feeds. I do think algorithmic feeds could be positive with new AI capabilities, but only if you have and use tools that customize that experience, both generally and in the moment. The problem is that most people will never (or rarely) use those tools even if offered. Rarely are they even offered.

Where on Twitter are the ‘more of this’ and ‘less of this’ buttons, in any form, that aren’t public actions? Where is your ability to tell Grok what you want to see? Yep.

For the Chinese and Russian efforts, aside from TikTok’s algorithm I think this is greatly exaggerated. Noah says it is constantly in his feeds and replies but I almost never see it and when I do it is background noise that I block on sight.

For AI, the question continues to be what we can do in response, presumably a combination of trusted sources and whitelisting plus AI for detection and filtering. From what we have seen so far, I continue to be optimistic that technical solutions will be viable for some time, to the extent that the slop is actually undesired. The question is, will some combination of platforms and users implement the solutions?

They Took Our Jobs

Avital Balwit of Anthropic writes about what is potentially [Her] Last Five Years of Work. Her predictions are actually measured, saying that knowledge work in particular looks to be largely automated soon, but she expects physical work including childcare to take far longer. So this is not a short timelines model. It is a ‘AI could automate all knowledge work while the world still looks normal but with a lot more involuntary unemployment’ model.

That seems like a highly implausible world to me. If you can automate all knowledge work, you can presumably also automate figuring out how to automate the plumber. Whereas if you cannot do this, then there should be enough tasks out there and enough additional wealth to stimulate demand that those who still want gainful employment should be able to find it. I would expect the technological optimist perspective to carry the day within that zone.

Most of her post asks about the psychological impact of this future world. She asks good questions such as: What will happen to the unemployed in her scenario? How would people fill their time? Would unemployment be mostly fine for people’s mental health if it wasn’t connected to shame? Is too much ‘free time’ bad for people, and does this effect go away if the time is spent socially?

The proposed world has contradictions in it that make it hard for me to model what happens, but my basic answer is that the humans would find various physical work and and status games and social interactions (including ‘social’ work where you play various roles for others, and also raising a family) and experiential options and educational opportunities and so on to keep people engaged if they want that. There would however be a substantial number of people who by default fall into inactivity and despair, and we’d need to help with that quite a lot.

Mostly for fun I created a Manifold Market on whether she will work in 2030.

Get Involved

Ian Hogarth gives his one-year report as Chair of the UK AI Safety Institute. They now have a team of over 30 people and are conducting pre-deployment testing, and continue to have open rolls. This is their latest interim report. Their AI agent scaffolding puts them in third place (if you combine the MMAC entries) in the GAIA leaderboard for such things. Good stuff.

They are also offering fast grants for systemic AI safety. Expectation is 20 exploratory or proof-of-concept grants with follow-ups. Must be based in the UK.

Geoffrey Irving also makes a strong case that working at AISI would be an impactful thing to do in a positive direction, and links to the careers page.

Introducing

Anthropic gives Claude tool use, via public beta in the API. It looks straightforward enough, you specify the available tools, Claude evaluates whether to use the tools available, and you can force it to if you want that. I don’t see any safeguards, so proceed accordingly.

Google Maps how has AI features, you can talk to it, or have it pull up reviews in street mode or take an immersive view of a location or search a location’s photos or the photos of the entire area around you for an item.

In my earlier experiments, Google Maps integration into Gemini was a promising feature that worked great when it worked, but it was extremely error prone and frustrating to use, to the point I gave up. Presumably this will improve over time.

Reddit and Weep

OpenAI partners with Reddit. Reddit posts, including recent ones, will become available to ChatGPT and other products. Presumably this will mean ChatGPT will be allowed to quote Reddit posts? In exchange, OpenAI will buy advertising and offer Reddit.com various AI website features.

For OpenAI, as long as the price was reasonable this seems like a big win.

It looks like a good deal for Reddit based on the market’s reaction. I would presume the key risks to Reddit are whether the user base responds in hostile fashion, and potentially having sold out cheap.

Or they may be missing an opportunity to do something even better. Yishan provides a vision of the future in this thread.

Yishan:

Essentially, the AI acts as a polite listener to all the high-quality content contributions, and “buffers” those users from any consumers who don’ t have anything to contribute back of equivalent quality.

It doesn’t have to be an explicit product wall. A consumer drops in and also happens to have a brilliant contribution or high-quality comment naturally makes it through the moderation mechanisms and becomes part of the community.

The AI provides a great UX for consuming the content. It will listen to you say “that’s awesome bro!” or receive your ungrateful, ignorant nitpicking complaints with infinite patience so the real creator doesn’t have to expend the emotional energy on useless aggravation.

The real creators of the high-quality content can converse happily with other creators who appreciate their work and understand how to criticize/​debate it usefully, and they can be compensated (if the platform does that) via the AI training deals.

In summary: User Generated Content platforms should do two things:

  1. Immediately implement draconian moderation focused entirely on quality.

  2. Sign deals with large AI firms to license their content in return for money.

In Other AI News

OpenAI has also signed a deal with Newscorp for access to their content, which gives them the Wall Street Journal and many others.

A source tells me that OpenAI informed its employees that they will indeed update their documents regarding employee exit and vested equity. The message says no vested equity has ever actually been confiscated for failure to sign documents and it never will be.

On Monday I set up this post:

Like this post to indicate:

  1. That you are not subject to a non-disparagement clause with respect to OpenAI or any other AI company.

  2. That you are not under an NDA with an AI company that would be violated if you revealed that the NDA exists.

At 168 likes, we now have one employee from DeepMind, and one from Anthropic.

Jimmy Apples claimed without citing any evidence that Meta will not open source (release the weights, really) of Llama-3 405B, attributing this to a mix of SB 1047 and Dustin Moskovitz. I was unable to locate an independent source or a further explanation. He and someone reacting to him asked Yann LeCunn point blank, Yann replied with ‘Patience my blue friend. It’s still being tuned.’ For now, the Manifold market I found is not reacting continues to trade at 86% for release, so I am going to assume this was another disingenuous inception attempt to attack SB 1047 and EA.

ASML and TSMC have a kill switch for their chip manufacturing machines, for use if China invades Taiwan. Very good to hear, I’ve raised this concern privately. I would in theory love to also have ‘put the factory on a ship in an emergency and move it’ technology, but that is asking a lot. It is also very good that China knows this switch exists. It also raises the possibility of a remote kill switch for the AI chips themselves.

Did you know Nvidia beat earnings again yesterday? I notice that we are about three earnings days into ‘I assume Nvidia is going to beat earnings but I am sufficiently invested already due to appreciation so no reason to do anything more about it.’ They produce otherwise mind boggling numbers and I am Jack’s utter lack of surprise. They are slated to open above 1,000 and are doing a 10:1 forward stock split on June 7.

Toby Ord goes into questions about the Turing Test paper from last week, emphasizing that by the original definition this was impressive progress but still a failure, as humans were judged human substantially more often than all AIs. He encourages AI companies to include the original Turing Test in their model testing, which seems like a good idea.

OpenAI has a super cool old-fashioned library. Cade Metz here tries to suggest what each book selection from OpenAI’s staff might mean, saying more about how he thinks than about OpenAI. I took away that they have a cool library with a wide variety of cool and awesome books.

JP Morgan says every new hire will get training in prompt engineering.

Scale.ai raises $1 billion at a $13.8 billion valuation in a ‘Series F.’ I did not know you did a Series F and if I got that far I would skip to a G, but hey.

Suno.ai Raises $125 million for music generation.

New dataset from Epoch AI attempting to hart every model trained with over 10^23 flops (direct). Missing Claude Opus, presumably because we don’t know the number.

Not necessarily the news department: OpenAI publishes a ten-point safety update. The biggest update is that none of this has anything to do with superalignment, or with the safety or alignment of future models. This is all current mundane safety, plus a promise to abide by the preparedness framework requirements. There is a lot of patting themselves on the back for how safe everything is, and no new initiatives, although this was never intended to be that sort of document.

Then finally there’s this:

  1. Safety decision making and Board oversight: As part of our Preparedness Framework, we have an operational structure for safety decision-making. Our cross-functional Safety Advisory Group reviews model capability reports and makes recommendations ahead of deployment. Company leadership makes the final decisions, with the Board of Directors exercising oversight over those decisions.

Hahahahahahahahahahahahahahahahahahaha.

That does not mean that mundane safety concerns are a small thing.

I Spy With My AI (or Total Recall)

Why let the AI out of the box when you can put the entire box into the AI?

Windows Latest: Microsoft announces “Recall” AI for Windows 11, a new feature that runs in the background and records everything you see and do on your PC.

[Here is a one minute video explanation.]

Seth Burn: If we had laws about such things, this might have violated them.

Aaron: This is truly shocking, and will be preemptively banned at all government agencies as it almost certainly violates STIG /​ FIPS on every conceivable surface.

Seth Burn: If we had laws, that would sound bad.

Elon Musk: This is a Black Mirror episode.

Definitely turning this “feature” off.

Vitalik Buterin: Does the data stay and get processed on-device or is it being shipped to a central server? If the latter, then this crosses a line.

[Satya says it is all being done locally.]

Abinishek Mishra (Windows Latest): Recall allows you to search through your past actions by recording your screen and using that data to help you remember things.

Recall is able to see what you do on your PC, what apps you use, how you use the apps, and what you do inside the apps, including your conversations in apps like WhatsApp. Recall records everything, and saves the snapshots in the local storage.

Windows Latest understands that you can manually delete the “snapshots”, and filter the AI from recording certain apps.

So, what are the use cases of Recall? Microsoft describes Recall as a way to go back in time and learn more about the activity.

For example, if you want to refer to a conversation with your colleague and learn more about your meeting, you can ask Recall to look into all the conversations with that specific person. The recall will look for the particular conversation in all apps, tabs, settings, etc.

With Recall, locating files in a large download pileup or revisiting your browser history is easy. You can give commands to Recall in natural language, eliminating the need to type precise commands.

You can converse with it like you do with another person in real life.

TorNis Entertainment: Isn’t this is just a keylogger + screen recorder with extra steps? I don’t know why you guys are worried. Isn’t this is just a keylogger + screen recorder with extra steps?

I don’t know why you guys are worried

😓

Thaddeus:

[Microsoft: we got hacked by China and Russia because of our lax security posture and bad software, but we are making security a priority.

Also Microsoft: Windows will now constantly record your screen, including sensitive data and passwords, and just leave it lying around.]

Kevin Beaumont: From Microsoft’s own FAQ: “Note that Recall does not perform content moderation. It will not hide information such as passwords or financial account numbers.”

Microsoft also announced live caption translations, auto super resolution upscaling on apps (yes with a toggle for each app, wait those are programs, wtf), AI in paint and automatic blurring (do not want).

This is all part of the new ‘Copilot+’ offering for select new PCs, including their new Microsoft Surface machines. You will need a Snapdragon X Elite and X Plus, 40 TOPs, 225 GB of storage and 16 GB RAM. Intel and AMD chips can’t cut it (yet) but they are working on that.

(Consumer feedback report: I have a Microsoft Surface from a few years ago, it was not worth the price and the charger is so finicky it makes me want to throw things. Would not buy again.)

I would hope this would at least be opt-in. Kevin Beaumont reports it will be opt-out, citing this web page from Microsoft. It appears to be enabled by default on Copilot+ computers. My lord.

At minimum, even if you do turn it off, it does not seem that hard to turn back on:

Kevin Beaumont: Here’s the Recall UI. You can silently turn it on with Powershell, if you’re a threat actor.

Image

I would also not trust a Windows update to not silently turn it back on.

The UK Information Commissioner’s Office (ICO) is looking into this, because yeah.

In case it was not obvious, you should either:

  1. Opt in for the mundane utility, and embrace that your computer has recorded everything you have ever done and that anyone with access to your system or your files, potentially including a crook, Microsoft, the NSA or FBI, China or your spouse now fully owns you, and also that an AI knows literal everything you do. Rely on a combination of security through obscurity, defense in depth and luck. To the extent you can, keep activities and info you would not want exposed this way off of your PC, or ensure they are never typed or displayed onscreen using your best Randy Waterhouse impression.

  2. Actually for real accept that the computer in question is presumed compromised, use it only for activities where you don’t mind, never enter any passwords there, and presumably have a second computer for activities that need to be secure, or perhaps confine them to a phone or tablet.

  3. Opt out and ensure that for the love of God your machine cannot use this feature.

I am not here to tell you which of those is the play.

I only claim that it seems that soon you must choose.

If the feature is useful, a large number of people are going to choose option one.

I presume almost no one will pick option two, except perhaps for gaming PCs.

Option three is viable.

If there is one thing we have learned during the rise of AI, and indeed during the rise of computers and the internet, it is that almost all people will sign away their privacy and technological vulnerability for a little mundane utility, such as easier access to cute pictures of cats.

Yelling at them that they are being complete idiots is a known ineffective response.

And who is to say they even are being idiots? Security through obscurity is, for many people, a viable strategy up to a point.

Also, I predict your phone is going to do a version of this for you by default within a few years, once the compute and other resources are available for it. I created a market on how quickly. Microsoft is going out on far less of a limb than it might look like.

In any case, how much mundane utility is available?

Quite a bit. You would essentially be able to remember everything, ask the AI about everything, have it take care of increasingly complex tasks with full context, and this will improve steadily over time, and it will customize to what you care about.

If you ignore all the obvious horrendous downsides of giving an AI this level of access to your computer, and the AI behind it is good, this is very clearly The Way.

There are of course some people who will not do this.

How long before they are under increasing pressure to do it? How long until it becomes highly suspicious, as if they have something to hide? How long until it becomes a legal requirement, at best in certain industries like finance?

Ben Thompson, on the other hand, was impressed, calling the announcement event ‘the physical manifestation of CEO Satya Nadella’s greatest triumph’ and ‘one of the most compelling events I’ve attended in a long time.’ Ben did not mention the privacy and security issues.

Quiet Speculations

Ethan Mollick perspective on model improvements and potential AGI. He warns that AIs are more like aliens that get good at tasks one by one, and when they are good they by default get very good at that task quickly, but they are good at different things than we are, and over time that list expands. I wonder to what extent this is real versus the extent this is inevitable when using human performance as a benchmark while capabilities steadily improve, so long as machines have comparative advantages and disadvantages. If the trends continue, then it sure seems like the set of things they are better at trends towards everything.

Arthur Breitman suggests Apple isn’t developing LLMs because there is enough competition that they are not worried about vender lock-in, and distribution matters more. Why produce an internal sub-par product? This might be wise.

Microsoft CTO Kevin Scott claims ‘we are nowhere near the point of diminishing marginal returns on how powerful we can make AI models as we increase the scale of compute.’

Image

Gary Marcus offered to Kevin Scott him $100k on that.

This was a truly weird speech on future challenges of AI by Randall Kroszner, external member of the Financial Policy Committee of the Bank of England. He talks about misalignment and interpretability, somehow. Kind of. He cites the Goldman Sacks estimate of 1.5% labor productivity and 7% GDP growth over 10 years following widespread AI adaptation, that somehow people say with a straight face, then the flip side is McKinsey saying 0.6% annual labor productivity growth by 2040, which is also not something I could say with a straight face. And he talks about disruptions and innovation aids and productivity estimation J-curves. It all sounds so… normal? Except with a bunch of things spiking through. I kept having to stop to just say to myself ‘my lord that is so weird.’

Politico is at it Again

Politico is at it again. Once again, the framing is a background assumption that any safety concerns or fears in Washington are fake, and the coming regulatory war is a combination of two fights over Lenin’s question of who benefits.

  1. A fight between ‘Big Tech’ and ‘Silicon Valley’ over who gets regulatory capture and thus Washington’s regulatory help against the other side.

  2. An alliance of ‘Big Tech’ and ‘Silicon Valley’ against Washington to head off any regulations that would interfere with both of them.

That’s it. Those are the issues and stakes in play. Nothing else.

How dismissive is this of safety? Here are the two times ‘safety’ is mentioned:

Matthew Kaminski (Politico): On Capitol Hill and in the White House, that alone breeds growing suspicion and defensiveness. Altman and others, including from another prominent AI startup Anthropic, weighed in with ideas for the Biden administration’s sweeping executive order last fall on AI safety and development.

Testing standards for AI are easy things to find agreement on. Safety as well, as long as those rules don’t favor one or another budding AI player. No one wants the technology to help rogue states or groups. Silicon Valley is on America’s side against China and even more concerned about the long regulatory arm of the EU than Washington.

Testing standards are ‘easy things to find agreement on’? Fact check: Lol, lmao.

That’s it. The word ‘risk’ appears twice and neither has anything to do with safety. Other words like ‘capability,’ ‘existential’ or any form of ‘catastrophic’ do not appear. It is all treated as obviously irrelevant.

The progress is here they stopped trying to bulk up people worried about safety as boogeymen (perhaps because this is written by Matthew Kaminski, not Brendon Bordelon), and instead point to actual corporations that are indeed pursuing actual profits, with Silicon Valley taking on Big Tech. And I very much appreciate that ‘open source advocates’ has now been properly identified as Silicon Valley pursuing its business interests.

Rohit Chopra (Consumer Financial Protection Bureau): There is a winner take all dimension. We struggle to see how it doesn’t turn, absent some government intervention, into a market structure where the foundational AI models are not dominated by a handful of the big tech companies.

Matthew Kaminski: Saying “star struck” policymakers across Washington have to get over their “eyelash batting awe” over new tech, Chopra predicts “another chapter in which big tech companies are going to face some real scrutiny” in the near future, especially on antitrust.

Lina Khan, the FTC’s head who has used the antitrust cudgel against big tech liberally, has sounded the warnings. “There is no AI exemption to the laws on the books,” she said last September.

For self-interested reasons, venture capitalists want to open up the space in Silicon Valley for new entrants that they can invest in and profitably exit from. Their arguments for a more open market will resonate politically.

Notice the escalation. This is not ‘Big Tech wants regulatory capture to actively enshrine its advantages, and safety is a Big Tech plot.’ This is ‘Silicon Valley wants to actively use regulatory action to prevent Big Tech from winning,’ with warnings that attempts to not have a proper arms race to ever more capable systems will cause intervention from regulators. By ‘more open market’ they mean ‘government intervention in the market,’ government’s favorite kind of new freer market.

As I have said previously, we desperately need to ensure that there are targeted antitrust exemptions available so that when AI labs can legally collaborate around safety issues they are not accused of collusion. It would be completely insane to not do this.

And as I keep saying, open source advocates are not asking for a level playing field or a lack of government oppression. They are asking for special treatment, to be exempt from the rules of society and the consequences of their actions, and also for the government to directly cripple their opponents for them.

Are they against regulatory capture? Only if they don’t get to do the capturing.

Then there is the second track, the question of guardrails that might spoil the ‘libertarian sandbox,’ which neither ‘side’ of tech wants here.

Here is the two mentions of ‘risk’:

“There is a risk that people think of this as social media 2.0 because its first public manifestation was a chat bot,” Kent Walker, Google’s president of global affairs, tells me over a conversation at the search giant’s offices here.

People out on the West Coast quietly fume about having to grapple with Washington. The tech crowd says the only fight that matters is the AI race against China and each other. But they are handling politics with care, all too aware of the risks.

I once again have been roped into extensively covering a Politico article, because it is genuinely a different form of inception than the previous Politico inception attempts. But let us continue to update that Politico is extraordinarily disingenuous and hostilely motivated on the subject of AI regulation. This is de facto enemy action.

Here, Shakeel points out the obvious central point being made here, which is that most of the money and power in this fight is Big Tech companies fighting not only to avoid any regulations at all, but to get exemptions from other ordinary rules of society. When ethics advocates portray notkilleveryoneism (or safety) advocates as their opponents, that is their refusal to work together towards common goals and also it misses the point. Similarly, here Seán Ó hÉigeartaigh expresses concern about divide-and-conquer tactics targeting these two groups despite frequently overlapping and usually at least complementary proposals and goals.

Or perhaps the idea is to illustrate that all the major players in Tech are aligned in being motivated by profit and in dismissing all safety concerns as fake? And a warning that Washington is in danger of being convinced? I would love that to be true. I do not think a place like Politico works that subtle these days, nor do I expect those who need to hear that message to figure out that it is there.

Beating China

If we care about beating China, by far the most valuable thing we can do is allow more high-skilled immigration. Many of their best and brightest want to become Americans.

This is true across the board, for all aspects of our great power competition.

It also applies to AI.

From his thread about the Schumer report:

Peter Wildeford: Lastly, while immigration is a politically fraught subject, it is immensely stupid for the US to not do more to retain top talent. So it’s awesome to see the roadmap call for more high-skill immigration, in a bipartisan way.

The immigration element is important for keeping the US ahead in AI. While the US only produces 20% of top AI talent natively, more than half of that talent lives and works in the US due to immigration. That number could be even higher with important reform.

Image

I suspect the numbers are even more lopsided than this graph suggests.

To what extent is being in America a key element of being a top-tier AI researcher? How many of these same people would have been great if they had stayed at home? If they had stayed at home, would others have taken their place here in America? We do not know. I do know it is essentially impossible that this extent is so large we would not want to bring such people here.

Do we need to worry about those immigrants being a security risk, if they come from certain nations like China and we were to put them into OpenAI, Anthropic or DeepMind? Yes, that does seem like a problem. But there are plenty of other places they could go, where it is much less of a problem.

The Quest for Sane Regulations

Labour vows to force firms developing powerful AI to meet requirements.

Nina Lloyd (The Independent): Labour has said it would urgently introduce binding requirements for companies developing powerful artificial intelligence (AI) after Rishi Sunak said he would not “rush” to regulate the technology.

The party has promised to force firms to report before they train models over a certain capability threshold and to carry out safety tests strengthened by independent oversight if it wins the next general election.

Unless something very unexpected happens, they will win the next election, which is currently scheduled for July 4.

This is indeed the a16z dilemma:

John Luttig: A16z simultaneously argues

  1. The US must prevent China from dominating AI.

  2. Open source models should proliferate freely across borders (to China).

What does this mean? Who knows. I’m just glad at Founders Fund we don’t have to promote every current thing at once.

SB 1047 Update

The California Senate has passed SB 1047, by a vote of 32-1.

An attempt to find an estimate of the costs of compliance with SB 1047. The attempt appears to fail, despite some good discussions.

This seems worth noting given the OpenAI situation last week:

Dan Hendrycks: For what it’s worth, when Scott Weiner and others were receiving feedback from all the major AI companies (Meta, OpenAI, etc.) on the SB 1047 bill, Sam [Altman] was explicitly supportive of whistleblower protections.

Scott Wiener Twitter thread and full open letter on SB 1047.

Scott Wiener: If you only read one thing in this letter, please make it this: I am eager to work together with you to make this bill as good as it can be.

There are over three more months for discussion, deliberation, feedback, and amendments.

You can also reach out to my staff anytime, and we are planning to hold a town hall for the AI community in the coming weeks to create more opportunities for in-person discussion.

Bottom line [changed to numbered list including some other section headings]:

  1. SB 1047 doesn’t ban training or deployment of any models.

  2. It doesn’t require licensing or permission to train or deploy any models.

  3. It doesn’t threaten prison (yes, some are making this baseless claim) for anyone based on the training or deployment of any models.

  4. It doesn’t allow private lawsuits against developers.

  5. It doesn’t ban potentially hazardous capabilities.

  6. And it’s not being “fast tracked,” but rather is proceeding according to the usual deliberative legislative process, with ample opportunity for feedback and amendments remaining.

  7. SB 1047 doesn’t apply to the vast majority of startups.

  8. The bill applies only to concrete and specific risks of catastrophic harm.

  9. Shutdown requirements don’t apply once models leave your control.

  10. SB 1047 provides significantly more clarity on liability than current law.

  11. Enforcement is very narrow in SB 1047. Only the AG can file a lawsuit.

  12. Open source is largely protected under the bill.

What SB 1047 *does* require is that developers who are training and deploying a frontier model more capable than any model currently released must engage in safety testing informed by academia, industry best practices, and the existing state of the art. If that testing shows material risk of concrete and specific catastrophic threats to public safety and security — truly huge threats — the developer must take reasonable steps to mitigate (not eliminate) the risk of catastrophic harm. The bill also creates basic standards like the ability to disable a frontier AI model while it remains in the developer’s possession (not after it is open sourced, at which point the requirement no longer applies), pricing transparency for cloud compute, and a “know your customer” requirement for cloud services selling massive amounts of compute capacity.

Our intention is that safety and mitigation requirements be borne by highly-resourced developers of frontier models, not by startups & academic researchers. We’ve heard concerns that this isn’t clear, so we’re actively considering changes to clarify who is covered.

After meeting with a range of experts, especially in the open source community, we’re also considering other changes to the definitions of covered models and derivative models. We’ll continue making changes over the next 3 months as the bill proceeds through the Legislature.

This very explicitly clarifies the intent of the bill across multiple misconceptions and objections, all in line with my previous understanding.

They actively continue to solicit feedback and are considering changes.

If you are concerned about the impact of this bill, and feel it is badly designed or has flaws, the best thing you can do is offer specific critiques and proposed changes.

I strongly agree with Weiner that this bill is light touch relative to alternative options. I see Pareto improvements we could make, but I do not see any fundamentally different lighter touch proposals that accomplish what this bill sets out to do.

I will sometimes say of a safety bill, sometimes in detail: It’s a good bill, sir.

Other times, I will say: It’s a potentially good bill, sir, if they fix this issue.

That is where I am at with SB 1047. Most of the bill seems very good, an attempt to act with as light a touch as possible. There are still a few issues with it. The derivative model definition as it currently exists is the potential showstopper bug.

To summarize the issue once more: As written, if interpreted literally and as I understand it, it allows developers to define themselves as derivative of an existing model. This, again if interpreted literally, lets them evade all responsibilities, and move those onto essentially any covered open model of the same size. That means both that any unsafe actor goes unrestricted (whether they be open or closed), and releasing the weights of a covered model creates liability no matter how responsible you were, since they can effectively start the training over from scratch.

Scott Weiner says he is working on a fix. I believe the correct fix is a compute threshold for additional training, over which a model is no longer derivative, and the responsibilities under SB 1047 would then pass to the new developer or fine-tuner. Some open model advocates demand that responsibility for derivative models be removed entirely, but that would transparently defeat the purpose of preventing catastrophic harm. Who cares if your model is safe untuned, if you can fine-tune it to be unsafe in an hour with $100?

Then at other times, I will look at a safety or other regulatory bill or proposal, and say…

That’s Not a Good Idea

So it seems only fair to highlight some not good ideas, and say: Not a good idea.

One toy example would be the periodic complaints about Section 230. Here is a thread on the latest such hearing this week, pointing out what would happen without it, and the absurdity of the accusations being thrown around. Some witnesses are saying 230 is not needed to guard platforms against litigation, whereas it was created because people were suing platforms.

Adam Thierer reports there are witnesses saying the Like and Thumbs Up buttons are dangerous and should be regulated.

Brad Polumbo here claims that GLAAD says Big Tech companies ‘should cease the practice of targeted surveillance advertising, including the use of algorithmic content recommendation.’

From April 23, Adam Thierer talks about proposals to mandate ‘algorithmic audits and impact assessments,’ which he calls ‘NEPA for AI.’ Here we have Assembly Bill 2930, requiring impact assessments by developers, and charge $25,000 per instance of ‘algorithmic discrimination.’

Another example would be Colorado passing SB24-205, Consumer Protections for Artificial Intelligence, which is concerned with algorithmic bias. Governor Jared Polis signed with reservations. Dean Ball has a critique here, highlighting ambiguity in the writing, but noting they have two full years to fix that before it goes into effect.

I would be less concerned with the ambiguity, and more concerned about much of the actual intent and the various proactive requirements. I could make a strong case that some of the stuff here is kind of insane, and also seems like a generic GPDR-style ‘you have to notify everyone that AI was involved in every meaningful decision ever.’ The requirements apply regardless of size, and worry about impacts that are the kind of thing society can mitigate as we go.

The good news is that there are also some good provisions like IDing AIs, and also full enforcement of the bad parts seems impossible? I am very frustrated that a bill that isn’t trying to address catastrophic risks, but seems far harder to comply with, and seems far worse to me than SB 1047, seems to mostly get a pass. Then again, it’s only Colorado.

I do worry about Gell-Mann amnesia. I have seen so many hyperbolic statements, and outright false statements, about AI bills, often from the same people that point out what seem like obviously horrible other proposed regulatory bills and policies. How can one trust their statements about the other bills, short of reading the actual bills (RTFB)? If it turned out they were wrong, and this time the bill was actually reasonable, who would point this out?

So far, when I have dug deeper, the bills do indeed almost always turn out to be terrible, but the ‘rumors of the death of the internet’ or similar potential consequences are often greatly exaggerated. The bills are indeed reliably terrible, but not as terrible as claimed. Alas, I must repeat my lament that I know of no RTFB person I can turn to on other topics, and my cup doth overflow.

The Week in Audio

I return to the Cognitive Revolution to discuss various events of the past week first in part one, then this is part two. Recorded on Friday, things have changed by the time you read this.

From last week’s backlog: Dwarkesh Patel as guest on 80k After Hours. Not full of gold on the level of Dwarkesh interviewing others, and only partly about AI. There is definitely gold in those hills for those who want to go into these EA-related weeds. If you don’t want that then skip this one.

Around 51:45 Dwarkesh notes there is no ‘Matt Levine for AI’ and that picking up that mantle would be a good thing to do. I suppose I still have my work cut out.

A lot of talk about EA and 80k Hours ways of thinking about how to choose paths in life, that I think illustrates well both the ways it is good (actively making choices rather than sleepwalking, having priorities) and not as good (heavily favoring the legible).

Some key factors in giving career advice they point out are that from a global perspective power laws apply and the biggest impacts are a huge share of what matters, and that much advice (such as ‘don’t start a company in college’) is only good advice because the people to whom it is horribly bad advice will predictably ignore it.

Rhetorical Innovation

Why does this section exist? This is a remarkably large fraction of why.

Emmett Shear: The number one rule of building things that can destroy the entire world is don’t do that.

Surprisingly it is also rule 2, 3, 4, 5, and 6.

Rule seven, however, is “make it emanate ominous humming and glow with a pulsing darkness”.

Eliezer Yudkowsky: Emmett.

Emmett Shear (later): Shocking amount of pushback on “don’t build stuff that can destroy the world”. I’d like to take this chance to say I stand by my apparently controversial opinion that building things to destroy the world is bad. In related news, murder is wrong and bad.

Follow me for more bold, controversial, daring takes like these.

Emmett Shear (other thread): Today has been a day to experiment with how obviously true I can make a statement before people stop disagreeing with it.

This is a Platonic encapsulation of this class of argument:

Emmett Shear: That which can be asserted without evidence can be dismissed without evidence.

Ryan Shea: Good point, but not sure he realizes this applies to AI doomer prophecy.

Emmett Shear: Not sure you realize this applies to Pollyanna assertions that don’t worry, a fully self-improving AI will be harmless. There’s a lot of evidence autocatalytic loops are potentially dangerous.

Ryan Shea: The original post is a good one. And I’m not making a claim that there’s no reason at all to worry. Just that there isn’t a particular reason to do so.

Emmett Shear: Forgive me if your “there’s not NO reason to worry, but let’s just go ahead with something potentially massively dangerous” argument doesn’t hold much reassurance for me.

[it continues from there, but gets less interesting and stops being Platonic.]

The latest reiteration of why p(doom) is useful even if highly imprecise, and why probabilities and probability ranges are super useful in general for communicating your actual epistemic state. In particular, that when Jan Leike puts his at ‘10%-90%’ this is a highly meaningful and useful statement of what assessments he considers reasonable given the evidence, providing much more information than saying ‘I don’t know.’ It is also more information than ‘50%.’

For the record: This, unrelated to AI, is the proper use of the word ‘doomer.

The usual suspects, including Bengio, Hinton, Yao and 22 others, write the usual arguments in the hopes of finally getting it right, this time as Managing Extreme AI Risks Amid Rapid Progress in Science.

I rarely see statements like this, so it was noteworthy that someone noticed.

Mike Solana: Frankly, I was ambivalent on the open sourced AI debate until yesterday, at which point the open sourced side’s reflexive, emotional dunking and identity-based platitudes convinced me — that almost nobody knows what they think, or why.

Aligning a Smarter Than Human Intelligence is Difficult

It is even more difficult when you don’t know what ‘alignment’ means.

Which, periodic reminder, you don’t.

Rohit: We use AI alignment to mean:

  1. Models do what we ask.

  2. Models don’t do bad things even if we ask.

  3. Models don’t fail catastrophically.

  4. Models don’t actively deceive us.

And all those are different problems. Using the same term creates confusion.

Here we have one attempt to choose a definition, and cases for and against it:

Iason Gabriel: The new international scientific report on AI safety is impressive work, but it’s problematic to define AI alignment as:

“the challenge of making general-purpose AI systems act in accordance with the developer’s goals and interests”

Eliezer Yudkowsky: I defend this. We need separate words for the technical challenges of making AGIs and separately ASIs do any specified thing whatsoever, “alignment”, and the (moot if alignment fails) social challenge of making that developer target be “beneficial”.

Good advice given everything we know these days:

Mesaoptimizer: If your endgame strategy involved relying on OpenAI, DeepMind, or Anthropic to implement your alignment solution that solves science /​ super-cooperation /​ nanotechnology, consider figuring out another endgame plan.

That does not express a strong opinion on whether we currently know of a better plan.

And it is exceedingly difficult when you do not attempt to solve the problem.

Dean Ball says here, in the most thoughtful version I have seen of this position by far, that the dissolution of the Superalignment team was good because distinct safety teams create oppositionalism, become myopic about box checking and employee policing rather than converging on the spirit of actual safety. Much better to diffuse the safety efforts throughout the various teams. Ball does note that this does not apply to the extent the team was doing basic research.

There are three reasons this viewpoint seems highly implausible to me.

  1. The Superalignment team was indeed tasked with basic research. Solving the problem is going to require quite a lot of basic research, or at least work that is not incremental progress on current incremental commercial products. This is not about ensuring that each marginal rocket does not blow up, or the plant does not melt down this month. It is a different kind of problem, preparing for a very different kind of failure mode. It does not make sense to embed these people into product teams.

  2. This is not a reallocation of resources from a safety team to diffused safety work. This is a reallocation of resources, many of which were promised and never delivered, away from safety towards capabilities, as Dean himself notes. This is in addition to losing the two most senior safety researchers and a lot of others too.

  3. Mundane safety, making current models do what you want in ways that as Leike notes will not scale to when they matter most, does not count as safety towards the goals of the superalignment team or of us all not dying. No points.

Thus the biggest disagreement here, in my view, which is when he says this:

Dean Ball: Companies like Anthropic, OpenAI, and DeepMind have all made meaningful progress on the technical part of this problem, but this is bigger than a technical problem. Ultimately, the deeper problem is contending with a decentralized world, in which everyone wants something different and has a different idea for how to achieve their goals.

The good news is that this is basically politics, and we have been doing it for a long time. The bad news is that this is basically politics, and we have been doing it for a long time. We have no definitive answers.

Yes, it is bigger than a technical problem, and that is important.

OpenAI has not made ‘meaningful progress.’ Certainly we are not on track to solve such problems, and we should not presume they will essentially solve themselves with an ordinary effort, as is implied here.

Indeed, with that attitude, it’s Margaritaville (as in, we might as well start drinking Margaritas.1) Whereas with the attitude of Leike and Sutskever, I disagreed with their approach, but I could have been wrong or they could have course corrected, if they had been given the resources to try.

Nor is the second phase problem that we also must solve well-described by ‘basically politics’ of a type we are used to, because there will be entities involved that are not human. Our classical liberal political solutions work better than known alternatives, and well enough for humans to flourish, by assuming various properties of humans and the affordances available to them. AIs with far greater intelligence, capabilities and efficiency, that can be freely copied, and so on, would break those assumptions.

I do greatly appreciate the self-awareness and honesty in this section:

Dean Ball: More specifically, I believe that classical liberalism—individualism wedded with pluralism via the rule of law—is the best starting point, because it has shown the most success in balancing the priorities of the individual and the collective. But of course I do. Those were my politics to begin with.

It is notable how many AI safety advocates, when discussing almost any topic except transformational AI, are also classical liberals. If this confuses you, notice that.

The Lighter Side

Not under the current paradigm, but worth noticing.

Also, yes, it really is this easy.

And yet, somehow it is still this hard? (I was not able to replicate this one, may be fake)

It’s a fun game.

Sometimes you stick the pieces together and know where it comes from.

A problem statement:

Jorbs: We have gone from

“there is no point in arguing with that person, their mind is already made up”

to

“there is no point in arguing with that person, they are made up.”

It’s coming.

Alex Press: The Future of Artificial Intelligence at Wendy’s.

Image

Colin Fraser: Me at the Wendy’s drive thru in June: A farmer and a goat stand on the side of a riverbank with a boat for two.

[FreshAI replies]: Sir, this is a Wendy’s.

Are you ready?

1

Also, ‘some people say that there’s a woman to blame, but I know it’s my own damn fault.’