Ram Potham comments on Why Should I Assume CCP AGI is Worse Than USG AGI?

Ram Potham 20 Apr 2025 16:04 UTC
17 points
−10
Based on previous data, it’s plausible like CCP AGI will perform worse on safety benchmarks than US AGI. Take Cisco Harmbench evaluation results:
- DeepSeek R1: Demonstrated a 100% failure rate in blocking harmful prompts according to Anthropic’s safety tests.
- OpenAI GPT-4o: Showed an 86% failure rate in the same tests, indicating better but still concerning gaps in safety measures.
- Meta Llama-3.1-405B: Had a 96% failure rate, performing slightly better than DeepSeek but worse than OpenAI.
Though, if it was just CCP making AGI or just US making AGI it might be better because it’d reduce competitive pressures.
But, due to competitive pressures and investments like Stargate, the AGI timeline is accelerated, and the first AGI model may not perform well on safety benchmarks.
- Stephen Fowler 24 Apr 2025 1:53 UTC
  5 points
  3
  Parent
  You have conflated two separate evaluations, both mentioned in the TechCrunch article.
  The percentages you quoted come from Cisco’s HarmBench evaluation of multiple frontier models, not from Anthropic and were not specific to bioweapons.
  Dario Amondei stated that an unnamed DeepSeek variant performed worst on bioweapons prompts, but offered no quantitative data. Separately, Cisco reported that DeepSeek-R1 failed to block 100% of harmful prompts, while Meta’s Llama 3.1 405B and OpenAI’s GPT-4o failed at 96 % and 86 %, respectively.
  When we look at performance breakdown by Cisco, we see that all 3 models performed equally badly on chemical/biological safety.
  - Ram Potham 25 Apr 2025 23:01 UTC
    3 points
    0
    Parent
    Thanks, updated the comment to be more accurate