links 5/13/2025: https://roamresearch.com/#/app/srcpublic/page/05-13-2025
https://www.cnn.com/2025/05/12/asia/india-pakistan-kashmir-ceasefire-intl-hnk India/Pakistan ceasefire
drug price controls executive order
https://www.cato.org/blog/trump-attempts-price-controls-prescription-drugs
https://www.whitehouse.gov/presidential-actions/2025/05/delivering-most-favored-nation-prescription-drug-pricing-to-american-patients/
US/China trade deal reduces tariffs
cuts in US military officers, federal research funding
Stephen Miller, homeland security advisor and White House deputy chief of staff, says they’re “actively looking at” suspending habeas corpus for immigrants
https://languagehat.com/why-adjectives/ not all languages have adjectives! read the comment thread too, it’s fascinating.
https://classics.mit.edu/Aristotle/interpretation.2.2.html “On Interpretation”
https://dynomight.net/titles/ how to title blog posts
Claude Sonnet 3.7 benchmark & compute references
https://www.anthropic.com/news/claude-3-7-sonnet
https://wandb.ai/byyoung3/Generative-AI/reports/Evaluating-Claude-3-7-Sonnet-Performance-reasoning-and-cost-optimization—VmlldzoxMTYzNDEzNQ
https://www.interconnects.ai/p/claude-3-7-thonks
https://www.vals.ai/benchmarks/mmlu_pro-03-13-2025
https://smythos.com/news/claude-3-7-sonnet-an-in-depth-analysis/
Grok 3 benchmark & compute references
https://x.ai/news/grok-3
https://www.helicone.ai/blog/grok-3-benchmark-comparison
GPT4 benchmark & compute references
https://www.linkedin.com/pulse/supposed-leak-gpt4-architecture-alvaro-duran-tovar/
Llama 2 benchmark & compute references
https://ai.meta.com/research/publications/llama-2-open-foundation-and-fine-tuned-chat-models/
https://www.cursor.com/en/blog/llama-inference
GPT-3.5-turbo benchmark & compute references
https://www.vellum.ai/blog/gpt-4o-mini-v-s-claude-3-haiku-v-s-gpt-3-5-turbo-a-comparison
calculating inference FLOPs and other compute measurements from model architecture, in transformers
https://kipp.ly/transformer-inference-arithmetic/ a now-classic reference post, explains the kv cache
https://www.adamcasson.com/transformer-flops.pdf
https://jax-ml.github.io/scaling-book/transformers/
https://www.artfintel.com/p/where-do-llms-spend-their-flops
inference time (and hence FLOPs on a given machine)scales linearly with output length if you use a KV cache
https://www.baseten.co/blog/llm-transformer-inference-guide/
https://www.lesswrong.com/posts/g7H2sSGHAeYxCHzrz/how-much-ai-inference-can-we-do
https://marginalrevolution.com/marginalrevolution/2025/05/who-wants-impartial-news.html who prefers impartial news over news that shares their point of view? apparently old, rich, male, non-ideological people. (I did not read the paper & thus do not have an opinion on its credibility.)
links 5/13/2025: https://roamresearch.com/#/app/srcpublic/page/05-13-2025
https://www.cnn.com/2025/05/12/asia/india-pakistan-kashmir-ceasefire-intl-hnk India/Pakistan ceasefire
drug price controls executive order
https://www.cato.org/blog/trump-attempts-price-controls-prescription-drugs
https://www.whitehouse.gov/presidential-actions/2025/05/delivering-most-favored-nation-prescription-drug-pricing-to-american-patients/
US/China trade deal reduces tariffs
cuts in US military officers, federal research funding
Stephen Miller, homeland security advisor and White House deputy chief of staff, says they’re “actively looking at” suspending habeas corpus for immigrants
https://languagehat.com/why-adjectives/ not all languages have adjectives! read the comment thread too, it’s fascinating.
https://classics.mit.edu/Aristotle/interpretation.2.2.html “On Interpretation”
https://dynomight.net/titles/ how to title blog posts
Claude Sonnet 3.7 benchmark & compute references
https://www.anthropic.com/news/claude-3-7-sonnet
https://wandb.ai/byyoung3/Generative-AI/reports/Evaluating-Claude-3-7-Sonnet-Performance-reasoning-and-cost-optimization—VmlldzoxMTYzNDEzNQ
https://www.interconnects.ai/p/claude-3-7-thonks
https://www.vals.ai/benchmarks/mmlu_pro-03-13-2025
https://smythos.com/news/claude-3-7-sonnet-an-in-depth-analysis/
Grok 3 benchmark & compute references
https://x.ai/news/grok-3
https://www.helicone.ai/blog/grok-3-benchmark-comparison
GPT4 benchmark & compute references
https://www.linkedin.com/pulse/supposed-leak-gpt4-architecture-alvaro-duran-tovar/
Llama 2 benchmark & compute references
https://ai.meta.com/research/publications/llama-2-open-foundation-and-fine-tuned-chat-models/
https://www.cursor.com/en/blog/llama-inference
GPT-3.5-turbo benchmark & compute references
https://www.vellum.ai/blog/gpt-4o-mini-v-s-claude-3-haiku-v-s-gpt-3-5-turbo-a-comparison
calculating inference FLOPs and other compute measurements from model architecture, in transformers
https://kipp.ly/transformer-inference-arithmetic/ a now-classic reference post, explains the kv cache
https://www.adamcasson.com/transformer-flops.pdf
https://jax-ml.github.io/scaling-book/transformers/
https://www.artfintel.com/p/where-do-llms-spend-their-flops
inference time (and hence FLOPs on a given machine)scales linearly with output length if you use a KV cache
https://www.baseten.co/blog/llm-transformer-inference-guide/
https://www.lesswrong.com/posts/g7H2sSGHAeYxCHzrz/how-much-ai-inference-can-we-do
https://marginalrevolution.com/marginalrevolution/2025/05/who-wants-impartial-news.html who prefers impartial news over news that shares their point of view? apparently old, rich, male, non-ideological people. (I did not read the paper & thus do not have an opinion on its credibility.)