For the past couple years I’ve wished LessWrong had a “sort posts by number of pingbacks, or, ideally, by total karma of pingbacks”. I particularly wished for this during the Annual Review, where “which posts got cited the most?” seemed like a useful thing to track for potential hidden gems.
We still haven’t built a full-fledged feature for this, but I just ran a query against the database, and made it into a spreadsheet, which you can view here:
2022 (and All Time) Posts by Pingback Count
For the past couple years I’ve wished LessWrong had a “sort posts by number of pingbacks, or, ideally, by total karma of pingbacks”. I particularly wished for this during the Annual Review, where “which posts got cited the most?” seemed like a useful thing to track for potential hidden gems.
We still haven’t built a full-fledged feature for this, but I just ran a query against the database, and made it into a spreadsheet, which you can view here:
LessWrong 2022 Posts by Pingbacks
Here are the top 100 posts, sorted by Total Pingback Karma
Title/LinkPost KarmaPingback CountTotal Pingback KarmaAvg Pingback KarmaAGI Ruin: A List of LethalitiesMIRI announces new "Death With Dignity" strategyA central AI alignment problem: capabilities generalization, and the sharp left turnSimulatorsWithout specific countermeasures, the easiest path to transformative AI likely leads to AI takeoverReward is not the optimization targetA Mechanistic Interpretability Analysis of GrokkingHow To Go From Interpretability To Alignment: Just Retarget The SearchOn how various plans miss the hard bits of the alignment challenge[Intro to brain-like-AGI safety] 3. Two subsystems: Learning & SteeringHow likely is deceptive alignment?The shard theory of human valuesMysteries of mode collapse[Intro to brain-like-AGI safety] 2. “Learning from scratch” in the brainWhy Agent Foundations? An Overly Abstract ExplanationA Longlist of Theories of Impact for InterpretabilityHow might we align transformative AI if it’s developed very soon?A transparency and interpretability tech treeDiscovering Language Model Behaviors with Model-Written EvaluationsA note about differential technological developmentCausal Scrubbing: a method for rigorously testing interpretability hypotheses [Redwood Research]Supervise Process, not OutcomesShard Theory: An OverviewEpistemological Vigilance for AlignmentA shot at the diamond-alignment problemWhere I agree and disagree with EliezerBrain Efficiency: Much More than You Wanted to KnowRefine: An Incubator for Conceptual Alignment Research BetsExternalized reasoning oversight: a research direction for language model alignmentHumans provide an untapped wealth of evidence about alignmentSix Dimensions of Operational Adequacy in AGI ProjectsHow "Discovering Latent Knowledge in Language Models Without Supervision" Fits Into a Broader Alignment SchemeGodzilla Strategies(My understanding of) What Everyone in Technical Alignment is Doing and WhyTwo-year update on my personal AI timelines[Intro to brain-like-AGI safety] 15. Conclusion: Open problems, how to help, AMA[Intro to brain-like-AGI safety] 6. Big picture of motivation, decision-making, and RLHuman values & biases are inaccessible to the genomeYou Are Not Measuring What You Think You Are MeasuringOpen Problems in AI X-Risk [PAIS #5][Intro to brain-like-AGI safety] 1. What's the problem & Why work on it now?Conditioning Generative ModelsConjecture: Internal Infohazard PolicyA challenge for AGI organizations, and a challenge for readersSuperintelligent AI is necessary for an amazing future, but far from sufficientOptimality is the tiger, and agents are its teethLet’s think about slowing down AINiceness is unnaturalAnnouncing the Alignment of Complex Systems Research Group[Intro to brain-like-AGI safety] 13. Symbol grounding & human social instinctsELK prize resultsAbstractions as Redundant Information[Link] A minimal viable product for alignmentAcceptability Verification: A Research AgendaWhat an actually pessimistic containment strategy looks likeLet's See You Write That Corrigibility Tagchinchilla's wild implicationsWorlds Where Iterative Design Failswhy assume AGIs will optimize for fixed goals?Gradient hacking: definitions and examplesContra shard theory, in the context of the diamond maximizer problemWe Are Conjecture, A New Alignment Research StartupCircumventing interpretability: How to defeat mind-readersEvolution is a bad analogy for AGI: inner alignmentRefining the Sharp Left Turn threat model, part 1: claims and mechanismsMATS ModelsCommon misconceptions about OpenAIPrizes for ELK proposalsCurrent themes in mechanistic interpretability researchDiscovering Agents[Intro to brain-like-AGI safety] 12. Two paths forward: “Controlled AGI” and “Social-instinct AGI”What's General-Purpose Search, And Why Might We Expect To See It In Trained ML Systems?Inner and outer alignment decompose one hard problem into two extremely hard problemsThreat Model Literature ReviewLanguage models seem to be much better than humans at next-token predictionWill Capabilities Generalise More?Pivotal outcomes and pivotal processesConditioning Generative Models for AlignmentTraining goals for large language modelsIt’s Probably Not LithiumLatent Adversarial Training“Pivotal Act” Intentions: Negative Consequences and Fallacious ArgumentsConditioning Generative Models with RestrictionsThe alignment problem from a deep learning perspectiveInstead of technical research, more people should focus on buying timeBy Default, GPTs Think In Plain Sight[Intro to brain-like-AGI safety] 4. The “short-term predictor”Don't leave your fingerprints on the futureStrategy For Conditioning Generative ModelsCall For DistillersThoughts on AGI organizations and capabilities workOptimization at a Distance[Intro to brain-like-AGI safety] 5. The “long-term predictor”, and TD learningWhat does it take to defend the world against out-of-control AGIs?Monitoring for deceptive alignmentLate 2021 MIRI Conversations: AMA / DiscussionHow to Diversify Conceptual Alignment: the Model Behind Refinewrapper-minds are the enemyBut is it really in Rome? An investigation of the ROME model editing techniqueAn Open Agency Architecture for Safe Transformative AI