EA & LW Forums Weekly Summary (17 − 23 Oct 22′)

Supported by Rethink Priorities

This is part of a weekly series—you can see the full collection here. The first post includes some details on purpose and methodology.

If you’d like to receive these summaries via email, you can subscribe here.

Podcast version: prefer your summaries in podcast form? A big thanks to Coleman Snell for producing these! Subscribe on your favorite podcast app by searching for ‘Effective Altruism Forum Podcast’.


EA Forum

Philosophy and Methodologies

Effective Altruism’s Implicit Epistemology

by Violet Hour

Longtermist philosophy is pretty reasonable (future people matter, there might be a lot, and we can make a difference to them). However many outside EA find the priorities that have arisen from these (eg. AI safety & bio risk) to be weird. The author argues this is due to EA’s unusual epistemic culture, and uses this post to highlight these norms and how they influence our decision-making.

In particular, EAs tend to be comfortable with speculative reasoning, put numbers on things (even when they’re highly unsure), use those numbers as inputs to decision-making, but are still skeptical if all that leads to anything too speculative and fanatical. The author suggests being explicit about these norms because that allows better outside criticism, or if we’re really onto something, allows others to benefit from it.

The Relative Importance of the Severity and Duration of Pain

by William McAuliffe, Adam_Shriver

Pains vary in their severity and duration. This report reviews the research and philosophy on how to trade off these two dimensions, which can impact cause prioritization decisions.

Some viewpoints explored include that badness scales non-linearly with severity of pain, or that long-duration pain can only outweigh a high severity pain if it meets the bar of preventing pleasure or making more moments bad than good. Utilitarian views simply multiplying (severity X duration) are also presented. It’s also possible these trade-offs vary between individuals—one study found most participants make decisions as if adding severity and duration to get badness, but a minority multiply them.

Ethical constraints, severity being more salient in retrospect, imagination failures and other factors make research and experimentation in this area difficult. The authors are planning to gather scientists and philosophers at a workshop to develop new methodologies to push the area forward.

Sign of quality of life in GiveWell’s analyses

by brb243

The author conducted a small scale (N=30) survey in a Kenyan slum in 2021, which found most participants rated themselves closer to the ‘worst possible situation’ than ‘best possible situation’ and the median participant wanted to live 2 more years if their situation didn’t change.

Taking this into account could influence Givewell recommendations. For instance, Givewell recommended a grant to support the deregistration of pesticides commonly used in suicide on the basis of lives saved. However, these lives are likely valued negatively, and the grant could have negative impacts on agricultural productivity and therefore quality of life for others.

Object Level Interventions /​ Reviews

My experience experimenting with a bunch of antidepressants I’d never heard of

by Luisa_Rodriguez

The author systematically experimented with different antidepressants over a year period, after putting together a best guess ranked list with their psychiatrist. They share both this desk research and the results of their personal experiment. While the year was grueling, they found a drug with good effectiveness and limited side effects for them. Antidepressant effects vary significantly between individuals, so they suggest this process could be worthwhile for others too (particularly if they have lots of money and support to help with the effects during). They also found CBT and changing their job role to focus on particularly satisfying /​ exciting tasks were a big help.

Growing the US tofu market—a roadmap

by George Stiffman

Chinese tofus are varied (eg. some are melty, cheese-tasting, crumb-like outsides), but little known outside China. Expanding access to these could save substantial amounts of animal lives.

Limited supply and awareness are bottlenecks, particularly as shipping is expensive if done in small quantities. Encouraging existing trading companies to import more, helping local producers scale up, or creating a new distribution company are all potential solutions. Developing novel uses for the tofus, or research into how ingredients have gained popularity previously would also be helpful.

You can support this project by co-founding various types of organizations, funding the author, connecting them with cofounders /​ chefs /​ researchers /​ etc., research, or advising. More details on each in the post.

AI Safety Ideas: A collaborative AI safety research platform

by Apart Research, Esben Kran

Author’s tl;dr: We present the AI safety ideas and research platform AI Safety Ideas in open alpha. Add and explore research ideas on the website here: aisafetyideas.com.

‘Dissolving’ AI Risk – Parameter Uncertainty in AI Future Forecasting

by Froolow

Most models of AI risk have a number of discrete steps which all need to be true for bad outcomes to occur. These models calculate total risk by multiplying the central probability estimate of each step together. This is statistically incorrect for conditional and independent steps. Eg. If the central estimate of each of 4 steps were 60%, by simple multiplication that’s 13%. However we actually have a probability distribution for each step—and if we end up in world with an unlikely result in the lower tail on one, and an unlikely result in the higher tail on another, the final probability is hugely reduced eg. 60%*60%*5%*99% is only 1.8%. This means if we keep sampling from the distributions for each event, simulating possible worlds, we will get a lower predicted risk than if we simply multiply our best guesses for each step together.

The author collects estimates from the community and AI risk experts on each step of a well-accepted path to AI risk (Carlsmith model, 2021), which via simple multiplication ends up around the usual estimates in the 15-30% range. However, via sampling from the distribution of answers, they find we are far more likely to be in a world with <3% risk of catastrophe due to out-of-control AGI, with a geometric mean of only 1.6% risk. This analysis also allows us to identify which steps are most important for determining if we are in a low or high risk world, which could be useful for prioritizing research directions.

A top comment notes that this method requires independence of each step of the AI risk model for a particular expert, and that assumption is likely not met and can hugely influence results.

What is the likelihood that civilizational collapse would cause technological stagnation? (outdated research)

by Luisa_Rodriguez

An incomplete draft (though still with lots of useful findings) from 2019/​2020 on the probability that a catastrophe that caused civilizational collapse might lead to indefinite technological stagnation. Explores three questions in relation to this:

  1. If we re-ran history, would we see the agricultural and industrial revolutions again?

  2. Would technological progress look different in a post-collapse world?

  3. What are the recovery timelines for a collapsed civilization?

Brief evaluations of top-10 billionnaires

by NunoSempere

The author briefly (1-2 paragraphs each) ranks the world’s top 10 billionaires according to how much value /​ impact they’ve created through their business and philanthropic activities.

Opportunities & Resources

Jobs, programs, competitions, fellowships, courses, resources, and more.

Introducing Cause Innovation Bootcamp

by Akhil, Leonie Falk

Fellows will participate in training on evidence-based research, and then produce a shallow report on a pre-selected global health and development (GHD) cause area. The research will be aimed at novel areas with the hope to identify new interventions that could be competitive with the top of the field currently.

Applications are open until 30th Oct for the pilot, which will run 7th Nov − 20th Dec.


Announcing Squigglepy, a Python package for Squiggle

by Peter Wildeford

Squiggle is a simple programming language for intuitive probabilistic estimation. This package implements many squiggle-like functionalities in Python. It also includes utility functions for Bayesian networks, pooling forecasts, laplace, and kelly betting.

Call for applications for Zanzibar residency

by Anne Nganga

Applications are open for the 2023 Effective Altruism Africa Residency Fellowship. The program runs Jan 15th—Mar 31st, and is aimed at providing support and community for EAs working on improving wellbeing in Africa. Accommodation and working space are provided.

A couple of expert-recommended jobs in biosecurity at the moment (Oct 2022)

by Clifford

The author asked Chris Bakerlee (Senior Program Associate for Biosecurity and Pandemic Preparedness at Open Philanthropy) for biosecurity roles he is excited to see filled right now. He responded with an Executive Assistant role on his team, and a Senior Program Officer /​ Senior Director for Global Biological Policy and Programs role at Nuclear Threat Initiative.

Community & Media

EA Funds has a Public Grants Database

by calebp

All public grants by EA Funds will now appear in this database. Entries include project summaries, the grantee, which fund paid, and the payout amount.

Careers in medicine—a new path profile from Probably Good and High Impact Medicine

by Probably Good, High Impact Medicine

A guide to impactful careers within the medical space, primarily aimed at existing doctors and medical students. Includes ways to have more impact within clinical work (eg. taking high paying roles and donating) as well as high-impact alternatives that benefit from a medical background (eg. medical research, public health, biosecurity, and nonprofit entrepreneurship).

Healthier Hens Y1 update including challenges so far and a call for funding

by lukasj10, Isaac_Esparza

Healthier hens investigate dietary interventions to improve the welfare of cage-free hens and engage farming stakeholders to adopt these interventions. In Y1 they spent most of their budget on staff, research, and travel. In Y2 they intend to ramp up their program work. However, they are short of funding (missing 180K out of 230K needed for Y2) and looking for donations.

Centre for Exploratory Altruism Research (CEARCH)

by Joel Tan (CEARCH)

CEARCH is a new org focused on cause prioritization research. They will investigate a large number of causes shallowly, doing more intensive research if the cost-effectiveness of the cause seems plausibly at least one magnitude higher than a Givewell top charity. So far they have completed three shallow investigations: nuclear war, fungal disease, and asteroids. Asteroids ranked highest (2.1x top Givewell charities), surprising the researchers.

Announcing VIVID: A new EA organization aspiring to scale effective self-improvement & reflection

by GidiKadosh

A new organization building an app for effective self-improvement and reflection, initially targeting EAs. The app distinguishes itself via a focus on extensive customization and self-testing of plans to tackle internal obstacles and change mindsets.

You can help by trying the beta version and providing feedback on what does /​ doesn’t work for you personally, getting in touch if you do EA wellbeing workshops or coaching, joining the team (several open positions) or giving feedback on the website /​ comms.

Be careful with (outsourcing) hiring

by Richard Möhn

Organisations not used to hiring might outsource it, but hiring firms don’t always do a good job—and the author has seen an example where it was hard for founders without hiring experience to identify this. In that example, the hiring company:

  • Turned candidates off with long ads, unnecessary requirements, unclear process, and delays

  • Failed to distinguish good candidates due to asking questions that didn’t dig into the candidates experience

  • Rejected candidates late in the process via email with a form letter that stated no feedback could be given

LW Forum

They gave LLMs access to physics simulators

by ryan_b

Google has plugged large language models into physics simulators, to allow them to reason better about the physical world. This increased performance on physics questions /​ tasks by a large margin eg. 27% zero-shot absolute accuracy improvements, and allowing small LMs to perform at the level of 100x bigger ones that didn’t have physics simulator access (on these specific questions).

Decision theory does not imply that we get to have nice things

by So8res

Some people believe logical decision theory (LDT) agents are friendly, and so if AI was one, we’d be alright. The author argues this is incorrect, because cooperative behavior for an LDT (eg. in Prisoner’s Dilemmas, or two-boxing newcombe’s problem) is entirely based on maximizing utility—not an in-built property of cooperativeness. If they don’t expect helping us to lead to better outcomes on their goals, they won’t help us.

How To Make Prediction Markets Useful For Alignment Work

by johnswentworth

As an alignment researcher, the author often has to make decisions on what things to pay attention to vs. ignore. Eg. will shard theory turn out? Will a certain conjecture be proven even if they don’t focus on it? However prediction markets focus almost exclusively on AI capability timelines. Eg. will we have an AI-generated feature film by 2026? Will AI wipe out humanity by 2100?

The author suggests more predictions that affect researchers day-to-day decision-making would make prediction markets more impactful.

Counterarguments to the basic AI risk case

by Katja_Grace

Summary repeated from last week as context for the next two posts, which directly respond to this one.

Counters to the argument that goal-directed AIs are likely and it’s hard to align them to good goals, so there’s significant x-risk:

  • AIs may optimize more for ‘looking like they’re pursuing X goal’ than actually pursuing it. This would mean they wouldn’t go after instrumental goals like money or power.

  • Even if an AI’s values /​ goals don’t match ours, they could be close enough, or be non-destructive. Or they could have short time horizons that don’t make worldwide takeovers worth it.

  • We might be more powerful than a superintelligent AI. Collaboration was as or more important than intelligence for humans becoming the dominant species, and we could have non-agentic AIs on our side. AIs might also hit ceilings in intelligence, or be working on tasks that don’t scale much with intelligence.

  • The core AI x-risk argument could apply to corporations too—but we don’t consider them x-risks. Corporations are goal-directed, hard to align precisely, far more powerful than individual humans, and adapt over time—but aren’t considered x-risks.

Response to Katja Grace’s AI x-risk counterarguments

by Erik Jenner, Johannes_Treutlein

Counterarguments to each of Katja’s points in the post above, drawing from the existing literature. These defend the position that if AI proceeds without big advances in alignment, we would reach existential catastrophe eventually:

  1. Goal-directedness is vague /​ AIs may optimize more for ‘looking like they’re pursuing X goal’ than actually pursuing it. Counter: If we define ‘goal-directedness’ as ‘reliably ensuring a goal will be achieved’ then economic pressures will tilt toward this. To ensure very hard goals are achieved, the AI will need to use novel methods /​ a wide action space eg. ‘acquire lots of power first’.

  2. An AI’s values could be close enough to ours. Counter: Imagine an AI is rewarded when sensors say a diamond is in a room. So it manipulates the sensors to always say that, instead of protecting the diamond. These are hugely different values that arise from the training signal not distinguishing ‘this looks good to humans’ and ‘this is actually good for humans, given full knowledge’ - which could be a common failure mode. Human values might vary little, but AI could vary a lot, particularly when working in situations with no training examples (because we don’t have superhuman performance to train on).

  3. We might be able to handle a superintelligent AI. Counter: While some tasks don’t benefit from intelligence, many do (eg. take over the world) and eventually someone will direct AI at one of these tasks, and keep improving it because of economic incentives. The question is if we can have superhuman alignment research (or another alignment solution) first.

  4. The core AI x-risk argument could apply to corporations too. Counter: corporations have limited scaling in comparison to AI, due to finite numbers of people and inadequate coordination.

A conversation about Katja’s counterarguments to AI risk

by Matthew Barnett, Ege Erdil, Brangus Brangus

Transcript of a conversation between Ege Erdil and Ronny Fernandez about Katja’s post above. Mostly focused on two of the arguments:

1. An AI’s values could be close enough to ours. Our training processes train things to look good to humans, not to be good. Even if these are only rarely badly different, if we run enough powerful AIs enough times, we’ll get that case and therefore catastrophe. And we might not have a chance to recognize it /​ recover because of the powerful optimization of the AI towards it. This is particularly likely for AIs doing things we find hard to rate (eg. ‘does this look like a human face?’ - the example in Katja’s post—is much easier than ‘is this or that similar world better?’)

2. The core AI x-risk argument could apply to corporations too. Counter: corporations are bad at coordination. AIs can be much better (eg. combine forces toward a weighted merge of their goals).


Scaling Laws for Reward Model Overoptimization

by leogao, John Schulman, Jacob_Hilton

Author’s tl;dr: “Reward model (RM) overoptimization in a synthetic-reward setting can be modelled surprisingly well by simple functional forms. The coefficients also scale smoothly with scale. We draw some initial correspondences between the terms of the functional forms and the Goodhart Taxonomy. We suspect there may be deeper theoretical reasons behind these functional forms, and hope that our work leads to a better understanding of overoptimization.”

I learn better when I frame learning as Vengeance for losses incurred through ignorance, and you might too

by chaosmage

The author experimented for 2 weeks with consciously learning ‘with a vengeance’, aiming to avenge whatever they lost because they didn’t learn the thing earlier. They had better motivation and recall, and suggested others try the same.

Age changes what you care about

by Dentin

The author believes there are double-digit odds of AI-caused extinction in the next century. However, this is less salient than the >50% that as a currently-49-year-old they will die in the next 3-4 decades, with increasing odds every year—particularly after several health scares. It’s hard to focus on anything above personal survival.

Plans Are Predictions, Not Optimization Targets

by johnswentworth

Treating a plan as a step-by-step list that we should always optimize toward isn’t as helpful as developing multiple plans, identifying common bottlenecks between them, and tackling those. This is particularly the case if your field is preparadigmatic and you’re working on hard problems, as it allows you to adapt when surprises are thrown your way.

In this case, a plan simply becomes one path we predict. We might even have a mainline /​ modal plan we most expect. But we’re selecting our actions to be useful in both this and other paths.

Wisdom Cannot Be Unzipped

by Sable

Compression works via assuming knowledge on the receiver’s end. If we know the receiver understands 4x1 to mean 1111 then we can compress binary. If we know the receiver understands the general idea that problems are easier to fix early on when they’re small, we can compress a reminder as ‘a stitch in time saves nine’.

When we share wisdom or learnings, we lose a lot of the value—there is no way for the receivers to ‘unzip’ these lessons and get an understanding of the experiences, context, and nuance that formed them.

Other

My resentful story of becoming a medical miracle

by Elizabeth

The author tried many things to deal with a medical problem on the advice of doctors, was eventually suggested a treatment for a different issue, tried it, and it solved the original problem (in this case—a particular antihistamine taken for rash dealt with difficulty digesting protein). They also ran studies on self help books and found no correlation between helpfulness and rigor /​ theoretical backing, and ran an RCT on ketone esters and found no benefits despite them and friends getting insane gains from them.

They conclude that “once you have exhausted the reliable part of medicine without solving your problem, looking for a mechanistic understanding or empirical validation of potential solutions is a waste of time. The best use of energy is to try shit until you get lucky.”

Voting Theory Introduction

by Scott Garrabrant

The first post in a sequence introducing a new voting system. This post introduces background framework, notation, and voting theory.

Important criteria for voting theories include:

  • Condorcet: If a candidate would defeat all others in one-on-one elections, that candidate should win.

  • Consistent: If two disjoint electorates would produce the same result, then combined, they should also produce that result.

  • Participation: No voter should be made worse off by voting compared to staying home.

  • Clone: If a set of candidates are clumped together in all voters preference orderings, the result of the election should be the same as if they were a single candidate.

Most voting methods violate at least one of these principles eg. in all deterministic voting systems the condorcet and consistent principles are in conflict.

Maximal Lotteries

by Scott Garrabrant

Maximal lotteries are a voting system where if anyone would win against all others 1-1, they do. If that’s not the case, votes create probability distributions (eg. it may assign 80% to one candidate), and then a random number is rolled to determine the winner.

This system fulfills all 4 voting principles from the previous post. It is distinct from the ‘random dictatorship’ voting method (choose a random person, go with their vote as the winner) only in that it first checks and fulfills the concordance principle, so a clear winner will always win.

Scott continues to build on this idea with posts on Maximal Lottery-Lotteries and Open Problem in Voting Theory (which discusses whether maximal lottery-lotteries exist, which is an open problem).

Why Weren’t Hot Air Balloons Invented Sooner?

by Lost Futures

Some technologies couldn’t have been invented much earlier than they were, because they rely on prior discoveries. Others were possible for an extended time before being discovered—hot air balloons are one of these.

The basic principles were operating in Chinese sky lanterns over a thousand years before hot air balloons were first invented. Once someone experimented to create a balloon in 1782, there was a working version within a year and it quickly proliferated around the world. It’s possible it was bottlenecked on textile prices or quality, but even accepting that it would still have been discovered 10s to 100s of years later than it could have been.

Untapped Potential at 13-18

by belkarx

Intelligent 13-18 year olds who aren’t ambitious enough to start their own projects have those years somewhat wasted by school busywork. Making meaningful work more accessible to them would be good.

Didn’t Summarize

An Extremely Opinionated Annotated List of My Favourite Mechanistic Interpretability Papers by Neel Nanda

Introduction to abstract entropy by Alex_Altair

Notes on “Can you control the past” by So8res

This Week on Twitter

AI

Meta released Universal Speech Translator, the first AI speech-to-speech translation system—which works even on languages that are primarily only spoken, not written. (tweet)

Stability AI (who released Stable Diffusion) are delaying release of the 1.5 version in order to focus on security and ensuring it’s not used for illegal purposes—driven by community feedback. (article)

AW

Nine EU countries are pushing for a Europe-wide ban on culling male chicks. (tweet) (article)

National Security

New analysis of the AI export restrictions by CSIS. Also mentions a US$53B commitment the US govt made in early August on semiconductor R&D.

Science

Biden’s latest National Biodefense Strategy calls for the US to produce a test for a new pathogen within 12 hours of its discovery and enough vaccine to protect the nation within 130 days. (tweet) (article)

Crossposted from EA Forum (35 points, 2 comments)