EA & LW Forums Weekly Summary (5th Dec − 11th Dec 22′)

Supported by Rethink Priorities

This is part of a weekly series summarizing the top posts on the EA and LW forums—you can see the full collection here. The first post includes some details on purpose and methodology. Feedback, thoughts, and corrections are welcomed.

If you’d like to receive these summaries via email, you can subscribe here.

Podcast version: prefer your summaries in podcast form? A big thanks to Coleman Snell for producing these! Subscribe on your favorite podcast app by searching for ‘EA Forum Podcast (Summaries)’. More detail here.


Top /​ Curated Readings

Designed for those without the time to read all the summaries. Everything here is also within the relevant sections later on so feel free to skip if you’re planning to read it all. These are picked by the summaries’ author and don’t reflect the forum ‘curated’ section.

Some observations from an EA-adjacent (?) charitable effort

by patio11

The author founded VaccinateCA, which ran USA vaccine location infrastructure for 6 months. They think this saved many thousands of lives, at a cost of ~$1.2M.

Learnings from this include:
1. Enabling trade as a mechanism for impact:

  • Many actors (eg. Google, the White House, pharmacists) wanted residents vaccinated, but weren’t working together.

  • By getting data via calling pharmacies and convincing a few people at Google to show it, they were able to do something very impactful without being hired to.

2. Engaging with the system:

  • Many people in policy are skeptical of tech.

  • 501c3 charity status is extremely useful for reputation and funding.

  • Carefully planned comms created initial positive press, which in turn was a key reason Google and government stakeholders engaged with them.

3. Factors in the team’s success:

  • Money to commit early and boldly.

  • Social capital to call in favors in the tech community /​ get resources.

  • Good understanding of the infrastructure of the project (call centers and information flows of IT systems).

  • Ability to write good software quickly.

  • PR experience and talent.

  • A network with lots of high agency people willing to try things.

  • People management skills (they didn’t have this, but it would have been very helpful)

While this was not an “EA Project” the author expects it wouldn’t have happened if EA hadn’t drawn their attention to expected value math, and is grateful for this.

Why development aid is a really exciting field

by MathiasKB

Wealthy countries spend a collective $178B on development aid per year − 25% of all giving worldwide. Some aid projects have been cost-effective on a level with Givewell’s top recommendations (eg. PEPFAR), while others have caused outright harm.

Aid is usually distributed via a several step process:

  1. Decide to spend money on aid. Many countries signed a 1970 UN resolution to spend 0.7% of GNI on official development assistance.

  2. Government decides a general strategy /​ principles.

  3. Government passes a budget, assigning $s to different aid subcategories.

  4. The country’s aid agency decides on projects. Sometimes this is donating to intermediaries like the UN or WHO, sometimes it’s direct.

  5. Projects are implemented.

This area is large scale, tractability is unsure but there are many pathways and some past successes (eg. a grassroots EA campaign in Switzerland redirected funding, and the US aid agency ran a cash-benchmarking experiment with GiveDirectly), and few organisations focus on this area compared to the scale.

The author and their co-founder have been funded to start an organization in this area. Get in touch if you’re interested in Global Development and Policy.

Rethink Priorities is hiring: help support and communicate our work

by Rethink Priorities

Author’s tl;dr:

EA Forum

Philosophy and Methodologies

Do Brains Contain Many Conscious Subsystems? If So, Should We Act Differently?

by Bob Fischer, Adam Shriver, MichaelStJules

The fifth post in the Moral Weight Project Sequence.

The Conscious Subsystems Hypothesis says that brains have subsystems that realize phenomenally conscious states that aren’t accessible to the subjects we typically associate with those brains—namely, the ones who report their experiences to us.

Some have argued that humans are more likely to have conscious subsystems, so risk-neutral expected utility maximizers should weigh human welfare higher relative to other animals. The authors argue against this on three points:

  1. If humans have conscious subsystems, other animals probably do too.

  2. Theories of consciousness that support the conscious subsystems hypothesis also tend to support the hypothesis that many small invertebrates are sentient.

  3. Risk neutral expected utility maximizers are committed to assumptions, including the assumption that all welfare counts equally.

The post also discusses claims that support CSH, but assigns low credence to these. Overall, the authors suggest we should not act on the conscious subsystems hypothesis.

What can we learn from the empirical social science literature on the expected contingency of value change?

by jackva

A short overview of some existing literature on the drivers of value change. A consistent theme is that “there are good reasons to expect that value change is actually quite predictable and that a certain set of values tend to emerge out of the conditions of modernization.” This makes the author believe a re-run of history reaching today’s level of technological development while maintaining slavery is much more unlikely than stated in WWOTF (What We Owe the Future).

The author notes historical examples tend to overemphasize contingency, while social science overemphasizes patterns and regularities, so those methodologies can be a good balance to each other.

Promoting compassionate longtermism

by jonleighton

Some suffering is bad enough that non-existence is preferable. The lock-in of uncompassionate systems (eg. through AI or AI-assisted governments) could cause mass suffering in the future.

OPIS (Organisation for the Prevention of Intense Suffering) has until now worked on projects to help ensure that people in severe pain can get access to effective medications. In future, they plan to “address the very principles of governance, ensure that all significant causes of intense suffering receive adequate attention, and promote strategies to prevent locked-in totalitarianism”. One concrete project within this is a full length film to inspire people with this vision and lay out actionable steps. They’re looking for support in the form of donations and /​ or time.

Object Level Interventions /​ Reviews

Why development aid is a really exciting field

by MathiasKB

Wealthy countries spend a collective $178B on development aid per year − 25% of all giving worldwide. Some aid projects have been cost-effective on a level with Givewell’s top recommendations (eg. PEPFAR), while others have caused outright harm.

Aid is usually distributed via a several step process:

  1. Decide to spend money on aid. Many countries signed a 1970 UN resolution to spend 0.7% of GNI on official development assistance.

  2. Government decides a general strategy /​ principles.

  3. Government passes a budget, assigning $s to different aid subcategories.

  4. The country’s aid agency decides on projects. Sometimes this is donating to intermediaries like the UN or WHO, sometimes it’s direct.

  5. Projects are implemented.

This area is large scale, tractability is unsure but there are many pathways and some past successes (eg. a grassroots EA campaign in Switzerland redirected funding, and the US aid agency ran a cash-benchmarking experiment with GiveDirectly), and few organisations focus on this area compared to the scale.

The author and their co-founder have been funded to start an organization in this area. Get in touch if you’re interested in Global Development and Policy.

Visualizing the development gap

by Stephen Clare

The US poverty threshold, below which one qualifies for government assistance, is $6625 per person for a family of four. In Malawi, one of the world’s poorest countries, the median income is a twelfth of that (adjusted for purchasing power). Without a change in growth rates, it will take Malawi almost two centuries to catch up to where the US is today.

This example illustrates the development gap: the difference in living standards between high and low income countries. Working on this is important both for the wellbeing of those alive today, and because it allows more people to participate meaningfully in humanity’s most important century and therefore help those in the future too.

Some observations from an EA-adjacent (?) charitable effort

by patio11

The author founded VaccinateCA, which ran the national shadow vaccine location infrastructure for 6 months. They think this saved many thousands of lives, at a cost of ~$1.2M.

Learnings from this include:
1. Enabling trade as a mechanism for impact:

  • Many actors (eg. Google, the White House, pharmacists) wanted residents vaccinated, but weren’t working together.

  • By convincing a few people at Google to show the data, and getting data via calling pharmacies, they were able to do something very impactful without being hired to do it.

2. Engaging with the system:

  • Many people in policy are skeptical of tech.

  • 501c3 charity status is extremely useful for reputation and funding.

  • Carefully planned comms created initial positive press, which in turn was a key reason Google and government stakeholders engaged with them.

3. Factors in the team’s success:

  • Money to commit early and boldly.

  • Social capital to call in favors in the tech community /​ get resources.

  • Good understanding of the infrastructure of the project (call centers and information flows of IT systems).

  • Ability to write good software quickly.

  • PR experience and talent.

  • A network with lots of high agency people willing to try things.

  • People management skills (they didn’t have this, but it would have been very helpful)

While this was not an “EA Project” the author expects it wouldn’t have happened if EA hadn’t drawn their attention to expected value math, and is grateful for this.

Race to the Top: Benchmarks for AI Safety

by isaduan

AI Safety benchmarks make it easier to track the field’s progress, create incentives for researchers (particularly in China) to work on problems relevant to AI safety, and develop auditing and regulations around advanced AI systems.

The author argues this should be given higher priority than it is, and that we can’t assume good benchmarks will be developed by default. They suggest AI safety researchers help by identifying traits for benchmarks, valuable future benchmarks and their prerequisites, or creating benchmarks directly. AI governance professionals can help by organizing workshops /​ competitions /​ prizes around benchmarking, researching how benchmarks could be used, and advising safety researchers on this. It could also be useful to popularize a “race to the top” narrative on AI Safety.

Thoughts on AGI organizations and capabilities work

by RobBensinger, So8res

Rob paraphrases Nate’s thoughts on capabilities work and the landscape of AGI organisations. Nate thinks:

  1. Capabilities work is a bad idea, because it isn’t needed for alignment to progress and it could speed up timelines. We already have many ML systems to study, which our understanding lags behind. Publishing that work is even worse.

  2. He appreciates OpenAI’s charter, openness to talk to EAs /​ rationalists, clearer alignment effort than FAIR or Google Brain, and transparency about their plans. He considers DeepMind and Anthropic on par and slightly ahead respectively on taking alignment seriously.

  3. OpenAI, Anthropic, and DeepMind are unusually safety-conscious AI capabilities orgs (e.g., much better than FAIR or Google Brain). But reality doesn’t grade on a curve, there’s still a lot to improve, and they should still call a halt to mainstream SotA-advancing potentially-AGI-relevant ML work, since the timeline-shortening harms currently outweigh the benefits.

You should consider launching an AI startup

by Joshc

AI startups can be big money-makers, particularly as capabilities scale. The author argues that money is key to AI safety, because money:

  • Can convert into talent (eg. via funding AI safety industry labs, offering compute to safety researchers, and funding competitions, grants, and fellowships). Doubly so if the bottleneck becomes engineering talent and datasets instead of creative researchers.

  • Can convert into influence (eg. lobbying, buying board seats, soft power).

  • Is flexible and always useful.

The author thinks another $10B AI company would be unlikely to counterfactually accelerate timelines by more than a few weeks, and that money /​ reduced time to AGI tradeoff seems worth it. They also argue that the transformative potential of AI is becoming well-known, and now is the time to act to benefit from our foresight on it. They’re looking for a full-stack developer as a cofounder.

Smallpox eradication

by Lizka

Smallpox was confirmed as eradicated on December 9th, 1979. Our World in Data has a great explorer on its history and how eradication was achieved.

Smallpox killed ~300 million people in the 20th century alone, and is the only human disease to have been completely eradicated. It also led to the first ever vaccine, after Edward Jenner demonstrated that exposure to cowpox—a related but less severe disease—protected against smallpox. In the 19th and 20th centuries, further improvements were made to the vaccine. In 1959, the WHO launched a global program to eradicate smallpox, including efforts to vaccinate (particularly those in contact with infected individuals - ‘ring vaccination’), isolate those infected, and monitor spread. They eventually contained the virus primarily to India (86% of cases were there in 1974), and with a final major vaccination campaign, dropped cases there to zero in 1976.

Main paths to impact in EU AI Policy

by JOMG_Monnet

An non-exhaustive overview of paths to impact in EU AI Policy, cross-checked with 4 experts:

1. Working on enforcement of the AI Act, related AI technical standards and adjacent regulation

2. Working on export controls and using the EU’s soft power

3. Using career capital from time in EU AI Policy to work in private sector or other policy topics

Building career capital is a prerequisite for impact in any of these. The author recommends building a personal theory of change before acting.

Binding Fuzzies and Utilons Together

by eleni, LuisMota

Some interventions are neglected because they have less emotional appeal. EA typically tackles this by redirecting more resources there. The authors suggest we should also tackle the cause, by designing marketing to make them more emotionally appealing. This could generate significant funding, more EA members, and faster engagement.

As an example, the Make-A-Wish website presents specific anecdotes about a sick child, while the Against Malaria Foundation website focuses on statistics. Psychology shows the former is more effective at generating charitable behavior.

Downsides include potential organizational and personal value drift, and reduction in relative funding for Longtermist areas if these are harder to produce emotional content for. They have high uncertainty and suggest a few initial research directions that EAs with a background in psychology could take to develop this further.

Giving Recommendations and Organisation Updates

Center on Long-Term Risk: 2023 Fundraiser
by stefan.torges
CLR’s goal is to reduce the worst risks of astronomical suffering, primarily via identifying and advocating for interventions that reliably shape the development and deployment of advanced AI systems in a positive way.

Current research programs include AI conflict, evidential cooperation in large worlds, and s-risk (suffering risk) macrostrategy. They had originally planned on expanding their s-risk community building, but these plans are on hold due to the funding situation.

They have a short-term funding shortfall, with only a 6 month runway after implementing cost-saving measures. You can donate here (with tax deductible options for Germany, Switzerland, Netherlands, USA and UK).

Presenting: 2022 Incubated Charities (Charity Entrepreneurship)

by KarolinaSarek, Joey

5 new charities have launched as a result of the June—August 2022 incubation program:

  • Center for Effective Aid Policy- identifying and promoting high-impact development policies and interventions.

  • Centre for Exploratory Altruism Research (CEARCH)- conducting cause prioritization research and outreach.

  • Maternal Health Initiative- producing transformative benefits to women’s health, agency, and income through increased access to family planning.

  • Kaya Guides- reducing depression and anxiety among youth in low-and middle-income countries.

  • Vida Plena- building strong mental health in Latin America.

In 2023, there will be two incubation programs, the first of which applications have closed. If you’re interested in applying for the second (June—August), you can sign up here to be notified when applications open.

SoGive Grants: a promising pilot. Our reflections and payout report.

by SoGive, Isobel P

SoGive is an EA-aligned research organization and think tank. In 2022, they ran a pilot grants program, granting £223k to 6 projects (out of 26 initial applicants):

  • Founders Pledge - £93,000 - to hire an additional climate researcher.

  • Effective Institutions Project - £62,000 - for a regranting program.

  • Doebem - £35,000 - a Brazillian effective giving platform, to continue scaling.

  • Jack Davies - £30,000 - for research improving methods to scan for neglected X-risks.

  • Paul Ingram - £21,000 - poll how nuclear winter info affects nuclear armament support.

  • Social Change Lab - £18,400 − 2xFTE for 2 months, researching social movements.

The funds were sourced from private donors, mainly people earning to give. If you’d like to donate, contact isobel@sogive.org.

They advise future grant applicants to lay out their theory of change (even if their project is one small part), reflect on how you came to your topic and if you’re the right fit, and consider downside risk.

They give a detailed review of their evaluation process, which was heavy touch and included a standardized bar to meet, ITN+ framework, delivery risks (eg. is 80% there 80% of the good?), and information value of the project. They tentatively plan to run it again in 2022, with a lighter touch evaluation process (extra time didn’t add much value). They also give reflections and advice for others starting grant programs, and are happy to discuss this with anyone.

Announcing BlueDot Impact

by Dewi Erwan, Jamie Bernardi, Will Saunter

BlueDot Impact is a non-profit running courses that support participants to develop the knowledge, community and network needed to pursue high-impact careers. The courses focus on literature in high-impact fields, in addition to career opportunities in the space, and bring together interested participants to help build networks and collaborations.

They’ve made multiple improvements over the past few months including working with pedagogy experts to make discussion sessions more engaging, formalizing the course design process, building systems to improve participant networking, collating downstream opportunities for participants to pursue after the courses, and building their team.

You can register your interest for future courses here.

r.i.c.e.’s neonatal lifesaving partnership is funded by GiveWell; a description of what we do

by deanspears
r.i.c.e collaborates with the Government of Uttar Pradesh and an organization in India to promote Kangaroo Mother Care (KMC), which is a well-established tool for increasing survival rates of low birth weight babies. They developed a public-private partnership to cause the government’s KMC guidelines to be implemented cost-effectively in a public hospital.

Their best estimate based on a combination of implementation costs and pre-existing research is that it costs ~$1.8K per life saved. However they are unsure and are planning to compare survival rates in the hospital targeted vs. others in the region next year.

Both Founders Pledge and GiveWell have made investments this year. They welcome further support—you can donate here. Donations will help maintain the program, scale it up, do better impact evaluation, and potentially expand to other hospitals if they find good implementation partners.

Our 2022 Giving

by Jeff Kaufman

For the past several years, Julia and Jeff have donated 50% of their income to charity, divided between GiveWell and the EA Infrastructure Fund. After a decrease in salary to focus on direct work, they’re planning to do the same for 2022 for the sake of simplicity and because of the greater funding needed post the FTX situation. They’ll re-evaluate the percentage in 2023.

Opportunities

Rethink Priorities is hiring: help support and communicate our work

by Rethink Priorities

Author’s tl;dr:

SFF Speculation Grants as an expedited funding source by Andrew Critch & SFF is doubling speculation (rapid) grant budgets; FTX grantees should consider applying by JueYan

The Survival and Flourishing Fund (SFF) funds many longtermist, x-risk, and meta projects. Its Speculation Grants program can fund charities and projects hosted by organizations with charity status, with some applications able to be improved in days and paid out within a month. In response to the recent extraordinary need, Jaan Tallinn, the main funder of SFF, is doubling speculation budgets. Grantees impacted by recent events should apply.

Applications for EAGx LatAm closing on the 20th of December

by LGlez

EAGx LatAM will take place in Mexico City Jan 6 − 8. It is primarily targeted at those from /​ with ties to Latin America, or experienced members of the international community excited to meet them. There is no requirement to speak Spanish. Applications can be made here, and close on 20th December.

Climate research webinar by Rethink Priorities on Tuesday, December 13 at 11 am EST

by Rethink Priorities

Rethink Priorities resident climate expert, Senior Environmental Economist Greer Gosnell, will give a presentation regarding the research process and findings of a report that evaluates anti-deforestation as a promising climate solution. It will include time for Q&A.

Sign up here to join, or to be emailed the recording if you can’t make it live.

Community & Media

Announcing: Audio narrations of EA Forum posts

by peterhartree, Sharang Phadke, JP Addison, Lizka, type3audio

You can now subscribe to a podcast of EA forum human-narrated content—including top posts, and these weekly summaries. Post narrations will also appear on the post itself, click the speaker button to listen to them. The nonlinear library will continue to generate AI narrations of a larger number of posts on a separate feed, with quality improvements planned.

What specific changes should we as a community make to the effective altruism community? [Stage 1]

by Nathan Young

The author is collecting ideas in the comments section, which will then be added to Polis. Polis allows people to vote on ideas, and groups similar voters together, to better understand the different clusters in the community. The suggestions with the highest consensus, or high consensus within specific clusters, will be put in a google doc for further research.

Learning from non-EAs who seek to do good

by Siobhan_M

The author asks whether EA aims to be a question about doing good effectively, or a community based around ideology. In their experience, it has mainly been the latter, but many EAs have expressed they’d prefer it be the former.

They argue the best concrete step toward EA as a question would be to collaborate more with people outside the EA community, without attempting to bring them into the community. This includes policymakers on local and national levels, people with years of expertise in the fields EA works in, and people who are most affected by EA-backed programs.

Specific ideas include EAG actively recruiting these people, EA groups co-hosting more joint community meetups, EA orgs measuring preferences of those impacted by their programs, applying evidence-based decision-making to all fields (not just top cause areas), engaging with people and critiques outside the EA ecosystem, funding and collaborating with non-EA orgs (eg. via grants), and EA orgs hiring non-EAs.

The Spanish-Speaking Effective Altruism community is awesome

by Jaime Sevilla

Since Sandra Malagón and Laura González were funded to work on growing the Spanish-speaking EA community, it’s taken off. There have been 40 introductory fellowships, 2 university groups started, 2 camps, many dedicated community leaders, translation projects, 7-fold activity on Slack vs. 2020, and a community fellowship /​ new hub in Mexico City. If you’re keen to join in, the slack workspace is here, and anyone (English or Spanish speaking) can apply to EAGxLatAm.

Revisiting EA’s media policy

by Arepo

CEA follows a fidelity model of spreading ideas, which claims because EA ideas are nuanced and the media often isn’t, media communication should only be done by those qualified who are confident the media will report the ideas exactly as stated.

The author argues against this on four points:

  1. Sometimes many people doing something ‘close to’ is better than few doing it ‘exactly’ eg. few vegans vs. many reductitarians.

  2. If you don’t actively engage the media, a large portion of coverage will be from detractors, and therefore negative.

  3. EA’s core ideas are not that nuanced. Most critics have a different emotional response or critique how it’s put into practice, rather than get anything factually wrong.

  4. The fidelity model contributes to hero worship and concentration of power in EA.

The author suggests further discussion on this policy, acknowledgement from CEA of the issues with it, experimenting with other approaches in low-risk settings, and historical /​ statistical research into what approaches have worked for other groups.

Why did CEA buy Wytham Abbey?

by Jeroen_W

Last year, CEA bought Wytham Abbey, a substantial estate in Oxford, to establish as a center to run workshops and meetings. Some media on this suggested it was extremely expensive (~15M pounds), though it seems like CEA bought <1% of the land that figure was based on. The author questions if this was cost-effective and asks CEA to share reasoning and EV calculations, to ensure we’re not rationalizing lavish expenses.

EA Taskmaster Game

by Gemma Paterson

Resources, how-tos and reflections for an EA themed game the author ran that was a mix between a scavenger hunt, the TV show Taskmaster and a party quest game. There were various tasks for teams to complete within a time limit, which awarded points not always in relation to difficulty, and had different scoring mechanisms (eg. winner takes all, split between submissions, all get the points). The idea was that prioritization of which tasks to complete would be best done using the scale, neglectedness, and tractability framework—and that doing them would be really fun!

[Link Post] If We Don’t End Factory Farming Soon, It Might Be Here Forever.

by BrianK

Article from Forbes. “Once rooted, value systems tend to persist for an extremely long time. And when it comes to factory farming, there’s reason to believe we may be at an inflection point.” It references What We Owe the Future and argues that AI may lock in the values of today, including factory farming.

New interview with SBF on Will MacAskill, “earn to give” and EA

by teddyschleifer

New interview with SBF by the post author, focused on “Will MacAskill, effective altruism, and whether “earn to give” led him to make the mistakes he made at FTX.” One key section quoted in the comments states he hasn’t talked to Will M since FTX collapsed, and feels “incredibly bad about the impact of this on E.A. and on him”, that SBF has a “duty to sort of spend the rest of my life doing what I can to try and make things right as I can”, and confirmation that a previous reference to a “dumb game that we woke Westerners play” was in regard to “corporate social responsibility and E.S.G”.

EA London Rebranding to EA UK

by DavidNash

The EA London website and newsletter have been rebranded to EA UK. The UK has a population of 67 million, only 14% of which live in a place with a paid group organizer (London, Oxford, or Cambridge). Setting up EA UK will help provide virtual support (1-1s, newsletter, directory, project advice) and may encourage more local groups to set up by making it easier to find other EAs in the area.

CEA “serious incident report”?

by Pagw

CNBC reported that CEA filed a ‘serious incident report’ with the Charity Commission tied to the collapse of FTX. Commenters note this is standard /​ expected—the Commission asks for serious incident reports to be filed on anything actual or alleged which risks significant loss of money or reputation.

Didn’t Summarize

The EA Infrastructure Fund seems to have paused its grantmaking and approved grant payments. Why? by Markus Amalthea Magnuson

LW Forum

Using GPT-Eliezer against ChatGPT Jailbreaking

by Stuart_Armstrong, rgorman

The authors propose a separate LLM (large language model) should be used to evaluate safety of prompts to ChatGPT. They tested a hacky version of this by instructing ChatGPT to take on the persona of a suspicious AI safety engineer (Eliezer). They ask that, within that persona, it assesses whether certain prompts are safe to send to ChatGPT.

In tests to date, this eliminates jailbreaking and effectively filters dangerous prompts, even including the less-straightforwardly-dangerous attempt to get ChatGPT to generate a virtual machine. Commenters have been able to break it, but primarily via SQL-injection style attacks, which could be avoided with a different implementation.

Updating my AI timelines

by Matthew Barnett

The author published a post last year titled Three reasons to expect long AI timelines. They’ve updated, and now have a median TAI timeline of 2047 and mode 2035. This is because they now:

  1. Think barriers to language models adopting human-level reasoning were weaker than they’d believed. (eg. reasoning over long sequences, after seeing ChatGPT).

  2. Built a TAI timelines model, which came out with median 2037.

  3. Reflected on short-term AI progress accelerating AI progress.

  4. Noted they’d been underestimating returns to scaling, and the possibility of large companies scaling training budget quickly to the $10B-$100B level.

  5. Almost everyone else updated to shorter timelines (unless they already had 5-15 year ones).

  6. They still think regulation will delay development, but not as much as before, given governments have been mostly ignoring recent AI developments.

Probably good projects for the AI safety ecosystem

by Ryan Kidd

Projects the author would be excited to see:

  • Variants of the MATS program: eg. one in London, one with rolling admissions, one for AI governance research.

  • Support for AI safety university groups: eg. bi-yearly workshops, plug-and-play curriculums, university course templates covering ‘precipism’.

  • Talent recruitment organisations: eg. focused on cyber security researchers, established ML talent, or specific gaps like for Sam Bowman’s planned AI safety research group.

  • Programs to develop ML safety engineering talent at scale eg. like ARENA.

  • Contests like ELK on well-operationalized research problems.

  • Hackathons for people with strong ML knowledge to critique AI alignment papers.

  • Supplemental work to OP’s worldview investigations team like GCP did for CEA & 80K.

AI Safety Seems Hard to Measure

by HoldenKarnofsky

The author argues it will be hard to tell if AI Safety efforts are successfully reducing risk, because of 4 problems:
1. The AI might look like it’s behaving, but actually isn’t.

2. The AI’s behavior might change once it has power over us.

3. Current AI systems might be too primitive to deceive or manipulate (but that doesn’t mean future ones won’t).

4. Far-beyond-human capabilities AI is really hard to predict in general.

Other

Setting the Zero Point

by Duncan_Sabien

‘Setting the Zero Point’ is a “Dark Art” ie. something which causes someone else’s map to unmatch the territory in a way that’s advantageous to you. It involves speaking in a way that takes for granted that the line between ‘good’ and ’bad is at a particular point, without explicitly arguing for that. This makes changes between points below and above that line feel more significant.

As an example, many people draw a zero point between helping and not helping a child drowning in front of them. One is good, one is bad. The Drowning Child argument argues this point is wrongly set, and should be between helping and not helping any dying child.

The author describes 14 examples, and suggests that it’s useful to be aware of this dynamic and explicitly name zero points when you notice them.

The Story Of VaccinateCA

by hath

Linkpost for Patrick MacKenzie’s writeup of VaccinateCA, a non-profit that within 5 days created the best source of vaccine availability data in California, improving from there to on best guess save many thousands of lives.

Didn’t Summarize

Logical induction for software engineers by Alex Flint

Finite Factored Sets in Pictures by Magdalena Wache

[Link] Why I’m optimistic about OpenAI’s alignment approach by janleike (linkpost for this)

Crossposted from EA Forum (27 points, 0 comments)
No comments.