All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025

All Jan Feb Mar Apr May Jun JulAugSep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 161718 19 20 21 22 23 24 25 26 27 28 29 30 31

What are the flaws in this AGI argument?

William the Kiwi Aug 11, 2023, 11:31 AM

5 points

14 comments1 min readLW link

Google DeepMind’s RT-2

SandXboxAug 11, 2023, 11:26 AM

9 points

1 comment1 min readLW link

(robotics-transformer2.github.io)

Linkpost: We need another Expert Survey on Progress in AI, urgently

David MearsAug 11, 2023, 8:22 AM

25 points

2 comments2 min readLW link

(open.substack.com)

What Does a Marginal Grant at LTFF Look Like? Funding Priorities and Grantmaking Thresholds at the Long-Term Future Fund

Linch, calebp99 and Daniel_Eth

Aug 11, 2023, 3:59 AM

64 points

0 comments1 min readLW link

(forum.effectivealtruism.org)

[Question] Will posting any thread on LW guarantee that a LLM will index all my content, and if questions people ask to the LLM after my name will surface up all my LW content?

Alex K. Chen (parrot)Aug 11, 2023, 1:40 AM

0 points

0 comments1 min readLW link

AI Safety Concepts Writeup: WebGPT

JustisMillsAug 11, 2023, 1:35 AM

9 points

1 comment7 min readLW link

[Question] What is science?

Adam ZernerAug 11, 2023, 12:00 AM

6 points

4 comments1 min readLW link

Three configurable prettyprinters

philhAug 10, 2023, 11:10 PM

9 points

0 comments22 min readLW link

(reasonableapproximation.net)

Ilya Sutskever’s thoughts on AI safety (July 2023): a transcript with my comments

mishkaAug 10, 2023, 7:07 PM

21 points

3 comments5 min readLW link

Seeking Input to AI Safety Book for non-technical audience

Darren McKeeAug 10, 2023, 5:58 PM

10 points

4 comments1 min readLW link

Evaluating GPT-4 Theory of Mind Capabilities

gcmac and Nathan

Aug 10, 2023, 5:57 PM

15 points

2 comments14 min readLW link

Some alignment ideas

SelonNeriasAug 10, 2023, 5:51 PM

1 point

0 comments11 min readLW link

Self Supervised Learning (SSL)

Varshul GuptaAug 10, 2023, 5:43 PM

5 points

1 comment2 min readLW link

(dubverseblack.substack.com)

Predicting Virus Relative Abundance in Wastewater

jefftkAug 10, 2023, 3:46 PM

33 points

2 comments1 min readLW link

(naobservatory.org)

AI #24: Week of the Podcast

ZviAug 10, 2023, 3:00 PM

49 points

5 comments44 min readLW link

(thezvi.wordpress.com)

Could We Automate AI Alignment Research?

Stephen McAleeseAug 10, 2023, 12:17 PM

34 points

10 comments21 min readLW link

The positional embedding matrix and previous-token heads: how do they actually work?

AdamYedidiaAug 10, 2023, 1:58 AM

27 points

4 comments13 min readLW link

LLMs are (mostly) not helped by filler tokens

Kshitij SachanAug 10, 2023, 12:48 AM

66 points

36 comments6 min readLW link

2023 ACX Meetups Everywhere—Newton, MA

duck_masterAug 9, 2023, 10:47 PM

6 points

2 comments1 min readLW link

Progress links digest, 2023-08-09: US adds new nuclear, Katalin Karikó interview, and more

jasoncrawfordAug 9, 2023, 7:22 PM

18 points

0 comments3 min readLW link

(rootsofprogress.org)

Mech Interp Challenge: August—Deciphering the First Unique Character Model

CallumMcDougallAug 9, 2023, 7:14 PM

36 points

1 comment3 min readLW link

Real Meaning of life has been found. Eliezer discovered it in 2000′s.

JorterderAug 9, 2023, 6:13 PM

−15 points

1 comment1 min readLW link

(docs.google.com)

Marginal Revolution unofficial birthday party

Derek M. JonesAug 9, 2023, 2:35 PM

4 points

0 comments1 min readLW link

A content analysis of the SQ-R questionnaire and a proposal for testing EQ-SQ theory

tailcalledAug 9, 2023, 1:51 PM

10 points

2 comments13 min readLW link

[Question] Does LessWrong allow exempting posts from being scraped by GPTBot?

micAug 9, 2023, 1:02 PM

29 points

3 comments1 min readLW link

If I Was An Eccentric Trillionaire

niplavAug 9, 2023, 7:56 AM

9 points

8 comments26 min readLW link

Modulating sycophancy in an RLHF model via activation steering

Nina PanicksseryAug 9, 2023, 7:06 AM

69 points

20 comments12 min readLW link

Open Thread—August 2023

habrykaAug 9, 2023, 3:52 AM

18 points

49 comments1 min readLW link

marine cloud brightening

bhauthAug 9, 2023, 2:50 AM

40 points

14 comments3 min readLW link

(www.bhauth.com)

Inflection.ai is a major AGI lab

Nikola JurkovicAug 9, 2023, 1:05 AM

137 points

13 comments2 min readLW link

Acausal Now: We could totally acausally bargain with aliens at our current tech level if desired

Christopher KingAug 9, 2023, 12:50 AM

1 point

5 comments4 min readLW link

Necromancy’s unintended consequences.

Christopher KingAug 9, 2023, 12:08 AM

−6 points

2 comments2 min readLW link

What’s A “Market”?

johnswentworthAug 8, 2023, 11:29 PM

94 points

16 comments10 min readLW link

Podcast (+transcript): Nathan Barnard on how US financial regulation can inform AI governance

Aaron BergmanAug 8, 2023, 9:46 PM

8 points

0 comments23 min readLW link

(www.aaronbergman.net)

What are the flaws in this argument about p(Doom)?

William the Kiwi Aug 8, 2023, 8:34 PM

−2 points

26 comments1 min readLW link

A Simple Theory Of Consciousness

SherlockHolmesAug 8, 2023, 6:05 PM

2 points

5 comments1 min readLW link

(peterholmes.medium.com)

[Linkpost] Rationally awake

jpcAug 8, 2023, 5:59 PM

−1 points

0 comments4 min readLW link

(jpc.dev)

Yet more UFO Betting: Put Up or Shut Up

MoreRatsWrongReUAPAug 8, 2023, 5:50 PM

10 points

18 comments1 min readLW link

AISN #18: Challenges of Reinforcement Learning from Human Feedback, Microsoft’s Security Breach, and Conceptual Research on AI Safety

Dan HAug 8, 2023, 3:52 PM

13 points

0 comments5 min readLW link

(newsletter.safe.ai)

[Question] Beginner’s question about RLHF

FTPickleAug 8, 2023, 3:48 PM

1 point

3 comments1 min readLW link

My Trial Period as an Independent Alignment Researcher

Bart BussmannAug 8, 2023, 2:16 PM

34 points

1 comment3 min readLW link

4 types of AGI selection, and how to constrain them

RemmeltAug 8, 2023, 10:02 AM

−4 points

3 comments3 min readLW link

Notice your everything

metachiralityAug 8, 2023, 2:38 AM

15 points

1 comment2 min readLW link

Model Organisms of Misalignment: The Case for a New Pillar of Alignment Research

evhub, Nicholas Schiefer, Carson Denison and Ethan Perez

Aug 8, 2023, 1:30 AM

319 points

30 comments18 min readLW link 1 review

Perpetually Declining Population?

jefftkAug 8, 2023, 1:30 AM

48 points

29 comments3 min readLW link

(www.jefftk.com)

[Question] How do I find all the items on LW that I’ve favorited or upvoted?

Alex K. Chen (parrot)Aug 7, 2023, 11:51 PM

14 points

3 comments1 min readLW link

A plea for more funding shortfall transparency

porbyAug 7, 2023, 9:33 PM

73 points

4 comments2 min readLW link

[Question] Tips for reducing thinking branching factor

Simon BerensAug 7, 2023, 8:21 PM

4 points

6 comments1 min readLW link

An interactive introduction to grokking and mechanistic interpretability

Adam Pearce and Asma Ghandeharioun

Aug 7, 2023, 7:09 PM

23 points

3 comments1 min readLW link

(pair.withgoogle.com)

Feedbackloop-first Rationality

RaemonAug 7, 2023, 5:58 PM

205 points

69 comments8 min readLW link 2 reviews