Abstraction

TagLast edit: 25 Jul 2025 23:22 UTC by satpugnet

Abstraction is the process of simplifying a system by capturing only the essential features needed for your purpose, while deliberately ignoring irrelevant details. In AI alignment, effective abstraction means creating models or concepts that genuinely reflect what matters for reasoning or control, not just convenient proxies. If the abstraction misses important structure, it can fail dramatically when optimized or applied in new situations. The challenge is to develop abstractions that remain valid and useful, even as systems scale or face new pressures.

(This is a stub, please rewrite if you have a better tag description).

Whence Your Abstractions?

Eliezer Yudkowsky20 Nov 2008 1:07 UTC

12 points

6 comments3 min readLW link

What is Abstraction?

johnswentworth6 Dec 2019 20:30 UTC

37 points

8 comments5 min readLW link

Underconstrained Abstractions

Eliezer Yudkowsky4 Dec 2008 13:58 UTC

11 points

27 comments5 min readLW link

[Question] What is abstraction?

Adam Zerner15 Dec 2018 8:36 UTC

25 points

11 comments4 min readLW link

Abstraction = Information at a Distance

johnswentworth19 Mar 2020 0:19 UTC

29 points

1 comment3 min readLW link

Intuitions on Universal Behavior of Information at a Distance

johnswentworth20 Apr 2020 21:44 UTC

21 points

3 comments9 min readLW link

How to Throw Away Information in Causal DAGs

johnswentworth8 Jan 2020 2:40 UTC

20 points

2 comments2 min readLW link

Causal Abstraction Intro

johnswentworth19 Dec 2019 22:01 UTC

24 points

6 comments1 min readLW link

Causality Adds Up to Normality

johnswentworth15 Jun 2020 17:19 UTC

13 points

2 comments5 min readLW link

Public Static: What is Abstraction?

johnswentworth9 Jun 2020 18:36 UTC

97 points

18 comments11 min readLW link

Formulating Reductive Agency in Causal Models

johnswentworth23 Jan 2020 17:03 UTC

33 points

0 comments2 min readLW link

The Indexing Problem

johnswentworth22 Jun 2020 19:11 UTC

36 points

2 comments4 min readLW link

(A → B) → A in Causal DAGs

johnswentworth22 Jan 2020 18:22 UTC

48 points

11 comments2 min readLW link

Example: Markov Chain

johnswentworth10 Jan 2020 20:19 UTC

15 points

2 comments4 min readLW link

Writing Causal Models Like We Write Programs

johnswentworth5 May 2020 18:05 UTC

90 points

11 comments4 min readLW link

Abstraction, Causality, and Embedded Maps: Here Be Monsters

johnswentworth18 Dec 2019 20:25 UTC

26 points

1 comment4 min readLW link

Logical Representation of Causal Models

johnswentworth21 Jan 2020 20:04 UTC

37 points

0 comments3 min readLW link

Abstraction, Evolution and Gears

johnswentworth24 Jun 2020 17:39 UTC

29 points

11 comments4 min readLW link

Motivating Abstraction-First Decision Theory

johnswentworth29 Apr 2020 17:47 UTC

43 points

16 comments5 min readLW link

[Question] Problems Involving Abstraction?

johnswentworth20 Oct 2020 16:49 UTC

30 points

12 comments1 min readLW link

Noise Simplifies

johnswentworth15 Apr 2020 19:48 UTC

28 points

3 comments2 min readLW link

Pointing to a Flower

johnswentworth18 May 2020 18:54 UTC

61 points

18 comments9 min readLW link

Integrating Hidden Variables Improves Approximation

johnswentworth16 Apr 2020 21:43 UTC

15 points

5 comments1 min readLW link

Causal Abstraction Toy Model: Medical Sensor

johnswentworth11 Dec 2019 21:12 UTC

34 points

6 comments6 min readLW link

Trace README

johnswentworth11 Mar 2020 21:08 UTC

37 points

1 comment8 min readLW link

Cartesian Boundary as Abstraction Boundary

johnswentworth11 Jun 2020 17:38 UTC

35 points

3 comments5 min readLW link

Mediation From a Distance

johnswentworth20 Mar 2020 22:02 UTC

16 points

0 comments2 min readLW link

Examples of Causal Abstraction

johnswentworth12 Dec 2019 22:54 UTC

24 points

7 comments4 min readLW link

Trace: Goals and Principles

johnswentworth28 Feb 2020 23:50 UTC

13 points

3 comments8 min readLW link

Alignment By Default

johnswentworth12 Aug 2020 18:54 UTC

177 points

101 comments11 min readLW link 2 reviews

Definitions of Causal Abstraction: Reviewing Beckers & Halpern

johnswentworth7 Jan 2020 0:03 UTC

30 points

4 comments4 min readLW link

The Flexibility of Abstract Concepts

lsusr2 Mar 2021 6:43 UTC

48 points

8 comments5 min readLW link

Finite Factored Sets

Scott Garrabrant23 May 2021 20:52 UTC

149 points

95 comments24 min readLW link 1 review

Reducing Agents: When abstractions break

Hazard31 Mar 2018 0:03 UTC

13 points

10 comments8 min readLW link

Assorted thoughts about abstraction

Adam Zerner5 Jul 2022 6:40 UTC

16 points

9 comments7 min readLW link

Testing The Natural Abstraction Hypothesis: Project Update

johnswentworth20 Sep 2021 3:44 UTC

88 points

17 comments8 min readLW link 1 review

Sapir-Whorf for Rationalists

Duncan Sabien (Inactive)25 Jan 2023 7:58 UTC

155 points

49 comments20 min readLW link

Finite Factored Sets: Introduction and Factorizations

Scott Garrabrant4 Jun 2021 17:41 UTC

36 points

2 comments10 min readLW link

Functors and Coarse Worlds

Scott Garrabrant30 Oct 2020 15:19 UTC

52 points

3 comments8 min readLW link

What Does The Natural Abstraction Framework Say About ELK?

johnswentworth15 Feb 2022 2:27 UTC

35 points

0 comments6 min readLW link

how 2 tell if ur input is out of distribution given only model weights

dkirmani5 Aug 2023 22:45 UTC

49 points

10 comments1 min readLW link

AXRP Episode 15 - Natural Abstractions with John Wentworth

DanielFilan23 May 2022 5:40 UTC

34 points

1 comment58 min readLW link

Seeing the Matrix, Switching Abstractions, and Missing Moods

Raemon4 Jun 2019 21:08 UTC

36 points

3 comments4 min readLW link

When to use “meta” vs “self-reference”, “recursive”, etc.

Alex_Altair6 Apr 2022 4:57 UTC

21 points

5 comments5 min readLW link

What Makes an Idea Understandable? On Architecturally and Culturally Natural Ideas.

NickyP, Peter S. Park and Stephen Fowler

16 Aug 2022 2:09 UTC

21 points

2 comments16 min readLW link

Distributed Decisions

johnswentworth29 May 2022 2:43 UTC

66 points

6 comments6 min readLW link

Neural networks as non-leaky mathematical abstraction

George3d619 Dec 2019 12:23 UTC

14 points

12 comments8 min readLW link

(blog.cerebralab.com)

Chaos Induces Abstractions

johnswentworth18 Mar 2021 20:08 UTC

100 points

13 comments7 min readLW link

AXRP Episode 9 - Finite Factored Sets with Scott Garrabrant

DanielFilan24 Jun 2021 22:10 UTC

59 points

2 comments59 min readLW link

On the Role of Proto-Languages

adamShimi22 Sep 2024 16:50 UTC

54 points

1 comment4 min readLW link

(epistemologicalfascinations.substack.com)

Saving Time

Scott Garrabrant18 May 2021 20:11 UTC

162 points

20 comments4 min readLW link 1 review

Remarks 1–18 on GPT (compressed)

Cleo Nardo20 Mar 2023 22:27 UTC

145 points

35 comments31 min readLW link

Finite Factored Sets: Conditional Orthogonality

Scott Garrabrant9 Jul 2021 6:01 UTC

29 points

2 comments7 min readLW link

Natural Latents: The Math

johnswentworth and David Lorell

27 Dec 2023 19:03 UTC

131 points

41 comments12 min readLW link 2 reviews

Towards the Operationalization of Philosophy & Wisdom

Thane Ruthenis28 Oct 2024 19:45 UTC

20 points

2 comments33 min readLW link

(aiimpacts.org)

A reformulation of Finite Factored Sets

Matthias G. Mayer24 Jul 2023 13:02 UTC

79 points

1 comment8 min readLW link

Maxent and Abstractions: Current Best Arguments

johnswentworth18 May 2022 19:54 UTC

33 points

2 comments3 min readLW link

To Be Particular About Morality

AGO31 Dec 2022 19:58 UTC

6 points

2 comments7 min readLW link

The Natural Abstraction Hypothesis: Implications and Evidence

CallumMcDougall14 Dec 2021 23:14 UTC

40 points

9 comments19 min readLW link

The Pragmascope Idea

johnswentworth4 Aug 2022 21:52 UTC

59 points

20 comments3 min readLW link

[Question] Why would code/English or low-abstraction/high-abstraction simplicity or brevity correspond?

curi4 Sep 2020 19:46 UTC

2 points

15 comments1 min readLW link

Do we have the right kind of math for roles, goals and meaning?

mrcbarbier22 Oct 2022 21:28 UTC

13 points

5 comments7 min readLW link

Abstraction Talk

johnswentworth25 May 2021 16:45 UTC

38 points

3 comments1 min readLW link

State, Art, Identity

musq25 Jan 2021 20:22 UTC

1 point

0 comments2 min readLW link

Embedded Agency via Abstraction

johnswentworth26 Aug 2019 23:03 UTC

42 points

20 comments11 min readLW link

Breaking Down Goal-Directed Behaviour

Oliver Sourbut16 Jun 2022 18:45 UTC

11 points

1 comment2 min readLW link

Abstractions as morphisms between (co)algebras

Erik Jenner14 Jan 2023 1:51 UTC

17 points

1 comment8 min readLW link

No Abstraction Without a Goal

dkirmani10 Jan 2022 20:24 UTC

28 points

27 comments1 min readLW link

Schematic Thinking: heuristic generalization using Korzybski’s method

romeostevensit14 Oct 2019 19:29 UTC

28 points

7 comments3 min readLW link

Research agenda: Formalizing abstractions of computations

Erik Jenner2 Feb 2023 4:29 UTC

93 points

10 comments31 min readLW link

Analogical Reasoning and Creativity

jacob_cannell1 Jul 2015 20:38 UTC

39 points

15 comments14 min readLW link

Identification of Natural Modularity

Stephen Fowler25 Jun 2022 15:05 UTC

15 points

3 comments7 min readLW link

The Fundamental Theorem for measurable factor spaces

Matthias G. Mayer12 Nov 2023 19:25 UTC

41 points

2 comments2 min readLW link

Agency As a Natural Abstraction

Thane Ruthenis13 May 2022 18:02 UTC

55 points

9 comments13 min readLW link

Abstract concepts and metalingual definition: Does ChatGPT understand justice and charity?

Bill Benzon16 Dec 2022 21:01 UTC

2 points

0 comments13 min readLW link

A comparison of causal scrubbing, causal abstractions, and related methods

Erik Jenner, Adrià Garriga-alonso and Egor Zverev

8 Jun 2023 23:40 UTC

73 points

3 comments22 min readLW link

Please Understand

samhealy1 Apr 2024 12:33 UTC

28 points

11 comments6 min readLW link

Towards Gears-Level Understanding of Agency

Thane Ruthenis16 Jun 2022 22:00 UTC

25 points

4 comments18 min readLW link

A Thorough Introduction to Abstraction

RohanS13 Jan 2023 0:30 UTC

9 points

1 comment18 min readLW link

But Where do the Variables of my Causal Model come from?

Dalcy9 Aug 2024 22:07 UTC

38 points

1 comment8 min readLW link

Good ontologies induce commutative diagrams

Erik Jenner9 Oct 2022 0:06 UTC

49 points

5 comments14 min readLW link

[Question] What is an agent in reductionist materialism?

Valentine13 Aug 2022 15:39 UTC

7 points

17 comments1 min readLW link

Tradeoffs in complexity, abstraction, and generality

remember and Gabriel Alfour

12 Dec 2022 15:55 UTC

32 points

0 comments2 min readLW link

Abstractions as Redundant Information

johnswentworth13 Feb 2022 4:17 UTC

69 points

9 comments14 min readLW link

One way to manipulate your level of abstraction related to a task

Andy_McKenzie19 Aug 2013 5:47 UTC

36 points

5 comments1 min readLW link

Deliberation, Reactions, and Control: Tentative Definitions and a Restatement of Instrumental Convergence

Oliver Sourbut27 Jun 2022 17:25 UTC

12 points

0 comments11 min readLW link

Rational Effective Utopia & Narrow Way There: Math-Proven Safe Static Multiversal mAX-Intelligence (AXI), Multiversal Alignment, New Ethicophysics… (Aug 11)

ank11 Feb 2025 3:21 UTC

13 points

8 comments38 min readLW link

Abstractions and translation

Amir Bolous20 May 2021 2:45 UTC

5 points

1 comment2 min readLW link

Scarce Channels and Abstraction Coupling

johnswentworth28 Feb 2023 23:26 UTC

41 points

11 comments6 min readLW link

Epistemic Motif of Abstract-Concrete Cycles & Domain Expansion

Dalcy10 Oct 2023 3:28 UTC

26 points

2 comments3 min readLW link

No comments.