RSS

For­mal Proof

TagLast edit: 26 Sep 2021 22:04 UTC by Pablo

A Formal Proof is a finite sequence of steps from axiom(s) or previous derived proof(s) which strictly follow the allowed rules of inference of the mathematical system in which it exists. They are used to establish statements as true within a mathematical framework in a way which can be independently verified with extremely high certainty, with the most reliable flavor of proof being machine-checked proofs generated by proof assistants since they have even less room for human error.

Proofs, Im­pli­ca­tions, and Models

Eliezer Yudkowsky30 Oct 2012 13:02 UTC
135 points
218 comments12 min readLW link

Com­pact Proofs of Model Perfor­mance via Mechanis­tic Interpretability

24 Jun 2024 19:27 UTC
104 points
4 comments8 min readLW link
(arxiv.org)

Please Mea­sure Ver­ifi­ca­tion Burden

Quinn23 Nov 2025 17:25 UTC
17 points
4 comments4 min readLW link

AXRP Epi­sode 40 - Ja­son Gross on Com­pact Proofs and Interpretability

DanielFilan28 Mar 2025 18:40 UTC
26 points
0 comments89 min readLW link

Lies, Damned Lies, and Proofs: For­mal Meth­ods are not Slopless

12 Jan 2026 22:32 UTC
101 points
9 comments7 min readLW link

HERMES: Towards Effi­cient and Ver­ifi­able Math­e­mat­i­cal Rea­son­ing in LLMs

Gunnar_Zarncke1 Dec 2025 10:07 UTC
8 points
0 comments1 min readLW link
(arxiv.org)

A List of things I might do with a Proof Oracle

Logan Zoellner5 Feb 2023 18:14 UTC
−14 points
13 comments3 min readLW link

Most Minds are Irrational

Davidmanheim10 Dec 2024 9:36 UTC
17 points
4 comments10 min readLW link

[Question] What Pro­gram­ming Lan­guage Char­ac­ter­is­tics Would Allow Prov­ably Safe AI?

Davidmanheim28 Aug 2019 10:46 UTC
4 points
9 comments1 min readLW link

Eleuther re­leases Llemma: An Open Lan­guage Model For Mathematics

mako yass17 Oct 2023 20:03 UTC
22 points
0 comments1 min readLW link
(blog.eleuther.ai)

Squeez­ing foun­da­tions re­search as­sis­tance out of for­mal logic nar­row AI.

Donald Hobson8 Mar 2023 9:38 UTC
16 points
1 comment2 min readLW link

The Fun­da­men­tal Cir­cu­lar­ity The­o­rem: Why Some Math­e­mat­i­cal Be­havi­ours Are In­her­ently Unprovable

Alister Munday22 Jan 2025 18:20 UTC
−11 points
2 comments4 min readLW link

[Question] I Tried to For­mal­ize Mean­ing. I May Have Ac­ci­den­tally De­scribed Con­scious­ness.

Erichcurtis9130 Apr 2025 3:16 UTC
0 points
0 comments2 min readLW link

Progress on au­to­mated math­e­mat­i­cal the­o­rem prov­ing?

JonahS3 Jul 2013 18:40 UTC
26 points
65 comments1 min readLW link

The Ad­e­quacy of Class Separation

milanrosko6 Dec 2025 6:10 UTC
4 points
0 comments5 min readLW link

The Perfec­tion Trap: How For­mally Aligned AI Sys­tems May Create Inescapable Eth­i­cal Dystopias

Chris O'Quinn1 Jun 2025 23:12 UTC
1 point
0 comments43 min readLW link

Fun­da­men­tals of For­mal­i­sa­tion Level 5: For­mal Proof

philip_b9 Jul 2018 20:55 UTC
13 points
0 comments1 min readLW link

Roadmap for a col­lab­o­ra­tive pro­to­type of an Open Agency Architecture

Deger Turan10 May 2023 17:41 UTC
31 points
0 comments12 min readLW link

The Man­sion of State­ments: A Re­fu­ta­tion of Gödel’s First In­com­plete­ness Theorem

Ángel Huerta Álvarez4 Sep 2025 23:09 UTC
−3 points
0 comments8 min readLW link

Proofs Sec­tion 1.2 (Mix­tures, Up­dates, Push­for­wards)

Diffractor27 Aug 2020 7:57 UTC
8 points
0 comments14 min readLW link

[Question] Search­ing for Im­pos­si­bil­ity Re­sults or No-Go The­o­rems for prov­able safety.

Maelstrom27 Sep 2024 20:12 UTC
2 points
1 comment1 min readLW link

Log­i­cal in­duc­tor limits are dense un­der poin­t­wise convergence

SamEisenstat6 Oct 2016 8:07 UTC
5 points
0 comments6 min readLW link

Towards Guaran­teed Safe AI: A Frame­work for En­sur­ing Ro­bust and Reli­able AI Systems

Joar Skalse17 May 2024 19:13 UTC
67 points
10 comments2 min readLW link

The value of learn­ing math­e­mat­i­cal proof

JonahS2 Jun 2015 3:15 UTC
8 points
42 comments1 min readLW link

Mea­sur­ing Non­lin­ear Fea­ture In­ter­ac­tions in Sparse Cross­coders [Pro­ject Pro­posal]

6 Jan 2025 4:22 UTC
19 points
0 comments12 min readLW link

A list of core AI safety prob­lems and how I hope to solve them

davidad26 Aug 2023 15:12 UTC
165 points
29 comments5 min readLW link

LBIT Proofs 5: Propo­si­tions 29-38

Diffractor16 Dec 2020 3:35 UTC
8 points
0 comments21 min readLW link

LBIT Proofs 2: Propo­si­tions 10-18

Diffractor16 Dec 2020 3:45 UTC
7 points
0 comments20 min readLW link

An Opinionated Look at In­fer­ence Rules

Gianluca Calcagni3 Sep 2024 13:32 UTC
−5 points
2 comments13 min readLW link

Weak HCH ac­cesses EXP

evhub22 Jul 2020 22:36 UTC
16 points
0 comments3 min readLW link

Video In­tro to Guaran­teed Safe AI

11 Jul 2024 17:53 UTC
27 points
0 comments1 min readLW link
(youtu.be)

LBIT Proofs 3: Propo­si­tions 19-22

Diffractor16 Dec 2020 3:40 UTC
8 points
0 comments17 min readLW link

The Un­der­val­ued Kleene Hierarchy

milanrosko18 Dec 2025 11:57 UTC
10 points
2 comments6 min readLW link

For­mal­ized math: dream vs reality

cousin_it9 Jul 2009 20:51 UTC
19 points
10 comments2 min readLW link

Proofs Sec­tion 2.2 (Iso­mor­phism to Ex­pec­ta­tions)

Diffractor27 Aug 2020 7:52 UTC
8 points
0 comments46 min readLW link

Proofs Sec­tion 2.3 (Up­dates, De­ci­sion The­ory)

Diffractor27 Aug 2020 7:49 UTC
8 points
0 comments31 min readLW link

Proofs Sec­tion 1.1 (Ini­tial re­sults to LF-du­al­ity)

Diffractor27 Aug 2020 7:59 UTC
8 points
0 comments26 min readLW link

Davi­dad’s Bold Plan for Align­ment: An In-Depth Explanation

19 Apr 2023 16:09 UTC
167 points
40 comments21 min readLW link2 reviews

Speedrun­ning 4 mis­takes you make when your al­ign­ment strat­egy is based on for­mal proof

Quinn16 Feb 2023 1:13 UTC
63 points
18 comments2 min readLW link

An Illus­trated Proof of the No Free Lunch Theorem

lifelonglearner8 Jun 2020 1:54 UTC
20 points
0 comments1 min readLW link
(mlu.red)

LBIT Proofs 7: Propo­si­tions 48-52

Diffractor16 Dec 2020 3:31 UTC
7 points
0 comments20 min readLW link

I bet $500 on AI win­ning the IMO gold medal by 2026

azsantosk11 May 2023 14:46 UTC
37 points
31 comments1 min readLW link

[Math] Towards Proof Writ­ing as a Skill In Itself

Andrew Quinn13 Jun 2018 4:39 UTC
25 points
8 comments2 min readLW link

LBIT Proofs 8: Propo­si­tions 53-58

Diffractor16 Dec 2020 3:29 UTC
7 points
0 comments18 min readLW link

So­cial Choice The­ory and Log­i­cal Handshakes

StrivingForLegibility29 Dec 2023 3:49 UTC
17 points
0 comments4 min readLW link

In­ter­view Daniel Mur­fet on Univer­sal Phenom­ena in Learn­ing Machines

Alexander Gietelink Oldenziel6 Feb 2023 0:00 UTC
58 points
1 comment16 min readLW link

Coun­ter­fac­tual In­duc­tion (Al­gorithm Sketch, Fix­point proof)

Diffractor17 Dec 2019 5:04 UTC
5 points
2 comments7 min readLW link

LBIT Proofs 6: Propo­si­tions 39-47

Diffractor16 Dec 2020 3:33 UTC
7 points
0 comments23 min readLW link

Proofs Sec­tion 2.1 (The­o­rem 1, Lem­mas)

Diffractor27 Aug 2020 7:54 UTC
8 points
0 comments36 min readLW link

In­fra-Do­main proofs 1

Diffractor28 Mar 2021 9:16 UTC
13 points
0 comments23 min readLW link

Plan­ning to build a cryp­to­graphic box with perfect secrecy

Lysandre Terrisse31 Dec 2023 9:31 UTC
40 points
6 comments11 min readLW link

A proof of Löb’s the­o­rem in Haskell

cousin_it19 Sep 2014 13:01 UTC
54 points
8 comments3 min readLW link

LBIT Proofs 4: Propo­si­tions 22-28

Diffractor16 Dec 2020 3:38 UTC
7 points
0 comments17 min readLW link

The LVV–HNV Co­her­ence Frame­work: A For­mal Model for Why Ra­tional AGI Can­not Re­place Humanity

oiia oiia2 Dec 2025 17:56 UTC
0 points
0 comments3 min readLW link

Ques­tion/​Is­sue with the 5/​10 Problem

acgt29 Nov 2021 10:45 UTC
6 points
3 comments3 min readLW link

Allow­ing a for­mal proof sys­tem to self im­prove while avoid­ing Lo­bian ob­sta­cles.

Donald Hobson23 Jan 2019 23:04 UTC
6 points
4 comments2 min readLW link

An ex­am­ple of self-fulfilling spu­ri­ous proofs in UDT

cousin_it25 Mar 2012 11:47 UTC
33 points
43 comments2 min readLW link

Align­ment pro­pos­als and com­plex­ity classes

evhub16 Jul 2020 0:27 UTC
40 points
26 comments13 min readLW link

LBIT Proofs 1: Propo­si­tions 1-9

Diffractor16 Dec 2020 3:48 UTC
7 points
0 comments25 min readLW link

In­fra-Do­main Proofs 2

Diffractor28 Mar 2021 9:15 UTC
14 points
0 comments21 min readLW link

Limi­ta­tions on For­mal Ver­ifi­ca­tion for AI Safety

Andrew Dickson19 Aug 2024 23:03 UTC
135 points
60 comments23 min readLW link
No comments.