Towards Mul­ti­modal In­ter­pretabil­ity: Learn­ing Sparse In­ter­pretable Fea­tures in Vi­sion Transformers

hugofryApr 29, 2024, 8:57 PM
94 points

52 votes

Overall karma indicates overall quality.

9 comments11 min readLW link

Towards a for­mal­iza­tion of the agent struc­ture problem

Alex_AltairApr 29, 2024, 8:28 PM
55 points

24 votes

Overall karma indicates overall quality.

6 comments14 min readLW link

Iron­ing Out the Squiggles

Zack_M_DavisApr 29, 2024, 4:13 PM
159 points

70 votes

Overall karma indicates overall quality.

36 comments11 min readLW link

Su­per ad­di­tivity of consciousness

Arturo MaciasApr 29, 2024, 3:41 PM
−2 points

8 votes

Overall karma indicates overall quality.

13 comments2 min readLW link

AISC9 has ended and there will be an AISC10

Linda LinseforsApr 29, 2024, 10:53 AM
75 points

34 votes

Overall karma indicates overall quality.

4 comments2 min readLW link

Open-Source AI: A Reg­u­la­tory Review

Apr 29, 2024, 10:10 AM
18 points

10 votes

Overall karma indicates overall quality.

0 comments8 min readLW link

Big-en­dian is bet­ter than lit­tle-endian

MenotimApr 29, 2024, 2:30 AM
32 points

29 votes

Overall karma indicates overall quality.

17 comments3 min readLW link

The Prop-room and Stage Cog­ni­tive Architecture

Robert KralischApr 29, 2024, 12:48 AM
14 points

5 votes

Overall karma indicates overall quality.

4 comments14 min readLW link

How are Si­mu­la­tors and Agents re­lated?

Robert KralischApr 29, 2024, 12:22 AM
6 points

3 votes

Overall karma indicates overall quality.

0 comments7 min readLW link

Ex­tended Embodiment

Robert KralischApr 29, 2024, 12:18 AM
8 points

6 votes

Overall karma indicates overall quality.

1 comment3 min readLW link

Refer­en­tial Containment

Robert KralischApr 29, 2024, 12:16 AM
2 points

1 vote

Overall karma indicates overall quality.

4 comments3 min readLW link

Disen­tan­gling Com­pe­tence and Intelligence

Robert KralischApr 29, 2024, 12:12 AM
23 points

8 votes

Overall karma indicates overall quality.

7 comments6 min readLW link

List your AI X-Risk cruxes!

Aryeh EnglanderApr 28, 2024, 6:26 PM
42 points

18 votes

Overall karma indicates overall quality.

7 comments2 min readLW link

Things I tell my­self to be more agentic

DMMFApr 28, 2024, 5:44 PM
10 points

8 votes

Overall karma indicates overall quality.

0 comments3 min readLW link
(danfrank.ca)

Es­ti­mat­ing the Num­ber of Play­ers from Game Re­sult Percentages

Daniel LApr 28, 2024, 5:42 PM
1 point

1 vote

Overall karma indicates overall quality.

2 comments1 min readLW link

The Science Al­gorithm—AISC 2024 Fi­nal Presentation

Johannes C. MayerApr 28, 2024, 2:55 PM
4 points

8 votes

Overall karma indicates overall quality.

0 comments1 min readLW link
(www.youtube.com)

[Aspira­tion-based de­signs] Out­look: deal­ing with complexity

Apr 28, 2024, 1:06 PM
13 points

9 votes

Overall karma indicates overall quality.

3 comments2 min readLW link

[Aspira­tion-based de­signs] 3. Perfor­mance and safety crite­ria, and as­pira­tion intervals

Jobst HeitzigApr 28, 2024, 1:04 PM
10 points

10 votes

Overall karma indicates overall quality.

0 comments12 min readLW link

[Aspira­tion-based de­signs] 2. For­mal frame­work, ba­sic algorithm

Apr 28, 2024, 1:02 PM
18 points

14 votes

Overall karma indicates overall quality.

2 comments16 min readLW link

[Aspira­tion-based de­signs] 1. In­for­mal in­tro­duc­tion

Apr 28, 2024, 1:00 PM
44 points

20 votes

Overall karma indicates overall quality.

4 comments8 min readLW link

Play­ing North­boro with Lily and Rick

jefftkApr 28, 2024, 2:40 AM
10 points

7 votes

Overall karma indicates overall quality.

1 comment2 min readLW link
(www.jefftk.com)

Re­lease of UN’s draft re­lated to the gov­er­nance of AI (a sum­mary of the Si­mon In­sti­tute’s re­sponse)

Sebastian SchmidtApr 27, 2024, 6:34 PM
7 points

2 votes

Overall karma indicates overall quality.

0 comments1 min readLW link
(forum.effectivealtruism.org)

Mercy to the Ma­chine: Thoughts & Rights

False NameApr 27, 2024, 4:36 PM
7 points

8 votes

Overall karma indicates overall quality.

5 comments17 min readLW link

Con­structabil­ity: Plainly-coded AGIs may be fea­si­ble in the near future

Apr 27, 2024, 4:04 PM
91 points

54 votes

Overall karma indicates overall quality.

15 comments13 min readLW link

So What’s Up With PUFAs Chem­i­cally?

J BostockApr 27, 2024, 1:32 PM
57 points

24 votes

Overall karma indicates overall quality.

25 comments6 min readLW link

Link: Let’s Think Dot by Dot: Hid­den Com­pu­ta­tion in Trans­former Lan­guage Models by Ja­cob Pfau, William Mer­rill & Sa­muel R. Bowman

Chris_LeongApr 27, 2024, 1:22 PM
12 points

10 votes

Overall karma indicates overall quality.

0 comments1 min readLW link
(twitter.com)

Two Ver­nor Vinge Book Reviews

Maxwell TabarrokApr 27, 2024, 12:14 PM
17 points

10 votes

Overall karma indicates overall quality.

0 comments2 min readLW link
(www.maximum-progress.com)

Re­fusal in LLMs is me­di­ated by a sin­gle direction

Apr 27, 2024, 11:13 AM
252 points

154 votes

Overall karma indicates overall quality.

95 comments10 min readLW link

[Question] Plau­si­bil­ity of Get­ting Early Warn­ing Shots be­cause AIs can’t co­or­di­nate?

hmysApr 27, 2024, 8:02 AM
12 points

4 votes

Overall karma indicates overall quality.

0 comments1 min readLW link

AI Safety Sphere

Myles HApr 27, 2024, 1:49 AM
6 points

10 votes

Overall karma indicates overall quality.

2 comments2 min readLW link

Ex­plor­ing the Eso­teric Path­ways to AI Sen­tience (Part One)

jeffreycarusoApr 27, 2024, 1:02 AM
−11 points

5 votes

Overall karma indicates overall quality.

6 comments2 min readLW link

Su­per­po­si­tion is not “just” neu­ron polysemanticity

LawrenceCApr 26, 2024, 11:22 PM
69 points

34 votes

Overall karma indicates overall quality.

4 comments13 min readLW link

D&D.Sci Long War: Defen­der of Data-mocracy

aphyerApr 26, 2024, 10:30 PM
44 points

19 votes

Overall karma indicates overall quality.

20 comments4 min readLW link

On Not Pul­ling The Lad­der Up Be­hind You

ScrewtapeApr 26, 2024, 9:58 PM
190 points

107 votes

Overall karma indicates overall quality.

21 comments9 min readLW link

We are headed into an ex­treme com­pute overhang

devrandomApr 26, 2024, 9:38 PM
54 points

36 votes

Overall karma indicates overall quality.

34 comments2 min readLW link

[Con­cept Depen­dency] Edge Reg­u­lar Lat­tice Graph

Johannes C. MayerApr 26, 2024, 9:14 PM
9 points

2 votes

Overall karma indicates overall quality.

1 comment1 min readLW link

[Con­cept Depen­dency] Con­cept Depen­dency Posts

Johannes C. MayerApr 26, 2024, 8:57 PM
11 points

4 votes

Overall karma indicates overall quality.

3 comments2 min readLW link

[Question] Wouldn’t weak AI agents provide warn­ing?

Looked At To WinApr 26, 2024, 7:34 PM
5 points

3 votes

Overall karma indicates overall quality.

0 comments1 min readLW link

World models

A*Apr 26, 2024, 7:11 PM
1 point

1 vote

Overall karma indicates overall quality.

0 comments1 min readLW link

Duct Tape security

Isaac KingApr 26, 2024, 6:57 PM
72 points

36 votes

Overall karma indicates overall quality.

11 comments5 min readLW link

Fun­da­men­tal Uncer­tainty: Chap­ter 8 - When does fun­da­men­tal un­cer­tainty mat­ter?

Gordon Seidoh WorleyApr 26, 2024, 6:10 PM
11 points

3 votes

Overall karma indicates overall quality.

4 comments32 min readLW link

Scal­ing of AI train­ing runs will slow down af­ter GPT-5

Maxime RichéApr 26, 2024, 4:05 PM
42 points

20 votes

Overall karma indicates overall quality.

5 comments3 min readLW link

Spa­tial at­ten­tion as a “tell” for em­pa­thetic simu­la­tion?

Steven ByrnesApr 26, 2024, 3:10 PM
55 points

17 votes

Overall karma indicates overall quality.

12 comments8 min readLW link

Arch-anarchy

Peter lawless Apr 26, 2024, 3:05 PM
−1 points

2 votes

Overall karma indicates overall quality.

1 comment25 min readLW link

Bread­board­ing a Whis­tle Synth

jefftkApr 26, 2024, 3:00 PM
9 points

1 vote

Overall karma indicates overall quality.

2 comments2 min readLW link
(www.jefftk.com)

An In­tro­duc­tion to AI Sandbagging

Apr 26, 2024, 1:40 PM
50 points

24 votes

Overall karma indicates overall quality.

13 comments8 min readLW link

LLMs seem (rel­a­tively) safe

JustisMillsApr 25, 2024, 10:13 PM
53 points

29 votes

Overall karma indicates overall quality.

24 comments7 min readLW link
(justismills.substack.com)

Los­ing Faith In Con­trar­i­anism

Bentham's BulldogApr 25, 2024, 8:53 PM
47 points

56 votes

Overall karma indicates overall quality.

44 comments5 min readLW link

Why I stopped be­ing into basin broadness

tailcalledApr 25, 2024, 8:47 PM
16 points

7 votes

Overall karma indicates overall quality.

3 comments2 min readLW link

AXRP Epi­sode 29 - Science of Deep Learn­ing with Vikrant Varma

DanielFilanApr 25, 2024, 7:10 PM
20 points

9 votes

Overall karma indicates overall quality.

1 comment63 min readLW link