GPT4o is still sen­si­tive to user-in­duced bias when writ­ing code

Sep 22, 2024, 9:04 PM
6 points
0 comments4 min readLW link

My 10-year ret­ro­spec­tive on try­ing SSRIs

Kaj_SotalaSep 22, 2024, 8:30 PM
80 points
9 comments2 min readLW link
(kajsotala.fi)

Mak­ing Eggs Without Ovaries

Sep 22, 2024, 5:44 PM
58 points
3 comments16 min readLW link
(www.asimov.press)

Becket First

jefftkSep 22, 2024, 5:10 PM
9 points
0 comments2 min readLW link
(www.jefftk.com)

On the Role of Proto-Languages

adamShimiSep 22, 2024, 4:50 PM
54 points
1 comment4 min readLW link
(epistemologicalfascinations.substack.com)

I’m cre­at­ing a deep dive pod­cast epi­sode about the origi­nal Lev­er­age Re­search—would you like to take part?

spencergSep 22, 2024, 2:03 PM
37 points
2 comments1 min readLW link

Who Feels More Alone?

marvinscheffoldSep 22, 2024, 11:54 AM
−8 points
2 comments39 min readLW link

Another ar­gu­ment against util­ity-cen­tric al­ign­ment paradigms

Fiora SunshineSep 22, 2024, 7:28 AM
67 points
39 comments8 min readLW link

My hopes for YouCongress.com

Nathan Helm-BurgerSep 22, 2024, 3:20 AM
14 points
3 comments4 min readLW link

How Often Does Tak­ing Away Op­tions Help?

niplavSep 21, 2024, 9:52 PM
21 points
7 comments2 min readLW link

A Ra­tional Com­pany—Seek­ing Advisors

AlignmentOptimizerSep 21, 2024, 7:51 PM
0 points
1 comment1 min readLW link

Seek­ing mentorship

Kevin AfachaoSep 21, 2024, 4:54 PM
5 points
0 comments1 min readLW link

Ap­pli­ca­tions of Chaos: Say­ing No (with Hast­ings Greer)

ElizabethSep 21, 2024, 4:30 PM
50 points
16 comments2 min readLW link
(acesounderglass.com)

In­ves­ti­gat­ing an in­surance-for-AI startup

Sep 21, 2024, 3:29 PM
70 points
0 comments16 min readLW link
(www.strataoftheworld.com)

An Un­mea­sured Song of Measurement

jan SijanSep 21, 2024, 3:08 PM
−3 points
0 comments4 min readLW link

Should Sports Bet­ting Be Banned?

Maxwell TabarrokSep 21, 2024, 2:13 PM
18 points
2 comments4 min readLW link
(www.maximum-progress.com)

Work with me on agent foun­da­tions: in­de­pen­dent fellowship

Alex_AltairSep 21, 2024, 1:59 PM
59 points
5 comments4 min readLW link

Elec­tric Mandola

jefftkSep 21, 2024, 1:40 PM
9 points
0 comments1 min readLW link
(www.jefftk.com)

Glitch To­ken Cat­a­log - (Al­most) a Full Clear

Lao MeinSep 21, 2024, 12:22 PM
38 points
3 comments37 min readLW link

The Other Ex­is­ten­tial Crisis

James Stephen BrownSep 21, 2024, 1:16 AM
9 points
24 comments2 min readLW link

Ap­ply to MATS 7.0!

Sep 21, 2024, 12:23 AM
32 points
0 comments5 min readLW link

Moscow – ACX Mee­tups Every­where Fall 2024

red-haraSep 20, 2024, 11:03 PM
−1 points
0 comments1 min readLW link

Val­i­dat­ing /​ find­ing al­ign­ment-rele­vant con­cepts us­ing neu­ral data

Bogdan Ionut CirsteaSep 20, 2024, 9:12 PM
7 points
0 comments1 min readLW link
(docs.google.com)

Aug­ment­ing Statis­ti­cal Models with Nat­u­ral Lan­guage Parameters

jsteinhardtSep 20, 2024, 6:30 PM
34 points
0 comments8 min readLW link
(bounded-regret.ghost.io)

Fun With The Tab­ula Muris (Se­nis)

sarahconstantinSep 20, 2024, 6:20 PM
25 points
0 comments8 min readLW link
(sarahconstantin.substack.com)

My Cri­tique of Effec­tive Altruism

Dylan PriceSep 20, 2024, 5:41 PM
−10 points
8 comments4 min readLW link

[Question] Why be moral if we can’t mea­sure how moral we are? Is it even pos­si­ble to mea­sure moral­ity?

OKlogicSep 20, 2024, 5:40 PM
−2 points
0 comments3 min readLW link

On Mea­sur­ing In­tel­lec­tual Perfor­mance—per­sonal ex­pe­rience and sev­eral thoughts

Alexander GufanSep 20, 2024, 5:21 PM
3 points
2 comments8 min readLW link

In­tro­duc­tion to Su­per Pow­ers (for kids!)

Shoshannah TekofskySep 20, 2024, 5:17 PM
25 points
0 comments3 min readLW link
(kidquest.substack.com)

Col­laps­ing “Col­laps­ing the Belief/​Knowl­edge Distinc­tion”

JeremiasSep 20, 2024, 4:11 PM
3 points
0 comments4 min readLW link

A New Class of Glitch To­kens—BPE Subto­ken Ar­ti­facts (BSA)

Lao MeinSep 20, 2024, 1:13 PM
37 points
7 comments5 min readLW link

o1-pre­view is pretty good at do­ing ML on an un­known dataset

Håvard Tveit IhleSep 20, 2024, 8:39 AM
67 points
1 comment2 min readLW link

Mo­ral Trade, Im­pact Distri­bu­tions and Large Worlds

LarksSep 20, 2024, 3:45 AM
7 points
0 commentsLW link

Key­board Gremlins

jefftkSep 20, 2024, 2:30 AM
10 points
0 comments2 min readLW link
(www.jefftk.com)

The case for more Align­ment Tar­get Anal­y­sis (ATA)

Sep 20, 2024, 1:14 AM
27 points
13 comments17 min readLW link

Piling bounded arguments

momom2Sep 19, 2024, 10:27 PM
7 points
0 comments4 min readLW link

We Don’t Know Our Own Values, but Re­ward Bridges The Is-Ought Gap

Sep 19, 2024, 10:22 PM
49 points
48 comments5 min readLW link

In­ter­ested in Cog­ni­tive Boot­camp?

RaemonSep 19, 2024, 10:12 PM
48 points
0 comments2 min readLW link

Just How Good Are Modern Chess Com­put­ers?

nemSep 19, 2024, 6:57 PM
10 points
1 comment6 min readLW link

RLHF is the worst pos­si­ble thing done when fac­ing the al­ign­ment problem

tailcalledSep 19, 2024, 6:56 PM
32 points
10 comments6 min readLW link

AISafety.info: What are In­duc­tive Bi­ases?

AlgonSep 19, 2024, 5:26 PM
11 points
4 comments2 min readLW link
(aisafety.info)

Physics of Lan­guage mod­els (part 2.1)

Nathan Helm-BurgerSep 19, 2024, 4:48 PM
9 points
2 comments1 min readLW link
(youtu.be)

Why good things of­ten don’t lead to bet­ter outcomes

DMMFSep 19, 2024, 4:37 PM
16 points
1 comment4 min readLW link
(danfrank.ca)

To CoT or not to CoT? Chain-of-thought helps mainly on math and sym­bolic reasoning

Bogdan Ionut CirsteaSep 19, 2024, 4:13 PM
21 points
1 comment1 min readLW link
(arxiv.org)

Laz­i­ness death spirals

PatrickDFarleySep 19, 2024, 3:58 PM
276 points
40 comments8 min readLW link

[In­tu­itive self-mod­els] 1. Preliminaries

Steven ByrnesSep 19, 2024, 1:45 PM
91 points
23 comments15 min readLW link

AI #82: The Gover­nor Ponders

ZviSep 19, 2024, 1:30 PM
50 points
8 comments27 min readLW link
(thezvi.wordpress.com)

Slave Mo­ral­ity: A place for ev­ery man and ev­ery man in his place

Martin SustrikSep 19, 2024, 4:20 AM
16 points
7 comments2 min readLW link
(250bpm.substack.com)

Which LessWrong/​Align­ment top­ics would you like to be tu­tored in? [Poll]

RubySep 19, 2024, 1:35 AM
43 points
12 comments1 min readLW link

The Oblique­ness Thesis

jessicataSep 19, 2024, 12:26 AM
95 points
19 comments17 min readLW link