Refine

TagLast edit: 1 Sep 2022 14:27 UTC by NeuralBets

Refine is a conceptual research incubator hosted by Conjecture.

confusion about alignment requirements

Tamsin Leake6 Oct 2022 10:32 UTC

39 points

10 comments3 min readLW link

(carado.moe)

Refine’s Second Blog Post Day

adamShimi20 Aug 2022 13:01 UTC

19 points

0 comments1 min readLW link

Benchmarking Proposals on Risk Scenarios

Paul Bricman20 Aug 2022 10:01 UTC

25 points

2 comments14 min readLW link

What if we approach AI safety like a technical engineering safety problem

zeshen20 Aug 2022 10:29 UTC

33 points

4 comments7 min readLW link

What I Learned Running Refine

adamShimi24 Nov 2022 14:49 UTC

107 points

5 comments4 min readLW link

PreDCA: vanessa kosoy’s alignment protocol

Tamsin Leake20 Aug 2022 10:03 UTC

50 points

8 comments7 min readLW link

(carado.moe)

the Insulated Goal-Program idea

Tamsin Leake13 Aug 2022 9:57 UTC

43 points

4 comments2 min readLW link

(carado.moe)

goal-program bricks

Tamsin Leake13 Aug 2022 10:08 UTC

31 points

2 comments2 min readLW link

(carado.moe)

ordering capability thresholds

Tamsin Leake16 Sep 2022 16:36 UTC

27 points

0 comments4 min readLW link

(carado.moe)

Refine Blogpost Day #3: The shortforms I did write

Alexander Gietelink Oldenziel16 Sep 2022 21:03 UTC

23 points

0 comments1 min readLW link

Refine’s Third Blog Post Day/Week

adamShimi17 Sep 2022 17:03 UTC

18 points

0 comments1 min readLW link

Representational Tethers: Tying AI Latents To Human Ones

Paul Bricman16 Sep 2022 14:45 UTC

30 points

0 comments16 min readLW link

Epistemic Artefacts of (conceptual) AI alignment research

Nora_Ammann and particlemania

19 Aug 2022 17:18 UTC

30 points

1 comment5 min readLW link

Oversight Leagues: The Training Game as a Feature

Paul Bricman9 Sep 2022 10:08 UTC

20 points

6 comments10 min readLW link

Ideological Inference Engines: Making Deontology Differentiable*

Paul Bricman12 Sep 2022 12:00 UTC

6 points

0 comments14 min readLW link

Levels of goals and alignment

zeshen16 Sep 2022 16:44 UTC

27 points

4 comments6 min readLW link

Cataloguing Priors in Theory and Practice

Paul Bricman13 Oct 2022 12:36 UTC

13 points

8 comments7 min readLW link

Refine: what helped me write more?

Alexander Gietelink Oldenziel25 Oct 2022 14:44 UTC

12 points

0 comments2 min readLW link

Embedding safety in ML development

zeshen31 Oct 2022 12:27 UTC

24 points

1 comment18 min readLW link

A newcomer’s guide to the technical AI safety field

zeshen4 Nov 2022 14:29 UTC

42 points

3 comments10 min readLW link

Interlude: But Who Optimizes The Optimizer?

Paul Bricman23 Sep 2022 15:30 UTC

15 points

0 comments10 min readLW link

Summary of ML Safety Course

zeshen27 Sep 2022 13:05 UTC

7 points

0 comments6 min readLW link

My Thoughts on the ML Safety Course

zeshen27 Sep 2022 13:15 UTC

50 points

3 comments17 min readLW link

(Structural) Stability of Coupled Optimizers

Paul Bricman30 Sep 2022 11:28 UTC

25 points

0 comments10 min readLW link

Refine’s First Blog Post Day

adamShimi13 Aug 2022 10:23 UTC

55 points

3 comments1 min readLW link

Boolean Primitives for Coupled Optimizers

Paul Bricman7 Oct 2022 18:02 UTC

9 points

0 comments8 min readLW link

my current outlook on AI risk mitigation

Tamsin Leake3 Oct 2022 20:06 UTC

63 points

6 comments11 min readLW link

(carado.moe)

How I think about alignment

Linda Linsefors13 Aug 2022 10:01 UTC

31 points

11 comments5 min readLW link

Steelmining via Analogy

Paul Bricman13 Aug 2022 9:59 UTC

24 points

0 comments2 min readLW link

(paulbricman.com)

I missed the crux of the alignment problem the whole time

zeshen13 Aug 2022 10:11 UTC

53 points

7 comments3 min readLW link

All the posts I will never write

Alexander Gietelink Oldenziel14 Aug 2022 18:29 UTC

53 points

8 comments8 min readLW link

Refine: An Incubator for Conceptual Alignment Research Bets

adamShimi15 Apr 2022 8:57 UTC

144 points

13 comments4 min readLW link

How to Diversify Conceptual Alignment: the Model Behind Refine

adamShimi20 Jul 2022 10:44 UTC

87 points

11 comments8 min readLW link

No comments.