Stuart_Armstrong(Stuart Armstrong)

Karma: 16,244

Differ­ent way clas­sifiers can be diverse

Stuart_Armstrong17 Jan 2022 16:30 UTC
9 points
3 comments2 min readLW link

Value ex­trap­o­la­tion par­tially re­solves sym­bol grounding

Stuart_Armstrong12 Jan 2022 16:30 UTC
19 points
10 comments1 min readLW link

How an alien the­ory of mind might be unlearnable

Stuart_Armstrong3 Jan 2022 11:16 UTC
25 points
35 comments5 min readLW link

Find­ing the mul­ti­ple ground truths of CoinRun and image classification

Stuart_Armstrong8 Dec 2021 18:13 UTC
15 points
3 comments2 min readLW link

Declus­ter­ing, reclus­ter­ing, and filling in thingspace

Stuart_Armstrong6 Dec 2021 20:53 UTC
15 points
6 comments3 min readLW link

Are there al­ter­na­tive to solv­ing value trans­fer and ex­trap­o­la­tion?

Stuart_Armstrong6 Dec 2021 18:53 UTC
19 points
7 comments5 min readLW link

$100/​$50 re­wards for good references

Stuart_Armstrong3 Dec 2021 16:55 UTC
19 points
3 comments1 min readLW link

Mo­rally un­der­defined situ­a­tions can be deadly

Stuart_Armstrong22 Nov 2021 14:48 UTC
17 points
8 comments2 min readLW link

Gen­eral al­ign­ment plus hu­man val­ues, or al­ign­ment via hu­man val­ues?

Stuart_Armstrong22 Oct 2021 10:11 UTC
26 points
25 comments3 min readLW link

Beyond the hu­man train­ing dis­tri­bu­tion: would the AI CEO cre­ate al­most-ille­gal ted­dies?

Stuart_Armstrong18 Oct 2021 21:10 UTC
35 points
2 comments3 min readLW link

Clas­si­cal sym­bol ground­ing and causal graphs

Stuart_Armstrong14 Oct 2021 18:04 UTC
22 points
2 comments5 min readLW link

Prefer­ences from (real and hy­po­thet­i­cal) psy­chol­ogy papers

Stuart_Armstrong6 Oct 2021 9:06 UTC
14 points
0 comments2 min readLW link

Force neu­ral nets to use mod­els, then de­tect these

Stuart_Armstrong5 Oct 2021 11:31 UTC
17 points
8 comments2 min readLW link

AI learns be­trayal and how to avoid it

Stuart_Armstrong30 Sep 2021 9:39 UTC
30 points
4 comments2 min readLW link

AI, learn to be con­ser­va­tive, then learn to be less so: re­duc­ing side-effects, learn­ing pre­served fea­tures, and go­ing be­yond conservatism

Stuart_Armstrong20 Sep 2021 11:56 UTC
14 points
4 comments3 min readLW link

Sig­moids be­hav­ing badly: arXiv paper

Stuart_Armstrong20 Sep 2021 10:29 UTC
24 points
1 comment1 min readLW link

Im­mo­bile AI makes a move: anti-wire­head­ing, on­tol­ogy change, and model splintering

Stuart_Armstrong17 Sep 2021 15:24 UTC
32 points
3 comments2 min readLW link

Re­ward splin­ter­ing as re­verse of interpretability

Stuart_Armstrong31 Aug 2021 22:27 UTC
10 points
0 comments1 min readLW link

What are bi­ases, any­way? Mul­ti­ple type signatures

Stuart_Armstrong31 Aug 2021 21:16 UTC
11 points
0 comments3 min readLW link

What does GPT-3 un­der­stand? Sym­bol ground­ing and Chi­nese rooms

Stuart_Armstrong3 Aug 2021 13:14 UTC
35 points
15 comments12 min readLW link