Owain_Evans(Owain Evans)

Karma: 2,212

https://owainevans.github.io/

Model Mis-specification and Inverse Reinforcement Learning

Owain_Evans and jsteinhardt

9 Nov 2018 15:33 UTC

33 points

3 comments16 min readLW link

Machine Learning Projects on IDA

Owain_Evans, William_S and stuhlmueller

24 Jun 2019 18:38 UTC

49 points

3 comments2 min readLW link

Neural nets as a model for how humans make and understand visual art

Owain_Evans9 Nov 2019 16:53 UTC

28 points

7 comments2 min readLW link

(owainevans.github.io)

Update on Ought’s experiments on factored evaluation of arguments

Owain_Evans12 Jan 2020 21:20 UTC

29 points

1 comment1 min readLW link

(ought.org)

Quantifying Household Transmission of COVID-19

Owain_Evans6 Jul 2020 11:19 UTC

35 points

4 comments4 min readLW link

AI Safety Research Project Ideas

Owain_Evans21 May 2021 13:39 UTC

58 points

2 comments3 min readLW link

How truthful is GPT-3? A benchmark for language models

Owain_Evans16 Sep 2021 10:09 UTC

58 points

24 comments6 min readLW link

Truthful AI: Developing and governing AI that does not lie

Owain_Evans, owencb and Lukas Finnveden

18 Oct 2021 18:37 UTC

82 points

9 comments10 min readLW link

AMA on Truthful AI: Owen Cotton-Barratt, Owain Evans & co-authors

Owain_Evans22 Oct 2021 16:23 UTC

31 points

15 comments1 min readLW link

The Rationalists of the 1950s (and before) also called themselves “Rationalists”

Owain_Evans28 Nov 2021 20:17 UTC

187 points

32 comments3 min readLW link 1 review

Lives of the Cambridge polymath geniuses

Owain_Evans25 Jan 2022 4:45 UTC

107 points

40 comments3 min readLW link

How do new models from OpenAI, DeepMind and Anthropic perform on TruthfulQA?

Owain_Evans26 Feb 2022 12:46 UTC

44 points

3 comments11 min readLW link

Paper: Teaching GPT3 to express uncertainty in words

Owain_Evans31 May 2022 13:27 UTC

97 points

7 comments4 min readLW link

Paper: Forecasting world events with neural nets

Owain_Evans, Dan H and Joe Kwon

1 Jul 2022 19:40 UTC

39 points

3 comments4 min readLW link

Paper: On measuring situational awareness in LLMs

Owain_Evans, Daniel Kokotajlo, Mikita Balesni, Tomek Korbak, lberglund, Asa Cooper Stickland, Meg and Maximilian Kaufmann

4 Sep 2023 12:54 UTC

106 points

16 comments5 min readLW link

(arxiv.org)

Paper: Tell, Don’t Show- Declarative facts influence how LLMs generalize

Owain_Evans and AlexMeinke

19 Dec 2023 19:14 UTC

45 points

4 comments6 min readLW link

(arxiv.org)

How do LLMs give truthful answers? A discussion of LLM vs. human reasoning, ensembles & parrots

Owain_Evans28 Mar 2024 2:34 UTC

26 points

0 comments9 min readLW link