Harrison G

Karma: 42

Interested in AI alignment, thinking about ethics, tap dancing, playing instruments, and wearing sandals year-round.

Apply to the Cambridge ERA:AI Fellowship 2025

Harrison G25 Mar 2025 13:50 UTC

16 points

0 comments3 min readLW link

Thinking About Propensity Evaluations

Maxime Riché, Harrison G, JaimeRV and Edoardo Pona

19 Aug 2024 9:23 UTC

10 points

0 comments27 min readLW link

A Taxonomy Of AI System Evaluations

Maxime Riché, JaimeRV, Harrison G and Edoardo Pona

19 Aug 2024 9:07 UTC

13 points

0 comments14 min readLW link

Harrison G 5 Jul 2023 22:01 UTC
3 points
0
in reply to: Noosphere89’s comment on: [Linkpost] Introducing Superalignment
The quote: “Finally, we can test our entire pipeline by deliberately training misaligned models, and confirming that our techniques detect the worst kinds of misalignments (adversarial testing).”

Harrison G 3 Jan 2023 2:33 UTC
3 points
0
on: More ways to spot abysses
Super helpful; thanks for writing!

Harrison G 3 Jan 2023 2:15 UTC
3 points
0
on: On sincerity
(read: The Athena-Parfit Long-Term Institute for Raising for Effectively Prioritizing Global Alignment Challenges)
I laughed about this for a while. Thank you for this though-provoking post, and for incorporating occasional humor throughout.

Harrison G 3 Jan 2023 1:03 UTC
1 point
0
on: Things I carry almost every day, as of late December 2022
At the top right is a pocket constitution made by Legal Impact for Chickens. I received this at an Effective Altruism Global conference, during the career fair. What actually happened was that someone came up to the booth I was at holding the pocket constitution, I noted that it looked cool, and they were kind enough to offer it to me. Unfortunately, I have never knowingly met anybody from Legal Impact for Chickens. I have not actually used this pocket constitution, but I carry it anyway in my winter jacket’s inner breast pocket since (a) it fits very unobtrusively and (b) it seems cool to carry around a pocket constitution.
If this was EAG SF, I remember an experience that sounds very similar to this, and I think I was this person! Ha

Harrison G 3 Aug 2022 4:04 UTC
0 points
0
on: A Proof Against Oracle AI
″ [...] since every string can be reconstructed by only answering yes or no to questions like ‘is the first bit 1?’ [...]”
Why would humans ever ask this question, and (furthermore) why would we ever ask this question n number of times? It seems unlikely, and easy to prevent. Is there something I’m not understanding about this step?

Distilled—AGI Safety from First Principles

Harrison G29 May 2022 0:57 UTC

11 points

1 comment14 min readLW link