Lê Nguyên Hoang

Karma: 293

Dr. in math. AI Alignment and Safety researcher. Bayesian. Science YouTuber, podcaster, writer. Author of the books “The Equation of Knowledge”, “Le Fabuleux Chantier” and “Turing à la plage”.

SmartPoop 1.0: An AI Safety Science-Fiction

Lê Nguyên Hoang15 Dec 2021 22:28 UTC

7 points

1 comment1 min readLW link

Pathways: Google’s AGI

Lê Nguyên Hoang25 Sep 2021 7:02 UTC

47 points

5 comments1 min readLW link

Lê Nguyên Hoang 13 Feb 2021 16:58 UTC
5 points
0
in reply to: Filipe Marchesini’s comment on: Tournesol, YouTube and AI Risk
Thanks for the interesting comment. Perhaps to clarify, our current algorithms are by no means a final solution. In fact, our hope is to collect an interesting database to then encourage research on better algorithms that will factor, e.g., the comments on the videos.
Also, in the “settings” of the rating page, we have a functionality that allows contributors to input both their judgments and their confidence in their judgments, on a scale from 0 to 3 stars (default is 2). One idea could be to demand comments when the contributor claims a 3-star confidence judgment. This can allow disputes in the comment section.

Lê Nguyên Hoang 9 Jul 2020 6:55 UTC
3 points
0
in reply to: Apodosis’s comment on: The Equation of Knowledge
Hi Apodosis, I have done my PhD in Bayesian game theory, so this is a topic close to my heart ˆˆ There are plenty of fascinating things to explore in the study of interactions between Bayesians. One important finding of my PhD was that, essentially, Bayesians end up playing (stable) Bayes-Nash equilibria in repeated games, even if the only feedback they receive is their utility (and in particular even if the private information of other players remain private). I also studied Bayesian incentive-compatible mechanism design, i.e. coming up with rules that incentivize Bayesians’ honesty. The book also discusses interesting features of interactions between Bayesians, such as Aumann-Aaronson’s agreement theorem or Bayesian persuasion (i.e. maximizing a Bayesian judge’s probability of convicting a defendant by optimizing what investigations should be persued). One research direction I’m interested in is that a Byzantine Bayesian agreement, i.e. how much a group of honest Bayesians can agree if they are infiltrated by a small number of malicious individuals, though I have not yet found the time to dig this topic further. A more empirical challenge is to determine how well these Bayesian game theory models fit the description of human (or AI) interactions. Clearly, we humans are not Bayesians. We have some systematic cognitive biases (and even powerful AIs may also have systematic biases, since they won’t be running Bayes rule exactly!). How can we best model and predict humans’ divergence from Bayes rule? There has been a lot of spectacular advance in cognitive sciences in this regard (check out Josh Tenenbaum’s work for instance), but there’s definitely a lot more to do!

The Equation of Knowledge

Lê Nguyên Hoang7 Jul 2020 16:09 UTC

58 points

3 comments7 min readLW link

Lê Nguyên Hoang 7 Feb 2020 7:35 UTC
7 points
0
in reply to: Bird Concept’s comment on: Bayes-Up: An App for Sharing Bayesian-MCQ
I promoted Bayes-up on my YouTube channel a couple of times 😋 (and on Twitter)
https://www.youtube.com/channel/UC0NCbj8CxzeCGIF6sODJ-7A/

Lê Nguyên Hoang 7 Feb 2020 7:33 UTC
7 points
0
on: Plausibly, almost every powerful algorithm would be manipulative
The YouTube algorithm is arguably an example of a “simple” manipulative algorithm. It’s probably a combination of some reinforcement learning and a lot of supervised learning by now; but the following arguments apply even for supervised learning alone.
To maximize user engagement, it may recommend more addictive contents (cat videos, conspiracy, …) because it learned from previous examples that users who clicked on one such content tended to stay longer on YouTube afterwards. This is massive user manipulation at scale.
Is this an existential risk? Well, some of these addictive contents are radicalizing and angering users,. This arguably increases the risk of international tensions, which increases the risk of nuclear war. This may not be the most dramatic increase in existential risk; but it’s one that seems already going on today!
More generally, I believe that by pondering a lot more the behavior and impact of the YouTube algorithm, a lot can be learned about complex algorithms, including AGI. In a sense, the YouTube algorithm is doing so many different tasks that it can be argued to be already quite “general” (audio, visual, text, preference learning, captioning, translating, recommending, planning...).
More on this algorithm here: https://robustlybeneficial.org/wiki/index.php?title=YouTube

Lê Nguyên Hoang 16 Jan 2020 20:27 UTC
6 points
0
in reply to: Mark_Friedenbach’s comment on: A rant against robots
This is probably more contentious. But I believe that the concept of “intelligence” is unhelpful and causes confusion. Typically, Legg-Hutter intelligence does not seem to require any “embodied intelligence”.
I would rather stress two key properties of an algorithm: the quality of the algorithm’s world model and its (long-term) planning capabilities. It seems to me (but maybe I’m wrong) that “embodied intelligence” is not very relevant to world model inference and planning capabilities.