Past Account comments on Alignment As A Bottleneck To Usefulness Of GPT-3

Past Account 22 Jul 2020 2:19 UTC
2 points
0
[Deleted]
- johnswentworth 22 Jul 2020 15:40 UTC
  4 points
  0
  Parent
  Any sort of probabilistic model offers the usual interpretations of the probabilities as an interface. For instance, I can train an LDA topic model, look at the words in the learned topics, pick a topic I’m interested in, then look at that topic’s weighting in each document in order to find relevant documents. More generally, I can train any clustering model, pick a cluster I’m interested in, then look for more things in that cluster. Or if I train a causal model, I can often interpret the learned parameters as estimates of physical interactions in the world. In each case, I’m effectively using the interpretation of the model’s built-in probabilities as an interface.
  This is arguably the main advantage of probabilistic models over non-probabilistic models: they come with a fairly reliable, well-understood built-in interface.
  - Past Account 23 Jul 2020 1:23 UTC
    3 points
    0
    Parent
    [Deleted]
- ESRogs 22 Jul 2020 6:18 UTC
  2 points
  0
  Parent
  Have we ever figured out a way to interface with what something has learned that doesn’t involve language prompts?
  You might be interested in some of Chris Olah’s work on interpretability. For example, this.
  EDIT: Or even just the example of sampling from the latent space of a variational autoencoder should count, I would think.
  - Past Account 22 Jul 2020 12:00 UTC
    3 points
    0
    Parent
    [Deleted]