P.

Karma: 521

P. 17 Feb 2022 14:37 UTC
2 points
in reply to: Thomas’s comment on: Prizes for ELK proposals
The Markov property doesn’t imply that we can’t determine what variable we care about using some kind of “correlation”. Some part of the information in some node in the chain might disappear when computing the next node, so we might be able to distinguish it from its successors. And it might also have been gained when randomly computing its value from the previous node, so it might be possible to distinguish it from its predecessors.
In the worst case scenario where all variables are in fact correlated to G what we need to do is to use a strong prior so that it prefers the correct computational graph over the wrong ones. This might be hard but it isn’t impossible.
But you can also try to create a dataset that makes the problem easier to solve, or train a wrong reporter and only reply when the predictions made when using each node are the same so we don’t care what node it actually uses (as long as it can use the nodes properly, instead of computing other node and using it to get the answer, or something like that).

P. 17 Feb 2022 16:59 UTC
2 points
in reply to: Thomas’s comment on: Prizes for ELK proposals
RNNs break the Markov property in the sense that they depend on more than just the previous element in the sequence they are modelling. But I don’t see why that would be relevant to ELK.
When I say that a strong prior is needed I mean the same thing that Paul means when he writes: “We suspect you can’t solve ELK just by getting better data—you probably need to ‘open up the black box’ and include some term in the loss that depends on the structure of your model and not merely its behaviour.”. Which is a very broad class of strategies.
I also don’t understand what you mean by having a strong idea about A->G, we of course have pairs of [A, G] in our training data but what we need to know is how to compute G from A given these pairs.

P. 18 Feb 2022 19:29 UTC
3 points
in reply to: Alex Flint’s comment on: Implications of automated ontology identification
I might just be repeating what he said, but he is right. Iteration doesn’t work. Assuming that you have performed an optimal Bayesian update on the training set and therefore have a probability distribution over models, generating new data from those models can’t improve your probability distribution over them, it’s just the law of conservation of expected evidence, if you had that information you would have already updated on it. Any scheme that violates this law simply can’t work.

P. 9 Mar 2022 14:26 UTC
1 point
on: ELK prize results
If human simulation is much simpler and faster than direct translation, then an obfuscated human-simulator would also be simpler and faster than a direct translator.
That is not obvious at all, indistinguishable obfuscation is a problem that seems to inherently require enormous amounts of computation. From the Wikipedia page you linked:
There have been attempts to implement and benchmark IO candidates. For example, as of 2017, an obfuscation of the function x1∧x2∧…∧x32 at a security level of 80 bits took 23.5 minutes to produce and measured 11.6 GB, with an evaluation time of 77 ms. Additionally, an obfuscation of the Advanced Encryption Standard hash function at a security level of 128 bits would measure 18 PB and have an evaluation time of about 272 years.
And the line below that links to a paper about a method that is exponential on the size of the circuit to obfuscate:
To give an idea about the practicality of this construction, consider a 2-bit multiplication circuit. It requires 4 inputs and between 1 and 8 AND gates for each of its 4 output bits. An obfuscation would be generated in about 10**27 years on a 2,6 GHz CPU and would require 20 Zetta Bytes of memory for m = 1 and p = 1049. Executing this circuit on the same CPU would take 1.3 × 10**8 years.
I was unable to find anything better and doubt this could be improved to the point where it would be hard to distinguish from a simple direct translator.

P. 10 Mar 2022 12:56 UTC
2 points
in reply to: paulfchristiano’s comment on: ELK prize results
Since the point of the contest is to discover a method for which there is no conceivable way it could fail, I grant you that given the uncertainty about what would really happen that counterexample suffices. But I still believe that in practice the interpreter will win. I think you are overextending your intuition, neural networks are incomprehensible to you because you are human, but activations in a network are what they are because they have been optimized so that the rest of the network can process (“understand”) them. So if a value can be written and read by a network, another one could do the same, since they are both being optimized to do it. It is only through complex cryptographic magic that we can avoid that (and maybe not even then if it turns out that IO can only create programs that are exponentially large).

P. 10 Mar 2022 13:30 UTC
4 points
on: ELK prize results
There is still one proposal for which I don’t think there is a known counterexample, my fifth one. Since it got an honorable mention, if I understand correctly it got classified under “reward reporters that are sensitive to what’s actually happening in the world”, with the counterexample “reporter randomizes its behavior”. I don’t think that’s right, or if it is, it might be easy to modify it so that it keeps working.
The basic idea was: Train a bayesian network or some similar model to map from [question,answer] pairs to the latent variables and activations (including observations) of the predictor. Penalize computation time. In order to answer questions, set any reasonable prior over answers and then perform bayesian inference.
Here both the “direct generator” and the “generator of worlds that a human thinks are consistent with a QA pair” (what I think you think is the counterexample) need to perform a lot of computation that scales with the size of the predictor. But assuming that we have a clean dataset where the answers correspond to reality, given an input like “What is on top of the pedestal? A diamond” the direct generator can just generate a representation of a diamond, while the other one needs to perform extra computation to determine in what other worlds a human would give that reply. This extra computation will be penalized, so the direct generator will be preferred and then the algorithm uses it to answer questions.

P. 10 Mar 2022 14:01 UTC
5 points
on: Online Interpreter Training in Canada
Is this a bot? If so, I’m impressed. How does it know where to click to create and submit a new post? Does LW follow a template shared by other sites?
Otherwise, why would you even bother wasting time creating an account on LW, knowing you will just be downvoted and banned?

P. 10 Mar 2022 19:17 UTC
4 points
in reply to: paulfchristiano’s comment on: ELK prize results
Sorry for writing so many comments, but I just don’t see it. Unless it is implemented in a very weird way or I’m misunderstanding something, my proposal doesn’t fail to the steganography counterexamples.
I don’t know why you think it would. One possibility is that there is a superficially similar proposal that might fail to steganography: we can train a reporter so that it produces answers that allow another system to reconstruct the activations of the predictor. In this case steganography is useful to pass extra information to that generator. But my proposal isn’t trying to make the answers as informative about the world as possible (during training); if we are using a proper distribution matching generator (i.e. almost anything except GANs), then it is trying to model the distribution as accurately as possible (given the computational limitations). Encoding information on the far side of the moon would just increase its loss function: given any random variable the distribution that minimizes the expected -log-likelihood is its true distribution, not any smart encoding. And if there is a QA pair that a human could have answered with uncertainty, the generator should then again accurately model that distribution (but a well made dataset won’t reflect that uncertainty).
And even if it did perform steganography (for some reason unknown to me) it would do so “on top” of the “direct generator” instead of the “generator of worlds that a human thinks are consistent with a QA pair”, because it is simpler.

P. 11 Mar 2022 16:34 UTC
4 points
in reply to: paulfchristiano’s comment on: ELK prize results
Is this precise enough?
- Start with a regularizer C that prefers faster programs. The exact implementation depends on the exact F used. For a regular neural network it is some proxy of simplicity, like L2 regularization. Although it would also be helpful to make F shallow.
- Learn a conditional probability distribution F: (question, answer) --> (predictor activations)
  - It must be such that we can compute P(predictor activations|question, answer), or some approximation
- It is optimized for some linear combination of C(F) and log loss when predicting the activations.
- Learn a prior distribution P(answer|question)
  - In the document I actually wrote just P(answer), that was a mistake, but it doesn’t change anything meaningful.
- In order to answer a question, iterate over all possible answers and choose the one that maximizes P(A|Q)*P(z|Q,A), z being the activations.
  - This might be slow, but this whole system can then be distilled into one that answers questions directly, we just need to train it with generated samples outside the training distribution, perhaps chosen adversarially. But that is outside the scope of the proposal.
This is given by P(A|Q,z) = P(Q,A,z)/P(Q,z) = P(Q)*P(A|Q)*P(z|Q,A)/P(Q,z) ∝ P(A|Q)*P(z|Q,A)
As discussed in the document, generating the sensory data first is a problem that won’t happen. Neither can it fill the activations at random, that will give it a terrible loss.

P. 12 Mar 2022 13:22 UTC
3 points
in reply to: Thomas’s comment on: ELK prize results
The disagreement stems from them not understanding my proposal, I hope that now it is clear what I meant. I explained in my submission why the speed prior works, but in a nutshell it is because both the “direct generator” and the “generator of worlds that a human thinks are consistent with a QA pair” need to generate the activations that represent the world, but the “direct generator” does it directly (e.g. “diamond” → representation of a diamond), while the other one performs additional computation to determine what worlds a human thinks are consistent with a QA pair (e.g. “diamond” → it might be a diamond, but it might also be just a screen in front of the camera and what is really in the vault is…), in the training set where the labels are always correct the direct generator is preferred.

P. 12 Mar 2022 17:28 UTC
2 points
in reply to: interstice’s comment on: ELK prize results
I don’t think that would happen. But imagine that somehow it does happen, the regularization is too strong and the dataset doesn’t include any examples where the camera was hacked, so our model predicts both the activations of the physical diamond and the diamond image independently, what then? Try to think about any toy model of a scenario like that, any simple enough that we can analyze exactly. The simplest is that the variable on which we are conditioning the generation is an “uniformly” distributed scalar to which we apply two linear transformations to predict two values (which are meant to stand for the two diamonds) and then add gaussian noise. Given two observed values I’m pretty sure (I didn’t actually do the math but it seems obvious) that the reconstructed initial value is a weighted average of what would be predicted by either value independently. I expect that something analogous would happen in more realistic scenarios. Is this an acceptable behavior? I think so, or at least much better than any known alternative. If we used an AI to optimize the world so that the answer to “Is the diamond in the vault?” is “Yes”, it would make sure that both the real diamond and the one in the image stay in place.

P. 12 Mar 2022 17:47 UTC
2 points
in reply to: Thomas’s comment on: ELK prize results
A generative model isn’t like a regression model. If we have two variables that are strongly correlated and want to predict another variable, then we can shift the parameters to either of them and get very close to what we would get by using both. In a generative model on the other hand, we need to predict both, no value is privileged, we can’t just shift the parameters. See my reply to interstice on what I think would happen in the worst case scenario where we have failed with the implementation details and that happens. The result wouldn’t be that bad.
If you can think of another algorithm as simple as a direct generator that performs well in training, say so. I think that almost by definition the direct generator is the simplest one.
And if we make a good enough but still human level dataset (although this isn’t a requirement for my approach to work) the only robust and simple correlation that remains is the one we are interested in.

P. 12 Mar 2022 18:43 UTC
1 point
in reply to: P.’s comment on: ELK prize results
As an aside, I think that that property of regression models, in addition to using small networks and poor regularization might be why adversarial examples exist (see http://gradientscience.org/adv.pdf). Some features might not be robust. If we have an image of a cat and the model depends on some non robust feature to tell it apart from dogs, we might be able to use the many degrees of freedom we have available to make a cat look like a dog. On the other hand if we used something like this method we would need to find an image of a cat that is more likely to have been generated from the input “dog” than from the input “cat”, it’s probably not going to happen.

P. 13 Mar 2022 10:29 UTC
4 points
in reply to: paulfchristiano’s comment on: ELK prize results
The concrete generative model I had in mind was the one I used as an example in the document (page 1 under section “Simplest implementation”):
Train a conditional VAE to generate all the variables of the decoder of the predictor, condition on the [question, answer] pair. Use L2 regularization on the decoder of our model as a proxy for complexity (since the computation time of VAEs is constant).
An autoregressive model is probably the single worst model you could choose. It forces an order of generation of the latents, which breaks the arguments I wrote in my proposal. And for the purposes of how I expect the intended model to work (which I will explain below), it is very very deep. During training we can generate everything in parallel, but at inference time we need to generate each token after the previous one, which is a very deep computational graph.
I don’t understand what you mean by considering a single question-answer pair. If for a given scenario we have multiple questions and answers, we just feed each [question, answer, activations] triple into our model as training data.
Does that sound right?
It does, that is the distribution it is modeling.
The intended behavior is the one I wrote in my reply to Thomas:
both the “direct generator” and the “generator of worlds that a human thinks are consistent with a QA pair” need to generate the activations that represent the world, but the “direct generator” does it directly (e.g. “diamond” → representation of a diamond), while the other one performs additional computation to determine what worlds a human thinks are consistent with a QA pair (e.g. “diamond” → it might be a diamond, but it might also be just a screen in front of the camera and what is really in the vault is…), in the training set where the labels are always correct the direct generator is preferred.
If we have a dataset where what is seen on screen always corresponds (simply) with reality, I expect the model to generate the “world” first (using a direct map like I said above) and then use it to generate the images (I argued for this order in the document). If we use this to answer questions, it will care about what is really happening in reality (this is just the Markov property, see the last row of my drawing if that isn’t clear).
If for some reason this fails and the model generates both the images and the world each conditioned mostly independently on the QA pair, then using that to answer questions will get us a system that cares about both what is happening on screen and what is happening in reality. Given that the data it had access to is consistent with both hypotheses, I don’t fault it for doing that. Using it to optimize the world so that the answer to “Is the diamond in the vault?” is “Yes”, will give us a system that cares about “both” diamonds. Not a terrible failure mode.
If on the other hand we have a dataset where the vault has an operator, so that it is hard for another person to answer correctly, the ordering argument becomes stronger. For simple enough models QA->world->images works but in QA->images->world the QA information is lost in the first step and doesn’t get to the world (since the images are modeled as independent from the true answer). And in this case if this fails for some reason and the model generates the world and the images independently, using it to answer questions will give us a model that cares strongly about reality. Because the correlation of the QA pair is stronger with reality (by construction, at least for simple models) than with the images. Here you can’t use the argument of “but what if performing perfect inference on the human simulation is still cheaper than the direct reporter”, since both models need to be big enough to generate all the data.
If you can think of another algorithm as simple as a direct generator that performs well in the training set, describe it.
Even if you think I’m wrong, do you at least understand why I currently think this might work?

P. 13 Mar 2022 17:53 UTC
2 points
in reply to: Rafael Harth’s comment on: March 2022 Welcome & Open Thread
Noether’s theorem? Symmetries and conservation laws seem quite different.

P. 14 Mar 2022 7:52 UTC
4 points
in reply to: paulfchristiano’s comment on: ELK prize results
Once it observes that covert tampering did in fact occur, it stops assuming that the human will be correct. (Since the most likely explanation is either that the human messed up, or that the model underestimated human abilities.) It seems like it won’t end up assuming that both tampering occurred to show a diamond and that the diamond was actually present.
But the neat thing is that there is no advantage, either to the size of the computational graph or in predictive accuracy, to doing that. In the training set the human is always right. Regular reporters make mistakes because what is seen on camera is a non-robust feature that generalizes poorly, here we have no such problems.
But I might have misunderstood, pseudocode would be useful to check that we can’t just remove “function calls” and get a better system.

P. 14 Mar 2022 8:28 UTC
1 point
in reply to: Thomas’s comment on: ELK prize results
You aren’t the only one, z is usually used for the latent, I just followed Paul’s notation to avoid confusion.
P(A|Q) comes just from training on the QA pairs. But I did say “set any reasonable prior over answers”, because I expect P(z|Q,A) to be orders of magnitude higher for the right answer. Like I said in another comment, an image generator (that isn’t terrible) is incredibly unlikely to generate a cat from the input “dog”, so even big changes to the prior probably won’t matter much. That being said, machine learning rests on the IID assumption, regular reporters are no exception, they also incorporate P(A|Q), it’s just that here it is explicit.
The whole point of VAEs is that the estimation of the probability of a sample is efficient (see section 2.1 here: https://arxiv.org/abs/1606.05908v1), so I don’t expect it to be a problem.

P. 26 Mar 2022 12:56 UTC
4 points
on: Compute Governance: The Role of Commodity Hardware
I like the seagull as a recurring character. In the caption of the first image you mention a ReLu but drew a sigmoid instead.

P. 1 Apr 2022 10:02 UTC
8 points
on: Replacing Karma with Good Heart Tokens (Worth $1!)
Given the flood of comments that will inevitably result from this, it might be hard to get noticed and to surface the best ones to the top. So I am offering the following service: If you reply to this I guarantee that I will read your comment, and then will give you one or two upvotes (or none) depending on how insightful I consider it to be. Sadly, this only works if people get to see this comment, so it is in your best interest to upvote it. Let’s turn this into a new, better comment section!

P. 1 Apr 2022 12:50 UTC
−2 points
on: Prime numbers (April fool’s)
25195908475657893494027183240048398571429282126204032027777137836043662020707595556264018525880784406918290641249515082189298559149176184502808489120072844992687392807287776735971418347270261896375014971824691165077613379859095700097330459748808428401797429100642458691817195118746121515172654632282216869987549182422433637259085141865462043576798423387184774447920739934236584823824281198163815010674810451660377306056201619676256133844143603833904414952634432190114657544454178424020924616515723350778707749817125772467962926386356373289912154831438167899885040445364023527381951378636564391212010397122822120720357