DavidHolmes
Neural networks biased towards geometrically simple functions?
Categorial preferences and utility functions
I really liked the content, but I found some of the style (`Sit down!′ etc) really off-putting, which I why I only actually read the post on my 3rd attempt. Obviously you’re welcome to write in whatever style you want, and probably lots of other people really like it, I just thought it might be useful to mention that a non-empty set of people find it off-putting.
Hi Zack,
Can you clarify something? In the picture you draw, there is a codimension-1 linear subspace separating the parameter space into two halves, with all red points to one side, and all blue points to the other. Projecting onto any 1-dimensional subspace orthogonal to this (there is a unique one through the origin) will thus yield a `variable’ which cleanly separates the two points into the red and blue categories. So in the illustrated example, it looks just like a problem of bad coordinate choice.
On the other hand, one can easily have much more pathological situations; for examples, the red points could all lie inside a certain sphere, and the blue points outside it. Then no choice of linear coordinates will illustrate this, and one has to use more advanced analysis techniques to pick up on it (e.g. persistent homology).
So, to my vague question: do you have only the first situation in mind, or are you also considering the general case, but made the illustrated example extra-simple?
Perhaps this is clarified by your numerical example, I’m afraid I’ve not checked.
I expect most people on LW to be okay being asked their Cheerful Price to have sex with someone.
I find this a surprising assertion. It does not apply to me, probably it does apply to you. Ordinarily I would ask if you had any other data points, but I don’t want to take the conversation in this direction...
Some advice for getting papers accepted on arxiv
As some other comments have pointed out, there is a certain amount of moderation on arXiv. This is a little opaque, so below is an attempt to summarise some things that are likely to make it easier to get your paper accepted. I’m sure the list is very incomplete!
In writing this I don’t want to give the impression that posting things to arXiv is hard; I have currently 28 papers there, have never had a single problem or delay with moderation, and the submission process generally takes me <15 mins these days.
-
Endorsement. When you first attempt to submit a paper you may need to be endorsed. JanBrauner kindly offered below to help people with endorsements; I might also be able to do the same, but I’ve never posted in the CS part of arXiv, so not sure how effective this will be. However, even better to avoid need for moderation. To this end, use an academic email address if you have one; this is quite likely to already be enough. Also, see below on subject classes (endorsement requirements depend on which subject class(es) you want to post in).
-
Choosing subject classes. Each paper gets one or more subject classes, like CS.AI; see [https://arxiv.org/category_taxonomy] for a list. Some subject classes attract more junk than others, and the ones that attract more junk are more heavily moderated. In mathematics, it is math.GM (General Mathematics) that attracts most junk, hence is most heavily moderated. I guess most people here are looking at CS.AI, I don’t know what this is like. But one easy thing is to minimise cross-listing (adding additional subject classes for your paper), as then you are moderated by all of them.
-
Write in (la)tex, submit the tex file. You don’t have to do this, but it is standard and preferred by the arXiv, and I suspect makes it less likely your paper gets flagged for moderation. It is also an easy way to make sure your paper looks like a serious academic paper.
-
It is possible to submit papers on behalf of third parties. I’ve never done this, and I suspect such papers will be more heavily moderated.
-
If you have multiple authors, it doesn’t really matter who submits. After the submission is posted you are sent a ‘paper password’ allowing coauthors to ‘claim’ the paper; it is then associated to their arXiv account, orcid etc (orcid is optional, but a really good idea, and free).
Finally, a request: please be nice to the moderators! They are generally unpaid volunteers doing a valuable service to the community (e.g. making sure I don’t have to read nonsense proofs of the Riemann hypothesis every morning). Of course it doesn’t feel good if your paper gets held up, but please try not to take it personally.
-
I think that
provable guarantees on the safety of an FHE scheme that do not rely on open questions in complexity theory such as the difficulty of lattice problems.
is far out of reach at present (in particular to the extent that there does not exist a bounty which would affect people’s likeliness to work on it). It is hard to do much in crypto without assuming some kind of problem to be computationally difficult. And there are very few results proving that a given problem is computationally difficult in an absolute sense (rather than just ‘at least as hard as some other problem we believe to be hard’). C.f. P vs NP. Or perhaps I misunderstand your meaning; are you ok with assuming e.g. integer factorisation to be computationally hard?
Personally I also don’t think this is so important; if we could solve alignment modulo assuming e.g. integer factorisation (or some suitable lattice problem) is hard, then I think we should be very happy…
-
More generally, I’m a bit sceptical of the effectiveness a bounty here because the commercial application of FHE are already so great.
-
About 10 years ago when I last talked to people in the area about this I got a bit the impression that FHE schemes were generally expected to be somewhat less secure than non-homomorphic schemes, just because the extra structure gives an attacker so much more to work with. But I have no idea if people still believe this.
I suspect the arXiv might not be keen on an account that posts papers by a range of people (not including the account-owner as coauthor). This might lead to heavier moderation/whatever. But I could be very wrong!
I was about to write approximately this, so thank you! To add one point in this direction, I am sceptical about the value of reducing the expectation for researchers to explain what they are doing. My research is in two fields (arithmetic geometry and enumerative geometry). In the first we put a lot of burden on the writer to explain themselves, and in the latter poor and incomplete explanations are standard. This sometimes allows people in the latter field to move faster, but
it leaves critical foundational gaps, which we can ignore for a while but which eventually causes lot of pain;
sometimes really critical points are hidden in the details, and we just miss these if we don’t write the details down properly. Disclaimers:
while I think a lot of people working in these fields would agree with me that this distinction exists, not so many will agree that it is generally a bad thing.
I’m generally criticising lack of rigour rather than lack of explanation. I am or claiming these necessarily have to go together, but in my experience they very often do.
Thank you for the quick reply! I’m thinking about section 5.1 on reparametrising the model, where they write:
every minimum is observationally equivalent to an infinitely sharp minimum and to an infinitely flat min- imum when considering nonzero eigenvalues of the Hessian;
If we stick to section 4 (and so don’t allow reparametrisation) I agree there seems to be something more tricky going on. I initially assumed that I could e.g. modify the proof of Theorem 4 to make a sharp minimum flat by taking alpha to be big, but it doesn’t work like that (basically we’re looking at alpha + 1/alpha, which can easily be made big, but not very small). So maybe you are right that we can only make flat minimal sharp and not conversely. I’d like to understand this better!
Definitely the antagonistic bits—I enjoyed the casual style! Really just the line ‘ Sit down. Sit down. Shut up. Listen. You don’t know nothing yet’ I found quite off-putting—even though in hindsight you were correct!
So the set of worlds, , is the set of functions from to …
I guess the should be a ? Also, you don’t seem to define ; perhaps ?
Bias towards simple functions; application to alignment?
The arXiv really prefers that you upload in tex. For the author this makes it less likely that your paper will be flagged for moderation etc (I guess). So if it were possible to export to Rex I think that for the purposes of uploading to arXiv this would be substantially better. Of course, I don’t know how much more/less work it is…
Thanks very much for the link!
I’m not sure I agree with interstice’s reading of the ‘sharp minima’ paper. As I understand it, they show that a given function can be made into a sharp or flat minimum by finding a suitable point in the parameter space mapping to the function. So if one has a sharp minmum that does not generalise (which I think we will agree exists) then one can make the same function into a flat minimum, which will still not generalise as it is the same function! Sorry I’m 2 years late to the party...
I’m sceptical of your decision to treat tenured and non-tenured faculty alike. As tenured faculty, this has long seemed to me to be perhaps the most important distinction.
More generally, what you write here is not very consistent with my own experience of academia (which is in mathematics and in Europe, though I have friends and collaborators in other countries and fields, so I am not totally clueless about how things work there).
Some points I am not seeing in your post are:
-
For many academics, being able to do their own research and work with brilliant students is their primary motivation. Grants etc are mainly valuable in how they facilitate that. This makes for a confusing situation where ‘losers’ in the original LCS model do the minimum work necessary for their paycheck, whereas ‘losers’ in the academic system (as you seem to be defining them?) do the maximum work that is compatible with their health and personal situation. Not only is this conceptually confusing to me, it also means that all other things being equal, the more `losers’ one is in academia the more impressive one’s CV will tend to be. Which is I think the opposite of the situation in the conventional LCS hierarchy?
-
The fact that I ‘perform peer review for nothing at all’ apparently makes me clueless. But this is weird; it does not go on my CV, and I do it because I think it is important to the advancement of science. Surely this makes it a `loser’ activity?
-
Acceptance of papers and awarding of grants is decided by people external to your university. This makes a huge difference, and I think you miss it by writing `So we might analyze this system at the department level, at the university level, or at the all-academia level, but it doesn’t make much of a difference.’.
Perhaps the above makes it sound as if I view academia as an organisational utopia; this is far from the case! But I do not think this post does a good job of identifying problems. I think a post analysing moral mazes in academia would be interesting, but I’m not convinced that the LCS hierarchy is an appropriate model, and this attempt to apply it does not seem to me to make useful category distinctions.
-
Sure, in the end we only really care about what comes top, as that’s the thing we choose. My feeling is that information on (relative) strengths of preferences is often available, and when it is available it seems to make sense to use it (e.g. allowing circumvention of Arrow’s theorem).
In particular, I worry that, when we only have ordinal preferences, the outcome of attempts to combine various preferences will depend heavily on how finely we divide up the world; by using information on strengths of preferences we can mitigate this.
Thanks for pointing me to this updated version :-). This seems a really neat trick for writing down a utility function that is compatible with the given preorder. I thought a bit more about when/to what extent such a utility function will be unique, in particular if you are given not only the data of a preorder, but also some information on the strengths of the preferences. This ended up a bit too long for a comment, so I wrote a few things in outline here:
It may be quite irrelevant to what you’re aiming for here, but I thought it was maybe worth writing down just in case.
Hi Stuart,
I’m working my way through your `Research Agenda v0.9’ post, and am therefore going through various older posts to understand things. I wonder if I could ask some questions about the definition you propose here?
First, that X be contained in RN for some N seems not so relevant; can I just assume X, Y and Z are some manifolds (Ck for some 0≤k≤∞)? And we are given some partial order ≺ on X, so that we can refer to `being a better world’?
Then, as I understand it, your definition says the following:
Fix X, ≺ and Z. Let Y be a manifold and y+, y−∈Y. Given a local homomorphism +:Y×Z→X, we say that y+ is partially preferred to y− if for all z∈Z, we have y−+z≺y++z.
I’m not sure which inequalities should be strict, but this seems non-essential for now. On the other hand, the dependence of this definition on the choice of Y seems somewhat subtle and interesting. I will try to illustrate this in what follows.
First, let us make a new definition. Fix X, ≺, and Z as before. Let Y′={y+,y−}, a two-element set equipped with the discrete topology, and let +′:Y×Z→X be an immersion of Ck-manifolds. We say that y+ is weakly partially preferred to y− if for all z∈Z, we have y−+′z≺y++′z.
First, it is clear that partial preference implies weak partial preference. More formally:
Claim 1: Fix X, ≺ and Z. Suppose we have a manifold Y, points y+, y−∈Y, and a local homomorphism +:Y×Z→X such that y+ is partially preferred to y−. Setting Y′={y+,y−} with the subspace topology from Y (i.e. discrete), and taking +′ to be the restriction of + from Y×Z to Y′×Z, we have that y+ is weakly partially preferred to y−.
Proof: obvious. $\qed$
However, the converse can fail if Z is not contractible. First, let’s prove that the concepts are equivalent for Z contractible:
Claim 2: Fix X, ≺ and Z, and assume that Z is contractible. Suppose we have a two-element set Y′={y+,y−} and a map +′:Y′×Z→X making y+ weakly partially preferred to y−. Then there exist a manifold Y, an injection Y′→Y, and a local homeomorphism +:Y×Z→X whose restriction to Y′×Z is +′, making y+ partially preferred to y−.
Proof: Let’s assume for simplicity of notation that X is equidimensional, say of dimension dX, and write dZ for the dimension of Z. Let Y be the disjoint union of two open balls of dimension dX−dZ, with Y′→Y the inclusion of the centres of the balls. Then take an ϵ-neighbourhood of Z in X; it is diffeomorphic to Y×Z since the normal bundle to Z in X is trivialisable (c.f. https://math.stackexchange.com/questions/857784/product-neighborhood-theorem-with-boundary). $\qed$
If we want examples where weak partial preference and partial preference don’t coincide, we should look for an example where Z is not contractible, and its normal bundle in X is not contractible.
Example 3: Let X be the disjoint union of two moebius bands, and let Z be a circle. Note that including Z along the centre of either band gives a submanifold whose tubular neighbourhood is not a product. Assume that ≺ is such that one component of X is preferred to the other (and ≺ is indifferent within each connected component). Then take Y′={y+,y−}, and +′:Y′×Z→X to be the inclusion of the two circles along the centres of the two moebius bands, such that {y+}×Z ends up in the preferred band. This yields a situation where y+ is weakly partially preferred to y−, but the conclusion of Claim 2 fails, i.e. this cannot be extended to a partial preference for y+ over y−.
What conclusion should we draw from this? To me, it suggests that the notion of partial preference is not yet quite as one would want. In the setting of Example 3, where X consists of two moebius strips, one of which is preferred to the other, then landing in the preferred strip should be preferred to landing in the un-preferred strip?! And yet the `local homeomorphism from a product’ condition gets in the way. This example is obviously quite artificial, and maybe analogous things cannot occur in reality. But I’m not so happy with this as an answer, since our approaches to AI safety should be (so far as possible) robust against the flaws in our understanding of physics.
Apologies for the overly-long comment, and for the imperfect LaTeX (I’ve not used this type of form much before).