Thanks a lot!
Thomas Sepulchre
The ZOPA issue you raise actually disappears when the trade involves a lot of players, not only two.
Let’s say we have N players. The first consequence would be the existence of a unique price. A lot of mechanisms can lead to a unique price, you could spy on your neighbors to see if they get a better deal than you do, or you could just have a price in mind which gets updated each time you get a deal or you don’t—If I get a deal, that’s suspicious, my price wasn’t good enough, I’ll update it. If I don’t, I was too greedy, I’ll update it—In the end, everyone will use the same price.
At this point, everyone will specialize in one good (banana or coconut) based on whether each one values banana/coconut more or less than the market does.
The ZOPA is therefore the ZOPA between the worst banana gatherer and the worst coconut gatherer. The bigger N, the smaller the ZOPA, therefore the smaller the need for perverse behaviors.
Dumb question here : What does agnostic mean in this context ?
1960 real GDP (and 1970 real GDP, and 1980 real GDP, etc) calculated at recent prices is dominated by the things which are expensive today—like real estate, for instance. Things which are cheap today are ignored in hindsight, even if they were a very big deal at the time.
I think this part could be misleading. Gross Domestic Product only includes goods that are produced in one year. Thus, if you live in a big city where it is almost impossible to build new buildings, the variation of real estate prices in this area doesn’t affect the GDP. Therefore, the fact that real estate is expensive today, which is mainly true in city center, has nothing to do with GDP, GDP growth, or the weight of the construction sector in the GDP.
I feel like you are barking up the wrong tree. You are modelling the expected time needed for a sequence of steps, given the number of hard steps in the sequence, and given that all steps will be done before time T. I agree with you that those results are surprising, especially the fact that the expected time required for hard steps is almost independent from how hard those steps are, but I don’t think this is the kind of question that comes to mind on this topic. The questions would be closer to
How many hard steps has life on earth already achieved, and how hard were they?
How many hard steps remain in front of us, and how hard are they?
How long will it take us to achieve them / how likely is it that we will achieve them?
I may have misunderstood your point, if so, feel free to correct me.
Sorry for this late response
For instance, knowing that the expected hard-step time is ~identical to the expected remaining time gives us reason to expect that the number of hard steps passed on Earth already is perhaps ~4.5 (given that the remaining time in Earth’s habitability window appears to be ~1 billion years).
I actually disagree with this statement. Assuming from now on that , so that we can normalize everything, your post shows that, if there are hard steps, given that they are all achieved, then the expected time required to achieve all of them is . Thus, you have built an estimator of , given : .
Now you want to solve the reverse problem: there is hard steps, and you want to estimate this quantity. We have one piece of information, which is . Therefore, the problem is, given , to build an estimator . This is not the same problem. You propose to use . The intuition, I assume, is that this is the inverse function of the previous estimator.
The first thing we could expect from this estimator is to have the correct expected value, i.e. . Let’s check that.
The density of is (quick sanity check here : the expected value of is indeed ). From this we can derive the expected value . And we conclude that
How did this happen? Well, it happened because it is likely for to be really close to 1, which makes explode.
Ok, so the expected value doesn’t match anything. But maybe is the most likely result, which would already be great. Let’s check that too.
The density of is . Since , we have , therefore . We can derive this to get the argmax :, and therefore . Surprisingly enough, the most likely result is not
Just to be clear about what this means, if there are infinitely many planets in the universe on which live sentient species as advanced as us, and on each of them a smart individual is using your estimator to guess how many hard step they already went through, then the average of these estimates is infinite, and the most common result is .
Fixing the estimator
An easy thing is to find the maximum likelihood estimator. Just use and, if you check the math again, you will see that the most likely result is . Matching the expected value is a bit more difficult, because as soon as your estimator looks like , the expected value will be infinite.
For us, , therefore the maximum likelihood estimator gives us . Therefore, 10 hard steps seems a more reasonable estimate than 4.5
So the estimate for the number of hard steps doesn’t make sense in the absence of some prior. Starting with a prior distribution for the likelihood of the number of hard steps, and applying bayes rule based on the time passed and remaining, we will update towards more mass on k = t/(T–t) (basically, we go from P( t | k) to P( k | t)).
Ok let’s do this. Since is an integer, I guess our prior should be a sequence . We already know . We can derive from this , and finally . In our case,
I guess the most natural prior would be the uniform prior: we fix an integer , and set for . From this we can derive the posterior distribution. This is a bit tedious to do by hand, but easy to code. From the posterior distribution we can for example extract the expected value of : . I computed it for and voilà!
Obviously is strictly increasing. It also converges toward 10. Actually, for almost all values of , the expected value is very close to 10. To give a criterion, for , the expected value is already above 7.25, which implies that it is closer to 10 than to 4.5.
We can use different types of prior. I also tried (with a normalization constant), which is basically a smoother version of the previous one. Instead of stating with certainty “the number of hard steps is at most ”, it’s more “the number of hard steps is typically , but any huge number is possible”. This gives basically the same result, except is separates from 4.5 even faster, as soon as .
My point is not to say that the number of hard steps is 10 in our case. Obviously I cannot know that. Whatever prior we may choose, we will end up with a distribution of probability, not a nice clear answer. My point is that if, for the sake of simplicity, we choose to only remember one number / to only share one number, it should probably not be 4.5 (or ), but instead 10 (or ). I bet that, if you actually have a prior, and actually make the bayesian update, you’ll find the same result.
I agree with those computations/results. Thank you
Could you please explain what the second “danger zone” graph means, i.e. what kind of skill could have such a danger zone ?
The first danger zone is pretty clear: this is a skill for which being overconfident is bad. Any skill where failure is expensive falls in that category. Any dangerous sport would be an example, being overconfident could mean death.
The third follows a similar idea, where misplaced confidence is bad.
Here is what I take from your post:
Zvi: Allowing more tests at home would provide information. Information is good. The FDA should use lower criteria to allow more tests (and should have done that months ago)
FDA: Too low sensitivity and bad use of tests at home could lead to misplaced confidence. Misplaced confidence is bad.
Zvi: In this situation the extra information outweights the misplaced confidence.
FDA: In this situation the extra information does not outweight the misplaced confidence.
Can you provide evidence that the information indeed brings more value ? Also, did I miss the point of your post and/or misrepresent it ?
Wouldn’t this lead to some bystander effect?
An adversary would need to try about passwords in order to guess yours, on average.
This statement is wrong. If we take your example of choosing the password “password” with probability and a uniform random 1000-character lowercase with probability (which represents about different passwords), then the average number of tries needed to guess your password is about , indeed it will take the attacker tries on average in the unlucky situation where you have not chosen the obvious password. This is extremely far from .
I apologize, my wording was indeed rude.
As for why I was confident, well, this was a clear (in my eye at least) example of Jensen’s inequality: we are comparing the mean of the log to the log of the mean. And, if you see this inequality come up often, you know that the inequality is always strict, except for a constant distribution. That’s how I knew.
As a last note, I must praise the fact that you left your original comments (while editing them of course) instead of removing them. I respect you a lot for that.
You have to replace the right term by . This makes the sum equal to , or if you keep only two digits after the decimal point. Still pretty close to 600 though :)
[Question] How would you learn absolute pitch?
TL;DR: Tailed peacocks make better female chicks
Let’s, for a moment, pretend to be a peahen choosing a sexual mate. We have a few options, with different degrees of impressive tails. As stated in the post, it is difficult to tell whether the tail is a good proxy for fitness. Indeed one could either argue that having a big tail is a handicap for the peacock, limiting agility for example, or that it is a strong hint that the peacock is otherwise very fit, despite the big tail. I would argue that, given the information we have, i.e. all the potential male mates survived so far, we shouldn’t assume a higher/lower fitness between them.
But, why do we care anyway? We are not interested in the fitness of our future mate, but rather in the fitness of our future chicks. And here, I think, the tail is relevant.
If we have male chicks, the choice of a mate will influence both the size of their tail and other characteristics like the ability to find food, agility and so on. As before, it doesn’t seem that the tail is a reasonable proxy on how to produce better male chicks.
If we have female chicks, the story is very different. A female chick will partially inherit the agility and general ability to survive from the mate we will choose, but will not inherit the handicap of a big tail. Therefore, we should choose the mate with the biggest tail.
Could you try “size of Jupiter, banana for scale”?
Not the main point of the post, just a small comment
So if you have any good hypotheses, you might want to test them with a compiled language for a 100-fold speedup over python
This used to be a kind of common knowledge that python is 100 times slower than C++ for example—at least I heard it quite a lot in the world of competitive programming—but I think this is kind of false nowaday
Libraries like NumPy are basically compiled C code accessible to the python user; thus, using such libraries will yield comparable performance to C/C++. Since a lot of operations can be written using NumPy or similar libraries, python is in fact no longer that slow.
I don’t have yet access to the code of this challenge, so maybe this one cannot be written this way, and therefore it would genuinely be 100 times faster using another language. But, in general, many standard operations can be done in python without such a big slowdown.
The ML technique known as dropout corresponds to the idea of randomly deleting neurons from the network, and still ensuring that it performs whatever its task is.
So I guess you can make sure that your NN is robust wrt loss of neurons.
I think there’s actually one way to set relative responsibility weight which makes more mathematical sense than the others. But, first, let’s slightly change the problem, and assume that the members of the group arrived one by one, that we can order the members by time of arrival. If this is the case, I’d argue that the complete responsibility for the soup goes to the one who brought the last necessary element.
Now, back to the main problem, where the members aren’t ordered. We can set the responsibility weight of an element to be the number of permutations in which this particular element is the last necessary element, divided by the total number of permutations.
This method has several qualities : the sum of responsibilities is exactly one, each useful element (each Necessary Element of some Sufficient Set) has a positive responsibility weight, while each useless element has 0 responsibility weight. It also respects the symmetries of the problem (in our example, the responsibility of the pot, the fire and the water is the same, and the responsibility of each ingredient is the same)
In a subtle way, it also takes into account the scarcity of each ressource. For example, let’s compare the situation [1 pot, 1 fire, 1 water, 3 ingredients] with the situation [2 pots, 1 fire, 1 water, 3 ingredients]. In the first one, in any order, the responsibility goes to the last element, therefore the final responsibility weight is 1⁄7 for each element. The second situation is a bit trickier, we must consider two cases. First case, the last element is a pot (1/4 of the time). In this case, the responsibility goes to the seventh element, which gives 1⁄7 responsibility weight to everything but the pots, and 1⁄14 responsibility weight to each pot. Second case, the last element is not a pot (3/4 of the time), in which case the responsibility goes to the last element, which gives 1⁄6 responsibility weight to everything but the pots, and 0 to each pot. In total, the responsibilities are 1⁄56 for each pot, and 9⁄56 for each other element. We see that the responsibility has been divided by 8 for each pot, basically because the pot is no longer a scarce ressource.
Anyway, my point is, this method seems to be the good way to generalize on the idea of NESS : instead of just checking whether an element is a Necessary Element of some Sufficient Set, one must count how many times this element is the Necessary Element of some permutation.