jimmy's Posts - LessWrong 2.0 viewer
https://www.greaterwrong.com/
jimmy's Posts - LessWrong 2.0 viewerxml-emitteren-usMeetup : Garden grove meetup by jimmy
https://www.greaterwrong.com/posts/rM35xcXLWD2JaZam3/meetup-garden-grove-meetup
<h2>Discussion article for the meetup : <a href="https://www.lesswrong.com/meetups/a4">Garden grove meetup</a></h2>
<div class="meetup-meta">
<p>
<strong>WHEN:</strong>
<span class="date">22 May 2012 07:18:15PM (-0700)</span><br>
</p>
<p>
<strong>WHERE:</strong>
<span class="address">Brookhurst and Garden Grove 10130 Garden Grove Blvd, Garden Grove, CA 92844</span>
</p>
</div><!-- .meta -->
<div id="" class="content">
<div class="md"><p>At Genki Living. For Snacks and Drinks with light discussion on how to properly have a meetup and what to meetup about.</p></div>
</div><!-- .content -->
<h2>Discussion article for the meetup : <a href="https://www.lesswrong.com/meetups/a4">Garden grove meetup</a></h2>jimmyrM35xcXLWD2JaZam3Tue, 15 May 2012 02:17:14 +000026 March 2011 Southern California Meetup by jimmy
https://www.greaterwrong.com/posts/FrjZxiYSxBkNjX7J9/26-march-2011-southern-california-meetup
<p><span>We’re having another SoCal LessWrong meetup this Saturday, the 26th. It’ll be held</span><span> in the upstairs meeting area</span><span> at </span><span><a href="http://maps.google.com/maps?hl=en&ie=UTF8&q=18542+Macarthur+Blvd+Irvine+CA,+92612+ihop&fb=1&gl=us&hq=ihop&hnear=18542+MacArthur+Blvd,+Irvine,+CA+92612&cid=0,0,16042522294109526569&ei=uLa8TOG7MoWCsQO1nM2TDw&ved=0CBYQnwIwAA&ll=33.679247,-117.859418&spn=0.009285,0.021136&t=h&z=16&iwloc=A">this IHOP</a></span><span> in Irvine. It will start at 2PM and probably run until 7 or so.</span></p>
<p><span>The format for past meetups has varied based on the number of attendees and their interests, at various points we have tried: <a href="http://wiki.lesswrong.com/wiki/Paranoid_debating">paranoid debating</a>, small group “dinner party conversations”, <a href="https://www.greaterwrong.com/posts/Ru39RvX2jnGm8TCjh/september-2010-southern-california-meetup#comment-PBmqRuhveJWtgi6Mf">structured rationality exercises</a>, large discussions with people sharing personal experiences with sleep and “nutraceutical” interventions for intelligence augmentation, and specialized subprojects to develop tools for quantitatively estimating the value of things like <a href="http://wiki.lesswrong.com/wiki/Cryonics">cryonics</a> or <a href="http://wiki.lesswrong.com/wiki/Existential_risk">existential risk</a> interventions.</span></p>
<p>If you need or can offer a ride, post in the comments.</p>jimmyFrjZxiYSxBkNjX7J9Sun, 20 Mar 2011 18:29:16 +0000October 2010 Southern California Meetup by jimmy
https://www.greaterwrong.com/posts/HZb5vKcRcXZ62gPGb/october-2010-southern-california-meetup
<p>We’re having the third SoCal LessWrong meetup this Saturday, the 23rd. It’ll be held at <a href="http://maps.google.com/maps?hl=en&ie=UTF8&q=18542+Macarthur+Blvd+Irvine+CA,+92612+ihop&fb=1&gl=us&hq=ihop&hnear=18542+MacArthur+Blvd,+Irvine,+CA+92612&cid=0,0,16042522294109526569&ei=uLa8TOG7MoWCsQO1nM2TDw&ved=0CBYQnwIwAA&ll=33.679247,-117.859418&spn=0.009285,0.021136&t=h&z=16&iwloc=A">this IHOP</a> in Irvine, from 1PM to 8PM, in the upstairs meeting area.</p>
<p>For those that haven’t yet come, the last two were quite successful bringing 13 and 16 people respectively, and there was plenty of intelligent and friendly discussion.</p>
<p>Make sure to comment if you have suggestions for how to improve on the last one, if you can give/need a ride, or just to say you’re coming.</p>jimmyHZb5vKcRcXZ62gPGbMon, 18 Oct 2010 21:28:17 +0000Localized theories and conditional complexity by jimmy
https://www.greaterwrong.com/posts/8f7sXMEiKRWQdBRna/localized-theories-and-conditional-complexity
<p>Suppose I hand you a series of data points without providing the context. Consider the theory v = a*t for t<<1, v = b for t>>1. Without knowing anything a priori about the shapes of the curves, one must have enough data to make sure that v follows the right lines at the two limits since there is complexity that must be justified. Here we have two one-parameter curves, so we need at least two data points to pick the right slope and offset, as well as at least a couple more to make sure it follows the right shape.</p>
<p>This is what I’ll call a completely local theory – see data, fit curve. Dealing with problems at this level does not leave much room for human bias or error, but it also does not allow for improvement by including background knowledge.</p>
<p>Now consider the case where v = velocity of a rocket sled, a = thrust/mass of sled, and b = sqrt(thrust/(1/2*rho*Cd*Af)). If you have a theory explaining rockets and aerodynamics, the equation v=a*t and v = b are just the limiting cases for small and large t. In this case, you only need two data points to find a and b since you already know the shape (over the full range) from solving the differential equations. If you understand the aerodynamics well enough, and know the shape and mass of the sled, you don’t even need to do the experiment! The “<a href="http://en.wikipedia.org/wiki/Conditional_entropy">conditional complexity</a>” is 0 since it is directly predicted from what we already know. This is the magic of keeping track of the dependencies between theories.</p>
<p>We can take this a step further and derive a theory of aerodynamics from a theory of air molecules- and so on until we have one massively connected <a href="http://en.wikipedia.org/wiki/Theory_of_everything">TOE</a>.</p>
<p>Now step back to the beginning. If all I tell you is that when t = 1e-5, v = 2e-5 and when t = 1e-3, v = 2e-3, you’re going to come up with the equationv = 2*t. If someone, with no further information, suggested that v = 2*t was only a small t approximation, and that for large t, v = 5.32, you’d think that he’s nutso (and rightfully so), with all that unnecessary complexity.</p>
<p>As a wannabe Bayesian, you need to update on all evidence, so we’re almost never trying to fit data without knowing what it means. We prefer globally simple theories, not theories where each local section is simple but they don’t want to fit together.</p>
<p>I suspect that one of the main reasons people fail to <a href="http://en.wikipedia.org/wiki/Occams_razor#Anti-razors">understand/accept</a> Occam’s razor comes from trying to apply it to theories locally and then noticing that by importing information from a more general theory, they can do better. Of course you do better with more information than you do with a wisely chosen ignorance prior. You need to apply Occams razor to the whole bundle. Since all of the background theory is the same, you can reduce this to the <a href="http://en.wikipedia.org/wiki/Entropy">entropy</a> of the local theories that is left after <a href="http://en.wikipedia.org/wiki/Conditional_entropy">conditioning</a> on the background theory.</p>
<p>When Eliezer says that he doesn’t expect humans to have <a href="https://www.greaterwrong.com/posts/NnohDYHNnKDtbiMyp/fake-utility-functions">simple utility functions</a>, its not because it is a magical case where Occam’s razor doesn’t apply. It’s that it would take more bits overall to explain evolution creating a simple utility function than it would be to explain evolution creating a particular <a href="https://www.greaterwrong.com/posts/cSXZpvqpa9vbGGLtG/thou-art-godshatter">locally complex utility function</a>. This is very different than concluding that Occam’s razor doesn’t fit to real life. If Occam’s razor seems to be giving you bad answers, you’re <a href="http://www.doingitwrong.com/">doing it wrong</a>.</p>
<p>What does this imply for the future? Those with poor memories and/or a poor understanding of history will answer “much like the present” based on a single point and the locally simplest fit. You can find people one step up from that who notice improvements over time and fit it to a line, which again isn’t a bad guess if that’s all you know (you almost always know more- actually using the rest of your information efficiently is the trick). Another step ahead and you get people who hypothesize an exponential growth based on their understanding of improvements feeding improvements, or at least a wider spanning dataset. This is where you’ll find Ray Kurzweil and the ‘accellerating improvement singularity’ folk. The last step I know of is where you’ll find Eliezer Yudkowsky and other ‘hard takeoff’ folks. This is where you say “yes, I know my theory is locally more complex- I know that it isn’t obvious from looking at the curve, quite the opposite. However, my theory is less complex after conditioning on the rest of my knowledge that <em>doesn’t show up on this plot</em>, and for that reason, I believe it to be true”.</p>
<p>This might sound like saying “Emeralds are <a href="http://en.wikipedia.org/wiki/Grue_and_Bleen">Grue</a> not green”, but while “X until arbitrary date, then Y” fares worse when applying Occam’s razor locally, if our theory of color indicated a special ‘turning point’, then we would have to conclude “Emeralds are grue, not green”, and we would conclude this beacuse of Occam’s razor, not inspite of it.</p>
<p>I chose this example because it is important and well known at LW, but not for lack of other examples. In my experience, this is a very common mistake, even for otherwise intelligent individuals. This makes getting it right a quite fun rationality ‘<a href="https://www.greaterwrong.com/posts/5o4EZJyqmHY4XgRCY/einstein-s-superpowers">superpower</a>’.</p>jimmy8f7sXMEiKRWQdBRnaMon, 19 Oct 2009 07:29:34 +0000How to use “philosophical majoritarianism” by jimmy
https://www.greaterwrong.com/posts/5XMrWNGQySFdcuMsA/how-to-use-philosophical-majoritarianism
<p>The majority of people would hold more accurate beliefs if they simply <a href="http://www.overcomingbias.com/2007/03/on_majoritarian.html">believed the majority</a>. To state this in a way that doesn’t risk <a href="https://www.greaterwrong.com/posts/DNQw596nPCX4x7xT9/information-cascades">information cascades</a>, we’re talking about averaging <a href="https://www.greaterwrong.com/posts/ZP2om2oWHPhvWP2Q3/the-ethic-of-hand-washing-and-community-epistemic-practice">impressions</a> and coming up with the same <a href="https://www.greaterwrong.com/posts/a7n8GdKiAZRX86T5A/making-beliefs-pay-rent-in-anticipated-experiences">belief</a>.</p>
<p>To the degree that you come up with different averages of the impressions, you acknowledge that your belief was just your impression of the average, and you average those metaimpressions and get closer to belief convergence. You can repeat this until you get bored, but if you’re doing it right, your beliefs should get closer and closer to agreement, and you shouldn’t be able to predict who is going to fall on which side.<br><br>Of course, most of us are atypical cases, and as good rationalists, we need to<em> update</em> on this information. Even if our impressions were (on average) no better than the average, there are certain cases where <a href="https://www.greaterwrong.com/posts/NKECtGX4RZPd7SqYp/the-modesty-argument">we know</a> that the majority is wrong. If we’re going to selectively apply majoritarianism, we need to figure out the <em>rules</em> for when to apply it, to whom, and how the weighting works.<br><br>This much I think has been said again and again. I’m gonna attempt to describe <em>how</em>.<a id="more"></a></p>
<p>Imagine for a moment that you are a perfectly rational Bayesian, and you just need data.<br><br>First realize that “duplicate people” don’t count double. If you make a maximum precision copy of someone, that doesn’t make him any more likely to be right- clearly we can do better than averaging over all people with equal weighting. By the same idea, finding out that a certain train of thought leading to a certain belief is common shouldn’t make you proportionally more confident in that idea. The only reason it might make you <em>any</em> more confident in it is the possibility that its truth leads to its proliferation and therefore its popularity is (weak) evidence.</p>
<p>This explains why we can dismiss the beliefs of the billions of theists. First of all, their beliefs are very well correlated so that all useful information can be learned through only a handful of theists. Second of all, we understand their arguments and we understand how they formed their beliefs-and have already taken them into account. The reason they continue to disagree is because the situation isn’t symmetric—they don’t understand the opposing arguments or the causal path that leads one to be a reductionist atheist. </p>
<p>No wonder “majoritarionism” doesn’t seem to work here.</p>
<p>Since we’re still pretending to be perfect Bayesians, we only care about people who are fairly predictable (given access to their information) and have information that we don’t have. If they don’t have any new information, then we can just follow the causal path and say “and here, sir, is where you went wrong.”. Even if we don’t understand their mind perfectly, we don’t take them seriously since it is clear that whatever they were doing, they’re doing it wrong. On the other hand, if the other person has a lot of data, but we have no idea how data affects their beliefs, then we can’t extract any useful information.</p>
<p>We only change our beliefs to more closely match theirs when they are not only predictable, but <em>predictably rational</em>. If you know someone is <a href="https://www.greaterwrong.com/posts/qNZM3EGoE5ZeMdCRt/reversed-stupidity-is-not-intelligence"><em>always</em> wrong</a>, then reversing his stupidity can help you get more accurate beliefs, but it won’t bring you closer to agreement- just the opposite!</p>
<p>If we stop kidding ourselves and realize that we aren’t perfect Bayesian, then we have to start giving credit to how other people think. If you and an epistemic peer come upon the same data set and come to different conclusions, then you have no reason to think that your way of thinking is any more accurate than his (as we assumed he’s an epistemic peer). While you may have different initial impressions, you better be able to converge to the same belief. And again, on each iteration, it shouldn’t be predictable who is going to fall on which side.</p>
<p>If we revisit the cases like religion, then you still understand how they came to their beliefs and exactly why they fail. So to the extent that you believe you can recognize stupidity when you see it, you still stick to your own belief. Even though you aren’t perfect, for this case, you’re good enough.<br><br><a href="http://rob-zahra.blogspot.com/2009/04/overcoming-bias-summaries.html"><strong>One sentence summary:</strong></a> <strong>You want to shift your belief to the average over answers given by predictably rational “Rituals of Cognition”/data set pairs<sup>1</sup>, <em>not people</em><sup>2</sup></strong>.</p>
<p>You weight the different “Rituals Of Cognition”/data pairs by how much you trust the ROC and by how large the data set is. You must, however, keep in mind that to trust yourself more than average, you have to have a <a href="http://www.overcomingbias.com/2006/12/benefit_of_doub.html">better than average</a> reason to think that you’re better than average.</p>
<p>To the extent that everyone has a unique take on the subject, counting people and counting cognitive rituals are equivalent. But when it comes to a group where all people think pretty close to the same way, then they only get one “vote”. </p>
<p>You can get “bonus points” if you can predict the conditional response of irrational peoples action in response to data and update based on that. For practical purposes though, I don’t think much of this happens as not many people are <a href="https://www.greaterwrong.com/posts/qNZM3EGoE5ZeMdCRt/reversed-stupidity-is-not-intelligence">intelligently stupid</a>.</p>
<p><strong>ETA:</strong> This takes the anthropomorphism out of the loop. We’re looking at valid ROC, and polling human beliefs is just a cheap way to <em>find</em> them. If we can come up with other ways of finding them, I expect that to be <em>very</em> valuable. The smart people that impress me most aren’t the ones that learn slightly quicker, since everyone else gets there too. The smart people that impress me the most come in where everyone else is stumped and chop Gordian’s knot in half with their unique way of thinking about the problem. Can we train this skill?</p>
<p><strong>Footnotes:</strong><br><strong>1.</strong> I’m fully aware of how hoaky this sounds without any real math there, but it seems like it should be formalizable.<br>If you’re just trying to improve human rationality (as opposed to programming AI), the real math would have to be interpreted again anyway and I’m not gonna spend the time right now.<br><br><strong>2.</strong> Just as thinking identically to your twin doesn’t help you get the right answer (and therefore is weighted less), if you can come up with more than one <em>valid</em> way of looking at things, you can expect justifiably be weighted as strongly as a small group of people.</p>jimmy5XMrWNGQySFdcuMsATue, 05 May 2009 06:49:45 +0000How to come up with verbal probabilities by jimmy
https://www.greaterwrong.com/posts/6Bz4TK37T8t5S3AbM/how-to-come-up-with-verbal-probabilities
<p>Unfortunately, we are kludged together, and we can’t just look up our probability estimates in a register somewhere when someone asks us “How sure are you?”.<br><br>The usual heuristic for putting a number on the strength of beliefs is to ask “When you’re this sure about something, what fraction of the time do you expect to be right in the long run?”. This is surely better than just “making up” numbers with no feel for what they mean, but still has it’s faults. The big one is that unless you’ve done your calibrating, you may not have a good idea of how often you’d expect to be right. <br><br>I can think of a few different heuristics to use when coming up with probabilities to assign.<br><br>1) Pretend you have to bet on it. Pretend that someone says “I’ll give you ____ odds, which side do you want?”, and figure out what the odds would have to be to make you indifferent to which side you bet on. Consider the question as if though you were <a href="http://www.overcomingbias.com/2007/06/uncovering_rati.html"><em>actually going to put money on it</em></a>. If this question is covered on a prediction market, your answer is given to you.</p>
<p>2) Ask yourself how much evidence someone would have to give you before you’re back to 50%. Since we’re trying to update according to bayes law, knowing how much evidence it takes to bring you to 50% tells you the probability you’re implicitely assigning.</p>
<p>For example, pretend someone said something like “I can guess peoples names by their looks”. If he guesses the first name right, and it’s a common name, you’ll probably write it off as fluke. The second time you’ll probably think he knew the people or is somehow fooling you, but <a href="https://www.greaterwrong.com/posts/neQ7eXuaXpiYw7SBy/the-least-convenient-possible-world">conditional on that</a>, you’d probably say he’s just lucky. By bayes law, this suggests that you put the prior probability of him pulling this stunt at 0.1%<p<3%, and less than 0.1% prior probability of him having his claimed skill. If it takes 4 correct calls to bring you to equally unsure either way, then thats about 0.03^4 if they’re common names, or one in a million<sup>1</sup>...<a id="more"></a><br><br>There’s a couple neat things about this trick. One is that it allows you to get an idea of what your subconscious level of certainty is before you ever think of it. You can imagine your immediate reaction to “Why yes, my name is Alex, how did you know” as well as your carefully deliberated response to the same data (if they’re much different, be wary of <a href="https://www.greaterwrong.com/posts/CqyJzDZWvGhhFJ7dY/belief-in-belief">belief in belief</a>). The other neat thing is that it pulls up alternate hypotheses that you find more likely, and how likely you find those to be (eg. “you know these people”).<br><br>3) Map out the typical shape of your probability distributions (ie through calibration tests) and then go by how many standard deviations off the mean you are. If you’re asked to give the probability that x<C, you can find your one sigma confidence intervals and then pull up your curve to see what it predicts based on how far out C is<sup>2</sup>.<br><br>4) Draw out your <a href="https://www.greaterwrong.com/posts/LKHJ2Askf92RBbhBp/metauncertainty">metaprobability distribution</a>, and take the mean.<br><br>You may initially have different answers for each question, and in the end you have to decide which to trust when actually placing bets.<br><br>I personally tend to lean towards 1 for intermediate probabilities, and 2 then 4 for very unlikely things. The betting model breaks down as risk gets high (either by high stakes or extreme odds), since we bet to maximize a utility function that is not linear in money.<br><br>What other techniques do you use, and to how do you weight them?<br><br><br> <br><strong>Footnotes:</strong><br><br><strong>1:</strong> A common name covers about 3% of the population, so p(b|!a) = 0.03^4 for 4 consecutive correct guesses, and p(b|a) ~=1 for sake of simplicity. Since p(a) is small, (1-p(a)) is approximated as 1.</p>
<p>p(a|b) = p(b|a)*p(a)/p(b) = p(b|a)*p(a)/(p(b|a)*p(a)+p(b|!a)*(1-p(a)) ⇒ approximately 0.5 = p(a)/(p(a)+0.03^4) ⇒ p(a) = 0.03^4 ~= <span class="frac"><sup>1</sup>⁄<sub>1,000,000</sub></span><br><br><strong>2:</strong> The idea came from <a href="https://www.greaterwrong.com/posts/ZEj9ATpv3P22LSmnC/selecting-rationalist-groups">paranoid debating</a> where Steve Rayhawk assumed a cauchy distribution. I tried to fit some data I had taken myself, but had insufficient statistics to figure out what the real shape is (if you guys have a bunch more data I could try again). It’s also worth noting that the shape of one’s probability distribution can change significantly from question to question so this would only apply in some cases.</p>jimmy6Bz4TK37T8t5S3AbMWed, 29 Apr 2009 08:35:01 +0000Metauncertainty by jimmy
https://www.greaterwrong.com/posts/LKHJ2Askf92RBbhBp/metauncertainty
<p><strong>Response to:</strong> <a href="https://www.greaterwrong.com/posts/AJ9dX59QXokZb35fk/when-not-to-use-probabilities">When (Not) To Use Probabilities</a> <br><br>“It appears to be a quite general principle that, whenever there is a randomized way of doing something, then there is a nonrandomized way that delivers better performance but requires more thought.” —E. T. Jaynes<br><br>The uncertainty due to vague (non math) language is no different than uncertainty by way of “randomizing” something (after all, <a href="https://www.greaterwrong.com/posts/f6ZLxEWaankRZ2Crv/probability-is-in-the-mind">probability is in the mind</a>). The principle still holds; you should be able to come up with a better way of doing things if you can put in the extra thought.<br><br>In some cases, you can’t afford to waste time or it’s not worth the thought, but when dealing with things such as the deciding whether to run the LHC or signing up for cryonics, there’s time, and it’s sorta a big deal, so it pays to do it right.<br><br><br>If you’re asked “how likely is X?”, you can answer “very unlikely” or “0.127%”. The latter may give the impression that the probability is known more precisely than it is, but the first is too vague; both strategies do poorly on the <a href="http://yudkowsky.net/rational/technical">log score</a>.<br><br>If you are unsure what probability to state, state this with… another probability distribution. </p>
<p><a id="more"></a>“My probability distribution over probabilities is an exponential with a mean of 0.127%” isn’t vague, it isn’t overconfident (at the meta^1 level), and gives you numbers to actually bet on.<br><br>The expectation value of the metaprobability distribution (integral from 0 to 1 of Pmeta*p*dp) is equal to the probability you give when trying to maximize your expected log score . <br><br>To see this, we write out the expected log score (Integral from 0 to 1 of Pmeta*(p*log(q)+(1-p)log(1-q))dp). If you split this into two integrals and pull out the terms that are independent of p, the integrals just turn into the expectation value of p, and the formula is now that of the log score with p replaced with mean(p). We already know that the log score is maximized when q = p, so in this case we set q = mean(p)<br><br>This is a very useful result when dealing with extremes where we are not well calibrated. Instead of punting and saying “err… prolly aint gonna happen”, put a probability distribution on your probability distribution and take the mean. For example, if you think X is true, but you don’t know if you’re 99% sure or 99.999% sure, you’ve got to bet at ~99.5%. <br><br>It is still no guarantee that you’ll be right 99.5% of times (by assumption we’re not calibrated!), but you can’t do any better given your metaprobability distribution.<br> <br>You’re not saying “99.5% of the time I’m this confident, I’m right”. You’re just saying “I expect my log score to be maximized if I bet on 99.5%”. The former implies the latter, but the latter does not (necessarily) imply the former.<br><br>This method is much more informative than “almost sure”, and gives you numbers to act on when it comes time to “shut up and multiply”. Your first set of numbers may not have “come from numbers”, but the ones you quote now do, which is an improvement. Theoretically this could be taken up a few steps of meta, but once is probably enough.</p>
<p>Note: <a href="https://www.greaterwrong.com/posts/dz3Mmr2Cykz6RRfhK/rationality-cryonics-and-pascal-s-wager#comment-zRGFpNd8cNiQkPdEN">Anna Salamon’s comment</a> makes this same point. <a href="https://www.greaterwrong.com/posts/dz3Mmr2Cykz6RRfhK/rationality-cryonics-and-pascal-s-wager#comment-zRGFpNd8cNiQkPdEN"><br></a></p>jimmyLKHJ2Askf92RBbhBpFri, 10 Apr 2009 23:41:52 +0000