This looks really interesting! Your special notation for composing functions is making this a lot harder to read for me though, I’ve read the part up to ‘Embeddings and Projections’, but I really have to take a break here and set extra time aside to read it at a slower pace from that point. I’m happy the notation works for you but it’s really jarring for me.
Already partially mentioned by others, including OP.
I usually start with comparing the conclusion with my expectations (I’m painfully aware that this creates a confirmation bias, but what else am I supposed to compare it with). If they are sufficiently different I try to imagine how, using the method described by the authors, I would be able to get a positive result to their experiment conditional on my priors being true, i.e. their conclusion being false. This is basically the same as trying to figure out how I would run the experiment and which data would disprove my assumptions, and then seeing if the published results fall in that category.
Usually the buck stops there, most published research use methods that are sufficiently flimsy that (again, conditional on my priors), it is very likely the result was a fluke. This approach is pretty much the same as your third bullet point, and also waveman’s point number 5. I would like to stress though that it’s almost never enough to have a checklist of “common flaws in method sections” (although again, you have to start somewhere). Unfortunately different strengths and types of results in different fields require different methods.
A small Bayesian twist on the interpretation of this approach: when you’re handed a paper (that doesn’t match your expectations), that is evidence of something. I’m specifically looking at the chance that, conditional on my priors being accurate, the paper I’m given is still being published.
Most of the problems you mention deal with limited battery life. I have never had any issues with this on my Huawei Nova, and in fact I only charge it every 2 or 3 days with normal use. Would you still recommend dual wielding if battery concerns are not an issue?
Is this based on https://www.wired.com/story/airline-emissions-carbon-offsets-travel/ ? I think https://slatestarcodex.com/2019/08/05/links-8-19/#comment-783425 might be relevant—it seems only the increase compared to 2019 is being offset, not the whole flight.
I think a very significant (probably even dominant) fraction of this geoengineering project would not be the industrial aspect but the organisational and political aspects. Building some ships sounds very doable (although I don’t know to what extend “spraying water” and “autonomous” are assembly-line projects, do we already have industries that make ships like this?) , coordinating around letting them sail around and alter the atmosphere less so.
Just to share my two cents on the matter, the distinction between abstract vectors and maps on the one hand, and columns with numbers in them (confusingly also called vectors) and matrices on the other hand, is a central headache for Linear Algebra students across the globe (and by extension also for the lecturers). If the approach this book takes works for you then that’s great to hear, but I’m wary of `hacks’ like this that only supply a partial view of the distinction. In particular matrix-vector mulitplication is something that’s used almost everywhere, if you need several translation steps to make use of this that could be a serious obstacle. Also the base map ΦB:Fn→V that limerott mentions is of central importance from a category-theoretic point of view and is essential in certain more advanced fields, for example in differential geometry. I’m therefore not too keen on leaving it out of a Linear Algebra introduction.
Unfortunately I don’t really know what to do about this, like I said this topic has always caused major confusion and the trade-off between completeness and conciseness is extremely complicated. But do beware that, based on only my understanding of your post, you might still be missing important insights about the distinction between numerical linear algebra and abstract linear algebra.
I think ‘competent’ should in this context mean something like ‘has the ability to, after being pointed to a gap in the market, build and/or keep functional a company that fills this gap’. This agrees fully with what you said, there is an extreme lot of wiggle room between ‘total retard’ and this sense of competent (in fact, I think almost everybody lives in this wiggle room). Furthermore, I think it makes sense to naively think that the abundance of successful companies suggests a lot of people are competent in this sense, whereas I claim this is not the case.
Very interesting observations! Personally I’d perhaps phrase it the other way around, not ‘incompetence is killing corporations’ but more something like ‘what changed in the past 70 years that allowed people to build long-living corporations back then and not now, assuming today’s regular company deaths are caused by incompetence?’. My personal guess is that either back when these long-living companies were founded (~1890’s) there was much more low-hanging fruit on the market, allowing less efficient companies to still survive, or alternatively that today’s economic environment is much more risk-tolerant so the selection for competence happens much more *after* founding a company.
I agree fully with the government bureaucracy remark, although I suspect there are a ton of other very important effects at work there too (for example, out of all organisations I expect governments in particular to have high accountability and regular run-ins with Chesterton’s fence, both of which increase bureaucratic load).
I personally think we don’t need to posit a mechanism that explains why people’s wrong beliefs don’t cause immediate disaster for companies. In my worldview this is fully explained by selection effects in the market, both at the level of organisations and at the level of individual employees. Since long-term views are very hard to link to individual outcomes, the selection pressure is weaker here.
I’d like to point out that this does suggest that organisations and companies fail and go bankrupt regularly, we just don’t hear that much about the quick failures (which I think fits reasonably well with observations, but I haven’t looked into this all that much).
This is in fact also an/my answer to the non-rhetorical question why anything works at all. I disagree with Kirkpatrick in attributing this to individuals, which seems to suggest there is some class of millions of managers who have attained some mystical level of competence that somehow doesn’t scale to groups.
This is part of the meaning of ‘utility’. In real life we often have risk-averse strategies where, for example, 100% chance at 100 dollars is preferred to 50% chance of losing 100 dollars and 50% chance of gaining 350 dollars. But, under the assumption that our risk-averse tendencies satisfy the coherence properties from the post, this simply means that our utility is not linear in dollars. As far as I know this captures most of the situations where risk-aversion comes into play: often you simply cannot tolerate extremely negative outliers, meaning that your expected utility is mostly dominated by some large negative terms, and the best possible action is to minimize the probability that these outcomes occur.
Also there is the following: consider the case where you are repeatedly offered bets of the example you give (B versus C). You know this in advance, and are allowed to redesign your decision theory from scratch (but you cannot change the definition of ‘utility’ or the bets being offered). What criteria would you use to determine if B is preferable to C? The law of large numbers(/central limit theorem) states that in the long run with probability 1 the option with higher expected value will give you more utilons, and in fact that this number is the only number you need to figure out which option is the better pick in the long run.
The tricky bit is the question whether this also applies to one-shot problems or not. Maybe there are rational strategies that use, say, the aggregate median instead of the expected value, which has the same limit behaviour. My intuition is that this clashes with what we mean with ‘probability’ - even if this particular problem is a one-off, at least our strategy should generalise to all situations where we talk about probability 1⁄2, and then the law of large numbers applies again. I also suspect that any agent that uses more information to make this decision than the expected value to decide (in particular, occasionally deliberately chooses the option with lower expected utility) can be cheated out of utilons with clever adversarial selections of offers, but this is just a guess.
I think your first remark is exactly the point. If the visits are useless then this is a crappy doctor scamming money and time out of patients and insurance companies, if the visits are important then asking OP’s friend to come in (for being over 4 months late on a 3-month checkup) sounds very reasonable to me. I think Zyryab’s suggestion of asking a doctor to Turing Test this makes a lot of sense—maybe the checkups are more valuable in certain life stages/demographics/early after diagnosis? Maybe the checkup is something more complicated than recording the HbA1c levels? I’m surprised to hear that without outside medical information the doctor is guilty until proven innocent.
I’m really surprised this is being downvoted so much.
As far as I can tell (and frankly I don’t care enough to put serious effort towards finding more information, but I do note nobody in the comments started with “I am a doctor” or “After talking about this with my own doctor, …”) OP’s friend was in a life-threatening situation, the solution to which is a renewed insulin prescription. On top of that, the doctor/medical establishment enforces the rule that people (only young people? only people who recently developed diabetes? There could be a good medical reason here, I don’t know) with Type I Diabetes have regular checkups.
Now I imagine there are all sorts of reasons for wanting to skip this checkup. Maybe the checkup isn’t needed, and is just a money scam (small aside: if my doctor tells me I need a regular checkup, this is not my first thought. But individual situations can vary). Maybe the doctor’s schedule is so unreasonable that it’s impossible to make an appointment. There could be thousands of valid reasons. The problem as I see it is that, from the point of view of both the doctor and the nurse, they are only negotiating over the checkup. You mention right at the start that the nurse offered a solution (“drop everything and come see your doctor tomorrow”) - from that point on the situation was no longer life-threatening! There was no realistic scenario in which this would cost your friend more than the plans they made for the next day! You were just haggling over what is more important, your friend’s schedule or the rules set by the medical establishment that you need an active prescription to get insulin and you need a checkup to renew your prescription. Guess which one the nurse is going to find more important.
I understand if it feels like your friend is being blackmailed by the doctor (and in fact it seems like they are), but by refusing to visit the next day you are the ones who escalated the situation. And then escalated even further by threatening with media exposure. I think from the point of view of the nurse your friend is showing rather hostile behaviour. I’ll take the liberty of going through the phone call as you posted it, filling in how I expect nurses to act:
The nurse tells my friend he needs to go see his doctor, because it has been seven months, and the doctor feels he should see his doctor every three.
Probably standard procedure. At any rate this decision it out of the nurse’s hands, so they are just providing information here.
My friend replies that he agrees he should see his doctor, and he has made an appointment in a few weeks when he has the time to do that.
The nurse says that he can’t get his prescription refilled until he sees the doctor.
Still standard. Nurses don’t get to overrule conditions doctors set for medication, if the doctor says a checkup is needed then the nurse has no way of handing over insulin.
My friend explains that he does not have the time to drop what he is doing and see the doctor the next day. That he is happy to see the doctor in a few weeks. But that until then, he requires insulin to live.
The nurse says that he can’t get his prescription refilled until he sees the doctor. That if he wants it earlier he can find another doctor.
Still the same issue. The nurse doesn’t have the authority to overrule the conditions set by the doctor. Also I’m missing a sentence here, who introduced talking to the doctor the very next day?
My friend explains again that he does not have the time to see any doctor the next day, nor can one find a doctor on one day’s notice in reasonable fashion. And that he has already made an appointment, and needs insulin to live. And would like to speak with the doctor.
The nurse refuses to get the prescription filled. The nurse does not offer to let him speak to the doctor, and says that he can either wait, make an appointment for the next day, or find a new doctor.
So apparently making an appointment one a one-day notice is very doable on the doctor’s side. By this point you are solidly haggling about time, not medicine. I also think the nurse could have let you speak with the doctor here. But I think it’s also plausible that they get/did in the past get phone calls from all kinds of entitled weirdos who refuse to show up to appointments, and at this moment it’s really not clear your friend is not one of them. Why would their day plans be more important?
My friend points out that without insulin, he will die. He asks if the nurse wants him to die. Or what the nurse suggests he do instead, rather than die.
This seems not to get through to the nurse, because my friend asks these questions several times. The nurse does not offer to refill the prescription, or let my friend talk to the doctor.
My friend says that if the doctor does not give him access to life saving medicine and instead leaves him to die, he will post about it on social media.
The nurse now decides, for the first time in the conversation, that my friend should perhaps talk to his doctor.
Really? Your friend escalates from “I don’t want to visit you tomorrow” to “that means you must want me to die”, which of course the sensible nurse ignores, and your strategy was to repeat it a few more times? Yeah, you really showed them there. I bet the nurse immediately realised they were wrong the first time, and connected you through with the doctor before you got to the third repetition. From their point of view you’ve refused a good solution to the problem and are now just bugging them to make your life easier (who likes going to checkups? Nobody. So who haggles about not wanting to show up? Well, not everybody, but more than just your friend I bet). And at that point your strategy is to escalate even more by threatening media exposure, and put even more pressure on that poor nurse? I’m not surprised the doctor claimed you are blackmailing them after this.
What was your goal of the conversation with the nurse in the first place? You need a doctor’s prescription for the insulin, so shouldn’t you have aimed for talking with the doctor? And if that was your goal, what purpose did it serve to tighten the screws on the nurse? You should have acted like a model patient and calmly requested you speak with the doctor, who can (and did) overrule the normal medical process just to give you life-saving medicine.
I guess that became a far longer monologue than I planned, I’m not going to go through the phone call with the doctor because it’s just more of the same. I think OP is in the wrong here, at the very least in their interaction with the nurse. And I do agree that this is a bad medical system, but you really can’t throw the co-pay costs, the lack of automatic prescription extensions/sufficiently large prescriptions to last you a long time and your interaction with the nurse and doctor on one heap and pretend this is the fault of “the American medical system”. The overall structure sucks, but some of these people are just local actors who cannot make a change and your friend threatened them to not have to change their schedule.
I have a bit of time on my hands, so I thought I might try to answer some of your questions. Of course I can’t speak for TurnTrout, and there’s a decent chance that I’m confused about some of the things here. But here is how I think about AUP and the points raised in this chain:
“AUP is not about the state”—I’m going to take a step back, and pretend we have an agent working with AUP reasoning. We’ve specified an arcane set of utility functions (based on air molecule positions, well-defined human happiness, continued existence, whatever fits in the mathematical framework). Next we have an action A available, and would like to compute the impact of that action. To do this our agent would compare how well it would be able to optimize each of those arcane utility functions in the world where A was taken, versus how well it would be able to optimize these utility functions in the world where the rest action was taken instead. This is “not about state” in the sense that the impact is determined by the change in the ability for the agent to optimize these arcane utilities, not by the change in the world state. In the particular case where the utility function is specified all the way down to sensory inputs (as opposed to elements of the world around us, which have to be interpreted by the agent first) this doesn’t explicitly refer to the world around us at all (although of course implicitly the actions and sensory inputs of the agent are part of the world)! The thing being measured is the change in ability to optimize future observations, where what is a ‘good’ observation is defined by our arcane set of utility functions.
“overfitting the environment”—I’m not too sure about this one, but I’ll have a crack at it. I think this should be interpreted as follows: if we give a powerful agent a utility function that doesn’t agree perfectly with human happiness, then the wrong thing is being optimized. The agent will shape the world around us to what is best according to the utility function, and this is bad. It would be a lot better (but still less than perfect) if we had some way of forcing this agent to obey general rules of simplicity. The idea here is that our bad proxy utility function is at least somewhat correlated with actual human happiness under everyday circumstances, so as long as we don’t suddenly introduce a massively powerful agent optimizing something weird (oops) to massively change our lives we should be fine. So if we can give our agent a limited ‘budget’ - in the case of fitting a curve to a dataset this would be akin to the number of free parameters—then at least things won’t go horribly wrong, plus we expect these simpler actions to have less unintended side-effects outside the domain we’re interested in. I think this is what is meant, although I don’t really like the terminology “overfitting the environment”.
“The long arms of opportunity cost and instrumental convergence”—this point is actually very interesting. In the first bullet point I tried to explain a little bit about how AUP doesn’t directly depend on the world state (it depends on the agent’s observations, but without an ontology that doesn’t really tell you much about the world), instead all its gears are part of the agent itself. This is really weird. But it also lets us sidestep the issue of human value learning—if you don’t directly involve the world in your impact measure, you don’t need to understand the world for it to work. The real question is this one: “how could this impact measure possibly resemble anything like ‘impact’ as it is intuitively understood, when it doesn’t involve the world around us?” The answer: “The long arms of opportunity cost and instrumental convergence”. Keep in mind we’re defining impact as change in the ability to optimize future observations. So the point is as follows: you can pick any absurd utility function you want, and any absurd possible action, and odds are this is going to result in some amount of attainable utility change compared to taking the null action. In particular, precisely those actions that massively change your ability to make big changes to the real world will have a big impact even on arbitrary utility functions! This sentence is so key I’m just going to repeat it with more emphasis: the actions that massively change your ability to make big changes in the world—i.e. massive decreases of power (like shutting down) but also massive increases in power—have big opportunity costs/benefits compared to the null action for a very wide range of utility functions. So these get assigned very high impact, even if the utility function set we use is utter hokuspokus! Now this is precisely instrumental convergence, i.e. the claim that for many different utility functions the first steps of optimizing them involves “make sure you have sufficient power to enforce your actions to optimize your utility function”. So this gives us some hope that TurnTrout’s impact measure will correspond to intuitive measures of impact even if the utility functions involved in the definition are not at all like human values (or even like a sensible category in the real world at all)!
“Wirehead a utility function”—this is the same as optimizing a utility function, although there is an important point to be made here. Since our agent doesn’t have a world-model (or at least, shouldn’t need one for a minimal working example), it is plausible the agent can optimize a utility function by hijacking its own input stream, or something of the sorts. This means that its attainable utility is at least partially determined by the agent’s ability to ‘wirehead’ to a situation where taking the rest action for all future timesteps will produce a sequence of observations that maximizes this specific utility function, which if I’m not mistaken is pretty much spot on the classical definition of wireheading.
“Cut out the middleman”—this is similar to the first bullet point. By defining the impact of an action as our change in the ability to optimize future observations, we don’t need to make reference to world-states at all. This means that questions like “how different are two given world-states?” or “how much do we care about the difference between two two world-states?” or even “can we (almost) undo our previous action, or did we lose something valuable along the way?” are orthogonal to the construction of this impact measure. It is only when we add in an ontology and start interpreting the agent’s observations as world-states that these questions come back. In this sense this impact measure is completely different from RR: I started to write exactly how this was the case, but I think TurnTrout’s explanation is better than anything I can cook up. So just ctrl+F “I tried to nip this confusion in the bud.” and read down a bit.
Thanks, that addresses the concerns I had!
I think the evidence presented is way too weak to support the type of conclusions drawn in this piece. I mean really, we’re computing doubling times by taking logarithms of estimated GDP, inserting an arbitrary offset in our definition of the horizontal axis and then plotting THAT on a log-log scale? What were you expecting to find?
More specifically: the horizontal labels of the most recent data points are heavily influenced by the particular choice of 2020 offset. I’ve taken the liberty of repeating (I hope) Scott’s analysis with the data from the paper, and swapping the offset to 2050 or even 2100 bunches the last data points a lot closer together, allowing a linear fit to pretty much pass through them. I think some argument can be made that we need a higher time resolution in an era with a doubling time of ~20 years compared to an era with a doubling time of ~500 years, but I’m still not happy with how sensitive this analysis is and would love to hear why 2020 is a better choice than 2100.
Also I notice that Scott left a bunch of data points from the paper out of the graph. I can live with excluding the really early ones (before 10000 B.C.), but why do you skip over the ones near 0 A.D.? The 1100-1200′s? And where are the data points with negative doubling times (i.e. declining GDP)? Maybe I missed it but I don’t see mention of these at all.
Yes, I think you’re right. Personally I think this is where the charitable reading comes in. I’m not aware of Einstein specifically stating that there have to be hidden variables in QM, only that he explicitly disagreed with the nonlocality (in the sense of general relativity) of Copenhagen. In the absence of experimental proof that hidden variables is wrong (through the EPR experiments) I think hidden variables was the main contender for a “local QM”, but all the arguments I can find Einstein supporting are more general/philosophical than this. In my opinion most of these criticisms still apply to the Copenhagen Interpretation as we understand it today, but instead of supporting hidden variables they now support [all modern local QM interpretations] instead.
Or more abstractly: Einstein backed a category of theories, and the main contender of that category has been solidly busted (ongoing debate about hidden variables blah blah blah I disagree). But even today I think other theories in that pool still come ahead of Copenhagen in likelihood, so his support of the category as a whole is justified.
I feel like I’m walking into a trap, but here we go anyway.
Einstein disagreed with some very specific parts of QM (or “QM as it was understood at the time”), but also embraced large parts of it. Furthermore, on the parts Einstein disagreed with there is still to this day ongoing confusion/disagreement/lack of consensus (or, if you ask me, plain mistakes being made) among physicists. Discussing interpretations of QM in general and Einstein’s role in them in particular would take way too long but let me just offer that, despite popular media exaggerations, with minimal charitable reading it is not clear that he was wrong about QM.
I know far less about Einstein’s work on a unified field theory, but if we’re willing to treat absence of evidence as evidence of absence here then that is a fair mark against his record.
I think this is an interesting idea, but doesn’t really intersect with the main post. The marginal benefits of reaching a galaxy earlier are very very huge. This means that if we are ever in the situation where we have some probes flying away, and we have the option right now to build faster ones that can catch up, then this makes the old probes completely obsolete even if we give the new ones identical instructions. The (sunk) cost of the old probes/extra new probes is insignificant compared to the gain from earlier arrival. So I think your strategy is dominated by not sending probes that you feel you can catch up with later.
Well, I still don’t have any experience with this. But maybe possible avenues include:
Looking into moderation rules.
Including some kind of reputation/point/reward system, and other methods to keep your users engaged.
Tracking metrics on the growth of the Site, and ideally having some advance expectations/plans on how to respond to different rates of growth/decline.
A more radical approach might be to give up the phase 2 and beyond in their entirety, and settle for a target audience of people close enough to you that you can reasonably trust them.
The survivorship bias is a very valid point, but [not doing research on how to make websites grow] is also a poor strategy. Personally I’d still look into the advice, but I’m afraid what you’re trying to do is simply very difficult.
Epistemic status: worried about effort/time lost.
I am by no means experienced with any of this, and seriously considered not writing anything at all. But it only takes me a bit of time (an hour max) to write why I feel why the odds are very strongly against you, and if you are serious about pursuing this idea then even if at low probability that my comment is helpful to you it’s worth writing it on average. So here we go.
During my read of the post, top-to-bottom, at the part
On matters of truth, it needs to support epistemic arguments for why we should believe or not believe particular claims. On matters of action, it needs to provide important pro/cons of taking that action. Site must have a method of allowing the best arguments to rise to the top.
my internal monologue went “The first bit is difficult but perhaps possible. The second is a mess. Oh dear, the third is basically impossible!”. The sentence immediately after, explaining this functionality would be the bare basics, shocked me quite a lot. I think aiming for the quoted section is nigh-impossible, and then we haven’t started on the possible additional features you mention. Your post strongly reminds me of Benjamin Hoffman’s piece on Anglerfish (in my opinion worth reading in full), and also a bit of a segment (near the start) in one of Eliezer’s posts on security mindset—where the character Amber makes the mistake of thinking that the critical part of her startup is the technology, where really it is the security. I think in a similar manner your Site would, besides depending on the UI, the back-end, the marketing etc. also depend critically on its ability to continue growing during certain critical phases, and the lack of discussion on this as a plausible failure mode is making me rather pessimistic.
In my mind, conditional on Site eventually operating as intended, it should grow through several phases. First you have a low number of users (~100 regular users? Sorry, I don’t have experience with this) who basically filtered in from your social circles, and are able to aggregate their opinions/thoughts as intended. Then in the next phase Site grows more popular as people notice this is a valuable source of truth/plans/speculation, and they provide new questions and answers covering broader topics. After that there should be some third phase where Site is diverse and big enough that all those extra features you mentioned might become plausible to implement (I’ll come back to this later).
My problem lies with the second phase. Benjamin’s piece suggests that as soon as Site is big enough to have any real value, this immediately creates incentives for outsiders to try to abuse/free-ride on the project (for example through manipulating the questions or voting). This would be worse on discussions on *actions*, which is why at the start I mentioned that that is more difficult than discussing *truth*. Your wish to keep Site crowd-sourced makes it more difficult to guard against this phenomenon, and to me Eliezer’s writing on security mindset suggests that if you don’t treat this problem as central the odds are strongly against you. It is unclear to me what motivates people to keep coming back to Site in this second phase if they disagree with a large part of the demographic/consensus, or in general why echo-chamber effects would not apply. In fact, it is unclear to me why people would spend time participating in discussions outside their immediate interests at all (see also for example evaporative cooling).
Lastly I think a large part of Site would only function after you have some critical mass of users to have sufficient discussion on a lot of different topics. This is troubling as it means those parts existing at all is conditional on Site being a success. In the spirit of “If you’re not growing you’re shrinking” I think a lot more time and effort should be focused on figuring out how to obtain and keep a userbase, and introducing fancy features is downstream from this.
Sorry for being so critical and nonconstructive. I don’t know how to solve any of these problems, but like I said at the start it felt like a wrong strategy to just stay quiet. I hope I’m wrong about most/all of this, and let me as a closure mention again that I don’t have experience with this at all.