DanielFilan’s Shortform Feed
Rationality-related writings that are more comment-shaped than post-shaped. Please don’t leave top-level comments here unless they’re indistinguishable to me from something I would say here.
Rationality-related writings that are more comment-shaped than post-shaped. Please don’t leave top-level comments here unless they’re indistinguishable to me from something I would say here.
The below is the draft of a blog post I have about why I like AI doom liability. My dream is that people read it and decide “ah yes this is the main policy we will support” or “oh this is bad for a reason Daniel hasn’t noticed and I’ll tell him why”. I think usually you’re supposed to flesh out posts, but I’m not sure that adds a ton of information in this case.
Why I like AI doom liability
AI doom liability is my favourite approach to AI regulation. I want to sell you all on it.
the basic idea
general approach to problems: sue people for the negative impacts
internalizes externalities
means that the people figuring out how to avoid are informed and aligned (rather than bureaucrats less aware of on-the-ground conditions / trying to look good / seeking power)
less fucked than criminal law, regulatory law
look at what hits the supreme court, which stuff ends up violating people’s rights the worst, what’s been more persistent over human history, what causes massive protests, etc.
first-pass approach to AI: sue for liabilities after AI takes over
can’t do that
so sue for intermediate disasters, get punitive damages for how close you were to AI takeover
intuition: pulling liability forward into places it can be paid, for same underlying conduct.
also mention strict liability, liability insurance
See Foom Liability (Hanson, 2023), Tort Law as a Tool for Mitigating Catastrophic Risk from Artificial Intelligence (Weil, 2024).
why it’s nice
liability when you’re more informed of risks, vs regulation now, when we know less
doesn’t require the right person in the right position
judged by juries informed by lawyers on both sides, not power-hungry politically constrained
we don’t really know what the right way to make safe AI is right now
good in high-risk worlds or low-risk worlds—as long as you believe in intermediate disasters
intermediate disasters seem plausible because slow takeoff
more fair: ai companies can’t get away with making egregiously unsafe AI, but they’re not penalized for doing stuff that is actually harmless.
difficulties with the proposal:
jury discretion
you could give the jury the optimal formula, which isn’t easy to plug numbers in, and give them a bunch of discretion how to apply it
or you could give them a more plug-and-play formula which sort of approximates the optimal formula, making things more predictable but less theoretically optimal.
it’s not clear how you want to trade off predictability with theoretical optimality, or what the trade-off even looks like (Hanson’s post is a bit more predictable but it’s unclear how predictable it actually is).
positive externalities
in a world where research produces positive externalities, it’s a really bad idea to force people to internalize all negative externalities
one way this is clear: open source AI. tons of positive externalities—people get to use AI to do cool stuff, and you can do research on it, maybe helping you figure out how to make AI more safely.
this regime, without tweaks, would likely make it economically unviable to open source large SOTA models. it’s unclear whether this is optimal.
I don’t know a principled way to deal with this.
Further note: this policy doesn’t work to regulate government-developed AGI, which is a major drawback if you expect the government to develop AGI. It also probably lowers the relative cost for the government to develop AGI, which is a major drawback if you think the private sector would do a better job of responsible AGI development than the government.
I think you could also push to make government liable as part of this proposal
You could but (a) it’s much harder constitutionally in the US (governments can only be sued if they consent to being sued, maybe unless other governments are suing them) and (b) the reason for thinking this proposal works is modelling affected actors as profit-maximizing, which the government probably isn’t.
Oh: it would be sad if there were a bunch of frivolous suits for this. One way to curb that without messing up optionality would be to limit such suits to large enough intermediate disasters.
You can’t always use liability to internalise all the externality because e.g. you can’t effectively sue companies for more than they have, and for some companies that stay afloat by regular fundraising rounds, that may not even be that much? like, if they’re considering an action that is a coinflip between “we cause some huge liability” and “we triple the value of our company” then it’s usually going to be correct from a shareholder perspective to take it, no matter the size of the liability, right?
Criminal law has the ability to increase the deterrent somewhat – probably many people will not accept any amount of money for a significant enough chance of prison – though obviously it’s not perfect either
OP doesn’t emphasize liability insurance enough but part of the hope is that you can mandate that companies be insured up to $X00 billion, which costs them less than $X00 billion assuming that they’re not likely to be held liable for that much. Then the hope is the insurance company can say “please don’t do extremely risky stuff or your premium goes up”.
Hot take: if you think that we’ll have at least 30 more years of future where geopolitics and nations are relevant, I think you should pay at least 50% as much attention to India as to China. Similarly large population, similarly large number of great thinkers and researchers. Currently seems less ‘interesting’, but that sort of thing changes over 30-year timescales. As such, I think there should probably be some number of ‘India specialists’ in EA policy positions that isn’t dwarfed by the number of ‘China specialists’.
For comparison, in a universe where EA existed 30 years ago we would have thought it very important to have many Russia specialists.
Just learned that 80,000 hours’ career guide includes the claim that becoming a Russia or India specialist might turn out to be a very promising career path.
I’ve been wondering recently whether CFAR should try having some workshops in India for this reason. Far more people speak English than in China, and I expect we’d encounter fewer political impediments.
Also, anecdotally, there have been lots of Indian applicants (and attendees) at ESPR throughout the years. Seems like people there also think rationality is cool (lots of the people I interviewed had read HPMOR, there are LW meetups there, etc. etc.)
Also fyi, a nontrivial fraction of new users on LessWrong have Indian sounding usernames.
Brazil is another interesting place. In addition to the large populations and GDP, anecdotally based on online courses I’ve taken, philosophy meme groups etc, Brazilians seem more interested in Anglo-American academic ethics than people from China or India, despite the presumably large language barrier.
fwiw the global poverty part of EA already does a fair amount of work in India. I know EA is a bit (and increasingly) fragmented between different cause areas, but that still might be a useful entry point?
The Indian grammarian Pāṇini wanted to exactly specify what Sanskrit grammar was in the shortest possible length. As a result, he did some crazy stuff:
There are two surprising facts about this:
His grammar was written in the 4th century BC.
People then failed to build on this machinery to do things like formalise the foundations of mathematics, formalise a bunch of linguistics, or even do the same thing for languages other than Sanskrit, in a way that is preserved in the historical record.
I’ve been obsessing about this for the last few days.
A complaint about AI pause: if we pause AI and then unpause, progress will then be really quick, because there’s a backlog of improvements in compute and algorithmic efficiency that can be immediately applied.
One definition of what an RSP is: if a lab makes observation O, then they pause scaling until they implement protection P.
Doesn’t this sort of RSP have the same problem with fast progress after pausing? Why have I never heard anyone make this complaint about RSPs? Possibilities:
They do and I just haven’t seen it
People expect “AI pause” to produce longer / more serious pauses than RSPs (but this seems incidental to the core structure of RSPs)
Basically I just agree with what James said. But I think the steelman is something like: you should expect shorter (or no) pauses with an RSP if all goes well, because the precautions are matched to the risks. Like, the labs aim to develop safety measures which keep pace with the dangers introduced by scaling, and if they succeed at that, then they never have to pause. But even if they fail, they’re also expecting that building frontier models will help them solve alignment faster. I.e., either way the overall pause time would probably be shorter?
It does seem like in order to not have this complaint about the RSP, though, you need to expect that it’s shorter by a lot (like by many months or years). My guess is that the labs do believe this, although not for amazing reasons. Like, the answer which feels most “real” to me is that this complaint doesn’t apply to RSPs because the labs aren’t actually planning to do a meaningful pause.
Good point!
Man, my model of what’s going on is:
The AI pause complaint is, basically, total self-serving BS that has not been called out enough
The implicit plan for RSPs is for them to never trigger in a business-relevant way
It is seen as a good thing (from the perspective of the labs) if they can lose less time to an RSP-triggered pause
...and these, taken together, should explain it.
The point that a capabilities overhang might cause rapid progress in a short period of time has been made by a number of people without any connections to AI labs, including me, which should reduce your credence that it’s “basically, total self-serving BS”.
More to the point of Daniel Filan’s original comment, I have criticized the Responsible Scaling Policy document in the past for failing to distinguish itself clearly from AI pause proposals. My guess is that your second and third points are likely mostly correct: AI labs think of an RSP as different from AI pause because it’s lighter-touch, more narrowly targeted, and the RSP-triggered pause could be lifted more quickly, potentially minimally disrupting business operations.
I think it’s not an unreasonable point to take into account when talking price, but also a lot of the time it’s serves as a BS talking point for people who don’t really care about the subtleties.
My guess is:
AI pause: no observation on what safety issue to address, work on capabilities anyways, then may lead to only capability improvements. (Assumption is that AI pausing means no releasing of models.)
RSP: observed O, shift more resources to work on mitigating O and less on capabilities, and when protection P is done, publish the model, then shift back to capabilities. (Ideally.)
I’m not saying there’s no reason to think that RSPs are better or worse than pause, just that if overhang is a relevant consideration for pause, it’s also a relevant consideration for RSPs.
I’d imagine that RSP proponents think that if we execute them properly, we will simply not build dangerous models beyond our control, period. If progress was faster than what labs can handle after pausing, RSPs should imply that you’d just pause again. On the other hand, there’s not a clear criteria for when we would pause again after, say, a six month pause in scaling.
Now whether this would happen in practice is perhaps a different question.
I think pause proponents think similarly!
Realized that I didn’t respond to this—PauseAI’s proposal is for a pause until safety can be guaranteed, rather than just for 6 months.
Are they the same people advocating for RSPs and also using compute/algorithm overhang as a primary argument against a pause? My understanding of the main argument in favor of RSPs over an immediate pause is:
Sure, we could continue to make some progress on safety if we paused other AI progress.
But:
we could make even more progress on safety if we could work with more advanced models; and
right now we have the necessary safety measures to create the next generation of models with low risk.
If AI progress continues without corresponding progress on safety, then (2.b) will no longer hold, so we should indeed pause at that time, hence the RSP.
If you believe that (2.a) and (2.b) are both true, then you can argue that RSPs are better than an immediate pause without referring to compute/algorithm overhang. If you believe that one of (2.a) and (2.b) is false, but are skeptical of a pause because you believe compute/algorithm overhang would increase risk (or at least negate the benefit), then it seems you should also be skeptical of RSPs.
I’m not saying that RSPs are or aren’t better than a pause. But I would think that if overhang is a relevant consideration for pauses, it’s also a relevant consideration for RSPs.
I agree that if overhang is a relevant consideration for pauses, then it’s also a relevant consideration for RSPs. My previous question was: Do you see the same people invoking overhang as an argument against pauses and also talking about RSPs as though they are not also impacted?
Maybe you’re not saying that there are people taking that position, but rather that those who invoke overhang as an argument against pauses don’t seem to be equally vocal against RSPs (if not necessarily in favor of them either). I can think of a couple of separate reasons this could be the case:
To the extent I think a pause is bad (for example, because of overhang), I might still be more motivated to prioritize arguing against “unconditional pause” than “maybe pause in the future”, even if the argument could apply to both. This is especially true if I consider the prospect of an unconditional pause a legitimate, near-term threat.
If I think a pause introduces a high, additional risk, and I think the base level of risk is low, it seems clear that I should not introduce that high risk. But if I get new evidence that there is an immediate, even-higher risk, which a pause could help mitigate, I should be willing to roll the dice on the pause, which now comes with a net reduction in risk.
(2) isn’t a very reassuring position, but it does suggest that “immediate pause bad because overhang” and “RSPs good [in spite of overhang]” are logically compatible.
I guess I’m not tracking this closely enough. I’m not really that focussed on any one arguer’s individual priorities, but more about the discourse in general. Basically, I think that overhang is a consideration for unconditional pauses if and only if it’s a consideration for RSPs, so it’s a bad thing if overhang is brought up as an argument against unconditional pauses and not against RSPs, because this will distort the world’s ability to figure out the costs and benefits of each kind of policy.
Also, to be clear, it’s not impossible that RSPs are all things considered better than unconditional pauses, and better than nothing, despite overhang. But if so, I’d hope someone somewhere would have written a piece saying “RSPs have the cost of causing overhang, but on net are worth it”.
As others have said, I believe AI pauses by governments would absolutely be more serious and longer, preventing overhangs from building up too much.
The big worry I do have with pause proposals in practice is that I expect most realistic pauses to buy us several years at most, but not decades long because people will shift their incentives towards algorithmic progress, which isn’t very controllable by default, and I also expect there to be at most 1 OOM of compute left to build AGI which scales to superintelligence by the time we pause, meaning that it’s a very unstable policy as any algorithmic advances like AI search actually working in complicated domains would immediately blow up the pause, and there are likely strong incentives to break the pause once people realize what superintelligence means.
See here for one example:
https://yellow-apartment-148.notion.site/AI-Search-The-Bitter-er-Lesson-44c11acd27294f4495c3de778cd09c8d
Are you saying that overhangs wouldn’t build up too much under pauses because the government wouldn’t let it happen, or that RSPs would have less overhang because they’d pause for less long so less overhang would build up? I can’t quite tell.
That RSPs would have less overhang because they’d pause for less long so less overhang would build up.
I continue to think that agent foundations research is kind of underrated. Like, we’re supposed to do mechinterp to understand the algorithm models implement—but how do we know what algorithms are good?
It additionally seems likely to me that we are presently missing major parts of a decent language for talking about minds/models, and developing such a language requires (and would constitute) significant philosophical progress. There are ways to ‘understand the algorithm a model is’ that are highly insufficient/inadequate for doing what we want to do in alignment — for instance, even if one gets from where interpretability is currently to being able to replace a neural net by a somewhat smaller boolean (or whatever) circuit and is thus able to translate various NNs to such circuits and proceed to stare at them, one probably won’t thereby be more than 110 of the way to the kind of strong understanding that would let one modify a NN-based AGI to be aligned or build another aligned AI (in case alignment doesn’t happen by default) (much like how knowing the weights doesn’t deliver that kind of understanding). To even get to the point where we can usefully understand the ‘algorithms’ models implement, I feel like we might need to have answered sth like (1) what kind of syntax should we see thinking as having — for example, should we think of a model/mind as a library of small programs/concepts that are combined and updated and created according to certain rules (Minsky’s frames?), or as having a certain kind of probabilistic world model that supports planning in a certain way, or as reasoning in a certain internal logical language, or in terms of having certain propositional attitudes; (2) what kind of semantics should we see thinking as having — what kind of correspondence between internals of the model/mind and the external world should we see a model as maintaining(; also, wtf are values). I think that trying to find answers to these questions by ‘just looking’ at models in some ML-brained, non-philosophical way is unlikely to be competitive with trying to answer these questions with an attitude of taking philosophy (agent foundations) seriously, because one will only have any hope of seeing the cognitive/computational structure in a mind/model by staring at it if one stares at it already having some right ideas about what kind of structure to look for. For example, it’d be very tough to try to discover [first-order logic]/ZFC/[type theory] by staring at the weights/activations/whatever of the brain of a human mathematician doing mathematical reasoning, from a standpoint where one hasn’t already invented [first-order logic]/ZFC/[type theory] via some other route — if one starts from the low-level structure of a brain, then first-order logic will only appear as being implemented in the brain in some ‘highly encrypted’ way.
There’s really a spectrum of claims here that would all support the claim that agent foundations is good for understanding the ‘algorithm’ a model/mind is to various degrees. A stronger one than what I’ve been arguing for is that once one has these ideas, one needn’t stare at models at all, and that staring at models is unlikely to help one get the right ideas (e.g. because it’s better to stare at one’s own thinking instead, and to think about how one could/should think, sort of like how [first-order logic]/ZFC/[type theory] was invented), so one’s best strategy does not involve starting at models; a weaker one than what I’ve been arguing is that having more and better ideas about the structure of minds would be helpful when staring at models. I like TsviBT’s koan on this topic.
Not only “good ”, but “obedient”, “non-deceptive”, “minimal impact”, “behaviorist” and don’t even talk about “mindcrime”.
In this sense agent foundations research seems similar to research on normative ethics.
Shower thought[*]: the notion of a task being bounded doesn’t survive composition. Specifically, say a task is bounded if the agent doing it is only using bounded resources and only optimising a small bit of the world to a limited extent. The task of ‘be a human in the enterprise of doing research’ is bounded, but the enterprise of research in general is not bounded. Similarly, being a human with a job vs the entire human economy. I imagine keeping this in mind would be useful when thinking about CAIS.
Similarly, the notion of a function being interpretable doesn’t survive composition. Linear functions are interpretable (citation: the field of linear algebra), as is the ReLU function, but the consensus is that neural networks are not, or at least not in the same way.
I basically wish that the concepts that I used survived composition.
[*] Actually I had this on a stroll.
Fwiw, this seems like an interesting thought but I’m not sure I understand it, and curious if you could say it in different words. (but, also, if the prospect of being asked to do that for your shortform comments feels ughy, no worries)
Often big things are made of smaller things: e.g., the economy is made of humans and machines interacting, and neural networks are made of linear functions and ReLUs composed together. Say that a property P survives composition if knowing that P holds for all the smaller things tells you that P holds for the bigger thing. It’s nice if properties survive composition, because it’s easier to figure out if they hold for small things than to directly tackle the problem of whether they hold for a big thing. Boundedness doesn’t survive composition: people and machines are bounded, but the economy isn’t. Interpretability doesn’t survive composition: linear functions and ReLUs are interpretable, but neural networks aren’t.
Frankfurt-style counterexamples for definitions of optimization
In “Bottle Caps Aren’t Optimizers”, I wrote about a type of definition of optimization that says system S is optimizing for goal G iff G has a higher value than it would if S didn’t exist or were randomly scrambled. I argued against these definitions by providing a examples of systems that satisfy the criterion but are not optimizers. But today, I realized that I could repurpose Frankfurt cases to get examples of optimizers that don’t satisfy this criterion.
A Frankfurt case is a thought experiment designed to disprove the following intuitive principle: “a person is morally responsible for what she has done only if she could have done otherwise.” Here’s the basic idea: suppose Alice is considering whether or not to kill Bob. Upon consideration, she decides to do so, takes out her gun, and shoots Bob. But little-known to her, a neuroscientist had implanted a chip in her brain that would have forced her to shoot Bob if she had decided not to. That said, the chip didn’t activate, because she did decide to shoot Bob. The idea is that she’s morally responsible, even tho she couldn’t have done otherwise.
Anyway, let’s do this with optimizers. Suppose I’m playing Go, thinking about how to win—imagining what would happen if I played various moves, and playing moves that make me more likely to win. Further suppose I’m pretty good at it. You might want to say I’m optimizing my moves to win the game. But suppose that, unbeknownst to me, behind my shoulder is famed Go master Shin Jinseo. If I start playing really bad moves, or suddenly die or vanish etc, he will play my moves, and do an even better job at winning. Now, if you remove me or randomly rearrange my parts, my side is actually more likely to win the game. But that doesn’t mean I’m optimizing to lose the game! So this is another way such definitions of optimizers are wrong.
That said, other definitions treat this counter-example well. E.g. I think the one given in “The ground of optimization” says that I’m optimizing to win the game (maybe only if I’m playing a weaker opponent).
Interesting, but I’m not sure how successful the counterexample is. After all, if your terminal goal in the whole environment was truly for your side to win, then it makes sense to understand anything short of letting Shin play as a shortcoming of your optimization (with respect to that goal). Of course, even in the case where that’s your true goal and you’re committing a mistake (which is not common), we might want to say that you are deploying a lot of optimization, with respect to the different goal of “winning by yourself”, or “having fun”, which is compatible with failing at another goal.
This could be taken to absurd extremes (whatever you’re doing, I can understand you as optimizing really hard for doing exactly what you’re doing), but the natural way around that is for your imputed goals to be required simple (in some background language or ontology, like that of humans). This is exactly the approach mathematically taken by Vanessa in the past (the equation at 3:50 here).
I think this “goal relativism” is fundamentally correct. The only problem with Vanessa’s approach is that it’s hard to account for the agent being mistaken (for example, you not knowing Shin is behind you).[1]
I think the only natural way to account for this is to see things from the agent’s native ontology (or compute probabilities according to their prior), however we might extract those from them. So we’re unavoidably back at the problem of ontology identification (which I do think is the core problem).
Say Alice has lived her whole life in a room with a single button. People from the outside told her pressing the button would create nice paintings. Throughout her life, they provided an exhaustive array of proofs and confirmations of this fact. Unbeknownst to her, this was all an elaborate scheme, and in reality pressing the button destroys nice paintings. Alice, liking paintings, regularly presses the button.
A naive application of Vanessa’s criterion would impute Alice the goal of destroying paintings. To avoid this, we somehow need to integrate over all possible worlds Alice can find herself in, and realize that, when you are presented with an exhaustive array of proofs and confirmations that the button creates paintings, it is on average more likely for the button to create paintings than destroy them.
But we face a decision. Either we fix a prior to do this that we will use for all agents, in which case all agents with a different prior will look silly to us. Or we somehow try to extract the agent’s prior, and we’re back at ontology identification.
(Disclaimer: This was SOTA understanding a year ago, unsure if it still is now.)
Live in Berkeley? I think you should consider running for the city council. Why?
4 seats are going to be open with no incumbents:
District 4: the area between Sacramento, Blake, Fulton, and University, plus the area between University, Cedar, MLK, and Fulton. Lots of rationalists live in this area. This will be a special election that’s yet to be scheduled, but I imagine it will be held in April or May, with a filing deadline in late Feb / early Mar. (Or maybe it will be held at the same time as District 7, on April 16, filing deadline on EOD Feb 16)
District 5: north of Cedar, between Spruce and Sacramento/Tulare/Nelson. Election in November.
District 6: north of Hearst, between Oxford/Spruce and Wildcat Canyon Road. Election in November.
District 7: campus and the couple blocks immediately south of it. Borders are hard to describe, check here. Special election: filing deadline is EOD Feb 16, election is April 16.
Nobody is running in those races yet.
You probably have gripes with how the city is running: maybe you wish policing were different, or there were more permissive zoning, or better education.
You probably have a bunch of friends who feel similarly who maybe would want to vote for you or support your campaign.
Here is a candidate handbook for the District 7 election, I imagine running for the other districts is similar (but with different relevant dates).
On the most recent episode of the podcast Rationally Speaking, David Shor discusses how members of the USA’s Democratic Party could perform better electorally by not talking about their unpopular extreme views, but notes that many individual Democrats have better lives by talking about their unpopular extreme views that are popular with left-wing activists (e.g. because they become more prominent and get to feel good about themselves), which cause some voters to associate those unpopular extreme views with the Democratic Party and not vote for them.
This is discussed as a sad irrationality that constitutes a coordination failure among Democrats, but I found that an odd tone. Part of the model in the episode is that Democratic politicians in fact have these unpopular extreme views, but it would hurt their electoral chances if that became known. From a non-partisan perspective, you’d expect it to be a good thing to know what elected officials actually think. Now, you might think that elected officials shouldn’t enact the unpopular policies that they in fact believe in, but it’s odd to me that they apparently can’t credibly communicate that they won’t enact those policies. At any rate, I’m a bit bothered by the idea of coordinated silence to ensure that people don’t know what powerful people actually think being portrayed as good.
The episode is quite interesting!
The system is set up in the way that before a Democratic candidate can enter their final battle against the Republican candidate, first they have to defeat their fellow Democrats. And things that help them in the previous rounds (talking like a SJW, to put it bluntly) seem to hurt them in the final round, and vice versa.
The underlying reason is that within Democratic Party, the opinions of the vanguard got recently so far from the opinions of hoi polloi, that it became almost impossible for any candidate to make both happy. With the vanguard, you score by being extreme, by “pushing the Overton window”. With hoi polloi, you score by being a relatable person, by (illusion of) caring about their boring everyday problems.
(Followed by an interesting explanation why Republican Party doesn’t have the symmetric problem. Within both parties, the more educated and more politically active people are more left-wing than their average voter. In Republican Party, this pushes the candidates towards center, making them more attractive for voters in general; in Democratic Party, this pushes the candidates away from center, making them less attractive for voters.)
The part where I disagree with Julia’s summary is that to Julia, if I understand her correctly, the vanguard is a more extreme version of hoi polloi. To me it seems like they often care about different things. Consider the television ads that were popular among elite Democrats, but actually made people more likely to vote for Republicans.
I don’t read this as “you represent my opinion too strongly”, but rather as “you don’t represent my opinion”.
David suggests an interesting solution: Democrats should have more non-white candidates with less woke opinions, because (these are my words) the vanguard will hesitate to attack them because of their color, and hoi polloi will find them more acceptable because of their opinions. (Kinda like Obama.) Cool trick, but I suspect it will stop working as soon as you have two candidates from the same minority, so they need to compete for the ideological support again.
David suggests it helped Biden that he refused to join the “defund police” bandwagon. Made him less popular with vanguard, but more popular with hoi polloi (especially with Hispanic voters).
Julia dislikes “the fact that people increasingly vote [...] all Democrats or all Republicans [...] because politicians are increasingly judged by what other politicians in their party say and do”. So not only are the politicians naturally more extreme, but they have to compete for opinion of other naturally more extreme people, which makes the winners even more extreme. So perhaps it is inevitable that the politicians will be one step more extreme than their voters, but we could stop them from becoming two steps more extreme. The problem is not that the voters will learn their opinions, but that the super-woke journalists will (and that the voters will ultimately take the journalists’ verdict into consideration).
I share your preference for more transparency about powerful people. I think that at some moment Democratic Party will somehow have to address the issue of being disconnected from the average voter.
I think this is false. Shor, from the transcript:
I don’t have much to say about your take, but it was interesting!
You’re right, the main difference is not between the primaries and the final round, but rather somewhere between Twitter/journalists and primaries.
It seems clear that we want politicians to honestly talk about what they’re intending to do with the policies that they’re actively trying to change (especially if they have a reasonable chance of enacting new policies before the next election). That’s how voters can know what they’re getting.
It’s less obvious how this should apply to their views on things which aren’t going to be enacted into policy. Three lines of thinking that point in the direction of maybe it’s good for politicians to keep quiet about (many of) their unpopular views:
It can be hard for listeners to tell how likely the policy is to be enacted, or how actively the politician will try to make it happen. I guess it’s hard to fit into 5 words? e.g. I saw a list of politicians’ “broken promises” on one of the fact checking sites, which was full of examples where the politician said they were in favor of something and then it didn’t get enacted, and the fact checkers deemed that sufficient to count it as a broken promise. This can lead to voters putting too little weight on the things that they’re actually electing the politician to do, e.g. local politics seems less functional if local politicians focus on talking about their views on national issues that they have no control over.
Another issue is that it’s cheap talk. The incentive structure / feedback loops seem terrible for politicians talking about things unrelated to the policies they’re enacting or blocking. Might be more functional to have a political system where politicians mostly talk about things that are more closely related to their actions, so that their words have meaning that voters can see.
Also, you can think of politicians’ speech as attempted persuasion. You could think of voters as picking a person to go around advocating for the voters’ hard-to-enact views (as well as to implement policies for the voters’ feasible-to-enact views). So it seems like it could be reasonable for voters to say “I think X is bad, so I’m not going to vote for you if you go around advocating for X”, and for a politician who personally favors X but doesn’t talk about it to be successfully representing those voters.
Note that the linked podcast is not merely arguing that politicians should keep quiet about their views, it’s also arguing that their fellow partisans in e.g. think-tanks and opinion sections should also keep quiet, because people can tell that the politicians secretly believe what the think-tankers and opinionists openly say. I think these arguments don’t imply that those think-tankers and opinionists should keep quiet.
Shor is very open about the fact that his views are to the left of 90%+ of the electorate, and that his goal is to maximize the power of people that share his views despite their general unpopularity.
Yeah, I think I’m more surprised by Galef’s tone than Shor’s.
I get to nuke LW today AMA.
I think the use of dialogues to illustrate a point of view is overdone on LessWrong. Almost always, the ‘Simplicio’ character fails to accurately represent the smart version of the viewpoint he stands in for, because the author doesn’t try sufficiently hard to pass the ITT of the view they’re arguing against. As a result, not only is the dialogue unconvincing, it runs the risk of misleading readers about the actual content of a worldview. I think this is true to a greater extent than posts that just state a point of view and argue against it, because the dialogue format naively appears to actually represent a named representative of a point of view, and structurally discourages disclaimers of the type “as I understand it, defenders of proposition P might state X, but of course I could be wrong”.
I’ve seen such dialogs, and felt exactly the same way. At least twice I’ve later found out that the dialog actually happened and there was no misrepresentation or simplification, just a HUGE inferential distance about what models of the universe (really, models of groups of people are the main sticking points) should be applied in what circumstances.
Possibly this could also be a strength, because by representing the views separately like that it makes it easier to see exactly what assumptions are causing them to fail the ITT.
On the other hand if they’re sufficiently far off, the dialogue basically goes off in the entirely wrong direction.
Do you have examples of dialogues that fail to pass the ITT? I’m curious if you think any of the dialogues I’ve read might have been misleading.
A bunch of my friends are very skeptical of the schooling system and promote homeschooling or unschooling as an alternative. I see where they’re coming from, but I worry about the reproductive consequences of stigmatising schooling in favour of those two alternatives. Based on informal conversations, the main reason why people I know aren’t planning on having more children is the time cost. A move towards normative home/unschooling would increase the time cost of children, and as such make them less appealing to prospective parents[*]. This in turn would reduce birth rates, worsening the problem that first-world countries face in the next couple of decades of a low working-age:elderly population ratio [EDIT: also, low population leading to less innovation, also low population leading to fewer people existing who get to enjoy life]. As such, I tentatively wish that home/unschooling advocates would focus on more institutional ways of supervising children, e.g. Sudbury schools, community childcare, child labour [EDIT: or a greater emphasis on not supervising children who don’t need supervision, or similar things].
[*] This is the weakest part of my argument—it’s possible that more people home/unschooling their kids would result in cooler kids that were more fun to be around, and this effect would offset the extra time cost (or kids who are more willing to support their elderly parents, perhaps). But given how lucrative the first world labour market is, I doubt it.
While I agree that a world where home/un-schooling is a norm would result in greater time-costs and a lower child-rate, I don’t think that promoting home/un-schooling as an alternative will result in a world where home/un-schooling is normative. Because of this, I don’t think that promoting home/un-schooling as an alternative to the system carries any particularly broad risks.
Here’s my reasoning:
I expect the associated stigmas and pressures for having kids to always dwarf the associated stigmas and pressures against having kids if they are not home/un-schooled. Having kids is an extremely strong norm both because of the underpinning evolutionary psychology and because a lot of life-style patterns after thirty are culturally centered around people who have kids.
Despite its faults, public school does the job pretty well for most of people. This applies to the extent that the opportunity cost of home/un-schooling instead of building familial wealth probably outweighs the benefits for most people. Thus, I don’t believe that the promoting of home/un-schooling is scaleable to everyone.
Lots of rich people who have the capacity to home/un-school who dislike the school system decide not to do that. Instead they (roughly speaking) coordinate towards expensive private schools outside the public system. I doubt that this has caused a significant number of people to avoid having children for fear of not sending them to a fancy boading school.
Even if the school system gets sufficiently stigmatised, I actually expect that the incentives will naturally align around institutional schooling outside the system for most children. Comparative advantages exist and local communities will exploit them.
Home/un-schooling often already involves institutional aspects. Explicitly, home/un-schooled kids would ideally have outlets for peer-to-peer interactions during the school-day and these are often satisfied through community coordination
I grant that maybe increased popularity of home/un-schooling could reduce reproduction rate by an extremely minor amount on the margin. But I don’t think that amount is anywhere near even the size of, say, the way that people who claim they don’t want to have kids because global warming will reproduce less on the margin.
And as someone who got screwed by the school system, I really wish that when I asked my parents about home/un-schooling, there was some broader social movement that would incentivize them to actually listen.
Developed countries already have below-replacement fertility (according to this NPR article, the CDC claims that the US has been in this state since 1971), so apparently you can have pressures that outweigh pressures to have children. In general I don’t understand why you don’t think that a marginal increase in the pressure to invest in each kid won’t result in marginally fewer kids.
Presumably this is not true in a world where many people believe that schools are basically like prisons for children, which is a sentiment that I do see and seems more memetically fit than “homeschooling works for some families but not others”.
My impression was that rich people often dislike the public school system, but are basically fine with schools in general?
Rich people have fewer kids than poor people and it doesn’t seem strange to me to imagine that that’s partly due to the fact that each child comes at higher expected cost.
This seems right to me barring strong normative home/unschooling, and I wish that this were a more promoted alternative (as my post mentions!).
Yep—you’ll notice that my post doesn’t deny the manifold benefits of the home/unschooling movement, and I think the average unschooling advocate is basically right about how bad typical schools are.
I think the crux of our perspective difference is that we model the decrease in reproduction differently. I tend to view poor people and developing countries having higher reproduction rates as a consequence of less economic slack. That is to say, people who are poorer have more kids because those kids are decent long-term investments overall (ie old-age support, help-around-the-house). In contrast, wealthy people can make way more money by doing things that don’t involve kids.
This can be interpreted in two ways:
Wealthier people see children as higher cost and elect not to have children because of the costs
or
Wealthier people are not under as much economic pressure so have fewer children because they can afford to get away with it
At the margin, both of these things are going on at the same time. Still, I attribute falling birthrates as mostly due to the latter rather than the former. So I don’t quite buy the claim that falling birth-rates have been dramatically influenced by greater pressures.
Of course, Wei Dai indicates that parental investment definitely has an effect so maybe my attribution isn’t accurate. I’d be pretty interested in seeing some studies/data trying to connect falling birthrates to the cultural demands around raising children.
...
Also, my understanding of the pressures re:homeschooling is something like this:
The social stigma against having kids is satisficing. Having one kid (below replacement level) hurts you dramatically less than having zero kids
The capacity to home-school is roughly all-or-nothing. Home-schooling one kid immediately scales to home-schooling all your kids.
I doubt the stigma for schooling would punish a parent who sends two kids to school more than a parent who sends one kid to school
This means that, for a given family, you essentially chose between having kids and home-schooling all of them (expected-cause of home-schooling doesn’t scale with number of children) or having no kids (maximum social penalty). Electing for “no kids” seems like a really undesirable trade-off for most people.
There are other negative effects but they’re more indirect. This leads me to believe that, compared to other pressures against having kids, stigmas against home-schooling will have an unusually low marginal effect.
Interesting—my bubble doesn’t really have a “schools are like prisons” group. In any case, I agree that this is a terrible meme. To be fair though, a lot of schools do look like prisons. But this definitely shouldn’t be solved by home-schooling; it should be solved by making schools that don’t look like prisons.
Kids will grow up and move away no matter if you’re rich or poor though, so I’m not sure the investment explanation makes sense. But your last sentence rings true to me. If someone cares more about career than family, they will always have “no time” for a family. I’ve heard it from well-paid professionals many times: “I’d like to have kids… eventually...”
I think you’re overstating the stigma against not having kids. I Googled “is there stigma around not having kids” and the top two US-based articles both say something similar:
USA Today:
Times:
Agreed. Per my latest reply to DanielFilan:
I massively underestimated the rate of childfree-ness and, broadly speaking, I’m in agreement with Daniel now.
[next quote is reformatted so that I can make it a quote]
Glad to see we agree—and again, the important point for my argument isn’t whether most of existing low fertility can be attributed to the existing cost of kids, but whether adding extra cost per kid will reduce the number of kids (as the law of demand predicts).
I’m sure this can’t be exactly right, but I do think that the low marginal cost of home-schooling was something I was missing.
I continue to think that you aren’t thinking on the margin, or making some related error (perhaps in understanding what I’m saying). Electing for no kids isn’t going to become more costly, so if you make having kids more costly, then you’ll get fewer of them than you otherwise would, as the people who were just leaning towards having kids (due to idiosyncratically low desire to have kids/high cost to have kids) start to lean away from the plan.
(I assume you meant pressure in favour of home-schooling?) Please note that I never said it had a high effect relative to other things: merely that the effect existed and was large and negative enough to make it worthwhile for homeschooling advocates to change course.
Yeah, I was thinking in broad strokes there. I agree that there is a margin at which point people switch from choosing to have kids to choosing not to have kids and that moving that margin to a place where having kids is less net-positive will cause some people to choose to have fewer kids.
My point was that the people on the margin are not people who will typically say”well we were going to have two kids but now we’re only going to have one because home-schooling”; they’re people who will typically say “we’re on the fence about having kids at all.” Whereas most marginal effects relating to having kids (ie the cost of college) pertain to the former group, the bulk of marginal effects on reproduction pertaining to schooling stigmas pertain to the latter group.
Both the margin and the population density at the margin matter in terms of determining the effect. What I’m saying is that the population density at the margin relevant to schooling-stigmas is notably small.
However, I’ve actually been overstating my case here. The childfree rate in the US is currently around 15% which is much larger than I expected. The childfree rate for women with above a bachelor’s degree is 25%. In absolute terms, these are not small numbers and I’ve gotta admit that this indicates a pretty high population density at the margin.
Per the above stats, I’ve updated to agree with this claim.
The trend in China of extreme parental investment (lots of extra classes starting from a young age, forcing one’s kid to practice hours of musical instrument each week, paying huge opportunity costs to obtain a 学区房) almost certainly contributes significantly to its current low birth rate. I think normative home/unschooling has the potential to have a similar influence elsewhere.
But have you thought about whether lower birth rate is good or bad from a longtermist / x-risk perspective? It’s not clear to me that it’s bad, at least.
I haven’t thought incredibly carefully about this. My guess is that a high birth rate accelerates basically everything but elderly care, and so the first-order question is whether you think humanity is pushing in roughly the right or wrong direction—I’d say it’s going in the right direction. That being said, there’s also a trickier factor of whether you’d rather have all your cognition be in serial or in parallel, and if you want it to be in serial, then low birth rates look good.
A couple of considerations in the “lower birth rate is good for longtermism” direction:
Lower birth rate makes war less likely. (Less perceived need to grab other people’s resources. Parents are loathe to lose their only children in war.)
Increased parental investment and inheritance which shifts up average per-capita human and non-human capital, which is probably helpful for increasing understanding of x-risk and ability/opportunity to work on it. (Although this depends on the details of how the parental investment is done, since some kinds, e.g., helicopter parenting, can be counterproductive. Home/unschooling seems likely to be good in this regard though.)
One factor here that is big in my mind: I expect per-capita wealth to be lower in worlds with lower populations, since fewer people means fewer ideas that enrich everyone. I think that this makes 2 go in the opposite direction, but it’s not obvious to me what it does for 1.
It’s not clear that positive-sum innovation is linear (or even monotonically positive) with total population. There almost certainly exist levels at which marginal mouths to feed drive unpleasant and non-productive behaviors more than they do the growth-driving shared innovations.
Whether we’re in a downward-sloping portion of the curve, and whether it slopes up again in the next few generations, are both debatable. And they should be debated.
My sense is that on average, more population means more growth (see this study on the question). But certainly at some point probably you run out of ideas for how to make material more valuable and growth just becomes making more people with the same consumption per capita.
I find this comment kind of irksome, because (a) neither I nor anybody else said that they weren’t proper subjects for debate and (b) you’ve exhorted debate on the topic but haven’t contributed anything other than the theoretical possibility that the effect could go the other way. So I see this as trying to advance some kind of point illegitimately. If you make another such comment that I find irksome in the same way, I’ll delete it, as per my commenting guidelines.
I now think the biggest flaw with this argument is that home/unschooling actually don’t take that many hours out of the day, and there’s a lot of pooling of work going on. Thanks to many FB commenters and Isnasene for pointing that out.
And also that anti-standard-school memes are less fit than pro-home/unschooling memes, such that “normative home/unschooling” doesn’t seem that likely to be a big thing.
So you’re a proponent of improving institutional ways of supervising children?
Tentatively, yes. But I’ve only just had this thought today, so I’m not very committed to it. Also note my edit: it’s more about being in favour of low-time-investment ways to raise children that don’t have the problems schooling is alleged to have.
I often see (and sometimes take part in) discussion of Facebook here. I’m not sure whether when I partake in these discussions I should disclaim that my income is largely due to Good Ventures, whose money largely comes from Facebook investments. Nobody else does this, so shrug.
Huh. Indeed seems good to at least have talked about talking about.
Why I am less than infinitely hostile to the time / bloomberg pieces:
They are kinda informative about the way the scene has been in the past
Much of the behaviour described in them is pretty fucked up
It is relevant for people to know that EA/rationality is not an abuse-free zone
The reported people have faced professional consequences, including being expelled from the community for the most serious offenders, but given that there were several, it’s plausible that others will crop up.
They point to a dynamic of “single-mindedness on AI stuff / extremely bad at normal human relationships” that is real and kinda bad.
One result that’s related to Aumann’s Agreement Theorem is that if you and I alternate saying our posterior probabilities of some event, we converge on the same probability if we have common priors. You might therefore wonder why we ever do anything else. The answer is that describing evidence is strictly more informative than stating one’s posterior. For instance, imagine that we’ve both secretly flipped coins, and want to know whether both coins landed on the same side. If we just state our posteriors, we’ll immediately converge to 50%, without actually learning the answer, which we could have learned pretty trivially by just saying how our coins landed. This is related to the original proof of the Aumann agreement theorem in a way that I can’t describe shortly.
Models and considerations.
There are two typical ways of deciding whether on net something is worth doing. The first is to come up with a model of the relevant part of the world, look at all the consequences of doing the thing in the model, and determine if those consequences are net positive. When this is done right, the consequences should be easy to evaluate and weigh off against each other. The second way is to think of a bunch of considerations in favour of and against doing something, and decide whether the balance of considerations supports doing the thing or not.
I prefer model-building to consideration-listing, for the following reasons:
By building a model, you’re forcing yourself to explicitly think about how important various consequences are, which is often elided in consideration-listing. Or rather, I don’t know how to quantitatively compare importances of considerations without doing something very close to model-building.
Building a model lets you check which possible consequences are actually likely. This is an improvement on considerations, which are often of the form “such-and-such consequence might occur”.
Building a model lets you notice consequences which you might not have immediately thought of. This can either cause you to believe that those consequences are likely, or look for a faulty modelling assumption that is producing those assumptions within the model.
Building a model helps you integrate your knowledge of the world, and explicitly enforces consistency in your beliefs about different questions.
However, there are also upsides to consideration-listing:
The process of constructing a model is pretty similar to consideration-listing: specifically, the part where one uses one’s judgement to determine which aspects of reality are important enough to include.
Consideration-listing is much easier to do, which is why it’s the form that this hastily-written shortform post takes.
Homework: come up with a model of this.
Hot take: the norm of being muted on video calls is bad. It makes it awkward and difficult to speak, clap, laugh, or make “I’m listening” sounds. A better norm set would be:
use zoom in gallery mode, so somebody making noise doesn’t make them more focussed than they were before
call from a quiet room
be more tolerant of random background sounds, the way we are IRL
Agreed. I often find myself unmuting because I’m trying to make social sounds (often laughter). However, in a large conversation, I prefer someone becomes a weird void without backchannel sounds than be plunged into domestic mayhem
As far as I can tell, people typically use the orthogonality thesis to argue that smart agents could have any motivations. But the orthogonality thesis is stronger than that, and its extra content is false—there are some goals that are too complicated for a dumb agent to have, because the agent couldn’t understand those goals. I think people should instead directly defend the claim that smart agents could have arbitrary goals.
I no longer endorse this claim about what the orthogonality thesis says.
A rough and dirty estimate of the COVID externality of visiting your family in the USA for Christmas when you don’t feel ill [EDIT: this calculation low-balls the externality, see below]:
You incur some number of μCOVIDs[*] a week, let’s call it x. Since the incubation time is about 5 days, let’s say that your chance of having COVID is about 5x/7,000,000 when you arrive at the home of your family with n other people. In-house attack rate is about 1⁄3, I estimate based off hazy recollections, so in expectation you infect 5xn/21,000,000 people, which is about xn/4,000,000 people.
How bad is it to infect one family member? Well, people tend to be most infectious about 1.5 days before symptoms show, which is about 3.5 days after they get infected. Furthermore, we empirically see that R is about 1 on average, so the people you infect each infect one person, who goes on to infect one other person, etc. How long until the chain ends? It looks like vaccines will be widely distributed in the USA some time between the 1st of April and the 31st of December 2021. Median date looks kinda like the 1st of September. So let’s say that there’s 8 months of transmission. A month has about 30.5 days, so that’s 244 days of transmission, which is 244⁄3.5=70 people. IFR is about 0.5%, so you get about 70 × 0.5% = 0.35 deaths. Each death loses maybe 13 life-years, altho that’s not quality-adjusted. Since I don’t want to quality-adjust that number, that’s 13 × 0.35 = 4.55 life-years lost. But some infections result in bad disability but not death. I estimate the disability burden at about equal to the mortality burden, so that’s 4.55 × 2 = 9.1 QALYs lost. A year is 365 days, so that’s 9.1 × 365 = 3321.5 QALDs (Quality-Adjusted Life-Days) lost.
So: when you travel to visit family, the rest of the world loses about 33xn/40,000 QALDs, where n is the number of family members and x is how many μCOVIDs you incurred over the last week. If you follow microcovid.org’s advice for healthy people and incur 200 μCOVIDs per week, and your family has 4 other people, that’s about two thirds of a healthy life-day lost by strangers. There are a bunch of estimates in this calculation, so this number might be off by an order of magnitude.
[*] A μCOVID is a one in one million chance of catching COVID-19.
I recently realized, thanks to a FB comment by Paul Christiano, that this is thinking about things in kind of the wrong way. R is approximately 1 because society is tamping down infection rates when infections are high and ‘loosening’ when infections are low. So, by infecting people, you cause some chain of counterfactual infections that perhaps ends when society notices and tamps down infection, but also you cause the rest of society to do less fun interacting in order to tamp down the virus. So the cost of infecting somebody is to cause everybody else to be more conservative. I’m still not quite sure how to think about that cost tho.
Note: this calculation only accounts for you infecting your relatives who then infect others, and not your relatives infecting you and you infecting others. Accounting for this should probably raise the cost by a factor of 2.
Note: this calculation assumes that travelling is not risky at all. Realistically that should be bundled into x.
Better to concretise 3 ways than 1 if you have the time.
Here’s a tale I’ve heard but not verified: in the good old days, Intrade had a prediction market on whether Obamacare would become law, which resolved negative, due to the market’s definition of Obamacare.
Sometimes you’re interested in answering a vague question, like ‘Did Donald Trump enact a Muslim ban in his first term’ or ‘Will I be single next Valentine’s day’. Standard advice is to make the question more specific and concrete into something that can be more objectively evaluated. I think that this is good advice. However, it’s inevitable that your concretisation may miss out on aspects of the original vague question that you cared about. As such, it’s probably better to concretise the question multiple ways which have different failure modes. This is sort of obvious for evaluating questions about things that have already happened, like whether a Muslim ban was enacted, but seems to be less obvious or standard in the forecasting setting. That being said, sometimes it is done—OpenPhil’s animal welfare series of questions seems to me to basically be an example—to good effect.
This procedure does have real costs. Firstly, it’s hard to concretise vague questions, and concretising multiple times is harder than concretising once. It’s also hard to predict multiple questions, especially if they’re somewhat independent as is necessary to get the benefits, meaning that each question will be predicted less well. In a prediction market context, this may well manifest in having multiple thin, unreliable markets instead of one thick and reliable one.
This weekend, I looked up Benquo’s post on zetetic explanation in order to nominate it for the 2019 review. Alas, it was posted in 2018, and wasn’t nominated for that year’s review. Nevertheless, I’ve recently gotten interested in amateur radio, and have noticed that the mechanistic/physical explanations of radio waves and such that I’ve come across while studying for exams are not really sufficient to empower me to actually get on the radio, and more zetetic explanations are useful, altho harder to test. Anyway, I recommend re-reading the post.
My bid for forecasters: come up with conditional prediction questions to forecast likely impacts of potential US policies towards Ukraine. See this thread where I brainstorm potential such questions.
Challenges as I see it: figuring out which policies are live options, operationalizing, and figuring out good success/failure metrics.
Benefits: potentially make policy more sane, or more realistically practice doing the sort of thing that might one day make policy more sane.
Ted Kaczynski as a relatively apolitical test case for cancellation norms:
Ted Kaczynski was a mathematics professor who decided that industrial society was terrible, and waged a terroristic bombing campaign to foment a revolution against technology. As part of this campaign, he wrote a manifesto titled “Industrial Society and Its Future” and said that if a major newspaper printed it verbatim he would desist from terrorism. He is currently serving eight life sentences in a “super-max” security prison in Colorado.
My understanding is that his manifesto (which, incidentally, has been updated and given a new title “Anti-Tech Revolution: Why and How”, the second edition of which was released this year) is lucid and thought-out. Here are some questions the answers to which are not obvious to me:
Should anybody read “Industrial Society and Its Future”, given its origin?
Suppose an EA group wrote to Kaczynski in prison, asking him to write a letter about opposition to technology to be read aloud and discussed in an EA meetup, and he complied. Would it have been unacceptable for the EA group to do this, and should it be unacceptable for the EA group to hold this meetup?
Generally speaking, if someone commits heinous and unambiguous crimes in service of an objective like “getting people to read X”, and it doesn’t look like they’re doing a tricky reverse-psychology thing or anything like that, then we should not cooperate with that objective. If Kaczynski had posted his manifesto on LessWrong, I would feel comfortable deleting it and any links to it, and I would encourage the moderator of any other forum to do the same under those circumstances.
But this is a specific and unusual circumstance. When people try to cancel each other, usually there’s no connection or a very tenuous connection between their writing and what they’re accused of. (Also the crime is usually less severe and less well proven.) In that case, the argument is different; either the people doing the cancelling think that the crime wasn’t adequately punished, and are trying to create justice via a distributed minor punishment. If people are right about whether the thing is bad, then the main issues are about standards of evidence (biased readings and out-of-context quotes go a long way), proportionality (it’s not worth blowing up peoples’ lives over having said something dumb on the internet), and relation to nonpunishers (problems happen when things escalate from telling people why someone is bad, to punishing people for not believing or not caring).
There’s no need to cancel anyone who’s failing to have influence already. I suspect there are no apolotical test cases: cancellation (in the form of verbally attacking and de-legitimizing someone as a person, rather than arguing against specific portions of their work) is primarily politically motivated. It’s pretty pure ad-hominem argument: “don’t listen to or respect this person, regardless of what they’re saying”. In this case, I’m not listening because I think it’s low-value on it’s own, regardless of authorship.
The manifesto is pretty easy to find in PDF form for free. I wasn’t able to get very far—way too many crackpot signals and didn’t seem worth my time. To your bullet points:
I can read this two ways: “should anybody” meaning “do you recommend any specific person read it” or “do you object to people reading it”. My answers are “yes, but not many people”, and “no.”. Anybody who is interested, either from a direct curiosity on the topic (which I predict won’t be rewarded) or from wanting to understand this kind of epistemic pathology (which might be worthwhile) should read it.
It’s absolutely acceptable. I wouldn’t enjoy it, but I’m not a member of the group, so no harm there. To decide whether YOUR group should do it, try to identify what you’d hope to get out of it, and what likely consequences there are from pursuing that direction. If your group is visible and sensitive to public perception (aka politically influenced), then certainly you should consider those affects.
To be explicit, here are some reasons that the EA community should cancel Kaczynski. Note that I do not necessarily think that they are sound or decisive.
EAs are known as utilitarians who are concerned about the impact of AI technology. By associating with him, that could give people the false impression that EAs are in favour of terroristic bombing campaigns to retard technological development, which would damage the EA community.
His threat to bomb more people and buildings if the Washington Post (WaPo) didn’t publish his manifesto damaged good discourse norms by inducing the WaPo to talk about something it wasn’t otherwise inclined to talk about, and good discourse norms are important for effective altruism.
It seems to me (not having read the manifesto) that the policies he advocates would cause large amounts of harm. For instance, without modern medical technology, I and many others would not have survived to the age of one year.
His bombing campaign is evidence of very poor character.
Did a newspaper print it verbatim?
Did he desist? Did he start again later?
Who wrote these: “(which, incidentally, has been updated and given a new title “Anti-Tech Revolution: Why and How”, the second edition of which was released this year)”?
How long is “eight life sentences”, and how much time does he have left?
Yes, it was published by the Washington Post.
Yes, there were no further bombings after its publication.
He did.
“eight life sentences”, IIUC, means that he will serve the rest of his life, and if the justice system decides that one (or any number less than 8) of the sentences should be vacated, he will still serve the rest of his life. I’m not sure what his life expectancy is, but he’s 78 at the moment.
I made this post with the intent to write a comment, but the process of writing the comment out made it less persuasive to me. The planning fallacy?
If this is all that Shortform Feed posts ever do it still seems net positive. :P
[edit: conditional on, you know, you endorsing it being less persuasive]
Similarly, I sometimes start a shortform post and then realize “you know what, this is actually a long post”. And I think that’s also shortform doing an important job of lowering the barrier to getting started even if it doesn’t directly get used.
Here’s a script I wrote to analyze how good Manifold Markets is at predicting Ukraine stuff. Basically: it’s about as good as you would be if you were calibrated at 80% accuracy if you average market prices over the life of the market, and if you take the probabilities at the mid-point of the market, it’s about as good as you would be if you were calibrated at 72% accuracy.
In order to figure out how good this is, you’d also want to check how hard the questions were.
Some puzzles:
rubber ducking is really effective
it’s very difficult to write things clearly, even if you understand them clearly
These seem like they should be related, but I don’t quite know how. Maybe if someone thought about it for an hour they could figure it out.
Related: As I wrote just recently:
https://www.facebook.com/Xuenay/posts/10161257148333662?comment_id=10161257444543662
The feeling of something being obvious or easy in the above sense can be mistaken sometimes. It is an intuition or heuristic our brain applies I guess to figure out which things we are supposed to know in a tribe. It can be put on more solid footing by spelling out things and being forced to make intuitions explicit.
My 5-second take is basically what Gunnar_Zarncke already said. If you’re finding difficulty writing something clearly, it might mean you don’t understand it as clearly as you think. Maybe you understand 90%, and you gloss over the unclear 10%. Writing it out (or trying to fully explain it to someone) forces you to work through that 10%.
You might be better at writing than I am.
Quantitative claims about code maintenance from Working in Public, plausibly relevant to discussion of code rot and machine intelligence:
“most computer programmers begin their careers doing software maintenance, and many never do anything but”, attributed to Nathan Ensmenger, professor at Indiana University.
“most software at Google gets rewritten every few years”, attributed to Fergus Henderson of Google.
“A 2018 Stripe survey of software developers suggested that developers spend 42% of their time maintaining code”—link
“Nathan Ensmenger, the informatics professor, notes that, since the early 1960s, maintenance costs account for 50% to 70% of total expenditures on software development”—paper
Does this definition of “maintenance” include writing new functionality for existing applications?
If yes, then I agree; it is a rare opportunity to start coding a non-trivial project from scratch.
If no, then I find it difficult to believe how someone could e.g. fix bugs without ever having written their own code first (the school exercises do not count, because in my experience they do not resemble actual industry code).
From when the book introduces ‘maintenance’:
So, sounds like the book author isn’t including writing new functionality, but IDK if the term has such a fixed and clear meaning that Nathan Ensmenger and all the respondents to the Stripe survey mean the same thing as the book.
Here’s a project idea that I wish someone would pick up (written as a shortform rather than as a post because that’s much easier for me):
It would be nice to study competent misgeneralization empirically, to give examples and maybe help us develop theory around it.
Problem: how do you measure ‘competence’ without reference to a goal??
Prior work has used the ‘agents vs devices’ framework, where you have a distribution over all reward functions, some likelihood distribution over what ‘real agents’ would do given a certain reward function, and do Bayesian inference on that vs choosing actions randomly. If conditioned on your behaviour you’re probably an agent rather than a random actor, then you’re competent.
I don’t like this:
Crucially relies on knowing the space of reward functions that the learner in question might have.
Crucially relies on knowing how agents act given certain motivations.
A priori it’s not so obvious why we care about this metric.
Here’s another option: throw out ‘competence’ and talk about ‘consequential’.
This has a name collision with ‘consequentialist’ that you’ll probably have to fix but whatever.
The setup: you have your learner do stuff in a multi-agent environment. You use the AUP metric on every agent other than your learner. You say that your learner is ‘consequential’ if it strongly affects the attainable utility of other agents.
How good is this?
It still relies on having a space of reward functions, but there’s some more wiggle-room: you probably don’t need to get the space exactly right, just to have goals that are similar to yours.
Note that this would no longer be true if this were a metric you were optimizing over.
You still need to have some idea about how agents will act realistically, because if you only look at the utility attainable by optimal policies, that might elide the fact that it’s suddenly gotten much computationally harder to achieve that utility.
That said, I still feel like this is going to degrade more gracefully, as long as you include models that are roughly right. I guess this is because this model is no longer a likelihood ratio where misspecification can just rule out the right answer.
It’s more obvious why we care about this metric.
Bonus round: you can probably do some thinking about why various setups would tend to reduce other agents’ attainable utility, prove some little theorems, etc., in the style of the power-seeking paper.
Ideally you could even show a relation between this and the agents vs devices framing.
I think this is the sort of project a first-year PhD student could fruitfully make progress on.
Toryn Q. Klassen, Parand Alizadeh Alamdari, and Sheila A. McIlraith wrote a paper on the multi-agent AUP thing, framing it as a study of epistemic side effects.
This is a fun Aumann paper that talks about what players have to believe to be in a Nash equilibrium. Here, instead of imagining agents randomizing, we’re instead imagining that the probabilities over actions live in the heads of the other agents: you might well know exactly what you’re going to do, as long as I don’t. It shows that in 2-player games, you can write down conditions that involve mutual knowledge but not common knowledge that imply that the players are at a Nash equilibrium: mutual knowledge of player’s conjectures about each other, players’ rationality, and players’ payoffs suffices. On the contrary, in 3-player games (or games with more players), you need common knowledge: common priors, and common knowledge of conjectures about other players.
The paper writes:
This is pretty mysterious to me and I wish I understood it better. Probably it would help to read more carefully thru the proofs and examples.
Got it, sort of. Once you have 3 people, then each person has a conjecture about the actions of the other two people. This means that your distribution might not be the product of the marginals over your distributions over the actions of each opponent, so you might be maximizing expected utility wrt your actual beliefs, but not wrt the product of the marginals—and the marginals are what are supposed to form the Nash equilibrium. Common knowledge and common priors mean stop this by forcing your conjecture over the different players to be independent. I still have a hard time explaining in words why this has to be true, but at least I understand the proof.
Let it be known: I’m way more likely to respond to (and thereby algorithmically signal-boost) criticisms of AI doomerism that I think are dumb than those that I think are smart, because the dumb objections are easier to answer. Caveat emptor.
An attempt at rephrasing a shard theory critique of utility function reasoning, while restricting myself to things I basically agree with:
Yes, there are representation theorems that say coherent behaviour is optimizing some utility function. And yes, for the sake of discussion let’s say this extends to reward functions in the setting of sequential decision-making (even tho I don’t remember seeing a theorem for that). But: just because there’s a mapping, doesn’t mean that we can pull back a uniform measure on utility/reward functions to get a reasonable measure on agents—those theorems don’t tell us that we should expect a uniform distribution on utility/reward functions, or even a nice distribution! They would if agents were born with utility functions in their heads represented as tables or something, where you could swap entries in different rows, but that’s not what the theorems say!
Here are two EA-themed podcasts that I think someone could make. Maybe that someone is you!
More or Less, but EA (or for forecasting)
More or Less is a BBC Radio program. They take some number that’s circulating around the news, and provide context like “Is that literally true? How could someone know that? What is that actually measuring? Is that a big number? Does that mean what you think it means?”—stuff like that. They spend about 10 minutes on each number, and usually include interviews with experts in the field. IMO, someone could do this for numbers that circulate around in the EA space. Another variant is to focus on forecasts—what factors are going in, what’s the reasoning for those guesses, etc.
This could be pretty easy to listen to, but moderately hard to make—requires research, editing conversations down, etc.
AI Safety Fellowship / Course thing—the podcast.
Get someone who’s doing something like the AGI Safety Fundamentals course or the Center for AI Safety’s thing like that. Each week, they make a podcast episode about what they think of the week’s readings—what seemed persuasive, what didn’t, what was interesting, what was novel. For a long version, you could make an episode about each reading.
If someone’s already doing one of these courses, I think it wouldn’t be much extra work to make this podcast (after the set cost of learning how you make a podcast). It would end up having an inherently limited run (but maybe you could do future seasons about reading thru Superintelligence / MIRI chat logs / various agendas?).
If you’re reading this, you might wonder: how do I actually make a podcast? Well, here’s the basic technical stuff to get started.
Buy a decent microphone, e.g. the Blue Yeti (costs ~$100). This will make you not sound bad.
If you’re going to be talking to people who aren’t physically near you, use some service that will record both of you talking. I recommend Zencastr (free for how I use it).
Record some talking (this is the hard part). My strong advice is that if you’re doing this remotely, you should both be wearing wired headphones. Please do this in a non-echoey, non-noisy space if you can. Kitchen is bad, sound-isolated place with blankets is good.
Do some minimal editing. Don’t try to delete every um and ah, that will take way too long. You can use the computer program “audacity” for this (free), or ask me who I pay to do my editing.
Optionally, make transcripts by uploading your edited audio files to rev.com (~$1 per minute of audio). You’ll then have to re-listen to the audio and fix mistakes in the transcript. If you do this, you will probably want to make a website to put transcripts on, which will maybe involve using Github Pages or Squarespace (or maybe you just put transcripts on a pre-existing Medium/Substack/blog?)
Think of a name and logo for your podcast. Your logo needs to be exactly square and high-res.
Use a podcast hosting service. I like libsyn (~$10/month for basic plan). Upload your audio files there, write descriptions and episode titles. You should now have an RSS feed.
Submit your RSS feed to Google Podcasts, Apple Podcasts, and Spotify. This will involve googling how to do this, you might make some errors, and then it will take ages for Apple to list your podcast.
Once you’ve done all this and dealt with the inevitable hiccups, you now have a podcast! Congratulations! It is certainly possible to do all of this better, but you at least have the basics.
What do you see as the main value of idea 2?
More easily digestible discussion / analysis of AI alignment ideas. Also it might be fun to listen to.
A sad fact is that good methods to elicit accurate probabilities of the outcome of some future process, e.g. who will win the next election, give you an incentive to influence that outcome, e.g. by campaigning and voting for the candidate you said was more likely to win. But with mind uploading and the ‘right’ theory of personal identity, we can fix this!
First, suppose that you think of all psychological descendants of your current self as ‘you’, but you don’t think of descendants of your past self as ‘you’. So, if you were to make a copy of yourself tomorrow, today you would think of your copy as fully you, but your ‘main self’ and the ‘copy’ would think of themselves as totally different people, and not care particularly much about the other one winning money.
Once you’re happy with that, here’s what you do: first, make your prediction about who wins the next election. Then, save your brain state right after making the prediction. Then wait for the election to happen, and after the result is known, instantiate a copy of you from that saved brain state, and reward or punish them according to how good the prediction was. At the time of making the prediction, you’re incentivized to be right so that your future self gets rewarded, but after the prediction is made but before the election, you think of the person who gets rewarded/punished as not you, and therefore don’t want to influence the election (any more than you already did).
(NB: this assumes that acausal trade isn’t a thing.)
Objections might include:
That’s mindcrime and/or murder, which is bad.
Acausal trade is in fact a thing
blah blah technical feasibility
Why murder? No sims are being deleted in this proposal.
Ok, a much simpler way is to put yourself in storage right after making the prediction and revive you after the event happens (e.g. by not having the copy of you that hangs out between the prediction and the event). Then you don’t need the weird theory of identity.
I’m having trouble supposing this. Aren’t ALL descendants of my past selves “me”, including the me who is writing this comment? I’m good with differing degrees of “me-ness”, based on some edit-distance measure that hasn’t been formalized, but that’s not based on path, it’s based on similarity. My intuition is that it’s symmetrical.
I’m sympathetic to the idea this is a silly assumption, I just think it buys you a neat result.
Suppose there are two online identities, and you want to verify that they’re associated with the same person. It’s not too hard to verify this: for instance, you could tell one of them something secretly, and ask the other what you told the first. But how do you determine that two online identities are different people? It’s not obvious how you do this with anything like cryptographic keys etc.
One way to do it if the identities always do what’s causal-decision-theoretically correct is to have the two identities play a prisoner’s dilemma with each other, and make it impossible to enforce contracts. If you’re playing with yourself, you’ll cooperate, but if you’re playing with another person you’ll defect.
That being said, this only works if the payoff difference between both identities cooperating and both identities defecting is greater than the amount a single person controlling both would pay to convince you that they’re actually two people. Which means it only works if the amount you’re willing to pay to learn the truth is greater than the amount they’re willing to pay to deceive you.
Here’s one way you can do it: Suppose we’re doing public key cryptography, and every person is associated with one public key. Then when you write things online you could use a linkable ring signature. That means that you prove that you’re using a private key that corresponds to one of the known public keys, and you also produce a hash of your keypair, such that (a) the world can tell you’re one of the known public keys but not which public key you are, and (b) the world can tell that the key hash you used corresponds to the public key you ‘committed’ to when writing the proof.
Actually I’m being silly, you don’t need ring signatures, just signatures that are associated with identities and also used for financial transfers.
Note that for this to work you need a strong disincentive against people sharing their private keys. One way to do this would be if the keys were also used for the purpose of holding cryptocurrency.
If you want to search for literature the relevant term is Sybil attack.
Blog post request: a summary of all the UFO stuff and what odds I should put on alien visitations of earth.
‘Seminar’ announcement: me talking quarter-bakedly about products, co-products, deferring, and transparency. 3 pm PT tomorrow (actually 3:10 because that’s how time works at Berkeley).
I was daydreaming during a talk earlier today (my fault, the talk was great), and noticed that one diagram in Dylan Hadfield-Menell’s off-switch paper looked like the category-theoretic definition of the product of two objects. Now, in category theory, the ‘opposite’ of a product is a co-product, which in set theory is the disjoint union. So if the product of two actions is deferring to a human about which action to take, what’s the co-product? I had an idea about that which I’ll keep secret until the talk, when I’ll reveal it (you can also read the title to figure it out). I promise that I won’t prepare any slides or think very hard about what I’m going to say. I also won’t really know what I’m talking about, so hopefully one of you will. The talk will happen in my personal zoom room. Message me for the passcode.
I do not have many ideas here, so it might mostly be me talking about the category-theoretic definition of products and co-products.
Avoid false dichotomies when reciting the litany of Tarski.
Suppose I were arguing about whether it’s morally permissible to eat vegetables. I might stop in the middle and say:
But this ignores the possibility that it’s neither morally permissible nor morally impermissible to eat vegetables, because (for instance) things don’t have moral properties, or morality doesn’t have permissible vs impermissible categories, or whether or not it’s morally permissible or impermissible to eat vegetables depends on whether or not it’s Tuesday.
Luckily, when you’re saying the litany of Tarski, you have a prompt to actually think about the negation of the belief in question. Which might help you avoid this mistake.
Alternate title: negation is a little tricky.
An interesting tension: it’s kind of obvious from a micro-econ view that group houses should have Pigouvian taxes on uCOVIDs[*] (where I pay housemates for the chance I get them sick) rather than caps on how many uCOVIDs everyone can incur per week—and of course both of these are better than “just sort of be reasonable” or having no system. But uCOVID caps are nice in that they make it significantly easier to coordinate with other houses—it’s much easier to figure out how risky interacting with somebody is when they can just tell you their cap, rather than having to guess how much they’ll interact with others between now and when you’ll meet them. It’s not totally clear to me which system ends up being better, although I think for people who have stable habits Pigouvian taxes end up better.
[*] a uCOVID, short for microCOVID, is a one in a million chance of catching COVID-19.
It’s μCOVID, with a μ!
Sorry, I maybe should have cared enough to copy-paste but didn’t.
(Which you get using option-m on a mac.)
(Or
M-x insert-char GREEK SM<tab> L<tab>M<tab>
in Emacs.)Even easier in Helm-mode!
FYI: I am not using the dialogue matching feature. If you want to dialogue with me, your best bet is to ask me. I will probably say no, but who knows.
Research project idea: formalize a set-up with two reinforcement learners, each training the other. I think this is what’s going on in baby care. Specifically: a baby is learning in part by reinforcement learning: they have various rewards they like getting (food, comfort, control over environment, being around people). Some of those rewards are dispensed by you: food, and whether you’re around them, smiling and/or mimicking them. Also, you are learning via RL: you want the baby to be happy, nourished, rested, and not cry (among other things). And the baby is dispensing some of those rewards.
Questions:
What even happens? (I think in many setups you won’t get mutual wireheading)
Do you get a nice equilibrium?
Is there some good alignment property you can get?
Maybe a terrible alignment property?
This could also be a model of humans and advanced algorithmic timelines or some such thing.
This will always multiply error, every time, until you have a society, at which point the agents aren’t really doing naked RL any more because they need to be resilient enough to not get parasitized/dutchbooked.
Blog posts I could write up in the next few days:
My EDC as of late Dec 2022 [EDIT: here]
thoughts on media I consumed in 2022 (would include Kindle books + stuff I watched on Netflix and Amazon Prime Video)
I could also do my “cover” of the “you should start a blog” genre of post. [EDIT: done]
A rationalist cover of Paul Washer’s shocking youth message sermon (worth watching if you’re interested in sermons).
I guess rationalist salvation is now a matter of degree: “What fraction of your multiverse measure will experience a future optimized according to true human values?”
I always enjoy a good EDC discussion. I’ve switched this year from trying to use a phone/small tablet out in the world, to just admitting that I really do prefer a full keyboard/trackpad and an OS that’s designed for it. Right now that’s a Surface Laptop 3 (13.5″ screen, under 3 lbs). It doesn’t go everywhere with me, but it’s around often enough that I don’t try to type more than a sentence or two on my phone.
An argument for stock-picking:
I’m not sure whether I can pick stocks better than the market. But if I can, then money is more valuable to me in that world, since I have better-than-market opportunities in that world but only par-with-market opportunities in the EMH world. So I should buy stocks that look good to me, at least for a while, to check whether I’m in the world where I can do that, because it’s a transfer from a world where money is less valuable to me to one where money is more valuable.
I think this argument goes thru if you assume market returns are equal in both worlds, which I think I think.
I agree market returns are equal in expectation, but you’re exposing. yourself to more risk for the same expected returns in the “I pick stocks” world, so risk-adjusted returns will be lower.
According to Michael Dickens, if you pick like 50 and they’re not too correlated it’s not actually all that much more risk. Which sort of makes sense—it’s like how you can accurately estimate population averages of stuff by taking a relatively small random sample and looking at the sample average.
Results from an experiment I just found about inside vs outside view thinking (but haven’t read the actual study, just the abstract: beware!)
Excerpts from a FB comment I made about the principle of charity. Quote blocks are a person that I’m responding to, not me. Some editing for coherence has been made. tl;dr: it’s fine to conclude that people are acting selfishly, and even to think that it’s likely that they’re acting selfishly on priors regarding the type of situation they’re in.
If this were true, then one shouldn’t engage in charitable discourse. People often do things for entirely selfish reasons. I can determine this because I often do things for entirely selfish reasons, and in general things put under selection pressure will behave in accordance with that pressure. I could elaborate or develop this point further, but I’d be surprised to learn that you disagreed. I further claim that you shouldn’t assume that something isn’t the case if it is often the case.
That being said, the “non-selfish” qualifier doesn’t appear in what Wikipedia thinks the principle of charity is, nor does it appear in r/slatestarcodex’s sidebar description of what it means to be charitable, and I don’t understand why you included it. In fact, in all of these cases, the principle of charity seems like it’s meant to apply to arguments or stated beliefs rather than actions in general...
You should in fact assume that tech companies are in it for the money and have no principles, at least until seeing contrary evidence, since that’s the standard and best model of corporations (although you shouldn’t necessarily assume the same of their employees). Regarding “we shouldn’t do the same”, I wholeheartedly reject the implication that if people don’t like having certain inferences drawn about them, one shouldn’t draw those inferences. Sometimes the truth is unflattering!
Whether it is more charitable to assume someone is or isn’t selfish can depend on context.
I think you (and Wikipedia and Scott) are limiting your ideas of what the principle really means. _IF_ you only care about rationality, it’s about assuming rationality. For those of us in conversations where we _ALSO_ care about intent, nuance, and connotation, it can include assuming goodwill and best intentions of your conversational partners.
In all cases, the assumption is only a prior—you’re getting a lot of evidence in the discussion, and you don’t need to cling to a false belief when shown that your opponent and their statements are not correct or useful.