zoop

Karma: 78

zoop 26 Jan 2026 16:07 UTC
1 point
−1
on: To be well-calibrated is to be punctual
I suspect there are a few genres of late. Most people have the Sudden Realization variety; I suffer from Lucid Coma tardiness. This horrific disease means I’m fully aware of the right time to leave and how it reflects on me, I just… don’t. Psychoanalysis might reveal some sort of ego conflict or latent aggression towards the world for imposing a schedule on me, but all of this is to say I think this model needs to add another dimension for agency.

Perhaps we can make a 2x2: calibration x execution reliability.
- High calibration / High execution: punctual.
- Low calibration / High execution: consistently early or late.
- High calibration / Low execution: “I knew and still didn’t.”
- Low calibration / Low execution: missed appointments.

zoop 8 Jan 2026 21:38 UTC
1 point
0
in reply to: James Camacho’s comment on: A Loser’s Reflections
I appreciate that you took the time to re-evaluate my original post. I did have prosocial intention, and perhaps you’re right that I should have explained my intentions better.
My opinion is that the author is engaged in damaging self-deception. My goals with my response were to 1) surface this self-deception and 2) demonstrate its absurdity. In so doing, I hoped the author and people with similar inclinations wouldn’t self harm in this particular way.
My interpretation is that we don’t quite agree what the author is doing wrong. In addition to the issue you identified as moral self-centeredness (“how I want the world to be is how it ought to be”) I understand the author to also be acting in accordance with this view (“I should act as though I live in the world that ought to exist”).
I’ll adapt your analogy a little to try and make my point. A hiker warns the author that there is a branch on the ground ahead blocking the trail. This branch is too heavy for any one person to move, but is easy to walk around. The author, believing that the government should remove branches from trails, resolves not to walk around the branch, falls over it, and injures themselves. They then go home and write a long editorial claiming their injury to be the government’s fault, since they should have removed the branch.
The most important takeaway, I think, is that the proximate cause of the injury is the author’s actions even if it is the government’s responsibility to remove branches. If walking around the branch is “unnecessary suffering,” surely intentionally walking into the branch is at least doubly unnecessary. I am not at all convinced blaming the system in this situation is engaging in a cooperative game, it simultaneously minimizes the author’s agency and overstates harms caused by the system.
I should note that if the author had walked around the branch then complained about it afterwards none of this criticism would stand.
Reading your replies, it’s clear you are sympathetic to the criticisms the author has of the education system/meritocracy. That’s fine, but it seems to me that you’re accepting the premise that their suffering is the school’s fault. I’m not telling the author to “shut up and climb over the log,” I’m telling the author to take responsibility for their own actions and stop blaming others. The fact that the author finds the situation intolerable doesn’t change proximal fault at all.
In a vacuum, sure, I do think many of these systems are imperfect. The distinction between a kid who has yet to go to MIT and a kid who has gone to MIT is minor, and failure to interview the former implies an inefficiency somewhere. The difference between you and the author is that you accepted your own agency and took responsibility for the consequences of your actions.
Not super relevant, but I haven’t really revealed my own position. Simply, I think it is the job of the individual to act in their own best interest, and the job of government to align incentives such that the individual’s best interest is also the best interest of the collective. In my personal life I’m the type of person who moves branches out of the way, but it’s not something I expect of others; I’m a moral anti-realist, which I understand is not a very common position.

zoop 8 Jan 2026 2:52 UTC
2 points
0
in reply to: James Camacho’s comment on: A Loser’s Reflections
I didn’t miss the second narrative, rather:
My failure to engage with the criticisms of the system was entirely intentional.

I had three reasons for ignoring the criticisms:
1. The critique is a fairly standard leftist/Marxist criticism of meritocracy. It’s well-trodden ground, and neither I nor (in my opinion) the author has much new to add here.
2. The critique is doing a lot of rhetorical work to assert the moral righteousness of the author not taking the test. Engaging it at all felt (to me, subjectively) like indulging their superiority fantasy.
3. The author’s argument that the university doesn’t function how it “claims” to function is weak; the university’s supposed “claim” is largely the author’s interpretation of what “special” means and what they personally think university should be for.
I grant that one can agree with their criticisms, I simply found them to be of secondary importance.

zoop 7 Jan 2026 16:37 UTC
5 points
3
on: A Loser’s Reflections
This essay was incredibly sad to read. It reads as a self-deception, a self-sabotage, an attempt to justify a continued refusal to pay the prices required to experience their desired outcomes.
The question I thought should obviously be consuming the author, namely, how to successfully navigate the system that actually exists, was only briefly considered before it was rejected as intolerable. Instead, the topic of the essay was how incorrect the system was for not granting the author something they appear to feel owed. Emergent systems like society are not necessarily just, fair, or even logical. They rarely do everything we want them to, as well as we want them to, or grant us the objects of our desire. We live in the world that exists, not the world we wish existed.
The thing I found most galling in this essay is that the author never seriously engages with the possibility that their application did not merit admission, and presents no data for the reader to use to evaluate merit for themselves. While we know the author spent lots of time on their essay, we don’t know how good the essay was. They have no grades or achievements to tell us of. Instead, I assume, we are supposed to take on faith that if the author had suffered the indignity of taking a test everyone else takes they would have gotten in and all would be well.
In fact, were I on the admissions committee, this essay alone would disqualify the candidate. In a scholarly institution it is frequently necessary for a teacher to hand a student work and for them to do it. It is rarely desirable for the student to respond not with completed work, but with a screed on how homework is a tool the elites use to keep the lower classes down, arguing that the assignment is illegitimate and should be replaced with a different evaluation method both because it is unfair and because they don’t really like doing it.
It could not be more normal for a human to find themselves unanointed by elite institutions. The majority of those so afflicted are able to live fulfilling lives, though admittedly not necessarily in the exact manner of their choosing. No tests need to be taken to start a business. The joy of having a family is not reserved for those with diplomas.
My failure to engage with the criticisms of the system was entirely intentional. If you want benefit in life, you must frequently pay the stated price. If you refuse to pay the price, you cannot demand the benefit.

Reminder: AI Safety is Also a Behavioral Economics Problem

zoop20 Dec 2024 1:40 UTC

2 points

0 comments1 min readLW link

zoop’s Shortform

zoop20 Nov 2024 16:37 UTC

2 points

0 comments1 min readLW link

zoop 13 Nov 2024 20:24 UTC
4 points
−4
on: Flipping Out: The Cosmic Coinflip Thought Experiment Is Bad Philosophy
I think you’ve made a motte-and-bailey argument:
- Motte: The payoff structure of the cosmic flip/St. Petersburg Paradox applied to the real world is actually much better than double-or-nothing, and therefore you should play the game.
- Bailey: SBF was correct in saying you should play the double-or-nothing St. Petersburg Paradox game.
Your motte is definitely defensible. Obviously, you can alter the payoff structure of the game to a point where you should play it.
That does not mean “there’s no real paradox” , it just means you are no longer talking about the paradox. SBF literally said he would take the game in the specific case where the game was double-or-nothing. Totally different!
This ends my issue with your argument, but I’ll also share my favorite anti-St. Petersburg Paradox argument since you didn’t really touch on any of the issues it connects to. In short: the definition of expected value as the mean outcome is inappropriate in this scenario and we should instead use the median outcome.
This paper makes the argument better than I can if you’re curious, but here’s my concise summary:
- Mean values are perhaps appropriate if we play the game many (or infinity) times. In these situations, through the law of large numbers, the mean outcome of the games played will approach the mean interpretation of expected value.
- For a single play-through (as in the thought experiment) the mean is not appropriate, as the law of large numbers does not apply. Instead, we should value the game by its median outcome: the outcome one should reasonably expect.
- Indeed, if you have people actually play this game, their betting behavior is more consistent with an intuition of median expected value (this is tested in the paper).
- There’s an argument Median EV is the better interpretation even when playing multiple times. In these situations you can think of the game as “playing the game multiple times, once.” This resolves the paradox in all but the infinite cases.
- If you use the median interpretation of EV for finite trials of the game, there is no paradox.
A personal gripe: I find it more than a little stupid that the “expected value” is a value you don’t actually “expect” to observe very frequently when sampling highly skewed distributions.
Mathematicians and Economists have taken issue with the mean definition of EV basically as long as it has existed. Regardless of whether or not you agree with it, it seems pretty obvious to me that it is inappropriate to use the mean to value single trial outcomes.
So maybe in the real world we should play the game, but I firmly believe we should value the game using medians and not means. Do we get to play the world outcome optimization game multiple/infinite times? Obviously not.

zoop 6 Nov 2024 16:48 UTC
5 points
2
in reply to: johnswentworth’s comment on: The Median Researcher Problem
I made an omission mistake in just saying “sampling from noisy posteriors,” note I didn’t say they were performing unbiased sampling.
To extend the Psychology example: a study could be considered a sampling technique of the noisy posterior. You appear to be arguing that the extent to which this is a biased sample is a “skill issue.”
I’m arguing that it is often very difficult to perform unbiased sampling in some fields; the issue might be a property of the posterior and not that the researcher has a weak prefrontal cortex. In this framing it would totally make sense if two researchers studying the same/correlated posterior(s) are biased in the same direction–its the same posterior!

zoop 5 Nov 2024 15:32 UTC
9 points
4
on: The Median Researcher Problem
Eh. feels wrong to me. Specifically, this argument feels over-complicated.
As best I can tell, the predominant mode of science in replication-crisis affected fields is that they do causal inference by sampling from noisy posteriors.
The predominant mode of science in non-replication-crisis affected fields is that they don’t do this or do this less.
Most of the time it seems like science is conducted like that in those fields because they have to. Can you come up with a better way of doing Psychology research? Science in hard fields is hard is definitely a less sexy hypothesis, but it seems obviously true?

zoop 6 Sep 2024 22:21 UTC
17 points
−4
on: the Giga Press was a mistake
I really, really, really did not like this post. I found it to be riddled with bad assumptions, questionable unsupported claims, and critical omissions. I don’t think any of the core arguments survive close scrutiny.
Moreover, I took serious issue with the tone throughout. The first half hand-waves some seriously questionable claims into existence with strong confidence, while the second half opines that everyone who ever thought otherwise is some combination of sycophantic, incurious, brainwashed, or an idiot. I would have appreciated more intellectual humility.
***
My read is that this post totally whiffed on the entire subject of die casting cost savings.
The chassis of cars is a relatively small fraction of their cost. The cost of aluminum die casting and stamped steel is, on Tesla’s scale, similar. Yet, there were so many articles saying gigacasting was a major advantage of Tesla over other companies.

To be clear: the cost savings argument for die casting is little to do with the cost of the chassis itself, it’s mostly an argument about the cost of body assembly.
In an automotive assembly line one of the most labor-intensive, challenging, and expensive steps is the “body shop,” where a car’s structural components are assembled into a “body in white.” Die casting saves time and money by reducing the number of welds, bolts, etc. required to go from components to body. It also cuts down on total weight, waste material from manufacturing a larger number of components, and the number of steps one can introduce tolerance errors.
Here is an example from the Model 3. Switching from traditional assembly to die casting cuts out 169 separate metal parts and 1600 welds. Those costs add up! Look at the difference in estimated variable costs.
in short, your claim: “The cost of aluminum die casting and stamped steel is, on Tesla’s scale, similar” both seems to miss the entire point and run against literally everything I have seen written about this. You need citations for this claim, I am not going to take your word for it.
***
The price thing alone seems like a post invalidating miss, but I was pretty alarmed by the sheer number of other strong assertions made with weak or no supporting evidence. Some of these seemed obviously wrong.
Tesla has been widely criticized for stuff not fitting together properly on the car body. My understanding is that the biggest reason for that is their large aluminum castings being slightly warped.
Tesla’s panel gap issues predate the giga press by like a decade and has always been attributed to wide tolerances for all parts and lazy QA (de-prioritized in favor of R&D). I have absolutely no idea how you got to this “understanding.” Citation please?
As for voids, they can create weak points; I think they were the reason the cybertruck hitch broke off in this test.
Or the geometry of the frame was insufficiently optimized for vertical shear. I do not understand how you reached this conclusion.
BYD is still welding stamped steel sheets together, and that’s why it can’t compete on price with Tesla. Hold on, it seems...BYD prices are actually lower than Tesla’s? Much lower?
Price alone doesn’t really say anything about the giga press. Perhaps BYD’s efficiency could be explained by some of the other few thousand things that go into making a car? What about all the other stamped steel chassis companies BYD is way more efficient than?
Also, production costs are the actual thing that matter for this argument, not price. Tesla has 6x the profit per car of BYD which obviously factors into the higher prices.
Oh, and Tesla is no longer planning single unitary castings for future vehicles?
This is a bit misleading. Tesla doesn’t currently do unitary castings, so this is a suspension of future R&D not changing what they currently do. Importantly, this means they will keep giga casting their chassis for the foreseeable future.
Money is a factor, of course; PR agencies drive a lot of the articles in media. I assume Tesla pays some PR firms and people there presumably decided to push the Giga Press.
You should stop assuming! Tesla spent essentially nothing on marketing until 2023, well after this assumed PR would be taking place. By nothing I mean that the estimate for their marketing spend in 2022 (literally all marketing to include PR if there was any at all) was $175k.

zoop 6 Jun 2024 21:27 UTC
1 point
0
in reply to: Jeffrey Heninger’s comment on: Environmentalism in the United States Is Unusually Partisan
Actually, my read of the data is that the mountain west is not more environmentally conscious than the rest of the US.
The mountain west poll does not include national numbers, so I have no idea where your national comparisons are coming from. If I did, I’d check for same year/same question, but because I don’t know where they’re from I can’t.
Take a look at this cool visualization of different state partisan splits from 2018: https://climatecommunication.yale.edu/visualizations-data/partisan-maps-2018/
The mountain west appears neither significantly more nor significantly less partisan on any of the climate change related questions than the rest of the US.
My main point, which I don’t think you’ve contradicted (even if I accept that the mountain west is unique), is that you’re making an argument about “environmentalism” partisanship by using primarily “climate change” polling data. The charts from the 2013 paper you’ve posted sort of confirm this take–climate change is obviously a uniquely partisan issue.
The intro to your sequence states the following:
The partisanship we see today is unusual, compared to other issues, other countries, or even the US in the 1980s.
Basically, I have not seen evidence that this is true for issues beyond climate change (or other countries!), and I think your sequence would benefit by explicitly comparing
- the partisan split of non-climate-change environmental issues (e.g. rain forest protection) to
- the partisan split of non-environmental issues (e.g. taxation)

zoop 4 Jun 2024 14:35 UTC
4 points
0
on: Environmentalism in the United States Is Unusually Partisan
My initial reaction, admittedly light on evidence, is that the numbers you present are at least partially due to selection bias. You’ve picked a set of issues, like climate change, that are not representative of the entire scope of “environmentalism.” It shouldn’t surprise anybody that “worry about global warming” is a blue issue, but the much more conservative-y “land use,” “protection of fish and wildlife” and “conservation,” issues for whatever reason are often not measured. In short, it feels a little to me that your actual argument is that liberal-coded environmental issues are partisan.
More than half of state wildlife conservation funding comes from hunting licenses and firearms taxes. I assure you, these fees mostly come from republicans in republican states. Here is some polling done in the west on environmental issues. It shouldn’t be a surprise that republican voters in Wyoming and rural Colorado care a lot about the environment, but one shouldn’t expect them to think about the issues in the same way as latte drinking knowledge workers in coastal cities.
It also might interest some to read how Nixon talked about the environment. This message to congress about founding the EPA in 1972 has some interesting passage, including the following:
PROTECTING OUR NATURAL HERITAGE
Wild places and wild things constitute a treasure to be cherished and protected for all time. The pleasure and refreshment which they give man confirm their value to society. More importantly perhaps, the wonder, beauty, and elemental force in which the least of them share suggest a higher right to exist—not granted them by man and not his to take away. In environmental policy as anywhere else we cannot deal in absolutes. Yet we can at least give considerations like these more relative weight in the seventies, and become a more civilized people in a healthier land because of it.
I’ve paid attention to politics for a long time, but I’ve never heard a democrat talk like this about the environment. Just this one paragraph contains three progressive blasphemies, nearly one per sentence:
- The idea that the environment belongs in any way shape or form to a nation or a people (is our heritage)
- The idea that the environment derives its value from the “pleasure and refreshment” they “give man”
- A higher right to exist not granted by man?????!

zoop 22 Mar 2024 0:24 UTC
3 points
0
in reply to: joshc’s comment on: New report: Safety Cases for AI
I hear what you’re saying. I probably should have made the following distinction:
1. A technology in the abstract (e.g. nuclear fission, LLMs)
2. A technology deployed to do a thing (e.g. nuclear in a power plant, LLM used for customer service)
The question I understand you to be asking is essentially how do we make safety cases for AI agents generally? I would argue that’s more situation 1 than 2, and as I understand it safety cases are basically only ever applied to case 2. The nuclear facilities document you linked definitely is 2.
So yeah, admittedly the document you were looking for doesn’t exist, but that doesn’t really surprise me. If you started looking for narrowly scoped safety principles for AI systems you start finding them everywhere. For example, a search for “artificial intelligence” on the ISO website results in 73 standards .
Just a few relevant standards, though I admit, standards are exceptionally boring (also many aren’t public, which is dumb):
- UL 4600 standard for autonomous vehicles
- ISO/IEC TR 5469 standard for ai safety stuff generally (this one is decently interesting)
- ISO/IEC 42001 this one covers what you do if you set up a system that uses AI
You also might find this paper a good read: https://ieeexplore.ieee.org/document/9269875

zoop 21 Mar 2024 19:39 UTC
5 points
0
on: New report: Safety Cases for AI
I’ve published in this area so I have some meta comments about this work.
First the positive:
1. Assurance cases are the state of the art for making sure things don’t kill people in a regulated environment. Ever wonder why planes are so safe? Safety cases. Because the actual process of making one is so unsexy (GSNs make me want to cry), people tend to ignore them, so you deserve lots of credit for somehow getting ex-risk people to upvote this. More lesswronger types should be thinking about safety cases.
2. I do think you have good / defensible arguments overall, minus minor quibbles that don’t matter much.
Some bothers:
1. Since I used to be a little involved, I am perhaps a bit too aware of the absolutely insane amount of relevant literature was not mentioned. To me, the introduction made it sound a little bit like the specifics of applying safety cases to AI systems have not been studied. That is very, very, very not true.
That’s not to say you don’t have a contribution! Just that I don’t think it was placed well in the relevant literature. Many have done safety cases for AI but they usually do it as part of concrete applied work on drones or autonomous vehicles, not ex-risk pie-in-the-sky stuff. I think your arguments would be greatly improved by referencing back to this work.
I was extremely surprised to see so few of the (to me) obvious suspects referenced, particularly more from York. Some labs with people that publish lots in this area.
- University of York Institute for Safe Autonomy
- NASA Intelligent Systems Division
- Waterloo Intelligent Systems Engineering Lab
- Anything funded by the DARPA Assured Autonomy program
2. Second issue is a little more specific, related to this paragraph:
To mitigate these dangers, researchers have called on developers to provide evidence that their systems are safe (Koessler & Schuett, 2023; Schuett et al., 2023); however, the details of what this evidence should
look like have not been spelled out. For example, Anderljung et al vaguely state that this evidence should be “informed by evaluations of dangerous capabilities and controllability”(Anderljung et al., 2023). Similarly, a recently proposed California bill asserts that developers should provide a “positive safety determination” that “excludes hazardous capabilities” (California State Legislature, 2024). These nebulous requirements raise questions: what are the core assumptions behind these evaluations? How might developers integrate other kinds of evidence?
The reason the “nebulous requirements” aren’t explicitly stated is that when you make a safety case you assure the safety of a system against specific relevant hazards for the system you’re assuring. These are usually identified by performing a HAZOP analysis or similar. Not all AI systems have the same list of hazards, so its obviously dubious to expect you can list requirements a priori. This should have been stated, imo.

zoop 16 Feb 2024 0:07 UTC
1 point
−2
on: Lsusr’s Rationality Dojo
I don’t think it works if there isn’t a correct answer, e.g. predicting the future, but I’m positive this is a good way to improve how convincing your claims are to others.
If there isn’t ground truth about a claim to refer to, any disagreement around a claim is going to be about how convincing and internally/externally consistent the claim is. As we keep learning from prediction markets, rationale don’t always lead to correctness. Many cases of good heuristics (priors) doing extremely well.
If you want to be correct, good reasoning is often a nice-to-have, not a need-to-have.

zoop 3 Jan 2024 19:16 UTC
3 points
−2
on: AI Is Not Software
I very strongly disagree. In my opinion, this argument appears fatally confused about the concept of “software.”
As others have pointed out, this post seems to be getting at a distinction between code and data, but many of the examples of software given by OP contain both code and data, as most software does. Perhaps the title should have been “AI is Not Code,” but since it wasn’t I think mine is a legitimate rebuttal.
I’m not trying to make an argument by definition. My comment is about properties of software that I think we would likely agree on. I think OP both ignores some properties software can have while assuming all software shares other separate properties, to the detriment of the argument.
I think the post is correct in pointing out that traditional software is not similar to AI in many ways, but that’s where my agreement ends.
1: Software, I/O, and such
Most agree on the following basic definition: software is a set of both instructions and data, hosted on hardware, that governs how input data is transformed to some sort of output. As you point out, inputs and outputs are not software.
For example, photos of a wedding or a vacation aren’t software, even if they are created, edited, and stored using software.
Yes.
Second, when we run the model, it takes the input we give it and performs “inference” with the model. This is certainly run on the computer, but the program isn’t executing code that produces the output, it’s using the complicated probability model which grew, and was stored as a bunch of numbers.
No! It is quite literally executing code to produce the output! Just because this specific code and the data it interacts with specifies a complicated probability model that does not mean it is not software.
Every component of the model is software. Even the pseudorandomness of the model outputs is software (torch.randn(), often). There is no part of this inference process that generates outputs that is not software. To run inference is only to run software.
2: Stochasticity
The model responds to input by using the probability model to estimate the probability of difference responses, in order to output something akin to what the input data did—but it does so in often unexpected or unanticipated ways.
Software is often, but is not necessarily deterministic. Software can have stochastic or pseudorandom outputs. For example, software that generates pseudorandom numbers is still software. The fact that AI generates stochastic outputs humans don’t expect does not make it not software.
Also, software is not necessarily interpretable and outputs are not necessarily expected or expectable.
3: Made on Earth by Humans
First, we can talk about how it is created. Developers choose a model structure and data, and then a mathematical algorithm uses that structure and the training data to “grow” a very complicated probability model of different responses… The AI model itself, the probability model which was grown, is generating output based on a huge set of numbers that no human has directly chosen, or even seen. It’s not instructions written by a human.
Neither a software’s code nor its data is necessarily generated by humans.
4: I have bad news for you about software engineering
Does software work? Not always, but if not, it fails in ways that are entirely determined by the human’s instructions.
This is just not true, many bugs are caused by specific interactions between inputs and the code + data, some also caused by inputs, code, data, and hardware (buffer overflows being the canonical example). You could get an error due to cosmic bit flips, that has nothing to do with humans or instructions at all! Data corruption… I could go on and on.
For example, unit tests are written to verify that the software does what it is expected to do in different cases. The set of cases are specified in advance, based on what the programmer expected the software to do.
… or the test is incorrect. Or both the test and the software are incorrect. Of course this assumes you wrote tests, which you probably didn’t. Also, who said you can’t write unit tests for AI? You can, and people do. All you have to do is fix the temperature parameter and random seed. One could argue benchmarks are just stochastic tests...
If it fails a single unit test, the software is incorrect, and should be fixed.
Oh dear. I wish the world worked like this.
Badly written, buggy software is still software. Not all software works, and it isn’t always software’s fault. Not all software is fixable or easy to fix.
5: Implications
What we call AI in 2024 is not software. It’s kind of natural to put it in the same category as other things that run on a computer, but thinking about LLMs, or image generation, or deepfakes as software is misleading, and confuses most of the ethical, political, and technological discussions.
In my experience, thinking of AI as software leads to higher quality conversations about the issues. Everyone understands at some level that software can break, be misused, or be otherwise in-optimal for any number of reasons.
I have found that when people begin to think AI is not software, they often devolve into dorm room philosophy debates instead of dealing with its many concrete, logical, potentially fixable issues.

zoop 18 Nov 2023 22:50 UTC
5 points
−2
on: Social Dark Matter
I think this post is probably correct, but I think most of the discourse over-complicated what I interpret to be the two core observations:
1. People condition their posteriors on how and how much things are discussed.
2. Societal norms affect how and how often things are discussed.
All else follows. The key takeaway for me is that you should also condition your posteriors on societal norms.

zoop 14 Aug 2023 15:04 UTC
19 points
8
on: We Should Prepare for a Larger Representation of Academia in AI Safety
Here be cynical opinions with little data to back them.
It’s important to point out that “AI Safety” in an academic context usually means something slightly different from typical LW fare. For starters, as most AI work descended from computer science, its pretty hard [1] to get anything published in a serious AI venue (conference/journal) unless you
1. Demonstrate a thing works
2. Use theory to explain a preexisting phenomenon
Both PhD students and their advisors want to publish things in established venues, so by default one should expect academic AI Safety research to have a near-term prioritization and be less focused on AGI/ex-risk. That isn’t to say research can’t accomplish both things at once, but its worth noting.
Because AI Safety in the academic sense hasn’t traditionally meant safety from AGI ruin, there is a long history of EA aligned people not really being aware of or caring about safety research. Safety has been getting funding for a long time, but it looked less like MIRI and more like the University of York’s safe autonomy lab [2] or the DARPA Assured Autonomy program [3]. With these dynamics in mind, I fully expect the majority of new AI safety funding to go to one of the following areas:
- Aligning current gen AI with the explicit intentions of its trainers in adversarial environments, e.g. make my chatbot not tell users how to make bombs when users ask, reduce the risk of my car hitting pedestrians.
- Blurring the line between “responsible use” and “safety” (which is a sort of alignment problem), e.g. make my chatbot less xyz-ist, protecting training data privacy, ethics of AI use.
- Old school hazard analysis and mitigation. This is like the hazard analysis a plane goes through before the FAA lets it fly, but now the planes have AI components.
The thing that probably won’t get funding is aligning a fully autonomous agent with the implicit interests of all humans (not just trainers), which generalizes to the ex-risk problem. Perhaps I lack imagination, but with the way things are I can’t really imagine how you get enough published in the usual venues about this to build a dissertation out of it.
[1] Yeah, of course you can get it published, but I think most would agree that its harder to get a pure theory ex-risk paper published in a traditional CS/AI venue than other types of papers. Perhaps this will change as new tracks open up, but I’m not sure.
[2] https://www.york.ac.uk/safe-autonomy/research/assurance/
[3] https://www.darpa.mil/program/assured-autonomy
What links here?
- Shallow review of technical AI safety, 2024 by technicalities (29 Dec 2024 12:01 UTC; 202 points)

zoop 23 Feb 2023 19:01 UTC
9 points
2
on: Building and Entertaining Couples
The core B/E dichotomy rang true, but the post also seemed to imply a correlated separation between autonomous and joint success/failure modes: building couples succeed/fail on one thing together, entertaining couples succeed/fail on two things separately.
I have not observed this to be true. Experientially, it seems a little like a quadrant, where the building / entertaining distinction is about the type of interaction you crave in a relationship, and autonomous / joint distinction is about how you focus your productive energies.
Examples:
- Building / Joint: (as above) two individuals building a home / business / family together
- Building / Autonomous: two individuals with distinct careers and interests, who both derive great meaning from helping the other achieve their goals.
- Entertaining / Joint: two individuals who enjoy entertainment and focus on that pursuit together. A canonical example might be childless couples who frequently travel, host parties, etc, or the “best friends who do everything together” couple everyone knows.
- Entertaining / Autonomous: (as above) individuals with separate lives who come together for conversation, sex, etc.
I might be extra sensitive to this, my last relationship failed because my partner wanted an “EJ” relationship while I wanted a “BA” relationship, neither of which followed cleanly from the post.

zoop 10 Oct 2022 15:17 UTC
1 point
0
on: The Teacup Test
“What is intelligence?” is a question you can spend an entire productive academic career failing to answer. Intentionally ignoring the nerd bait, I do think this post highlights how important it is for AGI worriers to better articulate which specific qualities of “intelligent” agents are the most worrisome and why.
For example, there has been a lot of handwringing over the scaling properties of language models, especially in the GPT family. But as Gary Marcus continues to point out in his inimitable and slightly controversial way, scaling these models fails to fix some extremely simple logical mistakes—logical mistakes that might need to be fixed by a non-scaling innovation before an intelligent agent poses an ex-risk. On forums like these it has long been popular to say something along the lines of “holy shit look how much better these models got when you add __ amount of compute! If we extrapolate that out we are so boned.” But this line of thinking seems to miss the “intelligence” part of AGI completely, it seemingly has no sense at all of the nature of the gap between the models that exist today and the spooky models they worry about.
It seems to me that we need a better specification for describing what exactly intelligent agents can do and how they get there.

zoop

Re­minder: AI Safety is Also a Be­hav­ioral Eco­nomics Problem

zoop’s Shortform

Reminder: AI Safety is Also a Behavioral Economics Problem