Apparently the new ChatGPT model is obsessed with the immaculate conception of Mary
I mean, “shoggoth” is not that far off from biblically accurate angels… ;-)
Apparently the new ChatGPT model is obsessed with the immaculate conception of Mary
I mean, “shoggoth” is not that far off from biblically accurate angels… ;-)
I’d say that in most contexts in normal human life, (3) is the thing that makes this less of an issue for (1) and (2). If the thing I’m hearing about it real, I’ll probably keep hearing about it, and from more sources. If I come across 100 new crazy-seeming ideas and decide to indulge them 1% of the time, and so do many other people, that’s usually, probably enough to amplify the ones that (seem to) pan out. By the time I hear about the thing from 2, 5, or 20 sources, I will start to suspect it’s worth thinking about at a higher level.
Exactly. More fundamentally, that is not a probability graph, it’s a probability density graph, and we’re not shown the line beyond 2032 but just have to assume the integral from 2100-->infinity is >10% of the integral from 0-->infinity. Infinity is far enough away that the decay doesn’t even need to be all that slow for the total to be that high.
I second what both @faul_sname and @Noosphere89 said. I’d add: Consider ease and speed of integration. Organizational inertia can be a very big bottleneck, and companies often think in FTEs. Ultimately, no, I don’t think it makes sense to have anything like 1:1 replacement of human workers with AI agents. But, as a process occurring in stages over time, if you can do that, then you get a huge up-front payoff, and you can use part of the payoff to do the work of refactoring tasks/jobs/products/companies/industries to better take advantage of what else AI lets you do differently or instead.
“Ok, because I have Replacement Agent AI v1 I was able to fire all the people with job titles A-D, now I can hire a dozen people to figure out how to use AI to do the same for job titles E through Q, and then another dozen to reorganize all the work that was being done by A-Q into more effective chunks appropriate for the AI, and then three AI engineers to figure out how to automate the two dozen people I just had to hire...”
This was really interesting, thanks! Sorry for the wall of text. TL:DR version:
I think these examples reflect, not quite exactly willingness to question truly fundamental principles, but an attempt at identification of a long-term vector of moral trends, propagated forward through examples. I also find it some combination of suspicious/comforting/concerning that none of these are likely to be unfamiliar (at least as hypotheses) to anyone who has spent much time on LW or around futurists and transhumanists (who are probably overrepresented in the available sources regarding what humans think the world will be like in 300 years).
To add: I’m glad you mentioned in a comment that you removed examples you thought would lead to distracting object-level debates, but I think at minimum you should mention that in the post itself. It means I can’t trust anything else I think about the response list, because it’s been pre-filtered to only include things that aren’t fully taboo in this particular community. I’m curious if you think the ones you removed would align with the general principles I try to point at in this comment, or if they have any other common trends with the ones you published?
Longer version:
My initial response is, good work, although… maybe my reading habits are just too eclectic to have a fair intuition about things, but all of these are familiar to me, in the sense that I have seen works and communities that openly question them. It doesn’t mean the models are wrong—you specified not being questioned by a ‘large’ group. The even-harder-than-this problem I’ve yet to see models handle well is genuine whitespace analysis of some set of writings and concepts. Don’t get me wrong, in many ways I’m glad the models aren’t good at this yet. But that seems like where this line of inquiry is leading? I’m not even sure if that’s fundamentally important for addressing the concerns in question—I’ve been known to say that humans have been debating details of the same set of fundamental moral principles for as far back as we have records. And also, keep in mind that within the still-small-but-growing-and-large-enough-for-AI-to-easily-recognize community of EA there are or have been open debates about things like “Should we sterilize the biosphere?” and “Obviously different species have non-zero, but finite and different, levels of intrinsic moral worth more, so does that mean they might be more important than human welfare?” It’s really hard to find a taboo that’s actually not talked about semi-publicly in at least some searchable forums.
I do kinda wish we got to see the meta-reasoning behind how the models picked these out. My overall sense is that to the degree moral progress is a thing at all, it entails a lot of the same factors as other kinds of progress. A lot of our implementations of moral principles are constrained by necessity, practicality, and prejudice. Over time, as human capabilities advance, we get to remove more of the epicycles and make the remaining core principles more generally applicable.
For example, I expect at some point in the next 300 years (plausibly much much sooner) humanity will have the means to end biological aging. This ends the civilizational necessity of biological reproduction at relatively young ages, and probably also the practical genetic problems caused by incest. This creates a fairly obvious set of paths for “Love as thou wilt, but only have kids when you can be sure you or someone else will give them the opportunity for a fulfilling life such that they will predictably agree they would have wanted to be created” to overcome our remaining prejudices and disgust responses and become more dominant.
Also, any taboo against discussing something that is fundamentally a measurable or testable property of the world is something I consider unlikely to last into the far future, though taboos against discussing particular responses to particular answers to those questions might last longer.
@jbash made the good point that some of these would have been less taboo 300 years ago. I think that also fits the mold. 500 years ago Copernicus (let alone the ancient Greeks millennia prior) faced weaker taboos against heliocentrism than Galileo in part because in his time the church was stronger and could tolerate more dissent. And 300 years ago questioning democracy was less taboo than now in part because there were still plenty of strong monarchs around making sure people weren’t questioning them, and that didn’t really reverse until the democracies were strong but still had to worry about the fascists and communists.
Credit cards are kind of an alternative to small claims court, and there are various reputational and other reasons that allow ordinary business to continue even if it is not in practice enforced by law.
True, but FWIW this essentially puts unintelligible enforcement in the hands of banks instead of the police. Which is probably a net improvement, especially under current conditions. But it does have its own costs. My wife is on the board of a nonprofit that last year got a donation, then the donor’s spouse didn’t recognize the charge and disputed it. The donor confirmed verbally and in writing, both to the nonprofit and to the credit card company, that the charge was valid. The nonprofit provided all requested documentation and another copy of the donor’s written confirmation. The credit card company refused both to reinstate the charge and to reverse the fee.
I walked around the neighborhood and took some photos.
As far as I’m concerned, this is almost literally zero evidence of anything, in any inhabited area, except to confirm or deny very specific, narrow claims. To assume otherwise you’d have to look my own photos from my last few years of traveling and believe no one ever goes to national parks and the visitation numbers and reports people write of crowding and bad behavior are all lies.
As things stand today, if AGI is created (aligned or not) in the US, it won’t be by the USG or agents of the USG. I’ll be by a private or public company. Depending on the path to get there, there will be more or less USG influence of some sort. But if we’re going to assume the AGI is aligned to something deliberate, I wouldn’t assume AGI built in the US is aligned to the current administration, or at least significantly less so than the degree to which I’d assume AGI built in China by a Chinese company would be aligned to the current CCP.
For more concrete reasons regarding national ideals, the US has a stronger tradition of self-determination and shifting values over time, plausibly reducing risk of lock-in. It has a stronger tradition (modern conservative politics notwithstanding) of immigration and openness.
In other words, it matters a lot whether the aligned US-built AGI is aligned to the Trump administration, the Constitution, the combined writings of the US founding fathers and renowned leaders and thinkers, the current consensus of the leadership at Google or OpenAI, the overall gestalt opinions of the English-language internet, or something else. I don’t have enough understanding to make a similar list of possibilities for China, but some of the things I’d expect it would include don’t seem terrible. For example, I don’t think a genuinely-aligned Confucian sovereign AGI is anywhere near the worst outcome we could get.
I won’t comment on your specific startup, but I wonder in general how an AI Safety startup becomes a successful business. What’s the business model? Who is the target customer? Why do they buy? Unless the goal is to get acquired by one of the big labs, in which case, sure, but again, why or when do they buy, and at what price? Especially since they already don’t seem to be putting much effort into solving the problem themselves despite having better tools and more money to do so than any new entrant startup.
I really, really hope at some point the Democrats will acknowledge the reason they lost is that they failed to persuade the median voter of their ideas, and/or adopt ideas that appeal to said voters. At least among those I interact with, there seems to be a denial of the idea that this is how you win elections, which is a prerequisite for governing.
That seems very possible to me, and if and when we can show whether something like that is the case, I do think it would represent significant progress. If nothing else, it would help tell us what the thing we need to be examining actually is, in a way we don’t currently have an easy way to specify.
If you can strike in a way that prevents retaliation that would, by definition, not be mutually assured destruction.
Correct, which is in part why so much effort went into developing credible second strike capabilities, building up all parts of the nuclear triad, and closing the supposed missile gap. Because both the US and USSR had sufficiently credible second strike capabilities, it made a first strike much less strategically attractive and reduced the likelihood of one occurring. I’m not sure how your comment disagrees with mine? I see them as two sides of the same coin.
If you live in Manhattan or Washington DC today, you basically can assume you will be nuked first, yet people live their lives. Granted people could behave differently under this scenario for non-logical reasons.
My understanding is that in the Cold War, a basic MAD assumption was that if anyone were going to launch a first strike, they’d try to do so with overwhelming force sufficient to prevent a second strike, hitting everything at once.
I agree that consciousness arises from normal physics and biology, there’s nothing extra needed, even if I don’t yet know how. I expect that we will, in time, be able to figure out the mechanistic explanation for the how. But right now, this model very effectively solves the Easy Problem, while essentially declaring the Hard Problem not important. The question of, “Yes, but why that particular qualia-laden engineered solution?” is still there, unexplained and ignored. I’m not even saying that’s a tactical mistake! Sometimes ignoring a problem we’re not yet equipped to address is the best way to make progress towards getting the tools to eventually address it. What I am saying is that calling this a “debunking” is misdirection.
I’ve read this story before, including and originally here on LW, but for some reason this time it got me thinking: I’ve never seen a discussion about what this tradition meant for early Christianity, before the Christians decided to just declare (supposedly after God sent Peter a vision, an argument that only works by assuming the conclusion) that the old laws no longer applied to them? After all, the Rabbi Yeshua ben Joseph (as the Gospels sometimes called him) explicitly declared the miracles he performed to be a necessary reason for why not believing in him was a sin.
We apply different standards of behavior for different types of choices all the time (in terms of how much effort to put into the decision process), mostly successfully. So I read this reply as something like, “Which category of ‘How high a standard should I use?’ do you put ‘Should I lie right now?’ in?”
A good starting point might be: One rank higher than you would for not lying, see how it goes and adjust over time. If I tried to make an effort-ranking of all the kinds of tasks I regularly engage in, I expect there would be natural clusters I can roughly draw an axis through. E.g. I put more effort into client-facing or boss-facing tasks at work than I do into casual conversations with random strangers. I put more effort into setting the table and washing dishes and plating food for holidays than for a random Tuesday. Those are probably more than one rank apart, but for any given situation, I think the bar for lying should be somewhere in the vicinity of that size gap.
One of the factors to consider, that contrasts with old-fashioned hostage exchanges as described, is that you would never allow your nation’s leaders to visit any city that you knew had such an arrangement. Not as a group, and probably not individually. You could never justify doing this kind of agreement for Washington DC or Beijing or Moscow, in the way that you can justify, “We both have missiles that can hit anywhere, including your capital city.” The traditional approach is to make yourself vulnerable enough to credibly signal unwillingness to betray one another, but only enough that there is still a price at which you would make the sacrifice.
Also, consider that compared to the MAD strategy of having launchable missiles, this strategy selectively disincentivizes people from wanting to move to whatever cities were the subject of such agreements, which were probably your most productive and important cities.
It’s a subtle thing. I don’t know if I can eyeball two inches of height.
Not from a picture, but IRL, if you’re 5′11“ and they claim 6′0”, you can. If you’re 5′4“, probably not so much. Which is good, in a sense, since the practical impact of this brand of lying on someone who is 5′4” is very small, whereas unusually tall women may care whether their partner is taller or shorter than they are.
This makes me wonder what the pattern looks like for gay men, and whether their reactions to it and feelings about it are different than straight women.
Lie by default whenever you think it passes an Expected Value Calculation to do so, just as for any other action.
How do you propose to approximately carry out such a process, and how much effort do you put into pretending to do the calculation?
I’m not as much a stickler/purist/believer in honest-as-always-good as many around here, I think there are many times that deception of some sort is a valid, good, or even morally required choice. I definitely think e.g. Kant was wrong about honesty as a maxim, even within his own framework. But, in practice, I think your proposed policy sets much too low a standard, and in practice the gap between what you proposed vs “Lie by default whenever it passes an Expected Value Calculation to do so, just as for any other action,” is enormous in both the theoretical defensibility, and in the skillfulness (and internal levels of honesty and self-awareness) required to successfully execute it.
I personally wouldn’t want to do a PhD that didn’t achieve this!
Agreed. It was somewhere around reason #4 I quit my PhD program as soon as I qualified for a masters in passing.
That’s true, and you’re right, the way I wrote my comment overstates the case. Every individual election is complicated, and there’s a lot more than one axis of variation differentiation candidates and voters. The whole process of Harris becoming the candidate made this particular election weird in a number of ways. And as a share of the electorate, there are many fewer swing voters than there used to be a few decades ago, and not conveniently sorted into large, coherent blocks.
And yet, it’s also true that as few as ~120,000 votes in WI, MI, and PA could have swung the result, three moderate states that have flipped back and forth across each of the past four presidential elections. Only slightly more for several other combinations of states. It’s not some deep mystery who lives in the rust belt, and what positions on issues a few tens of thousands of voters who are on the fence might care about. It’s not like those issues are uncorrelated, either. And if you look at the last handful of elections, a similar OOM of voters in a similar set of states could have swung things either way, each time.
And it’s true that Harris underperformed Biden-2024 by vote share in every state but Utah (and 37.7% vs 37.8% does not matter to the outcome in any plausible scenario). If I’m reading the numbers correctly she also received fewer votes numerically than Biden in all but 6 states.
So yes: I can very easily imagine scenarios where you’re right, and the fact that we don’t meet the theoretical assumptions necessary for the median voter theorem to apply means we can’t assume an approximation of it in practice. It’s even possible, if the Dems had really started organizing sustained and wide-ranging GOTV campaigns fifteen years ago, that there could be the kinds of blue wave elections I keep getting told are demographic inevitabilities just around the corner, as long as they keep moving further towards the current set of progressive policy goals. But what I cannot imagine is that, in July 2024, in the country as it actually existed, Harris wouldn’t have done better by prioritizing positions (many of which she actually already said she held!) that a relative handful of people in an actual handful of states consistently say they care the most about, and explaining why Trump’s usually-only-vaguely-gestured-at plans would make many of their situations worse. Would it have been enough? I don’t know. But it is a better plan than what happened, if what you want is to win elections in order to govern.