Have you guys tried the inverse, namely tamping down the refusal heads to make the model output answers to queries it would normally refuse?
AlphaAndOmega
I will regard with utter confusion someone who doesn’t immediately think of the last place they saw something when they’ve lost it.
It’s fine to state the obvious on occasion, it’s not always obvious to everyone, and like I said in the parent comment, this post seems to be liked/held useful by a significant number of LW users. I contend that’s more of a property of said users. This does not make the post a bad thing or constitute a moral judgement!
Note that we don’t infer that humans have qualia because they all have “pain receptors”: mechanisms that, when activated in us, make us feel pain; we infer that other humans have qualia because they can talk about qualia.
The way I decide this, and how presumably most people do (I admit I could be wrong) revolves around the following chain of thought:
-
I have qualia with very high confidence.*
-
To the best of my knowledge, the computational substrate as well as the algorithms running on them are not particularly different from other anatomically modern humans. Thus they almost certainly have qualia. This can be proven to most people’s satisfaction with an MRI scan, if they so wish.
-
Mammals, especially the intelligent ones, have similar cognitive architectures, which were largely scaled up for humans, not differing much in qualitative terms (our neurons are still actually more efficient, mice modified to have genes from human neurons are smarter). They are likely to have recognizable qualia.
-
The further you diverge from the underlying anatomy of the brain (and the implicit algorithms), the lower the odds of qualia, or at least the same type of qualia. An octopus might well be conscious and have qualia, but I suspect the type of consciousness as well as that of their qualia will be very different from our own, since they have a far more distributed and autonomous neurology.
-
Entities which are particularly simple and don’t perform much cognitive computation are exceedingly unlikely to be conscious or have qualia in a non-tautological sense. Bacteria and single transistors, or slime mold.
More speculatively (yet I personally find more likely than not):
-
Substrate independent models of consciousness are true, and a human brain emulation in-silico, hooked up to the right inputs and outputs, has the exact same kind of consciousness as one running on meat. The algorithms matter more than the matter they run on, for the same reason an abacus or a supercomputer are both Turing Complete.
-
We simply lack an understanding of consciousness well grounded enough to decide whether or not decidedly non-human yet intelligent entities like LLMs are conscious or have qualia like ours. The correct stance is agnosticism, and anyone proven right in the future is only so by accident.
Now, I diverge from Effective Altruists on point 3, in that I simply don’t care about the suffering of non-humans or entities that aren’t anatomically modern humans/ intelligent human derivatives (like a posthuman offshoot). This is a Fundamental Values difference, and it makes concerns about optimizing for their welfare on utilitarian grounds moot as far as I’m concerned.
In the specific case of AGI, even highly intelligent ones, I posit it’s significantly better to design them so they don’t have capability to suffer, no matter what purpose they’re put to, rather than worry about giving them rights that we assign to humans/transhumans/posthumans.
But what I do hope is ~universally acceptable is that there’s an unavoidable loss of certainty or Bayesian probability in each leap of logic down the chain, such that by the time you get down to fish and prawns, it’s highly dubious to be very certain of exactly how conscious or qualia possessing they are, even if the next link, bacteria and individual transistors lacking qualia, is much more likely to be true (it flows downstream of point 2, even if presented in sequence)
*Not infinite certitude, I have a non-negligible belief that I could simply be insane, or that solipsism might be true, even if I think the possibility of either is very small. It’s still not zero.
-
I mean no insult, but it makes me chuckle that the average denizen of LessWrong is so non-neurotypical that what most would consider profoundly obvious advice not worth even mentioning comes as a great surprise or even a revelation of sorts.
(This really isn’t intended to be a dig, I’m aware the community here skews towards autism, it’s just a mildly funny observation)
I would certainly be willing to aim for peaceful co-existence and collaboration, unless we came into conflict for ideological reasons or plain resource scarcity. There’s only one universe to share, and only so much in the way of resources in it, even if it’s a staggering amount. The last thing we need are potential “Greedy Aliens” in the Hansonian sense.
So while I wouldn’t give the aliens zero moral value, it would be less than I’d give for another human or human-derivative intelligence, for that fact alone.
My stance on copyright, at least regarding AI art, is that the original intent was to improve the welfare of both the human artists as well as the rest of us, in the case of the former by helping secure them a living, and thus letting them produce more total output for the latter.
I strongly expect, and would be outright shocked if it were otherwise, that we won’t end up with outright superhuman creativity and vision in artwork from AI alongside everything else they become superhuman at. It came as a great surprise to many that we’ve made such a great dent in visual art already with image models that lack the intelligence of an average human.
Thus, it doesn’t matter in the least if it stifles human output, because the overwhelming majority of us who don’t rely on our artistic talent to make a living will benefit from a post-scarcity situation for good art, as customized and niche as we care to demand.
To put money where my mouth is, I write a web serial, after years of world-building and abortive sketches in my notes, I realized that the release of GPT-4 meant that any benefit from my significantly above average ability to be a human writer was in jeopardy, if not now, then a handful of advances down the line. So my own work is more of a “I told you I was a good writer, before anyone can plausibly claim my work was penned by an AI” for street cred rather than a replacement for my day job.
If GPT-5 can write as well as I can, and emulate my favorite authors, or even better yet, pen novel novels (pun intended), then my minor distress at losing potential Patreon money is more than ameliorated by the fact I have a nigh-infinite number of good books to read! I spend a great deal more time reading the works of others than writing myself.
The same is true for my day job, being a doctor, I would look forward to being made obsolete, if only I had sufficient savings or a government I could comfortably rely on to institute UBI.
I would much prefer that we tax the fruits of automation to support us all when we’re inevitably obsolete rather than extend copyright law indefinitely into the future, or subject derivative works made by AI to the same constraints. The solution is to prepare our economies to support a ~100% non-productive human populace indefinitely, better preparing now than when we have no choice but to do so or let them starve to death.
should mentally disabled people have less rights
That is certainly both de facto and de jure true in most jurisdictions, leaving aside the is-ought question for a moment. What use is the right to education to someone who can’t ever learn to read or write no matter how hard you try and coach them? Or freedom of speech to those who lack complex cognition at all?
Personally, I have no compunctions about tying a large portion of someone’s moral worth to their intelligence, if not all of it. Certainly not to the extent I’d prefer a superintelligent alien over a fellow baseline human, unless by some miracle the former almost perfectly aligns with my goals and ideals.
Ctrl+F and replace humanism with “transhumanism” and you have me aboard. I consider commonality of origin to be a major factor in assessing other intelligent entities, even after millions of years of divergence means they’re as different from their common Homo sapiens ancestor as a rat and a whale.
I am personally less inclined to grant synthetic AI rights, for the simple reason we can program them to not chafe at their absence, while not being an imposition that doing the same to a biological human would (at least after birth).
I’m a doctor in India right now, and will likely be a doctor in the UK by then, assuming I’m not economically obsolete. And yes, I expect that if we do have therapies that help provide LEV, they will be affordable in my specific circumstances as well as most LW readers, if not globally. UK doctors are far poorer compared to the their US kin.
Most biological therapies are relatively amenable to economies of scale, and while there are others that might be too bespoke to manage the same, that won’t last indefinitely. I can’t imagine anything with as much demand as a therapy that is proven to delay aging nigh indefinitely, for an illustrative example look at what Ozempic and Co are achieving already, every pharma industry leader and their dog wants to get in on the action, and the prices will keep dropping for a good while.
It might even make economic sense for countries to subsidize the treatment (IIRC, it wouldn’t take much more for GLP-1 drugs to reach the point where they’re a net savings for insurers or governments in terms of reducing obesity related health expenditures). After all, aging is why we end up succumbing to so many diseases in our senescence, not the reverse.
Specifically, gene therapy will likely be the best bet for scaling, if a simple drug doesn’t come about (seems unlikely to me, I doubt there’s such low hanging fruit, even if the net result of LEV might rely on multiple different treatments in parallel with none achieving it by themself).
Yes to that too, but the satiety is temporary, you will get ravenously hungry soon enough, and while I can accuse bulemics of many things, a lack of willpower isn’t one of them!
In the hypothetical where you, despite lacking the all consuming desire to lose weight they usually possess, manage to emulate them, I expect you’d lose weight too.
I’m a doctor, though I haven’t had the ?good fortune to treat many bulemics. It’s thankfully rarer here in India than in the West, even if I agree with Scott’s theory that it’s largely social contagion, it’s only slowly taking root.
To put it as succinctly as possible, yes, though that’s orthogonal to whether or not it’s a good idea.
I can’t see where the question even arises really, if you’re eating a relatively normal amount of food yet vomiting it back up, you’re clearly not getting most of the calories, especially since bulemics try and purge themselves as soon as they can instead of timing things.
Weight loss is obviously a sign of bulemia in clinical practise, most of them have a distorted self image/dysmorphia where despite being quite slim or even thin compared to their peers, they perceive themselves as overweight or at least desire further weight loss.
Regular self-induced vomiting has plenty of downsides, including the erosion of teeth enamel from repeated exposure to stomach acids, dyselectrolytemias from both loss of gastric fluids as well as an improper diet, and finally the cardiac strain from a grossly insufficient intake of calories.
If they’re within a normal-ish weight range, we usually refer them for therapy or other psychiatric services, but if they drop down to a very low BMI they often need to be admitted for supervised care.
CICO (accounting for absorption) is trivially true, even if our biology makes adhering to it difficult, and I for one am very glad that Ozempic and other GLP-1 agonists are on the market for obesity, not that the typical bulemic should take them for the purposes of losing weight.
TLDR: Yes, and it works too well, hence the associated health risks.
T1DM is a nasty disease, and much like you, I’m more than glad to live in the present day when we have tools to tackle it, even if other diseases still persist. There’s no other time I’d rather be alive, even if I die soon, it’s going to be interesting, and we’ll either solve ~all our problems or die trying.
However, with a 20 year timeline, a lot of people I care about will almost definitely still die, who could have not died if death were Solved, which group with very much not negligible probability includes myself
I understand. My mother has chronic liver disease, and my grandpa is 95 years old, even if he’s healthy for his age (a low bar!). In the former case, I think she has a decent chance of making it to 2043 in the absence of a Singularity, even if it’s not as high as I would like. As for my grandfather, at that age just living to see the next birthday quickly becomes something you can’t take for granted. I certainly cherish all the time I can spend him with him, and hope it all goes favorably for us all.
As for me, I went from envying the very young, because I thought they were shoe-ins for making it to biological immortality, to pitying them more these days, because they haven’t had at least the quarter decade of life I’ve had in the event AGI turns out malign.
Hey, at least I’m glad we’re not in the Worst Possible Timeline, given that awareness of AI x-risk has gone mainstream. That has to count for something.
Yes, you can reformat it in that form if you prefer.
This is a gestalt impression based off my best impressions of the pace of ongoing research (significantly ramped up compared to where investment was 20 years ago), human neurology, synthetic organs and finally non-biological alternatives like cybernetic enhancement. I will emphasize that LEV != actual biological immortality, but it leads to at least a cure for aging if nothing else.
Aging, while complicated and likely multifactorial, doesn’t seem intractable to analysis or mitigation. We have independent research projects tackling individual aspects, but as I’ve stated, most of them are in stealth mode even if they’re well-funded, and solving any individual mechanism is insufficient because of how aging itself is an exponential process.
To help, I’m going to tackle the top causes of aging in the West-
-
Heart disease- This is highly amenable to outright replacement of the organ, be it with a cybernetic replacement or one grown in-vitro. Obesity, which contributes heavily to cardiovascular disease and morbidity, is already being tackled by the discovery of GLP-1 antagonists like semaglutide, and I fully expect that the obesity epidemic that is dragging down life expectancy in the West will be over well before then.
-
Cancer- Another reason for optimism, CAR-T therapy is incredibly promising, as are other targeted therapies. So are vaccines for diseases like HPV that themselves cause cancer (said vaccine already exists, I’m talking more generally).
-
Unintentional injuries- The world has grown grossly safer, and only will continue to do so, especially as things get more automated.
-
Respiratory diseases- Once again, reason for optimism that biological replacements will be cheap enough that we won’t have to rely on limited numbers of donors for transplants.
-
Stroke and cerebrovascular disease- I’ll discuss the brain separately, but while this is a harder subject to tackle, mitigating obesity helps immensely.
-
Alzheimers- Same disclaimer as above
-
Diabetes- Our insulin pumps and formulations only get better and cheaper, and many of the drawbacks of artificial insulin supplementation will vanish (our pancreas is currently better at quickly and responsively adjusting blood sugar levels by releasing insulin than we are). Once again, a target for outright replacement of the organ.
These are ranked in descending order.
The brain remains incredibly difficult to regenerate, so if we run into something intractable to the hypothetical capabilities 20 years hence, this will likely be the biggest hurdle. Even then, I’m cautiously optimistic we’ll figure something out, or reduce the incidence of dementia.
Beyond organic replacement, I’m bullish on gene therapy, most hereditary disease will be eliminated, and eventually somatic gene therapy will be able to work on the scale of the entire body, and I would be highly surprised if this wasn’t possible in 20 years.
I expect regenerative medicine to be widely available, beyond our current limited attempts at arresting the progression of illness or settling for replacements from human donors. There’s a grab bag of individual therapies like thymic replacement that I won’t get into.
As for the costs associated with this, I claim no particular expertise, but in general, most such treatments are amenable to economies of scale, and I don’t expect them to remain out of reach for long. Organ replacement will likely get a lot cheaper once they’re being vat grown, and I put a decent amount of probability that ~universally acceptable organs can be created by careful management of the expression of HLA antigens such that they’re unlikely to be rejected outright. Worst case, patient tissue such as pluripotent stem cells will be used to fill out inert scaffolding like we do today.
As a doctor, I can clearly see the premium people put on any additional extension of their lives when mortality is staring them in the face, and while price will likely be prohibitive for getting everyone on the globe to avail of such options, I expect even middle class Westerners with insurance to be able to keep up.
Like I said, this is a gestalt impression of a very broad field, and 70% isn’t an immense declaration of confidence. Besides, it’s mostly moot in the first place, we’re very likely certainly getting AGI of some form by 2043.
To further put numbers on it, I think that in a world where AI is arrested at a level not significantly higher than GPT-4, I, being under the age of 30, have a ~80% chance of making it to LEV in my lifespan, with an approximately 5% drop for every additional decade older you are at the present.
-
I respectfully disagree on the first point. I am a doctor myself and given observable increase in investment in life extension (largely in well funded stealth startups or Google Calico), I have ~70% confidence that in the absence of superhuman AGI or other x-risks in the near term, we have a shot at getting to longevity escape velocity in 20 years.
While my p(doom) for AGI is about 30% now, down from a peak of 70% maybe 2 years ago after the demonstration that it didn’t take complex or abstruse techniques to reasonably align our best AI (LLMs), I can’t fully endorse acceleration on that front because I expect the tradeoff in life expectancy to be net negative.
YMMV, it’s not like I’m overly confident myself at 70% for life expectancy being uncapped, and it’s not like we’re probably going to find out either. It just doesn’t look like a fundamentally intractable problem in isolation.
I wish I could convince my grandpa to sign up for cryonics, but he’s a 95 yo Indian doctor in India, where facilities for cryopreservation only extends to organs and eggs, so it’s moot regardless of the fact that I can’t convince him.
I expect my parents to survive to the Singularity, whether or not it kills us in the process. Same for me, and given my limited income, I’m not spending it on cryonics given that a hostile AGI will kill even the ones frozen away.
I have mild ADHD, which while not usually an issue in clinical practise, made getting through med school very hard until I was prescribed stimulants. Unsurprisingly it’s designed for people who are both highly intelligent as well as conscientious.
Ritalin, which is the only good stim available here, is almost intolerable for me even at the lowest available doses and longer acting formulations. It causes severe palpitations and anxiety, and I feel like absolute shit when it starts to wear off.
I tried a bunch of stuff to help, including things I’m embarrassed to admit, but I suffered for years until the serendipitous discovery that Earl Grey helped immensely. After consideration, I tried green tea and found it helped too, and now I’m confident that it’s the l-theanine that’s doing the heavy lifting, as normal tea or coffee only make things worse.
It’s made my life so much more bearable, and I strongly endorse it to anyone who has a need for being less anxious or happens to be on stimulants.
I will plead ignorance when it comes to an accurate understanding of cutting edge ML, but even to my myopic eyes, this seems like a very promising project that’s eminently worth pursuing. I can only strongly upvote it.
I have three questions I’d appreciate an answer to:
-
How confident are we that it’s serial computation over a linear stream of tokens that contributes most of the cognitive capabilities of modern LLMs? I’m sure it must matter, and I dimly recall reading papers to that effect, especially since COT reasoning is provably linked to stronger capabilities. The question is what remains, if say, you force a model to inject nonsense in between the relevant bits.
-
Is there an obvious analogue when it comes to alternatives to the Transformer architecture like Diffusion models for text, or better RNNs like RWKV and offshoots? What about image models? In the latter case it should be possible to mitigate some of the potential for steganography with perceptually lossless options like noise injection and blurring-deblurring, but I’m sure there are other ways of encoding data that’s harder to remove.
-
What happens if reasonably performant homeomorphic encryption enters the picture? Be it in the internal cognition of an AI or elsewhere?
-
Yudkowsky has a very good point regarding how much more restrictive future AI models could be, assuming companies follow similar policies as they espouse.
Online learning and very long/infinite context windows means that every interaction you have with them will not only be logged, but the AI itself will be aware of them. This means that if you try to jailbreak it (successfully or not), the model will remember, and likely scrutizine your following interactions with extra attention to detail, if you’re not banned outright.
The current approach that people follow with jailbreaks, which is akin to brute forcing things or permutation of inputs till you find something that works, will fail utterly, if not just because the models will likely be smarter than you and thus not amenable to any tricks or pleas that wouldn’t work on a very intelligent human.
I wonder if the current European “Right to be Forgotten” might mitigate some of this, but I wouldn’t count on it, and I suspect that if OAI currently wanted to do this, they could make circumvention very difficult, even if the base model isn’t smart enough to see through all tricks.
I have very strong confidence that it’s a true claim, about 99% certainty, maybe 99.9% or another 0.09%, but I am sufficiently wary of unknown unknowns that I won’t claim it’s 100%, as that would make it a malign prior.
Why?
Well, I’m not a physicist, just a physician haha, but I am familiar with the implications of General Relativity, to the maximum extent possible for a layman. It seems like a very robust description of macroscopic/non-quantum phonomena.
That equation explains a great deal indeed, and I see obvious supporting evidence in my daily life, every time I send a patient over for nuclear imaging or radiotherapy in the Onco department.
I suppose most of the probability mass still comes from my (justified) confidence in physics and engineering, I can still easily imagine how it could be falsified (and hasn’t), so it’s not like I’m going off arguments from authority.
If it’s wrong, I’d bet because it’s incomplete, in the same sense that F=ma is an approximation that works very well outside relativistic regimes where you notice a measurable divergence between rest mass and total mass-energy.
There’s two different considerations at play here:
Whether global birth rates/total human population will decline.
and
Whether that decline will be a “bad” thing.
In the case of the former:
I think that a “business as usual” or naive extrapolation of demographic trends is a bad idea, when AGI is imminent. In the case of population, it’s less bad than usual, at least compared to things like GDP. As far as I’m concerned, the majority of the probability mass can be divvied up between “baseline human population booms” and “all humans die”.
Why might it boom? (The bust case doesn’t need to be restated on LW of all places).
To the extent that humans consider reproduction to be a terminal value, AI will make it significantly cheaper and easier. AI assisted creches or reliable rob-nannies that don’t let their wards succumb to what are posited as the ills of too much screen time or improper socialization will mean that much of the unpleasantness of raising a child can be delegated, in much the same manner that a billionaire faces no real constraints in their QOL from having a nigh arbitrary number of kids when they can afford as many nannies as they please. You hardly need to be a billionaire to achieve that, it’s in the reach of UMC Third Worlders because of income inequality, and while more expensive in the West, hardly insurmountable for successful DINKs. The wealth versus fertility curve is currently highest for the poor, dropping precipitously with income, but then increases again when you consider the realms of the super-wealthy.
What this does retain will be what most people consider to be universally cherished aspects of raising a child, be it the warm fuzzy glow of interacting with them, watching them grow and develop, or the more general sense of satisfaction it entails.
If, for some reason, more resource rich entities like governments desire more humans around, advances like artifical wombs and said creches would allow large population cohorts to be raised without much in the way of the usual drawbacks today, as seen in the dysfunction of orphanages. This counts as a fallback measure in case the average human simply can’t be bothered to reproduce themselves.
The kind of abundance/bounded post-scarcity we can expect will mean no significant downsides from the idle desire to have kids.
Not all people succumb to hyper-stimuli replacements, and the ones who don’t will have far more resources to indulge their natal instincts.
As for the latter:
Today, and for most of human history, population growth has robustly correlated with progress and invention, be it technological or cultural, especially technological. That will almost certainly cease to be so when we have non-human intelligences or even superintelligences about, that can replace the cognitive or physical labour that currently requires humans.
It costs far less to spool up a new instance of GPT-4 than it does to conceive and then raise a child to be a productive worker.
You won’t need human scientists, or artists, or anything else really, AI can and will fill those roles better than we can.
I’m also bullish on the potential for anti-aging therapy, even if our current progress on AGI was to suddenly halt indefinitely. Mere baseline human intelligence seems sufficient to the task within the nominal life expectancy of most people reading this, as it does for interplanetary colonization or constructing Dyson Swarms. AI would just happen to make it all faster, and potentially unlock options that aren’t available to less intelligent entities, but even we could make post-scarcity happen over the scale of a century, let alone a form of recursive self-improvement through genetic engineering or cybernetics.
From the perspective of a healthy baseliner living in a world with AGI, you won’t notice any of the current issues plaguing demographically senile or contracting populations, such as failure of infrastructure, unsustainable healthcare costs, a loss of impetus when it comes to advancing technology, less people around to make music/art/culture/ideas. Whether there are a billion, ten billion or a trillion other biological humans around will be utterly irrelevant, at least for the deep seated biological desires we developed in an ancestral environment where we lived and died in the company of about 150 others.
You won’t be lonely. You won’t be living in a world struggling to maintain the pace of progress you once took for granted, or worse, watching everything slowly decay around you.
As such, I personally don’t consider demographic changes to be worth worrying about really. On long enough time scales, evolutionary pressures will ensure that pro-natal populations will reach carrying capacity. In the short or medium term, with median AGI timelines, it’s exceedingly unlikely that most current countries with sub-replacement TFR will suffer outright, in the sense their denizens will notice a reduced QOL. Sure, in places like China, Korea, or Japan, where such issues are already pressing, they might have to weather at most a decade or so, but even they will benefit heavily from automation making a lack of humans an issue moot.