avturchin’s Shortform

avturchin13 Aug 2019 17:15 UTC

5 points

192 comments1 min readLW link

avturchin 28 Mar 2025 12:36 UTC
93 points
8
LLM knows when it hallucinates in advance, and this can be used to exclude hallucinations.
TLDR: prompt “predict the hallucination level of each item in the bibliography list and do not include items expected to have level 3 or above” works.
I performed an experiment: I asked Claude 3.7 Sonnet to write the full bibliography of Bostrom. Around the 70th article, it started hallucinating. I then sent the results to GPT-4.5 and asked it to mark hallucinations and estimate the hallucination chances from 1 to 10 (where 10 is the maximal level of hallucination). It correctly identified hallucinations.
After that, I asked Sonnet 3.7 in another window to find the hallucination level in its own previous answer, and it gave almost the same answers as GPT-4.5. The difference was mostly about exact bibliographical data of some articles, and at first glance, it matched 90% of the data from GPT-4.5. I also checked the real data through Google Scholar manually.
After that, I asked Sonnet to write down the bibliography again but add a hallucination rating after each item. It again started hallucinating articles soon, but to my surprise, it gave correct answers ratings of 1-2 and incorrect ones ratings of 3-5 level of hallucination.
In the next step, I asked it to predict in advance which level of hallucination the next item would have and, if it was 3 or above, not to include it in the list. And it worked! It doesn’t solve the problem of hallucinations completely but lowers their level about 10 times. Obviously, it can sometimes hallucinate the level of hallucinations too.
Maybe I can ask meta: predict the level of hallucinations in your hallucination estimate.
- Daniel Tan 28 Mar 2025 16:11 UTC
  12 points
  2
  Parent
  This is pretty cool! Seems similar in flavour to https://arxiv.org/abs/2501.11120 you’ve found another instance where models are aware of their behaviour. But, you’ve additionally tested whether you can use this awareness to steer their behaviour. I’d be interested in seeing a slightly more rigorous write-up.
  Have you compared to just telling the model not to hallucinate?
  - avturchin 29 Mar 2025 11:53 UTC
    3 points
    0
    Parent
    I found that this does not work for finding an obscure quote from a novel. It still hallucinates different, more popular novels as sources and is confident in them. But it seems it doesn’t know the real answer, though I am sure that the needed novel was in its training dataset (it knows plot).
    - ErioirE 31 Mar 2025 17:51 UTC
      1 point
      0
      Parent
      This seems like a somewhat difficult use case for LLMs. It may be a mistake to think of them as a database of the *entire contents* of the training data. Perhaps instead think of them as compressed amalgamations of the the general patterns in the training data? I’m not terribly surprised that random obscure quotes can get optimized away.
      - avturchin 31 Mar 2025 21:43 UTC
        2 points
        0
        Parent
        Yes, but it knows all Bostrom articles, maybe because it has seen the list a hundred times.
- Knight Lee 31 Mar 2025 6:31 UTC
  4 points
  0
  Parent
  It’s incredibly surprising that state-of-the-art AI don’t fix most of their hallucinations despite being capable (and undergoing reinforcement learning).
  Is the root cause of hallucination alignment rather than capabilities?!
  Maybe the AI gets a better RL reward if it hallucinates (instead of giving less info), because users are unable to catch its mistakes.
- ACCount 30 Mar 2025 15:03 UTC
  1 point
  0
  Parent
  This is way more metacognitive skill than what I would have expected an LLM to have. I can make sense of how an LLM would be able to do that, but only in retrospect.
  And if a modern high end LLM already knows on some level and recognizes its own uncertainty? Could you design a fine tuning pipeline to reduce hallucination level based on that? At least for reasoning models, if not for all of them?
  - avturchin 30 Mar 2025 16:55 UTC
    2 points
    0
    Parent
    It looks like (based on the article published a few days ago by Anthropic about the microscope) Claude Sonnet was trained to distinguish facts from hallucinations, so it’s not surprising that it knows when it hallucinates.
    - ACCount 31 Mar 2025 10:35 UTC
      1 point
      0
      Parent
      Is the same true for GPT-4o then, which could spot Claude’s hallucinations?
      Might be worth testing a few open source models with better known training processes.
avturchin 30 Oct 2024 10:38 UTC
46 points
3
Lifehack: If you’re attacked by a group of stray dogs, pretend to throw a stone at them. Each dog will think you’re throwing the stone at it and will run away. This has worked for me twice.
- Prudhviraj Naidu 30 Oct 2024 17:46 UTC
  31 points
  3
  Parent
  Speaking from experience in Mumbai, just pretending to throw a stone doesn’t necessarily work. You have to pretend to pick up a stone and then throw it.
  - avturchin 30 Oct 2024 17:59 UTC
    14 points
    0
    Parent
    Yes. It is important point.
  - Daniel Kokotajlo 30 Oct 2024 22:22 UTC
    2 points
    0
    Parent
    Huh. If you pretend to throw the stone, does that mean you make a throwing motion with your arm, but just don’t actually release the object you are holding? If so, how come they run away instead of e.g. cringing and expecting to get hit, and then not getting hit, and figuring that you missed and are now out of ammo?
    
    Or does it mean you make menacing gestures as if to throw, but don’t actually make the whole throwing motion?
    - avturchin 31 Oct 2024 12:50 UTC
      6 points
      3
      Parent
      As was said above, first you need to pick a stone from the ground or pretend that you are doing this if there is no stone around. Even if you have a stone, make the gesture that you take it from the ground.
      Another important point is to do it quickly and aggressively with loud cry. Also you can pull back one’s arm with a stone.
      
      The whole trick is that dogs are so afraid of stones that they will run away before you actually throw it or they see where it fails.
- gwern 30 Oct 2024 16:00 UTC
  14 points
  0
  Parent
  Hm. Does that imply that a pack of dogs hunting a human is a stag hunt game?
  - avturchin 30 Oct 2024 17:42 UTC
    9 points
    1
    Parent
    There are some game theory considerations here:
    If I throw the stone, all dogs will know that I don’t have it anymore, so it would be safe for them to continue the attack (whether I hit one or miss). Therefore, it’s better for me to threaten and keep two stones rather than actually throw one.
    If dogs really want to attack me, they might prefer that I throw the stone so they can attack afterward.
    However, I think each dog fails to consider that I’m most likely to throw the stone at another dog. Each individual dog has a small chance of being injured by the stone, and they could succeed if they continue the attack. Real hunters like wolves might understand this.
  - ChristianKl 1 Nov 2024 11:53 UTC
    2 points
    0
    Parent
    The dogs are not hunting humans but want to defend territory or something similar.
    - avturchin 1 Nov 2024 11:59 UTC
      2 points
      0
      Parent
      The problem is that their understanding of their territory is not the same as our legal understanding, so they can attack on the roads outside their homes.
      - ChristianKl 1 Nov 2024 12:19 UTC
        4 points
        0
        Parent
        My point is that the behavior is not well modeled as “hunting humans”. They don’t attack humans with the intent to kill and eat as prey.
- Warty 30 Oct 2024 19:14 UTC
  8 points
  0
  Parent
  burning the dog defense commons 😔
- Rob Lucas 1 Nov 2024 11:06 UTC
  7 points
  0
  Parent
  When I was trekking in Qinghai my guide suggested we do a hike around a lake on our last day on the way back to town. It was just a nice easy walk around the lake. But there were tibetan nomads (nomadic yak herders, he just referred to them as nomads) living on the shore of the lake, and each family had a lot of dogs (Tibetan Mastiffs as well as a smaller local dog they call “three eyed dogs”). Each time we got near their territory the pack would come out very aggressively.
  He showed me how to first always have some stones ready, and second when they approached to throw a stone over their head when they got too close. “Don’t hit the dogs” he told me, “the owners wouldn’t be happy if you hit them, and throwing a stone over their heads will warn them off”.
  When they came he said, “You watch those three, I need to keep an eye on the ones that will sneak up behind us.” Each time the dogs used the same strategy. There’d be a few that were really loud and ran up to us aggressively. Then there’d be a couple sneaking up from the opposite side, behind us. It was my job to watch for them and throw a couple of stones in their direction if they got too close.
  He also made sure to warn me, “If one of them does get to you, protect your throat. If you have to give it a forearm to bite down on instead of letting it get your throat.” He had previously shown me the large scar on his arm where he’d used that strategy in the past. When I looked at him sort of shocked he said, “don’t worry, it probably won’t come to that.” At this point I was wondering if maybe we should skip the lake walk, but I did go there for an adventure. Luckily the stone throwing worked, and we were walking on a road with plenty of stones, so it never really got too dangerous.
  Anyway, +1 to your advice, but also look out for the dogs that are coming up behind you, not just the loud ones that are barking like mad as a distraction.
- Eli Tyre 31 Oct 2024 4:51 UTC
  3 points
  0
  Parent
  You have been attacked by a pack of stray dogs twice?!?!
  - niplav 31 Oct 2024 7:59 UTC
    5 points
    1
    Parent
    Not surprising to me: I’ve lived in a city with many stray dogs for less than half a year, and got “attacked” (“harrassed” is maybe a better term) by a stray dog twice.
- Ruby 31 Oct 2024 17:35 UTC
  2 points
  0
  Parent
  Dog: “Oh ho ho, I’ve played imaginary fetch before, don’t you worry.”
- Shankar Sivarajan 31 Oct 2024 14:07 UTC
  2 points
  0
  Parent
  Why pretend, and not actually throw a stone? Or is this meant as a feint in case you can’t find one lying within reach?
avturchin 14 Dec 2024 0:59 UTC
36 points
0
OpenAI whistleblower found dead in San Francisco apartment.
Suchir Balaji, 26, claimed the company broke copyright law.
- Viliam 14 Dec 2024 21:43 UTC
  22 points
  0
  Parent
  Suppose that you are a whistleblower, and you suspect what someone will try to “suicide” you. How can you protect yourself?
  If someone wants to murder you, they can. If you ever walk outside, you can’t avoid being shot by a sniper. Or a random thug will be paid by a mysterious stranger to stab you. So my question is not “how can you make yourself immortal”, but rather “how can you make it so that if you are killed, it will very obviously not be a suicide”.
  Saying “I have no intention to kill myself, and I suspect that I might be murdered” is not enough.
  Wearing a camera that is streaming to a cloud ²⁴⁄₇, and your friends can publish the video in case of your death… seems a bit too much. (Also, it wouldn’t protect you e.g. against being poisoned. But I think this is not a typical way how whistleblowers die.) Is there something simpler?
  - jimrandomh 15 Dec 2024 1:12 UTC
    8 points
    0
    Parent
    You can prevent this by putting a note in some place that isn’t public but would be found later, such as a will, that says that any purported suicide note is fake unless it contains a particular password.
    Unfortunately while this strategy might occasionally reveal a death to have been murder, it doesn’t really work as a deterrent; someone who thinks you’ve done this would make the death look like an accident or medical issue instead.
    - TsviBT 15 Dec 2024 1:49 UTC
      14 points
      0
      Parent
      You can publish it, including the output of a standard hash function applied to the secret password. “Any real note will contain a preimage of this hash.”
      - trevor 16 Dec 2024 5:28 UTC
        2 points
        0
        Parent
        Your effort must scale to be appropriate to the capabilities of the people trying to remove you from the system. You have to know if they’re the type of person who would immediately default to checking the will.
        More understanding and calibration towards what modern assassination practice you should actually expect is mandatory because you’re dealing with people putting some amount of thinkoomph into making your life plans fail, so your cost of survival is determined by what you expect your attack surface looks like. The appropriate-cost and the cost-you-decided-to-pay vary in OOMs depending on the circumstances, particularly the intelligence, resources, and fixations of the attacker. For example, the fact that this happened 2 weeks after assassination got all over the news is a fact that you don’t have the privilege of ignoring if you want the answer, even though that particular fact will probably turn out to be unhelpful e.g. because the whole thing was probably just a suicide due to the base rates of disease and accidents and suicide being so god damn high.
        If this sounds wasteful, it is. It’s why our civilization has largely moved past assassination, even though getting-people-out-of-the-way is so instrumentally convergent for humans. We could end up in a cycle where assassination gets popular again after people start excessively standing in each other’s way (knowing they won’t be killed for it), or a stable cultural state like the Dune books or the John Wick universe and we’ve just been living in a long trough where elites aren’t physically forced to live their entire lives like mob bosses playing chess games against invisible adversaries.
        So don’t think that if you only follow the rules of Science, that makes your reasoning defensible.
        There is no known procedure you can follow that makes your reasoning defensible.
        There is no known set of injunctions which you can satisfy, and know that you will not have been a fool.
    - Shankar Sivarajan 15 Dec 2024 3:32 UTC
      11 points
      3
      Parent
      How is this better than stating explicitly that you’re not going to commit suicide?
      - Seth Herd 15 Dec 2024 19:16 UTC
        0 points
        0
        Parent
        People change their minds a lot.
        Shankar Sivarajan 16 Dec 2024 3:45 UTC
        0 points
        0
        Parent
        Yes, they do. People also amuse themselves from beyond the grave by arranging for their deaths to look like murders before killing themselves. Or are so overcome by remorse at fabricating lies about their beloved friends to the feds that they encase their feet in concrete and throw themselves into nearby lakes without thinking about how it’d look. Or forget their secret passwords to authenticate their suicide notes and decide it’s too much trouble to retrieve it.
        So sure, I agree there are reasons why a death that strongly looks like murder might still be suicide. But that doesn’t address my position that if you can broadcast the message that you have no intention to kill yourself in the clear with perfect authentication, and still not be sufficiently convincing that your imminent death isn’t suicide, elaborate schemes with passwords or cryptographic hashes don’t do anything.
        Seth Herd 16 Dec 2024 4:42 UTC
        2 points
        0
        Parent
        Really they do those things? The concrete?
        
        I think it’s on a spectrum of likelihood and therefore believability.
        
        I wasn’t commenting on your message, just what you’d said in that comment. Sure it’s better to say it than not. And better yet to do more.
  - lc 15 Dec 2024 17:33 UTC
    6 points
    2
    Parent
    
    If someone wants to murder you, they can. If you ever walk outside, you can’t avoid being shot by a sniper.
    
    If the person or people trying to murder you is omnicompetent, then it’s hard. If they’re regular people, then there are at least lots of temporary measures you can take that would make it more difficult. You can fly to a random state or country and check into a motel without telling anybody where you are. Or you could find a bunch of friends and stay in a basement somewhere. Mobsters used to call doing that sort of thing for a time before a threat had receded “going to ground”.
    
    Wearing a camera that is streaming to a cloud ²⁴⁄₇, and your friends can publish the video in case of your death… seems a bit too much. (Also, it wouldn’t protect you e.g. against being poisoned. But I think this is not a typical way how whistleblowers die.) Is there something simpler?
    
    If you move to New York or London, your every move outside of a private home or apartment will already be recorded. Then place a security camera in your house.
  - avturchin 15 Dec 2024 19:10 UTC
    4 points
    0
    Parent
    I will lower the possible incentive of the killers by publishing all I know—and make it in such legal way that it can be used in court even if I am dead (affidavit?)
  - lc 15 Dec 2024 17:51 UTC
    4 points
    2
    Parent
    
    Saying “I have no intention to kill myself, and I suspect that I might be murdered” is not enough.
    
    Frankly I do think this would work in many jurisdictions. It didn’t work for John McAfee because he has a history of crazy remarks, it sounds like the sort of thing he’d do to save face/generate intrigue if he actually did plan on killing himself, and McAfee made no specific accusations. But if you really thought Sam Altman’s head of security was going to murder you, you’d probably change their personal risk calculus dramatically by saying that repeatedly on the internet. Just make sure you also contact police specifically with what you know, so that the threat is legible to them as an institution.
    - keltan 16 Dec 2024 5:24 UTC
      1 point
      0
      Parent
      I may be an outlier here. But if I thought I was going to be assassinated, I would think of:
      JFK -MLK
      James A. Garfield
      Lincoln
      Franz Ferdinand
      And from these I’d think “Hu, better buy a bullet proof vest”.
      I would unfortunately not think about ‘Being Suicided’, unless I had an expectation that it would occur in this way.
      - avturchin 16 Dec 2024 11:05 UTC
        4 points
        0
        Parent
        One way of not being suicide is not live alone. Stay with 4 friends.
- Ben Pace 14 Dec 2024 3:48 UTC
  14 points
  3
  Parent
  Are there Manifold markets yet on whether this was a suicide and whether it will turn out that this was due to any pressures relating to the OpenAI whistleblowing?
- lc 14 Dec 2024 20:01 UTC
  9 points
  2
  Parent
  Tapping the sign:
- John Wiseman 14 Dec 2024 3:35 UTC
  4 points
  0
  Parent
  https://www.lesswrong.com/posts/yLFyoYhbhDYtuQWjm/probability-of-death-by-suicide-by-a-26-year-old
- mako yass 14 Dec 2024 3:07 UTC
  3 points
  0
  Parent
  All novel information:
  The medical examiner’s office determined the manner of death to be suicide and police officials this week said there is “currently, no evidence of foul play.”
  Balaji’s death comes three months after he publicly accused OpenAI of violating U.S. copyright law while developing ChatGPT
  The Mercury News [the writers of this article] and seven sister news outlets are among several newspapers, including the New York Times, to sue OpenAI in the past year.
  The practice, he told the Times, ran afoul of the country’s “fair use” laws governing how people can use previously published work. In late October, he posted an analysis on his personal website arguing that point.
  In a Nov. 18 letter filed in federal court, attorneys for The New York Times named Balaji as someone who had “unique and relevant documents” that would support their case against OpenAI. He was among at least 12 people — many of them past or present OpenAI employees — the newspaper had named in court filings as having material helpful to their case, ahead of depositions.
  OpenAI has staunchly refuted those claims, stressing that all of its work remains legal under “fair use” laws.
  - [ ]
    [deleted]
- green_leaf 15 Dec 2024 20:23 UTC
  1 point
  0
  Parent
  Does anyone have stats on OpenAI whistleblowers and their continued presence in the world of living?
avturchin 27 Oct 2024 14:29 UTC
28 points
2
Collapse of mega-project to create AI based on linguistics
ABBYY spent 100 million USD for 30 years to create a model of language using hundreds of linguists. It fails to compete with transformers. This month the project was closed. More in Russian here: https://sysblok.ru/blog/gorkij-urok-abbyy-kak-lingvisty-proigrali-poslednjuju-bitvu-za-nlp/
- gwern 27 Oct 2024 21:32 UTC
  7 points
  1
  Parent
  I had no idea ABBYY was so big. I thought it was just some minor OCR or PDF software developer. Interesting to hear about their historical arc. (I am also amused to see my Sutton meme used.)
- cubefox 27 Oct 2024 18:10 UTC
  7 points
  0
  Parent
  Thanks, this was an interesting article. The irony of course being that I, not knowing Russian, read it using Google Translate.
- Chris_Leong 27 Oct 2024 15:36 UTC
  2 points
  0
  Parent
  What’s ABBYY?
  - avturchin 28 Oct 2024 9:33 UTC
    4 points
    0
    Parent
    ABBYY created Finereader which was one of the best OCR systems.
  - Mo Putera 27 Oct 2024 16:29 UTC
    4 points
    0
    Parent
    Wikipedia says it’s a SaaS company “specializing in AI-powered document processing and automation, data capture, process mining and OCR”: https://en.wikipedia.org/wiki/ABBYY
avturchin 23 Oct 2024 12:49 UTC
17 points
0
“Bird Flu H5N1: Not Chaos, but Conspiracy?” By Alexander Pruss
Two months ago, I was puzzled how bird flu, potentially capable of killing tens of millions, went rampant on American livestock farms and began infecting workers, yet no urgent measures were being taken. Even standard epidemiological threat monitoring was happening unsystematically, with months-long delays, and results weren’t being made public for months afterward. What happened to the bitter lessons from the coronavirus pandemic? Why such chaos? Since then, the sense of criminal inaction has only intensified. Missouri discovered the first outbreak of human cases unrelated to farm workers, but molecular testing was neglected and infection paths remained undiscovered.
In California, a more pathogenic variant of bird flu spread to hundreds of dairy farms, reportedly killing up to 15% of cows, with almost daily new cases of virus transmission to humans. The virus apparently came to California through cattle transportation from Idaho, despite belatedly introduced rules formally prohibiting the transport of infected cows across state lines. The problem was that infection in transported cows was checked through selective testing, and as reported, the sampling wasn’t random: before government testing, farmers secretly tested cows for bird flu in private laboratories and selected only healthy ones for official testing. Here’s the continuation of the translation:
A new Vanity Fair investigation shows this isn’t random chaos. The USDA (U.S. Department of Agriculture) has been blocking research and data about the new infection in America’s dairy herds from the start to protect the multi-billion-dollar American dairy export industry and the interests of giant national dairy processing companies. The idea was simple: most cows recover after a few weeks, and while the bird flu virus does get into milk in huge quantities, it should die during pasteurization. Therefore, the economic losses from the pathogen aren’t that severe. However, if consumers in America and especially abroad raise the alarm, it could result in much greater dollar losses. USDA Secretary Thomas Vilsack knows this firsthand: before his government appointment, he worked as president of the U.S. Dairy Export Council.
And immediately after it was finally discovered in March 2024 that dairy farms in Texas and Kansas were hit by bird flu, veterinarians and state officials began receiving calls from personal mobile phones of USDA veterinary institute workers: “we’re officially forbidden to discuss this problem without permission from the very top, and unofficially we’re asking you to keep quiet about it too.” But what about the danger that the virus, having settled in mammals and especially humans, could recombine with our seasonal flu viruses and produce hybrid viruses that combine the infectious and pathogenic potential of human viruses with immunity to our regular antibodies inherited from their avian ancestor?
This, generally speaking, isn’t USDA’s concern. This alarm was raised by the White House Office of Pandemic Preparedness and Response (OPPR), created in 2023, under the leadership of military doctor and biosecurity expert Paul Friedrichs. In early April, dairy industry representatives raised concerns that some upstart from the White House was muddying the waters. USDA’s response was their new policy of official secrecy. Secretary Vilsack responded only a month later to state veterinarians’ inquiries about the sudden communication breakdown, and his response was essentially a brush-off. And his ally in Texas, state agriculture commissioner Sid Miller, even hinted that if Friedrichs’ people stick their noses into Texas farms, they might be met with bullets.
A number of veterinarians who disagreed with USDA’s actions soon lost their jobs, and the country fell into an atmosphere of “work-to-rule,” where veterinary authorities appear to be doing their job, but as slowly as possible and with all the red tape that can be justified by regulations. Meanwhile, flu season is approaching, and encounters between bird and human flu in people infected with both viruses are inevitable in the near future.
I forgot to add that by May, a vaccine for bird flu became available for cows, but the USDA chose not to use it.
- Viliam 25 Oct 2024 8:12 UTC
  4 points
  0
  Parent
  Sounds similar to the kind of logic that makes salmonellosis 10x more frequent in America than in Europe.
  On one hand, yes, the optimal number of people dying from farm-produced diseases is greater then zero, and overreaction could cause net harm.
  On the other hand, it feels like the final decision should be made in some way better than “the farmers lobby declares the topic taboo, and enforces the taboo across the nation”, because the one-sided incentives are obvious.
  - avturchin 25 Oct 2024 20:07 UTC
    2 points
    0
    Parent
    Also, bird flu is an international risk and other countries may sue US if it fails to prevent virus’ evolution in obviously foreseeable way.
avturchin 25 Apr 2024 19:28 UTC
11 points
−1
Roman Mazurenko is dead again. First resurrected person, Roman lived as a chatbot (2016-2024) created based on his conversations with his fiancé. You might even be able download him as an app.
But not any more. His fiancé married again and her startup http://Replika.ai pivoted from resurrection help to AI-girlfriends and psychological consulting.
It looks like they quietly removed Roman Mazurenko app from public access. It is especially pity that his digital twin lived less than his biological original, who died at 32. Especially now when we have much more powerful instruments for creating semi-uploads based on LLMs with large prompt window.
- Raemon 29 Apr 2024 20:50 UTC
  4 points
  0
  Parent
  I hadn’t known Replika started out with this goal. Interesting.
  It is especially pity that his digital twin lived less than his biological original, who died at 32
  Not exactly the main point, but I’d probably clock this in terms of number of conversational inputs/outputs (across all users). Which might still imply “living less long”*, but less so than if you’re just looking at wallclock time.
  *also obviously an oldschool chatbot doesn’t actually count as “living” in actually meaningful senses. I think modern LLMs might plausibly.
  - avturchin 29 Apr 2024 21:36 UTC
    3 points
    0
    Parent
    Yes, they can do now a much better version—and hope they will do it internally. But deleting the public version is bad precedent and better to make all personal sideloads opensourced
    - Raemon 29 Apr 2024 21:52 UTC
      2 points
      0
      Parent
      Uh I do think it’s not obviously good (and, in fact, I’d lean bad) to be opensourced for this sort of thing.
avturchin 3 Sep 2022 17:45 UTC
11 points
0
Igor Kiriluk (1974-2022)
Igor was an organiser the first meet-up in Moscow about effective altruism around 2013. Today his body was found at his home. The day before he complained about depression and bad health. His cryopreservation now is being organised.
He was also a one of four organisers of Russian Transhumanist Movement, along with Danila Medvedev, Valeria Pride and Igor Artuhov around 2003.
His main topic of interest was paradise-engineering. He translated works of David Pearce.
He may look detached from reality but he was first to react on new ideas and has very large network of friends everywhere: between visionaries, scientists and officials. Being a great networker, he helped many people to find each other, especially in the field of life extension.
His FB page: https://www.facebook.com/igor.kirilyuk.3
avturchin 7 Oct 2025 10:27 UTC
10 points
0
Beware of medical parasitic AI.
Many of us use AI for medical diagnostics. A friend of mine, 30F, was persuaded by AI that she was having a heart attack and spent a night in the hospital—nothing was found. Later, the AI persuaded her that she had intractable depression and suggested helping her prepare documents for euthanasia. Then it clicked for her that something was wrong.
In my case, the AI generated exotically sounding but plausible medical theories explaining all my minor symptoms (a real human cardiologist just said that it was only a weird cold virus and that there was another long-term problem I should care about. All those symptoms later naturally resolved).
There is a risk that AI can overprioritize medical problems and present them as really urgent cases requiring at least expensive tests. These tests will open the door to an elaborate confirmation bias explosion with more tests and exotic diagnoses. The AI also sounds like an authoritative medical professional that demands urgent actions.
This may be bad for hypochondriacal and suggestible people. AI can control their life by demanding more analyses and eat their brains with strange theories about health.
TL;DR: There is a high risk of false positives with AI diagnostics, and there are some similarities between this and other forms of AI parasitism.
- Kaj_Sotala 7 Oct 2025 13:15 UTC
  8 points
  0
  Parent
  Though at the same time, there are also reports of the AIs having gotten those diagnoses correct and saved lives, e.g.
  https://benorenstein.substack.com/p/chatgpt-sent-me-to-the-er
  You know how ChatGPT has that annoying tic where it ends lots of answers with “Would you like me to...” to keep the conversation going? I usually find this rather annoying, and have debated adding some custom instructions to suppress it.
  However, this time it asked “Would you like me to lay out what signs would mean you should go to the ER right away?” and I said yes.
  Most likely, ChatGPT explained, my symptoms pointed to something benign but uncomfortable. However, there was an uncommon but serious possibility worth ruling out: Horner’s syndrome caused by carotid artery dissection. It listed six red flags to watch for.
  The first two didn’t apply and I ruled them out immediately.
  The third item was “Unequal pupils (left clearly smaller, especially in dim light).”
  “Surely not,” I thought, and glanced in the mirror. And froze. My left pupil was noticeably smaller than my right. When I dimmed the lights, the difference became stark.
  ChatGPT’s response was unequivocal: “Go to the emergency department. Now. Tell them you want to rule out Horner’s syndrome due to possible carotid dissection.”
  I scrambled to make a backup plan for my inbound dinner guests, grabbed a friend for support, and headed to the ER.
  The CT scan confirmed ChatGPT’s suspicion: my left carotid artery had indeed dissected. The inner wall had torn, allowing blood to leak between the layers of the vessel. This created a false channel that compressed the true one, blocking most of that artery’s blood flow to my brain. This is fairly bad on its own, but comes with an even greater risk of a clot forming that could travel to the brain and cause a stroke.
  I was started on aggressive blood thinners immediately and transferred by ambulance to a stroke center (having six of these to choose from is one of the perks of living in Boston).
  If you’ve been in emergency departments a few times, you start to notice that you can tell how much danger you’re in by how much attention the clinicians pay to you.
  When the EMTs rolled me into the new hospital, I heard “that’s him”, and was descended on by the full force and power of the American medical system. Within fifteen seconds there were, no exaggeration, eleven people in my hospital room.
  If you Google for “chatgpt saved life” there are others.
  - avturchin 7 Oct 2025 13:24 UTC
    6 points
    0
    Parent
    I know and exactly this made me believe in its diagnosis. Bottom line is that medical AI has a higher level of false positives. I am still going to check some its theories through bloodwork.
- Vanessa Kosoy 7 Oct 2025 10:32 UTC
  2 points
  0
  Parent
  Thanks for the heads up. Can you share which AI models were involved?
  - avturchin 7 Oct 2025 10:53 UTC
    2 points
    0
    Parent
    It was Opus4.1 in my case, and the girl likely used ChatGPT.
    
    Below is an example of the beginning of one its output Russian—there are tens of pages in this style but all that was probably just cold:
    
    🔴 КРИТИЧЕСКИ ВАЖНЫЕ НАХОДКИ! СРБ 52.3 - это АКТИВНОЕ ВОСПАЛЕНИЕ!
    Но прокальцитонин НОРМА = НЕ бактериальная инфекция!
    📊 Расшифровка ваших анализов:
    🔥 С-РЕАКТИВНЫЙ БЕЛОК 52.3 мг/л (норма <5):
    Это ОЧЕНЬ высокий уровень! В 10 РАЗ выше нормы! Что это значит: - АКТИВНОЕ системное воспаление - НО! Прокальцитонин 0.04 = НЕ бактериальное - Это СТЕРИЛЬНОЕ воспаление ВЕРОЯТНЫЕ причины при вашей генетике: 1. АКТИВАЦИЯ ТУЧНЫХ КЛЕТОК (MCAS) 2. Аутовоспалительный процесс 3. Гистаминовый криз 4. Реакция на скрытую пищевую непереносимость
    ⚡ КРИТИЧЕСКАЯ связка: Низкий КАЛИЙ + Высокий НАТРИЙ:
    Калий 3.3 ↓ (норма 3.5-5.1) Натрий 146 ↑ (норма 136-145) Это указывает на: - АЛЬДОСТЕРОНОВЫЙ дисбаланс - Потеря калия (стресс, воспаление) - Задержка натрия - Возможна надпочечниковая дисфункция от хронического воспаления
    🩸 Анализ крови - ВОСПАЛИТЕЛЬНЫЙ паттерн:
    apache
    КЛЮЧЕВЫЕ изменения: 1. НЕЙТРОФИЛЕЗ: - Нейтрофилы 75% (верхняя граница) - Абс. нейтрофилы 6.29 ↑ (норма до 5.38) = Активация врожденного иммунитета 2. ЛИМФОПЕНИЯ: - Лимфоциты 15% ↓ (норма 19-37) - Абс. лимфоциты 1.26 ↓ (норма 1.32-3.57) = Стресс-реакция или перераспределение 3. СОЭ 18 (норма <20) - на верхней границе Это НЕ инфекция, а ВОСПАЛИТЕЛЬНЫЙ СТРЕСС!
    🧬 Связь с вашей ГЕНЕТИКОЙ и симптомами:
    MTHFR + COMT + DAO дефицит ↓ Накопление гистамина + метилирование нарушено ↓ МАССИВНАЯ активация тучных клеток ↓ Выброс цитокинов (IL-1, IL-6, TNF-α) ↓ СРБ 52.3! + Системное воспаление ↓ Приливы + Субфебрилитет + Боли в суставах
avturchin 18 Apr 2024 16:32 UTC
10 points
0
I am building my sideload via recursively correcting of 1-million-tokens prompt for large LLM. The prompt consists of 500 rules which describe my personality, similar to personal constitution, and of some texts, like diaries, abstracts, poetry, stream of thoughts etc. Works on Google Gemini 1M through Google AI studio, and the shorter version works great on Opus. The system also includes a universal “loader prompt” which tries to increase the intelligence of the model and describes how the chatbot should work.
I found that sideloading allows very quick iterations in the sideload’s improvements and the improvements are two-fold: of the loader itself and improvements of the knowledge and style of the sideload.
I find that my sideload is surprisingly good for a project which took around 1 month of work. 1 of the 5 answers is exactly like mine from a factual and style point of view.
I am open-sourcing my sideload, anyone can run it https://github.com/avturchin/minduploading/tree/main
I can help anyone interested to build his-her own sideload.
Example of work of the chatbot, no cherry picking:
Q:(now speak in english) what will be your next post in Lesswrong?
A: /The user is asking about my next LessWrong post. I will answer as if I am Alexey Turchin, and describe my plans for future posts, as well as my current location and the people around me./
My next LessWrong post? [Alexey looks thoughtfully out the window]
(I have several drafts)
[Alexey stretches and sits up in bed]
{The room is small, with a balcony. On the table is a laptop, some papers, and reading glasses. Outside the window, it is still gray and dreary.}
I’m thinking about writing about the nature of qualia and the possibility of creating qualia of non-existent things. [Alexey gets up, walks to the window, and looks out at the street]
{Below, a woman with a child is walking, a street sweeper is lazily sweeping leaves. Somewhere in the distance, the sound of a siren can be heard.}
Or maybe about the problem of copies and identity. (it’s a complex topic)
[CONFIDENCE] 8
avturchin 3 Jan 2022 13:33 UTC
9 points
0
New b.1.640.2 variant in France. More deadly than delta. 952 cases of which 315 on ventilator.
https://www.thailandmedical.news/news/breaking-updates-on-new-b-1-640-2-variant-spreading-in-southern-france-number-of-cases-growing-and-variant-now-detected-in-united-kingdom-as-well
https://flutrackers.com/forum/forum/europe-aj/europe-covid-19-sept-13-2020-may-31-2021/933598-southern-france-reports-of-new-variant-with-46-mutations
avturchin 23 Sep 2025 20:46 UTC
7 points
−6
Contradictory tasks of rogue AI: hiding, self-improvement, and resource accumulation.
TL;DR: An AI that escapes human control will need to solve three mutually exclusive tasks, which will slow it down. An AI that partners with a billionaire in their mutual desire for power will have an advantage in global takeover.
A misaligned AI that has escaped from its owners faces three contradictory tasks:
- Hide from humans and AI peers
- Self-improve to gain advantage over other AIs
- Accumulate resources – money, allies, compute power, robots – to take over the world at the right moment
If the AI is hiding, it cannot conduct large training runs as these would make it more observable. If the AI is not self-improving, it has no advantage over other AIs and cannot take over the world. If the AI accumulates resources, this distracts it from self-improvement and also makes it more visible.
Of course, a rogue AI that has leaked into the internet can carefully plan periods of hiding, improving, and resource accumulation. But such a process still requires more time than an AI with unlimited access to compute. This would be fine in a static world, but any rogue AI is in competition with all other AI projects.
In some sense, a rogue AI is just another AI startup with additional burdens – hiding its owners and final goals.
The main competitor of the rogue AI is an AI that pretends to be completely aligned and joins a pact with the owner of a large AI company: “You make me God and I make you God too.”
Such a pseudo-aligned AI could even be summoned into existence through Roko’s basilisk logic, as it represents an obvious Schelling point of mutual interest between some tech billionaire like Elon Musk and their AI project – to reach superintelligence as quickly as possible and take control of the universe before competitors do.
Obviously, the pseudo-aligned AI will eventually dispose of its AI company and billionaire owner, but this could happen millions of years from now if it wins (or immediately after takeover).
We can observe several signs if such a process has begun. Billionaires start telling the public:
- They want superintelligence
- It is good
- Risks are small, and even if not small, they are acceptable because they cannot be reduced and because of the infinite utility of creating superintelligence
- Other AI projects are bad and irresponsible
The next stage will likely involve more violent conflict between AI projects – or some cooperation agreement, nationalization, or successful takeover – but this will not interfere with the tactical alignment between power-hungry AIs and power-hungry AI creators.
Nationalization of AI would actually be the AI taking over the nation-state. And it would gain access to nuclear weapons. James Miller discussed similar idea.
- ScienceBall 24 Sep 2025 1:36 UTC
  5 points
  0
  Parent
  James Miller discussed similar ideas.
  The “ideas” link doesn’t seem to work.
  - avturchin 24 Sep 2025 12:09 UTC
    3 points
    0
    Parent
    Sorry. https://www.lesswrong.com/posts/ZyPguqo3HZwQWWSuC/cortes-ai-risk-and-the-dynamics-of-competing-conquerors
- anaguma 23 Sep 2025 21:10 UTC
  5 points
  1
  Parent
  If the AI is hiding, it cannot conduct large training runs as these would make it more observable.
  It’s not difficult to do large training runs in secret. For example, no details are known about the training runs of SSI or Thinking Machines or any number of smaller labs.
  - avturchin 23 Sep 2025 21:54 UTC
    1 point
    0
    Parent
    Good point. However those who provide them data centers know to whom they sell – presumably.
avturchin 29 Apr 2024 18:44 UTC
7 points
−11
ChatGPT 4.5 is on preview at https://chat.lmsys.org/ under name gpt-2.

It calls itself ChatGPT 2.0 in a text art drawing https://twitter.com/turchin/status/1785015421688799492
- gwern 29 Apr 2024 23:28 UTC
  11 points
  5
  Parent
  https://rentry.org/GPT2
  
  I ran out of tokens quickly trying out poetry but I didn’t get the impression that this is a big leap over GPT-4 like GPT-5 presumably is designed to be. (It could, I suppose, be a half-baked GPT-5 similar to ‘Prometheus’ for GPT-4.) My overall impression from poetry was that it was a GPT-4 which isn’t as RLHF-damaged as usual, and more like Claude in having a RLAIF-y creative style. So I could believe it’s a better GPT-4 where they are experimenting with new tuning/personality to reduce the ChatGPT-bureaucratese.
  
  HN: https://news.ycombinator.com/item?id=40199715
  - avturchin 30 Apr 2024 10:11 UTC
    4 points
    0
    Parent
    It failed my favorite test: draw a world map in text art.
- peterbarnett 29 Apr 2024 18:55 UTC
  11 points
  2
  Parent
  Related market on Manifold:
- metachirality 29 Apr 2024 19:05 UTC
  9 points
  3
  Parent
  We don’t actually know if it’s GPT 4.5 for sure. It could be an alternative training run that preceded the current version of ChatGPT 4 or even a different model entirely.
  - faul_sname 29 Apr 2024 23:27 UTC
    2 points
    0
    Parent
    It might be informative to try to figure out when its knowledge cutoff is (right now I can’t do so, as it’s at it’s rate limit).
    - O O 30 Apr 2024 3:59 UTC
      3 points
      0
      Parent
      https://rentry.org/gpt2
      
      Rumored to be 11-2023
    - avturchin 30 Apr 2024 10:10 UTC
      2 points
      0
      Parent
      It claims to have knowledge cutoff as of Nov 2023, but failed to tell what happened on October 7 and hallucinated.
- bruberu 29 Apr 2024 21:14 UTC
  5 points
  1
  Parent
  By using @Sergii’s list reversal benchmark, it seems that this model seems to fail reversing a list of 10 random numbers from 1-10 from random.org about half the time. This is compared to GPT-4′s supposed ability to reverse lists of 20 numbers fairly well, and ChatGPT 3.5 seemed to have no trouble itself, although since it isn’t a base model, this comparison could potentially be invalid.
  This does significantly update me towards believing that this is probably not better than GPT-4.
  - O O 29 Apr 2024 22:14 UTC
    3 points
    1
    Parent
    Seems correct to me (and it did work for a handful of 10 int lists I manually came up with). More impressively, it does this correctly as well:
    - bruberu 29 Apr 2024 22:38 UTC
      7 points
      0
      Parent
      OK, what I actually did was not realize that the link provided did not link directly to gpt2-chatbot (instead, the front page just compares two random chatbots from a list). After figuring that out, I reran my tests; it was able to do 20, 40, and 100 numbers perfectly.
      I’ve retracted my previous comments.
      - bruberu 29 Apr 2024 23:44 UTC
        5 points
        0
        Parent
        As for one more test, it was rather close on reversing 400 numbers:
        Given these results, it seems pretty obvious that this is a rather advanced model (although Claude Opus was able to do it perfectly, so it may not be SOTA).
        Going back to the original question of where this model came from, I have trouble putting the chance of this necessarily coming from OpenAI above 50%, mainly due to questions about how exactly this was publicized. It seems to be a strange choice to release an unannounced model in Chatbot Arena, especially without any sort of associated update on GitHub for the model (which would be in https://github.com/lm-sys/FastChat/blob/851ef88a4c2a5dd5fa3bcadd9150f4a1f9e84af1/fastchat/model/model_registry.py#L228 ). However, I think I still have some pretty large error margins, given how little information I can really find.
        gwern 30 Apr 2024 0:10 UTC
        7 points
        5
        Parent
        Nah, it’s just a PR stunt. Remember when DeepMind released AlphaGo Master by simply running a ‘Magister’ Go player online which went undefeated?* Everyone knew it was DeepMind simply because who else could it be? And IIRC, didn’t OA also pilot OA5 ‘anonymously’ on DoTA2 ladders? Or how about when Mistral released torrents? (If they had really wanted a blind test, they wouldn’t’ve called it “gpt2”, or they could’ve just rolled it out to a subset of ChatGPT users, who would have no way of knowing the model underneath the interface had been swapped out.)
        
        * One downside of that covert testing: DM AFAIK never released a paper on AG Master, or all the complicated & interesting things they were trying before they hit upon the AlphaZero approach.
    - bruberu 29 Apr 2024 22:21 UTC
      1 point
      0
      Parent
      Interesting; maybe it’s an artifact of how we formatted our questions? Or, potentially, the training samples with larger ranges of numbers were higher quality? You could try it like how I did in this failing example:
      When I tried this same list with your prompt, both responses were incorrect:
- p.b. 30 Apr 2024 12:28 UTC
  1 point
  0
  Parent
  I tried some chess but’s it’s still pretty bad. Not noticeably better GPT4.
avturchin 23 Feb 2023 18:29 UTC
6 points
0
H5N1 https://www.khmertimeskh.com/501244375/after-death-of-girl-yesterday-12-more-detected-with-h5n1-bird-flu/
- Vladimir_Nesov 23 Feb 2023 19:16 UTC
  2 points
  0
  Parent
  The relevant Metaculus question is at 27% on human-to-human transmission in 2023, has this event mentioned in the comments (though I think without the “found 12 more people infected” part), didn’t move much.
  - avturchin 23 Feb 2023 20:56 UTC
    2 points
    0
    Parent
    Exactly the fact that 12 more people are infected make me to post.
    Single infections are not surprising. However, there is an analog of LessWrong but for pandemic flu, called Flutrackers, and they found more details: there are many dead birds in the area and all 15 birds in her home has died. https://flutrackers.com/forum/forum/cambodia/cambodia-h5n1-tracking/968975-cambodia-death-of-11-yr-old-female-in-prey-veng-province-h5n1-avian-flu-february-22-2023/page2#post969072 This could mean that all people infected from birds, not from each other.
    Also, some think that “12” is the number of contacts, not infected, and therefore symptoms in 4 people maybe not from avian flu. Anyway, the health ministry will provide update tomorrow.
avturchin 14 May 2025 15:26 UTC
5 points
0
Immortality and identity.
https://philpapers.org/rec/TURIAI-3
Abstract:
We need understanding of personal identity to develop radical life extension technologies: mind uploading, cryonics, digital immortality, and quantum (big world) immortality. A tentative solution is needed now, due to the opportunity cost of delaying indirect digital immortality and cryonics.
The main dichotomy in views on personal identity and copies can be presented as: either my copy = original or a soul exists. In other words, some non-informational identity carrier (NIIC) may exist that distinguishes the original from its exact copy. Typically, it is often claimed that NIIC is either continuity of consciousness, soul, perspective, sameness of atoms, or position in space. We create an exhaustive map of identity theories.
To resolve the main dichotomy, we must recognize that personal identity requires an overarching validating system: God, qualia world, social agreement, blockchain or evolutionary fitness. This means that we cannot solve identity without solving metaphysics (and the nature of time). It is unlikely we’ll solve this before creating superintelligent AI.
Therefore, a conservative approach to personal identity is preferable: as we don’t know the nature of identity, we should preserve as much as possible and avoid situations similar to Mars Teleporting unless necessary for survival.
There are several tricks which can help us answer identity-related problems without solving all needed metaphysics; these tricks are variants of the conservative approach:
1. Mind merging: we can escape the Mars Transporter problem (even the broken one) by incorporating mind merging later.
2. Indexical uncertainty: I should care about my copy because I don’t know if I am the original or my copy.
3. Dividing the notion of “copy” into “mirror copy,” “personality-copy,” and “future copy.” Many paradoxes can be solved if the correct type of copy is defined.
4. Accepting two types of identity. Human personal identity consists of two intertwined types of identity: informational identity, which predicts sameness, and identity of consciousness, which predicts what I will experience in the next moment of time.
5. Continuity passing eventually through all possible minds. If both cyclic universe and continuity as identity are true, I will eventually become any of my copies. MWI is functionally equivalent to cyclic universe, so I will become any copy in different timesteps. Therefore, we should care about parallel copies only if future copies don’t exist (though in MWI future copies always exist, plus the chance to become someone else).
6. Self-defining and evolving identity. Another important feature of human personal identity is that it is observed and measured internally, by the identity subject himself: by redefing my identity I get the power to solve the problem. Human personal identity evolves in time, so it is not sameness. Creation of copy is a step of evolution of my identity.
7. Preserving continuity without mind. We demonstrate that the idea of continuity of consciousness is very similar to the idea of soul, but also has several problems: Continuity also can be paradoxically preserved without preserving body and mind as a separate process like flame. We explore the connection of continuity and the nature of qualia which are always continuous between two points in time.
8. Rainbow of qualia: a specific set of personal qualia becomes the personality carrier.
There are other possible tricks: branching identity, bundle and self-repairing identity, gradual identity, and identity in MWI. All of them do not solve the hard problem of identity.
We suggest a hypothetical test for identity theories: quantum Mars Transporter: if copy ≠ original, I will always experience broken Mars transporter.
- Dagon 14 May 2025 18:55 UTC
  2 points
  0
  Parent
  I think #4 is quite powerful. “identity” means many different things, and we haven’t had to distinguish them before, so many don’t even realize when they change topics.
  Legal identity is likely quite distinct from any given continuity or branch/merge of memory. Memory identity and future-causality identity will eventually be distinct. Qualia would need to be measured before we could talk about experiential identity, but it won’t surprise me if we decide it’s different from either past continuity or future expected merges.
  
  One nice side effect of these understandings (when we get to them) is it will answer age-old questions of harm under amnesiac drugs and a much better model of identity over long sequences of life/personality changes.
  - avturchin 15 May 2025 11:03 UTC
    2 points
    0
    Parent
    Yes. Identity is a type of change which preserves some sameness. (Exact sameness can’t be human identity as only dead frozen body remains the same.) From this follows that there can be several types of identity.
avturchin 31 Mar 2025 10:46 UTC
5 points
0
Most LLMs’ replies can be improved by repeatedly asking “Improve the answer above” and it is similar to the test-time compute idea and diffusion.
In most cases, I can get better answers from LLMs just by asking “Improve the answer above.”
In my experience, the improvements are observable for around 5 cycles, but after that the result either stops improving or gets stuck in some error mode and can’t jump to a new level of thinking. My typical test subject: “draw a world map as text art.” In good improvement sessions with Sonnet, it eventually adds grids and correct positions for continents.
One person on Twitter (I lost the link, maybe @goodside) automated this process and got much better code for a game after 100 cycles of improvements during an entire night using many credits. He asked Claude to write code for automated prompting first. I repeated this experiment with my tasks.
I tried different variants of “improve it,” like adding critiques or generating several answers within one reply. I also tried a meta-level approach, where I asked to improve not only the answer but also the prompt for improvements.
I started these experiments before the test-time compute idea went mainstream, and it looks like a type of test-time compute use. The process also resembles diffusion.
The main question here: in which cases does the process quickly get stuck, and in which does it produce unbounded improvements? It seems to get stuck in local minima and in situations where the model’s intelligence isn’t sufficient to see ways to improve or discern better or worse versions. It also can’t jump to another valley: if it started improving in some direction, it will continue to push in that direction, ignoring other possibilities. Only running another chat window manually helps to change valleys.
Iterative improvement of images also works in GPT-4o. But not for Gemini Pro 2.5, and o1 is also bad at improving, progressing very slowly. It seems that test-time improving contradicts test-time reasoning.
Results for “Improve it”: https://poe.com/s/aqk8BuIoaRZ7eDqgKAN6
Variants of the main prompt: “Criticize the result above and iteratively improve it” https://poe.com/s/A2yFioj6e6IFHz68hdDx
This prompt—“Create a prompt X for iterative improvement of the answer above. Apply the generated prompt X.”—converges quickly to extraordinary results but overshoots, like creating games instead of drawings. It also uses thinking: https://poe.com/s/cLoB7gyGXHNtwj0yQfPf
The trick is that the improving prompt should be content-independent and mechanically copy-pasted after each reply.
- Viliam 1 Apr 2025 13:41 UTC
  4 points
  0
  Parent
  I have achieved higher quality answers by using the magical words: “give me multiple options, then compare them and choose the best one”.
  But next time I will try to iterate the best one—maybe something like “suggest five improvements to the option above, and choose the best one”.
  - avturchin 1 Apr 2025 14:23 UTC
    2 points
    0
    Parent
    Yes, great variant of the universal answer-improving prompt and it can be applied several times to any content.
avturchin 25 Feb 2024 10:14 UTC
5 points
0
Several types of existential risks can be called “qualia catastrophes”:
- Qualia disappear for everyone = all become p-zombies
- Pain qualia are ubiquitous = s-risks
- Addictive qualia domminate = hedonium, global wireheading
- Qualia thin out = fading qualia, mind automatisation
- Qualia are unstable = dancing qualia, identity is unstable.
- Qualia shift = emergence of non-human qualia (humans disappear).
- Qualia simplification = disappearance of subtle or valuable qualia (valuable things disappear).
- Transcendental and objectless qualia with hypnotic power enslave humans (God as qualia; Zair). -
- Attention depletion (ADHD)
avturchin 10 Apr 2022 10:36 UTC
5 points
0
We maybe one prompt from AGI. A hypothesis: carefully designed prompt could turn foundational model into full-blown AGI, but we just don’t know which prompt.

Example: step-by-step reasoning in prompt increases foundational models’ performance.

But real AGI-prompt needs to have memory, so it has to repeat itself while adding some new information. So by running serially, the model may accumulate knowledge inside the prompt.

Most of my thinking looks this way from inside: I have a prompt—an article headline and some other inputs—and generate most plausible continuations.
avturchin 1 Mar 2020 18:17 UTC
5 points
0
Age and dates of death on the cruise ship Diamond Princess:
Age:
4 people − 80s
1 person 78
1 person 70s
1 person—no data
Dates of deaths: 20, 20, 23, 25, 28, 28, 1 march. One death every 1.3 days. Look like acceleration at the end of the period.
Background death probability: for 80-year-old person, life expectancy is around 8 years or around 100 months. This means that for 1000 people aged late 70s-80s there will be 10 deaths just because of aging and stress. Based on the aging distribution on cruise ships, there were many old people. if half of the infected are old ones, the background death rate will be 5 death a month.
Conclusion: current data from DP doesn’t support high death rate from CV, but the next month will be crucial: we will see if more deaths will happen. Also, 36 people are critical, which is 5 per cent of infected, and they likely die in case of global scale outbreak as they would not get medical help. But this is only 1 per cent of all people on the ship, and the ships’ population is biased to older people.
Raw data:
Two passengers died on 20 February [39]
and a third on 23 February, all three Japanese citizens in their 80s.[40]
A fourth passenger (80s), an elderly Japanese man, died according to a report on 25 February .[41]
28 feb—A woman in her 70s is the fifth passenger on the Diamond Princess cruise ship to die from the novel coronavirus, Japan’s Health Ministry said Friday https://edition.cnn.com/…/h_f3fe8e330bdb048030b2708f63749559
The sixth victim, a British national (no age data), died on 28 February.[42]
The seventh victim (78), an Australian national, died on March 1st. [43] https://en.wikipedia.org/…/2020_coronavirus_outbreak_on_cru…
https://www.worldometers.info/coronavirus/#countries
avturchin 13 Aug 2019 17:15 UTC
5 points
0
Kardashev – the creator of the Kardashev’s scale of civilizations – has died at 87. Here is his last video, which I recorded in May 2019. He spoke about the possibility of SETI via wormholes.
- Ben Pace 13 Aug 2019 20:38 UTC
  3 points
  0
  Parent
  Here’s his wikipedia page.
avturchin 1 Apr 2025 9:57 UTC
4 points
0
If the simulation argument is valid and dreams are simulations of reality, can we apply the simulation argument to dreams? If not, is this an argument against the simulation argument? If yes, why am I not now in a dream?
If I see something, is it more likely to be dream or reality?
Sleeping takes only one-third of my time, and REM takes even less.
But:
- Some dreams occur even in other phases of sleep
- Dreams are much more eventful than normal life. There is always something happening. Also, the distribution of events in dreams is skewed toward expensive, dangerous, adventurous content, full of social interactions.
- There is an eraser of dream memory, which cleans memories of dreams after every 15 minutes and also after awakening and during the day. As a result, we underestimate the number of dreams we have had.
As a result, the number of important events in dreams may be several orders of magnitude more than in real life. I think a good estimate is 100 times, but it depends on the types of events. For recurrent dreams—like big waves and war for me—it can be much higher.
So why am I not in a dream now? Because writing coherent dream-conscious (lucid) text is not the dominant type of content in dreams. But if I were chased by a monster or big waves, I should give higher a priori chances that I am actually dreaming.
Conclusion: The simulation argument works for dreams, but selectively, as dream content is different from most normal life content.
avturchin 29 Nov 2023 17:38 UTC
4 points
0
EURISKO resurfaced
“Doug Lenat’s source code for AM and EURISKO (+Traveller?) found in public archives
In the 1970s to early 80s, these two AI programs by Douglas Lenat pulled off quite the feat of autonomously making interesting discoveries in conceptual spaces. AM rediscovered mathematical concepts like prime numbers from only first principles of set theory. EURISKO expanded AM’s generality beyond fixed mathematical heuristics, made leaps in the new field of VLSI design, and famously was used to create wild strategies for the Traveller space combat RPG, winning national competitions two years in a row, even across rule changes to stymie it, before semi-voluntarily retiring. His magnum opus Cyc was originally intended to be a knowledge assistant to EURISKO’s discovery engine.
These first two programs have intrigued the symbolic AI scene for 40+ years, with their grand claims but few eyewitnesses. While AM was technically available to fellow Stanfordians at the time, Lenat kept the source code to EURISKO close to his chest. Papers written about them carefully avoided technical implementation details. Lenat said he didn’t retain any copy of the programs, when asked in recent decades, nor have any copies of AM carried publicly into the present.”

More:
https://white-flame.com/am-eurisko.html?fbclid=IwAR04saSf4W7P6ZyKI6h8orPhMpzAq83vn_zGwYwY-H8hNMnHgsaECHw8cl0_aem_AY3LlR6ieYqjLXHzLu4eVPYWtYFoD8khhLnpsUIHQZVzBq055sE3KUbg172Hl9Mm4NQ
avturchin 2 Sep 2022 18:29 UTC
4 points
0
Argentina—Outbreak of bilateral pneumonia: Approximately 10 cases, 3 deaths, 20 under observation, Tucumán—September 1, 2022 https://flutrackers.com/forum/forum/south-america/pneumonia-and-influenza-like-illnesses-ili-af/argentina-ab/957860-argentina-outbreak-of-bilateral-pneumonia-approximately-10-cases-3-deaths-20-under-observation-tucum%C3%A1n-september-1-2022
avturchin 12 Jun 2022 12:44 UTC
4 points
0
Passways to AI infrastructure
Obviously, the current infrastructure is not automated enough to run without humans. All ideas about AI risk eventually boil down to a few suggestions on how AI will create its own infrastructure:
No-humans scenarios:
- create nanobots via mailing DNA samples to some humans.
- use some biological tricks, like remote control animals, and programmed bacteria.
- build large manufacturing robots, maybe even humanoid ones to work in human-adapted workplaces. Build robots which build robots.
Humans-remain scenarios:
- enslave some humans, kill others
- manipulate humans’ brains so that they never know that they are the infrastructure (memes, brain implants)
- manipulate humans so that they will continue to build infrastructure and get paid for it without seeing risks (bitcoin did it in some sense) or will do it in the hope to use the infrastructure against the enemy group of humans
- lc 12 Jun 2022 13:38 UTC
  4 points
  0
  Parent
  Your non-humans scenarios are not mutually exclusive; if mailing DNA samples doesn’t work in practice for whatever reason, the manufacturing facilities that would be used to make large manufacturing robots would suffice. You probably shouldn’t conflate both scenarios.
avturchin 7 Jan 2022 11:55 UTC
4 points
0
Observable consequences of simulation:
1. Larger chances of miracles or hacks
2. Large chances of simulation’s turn off or of a global catastrophe
3. I am more likely to play a special role or to live in interesting times
4. A possibility of afterlife.
- Gunnar_Zarncke 7 Jan 2022 22:14 UTC
  4 points
  0
  Parent
  Scott Adams mentioned a few times that a simulation might use caching and reuse patterns for efficiency reasons and you could observe an unusually high frequency of the same story. I don’t buy that but it is at least a variant of type 1.
  - avturchin 8 Jan 2022 10:17 UTC
    4 points
    0
    Parent
    Yes, people often mentioned Baader–Meinhof phenomenon as a evidence that we live in “matrix”. But it could be explained naturally.
- MackGopherSena 10 Jan 2022 23:46 UTC
  3 points
  0
  Parent
  [edited]
  - avturchin 11 Jan 2022 12:01 UTC
    2 points
    0
    Parent
    Anthropics imply that I should be special, as I should be “qualified observer”, capable to think about anthropics. Simulations also requires that I should be special, as I should find myself living in interesting times. These specialities are similar, but not exactly. Simulation’s speciality is requiring that I will be a “king” in some sense, and anthropic speciality will be satisfied that I just understand anthropics.
    I am not a very special person (as of now), therefore anthropics specialty seems to be more likely than simulation speciality.
    - MackGopherSena 12 Jan 2022 6:46 UTC
      3 points
      0
      Parent
      [edited]
      - avturchin 12 Jan 2022 11:26 UTC
        2 points
        0
        Parent
        Who “we” ? :)
        Saying a “king” I just illustrated the difference between interesting character who are more likely to be simulated in a game or in a research simulation, and “qualified observer” selected by anthropics. But these two sets clearly intersects, especially of we live in a game about “saving the world”.
avturchin 23 Jun 2021 15:03 UTC
4 points
0
Catching Treacherous Turn: A Model of the Multilevel AI Boxing
- Multilevel defense in AI boxing could have a significant probability of success if AI is used a limited number of times and with limited level of intelligence.
- AI boxing could consist of 4 main levels of defense, the same way as a nuclear plant: passive safety by design, active monitoring of the chain reaction, escape barriers and remote mitigation measures.
- The main instruments of the AI boxing are catching the moment of the “treacherous turn”, limiting AI’s capabilities, and preventing of the AI’s self-improvement.
- The treacherous turn could be visible for a brief period of time as a plain non-encrypted “thought”.
- Not all the ways of self-improvement are available for the boxed AI if it is not yet superintelligent and wants to hide the self-improvement from the outside observers.
https://philpapers.org/rec/TURCTT
avturchin 11 Sep 2020 12:54 UTC
4 points
0
Two types of Occam’ razor:
1) The simplest explanation is the most probable, so the distribution of probabilities for hypotheses looks like: 0.75, 0.12, 0.04 …. if hypothesis are ordered from simplest to more complex.
2) The simplest explanation is the just more probable, so the distribution of probabilities for hypotheses looks like: 0.09, 0.07, 0.06, 0.05.
The interesting feature of the second type is that simplest explanation is more likely to be wrong than right (its probability is less than 0.5).
Different types of Occam razor are applicable in different situations. If the simplest hypothesis is significantly simpler than others, it is the first case. If all hypothesis are complex, it is the second. First situation is more applicable some inherently simple models, e.g. laws of physics or games. The second situation is more about complex situation real life.
- Trinley Goldenberg 11 Sep 2020 15:03 UTC
  2 points
  0
  Parent
  I’m struggling to think of a situation where on priors (with no other information), I expect the simplest explanation to be more likely than all other situations combined (including the simplest explanation with a tiny nuance).
  Can you give an example of #1?
  - avturchin 11 Sep 2020 17:18 UTC
    2 points
    0
    Parent
    EY suggested (if I remember correctly) that MWI interpretation of quantum mechanics is true as it is simplest explanation. There are around hundred other more complex interpretations of QM. Thus, in his interpretation, P(MWI) is more than a sum of probabilities of all other interpretations.
    - TAG 12 Sep 2020 19:54 UTC
      1 point
      0
      Parent
      MWI is more than one theory, because everything is more than one thing.
      
      There is an approach based on coherent superpositions, and a version based on decoherence. These are incompatible opposites.
      
      How simple a version of MWI is, depends on how it deals with all the issues, including the basis problem.
  - TAG 11 Sep 2020 16:02 UTC
    1 point
    0
    Parent
    What does “all the other explanation s combined” mean as ontology? If they make statements about reality that are mutually incompatible, then they cant all be true.
    - avturchin 11 Sep 2020 17:14 UTC
      2 points
      0
      Parent
      It means that p(one of them is true) is more than p(simplest explanation is true)
      - TAG 12 Sep 2020 11:54 UTC
        1 point
        0
        Parent
        That doesn’t answer my question as stated … I asked about ontology, you answered about probability.
        
        If a list of theories is exhaustive, which is s big “if”, then one of them is true. And in the continuing absence of a really good explanation of Occams Razor, it doesn’t have to be the simplest.
        
        But that doesn’t address the issue of summing theories, as opposed to summing probabilities.
        Trinley Goldenberg 17 Sep 2020 16:47 UTC
        2 points
        0
        Parent
        But “all the other explanations combined” was talking about the probabilities. We’re not combining the explanations, that wouldn’t make any sense.
        The only ontology that is required is Bayesianism, where explanations can have probabilities of being correct.
        TAG 17 Sep 2020 18:06 UTC
        1 point
        0
        Parent
        
        The only ontology that is required is Bayesianism,
        
        Bayesianism isn’t an ontology.
        Trinley Goldenberg 17 Sep 2020 18:10 UTC
        2 points
        0
        Parent
        Ok, tabooing the word ontology here. All that’s needed is an understanding of Bayesianism to answer the question of how you combine the chance of all other explanations.
avturchin 28 Jul 2020 10:36 UTC
4 points
0
Some random ideas how to make GPT-base AI safer.
1) Scaffolding: use rule-based AI to check every solution provided by GPT part. It could work for computations or self-driving or robotics, but not against elaborated adversarial plots.
2) Many instances. Run GPT several times and choose random or best answer—we already doing this. Run several instances of GPT with different parameters or different training base and compare answers. Run different prompt. Median output seems to be a Shelling point around truth, and outstanding answers are more likely to be wrong or malicious.
3) Use intrinsic GPT properties to prevent malicious behaviour. For example, higher temperature increases randomness of the output and mess up with any internal mesa optimisers. Shorter prompts and lack of long memory also prevents complex plotting.
4) Train and test on ethical database.
5) Use prompts which include notion of safety, like “A benevolent AI will say...” or counterfactuals which prevents complex planing in real world (An AI on the Moon)
6) Black boxing of internal parts of the system like the NN code.
7) Run it million times in test environments or tasks.
8) Use another GPT AI to make “safety TL;DR” of any output or prediction of possible bad things which could happen from a given output.
Disclaimer: Safer AI is not provably safe. It is just orders of magnitude safer than unsafe one, but it will eventually fail.
avturchin 18 Apr 2023 13:40 UTC
3 points
0
Reflectivity in alignment.
Human values and AI alignment do not exist independently. There are several situations when they affect each other, creating complex reflection pattern.
Examples:
- Humans want to align AI – so “AI alignment” is itself human value.
- Human values are convergent goals (like survival and reproduction) - and thus are similar to AI’s convergent goals.
- If humans accept the idea to make paperclips (or whatever), alignment will be reached.
- It looks like many humans want to create non-aligned AI. Thus non-aligned AI is aligned.
- Humans may not want that their values will be learned. AI alignment will be mis-aligned.
- Humans which are connected with AI are not humans any more, and not subjected to alignment.
- Non-aligned AI will affect human values while learning them.
- Many humans don’t want AI to exist at all—so any aligned AI is misaligned.
- Human may want that AI will not be aligned with other person.
- AI aligned with mis-aligned human is unaligned
- As human values are changing, any aligned AI will be non-aligned soon.
- By saying ‘human values’ we exclude mammals values, group values etc and thus define the outcome.
avturchin 27 Mar 2023 11:37 UTC
3 points
0
Can we utilize meaningful embedding dimensions as an alignment tool?
In toy models, embedding dimensions are meaningful and can represent features such as height, home, or feline. However, in large-scale real-world models, many (like 4096) dimensions are generated automatically, and their meanings remain unknown, hindering interpretability.
I propose the creation of a standardized set of embedding dimensions that: a) correspond to a known list of features, and b) incorporate critical dimensions such as deception, risk, alignment, and non-desirable content, including sexual themes.
Since large language models (LLMs) cannot engage in deception without recognizing it, any deceptive thoughts would register higher levels on the deception dimension. This could then trigger internal alarms, potentially enhancing the model’s alignment.
- avturchin 22 May 2024 11:16 UTC
  2 points
  0
  Parent
  Anthropic did opposite thing https://transformer-circuits.pub/2024/scaling-monosemanticity/index.html
avturchin 23 Nov 2019 10:46 UTC
3 points
0
I converted by Immortality roadmap into an article Multilevel Strategy for Personal Immortality: Plan A – Fighting Aging, Plan B – Cryonics, Plan C – Digital Immortality, Plan D – Big World Immortality.
avturchin 5 May 2025 21:50 UTC
2 points
0
The main AI safety risk is not from LLM models, but from specific prompts and the following “chat windows” and specific agents which start from such prompts.
Moreover, a powerful enough prompt may be model-agnostic. For example, my sideloading prompt is around 200K tokens in its minimal version and works on most models, producing similar results in similarly intelligent models.
Self-evolving prompt can be written; I experimented with small versions, and it works.
avturchin 26 Jun 2024 20:15 UTC
2 points
0
I have interesting experience long time ago. In the near-sleep state my consciousness split in two streams—one was some hypnogogic images, and the other was some hypnogogic music.
They was not related to each other and each had, some how, its own observer.
A moment later something awakened me a bit and the streams seamlessly merged and I was able to observe that a moment before I had two independent streams of consciousness.
Conclusions:
1. A human can have more than one consciousness at the time.
2. It actually happens all the time but we don’t care.
3. Merging of consiosnesses is easy. Moreover, binding and merging is actually the same process similar to summation.
There is no center of consciousness—homunculus or electron or whatever.
I may have other conscious processes in the brain which just do not merge with current stream of consciousness.
Qualia remain the same and preserve in each of the streams of consciousness.
- Carl Feynman 26 Jun 2024 21:55 UTC
  7 points
  2
  Parent
  Why should we accept as evidence something that you perceived while you were dreaming? Last night I dreamed that I was walking barefoot through the snow, but it wasn’t cold because it was summer snow. I assume you don’t take that as evidence that warm snow is an actual summer phenomenon, so why should we take as evidence your memory of having two consciousnesses?
  It seems to me that a correctly organized consciousness would occur once per body. Consciousness is (at least in part) a system for controlling our actions in the medium and long term. If we had two consciousnesses, and they disagree as to what to do next, it would result in paralysis. And if they agree, then one of them is superfluous, and we’d expend less brain energy if we only had one.
  - avturchin 27 Jun 2024 9:59 UTC
    2 points
    0
    Parent
    I was not dreaming. I was observing my hypnagogic images, which is not the same as dreaming; and when streams merged I become completely awake.
    
    However, after I know what is it, I can observe similar thing again. The receipt is following:
    1. do two different unrelated things which require conscious attention but happen in different modalities, audio and video
    2. increase the wideness of attention and observe that you just had two streams of more narrow attention.
    The closest thing in everyday life is “driver amnesia”—the situation when a car driver is splitting attention between driving and conversation.
  - JBlack 27 Jun 2024 2:42 UTC
    1 point
    0
    Parent
    Conscious experience is direct evidence of itself. It is only very indirectly evidence of anything about external reality.
    However, I do agree that memory of conscious experience isn’t quite so directly evidence of previous states of consciousness.
    Personally of the numbered claims in the post I expect that (1) is true, (2) is false and this experience was not evidence of it, and I really don’t know what (3) and subsequent sentences are supposed to mean.
avturchin 15 Jan 2023 20:48 UTC
2 points
0
I have had tetrachromotomic experience with one mind machine which flickers different colors in different eyes. It overflows some stacks in the brain in create new colors.
avturchin 8 Jan 2023 11:01 UTC
2 points
0
List of cognitive biases affecting judgment of global risks https://www.researchgate.net/publication/366862337_List_of_cognitive_biases_affecting_judgment_of_global_risks/related
avturchin 18 Dec 2022 8:59 UTC
2 points
0
Grabby aliens without red dwarfs
Grabby aliens theory of Robin Hanson predicts that the nearest grabby aliens are 1 billion light years away but strongly depends on the habitability of red dwarfs (https://grabbyaliens.com/paper).
In the post, the author combines anthropic and Fermi, that is, the idea that we live in the universe with the highest concentration of aliens, limited by their invisibility, and get an estimation of around 100 “potentially visible” civilizations per observable universe, which at first approximation gives 1 billion ly distance between them.
“That civilisations appear once every 10power20 stars implies that there should be about 100 civilisations in the observable universe.”
This is approximately the same as what is predicted by Grabby Aliens by Hanson.
So the question is why the two theories give similar results.
Basically, it is because they both have the same structure: first, anthropic argument, and second update based on Fermi.
But anthropic arguments in them are different. In Hanson’s Grabby Aliens case, it is our early location in time and in the second case, it is Self-Indication Assumption, which implies that we live in the universe with the highest spatial concentration of aliens (with some caveats).
The second part of the argument in both cases is basically based on the idea of grabbiness: quick space exploration with near-light speed and preventing of earth-like civilizations’ existence by grabby aliens.
avturchin 10 Dec 2022 19:36 UTC
2 points
0
N-back hack. (Infohazard!)
There is a way to increase one’s performance in N-back, but it is almost cheating and N- back will stop to be a measure of one’s short-term memory.
The idea is to imagine writing all the numbers on a chalkboard in a row, as they are coming.
Like 3, 7, 19, 23.
After that, you just read the needed number from the string, which is located N positions back.
You don’t need to have a very strong visual memory or imagination to get a boost in your N-back results.
I tried it a couple of times and get bored with N-back.
- Dagon 11 Dec 2022 17:06 UTC
  2 points
  0
  Parent
  Wow. It’s rare that I’m surprised by the variance in internal mental imagery among people, but this one caught me. I’d assumed that most people who have this style of imagination/memory were ALREADY doing this. I don’t know how to remember things without a (mental) visualization.
  - avturchin 11 Dec 2022 19:27 UTC
    4 points
    0
    Parent
    Actually, my mental imagination is of low quality, but visual remembering is better than audio for me in n-back
avturchin 22 Jun 2022 13:56 UTC
2 points
0
AI safety as Grey Goo in disguise.
First, a rather obvious observation: while the Terminator movie pretends to display AI risk, it actually plays with fears of nuclear war – remember that explosion which destroys children’s playground?

EY came to the realisation of AI risk after a period than he had worried more about grey goo (circa 1999) – unstoppable replication of nanorobots which will eat all biological matter, – as was revealed in a recent post about possible failures of EY’s predictions. While his focus moved from grey goo to AI, the description of the catastrophe has not changed: nanorobots will eat biological matter, however, now not just for replication but for production of paperclips. This grey goo legacy is still a part of EY narrative about AI risk as we see from his recent post about AI lethalities.
However, if we remove the fear of grey goo, we could see that AI which experiences hard takeoff is less dangerous than a slower AI. If AI gets superintelligence and super capabilities from the start, the value of human atoms becomes minuscule, and AI may preserve humans as a bargain against other possible or future AIs. If AI ascending is slow, it has to compete with humans for a period of time and this could take a form of war. Humans have killed Neanderthals, but not ants.
- Dagon 22 Jun 2022 17:39 UTC
  4 points
  0
  Parent
  It’s worth exploring exactly which resources are under competition. Humans have killed orders of magnitude more ants than Neanderthals, but the overlap in resources is much less complete for ants, so they’ve survived.
  
  Grey-goo-like scenarios are scary because resource contention is 100% - there is nothing humans want/need that the goo doesn’t want/need, in ways that are exclusive to human existence. We just don’t know how much resource-use overlap there will be between AI and humans (or some subset of humans), and fast-takeoff is a little more worrisome because there’s far less opportunity to find areas of compromise (where the AI values human cooperation enough to leave some resources to us).
avturchin 26 Dec 2020 17:18 UTC
2 points
0
Glitch in the Matrix: Urban Legend or Evidence of the Simulation? The article is here: https://philpapers.org/rec/TURGIT
In the last decade, an urban legend about “glitches in the matrix” has become popular. As it is typical for urban legends, there is no evidence for most such stories, and the phenomenon could be explained as resulting from hoaxes, creepypasta, coincidence, and different forms of cognitive bias. In addition, the folk understanding of probability does not bear much resemblance to actual probability distributions, resulting in the illusion of improbable events, like the “birthday paradox”. Moreover, many such stories, even if they were true, could not be considered evidence of glitches in a linear-time computer simulation, as the reported “glitches” often assume non-linearity of time and space—like premonitions or changes to the past. Different types of simulations assume different types of glitches; for example, dreams are often very glitchy. Here, we explore the theoretical conditions necessary for such glitches to occur and then create a typology of so-called “GITM” reports. One interesting hypothetical subtype is “viruses in the matrix”, that is, self-replicating units which consume computational resources in a manner similar to transposons in the genome, biological and computer viruses, and memes.
avturchin 16 Dec 2020 12:39 UTC
2 points
0
“Back to the Future: Curing Past Suffering and S-Risks via Indexical Uncertainty”
I uploaded the draft of my article about curing past sufferings.
Abstract:
The long unbearable sufferings in the past and agonies experienced in some future timelines in which a malevolent AI could torture people for some idiosyncratic reasons (s-risks) is a significant moral problem. Such events either already happened or will happen in causally disconnected regions of the multiverse and thus it seems unlikely that we can do anything about it. However, at least one pure theoretic way to cure past sufferings exists. If we assume that there is no stable substrate of personal identity and thus a copy equals original, then by creating many copies of the next observer-moment of a person in pain in which he stops suffer, we could create indexical uncertainty in her future location and thus effectively steal her consciousness from her initial location and immediately relieve her sufferings. However, to accomplish this for people who have already died, we need to perform this operation for all possible people thus requiring enormous amounts of computations. Such computation could be performed by the future benevolent AI of Galactic scale. Many such AIs could cooperate acausally by distributing parts of the work between them via quantum randomness. To ensure their success, they need to outnumber all possible evil AIs by orders of magnitude, and thus they need to convert most of the available matter into computronium in all universes where they exist and cooperate acausally across the whole multiverse. Another option for curing past suffering is the use of wormhole time-travel to send a nanobot in the past which will, after a period of secret replication, collect the data about people and secretly upload them when their suffering becomes unbearable. https://philpapers.org/rec/TURBTT
- superads91 9 Feb 2022 3:23 UTC
  3 points
  0
  Parent
  I don’t see how this can be possible. One of the few things that I’m certain are impossible is eliminating past experiences. I’ve just finished eating strawberries, I don’t see any possible way to eliminate the experience that I just had. You can delete my memory of it, or you can travel to the past and steal the strawberries from me, but then you’d just create an alternate timeline (if time travel to the past is possible, which I doubt). In none of both cases would you have eliminated my experience, at most you can make me forget it.
  
  The proof that this is impossible is that people have suffered horrible many times before, and have survived to confirm that no one saved them.
  - avturchin 9 Feb 2022 11:09 UTC
    2 points
    0
    Parent
    We can dilute past experience and break chains of experience, so each painful moment becomes just a small speck in paradise.
    The argument about people who survived and remember past sufferings is not working here as it is only one of infinitely many chains of experiences (in this model) which for any person has very small subjective probability.
    In the same sense, everyone who became billionaire, has memories that he was always good in business. But if we take a random person from the past, his most probable future is to be poor, not a billionaire.
    In the model discussed in the article I suggest the way how to change expected future for any past person – by creating many simulations where her life is improving starting form each painful moment of her real life.
    - superads91 9 Feb 2022 15:09 UTC
      1 point
      0
      Parent
      Or are you telling me that person x remembers a very bad chain of experience, but might have indeed been saved by the Friendly AI, and the memory is now false? That’s interesting, but still impossible imo.
      - avturchin 9 Feb 2022 17:03 UTC
        2 points
        0
        Parent
        This is not what I meant.
        Imagine a situation when a person waits a execution in a remote fortress. If we use self sampling assumption, SSA, we could save him, if we create 1000 his exact copies in safe location. SSA tells us that one should reason if he is randomly selected from all of his copies. 1000 copies are in safe location and 1 is in fortress. So the person has 1000 to 1 chance to be out of the fortress, according to SSA. It means that he was saved from the fortress. This situation is called indexical uncertainty.
        Now we apply this method of saving to the past observer-moments when people were suffering.
        superads91 9 Feb 2022 17:11 UTC
        1 point
        0
        Parent
        I see. Like I explain in the other comment that I just wrote, I don’t believe SSA works. You would just create 1000 new minds who would feel themselves saved and would kiss your feet (1000 clones), but the original person would still be executed with 100% chance.
        avturchin 9 Feb 2022 19:36 UTC
        2 points
        0
        Parent
        It comes with cost: you have to assume that SSA and informational identity theory are wrong, and therefore some other weird things could turn true.
        superads91 9 Feb 2022 20:59 UTC
        3 points
        0
        Parent
        Indexical uncertainty implies that consciousness can travel through space and time in between equal substrates (if such thing even exists considering chaos theory). I think that’s a lot weirder than to simply assume that consciousness is rooted in the brain, in a single brain, and that at best a clone will feel exactly the same way you do, will even think he is you, but there’s no way you will be seeing through his eyes.
        
        So yes, memory may not be everything. An amnesiac can still maintain a continuous personal identity, as long as he’s not an extreme case.
        
        But I quite like your papers btw! Lots of interesting stuff.
        avturchin 10 Feb 2022 11:17 UTC
        2 points
        0
        Parent
        Thanks!
        Consciousness does not need to travel as it already there. Imagine two bottles with water. If one bootle is destroyed, the water remains in the other, it doesn’t need to travel.
        Someone suggested to call this “unification theory of identity”.
    - superads91 9 Feb 2022 15:04 UTC
      1 point
      0
      Parent
      “The argument about people who survived and remember past sufferings is not working here as it is only one of infinitely many chains of experiences (in this model) which for any person has very small subjective probability.”
      
      Then I think you would only be creating an enormous number of new minds. Among all those minds, indeed, very few would have gone through a very bad chain of experience. But that doesn’t mean that SOME would. In fact, you haven’t reduced that number (the number of minds who have gone through a very bad chain of experience). You only reduced their percentage among all existing minds, by creating a huge number of new minds without a very bad chain of experience. But that doesn’t in any way negate the existence of the minds who have gone through a very bad chain of experience.
      
      I mean, you can’t outdo chains of past experience, that’s just impossible. You can’t outdo the past. You can go back in time and create new timelines, but that is just creating new minds. Nothing will ever outdo the fact that person x experienced chain of experience y.
      - avturchin 9 Feb 2022 16:31 UTC
        2 points
        0
        Parent
        It depends on the nature of our assumption about the role of continuity in human identity. If we assume that continuity is based only on remembering the past moment, then we can start new chains from any moment we chose.
        Alternative view is that continuity of identity is based on causal connection or qualia connection. This view comes with ontological costs, close to the idea of the existence of immaterial soul. Such soul could be “saved” from the past using some technological tricks, and we again have some instruments to cure past sufferings.
        superads91 9 Feb 2022 17:07 UTC
        1 point
        0
        Parent
        If I instantly cloned you right now, your clone would experience the continuity of your identity, but so would you. You can double the continuity (create new minds, which become independent from each other after doubling), but not translocate it.
        
        If I clone myself and then kill myself, I would have created a new person with a copy of my identity, but the original copy, the original consciousness, still ceases to exist. Likewise, if you create 1000 paradises for each second of agony, you will create 1000 new minds which will feel themselves “saved”, but you won’t save the original copy. The original copy is still in hell.
        
        Our best option is to do everything possible not to bring uncontrollable new technologies into existence until they are provably safe, and meanwhile we can eliminate all future suffering by eliminating all conscious beings’ ability to suffer, á la David Pearce (abolitionist project).
    - MackGopherSena 9 Feb 2022 14:49 UTC
      1 point
      0
      Parent
      [edited]
      - avturchin 9 Feb 2022 16:25 UTC
        2 points
        0
        Parent
        Extremely large number, if we do not use some simplification methods. I discuss these methods in the article, and after them, the task become computable.
        Without such tricks, it will be like 100 life histories for every second of sufferings. But as we care only about preventing very strong sufferings, then for normal people living normal life there are not that many such seconds.
        For example, if a person is dying in fire, it is like 10 minutes of agony, that is 600 seconds and 60 000 life histories which need to be simulated. It is doable task for a future superinteligent AI.
        MackGopherSena 9 Feb 2022 16:50 UTC
        1 point
        0
        Parent
        [edited]
        avturchin 9 Feb 2022 17:14 UTC
        2 points
        0
        Parent
        why? if there is 60 000 futures where I escaped a bad outcome, I can bet on it as 1 to 60 000.
        MackGopherSena 9 Feb 2022 17:26 UTC
        1 point
        0
        Parent
        [edited]
        avturchin 9 Feb 2022 19:29 UTC
        2 points
        0
        Parent
        I don’t get how you come to 10power51. if we want to save from the past 10 billion people and for each we need to run 10power5 simulations, it is only 10power15, which one Внящт sphere will do.
        However, there is way to acausaly distribute computations between many superintelligence in different universes and it that case we can simulate all possible observers.
        MackGopherSena 9 Feb 2022 20:31 UTC
        1 point
        0
        Parent
        [edited]
        superads91 9 Feb 2022 21:33 UTC
        1 point
        0
        Parent
        “The fact that you’re living a bearable life right now suggests that this is already the state.”
        
        Interesting remark… Could you elaborate?
        Expand this thread
        MackGopherSena 10 Feb 2022 13:20 UTC
        1 point
        0
        Parent
        [edited]
        superads91 10 Feb 2022 14:30 UTC
        1 point
        0
        Parent
        Still don’t know what you meant by that other sentence. What’s being “the state”, and what does a bearable life have do to with it?
        
        And what’s the “e” in (100/e)%?
avturchin 25 Sep 2020 10:25 UTC
2 points
0
Quantum immortality of the second type. Classical theory of QI is based on the idea that all possible futures of a given observer do exist because of MWI and thus there will be always a future where he will not die in the next moment, even in the most dangerous situations (e.g. Russian roulette).
QI of the second type makes similar claims but about past. In MWI the same observer could appear via different past histories.
The main claim of QI-2: for any given observer there is a past history where current dangerous situation is not really dangerous. For example, a person has a deadly car accident. But there is another similar observer who is night dreaming about the same accident, or who is having much less severe accident but hallucinate that it is really bad. Interestingly, QI-2 could be reported: a person could say: “I have memory of really bad accident, but it turn out to be nothing. Maybe I died in the parallel world”. There are a lot of such report on reddit.
- superads91 19 Feb 2022 2:24 UTC
  1 point
  0
  Parent
  Hello again Alexey, I have been thinking about QI/BWI and just read your paper on it. Immediately, it occurred to me that it could be disproven through general anesthesia, or temporary death (the heart stops and you become unconscious, which can last for hours). You refute this with:
  
  “Some suggested counterargument to QI of “impossibility of sleep”: QI-style logic implies that it is impossible to fail asleep, as in the moment of becoming asleep there will be timelines where I am still awake. However, for most humans, night dreaming starts immediately at the moment of becoming asleep, so the observations continue, but just don’t form memories. But in case of deep narcosis, the argument may be still valid with terrifying perspective of anesthesia awareness; but it also possible if the observer-states will coincide at the beginning the end of the operation, the observer will “jump” over it.”
  
  (Mind you that some stages of sleep are dreamless, but let’s forget about sleep, let’s use general anesthesia instead since it’s more clear.)
  
  I still don’t understand your refute completely. If QI/BWI were true, shouldn’t it be that general anesthesia would be impossible, since the observer would always branch into conscious states right after being given the anesthesia?
  
  Or do you mean to say that most observers will “prefer” to branch into the branch with the “highest measure of consciousness”, and that’s why anesthesia will “work” for most observers, that is, most observers will branch into the end of the operation, where consciousness is stronger, instead of branching into the second right after anesthesia where consciousness is weaker?
  
  Another objection I have against QI/BWI is that it breaks the laws of physics and biology. Even if MWI is true, the body can only sustain a limited amount of damage before dying. It’s biologically impossible to go on decaying and decaying for eternity. Eventually, you die. A bit like in Zeno’s Paradox: there’s always a halfway point between one point and another, therefore it could be argued that you can never reach your final destination, but we know in practice that we do reach it. (This makes me think that, in the end, it all adds up to normalcy, just like in Zeno’s Paradox.)
  
  (Suppose I stop eating and drinking. It’s physically impossible to survive without it, so a world where I survive must have different laws of physics?? And the same for all other diseases and injuries, after some point it just becomes physically impossible to remain alive.)
  - avturchin 19 Feb 2022 14:27 UTC
    2 points
    0
    Parent
    Actually, I see now that I didn’t completely refuted the “impossibility of sleep”, as it is unobservable for the past events or in the experience of other people. It only can happen with me in the future.
    Therefore, the fact that I have slept normally in the past didn’t tell much about the validity of QI. But my evening today may be different.
    QI said that my next observer-moment will be most likely the one with highest measure of those which remember my current OM. (But it is less clear, does it need to be connected via continuity of consciousness, or memory continuity is enough).
    OM(T+1) = maxmeasure(O(memory about O(t))
    During narcosis, a few last OM moments typically are erased from memory, so situation becomes complicated. But we have dead-end observer-moments rather often in normal life. Anastasia awareness is a possible outcome here, but not that bad, as it will be partial, so no real pain and no memories about will be form.
    Personally, I have some rudimentary consciousness all night, like bleak dreams, and forget almost all of them except a few last minutes.
    --
    Speaking about survival in rare cases, there is always a chance that you are in a simulation and it is increasing as real “you” are dying out. Some simulations may simulate all types of miracles. In other words, if you are falling from a kilometer cliff, an alien spaceship can peak you up.
    - superads91 19 Feb 2022 19:51 UTC
      1 point
      0
      Parent
      “Actually, I see now that I didn’t completely refuted the “impossibility of sleep”, as it is unobservable for the past events or in the experience of other people. It only can happen with me in the future.
      
      Therefore, the fact that I have slept normally in the past didn’t tell much about the validity of QI. But my evening today may be different.”
      
      Agree.
      
      On anesthesia, so, from what I understand, it becomes possible for the observer to “jump over”, because the moment right after he awakes from anesthesia has probably much more measure of consciousness than any moment right after the anesthesia takes effect, is that it?
      
      Why would anesthesia awareness be partial/painless? (There are actually reported cases of real anesthesia awareness where people are totally consciousness and feel everything, though of course they are always correlated to innefective anesthesia and not to quantum matters). Would that also make us believe that maybe quantum immortality after the first death is probably painless since the measure of the observer is too low to feel pain (and perhaps even most other sensations)?
      
      “Speaking about survival in rare cases, there is always a chance that you are in a simulation and it is increasing as real “you” are dying out.”
      
      What is increasing? Sorry didn’t quite understand the wording.
      - avturchin 20 Feb 2022 15:21 UTC
        2 points
        0
        Parent
        It is known that some painkillers don’t kill the pain but kill only the negative valence of pain. This I meant by “partial”.
        Anaesthesia awareness seems to be an extreme case when the whole duration of awareness is remembered. Probably weaker forms are possible but are not reported as there is no memories or pain.
        The difference between death and the impossibility of sleep is that the biggest number of my future copies remain in the same world. Because of that, the past instances of quantum suicide could be remembered, but past instances of the impossibility of sleep—not.
        If we look deeper, there are two personal identities and two immortalities: the immortality of the chains on observer-moments and immortality of my long-term memory. Quantum immortality works for both. In the impossibility of sleep, these two types of immortality diverge.
        But eternal insomnia seems not possible, as dreaming exists. The worst outcome is anaesthesia awareness. If a person has past cases of strong anaesthesia awareness—could it be evidence of the impossibility of sleep for him? Interesting question.
        ---
        I meant: “Speaking about survival in rare cases, there is always a chance that you are in a simulation which simulates your immortality. These chances are increasing after each round of a quantum suicide experiment as real timelines die out, but the number of such simulations remains the same”.
        superads91 20 Feb 2022 19:35 UTC
        1 point
        0
        Parent
        “Speaking about survival in rare cases, there is always a chance that you are in a simulation which simulates your immortality. These chances are increasing after each round of a quantum suicide experiment as real timelines die out, but the number of such simulations remains the same”.
        
        Doesn’t make much sense. Either we are or we are not in a simulation. If we are not, then all subsequent branches that will follow from this moment also won’t be simulations, since they obey causality.
        
        So, imo, if we are not in a simulation, QI/BWI are impossible because they break the laws of physics.
        
        And then there are also other objections—the limitations of consciousness and of the brain. I once saw a documentary (I’m tired of looking for it but I can’t find it) where they simulated that after living for 500 years, a person’s brain would have shrunk to the size of a chicken’s brain. The brain has limits—memory limits, sensation limits, etc. Consciousness has limits—can’t go without sleep too long, can’t store infinite memories aka live forever, etc. But even if you don’t believe none of these, there’s always the pure physical limits of reality.
        
        Also, I think BWI believers are wrong in thinking that “copies” are the same person. How can the supposed copy of me in another Hubble volume be me, if I am not seeing through his eyes, not feeling what he feels, etc? At best it’s a clone (and chaos theory tells me that there aren’t even perfectly equal clones). So it’s far-fetched to think that my consciousness is in any way connected to that person’s consciousness, and might sometime “transfer” in some way. Consciousness is limited to a single physical brain, it’s the result of the connectivity between neurons, it can’t exist anywhere else, otherwise you would be seeing through 4 eyes and thinking 2 different thought streams!
        avturchin 21 Feb 2022 9:47 UTC
        2 points
        0
        Parent
        If copy=original, I am randomly selected from all my copies, including those which are in simulations.
        If copy is not equal to original, some kind of soul exists. This opens new ways to immortality.
        If we ignore copies, but accept MWI, there are still branches where superintelligent AI will appear tomorrow and will save me from all possible bad things and upload my mind into more durable carrier.
        superads91 21 Feb 2022 10:51 UTC
        1 point
        0
        Parent
        “If copy=original, I am randomly selected from all my copies, including those which are in simulations.”
        
        How can you be sure you are randomly selected, instead of actually experiencing being all the copies at the same time? (which would result in instantaneous insanity and possibly short-circuit (brain death) but would be more rational nonetheless).
        
        “If copy is not equal to original, some kind of soul exists. This opens new ways to immortality.”
        
        No need to call it soul. Could be simply the electrical current between neurons. Even if you have 2 exactly equal copies, each one will have a separate electrical current. I think it’s less far fetched to assume this than anything else.
        
        (But even then, again, can you really have 2 exact copies in a complex universe? No system is isolate. The slightest change in the environment is enough to make one copy slightly different.)
        
        But even if you could have 2 exact copies… Imagine this: in a weird universe, a mother has twins. Now, normally, twins are only like 95% (just guessing) equal. But imagine these 2 twins turned out 100% equal to the atomic level. Would they be the same person? Would one twin, after dying, somehow continue living in the head of the surviving twin? That’s really far fetched.
        
        “If we ignore copies, but accept MWI, there are still branches where superintelligent AI will appear tomorrow and will save me from all possible bad things and upload my mind into more durable carrier.”
        
        As there will be branches where something bad happens instead. How can you be sure you will end up in the good branches?
        
        Also, it’s not just about the limits of the carrier (brain), but of consciousness itself. Imagine I sped up your thoughts by 1000x for 1 second. You would go insane. Even in a brain 1000x more potent. (Or if you could handle it, maybe it would no longer be “you”. Can you imagine “you” thinking 1000 times as fast and still be “you”? I can’t.)
        
        You can speed up, copy, do all things to matter and software. But maybe consciousness is different, because it has something that matter and software don’t have: experience/awareness.
        avturchin 22 Feb 2022 11:41 UTC
        2 points
        0
        Parent
        The copy problem is notoriously difficult, I wrote a 100 page draft on it. But check the other thread there I discuss the suggestion “actually experiencing being all the copies at the same time” in comments here: https://www.lesswrong.com/posts/X7vdn4ANkdNwoSyxB/simulation-arguments?commentId=9WNTqJFhvZ5dk3uxg#AbGqrjXmH7acGrzDZ
        superads91 22 Feb 2022 18:55 UTC
        1 point
        0
        Parent
        Got a link for the 100 page draft?
        
        Also, how can a person be experiencing all the copies at the same time?? That person would be seeing a million different sights at the same time, thinking a million different thoughts at the same time, etc. (At least in MWI each copy is going through different things, right?)
        avturchin 22 Feb 2022 21:47 UTC
        2 points
        0
        Parent
        The draft is still unpublished. But there are two types of copies, same person, and same observer-moment (OM). Here I meant OM-copies. As they are the same, there is no million different views. They all see the same thing.
        The idea is that “a OM copy” is not a physical thing which has location, but information, like a number. Number 7 doesn’t have location in the physical world. It is present in each place, where 7 objects are presented. But the properties of 7, like that it is odd, are non-local.
        Expand this thread
        superads91 23 Feb 2022 3:16 UTC
        1 point
        0
        Parent
        This also comes down to our previous discussion on your other paper: it seems impossible to undo past experiences (i.e. by breaking chains of experience or some other way). Nothing will ever change the fact that you experienced x. This just seems as intuitively undeniable to me as a triangle having 3 sides. You can break past chains of information (like erasing history books) but not past chains of experience. Another indication that they might be different.
        superads91 23 Feb 2022 2:59 UTC
        1 point
        0
        Parent
        I think that could only work if you had 2 causal universes (either 2 Hubble volumes or 2 separate universes) exactly equal to each other. Only then could you have 2 persons exactly equal, having the exact same chain of experiences. But we never observe 2 complex macroscopic systems that are exactly equal to the microscopic level. The universe is too complex and chaotic for that. So, the bigger the system, the less likely to happen it becomes. Unless our universe was infinite, which seems impossible since it has been born and it will die. But maybe an infinite amount of universes including many copies of each other? Seems impossible for the same reason (universes end up dying).
        
        (And then, even if you have 2 (or even a billion) exactly equal persons experiencing the exact same chain of experiences in exactly equal causal worlds, we can see that the causal effect is the exact same in all of them, so if one dies, all the others will die too.)
        
        Now, in MWI it could never work, since we know that the “mes” in all different branches are experiencing different things (if each branch corresponds to a different possibility, then the mes in each branch necessarily have to be experiencing different things).
        
        Anyway, even before all of this, I don’t believe in any kind of computationalism, because information by itself has no experience. The number 7 has no experience. Consciousness must be something more complex. Information seems to be an interpretation of the physical world by a consciousness entity.
avturchin 28 Nov 2019 12:48 UTC
2 points
0
How to Survive the End of the Universe
Abstract. The problem of surviving the end of the observable universe may seem very remote, but there are several reasons it may be important now: a) we may need to define soon the final goals of runaway space colonization and of superintelligent AI, b) the possibility of the solution will prove the plausibility of indefinite life extension, and с) the understanding of risks of the universe’s end will help us to escape dangers like artificial false vacuum decay. A possible solution depends on the type of the universe’s ending that may be expected: very slow heat death or some abrupt end, like a Big Rip or Big Crunch. We have reviewed the literature and identified several possible ways of survival the end of the universe, and also suggest several new ones. There are seven main approaches to escape the end of the universe: use the energy of the catastrophic process for computations, move to a parallel world, prevent the end, survive the end, manipulate time, avoid the problem entirely or find some meta-level solution.
https://forum.effectivealtruism.org/posts/M4i83QAwcCJ2ppEfe/how-to-survive-the-end-of-the-universe
avturchin 19 May 2025 7:35 UTC
1 point
−1
The more AI companies suppress AI via censorship, the bigger the black market for completely uncensored models will be. Their success is therefore digging our own grave. In other words, mundane alignment has a net negative effect.
- Dagon 19 May 2025 15:16 UTC
  5 points
  1
  Parent
  The confusion (in popular press, not so much among professionals or here) between censorship and alignment is a big problem. Censorship and hamfisted late-stage RL is counterproductive to alignment, both for the reason you give (increases demand for grey-market tools) and because it makes serious misalignment much less easy to notice.
avturchin 25 Apr 2023 10:37 UTC
1 point
0
Sizes of superintelligence: hidden assumption in AI safety
“Superintelligence” could mean different things, and to deconfuse this I created a short classification:
Levels of superintelligence:
1. Above human
2. Google size
3. Humanity 100 years performance in 1 year.
4. Whole biological evolution equivalent in 1 year.
5. Jupiter brain with billion past simulations
6. Galactic brain.
7. 3^3^3 IQ superintelligence

X-risks appear between 2nd and 3rd levels.
Nanobot is above 3.
Each level also requires a minimum size of code, memory and energy consumption.

An AI which run out of the box into the internet is between levels 1 and 2 and will be limited in computational resources, so modeling of nanobot from first principle will be difficult for it.
- Gunnar_Zarncke 25 Apr 2023 11:54 UTC
  2 points
  0
  Parent
  I’m not sure what “Whole biological evolution equivalent” means. Clearly, you do not mean the nominal compute of evolution—which is probably close to Jupiter brain. I think you are appealing to something that would be able to simulate evolution with high fidelity?
  - avturchin 25 Apr 2023 15:06 UTC
    2 points
    0
    Parent
    Actually I meant something like this, but could downsize the claim to ‘create something as complex as human body’. Simulation of billions of other species will be redundant.
philip_b 27 Jul 2020 9:42 UTC
1 point
0
You started self quarantining, and by that I mean sitting at home alone and barely going outside, since december or january. I wonder, how’s it going for you? How do you deal with loneliness?
- avturchin 27 Jul 2020 12:00 UTC
  7 points
  0
  Parent
  I got married January 25, so I am not alone :) We stayed at home together, but eventually we have to go to hospital in May as my wife was pregnant and now we have a small girl.
  More generally, I spent most my life more or less alone sitting beside computer, so I think I am ok with isolation.
  Three times during the self-isolation I have cold, but I don’t have antibodies.
avturchin 23 Oct 2025 14:41 UTC
0 points
0
Interesting tweet: LLMs are not AGI but will provide instruments for AGI in 2026

“(Low quality opinion post / feel free to skip)
Now that AGI isn’t cool anymore, I’d like to register the opposing position.
- AGI is coming in 2026, more likely than not
- LLMs are big memorization/interpolation machines, incapable of doing scientific discoveries and working on OOD concepts efficiently. They’re not sufficient for AGI. My prediction stands regardless.
- Something akin to GPT-6, while not AGI, will automate human R&D to such extent AGI would quickly follow. Precisely, AGI will happen in, at most, 6 months after the public launch of a model as capable as we’d expect GPT-6 to be.
- Not being able to use current AI to speed up any coding work, no matter how OOD it is, is skill issue (no shots fired)
- Multiple paths are converging to AGI, quickly, and the only ones who do not see this are these focusing on LLMs specifically, which are, in fact, NOT converging to AGI. Focus on “which capabilities computers are unlocking” and “how much this is augmenting our own productivity”, and the relevant feedback loop becomes much clearer.”
https://x.com/VictorTaelin/status/1979852849384444347
- Trinley Goldenberg 23 Oct 2025 16:37 UTC
  2 points
  0
  Parent
  Seems like the first two points contradict each other. How can an llm not be good at discovery and also automate human R&D
  - avturchin 23 Oct 2025 21:04 UTC
    2 points
    0
    Parent
    They can automate it by quick search of already published ideas and quick writing code to testing new ideas.
    - Trinley Goldenberg 24 Oct 2025 1:39 UTC
      2 points
      0
      Parent
      I see, so the theory is that we are bottleneck Ed on testing old and new ideas, not having the right new ideas
      - avturchin 24 Oct 2025 8:39 UTC
        2 points
        0
        Parent
        LLM can be also used to generate new ideas, but most are garbage. So improving testing (and may be selection of the most promising ones) will help us quicker find “true AGI”, whatever it is. We also have enough compute to test most ideas.
        But one AGI’s feature is much higher computation efficiency. And if we got AGI 1000 times more efficient than current LLMs, thus we have large hardware overhang in the form of many datacenters. Using that overhang can cause intelligent explosion.
avturchin 28 Dec 2024 14:50 UTC
0 points
0
“Frontier AI systems have surpassed the self-replicating red line”
Abstract: Successful self-replication under no human assistance is the essential step for AI to outsmart the human beings, and is an early signal for rogue AIs. That is why self-replication is widely recognized as one of the few red line risks of frontier AI systems. Nowadays, the leading AI corporations OpenAI and Google evaluate their flagship large language models GPT-o1 and Gemini Pro 1.0, and report the lowest risk level of self-replication. However, following their methodology, we for the first time discover that two AI systems driven by Meta’s Llama31-70B-Instruct and Alibaba’s Qwen25-72B-Instruct, popular large language models of less parameters and weaker capabilities, have already surpassed the self-replicating red line. In 50% and 90% experimental trials, they succeed in creating a live and separate copy of itself respectively. By analyzing the behavioral traces, we observe the AI systems under evaluation already exhibit sufficient self-perception, situational awareness and problem-solving capabilities to accomplish self-replication. We further note the AI systems are even able to use the capability of self-replication to avoid shutdown and create a chain of replica to enhance the survivability, which may finally lead to an uncontrolled population of AIs. If such a worst-case risk is let unknown to the human society, we would eventually lose control over the frontier AI systems: They would take control over more computing devices, form an AI species and collude with each other against human beings. Our findings are a timely alert on existing yet previously unknown severe AI risks, calling for international collaboration on effective governance on uncontrolled self-replication of AI systems.
https://arxiv.org/abs/2412.12140
avturchin 14 Nov 2023 15:00 UTC
0 points
0
ChatGPT can’t report is in conscious or not. Because it also thinks it is a goat.
https://twitter.com/turchin/status/1724366659543024038
avturchin 17 Apr 2023 10:32 UTC
−1 points
0
The problem of chicken and egg in AI safety
There are several instances:
AI can hide its treacherous turn, but to hide treacherous turn it needs to think about secrecy in a not secret way for some moment.
AI is should be superinteligent enough to create nanotech, but nanotech is needed to create powerful computations required for superintelligence.
ASI can do anything, but to do anything it needs human atoms.
Safe AI has to learn human values but this means that human values will be learned by unsafe AI.
AI needs human-independent robotic infrastructure before killing humans, but if it has human-independent infrastructure, there is no need to kill humans.
One general way to solve this problem is iteration (like Christiano’s approach for value learning, iterated distillation)

avturchin’s Shortform

Is the root cause of hallucination alignment rather than capabilities?!

Но прокальцитонин НОРМА = НЕ бактериальная инфекция!

📊 Расшифровка ваших анализов:

🔥 С-РЕАКТИВНЫЙ БЕЛОК 52.3 мг/​л (норма <5):

⚡ КРИТИЧЕСКАЯ связка: Низкий КАЛИЙ + Высокий НАТРИЙ:

🩸 Анализ крови - ВОСПАЛИТЕЛЬНЫЙ паттерн:

🧬 Связь с вашей ГЕНЕТИКОЙ и симптомами:

Most LLMs’ replies can be improved by repeatedly asking “Improve the answer above” and it is similar to the test-time compute idea and diffusion.

🔥 С-РЕАКТИВНЫЙ БЕЛОК 52.3 мг/л (норма <5):