This could work. I think the hard part is finding a meaningful way to simulate the environment so that the conclusions transfer to real life.
lemonhope
I was at this weird party where everyone started drinking poison. I tried explaining it to them but I didn’t have any proof or anything and I said “look that’s poison sir” and “m’am if you think there’s any chance I’m right you should stop drinking that” but no luck. They said “I’m thirsty” or “i already have this cup in my hand” or “maybe water is poison and this is antidote, you know more people die from drowning than random poisoning”. I realized I was failing to think about it from their perspective. If I had known all these people for years and this big party was basically what we’ve been planning for a while and it was blowing up online then I would definitely drink it too.
After all the poison-is-actually-antidote guy could’ve also said “if there’s any chance I’m right you have to drink it right now” which would be just political manuvineering clearly.
Anyways I miss those guys they were fun
With controlling a theoretical rl agent, what’s the problem with asking the ai to be 99% sure that it mopped 99% of the floor and stop?
I remember that if you just ask for 99% floormop then agent will spend forever getting 99.99999% sure that at least 99% is mopped, but I can’t remember the problem with this little patch.
This resolved yes!
Ok it does seem like an example then. Thank you for spelling it out.
Having some vague thoughts about “evil people”. In the movies the heroes love everything and fight to save it, and the villains hate everything and fight to destroy it. I feel like in real life, the heroes love power and control, they want the world to be a certain way, and that happens to be a good world for some others too, and they happen to have good ideas about how to do it (eg USA founding fathers, Xi Jinping) instead of stupid bad ideas (Mao, Stalin). There seems to be an innate human drive for revenge and for genocide of other ethnicities, but besides that, humans don’t seem to come packaged with anything that makes us crave death and destruction.
I thought about this because I was thinking about my past choices to avoid psychopaths in my personal life. It is certainly the safe choice to avoid them. But maybe I should put more thought into their preferences and ideas, instead of stopping analysis once psychopathy is obvious.
Like if you are friends with both Mao and Xi Jinping then you shouldn’t just think “man those guys are real politicians”. You should look closely at what they want and whether their ideas will succeed or fail. Probably obvious to many people, but not me.
They always try to make it interesting. I would even say that all these downvoted posts already seem to be trying to follow your advice here. The hard part is guessing correctly. Your advice here is obvious but you clearly have some knowledge/skill/understanding that is not obvious. If you really want to help people accomplish successful closet exit then you will need to give a better tutorial with more detail. I think that would be a good thing to have around.
Hangnails open up skin and very much killed our non-ancestors. It is a great place for an infection to start. I think it is just a hard problem to solve. Lots of animals have problems with their nails and claws. It is hard stuff poking through / mounted to skin, just a difficult thing to do well.
Be the most upvoted and well received LW account ever.
Hey guys I think everybody should open up! You just need to make it interesting!
If you browse newest (which is almost impossible to do on lesswrong.com) then you will find someone opening up like this every couple days and almost always getting downvoted and attacked.
I think you meant to say something different with this paragraph, or I am confused:
Diagnostic dilution is always structurally invalid, but it misleads us specifically when the conclusion hinges on forgetting X . “Bob is a practising doctor, so Bob can prescribe medications, so Bob can probably prescribe Ozempic” is a fine inference because it goes through without forgetting that Bob is a doctor. But consider the same chain starting with “Bob is a vet”: now the inference does not go through unless we forget Bob’s job, and indeed the conclusion is false.
This does not seem to be an example of “diagnostic dilution”
Llama 8b might do a decent job
Be an ASS: arrive, survive, and spread
The George Burns quote has an interesting history. https://quoteinvestigator.com/2011/12/05/fake-honesty/
It’s a nonprofit, but I would like to complement METR for raising many millions of dollars while sticking to their guns and simultaneously not screwing anybody over. Rare combination of integrity, research competence, and fundraising competence. I expect it to remain a nonprofit forever.
Misc ideas I thought of or heard of:
adult language learning service, maybe based on partially-translating books or TV. Eg book starts 10% Russian and gets more Russian as you click the define button less often.
foreign dubs for TV (ML-based audio splitting, translation, and audio generation). Sell as debrid service
discount claude code competitor which lets you report past sneakiness when you discover it. Sell the reports as training data.
I’m no writer or editor but you could email me. I check my email every few days lemonhope@fastmail.com
Long have I searched for an intuitive name for motte & bailey that I wouldn’t have to explain too much in conversation. I might have finally found it. The “I was merely saying fallacy”. Verb: merelysay. Noun: merelysayism. Example: “You said you could cure cancer and now you’re merelysaying you help the body fight colon cancer only.”
Application link says “no access to this page”
“Death with dignity” was clearly intended to trigger the audience to HMCF right? He was doing exactly what you are asking for
I can attest that you can fool yourself quite often (twice a week?) even if you just use CLI, frequently switch models, and almost never have context/history. When you drop something in front of an LLM they say “ok lets make it work.” That’s a strong signal from a fellow human. Trips something in my brain I guess. I tried adding criticism and refusals to my CLI thing but it doesn’t work.