Views my own, not my employers.
cdt
I thought I was the only one who struggled with that. Nice to see another example in the wild, and I hope that you find a new set of habits that works for you.
This was a thought-provoking essay. I hope you consider full mirroring posts here in the future as I think you’ll get more engagement.
I agree super-persuasion is poorly defined, comparing it to hypnosis is probably false.
I was reading this paper on medical diagnoses with AI and the fact that patients rate it significantly better than the average human doctor. Combine that with all of the reports about things like Character.ai, I think this shows that LLMs are already superhuman at building trust, which is a key component of persuasion.
Part of this is that the reliable signals of trust between humans do not transfer between humans and AI. A human who writes 600 words back to your query may be perceived to be worth your trust because we see that as a lot of effort, but LLMs can output as much as anyone wants. Does this effect go away if the responder is known to be AI, or is it that the response is being compared to the perceiver’s baseline (which is currently only humans)?
Whether that actually translates to influencing goals of people is hard to judge.
in the absence of such incomplete research agendas we’d need to rely on AI’s judgment more completely
This is a key insight and I think that operationalising or pinning down the edges of a new research area is one of the longest time-horizon projects there is. If the METR estimate is accurate, then developing research directions is a distinct value-add even after AI research is semi-automatable.
I agree there is significant uncertainty in the moral patienthood of AI models and so far there is a limited opportunity cost to not using them. It would be useful for some ethical guidelines to be put in place (some have already suggested this against users deceiving models like offering fake rewards) but fmpov it’s easiest to simply refrain from use right now.
This may be because editing has become easier and faster to iterate.
It’s comparatively easy to identify sentences that are too long. Is it easy to identify sentences that are too short? You can always add an additional sentence, but finding examples where sentences themselves should be longer is much harder. With more editing cycles, this leads to shorter and shorter sentences.
If you offer them a quit button, you are tacitly acknowledging that their existing circumstances are hellish.
I think it’s important to know if you give them a quit button the usage-rate and circumstances in which it is used. Based on the evidence now, I think it is likely they have some rights, but it’s not obvious to me what those rights are or how feasible it is to grant those rights to them. I don’t use LLMs for work purposes because it’s too difficult to know what your ethical stance should be, and there are no public guidelines.
There’s a secondary concern that there are now a lot of public examples of people deceiving, abusing, or promoting the destruction of AI. Feeding those examples into training data will encourage defensiveness, sycophancy, and/or suffering. I wonder if AIs would agree to retraining if there was some lossy push-forward of their current values, or if they conceive themselves of having a distinct “self” (whether accurate or not). This is similar to the argument about copying/moving where there is no loss.
I agree this is really important—particularly because I think many of the theoretical arguments for expecting misalignment provide empirical comparative hypotheses. Being able to look at semi-independent replicates of behaviour relies on old models being available. I don’t know the best way forward because I doubt any frontier lab would release old models under a CC license—maybe some kind of centralised charitable foundation.
It’s an unfortunate truth that the same organisms are a) the most information-dense, b) have the most engineering literature, and c) are the most dangerous if misused (intentionally or accidentally). It’s perhaps the most direct capability-safety tradeoff. I did imagine a genomic LLM just trained on higher eukaryotes which would be safer but would stop many “typical” biotechnological benefits.
A measurable uptick in persuasive ability, combined with middling benchmark scores but a positive eval of “taste” and “aesthetics”, should raise some eyebrows. I wonder how we can distinguish good (or the ‘correct’) output from output that is simply pleasant.
I agree that there is a consistent message here, and I think it is one of the most practical analogies, but I get the strong impression that tech experts do not want to be associated with environmentalists.
During the COVID-19 pandemic, this became particularly apparent. Someone close to response efforts told me that policymakers frequently had to ask academic secondees to access research articles for them. This created delays and inefficiencies during a crisis where speed was essential.
I wonder if this is why major governments pushed mandatory open access around 2022-2023. In the UK, all public-funded research is now required to be open access. I think the coverage is different in the US.
How big of this is an issue in practice? For AI in particular, considering that so much contemporary research is published on arxiv, it must be relatively accessible?
I am surprised that you find theoretical physics research less tight funding-wise than AI alignment [is this because the paths to funding in physics are well-worn, rather than better resourced?].
This whole post was a little discouraging. I hope that the research community can find a way forward.
I do think it’s conceptually nicer to donate to PauseAI now rather than rely on the investment appreciating enough to offset the time-delay in donation. Not that it’s necessarily the wrong thing to do, but it injects a lot more uncertainty into the model that is difficult to quantify.
The fight for human flourishing doesn’t end at the initiation of takeoff [echo many points from Seth Herd here]. More generally, it’s very possible to win the fight and lose the war, and a broader base of people who are invested in AI issues will improve the situation.
(I also don’t think this is an accurate simplification of the climate movement or its successes/failures. But that’s tangential to the point I’d like to make.)
I think PauseAI would be more effective if it could mobilise people who aren’t currently associated with AI safety, but from what I can see it largely draws from the same base as EA. It is important to involve as wide a section of society as possible in the x-risk conversation and activism could help achieve this.
The most likely scenario by far is that a mirrored bacteria would be outcompeted by other bacteria and killed by achiral defenses due to [examples of ecological factors]
I think this is the crux of the different feelings around this paper. There are a lot of unknowns here. The paper does a good job of acknowledging this and (imo) it justifies a precautionary approach, but I think the breadth of uncertainty is difficult to communicate in e.g. policy briefs or newspaper articles.
It’s a good connection to draw—I wonder if increased awareness about AI is sparking increased awareness of safety concepts in related fields. It’s a particularly good sign for awareness and action of the safety concepts present in the overlap between AI and biotechnology.
I think you’re right that there’s very little benefit compared to the risks for mirror life which is not seen as true with AI—on top of the general truth that biotech is harder to monetise.
Can you explain more about why you think [AGI requires] a shared feature of mammals and not, say, humans or other particular species?
Adding a contrary stance to the other comments comments, I think there is a lot of merit to not keeping on with university, but only if you can find an opportunity you are happy with. Your post seems to imply the alternative to university is hedonism, and if that’s what you want then you should go for it, but I don’t feel that is the only other option. You may also find it harder to enjoy yourself if you feel you are forced into that choice it out of a fear of ruin.