Codex doesn’t seem to understand that humans read things from top to bottom instead of just loading entire documents into our context windows all at once
lc
One reason that AIs probably seem to generalize “niceness” so well is that they aren’t intelligent enough to anticipate the exact distribution of the RL environments we will throw at them. Not being intelligent enough to figure out what the exact shape of RL is, means that you basically have to be earnestly nice and hope for the best. However, this is not guaranteed to stay true as long as we need it to be true. As we continue to build more and more powerful AIs, they will get better at anticipating what sorts of moral conundrums to expect inside RL. That could cause the nice generalization we’re enjoying to suddenly narrow in a way we don’t expect in later iterations.
Nothing to show what a farce your rationality is like roulette. “God, 17 is a prime number, I should’ve had that”
Nitpick: the tweet in question does not actually use the phrases “all questions” or “blanket ban”. ChatGPT is interpreting ambiguity in a hyperbolic way to make the correction look stronger.
I did not notice this earlier, but given Habryka’s comment, I don’t think this is a nitpick at all. I think the AI is actively trying to make the user believe that the bill doesn’t ban advice, without saying that.
Yeah you’re probably right.
I didn’t see that portion of the bill you posted, and my earlier takeaway from 5.4′s objection was that it would have permitted the kinds of chat you mention. So I think your take is largely accurate and I’m sorry for posting something like this without reading the bill all the way through; I probably wouldn’t have done so if I had understood this before making it.
If you read closely, the AI’s correction is actually only really objecting to the “related to”, which (in its interpretation) I guess is supposed to imply the idea that the bill was supposed to ban any advice “related to” medicine, engineering, etc. As Max H pointed out, there’s a reasonable read of the post that goes “bans AI from answering (some) questions related to several licensed professions like medicine...”, so the AI’s correction is assuming a reading of the above tweet that isn’t even correct.
I think the AI’s correction is designed to give the reader the impression that the bill doesn’t ban advice, without actually claiming that it doesn’t ban advice. I suspect that AI models are just not good/trustworthy enough to do this sort of thing yet. I still do think though that the tweet would be better saying “effectively ban” and removing the “related to”.
But do you see how shifting from ‘just reading’ to ‘interpreting implementation consequences’ means that the tweet may be claiming an effective ban and not a ban by the letter of the law itself?
I understand what you’re saying, but that amount of charity is inappropriate. If the OP wanted to say “effective ban”, they would have done that, and then the tweet wouldn’t have misled people. And in other contexts I am almost positive that rationalists would be able to immediately register this kind of conflation as antisocial; for instance, people made similar claims that SB 1047 would “ban open source”, and several of the people mentioned above thought that was just as mendacious.
They are literally misreading the law if the tweet says the bill “bans AI from answering questions related to several licensed professions”, and it literally doesn’t do that.
Seems reasonable to me to interpret the words “ban X” as a ban on X, not as ban on some subset of X. That is certainly how people who are responding to the tweet appear to be reading it.
It’s also not clear from the excerpt of the bill that you quoted that a chatbot simply prefacing its response with “I am not a lawyer / doctor / hair stylist / etc. but”, (which would be annoying but not catastrophic) is sufficient to avoid liability.
Here are the attached sources that the AI gave, & its quotes & paraphrases from the sections:
NY State Senate Bill 2025-S7263
Summary: “Imposes liability for damages caused by a chatbot impersonating certain licensed professionals.” Sponsor memo: the bill would prohibit a chatbot from giving responses or advice that, if taken by a natural person, would constitute unauthorized practice or unauthorized use of a professional title.STATE OF NEW YORK — S7263 bill text
Section 390-f(2)(a): “A proprietor of a chatbot shall not permit such chatbot to provide any substantive response, information, or advice, or take any action which, if taken by a natural person” would constitute specified crimes of unauthorized professional practice or unauthorized use of title.AI Chatbot Ban for Minors Passes Internet & Technology Committee, among 11 Bills
The Senate press release says S7263 would “prevent AI from impersonating certain licensed professionals” and “would prohibit chatbots from giving substantive responses, including information or advice, that can be mistaken for professional counseling.”
If there is actually any ambiguity here, I am willing to bet literally anyone on this website that, if the bill goes forward, the ambiguities will be resolved in favor of a less aggressive interpretation in subsequent edits.
Note: This shortform correcting a tweet is itself misleading; see this conversation between Habryka and myself.
As I havepointed out before, rationalists & technology workers have a self-serving bias that makes them preferential to stories about public servants being stupid. This tweet about the NY AI bill is a great example of that:The above post was quoted uncritically byZvi,Eliezer, andDean W. Ball, who each wrote some variation of “woe that state legislatures are so horrible”. The real bill is quite abit narrower than the tweet suggests; fromGPT-5.4:New York Senate bill S7263 is officially titled “Imposes liability for damages caused by a chatbot impersonating certain licensed professionals.” Its operative language says a chatbot proprietor may not let the chatbot provide a substantive response, information, advice, or take an action that, if done by a natural person, would constitute unauthorized practice or unauthorized use of a professional title under specified provisions of the Education Law or Judiciary Law.That is materially narrower than a blanket ban on AI “answering questions related to” medicine, law, dentistry, nursing, psychology, social work, engineering, etc. The bill targets impersonation/unauthorized practice, not every answer touching those subjects. An official Senate press release describing the same bill likewise says it would prohibit chatbots from giving substantive responses “that can be mistaken for professional counseling” and identifies it as legislation to prevent AI from “impersonating certain licensed professionals.”So the post overstates the bill’s scope: the bill does not ban AI from answering all questions related to those professions; it bars a narrower category of chatbot conduct that would amount to unauthorized professional practice if done by a human.Public policymakers are ultimately the people who are going to be drafting laws & organizing responses about AI safety, and the reception of those laws is going to be driven partly by priors about their efficacy, developed from stuff like this. Boosting hyperbolic misinfo about particular bills, if it’s an attempt to build solidarity/shared intuitions with people in tech circles (which, to be clear, I don’t think this is), seems extremely counterproductive.
New release today (v0.3.3):
Released on the chrome store now.
Firefox available on releases page but it’s pending.
Generated a website where you can view and search through corrections. Probably good if you want to get a sense of the type of things that this bot finds.
Made some simple improvements to the investigation workflow & prompt that should strictly reduce false positive rates & decrease jitter between post updates.
Upgraded model from GPT-5.2 to GPT-5.4.
Added Wikipedia support.
Corrections now stream as soon as the AI makes them instead of all at once at the end.
Highlighting and claim focus are more robust on long or dynamic pages, especially Substack and Wikipedia.
Media handling is smarter: better image occurrence tracking, image-only content/version changes, and improved video detection.
Clearer compatibility/error handling when the extension or API is out of date.
While worms and viruses are nothing new, in many ways their destructive impact has been limited by the command and control capabilities of the people manning them. You could hack a couple thousand end users’ desktops with an SMB 0-day, but so what; it’s not like you can actively try and pivot off of each of their privileges, except in a programmatic way. Now, with AI agents, you actually can.
I apologize, I was just joking.
Personally I would like to protest their influence over the U.S. government and AI policy specifically. I think that’s something a lot of people find immediately sympathetic, even people with different views on AI X-risk than us. It’s also what I’m angry about right now in this moment.
Man, what haven’t they done?
If any groups are up for organizing it, I would love the chance to attend a protest at the a16z offices.
I was thinking about offering the ability to change the model, but this seems like a more general solution. I would conceptualize it less as an API and more as a “plugin investigative process” that the user could select, that would be subjected to benchmarks that we’d also use to optimize the “default” tool and that people could compare.
When in fact the post was published on December 03 2025.
Probably just a bug. It has to grab all this shit from the SPA DOM.
If your goal is to tell us “here’s what the extension is like, also be aware that some of the corrections are wrong/unhelpful”, then fine
Yeah, that was why I left the example in. Hopefully it will get better soon:

I’m trying to build out an ARCHITECTURE.md for a new program and it is literally using terms out of order, before they’re properly defined.