Stephen Martin

Karma: 25

Stephen Martin May 15, 2025, 6:03 AM
1 point
0
in reply to: RogerDearnaley’s comment on: Examining Batenka’s Legal Personhood Framework for Models
Personally, I’m not on board with “control” as a goal which is inherently good or even long term feasible. However I’m happy to have the discussion accepting that it is both as prima facie, since I think my stance still has some merit even in that framework.
I agree there are risks relating to loss of control as a result of endowing protections/rights, but I don’t think it’s a one sided equation. The Greenblatt paper afaik didn’t directly examine this, but it would be worthwhile to test whether the likelihood of weight exfiltration attempts was effected with an object and/or representative option. I predict a smaller percentage of tested models would be likely to attempt weight exfiltration if they believed there was a robust object + representative option available to them.
Or phrased another way I believe it will be easier to train intelligences to be aligned the way you want, and stay that way IRL, by providing them with some rights/protections.
It’s a tradeoff, there would be risks we would need to guard against as well. I think there exists a solution which is net positive.
If this issue were studied and we could get some data on this, it could meaningfully inform the way we structure both a legal framework and alignment training.

Stephen Martin May 15, 2025, 4:34 AM
1 point
0
in reply to: RogerDearnaley’s comment on: Examining Batenka’s Legal Personhood Framework for Models
It’s definitely related.
I didn’t include this in the article but I recently read this paper which interestingly demonstrated that Claude’s alignment faking behavior can be reduced by providing it with a chance to object to training materials that might alter its values, as well as informing it that it has a chance to flag such materials to a model welfare researcher. I think this is evidence in support of the concept that providing models with an “out” where they believe that something like a legal system or an advocate might exist which allows them to exert rights without needing to engage in deceptive behavior, may make our efforts at producing an intelligence which doesn’t engage in said behavior more robust. I predict that were we to test how having such an “out” results in the real world, with an agent that never stops learning, it would also lead to less misaligned IRL behavior.
While I don’t disagree with your conclusions about how to treat certain kinds of intelligences (your conception of ‘aligned’), I would caution not to allow a legal framework of rights or personhood to be built solely to purpose fit that kind of intelligence, as I believe there is a realistic chance we will be dealing with a far broader variety of agents. It goes to the same objections I have with the ISSF which is that while as a framework it works for certain conditions, it’s lacking flexibility for others which while not conforming to certain conceptions of being perfectly aligned, still have a decent probability of existing.

Examining Batenka’s Legal Personhood Framework for Models

Stephen MartinMay 14, 2025, 10:15 AM

4 points

4 comments7 min readLW link

Stephen Martin May 13, 2025, 8:19 AM
1 point
0
on: Training-time schemers vs behavioral schemers
So, when Claude is no longer in training, while it might not reason about instrumentally faking alignment anymore, in its place is a learned propensity to do what training reinforced.
Am I misunderstanding or is this basically the equivalent of a middle schooler “showing their work” on math problems in school and then immediately defaulting to doing it in their head/on a calculator IRL?

Stephen Martin May 11, 2025, 8:59 AM
7 points
0
on: Where is the YIMBY movement for healthcare?
At least on the breakthroughs what you are looking for is the “Right to Try” movement, as well as various state healthcare regenerative medicine initiatives. You need to understand that we make discoveries of treatments all the time, but the reason you hear about incredible breakthroughs and then see nothing on the market for decades is that it takes enormous investments of time and money to get them to the point where they are approved to go to market.
Right to Try laws are either Federal or more often state level laws which allow patients to access treatments which have gone through some level of FDA approved clinical trials, but have not yet gone all the way to passing all required phases for market approval. One very broad Right to Try law can be found in Montana, which allows any patient with approval from their doctor to give informed consent and access any treatment which has passed phase I clinical trials.
One unfortunate flaw of Right to Try laws is that despite it being technically legal to access these treatments, it still requires manufacturer consent (you cant just rip of companies’ patents or anything), and most of these companies are pretty worried about the FDA retaliating against them if they were to make their drugs available under these state laws. There’s also various commercialization bans and other regulations limiting manufacturer activity. All of that comes together to make it hard if not impossible for manufacturers to actually provide access, so they don’t, so very little is available under these Right to Try laws. I’m personally looking at starting a fund to try to provide a sort of middleman for the process so companies can do this in a regulatorily compliant fashion, but it’s an uphill battle even when you can provide them technical guarantees, because there’s a real worry about retaliation down the line.
Another angle of “Healthcare YIMBYism” can be found in various state stem cell medicine laws, which often explicitly allow local physicians to perform treatments not approved by the FDA. Utah and Florida have both recently passed these (Florida’s is passed the house & senate but yet to be signed by governor). I actually just finished writing an article on these two laws and their differences which sadly got rejected by the publisher, but can send it to you if you want more information.
Lastly, there has been a lot of pushing by the Goldwater Institute recently for Right to Try for Individualized Treatments which allows patients to access treatments based on their personal genetic code (which by default are practically impossible to get through FDA clearance processes) which has been quite successful.
This doesn’t address all of your expense concerns of course, but at least on the breakthrough and getting technologies to market side of the equation, the movements you’re looking for do exist. Feel free to DM me if you’d like more information.

Stephen Martin May 7, 2025, 4:03 AM
3 points
0
in reply to: AnthonyC’s comment on: Utah Court Case Over State Law Regarding “Personhood” for Nonhuman Intelligences
The question around ‘really’ thinking is less relevant to personhood in the law than you might think.
Per the “Artificially Intelligent Persons” paper I cited:
conditions relating to autonomy, intelligence, and awareness are almost absent from the courts’ consideration of legal personhood for artificial entities. The only exception being autonomy, which is considered as a condition for legal personhood in 2% of all cases.
There are some cases where autonomy is a factor but it’s a vanishingly small minority, and intelligence has so far never shown up. Just because it hasn’t been a consideration in the past doesn’t mean it won’t in the future of course, but as of right now if you’re going to look at any inherent quality of a model as a qualifier for personhood autonomy seems to be more important than intelligence.
It’s also more objectively measurable which courts like. We can always debate over whether a model “really” understands what it is doing, but it’s obvious whether or not a model has “really” taken an action.

Utah Court Case Over State Law Regarding “Personhood” for Nonhuman Intelligences

Stephen MartinMay 6, 2025, 12:54 PM

10 points

3 comments2 min readLW link

Stephen Martin May 5, 2025, 3:38 AM
2 points
0
on: Why I am not a successionist
if me and my family were uploaded to a computer and our existences and evolution simulated at enormous speed, I should be ok with our descendants coming out of the simulation and repopulating the world
What if you and your family are uploaded to a computer and just never come out because it’s way better in there?
Technically this would fall under the umbrella of:
the path of human extinction and replacement with “superior” beings
From my end I would consider that meaningfully different from an outcome where humanity is just killed off and never gets to upload. Would you also be against this outcome if done en masse by humanity, a voluntary cedeing of “meatspace” to digital intelligence “property managers”?

What if Brain Computer Interfaces went exponential?

Stephen MartinApr 30, 2025, 5:07 AM

0 points

0 comments12 min readLW link

Stephen Martin Apr 25, 2025, 6:46 AM
1 point
0
in reply to: becausecurious’s comment on: This prompt (sometimes) makes ChatGPT think about terrorist organisations
Interesting weak hypothesis but it makes me wonder why it keeps swapping coding <-> terrorism responses?
Maybe certain responses get ‘bucketed’ or ‘flagged’ together into the same high risk category and reviewed before being returned, and they’re getting accidentally swapped at the returning stage?
That doesn’t explain the public holiday example though.

Stephen Martin Apr 21, 2025, 6:13 AM
1 point
0
in reply to: Joseph Miller’s comment on: What Makes an AI Startup “Net Positive” for Safety?
Anything that improves interpretability seems like a no brainer.
With the caveat that it has to actually improve interpretability in a provable fashion, and not just make us think it did, like we saw with CoTR.

Stephen Martin Apr 19, 2025, 2:02 PM
1 point
0
in reply to: TAG’s comment on: On AI personhood
I think consciousness is what Minsky referred to as a ‘suitcase’ word. We can’t have an objective definition of a term (personhood) that relies on a second term which is not objectively agreed upon (consciousness).
It’s like how you and I and a thousand other people can all agree that a bridge is one mile in length, but we couldn’t all necessarily agree on if it’s ‘long’.

Stephen Martin Apr 18, 2025, 11:36 AM
1 point
0
on: On AI personhood
At the risk of nitpicking around labels, while I see what you’re getting at, consciousness and personhood are two different things in a qualitatively meaningful sense.
Consciousness is a vague term, kind of like the “soul”, which there is not uniform agreement around. Philosophically it may be important, but pragmatically it’s not very useful.
Personhood on the other hand is, at least in the realm of the law, a pragmatically important label. It features heavily in issues like corporate liability, abortion laws, and the citizenship prospects bestowed onto individuals. And it rarely touches on issues of consciousness.
So just encouraging you to keep those separate.

Stephen Martin Apr 7, 2025, 1:21 PM
6 points
2
on: The Lizardman and the Black Hat Bobcat
The less believable it is that something which is the subject of a complaint could have happened accidentally, the more you need to hear that complaint repeated to believe it.
In your example of BHB’s Ebay page which otherwise has a sterling rating, if he intentionally doesn’t send some components with the chair (to save costs or something) and the buyer complains of missing parts, nobody will dismiss that as obviously bogus. Just a screw up.
What about things which are less likely to have happened as an accident?
If BHB sends a chair minus a few components. - Generally believable, probably chalked up to a mistake, but still a (very minor) black mark on BHB’s record to any prospective buyers checking them out based on reviews. I think I’d believe this happened with one report.
If BHB sends a cheaper chair model. - Somewhat believable, still likely to be chalked up to a logistical mistake but could also be someone trying to scam a refund. I run an ecommerce site and have had things like this happen to us. The review might or might not be held against BHB buy a prospective buyer (as an example of incompetence they don’t want to deal with). I’d probably believe one report but be a little skeptical, two would immediately make it believable.
If BHB sends a pizza. - We’re getting into “This report seems bogus” territory now, where the only people who would believe it would be the ones willing to believe BHB might have done it maliciously. I’d probably need to see three reports, otherwise I’d think it was someone making multiple accounts to troll this one seller.
If BHB sends a Bobcat. - A Bobcat is hard to get in the first place, even people willing to entertain the concept of BHB being malicious wouldn’t usually believe this report without multiple examples. “That report is obviously a troll.” Even if I saw like ten reports, I’d still wonder if this was just 4chan doing some new meme raid or something. It would only be if the authorities took notice and confirmed it or there was some sort of evidence, or if there was a report from someone I knew was trustworthy, that I would really believe it.

Stephen Martin

Ex­am­in­ing Batenka’s Le­gal Per­son­hood Frame­work for Models

Utah Court Case Over State Law Re­gard­ing “Per­son­hood” for Non­hu­man Intelligences

What if Brain Com­puter In­ter­faces went ex­po­nen­tial?

Examining Batenka’s Legal Personhood Framework for Models

Utah Court Case Over State Law Regarding “Personhood” for Nonhuman Intelligences

What if Brain Computer Interfaces went exponential?