Sure you can, but people tend to dislike arbitrary and capricious punishments. If you want a roll of the dice to hold up as valid legal reasoning in a sentencing hearing, you’ll have to reform the legal system by quite a bit first.
ulyssessword
Getting sent to jail for 90% as long is still getting sent to jail. There are practical minimums for it (one day, which is honestly a lot shorter than I thought) just like there are practical minimums for pulling someone over. Something that’s 1⁄24 as serious as a one-day offense won’t get you an incarceration record (for one hour), and can be conveniently swept under the rug when dealing with bureaucracy. Same with getting your license revoked: even if it’s for a single minute (as opposed to a year or whatever), you still have to apply to get it reinstated and all that.
I think there are many hidden thresholds when various punishments (or rewards, or simple reactions) become relevant for things. Do you want baggage with “kind of heavy load” marked on it, that requires 1.03x the OSHA-standard amount of strength to handle safely? A fuzzy line between “general wellness” and “diagnosable medical condition” for weight?
Continuous functions only work where the underlying reality is continuous.
Using speeding as an example, going 9 over the limit is de-facto legal. Cops can’t pull you 90% over, so there’s a step-change at 11 over where they start bothering to do it and you’re suddenly de-facto illegal and will face a moderate fine. Similarly, you can’t get sent 90% to jail or have your license 90% revoked (for a single offense), so there are another couple step changes in the punishment.
Same with kind-of writing up an accommodation plan for a sort-of disabled employee, barely retaking a class that you barely failed (or graduating with nearly-honors), or almost serving water that’s almost safe.
I underestimated what METR’s results meant.
Specifically, they don’t let you learn what the model can do and choose to do more of that. If you were deploying AI agents in a practical setting, you could simply choose not to have GPT 5.2 try to find the tallest building (3 minute task, 25% success rate), and instead do audio classification of macaques (6.0 hours, 100% success rate).
A 50% success rate could theoretically cash out at perfect reliability on half of the tasks. In practice it isn’t that far off, with tasks in the 0-10% success rate band being almost as common as 10-90% ones.
Unfortunately, there are two parties involved.
The shepherd may have a 1:100 ratio of warning cost:wolf cost, but his neighbors might only be around 1:10. Giving the twentieth false alarm may be worthwhile to the shepherd, but responding to it wouldn’t be worthwhile for his neighbors. And then nobody would respond to the 21st warning, even if there was a wolf.
https://aisle.com/blog/ai-cybersecurity-after-mythos-the-jagged-frontier
They can detect the problem, but not develop a working exploit. They also had a 2⁄3 false positive rate on a the patched version of that function.
They say that’s fine because something other than the (frontier) model can do those steps, but don’t demonstrate the capability anywhere I could see.
I googled the source link here: https://aisle.com/blog/ai-cybersecurity-after-mythos-the-jagged-frontier
I’m also concerned about isolating the code. It’s the difference between finding a needle in a haystack, and distinguishing a needle from a single straw. Their set of models returned 12⁄18 false positives (and 18⁄18 true positives), which suggests terrible specificity to me.
I am willing to bet that despite the supererogatory behavior of these volunteers, the company and executives profited disproportionately.
Disproportionate to what?
They probably didn’t get a 4.2x income multiplier (168 hours / 40 hours), or even a 1.276x one ((2000 hours + 4 weeks * 128 extra hours per week + 40 hours extra vacation)) / 2000 hours per normal year), so I don’t think it’s disproportionate to the workers.
Disproportionate to society as a whole? That would take an absurd amount of price gouging given how valuable the material was, and I feel like someone would’ve at least written an expose (if they weren’t criminally charged).
A camera that can do facial recognition from outside of national borders doesn’t need to be a petapixel one. A mid-gigapixel camera with good optics can cover an entire city at once (or at least it could if it wasn’t for all the buildings in the way).
The main barrier to petapixel cameras is that they don’t serve your goal of full public monitoring (regardless of whether it’s by the government or by everyone individually).
The constant angular field of view is the disagreement. A camera in the mid-gigapixel to low-terapixel range could cover one city by using an appropriate lens at an arbitrary distance (including space).
Any sensor finer than that would either cover substantial amounts of “boring” area (e.g. nature preserves, agricultural areas), or increase the resolution beyond your target.
I don’t think you understand what a “pixel” is.
If you want to see a 1 km x 1 km area at a resolution of 0.1m, then you will need 10 000 x 10 000 = 100 000 000 points on the image, AKA 100 megapixels. This is (ideally) independent of technology. You can walk around and take fifty 2 MP pictures and stitch them together, you can fly a drone a few hundred meters up and take a wide-angle shot, or you can fly a satellite overhead and take a picture from space. The distance doesn’t matter.
From that, it’s a simple extrapolation that it’ll (again, ideally) take 100 MP/km^2 * 778 km^2 = 77800 megapixels = 77.8 gigapixels to surveil the land area of New York City to that level of detail. Again, that could be from an array of low-resoultion cameras, a wide-angle camera nearby, or a telephoto camera far away.
(Also, a brief search suggests facial recognition needs about 3mm (0.003m) resolution to identify individuals, not 100 mm (0.1m))
A 1 petapixel camera could cover an area of 3162 km x 3162 km to that resolution, or roughly the entire United States in a single snapshot (ignoring practicalities like the curve of the earth, of course). It could also be used to count someone’s nosehairs if you set it up differently.
I think a moderately-skilled person could outperform Claude here, but it’s closer than you might think. Have you thought of running this experiment with a human on the other end?
I occasionally give technical support for industrial automation equipment, and I feel for Claude. It’s so much harder than it looks, even when you have voice+video instead of text+pictures.
As one example of how it can go wrong, I said “Check the cables on the enclosure, and make sure they’re all connected properly.” instead of “Check the three cables on the enclosure (Power, ethernet, remote sensor module), and make sure each of them are connected properly.” and it took us 20 minutes to figure out that the reason it couldn’t communicate with the network is because the ethernet cable was completely missing.
There are hundreds of videos about the difficulty of giving precise directions, usually played for comedy. For example, here:
Anecdote time!
In the past couple weeks, I:
stopped driving my 2007 vehicle
rented a 2025 Toyota Corolla for a while
bought a base-model 2022 model year vehicle
The newer ones have much better handling, power, and comfort, along with some fancy features. They also have some significant downgrades. For the Corolla:
The touchscreen for the radio (etc) is too bright. It’s also a touchscreen instead of buttons.
The signal lights don’t react properly. There is no physical feedback when you lock in the switch, and it doesn’t cancel the lights immediately when you unclick the stick.
It gives useless warnings. Whenever a car merges in front of you, it displays a “there is a car merging in front of you” graphic where your speed etc. was on the dashboard.
It gives useless warnings part 2. It can usually detect the speed limit with its cameras, and displays them to you. It can’t detect “Resume Speed”, or “Construction Zone Ends” or any similar sign. When it fails like that or simply doesn’t see a faster speed limit sign, it gives you a solid red warning that you might be speeding.
It gives useless warnings part 3. Yes, I will totally check the back seat for passengers. Thank you, car. Yes, I will also obey all traffic laws and drive safely. Thanks for the reminder.
Visibility is worse, and the warning lights (e.g. blindspot detectors on the side mirrors) only partially mitigate that. It may be specific to the model, but I’ve seen a clear trend from 1998 → 2007 → 2022/2025.
Also, fuel consumption is similar over the decades for me, on similar models. All of the gains to efficiency went to more capability instead.
Re #3: The easiest demonstration is to do it. How well do you think it would go over if you said “I know you like cooperative board/card games. I found The Crew for sale at Central Gaming for $19.99, and you might enjoy playing it. Here’s their address and a $20 bill.” vs. just buying it and giving to them?
Kind of against #4: A gift includes permission to have and use it. This goes double when the recipient is a child, spouse, or anyone else the gift-giver has a stake in. “Why do you think those binoculars are the best way to spend $1000?” Not my choice, it was a gift! “Why do you birdwatch during our hikes?” Because I was gifted the binoculars!
If you haven’t seen it in your investigations, then I doubt if raw timestamps would help.
First, follow the link in the Youtube description to the FBI page, and download the .mp4 file.
Next, open it in VLC (download it first, if needed), activate “Interactive Zoom” mode (Tools → Effects and Filters → Video Effects → Geometry → Interactive Zoom), set it to 800% (small slider below the left corner of the picture-in-picture) and focus on the edge of the small rooftop building.
Last, look at 0:09-0:10. There is clearly an object of some sort on his back, as his torso is larger in that direction than to his front (relative to his head). It’s more consistent with a backpack than a rifle, but that’s unsurprising given the camera. Annotated image here, but it’s much clearer in video form.
(Those aren’t the parts I was mentioning in the previous comment. That was as he was walking across the grass, which (on rereading) wasn’t your point. This is as he is still on the roof.)
It’s visible in several frames as he walks away, otherwise it blends in with his legs. It’s also easy to mistake for another leg.
Given the quality of the camera, that item’s shape is “consistent with” a lot of different items, including a rifle. It could’ve been anything from a jacket to a small suitcase, and any features smaller than a couple inches (such as a rifle barrel) would disappear between the pixels.
I wouldn’t expect to see an identifiable rifle in that low-quality footage, so not seeing it isn’t surprising. Album, where I tried to keep the pixelization consistent (18 pixels tall = 4“/pixel in the video, guessing 28” barrel = 7 output pixels per 430 input pixels). My unidentifiable blob is consistent with a rifle because it was made from a rifle. It’s also consistent with a stick.
when you receive quite a few DMs asking you to bring back 4o and many of the messages are clearly written by 4o it starts to get a bit hair raising.
Am I missing something, or is that impossible? How could it be written by 4o after 4o was taken offline (and before it was reinstated)?
At the point of death, presumably, the person whose labour is seized does not exist. I think that’s a good point to consider, since I also estimate that a significant amount of resistance to the idea of no inheritance assumes the dead person’s will is a moral factor after their death.
Yes, I make that assumption. I believe I’m in very good company there, with both the general public and (many, but not all) decision theories/moral systems recognizing agreements that carry on past death. Why would you think otherwise?
I also don’t agree that you’re effectively limiting people’s power of affecting causes they care about to what the government would do with the money, since people have other causes they care about besides their offspring, even if to a lesser degree, and...
I’m not quite sure what this post’s hypothetical is supposed to be, but sure. Let’s say that charitable donations are fully exempt from the tax.
People don’t care about charity to any substantial extent. Donation rates are around 4%, whereas raising a child averages 15%ish per child for nearly half of a parent’s career, never mind the non-financial investments in their wellbeing. It’s not a complete restriction on giving, but it cuts out the most important one in many peoples’ lives.
Allowing for charitable donations as an alternative to simple taxation does shift the needle a bit but not enough to substantially alter the argument IMO.
… are free to spend their money while alive to advance those.
No, they absolutely are not. Spending your money before your death is heavily constrained by uncertainty. The chance of sudden unexpected death between 20-64 totals about 1.5% (calculated from here), and the anti-loophole protections would catch more. Even outside of the worst-case scenarios, you will always die before a sufficiently-optimistic estimate (and if you aren’t optimistic enough? Have fun living out your last days while completely broke, I guess.)
A relevant point I don’t have an opinion on is whether the offsprings of a person are better stewards of that person’s former wealth than the government.
To be clear, I was talking about the parents being good stewards by managing the wealth for the benefit of future generations (i.e. Bob, and perhaps his kids). I have opinions about how effective the government would be compared to the children, but those differences pale in comparison to tearing everything down to get the last drop of value out before you die and lose it all.
I see extra complexity every time I try to get coding help.
For background, I’m not much of a programmer. The biggest project I’ve made is a ~1000 line script that almost runs in a straight line from start to finish. It’s enough to turn a week of constant work into five minutes of careful checking, or a few minutes of skilled labor into a single button press, but nothing compared to True Software(TM).
Whenever I would look up how to do something (open a file, create a folder, play a sound), I’d find a 20-line monstrosity given as a minimal example on Stack Overflow, or something worse on Copilot (I’ve since graduated to Claude). After picking apart which sections actually matter, I’d add the two important lines to my code and go on to the next problem.
If LLM code was only 10% more complex than necessary, they’d be obviously superhuman.