I am Andrew Hyer, currently living in New Jersey and working in New York (in the finance industry).
aphyer
“This problem is real” is not itself a useful insight. It is useful to the extent that it might lead to the problem being fixed. And fixing problems is much, much, much harder than identifying them.
Perhaps you have noticed that Bay Area rents are really high. This is, indeed, a problem. But noticing that this is a problem is not a serious contribution to the conversation. If you have successfully identified a real problem, and then proposed a solution that will make it worse, you are not helping on net.
This goes double if someone points out that your solution will make it worse, and you say that you “aren’t even primarily interested in fault analysis at [that] level of granularity” and that them pointing that out will “distract you from the problem being real.”
A: “This is a bad problem! We should solve this problem by giving Group X more power to fix it!”
B: “Actually, it sure looks like this problem is plausibly caused by Group X, and certainly they’re exercising all the power they currently have to make it worse. I’m not sure what you hope to accomplish by giving them more power.”
A: “Bad problems don’t stop being bad just because someone is bad at fault analysis!”
B: “No, they don’t, but listening to your solutions to those bad problems can stop being a good idea because you’re bad at fault analysis.”
I think it’s fairly clearly a small joke.
Slowing down AI progress is not itself valuable.
Slowing down AI progress is valuable if and only if you can slow down AI progress without slowing down AI safety progress.
I have faith in the ability of the US government to put lots of red tape and roadblocks in the way of AI development, require that huge armies of lawyers and compliance officers be deployed by any entity trying to develop AI, etc.
I have no faith whatsoever that this process will not also slow down AI safety progress, and some strong suspicion that it might disproportionately slow down AI safety progress.
Overall I think that, when I look at the current conduct of Anthropic, and the current conduct of the US government, I do not find myself saying ‘oh, if only we had given the US government more control over AI companies!’
The last of these seems very unlike the others? We have a list of potential alignment failures, bypassing safeguards, etc...and then one time when the model carelessly deleted the wrong thing. Is the researcher still really mad at Claude about that or something?
I have a complaint. My perfectly-reasonable request to the LLM was denied for reasons of censorship. Communist censorship! This is literally 1984!
#1 seems to be confusingly worded. Is the banking industry in the US ‘quasi-nationalized’? The pharma industry? Construction? Restaurants? All of these do ‘operate under the protection and audit of government agencies.’ What level of government involvement are you envisioning?
You can have
until I figure out how to play the Necrobinder without dyinguntil Sunday, after that I’ll post the solution anyway to not keep other players waiting too long.
No worries, granted. I’ll aim to post the wrapup doc on the 23rd.
I personally strong-downvoted the parent comment. When I remove my strong-downvote from it, I discover that the combined rest of LessWrong voted that comment to a total of +18 karma and +2 agreement as of me writing this.
I do in fact consider it somewhat alarming that the combined rest of LessWrong on net agrees with this comment and wants to reward the author for posting it.
There’s a saying usually applied to diet and exercise that also seems relevant here—that the best diet/exercise program by far is the one you will actually do.
...it seems very hard to keep whoever is involved in organizing/planning the operation from knowing about it 24 hours in advance, and very unlikely that every such person is wealthy enough to ignore a quarter million dollars.
Two of the four dish combinations with Quality=20 had >16 Quality in expectation.
This is true, but is actually something I looked into making the scenario. The average score of ‘pick a random 20-quality feast from the dataset’ was 15.38, which players did successfully beat.
There was a related writing-constraint that came from me pushing to simplify the ruleset a bit. The original envisioned ruleset was going to give players an additional rule limiting which dishes they were allowed to include.[1]
This would have let me give a substantially larger dataset without worrying that grabbing the best-scoring thing out of the dataset would trivially solve the scenario—if 6-7 dishes were banned, it would be easy for none of the top-scoring feast to be allowed for you, and/or for the best-scoring feast you were allowed to be a single bit of random luck that would betray you if you repeated it.
When I removed that rule, I needed to cut down on the dataset size to avoid that being a trivial solution, which is what led to the data-starvation. Overall I think something like that wouldn’t have been worth the complexity—just telling players they can include whatever dishes they want is simpler and also feels more realistic in context. Open to other views on that, though.
- ^
I had a whole bunch of excuses lined up for this too! One of your companions gets seasick and doesn’t want to go hunt Kraken...another is Good-aligned and will be angry if you kill a Pegasus...
- ^
Huh. I believe I tagged them both the same way, but I don’t get tag notifications on my own posts. Can someone who isn’t me comment on whether they got the notification?
I’m very proud of this scenario. (Even if you’re confident you aren’t going to play it, I think you could read the wrapup doc and in particular the section on ‘Bonus Objective’ so you can see what it involved).
It accomplished a few things I think are generally good in these scenarios:
There was underlying structure that players could uncover, which created emergent complexity in the output but made sense with the theme once the underlying ruleset was revealed/discovered.
Human thought about e.g. the theme and what patterns would be reasonable to observe was valuable, the puzzle was not optimally-solved just by feeding the data into a model and calling it a day.
Multiple levels of solution were possible, from a decent solution with little effort up to a more-involved solution that went further and dug into the underlying structure.
And also it managed to trick many players with a surprising-yet-thematic twist :P
...oops. It turns out that efficiently solving the Knapsack Problem is hard.
On looking in, it looks like my confusion with python variables means that, while your soldiers will always find a valid solution if it’s reachable by a few straightforward rules, they will sometimes fail to find the solution if it wouldn’t be reachable like that.[1]
This doesn’t affect the performance of any submitted team (since the teams were evaluated using the same code that the dataset was generated with), but it does mean that the underlying ruleset was messier and less derivable than I’d hoped...sorry :(
- ^
In detail: when your soldiers can’t find a solution using simple rules, they were supposed to list each possible target for their biggest shot, and make separate branches for each of those. However, a bug means that the code execution for the first branch removes shots from their available list for the later branches, and so all the later branches are doomed. Therefore they will win only if the first branch works.
- ^
I have mixed feelings about this scenario.
I was proud of the underlying mechanics, which I think managed to get interesting and at-least-a-little-realistic effects to emerge from some simple underlying rules.
The theme...at least managed to make me giggle to myself a little as I was writing it.
When players submitted answers to this, though, several people got tricked into getting themselves killed. Out of five answers, two players took extremely safe approaches. Of the three players who were more daring, one submitted an excellent answer while two managed to trick themselves into submitting answers that were worse than random.
From a certain point of view, this is a valuable learning experience, which could teach people not to take drastic risks on limited data.
But I feel like other scenarios in this genre may have taught that lesson better without shooting quite so many players in the foot.
Another downside of this strategy is that a political faction not currently perceived by voters as ‘in power’ has an incentive to use any power they do have to actively worsen the lives of voters, who will blame their opposition.
Your boolean disagreement is relevant because it’s actionable. Suppose that:
Alice thinks calling is +$100 EV
Bob thinks calling is +$10 EV
Claire thinks folding is +$10 EV
David thinks folding is +$100 EV
In this case, Bob is much closer to Claire than to Alice in terms of their beliefs. But Bob agrees with Alice about the correct action, which is often the thing where disagreement actually matters.
(Politics-related examples are left as exercises for the reader).
Not all of them, but at least one. Let’s take the quarantine one first because that’s the one where I think you do well, and then look at the headlines one next. (The AI one doesn’t have enough detail for me to be clear on what’s going on there).
--------------------------------------------------------
I think you’re pretty much on the money for the quarantine one, because:
You are pointing to a relatively-large problem that many people are genuinely unaware of. (Before reading your post I had...uh...some vague idea that some people on a cruise ship had some virus).
You aren’t proposing any working solution—but you’re clear that you don’t have a proposed solution, and also aren’t proposing anything actively harmful.
I do think you may be missing some important points in your quarantine commentary. Like, yes. There are some potential major beneficial effects of these potential carriers being quarantined. By. Uh. Some international organization that can unilaterally detain citizens of multiple different sovereign nations overseas without trial and prevent them from returning home. That seems to me...potentially fraught.
But (as you say), you aren’t proposing solutions at that level of granularity, and you’re being up-front about that. You’re pointing to a situation that looks bad and saying “is there anything we can do about this?” Maybe, on reflection, the answer to this will end up being “it’s going to be a very big diplomatic effort to get even small steps on this problem, and it doesn’t seem likely to succeed.” But that’s not your fault, and everything you have said on this is reasonable.
--------------------------------------------------------
On the other hand, the headlines point I think is a much less encouraging one for your view.
Lots of headlines are misleading. What can we do about this? Should we go harass journalists about it? Or perhaps we should go harass editors about it? Those are both proposed solutions at a fair level of granularity. You...seem to me to rather be suggesting both of them. And both of them are (at least in my opinion) bad ideas.
The reason why headlines are misleading is because these businesses need to attract readers in order to continue to exist, and readers demand exaggerated headlines. Talking about which particular employees of a newspaper you should personally harass in order to improve the accuracy of the headlines is like talking about which particular employees of Walmart you should personally harass in order to make them stop selling Twinkies and start selling healthier food.
Here, I think that:
You are pointing to a relatively-small problem that everyone reading your post is already aware of.
You are proposing a solution that seems to me to be actively bad.
and so I think that your overall net score here does not come out positive.
--------------------------------------------------------
Not quite. I’m saying that the value of your contribution has two elements:
Pointing to a problem. You gain points here by pointing out real, large problems that your readers are not already familiar with.
Suggesting a solution. You gain points here by suggesting reasonable solutions, and lose points here by suggesting bad solutions.
If Alice points out a substantial new problem, without particularly proposing a detailed solution beyond ‘can we do something about this’, and people are responding by nitpicking issues like that, I agree that’s bad.
But I think it’s a vastly more common situation for Alice to point out a problem that everyone is already aware of and propose an actively bad solution. And I think that the correct response to that looks almost identical to the nitpicking that annoys you.