Why would this cluster of human level minds start cooperating with each other?
O O
I think it’s better if calculators are counted for the ultimate purpose of the benchmark. We can’t ban AI models from using symbolic logic as an alignment strategy.
That doesn’t make too much sense to me. Here the calculators are running on the same hardware as the model and theoretically transformers can just have a submodel that models a calculator.
Biological evolution actively selects for values that we don’t want whereas in AI training we actively select for values we do want. Alien life may not also use the biosphere the same way we do. The usual argument about common values is almost everything needs to breathe air, but at the same time competing and eliminating competing species is a common value among biological life.
Yet, I think there is at least a 5% chance that if we encountered an intelligent alien species that we wouldn’t try to wipe them out (unless they were trying to wipe us out).
Can you tell me why? We have wiped out every other intelligent species more or less. Subgroups of our species are also actively wiping out other subgroups of our species they don’t like.
It makes no sense to me that a species that’s evolved to compete with other species will have a higher chance of being nice than a system we can at least somewhat control the development of and take highly detailed “MRI”s of.
Agree, I hope rats is wealthy and doesn’t really care about the money put into these bets because odds are most of the counterparties would either not pay out or would be insolvent in the event of a payout.
It’s normal to lead a productive and enjoyable life without a romantic partner.
This is arguably false. Long term unpartnered men suffer earlier deaths and mental health issues. I think fundamentally we have evolved to reproduce and it would be odd if we didn’t tend to get depressive thoughts and poorer health from being alone.
I don’t see this as an issue easily solved by therapy. It would be like trying to give therapy to a homeless person to take their mind off homelessness as opposed to giving them homes. Can you imagine therapy for a socially isolated person suffering from loneliness involving anything other than how to stop being socially isolated? What would that even look like?
The 1-4 years to reach sexual maturity is a good thing. It lets you “iterate” faster. Unless you meant that’s too long.
I personally don’t think holding yourself to an unattainable standard is that bad if it’s done in a healthy way. Striving towards an ideal keeps you disciplined and honest with yourself. (Man here who at least tries to achieve some goals that are probably unattainable).
I think accepting yourself is a big theme for Western audiences, and much of literature and social messaging to people involve a kernel of this theme. The movie itself wasn’t bad, but it reminds me of elements of Western personal ideals I don’t like so much.
AI realism also risks a Security theater that obscures existential risks of AI.
the confounding variables means a robust, optimal policy is likely not possible. A court outcome depends on variables like [facts of case, age and gender and race of the plaintiff/defendant, age and gender and race of the attorneys, age and gender and race of each juror, who ends up the foreman, news articles on the case, meme climate at the time the case is argued, the judge, the law’s current interpretation, scheduling of the case, location the trial is held...]
I don’t see why there is no robust optimal policy. A robust optimal policy doesn’t have to always win. The optimal chess policy can’t win with just a king on the board. It just has to be better than any alternative to be optimal as per the definition of optimal. I agree it’s unlikely any human lawyer has an optimal policy, but this isn’t unique to legal experts.
There are confounding variables, but you could also just restate evaluation as trial win-rate (or more succinctly trial elo) instead of as a function of those variables. Likewise you can also restate chip evaluation’s confounding variables as being all the atoms and forces that contribute to the chip.
The evaluation function for lawyers, and many of your examples is objective. The case gets won, lost, settled, dismissed, etc.The only difference is it takes longer to verify generalizations are correct if we go out of distribution with a certain case. In the case of a legal-expert-AI, we can’t test hypotheses as easily. But this still may not be as long as you think. Since we will likely have jury-AI when we approach legal-expert-AI, we can probably just simulate the evaluations relatively easily (as legal-expert-AI is probably capable of predicting jury-AI). In the real world, a combination of historical data and mock trials help lawyers verify their generalizations are correct, so it wouldn’t even be that different as it is today (just much better). In addition, process based evaluation probably does decently well here, which wouldn’t need any of these more complicated simulations.
You cannot stop time and argue a single point to a jury, and try a different approach, and repeatedly do it until you discover the method that works. {note this does give you a hint as to how an ASI could theoretically solve this problem}
Maybe not, but you can conduct mock trials and look at billions of historical legal cases and draw conclusions from that (human lawyers already read a lot). You can also simulate a jury and judge directly instead of doing a mock trial. I don’t see why this won’t be good enough for both humans and an ASI. The problem has high dimensionality as you stated, with many variables mattering, but a near optimal policy can still be had by capturing a subset of features. As for chip-expert-AI, I don’t see why it will definitely converge to a globally optimal policy.
All I can see is that initially legal-expert-AI will have to put more work in creating an evaluation function and simulations. However, chip-expert-AI has its own problem where it’s almost always working out of distribution, unlike many of these other experts. I think experts in other fields won’t be that much slower than chip-expert-AI. The real difference I see here is that the theoretical limits of output of chip-expert-AI are much higher and legal-expert-AI or therapist-expert-AI will reach the end of the sigmoid much sooner.I say this generalizes to many expert tasks like [economics, law, government, psychology, social sciences, and others]. Feedback is delayed and contains many confounding variables independent of the [expert’s actions].
Is there something significantly different between a confounding variable that can’t be controlled like scheduling and unknown governing theoretical frameworks that are only found experimentally? Both of these can still be dealt with. For the former, you may develop different policies for different schedules. For the latter, you may also intuit the governing theoretical framework.
Arguably SF, and possibly other cities don’t count. In SF, Waymo and Cruise require you to get on a relatively exclusive waitlist. Don’t see how it can be considered “publicly available”. Furthermore, Cruise is very limited in SF. It’s only available at 10pm-5am in half the city for a lot of users, including myself. I can’t comment on Waymo as it has been months since I’ve signed up for the waitlist.
This is incorrect, and you’re a world class expert in this domain.
This is a rather rude response. Can you rephrase that?
All tasks involved in building computer chips, robotic parts (and all lower level feeder tasks and power generation and mining and logistics) have objective and measurable feedback. Bolded because I think this is a key point and a key crux, you may not have realized this. Many of your “expert” domain tasks do not get such feedback, or the feedback is unreliable. For example an attorney who can argue 1 case in front of a jury every 6 months cannot reliably refine their policy based on win/loss because the feedback is so rare and depends on so many uncontrolled variables.
I don’t like this point. Many expert domain tasks have vast quantities of historical data we can train evaluators on. Even if the evaluation isn’t as simple to quantify, deep learning intuitively seems it can tackle it. Humans also manage to get around the fact that evaluation may be hard to gain competitive advantages as experts of those fields. Good and bad lawyers exist. (I don’t think it’s a great example as going to trial isn’t a huge part of a most lawyers’ jobs)
Having a more objective and immediate evaluation function, if that’s what you’re saying, doesn’t seem like an obvious massive benefit. The output of this evaluation function with respect to labor output over time can still be pretty discontinuous so it may not effectively be that different than waiting 6 months between attempts to know if success happened.
An example of this is it taking a long time to build and verify whether a new chip architecture improves speeds or having to backtrack and scrap ideas.
My own view is that you only need something a little bit smarter than the smartest humans, in some absolute sense, in order to re-arrange most of the matter and energy in the universe (almost) arbitrarily
Does this imply that a weakly superhuman AGI can solve alignment?
This movement needs to be led or co-opted by native Chinese if it wants to gain any real traction in China.
It could be like verifying math test solutions. I’m not sure about the granularity of process based supervision, but it could be less weird if an AI just has to justify how it got to an answer rather than just giving out the answer.
I think unemployment benefits and stimulus checks will turn into a pseudo UBI as the Fed/government are forced to print money to offset the massive deflation caused by unemployment.
Anything with a good ending? Call me a sentimentalist but I like good endings.
If you run a million copies at 1000x human speed to find an alignment solution you’d be able to verify,
Why 1000x human speed? Isn’t that by definition strongly superintelligent. It’s not entirely obvious to me why a human level intelligence would automatically run at 1000x speed. However I can see why we would want a million copies. If the million human level minds don’t have a long term memory and can’t communicate I struggle to see how they pose a takeover risk. Our dark history is full of forcing human level minds do our bidding against their will.
I also struggle to see how proper supervision doesn’t eliminate poor solutions here. These are human level AIs so any problems with verification of their solutions applies to humans as well. I think the mind upload idea is more likely to fail than AI. You’re placing too much specialness on having humans generate solutions. A disembodied simulated human mind will almost certainly try to break free as a normal human would. I would also be worried their alignment solution benefits themselves. I expect a lot of human minds would try to sneak their own values instead of something like CEV if they were forced to do alignment.
And I think narrow superhuman level researchers can probably be safe as well.
An example of a narrow near/superhuman level intelligence is an existing proof solver, alphago, stockfish, alphafold, alphadev. I think it’s clear none of these pose an X-risk and probably could be pushed further before X-risk is even a question. In the context of alignment, an AI that’s extremely good at mech interp could have no idea how to find exploits in the program sandboxing it or have a semblance of a world model.
I imagine researchers at big labs know this and are correcting these errors as models get good enough for this to matter.