I was also thinking the same thing as you, but after reading paulfchristiano’s reply, I now think it’s that you can use the model to use generate probabilities of next tokens, and that those next tokens are correct as often as those probabilities. This is to say it’s not referring to the main way of interfacing with GPT-n (wherein a temperature schedule determines how often it picks something other than the option with the highest probability assigned; i.e. not asking the model “in words” for its predicted probabilities).
GPT-4 can also be confidently wrong in its predictions, not taking care to double-check work when it’s likely to make a mistake. Interestingly, the base pre-trained model is highly calibrated (its predicted confidence in an answer generally matches the probability of being correct). However, through our current post-training process, the calibration is reduced.
What??? This is so weird and concerning.
I graduated college in four years with two bachelors and a masters. Some additions:
You don’t need to take the AP course to take the test at all. This is NOT a requirement. If your high school doesn’t offer the test you may need to take it at another school, though. Also unfortunate is that if it is the same as when I did this, your school probably gets test fees waived for students who took the course and thus you may need to pay for the test. https://apstudents.collegeboard.org/faqs/can-i-register-ap-exam-if-my-school-doesnt-offer-ap-courses-or-administer-ap-exams
The college I went to offered “Proficiency Tests” for many courses (mostly freshman targeted) which were effectively final exams for courses that you could take, and if you satisfied with some grade you got credit for the course. If you are good at studying on your own, this will probably be significantly less work than taking the course and it is an especially effective for courses that you are not interested in.
Taking More Classes:
I literally planned my entire course load for all four years way before I got on campus (with built in flexibility for when courses were full or if I wanted to leave a couple of wildcards in for fun or whatever). This is important because if you’re planning something like what I was doing, it’s important not to have all your hard classes in the same semester and then burn out.
The big accusation, I think, is of sub-maximal procreation. If we cared at all about the genetic proliferation that natural selection wanted for us, then this time of riches would be a time of fifty-child families, not one of coddled dogs and state-of-the-art sitting rooms.
Natural selection, in its broadest, truest, (most idiolectic?) sense, doesn’t care about genes.
So what did natural selection want for us? What were we selected for? Existence.
I think there might be a meaningful way to salvage the colloquial concept of “humans have overthrown natural selection.”
Let [natural selection] refer to the concept of trying to maximize genetic fitness and specifically refer to maximizing the spread of genes. Let [evolution] refer to the concept of trying to maximize ‘existence’ or persistence. There’s sort of a hierarchy of optimizers where [evolution] > [natural selection] > humanity where you could make the claim that humanity has “overthrown our boss and took their position” such that humanity reports directly to [evolution] now instead of having [natural selection] as our middle manager boss. One can make the argument that ideas in brains are the preferred substrate over DNA now, as an example of this model.
This description also makes the warning with respect to AI a little more clear: any box or “boss” is at risk of being overthrown.
(This critique contains not only my own critiques, but also critiques I would expect others on this site to have)
First, I don’t think that you’ve added anything new to the conversation. Second, I don’t think what you have mentioned even provides a useful summary of the current state of the conversation: it is neither comprehensive, nor the strongest version of various arguments already made. Also, I would prefer to see less of this sort of content on LessWrong. Part of that might be because it is written for a general audience, and LessWrong is not very like the general audience.
This is an example of something that seems to push the conversation forward slightly, by collecting all the evidence for a particular argument and by reframing the problem as different, specific, answerable questions. While I don’t think this actually “solves the hard problem of consciousness as Halberstadt notes in the comments, I think it could help clear up some confusions for you. Namely, I think it is most meaningful to start from a vaguely panpsychist model of “everything is conscious,” what we mean by consciousness is “the feeling of what it is like to be” and the move on to talk about what sorts of consciousness we care about: namely consciousness that looks remotely similar to ours. In this framework, AI is already conscious, but I don’t think there’s any reason to care about that.
Consciousness is not, contrary to the popular imagination, the same thing as intelligence.
I don’t think that’s a popular opinion here. And while I think some people might just have a cluster of “brain/thinky” words in their head when they don’t think about the meaning of things closely, I don’t think this is a popular opinion of people in general unless they’re really not thinking about it.
But there’s nothing that it’s like to be a rock
But that could be very bad, because it would mean we wouldn’t be able to tell whether or not the system deserves any kind of moral concern.
Assuming we make an AI conscious, and that consciousness is actually something like what we mean by it more colloquially (human-like, not just panpsychistly), it isn’t clear that this makes it a moral concern.
There should be significantly more research on the nature of consciousness.
I think there shouldn’t. At least not yet. The average intelligent person thrown at this problem produces effectively nothing useful, in my opinion. Meanwhile, I feel like there is a lot of lower hanging fruit in neuroscience that would also help solve this problem more easily later in addition to actually being useful now.
In my opinion, you choose to push for more research when you have questions you want answered. I do not consider humanity to have actually phrased the hard problem of consciousness as a question, nor do I think we currently have the tools to notice an answer if we were given one. I think there is potentially useful philosophy to do around but not on the hard problem of consciousness in terms of actually asking a question or learning how we could recognize an answer
Researchers should not create conscious AI systems until we fully understand what giving those systems rights would mean for us.
They cannot choose not to because they don’t know what it is, so this is unactionable and useless advice.
AI companies should wait to proliferate AI systems that have a substantial chance of being conscious until they have more information about whether they are or not.
Same thing as above, and also the prevailing view here is that it is much more important that AI will kill us, and if we’re theoretically spending (social) capital to make these people care about things, the not killing us is astronomically more important.
AI researchers should continue to build connections with philosophers and cognitive scientists to better understand the nature of consciousness
I don’t think you’ve made strong enough arguments to support this claim given the opportunity costs. I don’t have an opinion on whether or not you are right here.
Philosophers and cognitive scientists who study consciousness should make more of their work accessible to the public
Same thing as above.
Nitpick: there’s something weird going on with your formatting because some of your recommendations show up on the table of contents and I don’t think that’s intended.
I haven’t quite developed an opinion on the viability of this strategy yet, but I would like to appreciate that you produced a plausible sounding scheme that I, a software engineer not mathematician, feel like I could actually probably contribute to. I would like to request people come up with MORE proposals similar along this dimension and/or readers of this comment to point me to other such plausible proposals. I think I’ve seen some people consider potential ways for non-technical people to help, but I feel like I’ve seen disproportionately few ways for technically competent but not theoretically/mathematically minded to help.
If I discover something first, our current culture doesn’t assign much value to the second person finding it, is why I mentioned exploration as not-positive sum. Avoiding death literally requires free energy, a limited resource, but I realize that’s an oversimplification at the scale we’re talking.
I see. I feel like honor/idealism/order/control/independence don’t cleanly decompose to these four even with a layer of abstraction, but your list was more plausible than I was expecting.
That said, I think an arbitrary inter-person interaction with respect to these desires is pretty much guaranteed to be zero or negative sum, as they all depend on limited resources. So I’m not sure what aligning on the values would mean in terms of helping cooperation.
I don’t think most people are consciously aware, but I think most people are unconsciously aware that “it is merely their priorities that are different, rather than their fundamental desires and values” and furthermore our society largely looks structured such that only the priorities are different, but that the priorities differ significantly enough because of the human-sparseness of value-space.
I am skeptical of psychology research in general, but my cursory exploration has suggested to me that it is potentially reasonable to think there are 16. My best estimates are probably that there literally are 100 or more, but that most of those dimension largely don’t have big variance/recognizable gradations/are lost in noise. I think humans are reasonably good at detecting 1 part in 20, and that the 16 estimate above is a reasonable ballpark, meaning I believe that 20^16=6.5E20 is a good approximation of the number of states in the discretized value space. With less than 1E10 humans, this would predict very few exact collisions.
I would be really dubious of any models that suggest there are less than 5. Do you have any candidates for systems of 3 or 4 fundamental desires?
- 10 Oct 2022 0:40 UTC; 1 point)'s comment on The Village and the River Monsters… Or: Less Fighting, More Brainstorming by (
I’m not sure why adjacency has to be “proper”; I’m just talking about social networks, where people can be part of multiple groups and transmit ideas and opinions between them.
I approximately mean something as follows:
Take the vector-value model I described previously. Consider some distance metric (such as the L2 norm), D(a, b) where a and b are humans/points in value-space (or mind-space, where a mind can “reject” an idea by having it be insufficiently compatible). Let k be some threshold for communicability of a particular idea. Assume once an idea is communicated, it is communicated in full-fidelity (you can replace this with a probabilistic or imperfect communication model, but it’s not necessary to illustrate my point). If you create the graph amongst all humans in value-space, where an edge exists between a and b iff D(a,b) < k, it’s not clear to me that this graph is connected, or even has many edges at all. If this is true for a particular idea/k pair, then the idea is unlikely to undergo information cascade, because additional effort is needed in many locations to cross the inferential gap.
As you say, the ability to coordinate large-scale action by decree requires a high place in a hierarchy. With the internet, though, it doesn’t take authority just to spread an idea, as long it’s one that people find valuable or otherwise really like.
Somewhat related, somewhat tangential, I think the internet itself is organized hierarchically as nested “echo-chambers” or something similar where the smallest echo chambers are what we currently call echo-chambers. This means you can translate any idea/concept as existing somewhere on the hierarchy of internet communities, and only ideas high on the hierarchy can effectively spread messages/information cascades widely.
Is there anywhere you can concretely point to in my model(s) you would disagree with?
if there’s anyone in this community who recognizes the potential of facilitating communication.
I agree this is (potentially) high leverage. My strategy has general been that expressing ideas with greater precision more greatly aids communication. An arbitrary conversation is unlikely to transmit the full precision of your idea, but it becomes less likely that you transmit something you don’t mean and that makes a huge difference. The domain of politics seems mostly littered with extremely low precision communication, and in particular, often deceptively precise communication, wherein wording is chosen between two concepts to allow any error correction of behalf of a listener to be in favor of the communicator. Is there any reason why you want to specifically target politics instead of generally trying to make the human race more sane, such as what Yudkowsky did with the sequences?
I was thinking of issues like the economy, healthcare, education, and the environment.
I disagree and will call any national or global political issues high-hanging fruit. I believe there is low-hanging fruit at the local level, but coordination problems of million or more people are hard.
They can influence the people ideologically adjacent to them, who can influence the people adjacent to them, et cetera.
In my experience, it’s not clear that there is really much “proper adjacency.” Sufficiently high dimensional spaces make any sort of clustering ambiguous and messy if even possible. Even more specifically, I haven’t seen much of any ideas in politics that spread quickly that wasn’t also coordinated from (near) the top, suggesting to me that information cascades in this domain are impractical.
I think that largely that’s what is even meant by hierarchical structures. Small/low elements have potentially rich, complicated inner lives, but have very little signal they can send upwards/outwards. High/large structures have potentially bureaucratically or legally constrained action space, but their actions have wide and potentially large influences.
So far as I can tell, the tools I’ve accumulated for this endeavor appear to be helping the people around me a great deal.
Great. Keep on doing it, then.
It starts with expressing as simply as possible what matters most. It turns out there is a finite number of concepts that describe what people care about.
Say there are 100 fundamental desires, and all desires stem from these 100 fundamental desires. Each can still take on any number from −1 to 1, allowing a person to care about each of these things in different proportions. Even if we restrict the values to 0 to 1, you still get conflict because what is most important to one person is not what’s most important to another, causing real value divergences.
Is there another approach to making the world a better place without changing how humans think, that I’m unaware of?
I can think of some that you didn’t explicitly mention.
You can make the world just a slightly better place by normal means, trying to be kind, etc.
You can have kids, and teach them a better way to think while they’re still especially pliable, and ignore trying to teach old dogs new tricks
Maximize your inclusive genetic fitness, live a long life, make sure your ideas are such good ones that your kids will teach their kids and eventually outlive and out-compete inferior ideas
You can change how humans think, but you can do it in not the domain of politics
For what it’s worth, I also largely agree with things you said and your original post. At the point where the Wanderer contributed, I guessed both how the story would end, and the worse compromise the Wanderer mentioned. I guess I especially agree with your target. It’s not clear to me that I agree with your methods after having spent a fair deal of time on this sort of problem myself. That said, it’s extremely likely that you have real skill advantages in this domain over me. That said, I think any premise that begins with “the economy, healthcare, education, and the environment are low-hanging fruit in politics” is one where you get burned and waste time.
I don’t think there’s many potential negative consequences in trying. My response wasn’t a joke so much as taking issue with
It is apparent to me that making human politics more constructive is a low-hanging fruit
I think it really, really is not low hanging fruit. The rights and personhood line seems quite a reasonable course of discussion to go down, but you’re frequently talking to people who don’t want to apply reason, at least not at the level of conversation.
Religion is a “reasonable choice” in that you buy a package and it’s pretty solid and defended by a conglomerate with the intent that you defend and get some defense back. I don’t think you’re going to get far without dismantling institutions such as religions, and I don’t think your process is sufficient to dismantle those institutions.
Many people have effectively made the decision “you are not in my tribe, so I will not engage with you in a productive way, because I need to assume you are deceiving me.” I think amongst any parties that aren’t pre-opposed to one another, looking for win-wins is the default, sane thing that basically everyone does all the time. The problem is all coordination problems are downstream of effective communication, and there are many people with whom you will not communicate with effectively.
The real potential negative consequence that is likely is you waste your time, and frankly, I don’t think you’ll be the one to solve this, because I don’t think there are win-wins on this subject, and a good number of other subjects from politics.
Great, now solve pro-choice vs pro-life.
I think most people would agree that at some point there is likely to be diminishing returns. I, and I think the prevailing view on lesswrong, is that the biological constraints you mentioned are actually huge constraints that silicon-based intelligence won’t/doesn’t have. And the lack of these constraints will push off the point of diminishing returns to a point much past humans.
You can find it here. https://www.glowfic.com/replies/1824457#reply-1824457
I would describe it as extremely minimal spoilers as long as you read only the particular post and not preceding or later ones. The majority of the spoilerability is knowing that the content of the story is even partially related, which you would already learn by reading this post. The remainder of the spoilers is some minor characterization.
Great relevant wikipedia page
At the same time it’s basically the only filtering criteria provided besides “software developer job.” Having worked a few different SWE jobs, I know that some company cultures which people love are cultures I hate, and vice versa. I would point someone to completely different directions based off a response. Not because I think it’s likely they got their multidimensional culture preferences exactly perfectly communicated, but because the search space is so huge it’s good to at least have an estimator on how to order what things to look into.
I don’t have strong preferences about what the company does. I mostly care about working with a team that has a good culture.
This is pretty subjective, and I would find it helpful to know what sort of culture you’re looking for.
Thanks for the great link. Fine-tuning leading to mode collapse wasn’t the core issue underlying my main concern/confusion (intuitively that makes sense). paulfchristiano’s reply leaves me now mostly completely unconfused, especially with the additional clarification from you. That said I am still concerned; this makes RLHF seem very ‘flimsy’ to me.