An interesting feature would be to show the ratio instead of only the vote count in addition to the score.
I’m really excited that you are doing this. I recognize that it’s time consuming and not often immediately profitable (in various senses of that word, depending on your conditions) to do this sort of thing, especially when you might be working with someone more junior who you have to spend time and effort on training or otherwise bringing-up-to-speed on skills and knowledge, but I expect the long-term benefits to the AI safety project may be significant in expectation, and hope more people find ways to do this in settings like this outside traditional mentoring and collaboration channels.
I hope it works out, and look forward to seeing what results it produces!
Well, I do say “maybe”; this is a guess based on how the score evolved over time and how the the total score compares to the number of votes.
But I’m skeptical of whether people will actually cast explicit neutral votes, in most cases; that would require them to break out of skimming, slow down, and make a lot more explicit decisions than they currently do. A more promising direction might be to collect more granular data on scroll positions and timings, so that we can estimate the number of people who read a comment and skimmed a comment without voting, and use that as an input into scoring.
This is very much a problem in collecting NPS data in its original context, too: you get lots of data from upset customers and happy customer and meh customers stay silent. You can do some interpolation about what missing votes mean, and coupled with scrolling behavior you could get some sense of read count that you could use to make adjustments, but that obviously makes things a bit more complicated.
Maybe that’s useful, but we’d have to figure out what votes are supposed to mean in the first place, i.e. I’m not sure there is a well defined notion of what votes are for now such that we could change the UI to encourage using them in the expected manner.
I have an idea of what problems I’d like solved and voting is one way to solve some of those problems and in that context we might have some sense of how we would like to ask users to use voting, but that only makes sense in that context. On it’s own voting is just incrementing/decrementing counters in the database and that counter is used to inform some algorithms about the order in which content is displayed on the site; we have to decide what that means to us and what we would like it to do beyond what it naturally does on its own such that providing instruction and shaping the UI to encourage particular behaviors is meaningful.
So that’s a long way to say yes, but conditional on having explicit norms.
I’m not sure how much it would help, but that’s mainly because I am both not troubled by voting up things I disagree with or voting down things I disagree with and because I have a history of leaving constructive feedback comments, especially in cases where I feel like a comment/post is being treated overly harshly or where the author seems unfamiliar with community norms. I can imagine that others who are less willing to do that might be more willing to leave reacts that at least convey some of that information.
But my impression is this is not true for everyone. One clearcut thing is that there’s a certain threshold of agency and self-efficacy that someone needs to have demonstrated before I feel comfortable inviting them to mission-centric spaces (over the longterm), and I think I’m not alone in that. I think there are people who have “mixed competencies”, where they’ve gotten good at some things but others, and they want to be able to help the mission, and there are subtle and not-so-subtle social forces that push them away.
And I’m not sure there’s anything wrong with that, but it seems important to acknowledge.
I think there’s something proper in the function of a sangha (and by extension, our community) that it discourages those who don’t have, as you put it, the “agency and self-efficacy” to properly engage in the mission, and also pushes out those who are only half in it, such that what I can imagine as “mixed competencies” results in them not staying despite the fact that they could have stayed if they had been more committed and willing to make space for themselves in a place that was willing to tolerate them but not usher them in.
Of course, it feels a bit weird because in sangha that’s directly tied to the purpose of the community and can be done skillfully as part of transmitting the dharma, whereas in our community this seems at cross-purposes with the mission and can feel to some like defecting on paying the cost to train and develop the people it needs. Probably this is part of what sets apart sangha from other forms of community: it’s shape is directly tied to its function, and is a natural extension of the mission, where elsewhere other shapes could be adopted because the mission does not directly suggest one.
For a related notion, let me relate some things about sangha, as I tend to think it’s a good model for the kind of community that is likely shaped to fit the present situation.
“Sangha” is a Sanskrit word usually translated as “community”. It has a couple different meanings within Buddhism. One is mission focused: everyone is a member of the ideal sangha that transcends any particular space and time who has, in various definitions, taken the three refuges, taken the precepts, achieved stream entry, or is otherwise somehow on The Path. Another is location focused: “sangha” can refer to the specific community in a particular monastery, order, lineage, practice center, etc. of people who are “committed” or “serious” in one of the ways just enumerated. There are some others but those are the ones that seem relevant here.
Some things might look on the outside like sangha but might not be. For example, I facilitate a weekly meditation meetup at the REACH. It’s not really a sangha, though, because lots of people casually drop in who may or may not have committed themselves to liberation from suffering, they just want a place to practice or to hang out with some cool people or something else. And that’s fine; the point of a group like this is to be accessible in a way a sangha is not because although a sangha may be welcoming (the local one to which I belong certainly tries to be), many people bounce off sanghas because, I theorize, they aren’t ready to make the kind of commitment that really being part of one asks of you.*
*Because sangha will ask for commitment, even if you just try to be a “casual”—there’s not really a way to do the equivalent of hiding in the pews in the back. And it’s not the way you can’t hide in an Evangelical Christian group where you will be pointedly asked about your seriousness and shunned if you aren’t committed. Rather it’s like the practice pervades everything around the sangha and if you get too close to it and want to maintain distance you’ll feel very out-of-place.
And there’s the opposite situation, where a sangha may be real but not look much like it to outsiders. Sometimes this is just two friends coming together who are fellow stream winners who create sangha through their every interaction with each other, but to an outsider they might just look like good friends and not see the deeper connection to practice pervading their relationship.
The point being, sangha is something special, valuable, with real but somewhat fuzzy borders, and a strong commitment to a “mission”.
Now, our community (the one Ray is talking about here) hasn’t existed for long enough that we’ve had time to come to agreement on just what the criteria for inclusion are (or, put another way, exactly how we would phrase what the mission is, though I like what Ray says above), but whatever it is we can use this as the foundation of our community. I’ve seen in my time in Berkeley that, in my estimation, the mission is strong and powerfully creates a center of gravity that pulls in people sufficiently aligned with it and pushes out people who are not, usually not by force but because they simply get pulled away by other interests because although they might care about the mission some, they don’t care about it so much to make it a top priority in their life. This to me makes it like a sangha, but rather than a community committed to enlightenment it’s a community committed to long-term flourishing.
To me this suggests a couple things about how to build what Ray has called the village:
Keep the mission strong. The mission is the thing that holds the village together.
Leave the village open. The mission is both lighthouse and craggy shore that draws some people in and keeps other people out of the metaphorical harbor of the village because threading the currents to the harbor’s mouth is just hard enough that it keeps out anyone who doesn’t deeply care about the mission but not so hard as to keep out anyone who is serious.
Fix up the village. Right now the village is like a shanty town built around a small fort. Things in the fort are okay, but it’s a remote outpost far from home, requires frequently resupply from the outside, and the shanty town is better than living rough and more a place than anything else nearby the fort, but that’s about it. I believe most of the problems are not a consequence of keeping the mission strong or leaving the village open, but of not caring for the village.
I’m tempted to speculate about a “harder” version of this question: what if we lived in a universe where Bayes’ theorem not only hand’t been discovered but wasn’t true? Like a universe with different physics of causality. But I digress.
I don’t have a direct answer for you, but it might be constructive to reflect that Bayes’ theorem is a particular mathematical understanding of a pattern people understand and use implicitly and pops up all over the place because Bayes’ is a view onto the mechanisms of causation. This suggests that even without Bayes’ theorem formally stated by anyone in any way, we’d still see it pop up all over the place, only no one would have identified it as a common pattern.
An interesting pattern I see in the comments and have picked out from other conversations but that no one has called out, is that many people seem to have a preference for a style of communication that doesn’t naturally fit “I sit alone and write up my thoughts clearly and then post them as a comment/post”. My personal preference is very much to do exactly that, as talking to me in person about a technical subject is maybe interesting but actually requires more of my time and energy than it does for me to write about it. This suggests to me that the missing engagement is all folks who don’t prefer to write out their thoughts carefully, and the existing engagement is largely from people who do prefer this.
I have some kind of pet theory here about different internet cultures (I grew up with Usenet and listservs; younger/other folks grew up with chat and texting), but I think the cause of this difference in preferences is not especially relevant.
I continue to be concerned with issues around downvotes and upvotes being used as “boos” and “yays” rather than saying something about the worthiness of a thing to be engaged with (I’ve been thinking about this for a while and just posted a comment about it over on EA forum). The result is that to me votes a very low in information value, which is unfortunate because they are the primary feedback mechanism on LW. I would love to see a move towards something that made voting costlier, although I realize that might impact engagement. There are probably other solutions that overcome these issues by not directly tweaking voting but instead pulling sideways at voting to come up with something that would work better for what I consider the import thing you want votes for: to identify the stuff worth engaging with.
This makes an interesting case I don’t often hear, which is that political stability, and especially stable succession, can overcome many other conditions that would tend to lead to general instability and lack of development. To me this further seems to suggest that the underlying factor that’s important to development is stability, and other things simply affect the specifics of how development proceeds, while stability is usually (always?) the deciding factor.
Does that fit with what you’ve seen?
So I want to poke at you and see if I’m understanding you correctly.
First, are you just talking about strong moral essentialism (morals are powered by real, possibly observable, facts or processes, the are causally connected to moral judgements) here or all moral realism (moral facts exist, even if they are unknowable)?
Second, what makes you think a moral realist would not be in favor of AI value learning such that you need to argue for it?
I think there are ways in which believing in moral realism may make you sloppy about value learning and is more likely to result in learning designs that will result in learning values that we would, in retrospect, not endorse, but I don’t see that as suggesting a moral realist would be against value learning (preprint on this point). In fact, I expect them to be for it, only that they will expect values to naturally converge no matter the input where an anti-realist would except input to matter a lot. The standard objection from an antirealist would be “seems like data matters a lot so far to outcomes” and the standard realist reply might be “need more data or less biased data”.
Being a prophet sucks in part because you can clearly see that correct thing that everybody should do… but nobody understands you and are constantly misinterpreting you or not listening to you or seeing you as a threat to their power.
Turns out this part doesn’t suck because being a prophet you also understand this is going to happen and are accepting of it and work with it. It only counter-factually sucks to the non-prophet imaging what it would be like to be a prophet.
At least for some sufficiently advanced prophet. I think Sarah is using “prophet” in a way that your interpretation makes sense, as in many people will be in what I might instead call the “advisor” category.
But is it really fruitless? Yes, you showed it’s flawed, but I can immediately imagine ways it might achieve what seems to be its goal of getting people to notice better how they came to think what they think and throw it into a light where they might re-examine those reasons and move towards reflective equilibrium. Maybe it’s not the best way to do it, but it’s a way that likely works at least sometimes for some people or else I would expect it wouldn’t have been so salient to you as to be worth writing about (compare the way no one has to write a post explaining why giving yourself concussions to lose weight is a bad idea—it’s got no proponents, no reasons to think it would work, and probably no kernel of value that might be doing something in a less-than-optimal way but still doing something in the expected direction).
That’s what I’m trying to get at: why do you think it’s fruitless? I see arguments showing it has problems, but not a clear line of reasoning to show me that those problems make it not worth it weighed against whatever benefits people who promote it believe it delivers. If I was a proponent of street epistemology I’d read this and say “yeah, sure, it’s not a perfect method, but on balance it’s better than not doing it so I’m going to keep doing it” and being neutral I read this and say “yeah, okay, I see, there are a few problems, but every method has some problems for some purposes, so I remain neutral on it and am not swayed either way”. Again, since you say “against”, I anticipated and wanted to ask if there was some argument for “against” rather than “neutral but better informed of the caveats”.
Well, when someone says they are “against” something I usually expect there to be a presentation of reasons why the thing one is against should be avoided. I came away from reading this with a sense that you found plenty of things you didn’t like about street epistemology but not much of an argument towards a norm that excludes it. That made me curious if there was something more you didn’t say that causes you to title the post “against street epistemology” where there is some norm/ought you meant to advance that I didn’t pick up on out of the various reasons you find street epistemology either unlikeable or failing to satisfy some criterion, including criteria street epistemology thinks it fulfills.
Put another way, you told me a bunch of reasons why street epistemology sucks at some things, but didn’t really tie together for me why that matters to you or convince me why that should matter to me, and that’s the kind of thing I expected from a title like “against street epistemology”.
This seems rather intuitive to me, and maybe that’s because I trained as a mathematician and I did that because I just happen to have the sort of mind that finds additional layers of abstraction useful on their own, but for the sake of being explicit I’ll tell you why I think it exists and why we would invent it now if it hadn’t already been.
Mathematics is (maybe) fundamentally about finding patterns in the world and reasoning about those patterns. When we see enough patterns, we can find patterns in those patterns, and then when we see enough meta-patterns and we find patterns in those patterns, and so on and on until we can’t find any more patterns. For example, geometry is, at least originally about the patterns we find when drawing stuff on a flat surface; arithmetic is about the patterns we find when we combine countable things; regular algebra is about the patterns we find in arithmetic; abstract algebra is about the patterns we find in regular algebra, geometry, and some other fields of mathematics with particular structures. Category theory is an extension of this pattern finding to the level of finding patterns across broad swaths of otherwise disconnected parts of mathematics.
It’s useful for several reasons, some internal to the theory itself, but I think largely because it gives us more general ways to reason about more concrete things. In addition to mathematics I also trained and work as a programmer, so my view is perhaps a bit biased here, but I find general abstractions useful because they let us deal with many concrete things that we would otherwise have to handle as special cases. With category theory we no longer have a bunch of mathematical silos that require the redevelopment of various concepts, and instead we have a general field that can at least give us for free theorems and structures and relationships for any part of mathematics that it adequately covers, thus I can take a result in category theory and use it to find similar results when applied to various fields.
Category theory also helps, much as abstract algebra did before it, to identify shared patterns across different fields of mathematics to set up correspondences that allow the transmutation of, say, a problem about graphs into a problem about complex variables without relying on a bunch of one-off proofs of shared structure because you can appeal to the categories to show how they relate. Yes, there is always stuff that doesn’t translate between fields because the fields have their own unique parts that are different because they are trying to model different things, but category theory at least lets us abstract away what we can from the noise and notice what’s going on in common.
It’s been a while since I did much academic math so I’m a bit fuzzy on specific results to point to, but I hope that gives a general sense of why category theory seems valuable and important to me.
I’m a bit confused just what you are really against about street epistemology. It seems, as you describe it, that street epistemology may often not be an appropriate means to achieve ends you care about, nor necessarily even the ends the proponents of street epistemology care about, but this just seems to be saying that this relatively obscure thing of street epistemology is ineffective at serving certain purposes. So what then is there to be against, other than perhaps unskillfulness?
Maybe I’m just thrown by your post’s title and you meant this more to just be “here’s why street epistemology is ineffective” or something like that?
So a number of the comments have pointed in the direction of my concerns with what I interpret to be the underlying assumption of this post, namely that it is at all possible to work with something that is not at least touched by humans enough that implicit, partial modeling of humans will not happen as a result of trying to build AI, general or narrow, even if restricted to a small domain. This is not to fail to acknowledge that much current AI safety work tends to be extremely human-centric, going so far as to rely on uniquely human capabilities (at least unique among known things), and that this is in itself a problem for many of the reasons you lay out, but I think it would be a mistake to think we can somehow get away from humans in building AGI.
The reality is that humans are involved in the work of building AGI, involved in the design and construction of the hardware they will run on, the data sets they will use, etc., and even if we think we’ve removed the latent human-shaped patterns from our algorithms, hardware, and data, we should strongly suspect we are mistaken because humans are tremendously bad at noticing when they are assuming something true of the world when it is actually true of their understanding, i.e. I would expect it to be more likely that humans would fail to notice their latent presence in a “human-model-free” AI than for the AI to actually be free of human modeling.
Thus to go down the direction of working on building AGI without human models risks failure because we failed to deal with the AGI picking up on the latent patterns of humanity within it. This is not to say that we should stick to a human-centric approach, because it has many problems as you’ve described, but to try to avoid humans is to ignore making our systems robust to the kinds of interference from humans that can push us away from the goal of safe AI, especially unexpected and unplanned for interference due to hidden human influence. If we instead build expecting to deal with and be robust to the influence of humans, we stand a much better chance of producing safe AI than either being human-centric or overly ignoring humans.
I am not sure I follow. The bots indeed do end up clustering into 4 to 5 different clusters, where each cluster represents a certain convergent view. By “keeping the affinity score”, do you mean they keep track of the past interactions, not just compare current views at each step? That would be an interesting improvement, adding memory to the model, but that would be, well, an improvement, not necessarily something you put into a toy model from the beginning. Maybe you mean something else? I’m confused.
Oh, this paragraph seems to suggest your model has a lot more going on that I got from reading this post. Maybe if I followed you links I would find some more details (sounded like they were just extra details that could be skipped)? I got the impression you found a function that has a shape illustrative of what you want and that was it, but this sounds like there’s a lot more going on not described in the text of this post!