Viliam comments on Lesswrong Potential Changes

Viliam 21 Mar 2016 12:17 UTC
8 points
0

How will we know we have done well (KPI—technical)
Total comments/month
Total word count on posts/month
Total word count on comments/month

This feels wrong to me. I mean, I would like to have a website with a lot of high-quality materials. But given a choice between higher quality and more content, I would prefer higher quality. I am afraid that measuring these KPIs will push us in the opposite direction.

Reading spends time. Optimizing for more content to read means optimizing for spending more time here, and maybe even optimizing for attracting the kind of people who prefer to spend a lot of time debating online. Time spent reading is a cost, not a value. The value is what we get from reading the text. The real thing we should optimize for is “benefits from reading the text, minus time spent reading the text”.

subreddits: …

I think the subreddits should only be created after enough articles for given category were posted (and upvoted). Obviously that requires having one “everything else” subreddit. And the subreddits should reflect the “structure of the thingspace” of the articles.

Otherwise we risk having subreddits that remain empty. Or subreddits with too abstract names, or such that authors are confused where exactly which article belongs. (There will always be some difficult cases, but if the subreddit structure matches the typically written articles, the confusion is minimized.) For example, I wouldn’t know whether talking about algorithms playing Prisonners’ Dilemma belongs to “AI” or “math”, or whether debates of procrastination among rationalists and how to overcome it are “instrumental” or “meta”. By having articles first and subreddits later we automatically receive intensional definition of “things like this”.

Perhaps we could look at some existing highly upvoted articles (except for the original Sequences) and try to classify those. If they can fit into the proposed categories, okay. But maybe we should have a guideline that a new subreddit cannot be created unless at least five already existing articles can be moved there.

Vaniver and others are interested in changing the voting system to something like StackOverflow’s model (privileged voting?).

Upvoting and downvoting should be limited to users already having some karma; not sure about exact numbers, but I would start with e.g. 100 for upvoting, and 200 or 300 for downvoting. This would prevent the most simple ways to game the system, which in its current form is insanely fragile—a single dedicated person could destroy the whole website literally in an afternoon even without scripting. This is especially dangerous considering how much time it takes to fix even the smallest problems here.

EDIT:

It would be nice to have scripts for creating things like Open Thread automatically.

Explicitly invite the following people’s contribution: …

Definitely add PJ Eby to the list. I am strongly convinced that ignoring him was one of the largest mistakes of the LW community. I mean, procrastination is maybe the most frequently mentioned problem on this website, and coincidentally we have an expert on this who also happens to speak our language and share our views in general, but instead of thinking about how to cooperate with him to create maximum value, CFAR rather spent years creating their own curriculum from the scratch which only a few selected people have seen. (I guess a wheel not invented in the Bay Area is not worth trying, despite all the far-mode talk about the virtue of scholarship.)
- Vaniver 21 Mar 2016 16:07 UTC
  6 points
  0
  Parent
  
  I think the subreddits should only be created after enough articles for given category were posted (and upvoted).
  
  Agreed. This is why I shut off main and forced everything into discussion—I don’t think we know enough about how LW will be used to partition things ahead of time. (I’m also pretty skeptical of doing a subreddit split on topics instead of on rules.)
  
  Upvoting and downvoting should be limited to users already having some karma; not sure about exact numbers, but I would start with e.g. 100 for upvoting, and 200 or 300 for downvoting. This would prevent the most simple ways to game the system, which in its current form is insanely fragile—a single dedicated person could destroy the whole website literally in an afternoon even without scripting. This is especially dangerous considering how much time it takes to fix even the smallest problems here.
  
  Currently the limits are 10 for both upvoting and downvoting. We’ve already seen some innocent bystanders hit.
  
  I think you’re underestimating the difficulty in getting up to 100 karma. (One comment made a while ago is that the fragility of the voting system—especially when it comes to serial downvoters—happens in part because of how infrequently good users vote. It is problematic when we exclude people with good taste who don’t contribute much, because that means the base of good votes to overcome is even shallower.)
  - Viliam 21 Mar 2016 20:38 UTC
    3 points
    0
    Parent
    
    We’ve already seen some innocent bystanders hit.
    
    The disabled buttons should have tooltips saying “you need X karma to vote”.
    
    I think you’re underestimating the difficulty in getting up to 100 karma.
    
    Maybe 10 is okay for upvoting, but there needs to be a sufficiently high limit for downvoting, to stop the usual Eugine’s strategy of “post three quotes in rationality thread, get a few upvotes, and immediately use the karma to harass others”. Higher costs of doing things increase the cost of avoiding bans by making new accounts repeatedly.
    - Lumifer 21 Mar 2016 20:57 UTC
      −3 points
      0
      Parent
      
      to stop the usual Eugine’s strategy
      
      Our design docs for LW v2.0 are not written by Eugine, are they?
      - Viliam 21 Mar 2016 22:11 UTC
        3 points
        0
        Parent
        Well, if they cannot even solve the existing problems, then I have to predict that the existing problems will continue to exist, duh.
        
        A different solution would be okay too. But some solution is needed. And I violate the virtue of silence again by saying that as a determined user, I could ruin the website in a weekend, even without scripting. (With scripting, I could make a program that ruins the website at any moment at a click of a button.) Eugine is really picking just the absolutely lowest hanging fruit; and fixing that already creates about 50% of moderators’ work. Multiply this by ten, and LW will not have enough manpower to deal with the problems. A script could multiply it by a million.
        Vaniver 22 Mar 2016 18:44 UTC
        2 points
        0
        Parent
        
        But some solution is needed.
        
        When a user has admin on, they can see the list of users who upvoted or downvoted a comment in the tooltip (which currently shows % positive).
        
        All karma calculations use a ‘weight’ variable that’s stored per user, and can be adjusted at the userpage by a user with admin on.
        
        That means shutting down sockpuppets is two clicks, and discovering them is a mouseover. The main uncertainty is how the weight variable will impact the performance of the site.
        Viliam 22 Mar 2016 21:12 UTC
        −1 points
        0
        Parent
        The “weight” only needs values 0 and 1. And the value 0 can be achieved by disabling the buttons (which has negligible impact on performance) and removing the existing votes (which only happens one per user).
        Lumifer 21 Mar 2016 23:43 UTC
        0 points
        0
        Parent
        
        But some solution is needed.
        
        You need to define and specify the problem first. For example, you were saying that the 10 karma to gain the right to vote is too low—but e.g. in the weaponised scenarios that you mention the number of karma points is pretty much irrelevant. Setting this variable to 100 (or 1000) will not provide much defense.
        
        As usual, step one should be to specify the threat model.
        Viliam 22 Mar 2016 8:08 UTC
        5 points
        0
        Parent
        
        As usual, step one should be to specify the threat model.
        
        I think describing it publicly in details is not a good idea. I have a model of attack which I believe is strong enough that within a day I could reduce your (or anyone else’s) karma to zero, without using my existing account (i.e. pretending that I am a completely new person, or that my old account was banned), and without scripting, assuming I spend the whole day doing this. With scripting, it is merely a question of clicking a button when the script is ready, and the script could be ready in a day or two. After having the script ready, the slowest part would be getting the first 10 karma for the new account, which is quite easy. Which is why I recommend making exactly this part more difficult.
        
        After running the script, to undo the damage it would be necessary to find the account that did it, and make a script that removes all its votes. (Assuming the attack was done with one account. That’s not a reasonable assumption with scripting.) Judging by how “quickly” the support has reacting in the past, that would take about a month. Running the script again would take just one more click. Fixing the problem again would probably take a few days. There is a huge assymetry of effort. And with small modification, version 2.0 of the script could create hundreds of new accounts (now the slowest part would be the attacker typing the captcha for registering the new accounts), which would make defense impossible for a few months, until the necessary changes in code would be implemented.
        
        All I am asking for is to make this vector of attack more costly, by increasing a fucking constant. What else am I supposed to do to convince anyone? Do I have to produce a working prototype of the scipt? There is already enough information here for anyone to connect the dots.
        Lumifer 22 Mar 2016 17:15 UTC
        −1 points
        0
        Parent
        
        All I am asking for is to make this vector of attack more costly, by increasing a fucking constant.
        
        My point is that increasing that constant is not a viable defence against the attacks. You are not putting up a roadblock, merely a microscopic speed bump that a capable attacker will not even notice.
        
        You suggestion is like prohibiting passwords consisting of a single character. Will it help in the case of really stupid people? A bit. Will it help in the case of people actually likely to mount an attack? Not at all. Does it create the impression that you’ve “improved the security”? Yes, and that’s the worst part.
        Viliam 22 Mar 2016 21:19 UTC
        −1 points
        0
        Parent
        In theory, the only values defensible from the first principles are 0, 1, and infinity. In practice, the difference between an hour and a week can be significant.
        
        Will it help in the case of people actually likely to mount an attack? Not at all.
        
        Actually, it can make the attack without scripting quite expensive, and having to write and debug a script can be an obstacle for many people. For example, I am quite tempted to make a proof-of-concept script and fire it at you just to prove my point; and I have already written scripts interacting with websites in the past; but I am still quite likely not to do it, because it would take me a few hours of work. Procrastination, trivial inconveniences, etc.
        
        You suggestion is like prohibiting passwords consisting of a single character.
        
        I believe that CAPTCHA would be a better analogy, because it is an amount of work that has to be done by the user manually, before they are given access to the full functionality. More specifically, it is like changing a one-character CAPTCHA into multiple characters.
        Lumifer 22 Mar 2016 21:34 UTC
        1 point
        0
        Parent
        
        In practice, the difference between an hour and a week can be significant.
        
        Sure, but why do you want to take a roundabout-karma way about it? If you care about slowing attacks down, make it so that no account younger than X days can vote. If you care about a sockpuppet explosion, implement some checks on the front end so that no IP address can create more than Y accounts in Z days (yes, proxies, but that’s another speed bump).
        
        However I feel that all this distracts from a bigger point. LW is in crisis and some people even say it’s dying. This is not because LW is under siege from multiple accounts or sockpuppets. If Eugene goes away LW will still be in crisis. While I’m not in general a big fan of YAGNI, I feel that it’s appropriate here. Focus on important parts first.
        Viliam 23 Mar 2016 7:06 UTC
        −1 points
        0
        Parent
        There is more than one problem with LW. But for me this is just more reason to make one go away quickly by increasing a constant, and then focus on the remaining ones.
        Expand this thread
        Lumifer 23 Mar 2016 14:21 UTC
        1 point
        0
        Parent
        We’re disagreeing about whether increasing that constant will make the problem go away.
  - Lumifer 21 Mar 2016 17:08 UTC
    3 points
    0
    Parent
    
    I think you’re underestimating the difficulty in getting up to 100 karma.
    
    I don’t think straight number limits like this are going to work well. Let’s take two new users Alice and Bob, and stipulate that, using gaming terminology, Alice is a casual and Bob is an elitist jerk. Alice might well take a month or two or three to accumulate 100 karma in the course of her ordinary use of LW. Bob, being who he is, will minmax the process and get his 100 karma in a couple of days.
    
    Managing the power gap between casuals and elite minmaxers is a big problem in multiplayer games and it doesn’t look like an easily solved one.
    - Vaniver 21 Mar 2016 18:18 UTC
      3 points
      0
      Parent
      
      I don’t think straight number limits like this are going to work well.
      
      I think straight number limits give us the most usefulness for the difficulty to implement. If you have other suggestions, I’m interested.
      - Lumifer 21 Mar 2016 18:27 UTC
        5 points
        0
        Parent
        If we are talking about the criteria for the promotion to the full vote-wielding membership of LW, you are not limited to looking just at karma.
        
        For example: Promote to full membership when (net karma > X) AND (number of positive-karma comments > Y) AND (days when posted a positive-karma comment > Z).
        
        Implementation shouldn’t be difficult, given how all these conditions are straightforward SQL queries.
        
        A more general question is the trade-off between false positives and false negatives. Do you want to give the vote to the newbies faster at the cost of some troll vandalism, or do you want to curtail the potential for disruption at the cost of newbies feeling themselves second-class citizens longer?
        gjm 21 Mar 2016 20:42 UTC
        4 points
        0
        Parent
        
        straightforward SQL queries
        
        Very funny.
        Lumifer 21 Mar 2016 21:02 UTC
        2 points
        0
        Parent
        If what should be straightforward SQL queries are too difficult to implement, LW code base is FUBARed anyway.
        
        Anyone wants to write another middle layer which will implement normal SQL on top of that key-value store implemented on top of normal SQL? X-D
        
        A bit more seriously, LW code clearly uses some ORM which, hopefully, makes some sense in some (likely, non-SQL) way. Also reading is not writing and for certain tasks it might make sense to read the underlying Postgres directly without worrying about the cache.
  - PhilGoetz 29 Mar 2016 15:20 UTC
    0 points
    0
    Parent
    I just posted an article to Main. Would you check & see if it appears there for you, too?
    What links here?
    Error's comment on “3 Reasons It’s Irrational to Demand ‘Rationalism’ in Social Justice Activism” by PhilGoetz (29 Mar 2016 20:54 UTC; 30 points)
    - Vaniver 29 Mar 2016 15:49 UTC
      0 points
      0
      Parent
      Hmm. It is visible to me; we put in a safety valve so that people could edit posts that were already in Main, but I’d have to look into more detail to see what happened.
      
      (You shouldn’t have had the option in the dropdown, I think.)
- Houshalter 25 Mar 2016 4:52 UTC
  2 points
  0
  Parent
  
  This feels wrong to me. I mean, I would like to have a website with a lot of high-quality materials. But given a choice between higher quality and more content, I would prefer higher quality. I am afraid that measuring these KPIs will push us in the opposite direction.
  
  But more content equals a higher chance that some of the content is worth reading. You can’t get to gold without churning through lots of sand.
  
  Instead I think there should be decent filtering. It shouldn’t be sorted by new by default, but instead “hot” or “top month” etc.
  
  I think the subreddits should only be created after enough articles for given category were posted (and upvoted). Obviously that requires having one “everything else” subreddit. And the subreddits should reflect the “structure of the thingspace” of the articles.
  
  I second this. In fact I would go further and say there should only be 1 or 2 distinct subreddits. Ideally just 1.
  
  The model for this is Hacker News. They only have one main section. And no definition of what belongs there except maybe “things of interest to hackers” it’s filled with links of all kinds of content from politics to new web frameworks.
  
  I think lesswrong could do something like that successfully. The only reason it isn’t is because, see above, new content like that is discouraged.
  
  It would be nice to have scripts for creating things like Open Thread automatically.
  
  Lesswrong currently uses (a highly outdated version of) reddit’s api, so writing bots to do various tasks shouldn’t be too difficult, and doesn’t require access to Lesswrong’s code.
- gjm 21 Mar 2016 12:35 UTC
  1 point
  0
  Parent
  I don’t know about CFAR, but my sense is that if the LW community as a whole ignored PJ Eby it wasn’t because of Bay Area prejudice (what fraction of LW people have, or had, any idea where he lives?) but because the style of his writing was offputting to many here.
  
  I mean, for instance, his habit of putting everything important in boldface, which feels kinda patronizing (and I think LW people tend to be extra-sensitive to that). And IIRC he used too many exclamation marks! The whole schtick pattern-matches to “empty-headed wannabe lifestyle guru”!
  
  Having said that, I just had a quick historical look and it seems like from ~2013 (which is as far back as I looked) he hasn’t been doing that much, and hasn’t been ignored any more than other LW contributors. But perhaps he also hasn’t been posting much about his lifestyle-guru/therapist/coach stuff either. (I can easily believe that the unusual writing style goes with the self-help territory rather than being something he just does all the time.)
  - Vaniver 21 Mar 2016 14:11 UTC
    1 point
    0
    Parent
    
    I don’t know about CFAR, but my sense is that if the LW community as a whole ignored PJ Eby it wasn’t because of Bay Area prejudice (what fraction of LW people have, or had, any idea where he lives?) but because the style of his writing was offputting to many here.
    
    I think this is the main factor. I didn’t find his style offputting, at least to the degree others did, but I notice that I never went on an archive-binge of what he’d written.
  - Viliam 21 Mar 2016 13:44 UTC
    0 points
    0
    Parent
    The evidence against “empty-headed” is that his articles and comments often got highly upvoted on LW.
    - gjm 21 Mar 2016 17:59 UTC
      2 points
      0
      Parent
      I am arguing not that PJE is in fact empty-headed but that his writing style may have felt like that of someone empty-headed and that, if in fact he was ignored and neglected, this may be why.
      
      But I’m a bit confused now, because if his articles and comments were highly upvoted on LW I don’t think I understand in what sense you can say that “ignoring him was one of the largest mistakes of the LW community”. (Of course it could still be a mistake made by, say, CFAR.)
      - Viliam 21 Mar 2016 21:59 UTC
        1 point
        0
        Parent
        After noticing that procrastination is a serious problems for many aspiring rationalists, and that we have a domain expert on LW, the reasonable approach would be to invite him to make a lecture for CFAR seminars. (And then of course use the standard CFAR methods to measure the impact of the lecture.) Motivation is a multiplier; if the lessons actually work, CFAR would get a huge bonus not only by having these lessons for their students, but also by using them for themselves; and maybe even for the folks at MIRI to build the mechanical god faster.
        
        If the negotiation fails, there are still backup options, such as having someone infiltrate his lessons, steal the material, and modify it to avoid copyright issues. (Shouldn’t be difficult. LW is accused all the time of inventing new names for the existing concepts, which is exactly what needs to be done here, because only names can be copyrighted and trademarked, not the concepts themselves.) But I would expect the negotiation to be successful, because PJE is already okay with publishing articles on LW, so whatever he is trying to achieve by that, he would achieve even more of it by cooperating with CFAR.
        
        Maybe some kind of cooperation actually happened, I just haven’t heard about it, in which case I apologize to everyone concerned.
        
        I sincerely believe that in his area of work, PJE is doing the same kind of high-quality work as Eliezer did in writing the Sequences. Joining high motivation with avoiding biases seems like a super powerful combo, like the royal road to winning at life. I am quite sensitive about reading bullshit, and the field of motivation is 99% bullshit. Yet PJE somehow manages to read all those books, extract the 1% that makes sense, and explain it separately from the rest. I have listened to a few of his lectures, and read his unfinished book, and I don’t remember finding anything that I would be ashamed to tell at a LW meetup. There are people who swallow the bullshit completely; there are also people who believe that there is a dilemma between lying to yourself and being more productive or refusing to lie to yourself at a cost of losing productivity (and then explain why they choose one side over the other, or vice versa), but PJE always takes apart the stuff that experimentally works from the incorrect explanation that surrounds it, in a way that makes sense to me.
        
        Most wannabe rationalists avoid the emotional topics and pretend they don’t exist. The Vulcan stereotype exists for a reason, and many explanations why this is not how we do rationality feel like “the lady doth protest too much”. Our culture rewards trying to explain away emotions by using pseudomathematical bullshit such as “hyperbolic discounting” (oh, you used two scientifically sounding words, that’s neat; but you also completely failed to explain why some people procrastinate while others don’t, or why a short exercise can switch a person from avoiding work to doing the work). This is our collective blind spot; our motivated stopping before stepping on an unfamiliar territory. Back to the safety of abstractions and equations; even if we are forced to use equations as metaphors, so the actual benefits of doing maths are not there, it still feels safer at home.
        
        Unfortunately, this is one of the situations where I don’t believe I could actually convince anyone. I mean, not just admit verbally that I may have a point, but to actually change their “aliefs” (which is by the way yet another safe word for emotions).
- Elo 21 Mar 2016 12:59 UTC
  0 points
  0
  Parent
  
  these KPIs
  
  I have a lot of KPI’s because I realise some will not be effective. as we know; what gets measured gets optimised for; which is why I think having so many different measures will help make it hard to select for the wrong goals. By at least watching all of them; I expect we are likely to be able to make progress.
  
  “benefits from reading the text, minus time spent reading the text”.
  
  Agree. But how? if you have a better metric for measuring that I would gladly try to implement it; until then—I came up with the best possible solutions I could.
  
  subreddits
  
  I am thinking tags might be an easer implemented and stronger solution. maybe two layers of tags; one for “content tags” and one for “sorting tags”. The content tags will be anything (as per the current system). The sorting tags will be a set number of possible tags and clearly visible everywhere for sorting posts by, and posting into.
  
  Some sorting tags will also be able to be auto-assigned i.e. +10karma score. which can then be automatically aggregated to an RSS feed.
  
  Upvoting and downvoting
  
  I like these ideas.
  
  scripts for creating things like Open Thread automatically.
  
  Yes and no; particularly no on meetups. I don’t want dead meetups to appear on the meetup schedule, I was thinking an opt in email, “the last time you planned this meetup was 2 weeks ago; would you like to set one for two weeks now; reply “yes” to this email to confirm a meetup with the same location and this date and time.”
  
  A weekly thread can be automated; a monthly thread will find less use being automated. But certainly an option.
  
  PJ eby added.
  
  Edit: On second thought; if you want to just remove those particular KPI’s or “discount their validity a lot” I can also do that.
  - Viliam 21 Mar 2016 13:40 UTC
    −1 points
    0
    Parent
    I’m thinking, but not sure, whether watching average karma would be a good idea.
    
    Or maybe some curve that would transform article karma, something like articles with positive karma would get “karma − 10” points, and articles with zero or negative karma would get constant “-10″ points, and measuring a sum of that. (The rationale is that we subtract a few points as a cost of time spent reading; but we don’t penalize the negative-karma articles too much, because skipping an article with −100 karma is just as easy as skipping an article with −5 karma.)
    - Elo 21 Mar 2016 20:13 UTC
      1 point
      0
      Parent
      when was the last time you skipped an article, comment or post because it was negative? (not that you are a typical user) (and maybe this is worthy of a poll in the OT)
      
      This seems reasonable in general, aside from that minor quibble.
      - gwillen 22 Mar 2016 9:04 UTC
        1 point
        0
        Parent
        As one anecdata point, I do generally skip articles with much negative karma. I read via RSS, so I just hit ‘mark read’ on them. LW users are not big downvoters, most of the time, so if something has more than a few downvotes, I have found that I probably don’t want to read it.
        
        And of course, comments with a score of −3 are hidden by default, so many people probably don’t read them.

Viliam comments on Lesswrong Potential Changes

A weekly thread can be automated; a monthly thread will find less use being automated. But certainly an option.