Expanding on Ruby’s comment with some more detail, after talking to some other Lightcone team members:
Those of us with access to database credentials (which is all the core team members, in theory) would be physically able to run those queries without getting sign-off from another Lightcone team member. We don’t look at the contents of user’s DMs without their permission unless we get complaints about spam or harassment, and in those cases also try to take care to only look at the minimum information necessary to determine whether the complaint is valid, and this has happened extremely rarely[1]. Similarly, we don’t read the contents or titles of users’ never-published[2] drafts. We also don’t look at users’ votes except when conducting investigations into suspected voting misbehavior like targeted downvoting or brigading, and when we do we’re careful to only look at the minimum amount of information necessary to render a judgment, and we try to minimize the number of moderators who conduct any given investigation.
We do see drafts that were previously published and then redrafted in certain moderation views. Some users will post something that gets downvoted and then redraft it; we consider this reasonable because other users will have seen the post and it could easily have been archived by e.g. archive.org in the meantime.
I occasionally incidentally see drafts by following our automated error-logging to the page where the error occurred, which could be the edit-post page, and in those cases I have looked enough to check things like whether it contains embeds, whether collaborative editing is turned on, etc. In those cases I try not to read the actual content. I don’t think I’ve ever stumbled onto a draft dramapost this way, but if I did I would treat it as confidential until it was published. (I wouldn’t do this with a DM.)
Is there an immutable (or at least “not mutable by the person accessing the database”) access log which will show which queries were run by which users who have database credentials? If there is, I suspect that mentioning that will alleviate many concerns.
No. It turns out after a bit of digging that this might be technically possible even though we’re a ~7-person team, but it’d still be additional overhead and I’m not sure I buy that the concerns it’d be alleviating are that reasonable[1].
Not a confident claim. I personally wouldn’t be that reassured by the mere existence of such a log in this case, compared to my baseline level of trust in the other admins, but obviously my epistemic state is different from that of someone who doesn’t work on the site. Still, I claim that it would not substantially reduce the (annualized) likelihood of an admin illicitly looking at someone’s drafts/DMs/votes; take that as you will. I’d be much more reassured (in terms of relative risk reduction, not absolute) by the actual inability of admins to run such queries without a second admin’s thumbs-up, but that would impose an enormous burden on our ability to do our jobs day-to-day without a pretty impractical level of investment in new tooling (after which I expect the burden would merely be “very large”).
I think it would be feasible to increase the friction on improper access, but it’s basically impossible to do in a way that’s loophole-free. The set of people with database credentials is almost identical to the set of people who do development on the site’s software. So we wouldn’t be capturing a log of only typed in manually, we’d be capturing a log of mostly queries run by their modified locally-running webserver, typically connected to a database populated with a mirror snapshot of the prod DB but occasionally connected to the actual prod DB.
Thanks for response; my personal concerns[1] would somewhat be alleviated, without any technocal changes, by:
Lightcone Infrastructure explicitly promising not to look at private messages unless a counterparty agrees to that (e.g., becasue a counterparty reports spam);
Everyone with such access explicitly promising to tell others at Lightcone Infrastructure when they access any private content (DMs, drafts).
Clarifying in the first case: If Bob signs up and DMs 20 users, and one reports spam, are you saying that we can only check his DM, or that at this time we can then check a few others (if we wish to)?
TBH the main thing that helps with in practice is that it forces teams to get off the “emailed spreadsheet of shared passwords” model of access management. Which mainly becomes useful if someone is leaving the team in a hurry under less than ideal circumstances.
“That problem is not on the urgent/important pareto frontier” is absolutely a valid answer though, especially since AFAIK LW doesn’t store any data more sensitive than passwords / a few home addresses.
Expanding on Ruby’s comment with some more detail, after talking to some other Lightcone team members:
Those of us with access to database credentials (which is all the core team members, in theory) would be physically able to run those queries without getting sign-off from another Lightcone team member. We don’t look at the contents of user’s DMs without their permission unless we get complaints about spam or harassment, and in those cases also try to take care to only look at the minimum information necessary to determine whether the complaint is valid, and this has happened extremely rarely[1]. Similarly, we don’t read the contents or titles of users’ never-published[2] drafts. We also don’t look at users’ votes except when conducting investigations into suspected voting misbehavior like targeted downvoting or brigading, and when we do we’re careful to only look at the minimum amount of information necessary to render a judgment, and we try to minimize the number of moderators who conduct any given investigation.
I don’t recall ever having done it, Habryka remembers having done it once.
We do see drafts that were previously published and then redrafted in certain moderation views. Some users will post something that gets downvoted and then redraft it; we consider this reasonable because other users will have seen the post and it could easily have been archived by e.g. archive.org in the meantime.
I occasionally incidentally see drafts by following our automated error-logging to the page where the error occurred, which could be the edit-post page, and in those cases I have looked enough to check things like whether it contains embeds, whether collaborative editing is turned on, etc. In those cases I try not to read the actual content. I don’t think I’ve ever stumbled onto a draft dramapost this way, but if I did I would treat it as confidential until it was published. (I wouldn’t do this with a DM.)
Is there an immutable (or at least “not mutable by the person accessing the database”) access log which will show which queries were run by which users who have database credentials? If there is, I suspect that mentioning that will alleviate many concerns.
No. It turns out after a bit of digging that this might be technically possible even though we’re a ~7-person team, but it’d still be additional overhead and I’m not sure I buy that the concerns it’d be alleviating are that reasonable[1].
Not a confident claim. I personally wouldn’t be that reassured by the mere existence of such a log in this case, compared to my baseline level of trust in the other admins, but obviously my epistemic state is different from that of someone who doesn’t work on the site. Still, I claim that it would not substantially reduce the (annualized) likelihood of an admin illicitly looking at someone’s drafts/DMs/votes; take that as you will. I’d be much more reassured (in terms of relative risk reduction, not absolute) by the actual inability of admins to run such queries without a second admin’s thumbs-up, but that would impose an enormous burden on our ability to do our jobs day-to-day without a pretty impractical level of investment in new tooling (after which I expect the burden would merely be “very large”).
I think it would be feasible to increase the friction on improper access, but it’s basically impossible to do in a way that’s loophole-free. The set of people with database credentials is almost identical to the set of people who do development on the site’s software. So we wouldn’t be capturing a log of only typed in manually, we’d be capturing a log of mostly queries run by their modified locally-running webserver, typically connected to a database populated with a mirror snapshot of the prod DB but occasionally connected to the actual prod DB.
Thanks for response; my personal concerns[1] would somewhat be alleviated, without any technocal changes, by:
Lightcone Infrastructure explicitly promising not to look at private messages unless a counterparty agrees to that (e.g., becasue a counterparty reports spam);
Everyone with such access explicitly promising to tell others at Lightcone Infrastructure when they access any private content (DMs, drafts).
Talking to a friend about an incident made me lose trust in LW’s privacy unless it explicitly promises that privacy.
Second one seems reasonable.
Clarifying in the first case: If Bob signs up and DMs 20 users, and one reports spam, are you saying that we can only check his DM, or that at this time we can then check a few others (if we wish to)?
TBH the main thing that helps with in practice is that it forces teams to get off the “emailed spreadsheet of shared passwords” model of access management. Which mainly becomes useful if someone is leaving the team in a hurry under less than ideal circumstances.
“That problem is not on the urgent/important pareto frontier” is absolutely a valid answer though, especially since AFAIK LW doesn’t store any data more sensitive than passwords / a few home addresses.