On Social Media Nazi Bars, Tradeoffs, And The Impossibility Of Content Moderation At Scale
A few weeks ago I wrote about an interview that Substack CEO Chris Best did about his company’s new offering, Substack Notes, and his unwillingness to answer questions about specific content moderation hypotheticals. As I said at the time, the worst part was Best’s unwillingness to just own up to what he was saying were the site’s content moderation plans, which was that they would be quite open to hosting the speech of almost anyone, no matter how terrible. That’s a decision that you can make (in the US at least), but if you’re going to do that, you have to be willing to own the decision that you’re making and be clear about it, which Best was unwilling to do.
I compared it the “Nazi bar” problem that has been widely discussed on social media in the past, where if you own a bar, and don’t kick the Nazis out up front, you get the reputation as a “Nazi bar” that is difficult to get rid of.
It was interesting to see the response to this piece. Some people got mad, claiming it was unfair to call Best a Nazi, even though I was not doing that. As in the story of the Nazi bar, no one is claiming that the bar owner is a Nazi, just that the public reputation of his bar would be that it’s a Nazi bar. That was the larger point. Your reputation is what you allow, and if you’re taking a stance that you don’t want to get involved at all, and you want to allow such things, that’s the reputation that’s going to stick.
I wasn’t calling Best a Nazi or a Nazi sympathizer. I was saying that if he can’t answer a straightforward question like the one that Nilay Patel asked him, Nazis are going to interpret that as he’s welcoming them in, and they will act accordingly. So too will people who don’t want to be seen hanging out at the Nazi bar. The vaunted “marketplace of ideas” includes the ability for a large group of people to say “we don’t want to be associated with that at all…” and to find somewhere else to go.
And this brings us to Bluesky. I’ve written a bunch about Bluesky going back to Jack Dorsey’s initial announcement which cited my paper among others as part of the inspiration for betting on protocols.
As Bluesky has gained a lot of attention over the past week or so, there have been a lot of questions raised about its content moderation plans. A lot of people, in particular, seem confused by its plans for composable moderation, which we spoke about a few weeks ago. I’ve even had a few people suggest to me that Bluesky’s plans represented a similar kind of “Nazi bar” problem as Best’s interview did, in particular because their initial reference implementation shows “hate speech” as a toggle.
I’ve also seen some people claim (falsely) that Bluesky would refuse to remove Nazis based on this. I think there is some confusion here, and it’s important to go deeper on how this might work. I have no direct insight into Bluesky’s plans. And they will likely make big mistakes, because everyone in this space makes mistakes. It’s impossible not to. And, who knows, perhaps they will run into their own Nazi bar problem, but I think there are some differences that are worth exploring here. And those differences suggest that Bluesky is better positioned not to be the Nazi bar.
The first is that, as I noted in the original piece about Best, there’s a big difference between a centralized service and its moderation choices, and a decentralized protocol. Bluesky is a bit confusing to some as it’s trying to do both things. Its larger goal is to build, promote, and support the open AT Protocol as an open social media protocol for a decentralized social media system with portable identification. Bluesky itself is a reference app for the protocol, showing how things can be done — and, as such it has to do content moderation tasks to avoid Bluesky itself running into the Nazi bar problem. And, at least so far, it seems to be doing that.
The team at Bluesky seems to recognize this. Unlike Best, they’re not refusing to answer the question, they’re talking openly about the challenges here, but so far have been willing to remove truly disruptive participants, as CEO Jay Graber notes here:
But, they definitely also recognize that content moderation at scale is impossible to do well, and believe that they need a different approach. And, again, the team at Bluesky recognizes at least some of the challenges facing them:
But, this is where things get potentially more interesting. Under a traditional centralized social media setup, there is one single decision maker who has to make the calls. And then you’re in a sort of benevolent dictator setup (or at least you hope so, as the malicious dictator threat becomes real).
And this is where we go on a little tangent about content moderation: again, it’s not just difficult. It’s not just “hard” to do. It’s impossible to do well. The people who are moderated, with rare exceptions, will disagree with your moderation decisions. And, while many people think that there are a whole bunch of obvious cases and just a few that are a little fuzzy, the reality (this is part of the scale part) is that there are a ton of borderline cases that all come down to very subjective calls over what does or does not violate a policy.
To some extent, going straight to the “Nazi” example is unfair, because there’s a huge spectrum between the user who is a hateful bigot, deliberately trying to cause trouble, and the good helpful user who is trying to do well. There’s a very wide range in the middle and where people draw their own lines will differ massively. Some of them may include inadvertent or ignorant assholery. Some of it may just include trolling. Or sometimes there are jokes that some people find funny, and others find threatening. Sometimes people are just scared and lash out out of fear or confusion. Some people feel cornered, and get defensive when they should be looking inward.
Humans are fucking messy.
And this is where the protocol approach with composable moderation becomes a lot more interesting. On the most extreme calls, the ones where there are legal requirements, such as child sexual abuse material and copyright infringement, for example, those can be removed at the protocol level. But as you start moving up into the more murky areas, where many of the calls are subjective (not so much: “is this person a Nazi” but more along the lines of “is this person deliberately trolling, or just uninformed…”) the composable moderation system begins to let (1) the end users make their own rules and (2) enable any number of 3rd parties to build tools to work with those rules.
Some people may (for perfectly good reasons, bad reasons, or no reasons at all) just not have any tolerance for any kind of ignorance. Others may be more open to it, perhaps hoping to guide ignorance to knowledge. Just as an example, outside of the “hateful” space, we’ve talked before about things like “eating disorder” communities. One of the notable things there was that when those communities were on more mainstream services, people who had gotten over an eating disorder would often go back to those communities and provide help and support to those who needed it. When those communities were booted from the mainstream services, that actually became much more difficult, and the communities became angrier and more insulated, and there was less ability for people to help those in need.
That is, there will still need to be some decision making at the protocol level (this is something that people who insist on “totally censorship proof” systems seem to miss: if you do this, eventually the government is going to shut you down for hosting CSAM), but the more of the decision making that can be pushed to a different level and the more control put in the hands of the user, the better.
This allows for more competition for better moderation, first of all, but also allows for the variance in preferences, which is what you see in the simple version that Bluesky implemented. The biggest decisions can be made at the protocol level, but above that, let there be competitive approaches and more user control. It’s unclear exactly where Bluesky the service will come down in the end, but the early indications from what’s been said so far are that the service level “Bluesky” will be more aggressive in moderating, while the protocol level “AT Protocol” will be more open.
And… that’s probably how it should be. Even the worst people should be able to use a telephone or email. But, enabling competition at the service level AND at the moderation level, creates more of the vaunted “marketplace of ideas” where (unlike what some people think the marketplace of ideas is about), if you’re regularly a disruptive, disingenuous, or malicious asshole, you are much more likely to get less (or possibly no) attention from the popular moderation services and algorithms. Those are the consequences of your own actions. But you don’t get banned from the protocol.
To some extent, we’ve already seen this play out (in a slightly different form) with Mastodon. Truly awful sites like Gab, and ridiculously pathetic sites like Truth Social, both use the underlying ActivityPub and open source Mastodon code, but they have been defederated from the rest of the fediverse. They still get to use the underlying technology, but they don’t get to use it to be obnoxiously disruptive to the main userbase who wants nothing to do with them.
With AT Protocol, and the concept of composable moderation, this can get taken even further. Rather than just having to choose your server, and be at the whims of that server admin’s moderation choices (or the pressure from other instances which keeps many instances in check and aligned), the AT Protocol setup allows for a more granular and fluid system, where there can be a lot more user empowerment, without having to resort to banning certain users from using the technology entirely.
This will never satisfy some people, who will continue to insist that the only way to stop a “bad” person is to ban them from basically any opportunity to use communications infrastructure. However, I disagree for multiple reasons. First, as noted above, outside of the worst of the worst, deciding who is “good” and who is “bad” is way more complicated and fraught and subjective than people like to note, and where and how you draw those lines will differ for almost everyone. And people who are quick to draw those lines should realize that… some other day, someone who dislikes you might be drawing those lines too. And, as the eating disorder case study demonstrated, there’s a lot more complexity and nuance than many people believe.
That’s why a decentralized solution is so much better than a centralized one. With a decentralized system you don’t have to be worrying about yourself getting cut out either. Everyone gets to set their own rules and their own conditions and their own preferences. And, if you’re correct that the truly awful people are truly awful, then it’s likely that most moderation tools and most servers will treat them as such, and you can rely on that, rather than having them cut off at the underlying protocol level.
It’s also interesting to also see how the decentralized social media protocol nostr is handling this as well. While it appears that some of the initial thinking behind it was the idea that nothing should ever be taken down, it appears that many are recognizing how impossible that is, and they’re now having really thoughtful discussions on “bottom up content moderation” specifically to avoid the “Nazi bar” problem.
Eventually in the process, thoughtful people recognize that a community needs some level of norms and rules. The question is how are those created, how are they implemented, and how are they enforced and by whom. A decentralized system allows for much greater control by end users to have the systems and communities that more closely match their own preferences, rather than requiring the centralized authority handle everything, and be able to live up to everyone’s expectations.
As such, you may end up with results like Mastodon/ActivityPub, where “Nazi bar” areas still form, but they are wholly separated from other users. Or you may end up with a result where the worst users are still there, shouting into the wind with no one bothering to listen, because no one wants to hear them. Or, possibly, it will be something else entirely as people experiment with new approaches enabled by a composable moderation system.
I’ll add one other note on that, because there are times when I’ve discussed this that people highlight that there are other forms of harassment or other kinds of risks beyond direct harassment. And just blocking a user does not stop them from harassing or encouraging or directing harassment against another. This is absolutely true. But, this kind of setup does also allow for better tooling for potentially monitoring such a thing without having to be exposed to it directly. This could take the form of Block Party’s “lockout folder” where you can have a trusted third party review the harassing messages you’ve been receiving rather than having to go through it yourself, or, conceivably. other monitoring and warning services could pop up, that could track people who are doing awful things, try to keep them from succeeding, and alert the proper people if things require escalation.
In short, decentralizing things, and allowing many different approaches, and open systems and tooling doesn’t solve all problems, but it presents some creative ways to handle the Nazi Bar problem that seem likely to be a lot more effective than living in denial and staring blankly into the Zoom screen as a reporter asks you a fairly basic question about how you’ll handle racist assholes on your platform.