Whenever we have managed Facebook pages for businesses or individuals that are the target for online reputation attacks, I’ve been struck with how odd it is that Facebook doesn’t provide an option to allow all comments to be automatically moderated so that one may review them and approve or disapprove them before being published. It seemed like a glaringly obvious need for a great many businesses, and I can see that others out there would also like this option.
I’ve developed a workaround that essentially provides the ability to set all your Facebook page comments to be automatically moderated, and we’re providing this solution for free — so read on!
Facebook’s reticence in providing this feature seems obvious, intentional — and more than a bit presumptuously elitist. There’s a fairly strong activist philosophy in Silicon Valley called “Information Wants To Be Free” that many there share to some varying degrees. I think that one aspect of this philosophy derives into a believe that all expression of opinion should be allowed anytime, anywhere, and adherents believe that social media should generally be unrestricted. (I’m not a total critic of parts of these concepts, but I do note that there’s huge hypocrisy in Silicon Valley as to how corporations impose this philosophy upon others while applying the more conservative philosophy to themselves.)
Where Facebook is concerned, they’ve allowed companies to create pages for communicating with people on Facebook (in the case of local businesses, Facebook pages are automatically created, removing the companies’ option to not participate), but they haven’t provided full control of the pages to the companies involved — all pages allow visitors to add comments to the pages’ posts. This creates a number of dilemmas in many cases. Some companies do not have sufficient budget for frequent monitoring and responding to Facebook comments (in addition to the myriad other social media sites where online users congregate). In yet other cases, companies and organizations by their nature may be lightning rods for people to be posting snarky and malicious things and therefor making the prospect of managing a Facebook account costly. While the expression of opinion is a vital human right, a Facebook page is a strange mixture of an organization’s communications arm and the public forum owned by the audience. Companies may want to participate to a degree in order to communicate with their audience on Facebook, but not if it creates a highly-visible billboard where hostile people are constantly beating them up.
Facebook’s lack of an option to allow organizations to review all comments prior to publication is a big irritation for many. Major brands would prefer to be able to de-risk potential damage to their online image and reputation if they’re going to expend resources in promoting themselves on Facebook and engaging with the public there. The only option has been to allow any and all crazies to add comments to pages willy-nilly, and companies can hide them if they choose, after the fact and after a significant amount of page visitors may see the information. This has resulted in distribution of spam, offensive materials, hate speech, trolling, harassment, defamation, misinformation, and things that just detract from a brand’s message. Facebook’s apparent refusal to provide comment moderation leaves companies with few choices: hire social media managers to spend time reviewing and moderating comments after-the-fact, or walk away from Facebook.
The Solution: Automatic Hiding Of All Facebook Comments
Facebook has two page management features that can be leveraged in combination to provide the ability to make all comments be automatically hidden until a page manager may review them and chooses to make them visible or not.
The first feature is the ability to hide visitor comments. When you mouse over a visitor’s comment on your page, a little “X” appears in the upper right side of the comment. By clicking on this tool, you can hide the comment so that no other visitors will be able to see the comment. What’s brilliant about this tool is that the person who posted the comment, and their friends on Facebook, will all still see the comment as though it was still visible on the page! This is cool, because if the commenter was a hater, they’re still off chuckling to themselves, thinking they’ve given you a black eye. (Once you’ve hidden a comment, you can still access it and there are additional tool links displayed below the comment which can allow you to completely delete the comment, report the comment to Facebook, or ban that visitor from your page.)
The second feature is the ability to automatically hide comments containing particular keywords you specify (under your page’s settings, called “Page Moderation”). This is a very strong feature if your organization had some negative event about it or specific topics that are negative associated with it — add the words involving that to your list of words to be blocked and any comment submitted that contains one of those words will be automatically hidden until or if you review it and unblock it.
The ability to block keywords gave me the idea: what if we blocked the entire dictionary of words?!!? Of course, that’d be quite a large list, and chances are that Facebook has some upper limit in place that would not allow one to use the entire dictionary. I looked, and indeed, Facebook limited the blocked keywords list to restrict it to a maximum of only 10,000 characters long. But, this is still a generously large amount of text.
In the past, I’ve studied and developed applications involving natural language processing and from that I knew that it’s not really necessary to add the entire dictionary of words. We only need to add the most-commonly-used English words, and this would cause the vast majority of comments to be automatically hidden for moderation.
I’ve developed this list, and you’re welcome to download and use it.
Right-click to save list to your computer, or click to load it in your browser:
Technical Details of the Moderate All Comments Keywords List
There are a number of lists out there of “most-used English words”. These are based upon what’s called a “corpus” or specific body of text, and they are more or less useful for a particular application, depending upon what you intend to use it for. The list I’ve developed was for American English — it’s likely still quite applicable for other English-speaking countries as well, although the relative popularity of words will differ some. Even so, I’d suspect that somewhere around 90% of all comments posted on Facebook pages would likely get caught with just the first 100 to 200 most-popular words in my list.
Just glance at the first ten words from the list to get an idea:
The list is provided in order of relative popularity, with the most-popular English words at the beginning. As you can see, some of these words are so popular that you’d be hard-put to find any Facebook comments that DIDN’T contain one of them.
When I searched for the most-common English words, I found Brigham Young University’s Corpus of Contemporary American English (“COCA”. This is probably one of the best sets of data for analyzing frequencies of words in the written language, because it’s large and uses current text samples from 1990 through 2015, so it is not based on older, unrepresentative language usage patterns of many decades ago. The COCA corpus contains more than 520 million words of text and the sources were equally divided between spoken, fiction, popular magazines, newspapers, and academic texts.
Of course, quite a bit of that originating data is from professionally written and edited source material like books, magazines an newspapers, so it’s actually not completely ideal for our purpose. It would be much better if we had a list of popular words out of a few years of data taken from Facebook users’ comments, because it’s very likely that the informal text in comments may have different patterns of usage than from all of those formally written sources.
Even so, this doesn’t have to be “perfectly perfect” in order to function very effectively. We can expect that the most-common, simple parts of speech as shown in the list of the first ten words above will effectively catch a great majority of comments. And, we only have to have a single word from a user’s comment match up with our list in order to get it caught by our filter — and most comments have more than one word in them. So, the list we’ve developed is really a sort of overkill for halting comments from being published.
Despite the COCA being a pretty ideal data source, there are issues that made it insufficient in of itself. First, the people who created the word frequency lists out of the COCA data require payment for their list (http://www.wordfrequency.info/purchase.asp). They also license it such that I could not provide you with a free version. They do provide a free list of the top 5,000 words which they will allow one to repost, but it’s a list of lemmas — the headwords or primary base form of a word, rather than the list of all raw words by usage. So, for instance, all the different declensions or tenses of words would be rolled up into one, rather than each one listed separately. Words like the verbs “was”, “are”, “am”, “is” are all represented merely by “be” in that 5k list. Further, that list also has tons of duplicate words in it — a word that has multiple definitions may be repeated for each different meaning. For example the word “one” can be found three times in the first 1,000 words. Apparently it’s intended more for academic use that what we need here. Even so, I did use a portion of that list.
The better list I found was a list of the top 1,000 most-common English words on GitHub, provided by David Norman. That list contains the stemming of the words like the “to be” verb, for instance and only had one or two duplicate words and maybe one mistake. But, I wanted to take advantage of the full 10,000 character limit provided by Facebook for their keyword list field, and David’s list only took up half of that limit.
So, I combined David’s list with the top 5,000 list from WordFrequency.info. That required further processing. Both lists naturally overlap quite a lot. I eliminated the duplicates from the WordFrequency data, and also changed the list into comma-delimited as required by Facebook.
After counting characters, the final list was chopped off to be 1,525 words.
This is a brute-force approach, and isn’t entirely ideal, of course. It’s a bit of a hack around Facebook’s lack of a proper moderation option for comments. The better option would be for Facebook to go ahead and provide that for their pages.
Part of me hopes that enough companies might add this extensive list of keywords to their Facebook comment moderation to prove to be a processing inefficiency for Facebook servers, and thereby convince them to develop out default comment moderation as a feature. (If they don’t want to go all the way, Facebook could allow an option to hold all comments for a week in moderation to give companies an opportunity to review/hide/delete prior to them being made public.) I’m sure Facebook developers didn’t imagine or prepare for large numbers of page managers opting to use the full 10k allowance for keyword filtration — if sufficient numbers of pages set this up it might impact server processes. I don’t wish to annoy Facebook, but I do hope they provide page managers with a bit more editorial control over what publishes on their pages.