Crafting Effective Content Policies: A Conversation with Sabrina (Pascoe) Puls, Director of Trust & Safety
Keeping users safe is a complex task for all online platforms. In this blog post, we examine the critical role content policies play in ensuring online safety. And to help us do that, we’ll hear from Sabrina (Pascoe) Puls, TrustLab’s Director of Trust and Safety. Sabrina explains how content policies work behind the scenes to help protect users and platforms by preventing online harms like misinformation, hate speech, and more. Sabrina also reveals the often overlooked challenges that come with developing and enforcing these rules and provides suggestions for avoiding these pitfalls.
What exactly is Content Policy, and what role does it play in online safety / content moderation?
Content Policies are essentially the rules of the road for digital platforms. They tell users what they can and cannot do on your platform, and also help advise internal teams on how to respond if a user does violate one of your policies (e.g. removing content, banning a user from your platform, or even corresponding with law enforcement depending on the severity of the violation).
How Are Content Policies Different from Community Guidelines?
Content policies and Community Guidelines are more or less synonymous. One thing companies can struggle with is determining how transparent to be in their Community Guidelines. I think generally platforms and Trust & Safety teams want to be as transparent as possible with their users, but we also need to make sure that we’re not sharing so much information or detail that it’s easy to evade enforcement of these policies or circumvent measures we’ve put in place to keep users safe. Fraud is a great example. If you’re too transparent about how you identify fraud on your platform, bad actors will just pivot and try something else.
Who creates Content Policies?
It kind of depends. Often when platforms are just starting out, there aren’t necessarily Trust & Safety teams or policy writers in place to advise or guide the company on how to create effective content policies. It may not seem like a major issue at the time, but not having well-written policies in place can lead to a variety of issues down the road…
- User Experience: Not having clear content policies in place can negatively impact user experience. If users don’t know why their content is being moderated the way that it is, they might jump to a competing, more transparent platform.
- Regulatory Compliance: There could be legal implications to not having well articulated content policies. If you operate or have users within the EU, then you’re beholden to the Digital Services Act which mandates published content policies and metrics associated with content moderation.
- Data Insights: Speaking of metrics, not having well formulated content policies could really impact your ability to anticipate and mitigate various forms of online and offline harms. Prevalence (for example) can help your team stay abreast of new abuse trends online and help you know what to prioritize and where to invest more resources.
- Financial Implications: Lastly, not having well-formatted content policies can really impact a company’s bottom line. Content policy experts know how to develop policies in a way that supports automation and can save newer platforms millions of dollars per year on manual review.
There’s plenty of other potential implications, but I think these are the major ones for platforms to consider.
Are all Content Policies made the same?
Definitely not! And that’s one of the things I enjoy most about working at TrustLab! We can either develop solutions based on your existing policies or our Policy team can help you create custom policies from scratch. This is important because off-the-shelf classifiers often aren’t able to take the nuances of your specific platform into consideration. For example, if you’re a platform for online dating, you might take a very different approach to moderating Adult & Sexual content than an online marketplace focused on the provision of goods and services. In the latter case, one user is employing another user and sexual content policies need to be a lot more conservative.
What are some of the most common types of content policies that platforms typically need to develop?
Broadly speaking, there are a handful of abuse areas that most platforms either outright forbid or may sensor to some degree in order to keep users safe. Examples include:
- Policies around derogatory content or hate speech. These policies are created to protect users from being targeted based on a protected class, like race, ethnicity, gender, sexual orientation, etc. But context plays a really important role here. Marginalized communities may reclaim terms that were once considered pejorative and you have to create these policies and enforcement mechanisms in such a way that you’re not accidentally silencing the communities you sought out to protect.
- Another example is Harassment and Violence policies. Depending on your platform, your policies around violent content could look extremely different. If you’re hosting lots of fictional content (e.g. video games, movies, TV shows, etc), you’ll likely have a much more lenient stance on violent imagery than a dating app who may be using these signals to identify potential bad actors. Another issue to consider is real violence. In general, users have varying opinions about the appropriateness of sharing images or videos of real world harm in order to raise awareness of injustice or war. You have to decide as a platform what your stance is going to be on these issues.
- An area that can be particularly challenging is Misinformation. As you can probably tell just by watching the news, people have very strong and disparate opinions on misinformation. Over the last 5 years or so, many executives of tech companies have been asked to explain their approach to moderating misinformation on their platforms to congress – with Republican members feeling like these platforms are doing too much and Democratic representatives arguing that they’re not doing enough. We could probably write an entire article on Misinformation alone, but to try and keep it brief, I’ll just say that most platforms tend to focus on moderating a few types of misinformation: medical misinformation, election misinformation, climate change misinformation, and denial of significant historical events. One last thing to note is that moderating misinformation has only become more challenging with advancements in generative AI.
- Other categories to be mindful of include Adult & Sexual content and Fraud (which we discussed a bit earlier), as well as Sexual Exploitation & Abuse. If you haven’t already established a relationship with the National Center for Missing & Exploited Children (also known as NCMEC), I would definitely prioritize that.
What are some common pitfalls during the policy creation and launch process?
The best advice I can give in this regard is to identify all your key stakeholders from the beginning and actively solicit and incorporate their feedback. Believe me, it’s not fun spending weeks or months on research and development, just to realize that what you’re suggesting may not be technically or operationally feasible. This can look different depending on your company, but in general I would recommend meeting with your Legal team, Product and Engineering, and PR/communications.
What strategies have you found most effective for acquiring cross-functional buy-in, particularly from stakeholders who may have conflicting priorities?
Definitely create a Policy Standard Operation Procedure (SOP) before you get into content policy creation. A Policy SOP essentially outlines each phase of the Policy process (from research and development all the way to quality assurance) and assigns varying levels of responsibility to different stakeholders, such as Legal, Product/Eng, etc. Getting cross-functional buy-in on the process itself will make getting alignment on specific content policies ten times easier down the road.
With the growing reliance on automation for content moderation, how can we address potential biases in AI systems?
I would say one of the first things a platform should do before implementing automated content moderation is to establish (i) your ethical principles around AI and (ii) a system that holds your team accountable for these principles. Some examples of things to consider include:
- Privacy - are you compliant with relevant legislation like GDPR or the California Consumer Privacy Act (CCPA)?
- Fairness - does policy enforcement remain consistent across various social groups?
- Explainability - are you able to explain why the model made the decision it did or have you fallen into a bit of a “black box” scenario?
- Transparency - are your users aware of your platform’s use of AI and how it might impact their experience?
- Accountability - who within your company is responsible for the ongoing oversight of AI systems? Are users able to appeal decisions made by models that they disagree with?
What future trends do you foresee in the realm of content moderation and Trust & Safety? How should companies prepare to adapt their policies accordingly?
A newer trend is definitely leveraging generative AI to create classifiers. Content policy writers will need to adapt their writing styles to help product and engineering teams develop prompts and not just traditional policies or guidelines.