Content vs. Actor: Looking Into Online Safety Signals

Table of Contents

In the realm of online safety, Trust and Safety teams face a dual challenge: They must keep tabs on both bad content and the bad actors who spread it.

Keeping tabs on these two types of signals is the yin and yang of online safety. You can't have one without the other if you want to create a truly safe digital space.

However, one of these signals presents a greater challenge and holds more strategic importance than the other: actor-level signals.

In this deep dive, we'll break down the differences between content-level and actor-level signals, and we'll even peek into the potential of cross-platform signals.

We'll also discuss how the future of online safety lies in integrated approaches. This isn't just about monitoring different signals simultaneously - it's about fostering unprecedented collaboration between humans and AI.

Let’s get started.

Content-Level Signals

Content-level signals focus on specific pieces of content, rather than the users who shared it.

This might be what most people think about when they think about “content moderation”, and the work of Trust & safety teams: Analyzing text, images, video, or other bits of content, and assessing whether this content complies with the platform's policies, along with local and international regulations.

In short, focusing on content-level signals means determining whether any given piece of content has the potential to cause harm to users.

Content-Level Signals: Tools & Techniques

To manage content-level signals, most platforms use automated systems or human moderation - or some combination of the two.

Automated (often off-the shelf) systems can quickly process huge amounts of content and flag any potentially harmful content, based on a certain set of industry standards.

Human moderators can be more precise. Handling nuance and platform-specific policies (what some platforms deem harmful to their users could be different to others).

Off-the shelf solutions often produce false-positives and struggle to pick up on nuance. Requiring extra engineering work and constant tweaking.

Hiring external human moderation teams (BPOs), gets more expensive as platforms grow in user-base and content volume being posted.

In our 2024 Online Safety report, we found that the balance between these two has been a challenge for Trust & Safety teams.

No tool can do it all, and the desire to invest in Trust & Safety is slim.

Looking Beyond Content-Level Signals

Monitoring content-level signals is an essential part of daily moderation.

It means that any potentially harmful content can be removed – or restricted in reach – before it has the chance to cause any real damage.

The problem is that looking at content-level signals means focusing on the content itself, rather than the users posting the content.

Users who post harmful content are often repeat offenders.

A recent study into prevalent harmful content on Instagram, TikTok, and Pinterest identified “a core number of power users” responsible for creating and disseminating some of the most harmful material on the platforms.

Are “power users” like this going to be put off by sustained efforts to moderate their content? Or are they just going to double down, and constantly seek out new ways to exploit the system?

This is why Trust & Safety teams cannot get by through focusing solely on content-level signals.

Content-level analysis does not address recurrent problematic behavior by users, which could be used to pre-emptively address issues.

This is where actor-level signals come into play.

Actor-Level Signals

Actor-level signals focus on the way users behave on platforms, rather than on the specific content they post.

Through monitoring a user’s posting history, and their patterns of interactions on a platform, T&S teams can identify users who might repeatedly engage in harmful behavior.

Actor-level signals also allow T&S teams to identify content that might seem innocuous, but which could indicate a wider pattern of abuse.

For example, an automated content moderation system might not see any harm in a message that reads “what school did you go to?” Yet this seemingly innocent message could form part of a long-term grooming process.

It can be obvious whether or not any given piece of content is “harmful”. Yet actor-level signals can be very discrete.

It’s the standard moderation paradox: The constant struggle to protect users online without impinging on their freedom of expression.

Integrating Content and Actor Signals for Improved Online Safety

Focusing entirely on content-level signals turns moderation into an endless game of whack-a-mole.

No matter what you do to clamp down on harmful content, there will always be bad actors on your platforms looking for ways to exploit your systems. Plus, some forms of online abuse are so subtle that automated systems may overlook them completely.

Yet through balancing content and actor-level signals, Trust & Safety teams can work towards achieving far-reaching safety on their platforms:

Monitoring content-level signals allows T&S teams to remove harmful content that has already been posted.
Monitoring actor-level signals allows T&S teams to prevent online abuse from occurring before it’s too late.

Advanced moderation tools can analyze multiple data points and behavioral indicators to help T&S teams spot patterns and correlations within user data. This means they can identify possible signs of abuse that would otherwise have gone completely unnoticed by their automated classifiers.

A single, isolated risk indicator might be just that – an isolated incident. It’s only when this single indicator is compared with a user’s wider behavior that T&S teams can determine whether this user poses a risk.

When assessing actor-level signals, T&S teams might set thresholds for harmful behavior: If a user gets a certain number of “strikes”, their profile can be flagged, and the moderators can take appropriate action.

A focus on specific signals allows for a deep analysis of user behavior without impinging too much on user privacy.

It’s less a case of monitoring every single action on the platform and more a case of taking a closer look once certain flags have been raised.

Cross-Platform Actor Signals

So, monitoring content-level signals allows T&S teams to remove harmful content that’s already out there.

Monitoring actor-level signals is a more proactive approach to online safety. It’s all about discovering and addressing deeper abuse patterns from persistent bad actors.

Yet bad actors do not just limit themselves to a single platform.

A predator might leverage personal details they obtained on one platform in order to groom or extort a user on a different platform, for instance.

It’s for this reason that a few participants in our Online Safety Report expressed a strong desire for a system for monitoring actor-level signals across platforms.

This way, they could identify potentially problematic users even before they entered their platform. Or, they could take pre-emptive action against a user on their platform for harmful behavior they had exhibited on a different platform.

Although these signals do not, in themselves, constitute proof of abuse. They can act as red flags that lead to further investigations and, if necessary, allows T&S teams take pre-emptive action.

The Future of Online Safety Lies in Integrated Approaches

Content moderation requires an integrated approach.

This means balancing the swiftness and efficiency offered by automation tools with the insights and nuances that can only come from human input.

It also means that T&S teams can no longer get by through monitoring content alone.

They must also monitor actor-level signals so as to address deeper patterns of abuse from repeat offenders.

As the landscape of online safety evolves, Trust & Safety teams need innovative solutions that can keep pace with emerging challenges.

This is where cutting-edge tools like ModerateAI come into play, offering a powerful blend of AI efficiency and human expertise.

Developed by industry veterans from Google, YouTube, and Reddit, ModerateAI simplifies complex workflows, enhances quality, and reduces costs – all while keeping your users safer than ever before.

Join the beta waitlist for ModerateAI here!

Watch the ModerateAI demo:

‍

Meet the Author

Carmo Braga da Costa

Head of Content at TrustLab

Let's Work Together

Partner with us to make the internet a safer place.

Get in Touch