Video and Audio Moderation Software

Intel’s Bleep lets gamers filter video game audio in real time to exclude hate speech. Gamers can set sliders to hear all, some or none of what intel says is sexism, racism, misogyny and white nationalism.

The world’s only full-coverage voice or audio moderation software. It understands all the nuances of voice, going beyond transcription to consider emotion, speech acts and listener responses.

 

Voice Moderation

Voice moderation involves understanding the nuances of human speech. It’s not just the words you hear, but how they are said, what the speaker’s mood is like, their body language, and the context of their relationship with the listener. It’s an incredibly complex and challenging task, but it must be done accurately and **quickly.

With the advent of voice-focused social apps (TikTok, Instagram’s Reels, Discord’s Stage Channels) and virtual gaming platforms, users have begun to organize into communities that use both text and audio chats. Many of these platforms also allow users to create their own servers for group discussion or broadcasting live video streams.

But despite these new tools, the technology currently available for voice moderation is lagging far behind what’s available for text-based social media. Most systems focus on keywords or other text-based methods that are difficult to apply to voice. ToxMod, developed by Modulate, goes beyond these limitations with a proactive solution that independently catches toxicity as it emerges, allowing game companies to keep players safe and build more positive communities.

Audio Scanning

Unlike images and text, audio files can hide NSFW threats in many ways. They can also be more difficult to moderate because they require consideration for tonality, background noise, accents and languages. These risks can hurt brands by exposing them to legal issues, bruised reputations and damaged user experiences.

Like text moderation, the process starts with an AI tool that transcribes the audio file’s content (not quite real-time). This transcription is then run through WebPurify’s profanity and intent filters to ensure that any known NSFW phrases are caught and blocked.

Any other submissions that aren’t clear-cut or may have double meanings are escalated to a human for review. This optimizes SLAs, reduces human moderator workload and still allows for the close inspection of nuanced content that would be challenging for AI to understand on its own. This process is particularly important for music, speech and podcasts where copyright infringement is a significant risk.

Video Scanning

In addition to detecting explicit visual content, video moderation software also scans for offensive speech patterns and can flag videos that violate your terms of service. This enables staff to quickly identify and take action on inappropriate content without needing to review each individual frame of a video.

Unitary’s audio, image and text moderation algorithms can automatically detect the most objectionable content from over 25,000 frames of video per second, with human-level accuracy. The solution also identifies users who have violated your terms of service in the past, so future posts can be flagged for review more quickly.

In interlaced scan, an image is broken down into odd and even lines, which are transmitted for 1/60 of a second each (or half a frame). The next video frame is then displayed, using the same series of lines from either field, which causes flickering. Progressive scan displays one line of a video at a time, which requires less bandwidth and does not cause flickering.

Multimedia Moderation

With apps like TikTok, Instagram’s Reels and Discord’s Stage Channels offering users the ability to connect with one another via voice, audio or text-based chats, brands need to be prepared to moderate these spaces. Moderation tools for these platforms are lagging behind those for text-based chats and a clear and concise set of guidelines is critical for maintaining a safe environment.

Creating these guidelines is challenging as a single word or phrase can have different meanings for people of various backgrounds and cultures. For example, “hate speech” can be interpreted differently depending on context and tone.

This is where AI can be useful as it translates words in a conversation and flags any that appear offensive. This allows moderators to save time by letting the machine take care of the more obvious violations. In turn, human moderators can focus on more complex and nuanced cases. This is why brands seeking a content moderation partner should look for an organization that is a proponent of a hybrid approach, understanding the power of technology while respecting and supporting their human moderators.

Leave a Comment