Skip to main content
Data denoising

How Chattermill cleans certain data sources to make them easier to work with

Mikhail Dubov avatar
Written by Mikhail Dubov
Updated over a week ago

Some of the CX datasets, notably Social Media and Support Conversations can have a lot of noise that is not relevant for analysis of the data. Examples of noise are:

  • Canned, automated messages on Support channels

  • Social posts that do not contain any valid Customer Feedback (such as News articles and miscellaneous mentions of your company name)

Chattermill's denoising service helps you focus on the most important parts of every conversation, improving your support capabilities and overall data quality.

This feature is deployed at the start of our overall Lyra AI data pipeline, it filters out irrelevant content and keeps only the meaningful parts of your conversations, ensuring you have the most accurate and useful data.

How Chatermill denoising works

Our denoising service scans your Support and Social interactions and identifies the essential parts of the conversation—the active voice of your customers. It removes any noise, such as:

Support Data:

  • Email signatures

  • Duplicate messages

  • Irrelevant details

  • Canned messages

Social Data:

  • Miscellaneous mentions of your company

  • News Articles / Spam containing no active Voice of Customer feedback

  • Comments containing only a social handle (eg @Acme)

We built a set of proprietary AI models that work across Support data (chats, emails, and tickets in a customer support context) and Social (which supports most of the popular Social channels, including Twitter, Instagram, TikTok, Facebook, and Reddit, among others).

Our models handle each response differently depending on the type of data. For example, if we are analyzing Support data, our models identify the specific segment of the response that contains the active Voice of Customer feedback and disregard all other segments, such as email signatures, out-of-office replies, and date and time stamps.

For Social data, we analyse each response in its entirety and assign a probability score of the comment being noisy before reviewing the data and setting the correct threshold for filtering out noisy responses.

In both approaches the original responses are also preserved should you need to access those.

Why denoise data?

  1. Enhanced Data Quality: Focuses on the key parts of conversations for more accurate data and readability.

  2. Consistent Formatting: Processes all interactions uniformly, whether they are chats, emails, or tickets.

  3. Improved Analysis: Provides cleaner, more relevant data for better insights and decision-making.

Benefits for You

  • Save Time: Quickly access the important information without sifting through irrelevant content.

  • Increase Efficiency: Streamlined data helps your support team work more effectively.

  • Improved Understanding of Customer Needs: Clearer insights allow for more accurate responses to customer needs.

Did this answer your question?