When it comes to contact center analytics, the most fruitful information lies in knowing what the customer talks about and what caused them to call – something that is often only captured in approximate terms by an interactive voice response system and the contact centre agents. In this blog, we talk about how advanced math-heavy approaches from the discipline of machine learning can offer considerable benefits.
Contact centers are frequently viewed as a necessary expense for a company. Many companies see contact center costs rising, quality declining, and may not even know of missed opportunities, such as upselling or developing customer analytics insights.
When a customer calls, the most applicable and reliable source of truth is the actual call recording, the call transcript, and sometimes the notes taken by the agent. Natural language is difficult even with ideal data, but off-the-shelf solutions aren’t inherently good enough for the domain-specific language used in reference to a company’s offerings.
So what should be an ideal approach? Supplementing your current call center software and call analytics with a mix of supervised and unsupervised machine learning techniques promises to deliver more valuable and actionable insights.
A high-performing and versatile technique for achieving the ideal solution begins with a semantic numerical representation of language used – this can be a call transcript automatically generated from recorded calls or can be free-text notes your contact center agents take during the call. The best representation for the given use cases is called a text embedding, which is created by training a special kind of neural network on all the text data you want to analyze (read more about text embedding here, by Malaikannan Sankarasubbu).
This model allows you to represent every call and even any word occurring in the text as a high-dimensional vector. None of this is achieved by instructing the computer to treat things according to rules, or formulating a massive dictionary, but rather just by having it to read all the transcripts or notes itself.
An easy unsupervised learning task for these new features is clustering, which can reveal which calls (as vectors) are grouped together, and therefore are the most similar. By inspecting a sample of the calls or running a summarization algorithm over each group of calls, a topic can be assigned to each group and repeating this process within the most significant groups provides more specific answers about why customers called. For the first time, the contact center has an entirely fact-based, data-backed root-cause analysis on the highest-volume segments of calls – information critical to reducing call volume and expenses.
When it comes to supervised learning tasks, any predictive model could benefit from incorporating the natural language understanding of the transcripts or notes the machine has learned. An example is identifying customer churn that could be directly related to poor customer service within the contact center, for which the actual call transcripts (embedded as vectors) make an excellent addition to traditional structured data features like call length, hold time, or flags set by the agent.
However, a more sophisticated way to take advantage of this new natural language model is to attempt to predict churn for any other dissatisfaction-related cause, since that dissatisfaction can now be identified at a detailed level, based on the conversations between customers and agents. For example, engineering may be aware of some intermittent network latency issues, while the contact center agents as individuals may know that there are support calls regarding outages, and finance may see that the churn rate is going to lead to a narrow miss of Q2 earnings. But it is the data-centric analytics solution that is able to “read” and “understand” millions of contact center calls and notes, along with customer-level churn events and other structured data to report to senior management that x% of revenue will be lost to customers leaving in the next two months if engineering doesn’t resolve the issue they are aware of. Many within the organization could propose this hypothesis, but none could definitively show it or independently discover it without understanding what all the calls were about – a monumental task that only the learned machine could do.
For post-call general-purpose contact center analytics, structured data falls short of expectations in the era of data science, but adding actual transcripts and notes to these by training a text embedding system reveals many new opportunities and boosts the usefulness of several machine learning applications.
Have you experienced similar or newer challenges in implementing a smart contact center? I welcome you to share your thoughts.