The Evolution of Text Analytics and Its Role in the New Era of Conversational Intelligence

For the past several years, the concept of integrated customer experience (a holistic approach to managing and optimizing the entire customer journey across all touchpoints and interaction) has grown in awareness among the CX and marketing communities. One of the key requirements of integrated CX is the ability to tap into and analyze all available data sources to understand customer sentiment and pain points.

For a long time now, we've been able to derive actionable insights from single documents of text or voice data, like individual reviews, news articles, tweets (X posts), or survey responses. But while some channels like call centers, email exchanges, and chat logs have been at the heart of most companies' CX programs, shockingly few do very little with these treasure troves of untapped insights. And for good reason: Until recently, viable technology solutions to effectively analyze this particular form of unstructured data just haven't existed.

But now, due to advances in natural language processing (NLP), machine learning (ML) and other artificial intelligence (AI) technologies, we're able to embark on a new era of conversational intelligence—that is, the ability for computers to be conversationally aware and derive insights and recommend actions based on communications between organizations and their customers. To understand how we've gotten to this point, it would be helpful to take a look at the evolution of text analytics and NLP technologies as they relate to understanding human communication.

Text analytics began 20-plus years ago with well-formatted, grammatically precise, explanatory news content and internal corporate documentation. These documents have clear communication goals, take the time to define terms, and directly present useful information. In this setting, text analytics aims to extract the information already being offered for consumption by a broad audience of varying backgrounds. Writers use emotive language to express their opinions, which is tracked and boiled down into sentiment metrics. Proper nouns like company names are presented with proper capitalization, allowing for (relatively) easy named entity recognition.>

With the rise of social media, the amount of timely content available for processing ballooned. Entire fields of software came into existence, such as social media monitoring, promising improved value by allowing marketing teams to react instantaneously to developing controversies. However, social media users' informal and ever-changing language required rethinking text analytics tools and approaches. Much of social media is intended for a specific audience with its own linguistic quirks and assumed knowledge. Rather than trained reporters following precise style guides to make understanding easy for man and machine alike, informal language is frequently playful. Suddenly, sarcasm, hyperbole, and clever allusions instead of formally stated names broke the algorithms and models designed and trained around formal content.

While this development required tweaking algorithms and annotating new datasets, the core approach of text analytics remained relatively unchanged. Writers were writing in a new way, but their goal was still communication with a broad audience. Businesses wanted to understand the sentiments of their customers, and their customers wanted to share their positive and negative experiences. However, just as the emergence of social media sites unlocked the opportunity to apply text analytics to informal content, recent expansions in the capabilities of chatbots, plus declining costs and increasing accuracies in voice transcription are bringing conversational data to the forefront of text analytics and bringing with them new challenges that demand new approaches.

Conversational data is unlike other forms of content in that the speakers try to accomplish specific goals only tangentially related to communicating facts and opinions. When returning a shoe, I might be pleased or disappointed by the product, and I might share those opinions with the agent, but primarily, I'm trying to obtain a refund. Thus, traditional assumptions that the overall tone of a document tells us about the speaker's opinions do not necessarily apply. Where it was previously sufficient to essentially count up positive and negative sentences and assume they reflected the speaker's mood, the order and interplay of polarity matter greatly in a conversation.

There is some nuance to an article that starts upbeat and turns negative vs. one that leads with the problems and ends on a high note. But that is nothing compared to the difference between a happy caller who leaves angry and an angry caller who leaves satisfied. Time is suddenly a much more significant aspect of text analytics and needs better representation as a first-class concept.

There is an inherent tension to handling conversations with traditional text analytics techniques. If you run the entire document through an engine, text analytics will report on the statements of both speakers. However, in most situations, a business cares about each separately: just the agents to evaluate whether they're following best practices or the customers to understand their goals, preferences, and satisfaction levels. But if you run the document split into separate utterances or a single channel, you lose the larger context of the conversation. Even more than in social media, a conversation heavily uses shared knowledge and references that come up during the exchange. Often, correctly understanding the meaning of a statement requires a reference back to what the other speaker just said. "No" can convey a powerful sentiment in a positive or negative direction amid a conversation.

Social media demanded a loosening of our expectations around what properly formatted content looks like and retraining on newly annotated datasets. Conversational intelligence has required new organizational structures for analyzing and reporting text. We've restructured our unit of analysis around a section, a series of statements by a single speaker. These sections might be embedded into complex graphs, as in an email chain broken with sub-conversations branching off the initial message or as a customer issue travels across systems between phone calls, email exchanges, and internal ticketing systems. Where results would previously be attributed to an entire document or to a specific topic or entity, we can now better model the shifting nature of a conversation, ideally from a dissatisfied customer transformed into an advocate through exceptional customer service.

While this sectional approach to text analytics is ultimately a means to an end, we believe it's one of the most important advances in the field in the past 10 years. It expands the ability of users to understand the direction and flow of information, whether it's getting better or worse, the various sides of a discussion, who is locked into a position, and whose thoughts are evolving Understanding the full context of a complex discussion as well as the flow of a discussion, especially if we're moving to a positive resolution, gives us a powerful new tool for improving customer experiences.


Paul Barba is chief scientist at InMoment, where he is focused on applying machine learning, natural language processing, and other artificial intelligence technologies to solve the challenges related to analyzing mountains of unstructured feedback data in the customer experience (CX) market. Barba has spearheaded the integration of generative AI and large language models into InMoment's NLP stack and continues to drive that development as new capabilities come to market. He has nearly two decades of experience in diverse areas of NLP and machine learning, from sentiment analysis and machine summarization to genetic programming and bootstrapping algorithms and is continuing to bring cutting-edge research to solve everyday business problems while working on new big ideas to push the entire field forward. He earned a degree in computer science and mathematics from the University of Massachusetts at Amherst.