Conversation Analysis in Chatbots

by Michael Szul on

No ads, no tracking, and no data collection. Enjoy this article? Buy us a ☕.

We're trying something new over at the Codepunk YouTube channel. Give a look at our first few Digital Shots, and tell us what you think.

During our last conversational software post, we talked about the different types of conversations: Pairs, stories, therapy, etc. This is because conversation develops different patterns depending on the context, reason, and the expected outcome. There are different ways that we look at conversation when it comes to critical analysis.

I took a short break from our chatbot discussion with the recent pandemic, and had been writing more about remote work and DevOps. I still have more to say on that subject, but for now, I'm going to try to rotate posts.

First, we're not talking about language acquisition and learning. People too often mistake chatbots for artificial intelligence. A chatbot is incapable of inferring intent. It's primary focus is on continued dialog. Inferred intent is the domain of natural language understanding (NLU), and is a component often integrated with chatbots.

This brings up an important distinction. In past posts, I was adamant about using the term "conversational software" instead of "chatbot," but whereas a chatbot is a very specific tool for interaction, and we can easily reduce it to dialog management, conversational software will often take on an elevated approach, encompassing multiple tools for engaging in conversation, and not just relying on dialog management in isolation.

In addition to removing the concept of language acquisition, we're also not talking about theories of competence. Noam Chomsky has done exception work in linguistic theory and grammar, but it holds little place in the context of chatbots and conversational software.

What about discourse analysis? You could, in theory, apply principles of discourse analysis to conversational software—and I actually think this is a worthy pursuit—but discourse encompasses all forms of symbolic communication (e.g., speech, writing, sign language), and it does concern itself with participatory conversation and the social implications of such interactions; however, it is a broad subject matter where many components would be left untouched when referring to tools and software.

As you likely could tell by the title of this post, we're going to look at conversational software in terms of conversation analysis, and as we build our prototype chatbot software, we're going to compare implementation with theory.

Why conversation analysis?

Conversation analysis is an analytical tool focused on the human process of conversation, and its defined methodology revolves around interaction. As a theory, it observes the visible, physical natural of conversation, categorizing its steps, and documenting its outcomes. It is not a theory that depends on consciousness, intelligence, or learning. This is key. It's an examination of a process—and software can duplicate a process quite easily. If we are patterning a chatbot framework on conversation analysis, we are only dependent on the behavior of the process. This allows us to duplicate the behavior without inferring intelligence.

What is conversation analysis?

Conversation analysis is a systematic analysis of talk that is produced as a result of normal everyday interactions. This talk is referred to as 'talk-in-interaction'. Conversation analysis refers to the study of orders of talk-in-interaction that takes place with any individual and in any setting. […] Conversation analysis, therefore, tries to understand the hidden rules, meanings or structures that create such an order in a conversation.

If we wanted to get fancy, we could call it: ethnomethodology.

Conversation analysis is very simply the study of how people interact through conversation, and the discipline of conversation analysis helps us categorize and understand the parts of conversation. From a structural perspective, conversation analysis is concerned with turns, sequences, repairs, and actions. In fact, "turn-taking" is considered the centerpiece of conversation analysis, where each party takes a turn in a conversation. We'll take a more in-depth look at turn-taking in the next post.

In the Microsoft Bot Framework, each round trip from person to chatbot is referred to as a "turn," and the framework uses a "turn context" to contextualize the software's approach to this form of turn-taking. We'll look at examples of different chatbot frameworks as we build our prototype.

In terms of sequence or organization of conversation parts, we already talked about one of these parts in the last post when we looked at question/answer pairs. In conversation analysis, this category is referred to as adjacency pairs and encompasses questions/answers, offers/refusals (or acceptance), compliments/acknowledgements, etc. This category describes the most common back-and-forth between individuals, and the process of an adjacency pair sequence is the easiest to capture in standard software development.

Like with turn-taking, we'll discuss adjacency pairs and how they are implemented in different frameworks when we detail dialog management in our prototype.

There are other forms of sequence organization that is less straightforward than adjacency pairs, such as sequence expansion and preference organization. As we move through this series, we'll bring these up as they relate to chatbots and conversational software. As a quick example, sequence expansion includes a concept of "silence" which has contextual meaning. The meaning of this silence can usually be inferred by the conversation or by body language. With chatbots, inferring the meaning of silence is more difficult, but many chatbot frameworks (and chat applications in general) compensate with things like a typing indicator, which you can kick off while waiting for a long-running process to finish. This gives the user an indication that something is happening on the other side despite the silence.

I've purposely left out a discussion on conversation repair and action formation. These are advanced concepts that conversational software has not effectively tackled yet. We'll get to these later in this series, and suggest ways to solve for them. This post was in no way meant to be an exhaustive look at conversation analysis, but instead a very brief introduction to get you thinking about conversational structure. In the next conversational software post, we'll take a much deeper look at turn-taking in conversation and in software. Throughout this series, we will continue to expand on conversation analysis concepts as we approach them while prototyping our own software. You can think of this entire series as one about both conversation analysis and conversational software: We'll expand our understanding of both as we go.