Custom Bot Framework Prompts with the Recognizers Text Suite

by Michael Szul on

No ads, no tracking, and no data collection. Enjoy this article? Buy us a ☕.

In the last blog post I wrote, we talked about building custom middleware, and how easy it is to do with the Bot Framework using Cognitive Services. Through this, I was able to introduce you to the Bot Builder Community's Text Analytics Middleware package for Node.JS. There are some other package we're building out for the Bot Builder Community as well. Gary Pretty put together a nice couple of custom prompts for the .NET library, and I had it on my to-do list to port them over to the Node.JS package.

Building custom prompts is pretty easy, as well, and it allows you to work with some of the specific dialog interfaces in the TypeScript source for the Bot Framework.

To create your own prompts, all you have to do is extend the base Prompt<T> class. Let's create our own prompt that asks for an email address. Sounds like a simple text prompt, right? Maybe, but that puts a lot of assumptions on what the user's response is going to be. If we ask the user for their email address they could respond with "michael@szul.us" or they could respond with "my email is michael@szul.us" or "it's michael@szul.us" even. We want to be able to ensure that we get the email address, and none of the other clutter.

Quick note: This example uses a text recognizer from Microsoft's recognizer text suite, and is available from the Bot Builder Community project in the dialog prompts package.

You are going to need three packages to build this prompt. If you are building the prompt as a part of your chatbot (a separate file that you are just pathing to) then you already have two of them installed, but just for completeness sake:

npm install botbuilder-core --save
      npm install botbuilder-dialogs --save
      npm install @microsoft/recognizers-text-sequence --save
      

The first two are a part of the Bot Framework, and they give you the classes and interfaces you need to create a prompt. The second one you will learn more about in part 2 of this book, but its a package that allows you to recognize patterns in text.

Create a file name email.ts, and add some import statements.

import * as recognizers  from "@microsoft/recognizers-text-sequence";
      import { Activity, InputHints, TurnContext } from "botbuilder-core";
      import { Prompt, PromptOptions, PromptRecognizerResult, PromptValidator } from "botbuilder-dialogs";
      

Let's create our class:

export class EmailPrompt extends Prompt<string> {
          public defaultLocale: string | undefined;
          constructor(dialogId: string, validator?: PromptValidator<string>, defaultLocale?: string) {
              super(dialogId, validator);
              this.defaultLocale = defaultLocale;
          }
          ...
      }
      

The EmailPrompt class extends the Prompt class, which itself actually extends the base Dialog class. A prompt is just a dialog with some special add-ons. We declare a public property of defaultLocale. Our constructor accepts a dialog ID (which is a string), a validator, and a default locale that we assign to the public property. The dialog ID and the validator are then passed to the base class' constructor with super().

Now let's add the onPrompt() code needed to capture prompting:

protected async onPrompt(context: TurnContext, state: any, options: PromptOptions, isRetry: boolean): Promise<void> {
          if (isRetry && options.retryPrompt) {
              await context.sendActivity(options.retryPrompt, undefined, InputHints.ExpectingInput);
          } else if (options.prompt) {
              await context.sendActivity(options.prompt, undefined, InputHints.ExpectingInput);
          }
      }
      

This is a protected asynchronous method that accepts the turn context, the state, any options, and whether or not this is a retry. This code basically sends the prompt to the user with the appropriate options depending on if it is a retry or not.

Prompts return a recognizer result from an onRecognize() method. This method is used to capture the user's message, perform any necessary operations on it, and then returns the result. Our email prompt looks like this:

protected async onRecognize(context: TurnContext, state: any, options: PromptOptions): Promise<PromptRecognizerResult<string>> {
          const result: PromptRecognizerResult<string> = { succeeded: false };
          const activity: Activity = context.activity;
          const utterance: string = activity.text;
          const locale: string = activity.locale || this.defaultLocale || "en-us";
          const results = recognizers.recognizeEmail(utterance, locale);
          if (results.length > 0 && results[0].resolution != null) {
              try {
                  result.succeeded = true;
                  result.value = results[0].resolution.value;
              }
              catch(e) { }
          }
          return result;
      }
      

What is this actually doing? The result constant is the recognizer result that we return, and we initially set the succeeded property to false. We then set the activity variable to the turn context activity that we received, and then get the user's input by accessing the text property from that. We then need to set the locale. We need this in order to pass the appropriate culture to the email recognizer.

The real work here is being done by the next line:

const results = recognizers.recognizeEmail(utterance, locale);
      

This is passing the user's input and the locale (or culture) to the recognizeEmail() function from the recognizer text suite that we imported. If this yields a result that is greater than zero, and the first resolution in that result set is not null, then we set the succeeded property of the result constant to true, and then set the value property of the result constant to the value of the resolution in the first result item. Then we return the result.

When you use this, it might look something like this:

dialogs.add(new WaterfallDialog("email", [
          async (step) => {
              return await step.prompt("emailPrompt", "What is your email address?");
          },
          async (step) => {
              await step.context.sendActivity(`Your email is: ${step.result}`);
              return await step.endDialog();
          }
      ]));
      
      dialogs.add(new EmailPrompt("emailPrompt"));
      

The above example consists of two dialogs added to a dialog set. You should be familiar with this syntax from the chapter on dialogs and the examples of prompts earlier in this chapter. When the user is prompted for their email address, they could enter "my email is michael@szul.us" but the result of the second function in the above waterfall would send a message back to the user that says "Your email is: michael@szul.us". The step.result value is not the entire string that the user entered, but instead, just the parsed out email address.

If you want to learn a little more about custom prompts and text recognizers, I encourage you to take a look at what we're putting together for the Bot Builder Community.