Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Voicebot and Chatbot Design
Voicebot and Chatbot Design

Voicebot and Chatbot Design: Flexible conversational interfaces with Amazon Alexa, Google Home, and Facebook Messenger

Arrow left icon
Profile Icon Rachel Batish
Arrow right icon
NZ$35.99 NZ$51.99
Full star icon Full star icon Full star icon Full star icon Half star icon 4.6 (7 Ratings)
eBook Sep 2018 296 pages 1st Edition
eBook
NZ$35.99 NZ$51.99
Paperback
NZ$64.99
Subscription
Free Trial
Arrow left icon
Profile Icon Rachel Batish
Arrow right icon
NZ$35.99 NZ$51.99
Full star icon Full star icon Full star icon Full star icon Half star icon 4.6 (7 Ratings)
eBook Sep 2018 296 pages 1st Edition
eBook
NZ$35.99 NZ$51.99
Paperback
NZ$64.99
Subscription
Free Trial
eBook
NZ$35.99 NZ$51.99
Paperback
NZ$64.99
Subscription
Free Trial

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Table of content icon View table of contents Preview book icon Preview Book

Voicebot and Chatbot Design

Chapter 1. Conversational UI is our Future

Conversational user interface (UI) is changing the way that we interact. Intelligent assistants, chatbots, and voice-enabled devices, such as Amazon Alexa and Google Home, offer a new, natural, and intuitive human-machine interaction and open up a whole new world for us as humans. Chatbots and voicebots ease, speed up, and improve daily tasks. They increase our efficiency and, compared to humans, they are also very cost-effective for the businesses employing them.

This chapter will address the concept of conversational UIs by initially exploring what they are, how they evolved, what they offer, their challenges, and how they will develop in the future. The chapter provides an introduction to the conversational world. We will take a look at how UI has developed over the years and the difference between voice control, chatbots, virtual assistants and conversational solutions.

What is conversational UI?

Broadly speaking, conversational UI is a new form of interaction with computers that tries to mimic a "natural human conversation." To understand what this means, we can turn to the good old Oxford Dictionary and search for the definition of a conversation:

con·ver·sa·tion

/ˌkänvərˈsāSH(ə)n/ noun

A talk, especially an informal one, between two or more people, in which news and ideas are exchanged.

On Wikipedia (https://en.wikipedia.org/wiki/Conversation), I found some interesting additions. There, conversation is defined a little more broadly: "An interactive communication between two or more people… the development of conversational skills and etiquette is an important part of socialization."

The development of conversational skills in a new language is a frequent focus of language teaching and learning. If we sum up the two definitions, we can agree that a conversation must be:

  1. Some type of communication (a talk)
  2. Between more than two people
  3. Interactive: ideas and thoughts must be exchanged
  4. Part of a socialization process
  5. Focused on learning and teaching

Now if we go back to our definition of conversational UI, we can easily identify the gaps between the classic definition of a conversation and what we define today as conversational UI.

Conversational UI, as opposed to the preceding definition:

  1. Doesn't have to be oral: it could be in writing (for example, chatbots).
  2. Is not just between people and is limited to two sides: in conversational UI, we have at least one form of a computer involved, and the conversation is limited to only two participants. Rarely does conversational UI involve more than two participants.
  3. Is less interactive and it's hard to say whether ideas are exchanged between the two participants.
  4. Is thought of as unsocialized, since we are dealing with computers and not people. However, the two main components are already there.
  5. Is a medium of communication that enables natural conversation between two entities.
  6. Is about learning and teaching by leveraging Natural Language Understanding (NLU), Artificial Intelligence (AI), Machine Learning (ML), and Deep Learning (DL), as computers continue to learn and develop their understanding capabilities.

The gaps that we identified above represent the future evolution of conversational UI. While it seems like there is a long way to go for us to actually be able to truly replace human-to-human interaction, with today's and future technologies, those gaps will close sooner than we think. However, let's start by taking a look at how human-computer interaction evolved over the last 50 years, before we try to predict the future.

The evolution of conversational UI

Conversational UI is part of a long evolution of human-machine interaction. The interface of this communication has evolved tremendously over the years, mostly thanks to technology improvements, but also through the imagination and vision of humans.

Science fiction books and movies predicted different forms of humanized interaction with machines for decades (some of the best-known examples are Star Wars, 2001: A Space Odyssey, and Star Trek), however, computing power was extremely scarce and expensive, so investing in this resource on UIs wasn't a high priority. Today, when our smartphones use more computing power than a supercomputer did in the past, the development of human-machine interaction is much more natural and intuitive. In this chapter, we will review the evolution of computer UI, from the textual through to the graphical and all the way to the conversational UI.

Textual interface

For many years, a textual interface was the only way to interact with computers. The textual interface used commands with a strict format and evolved into free natural language text.

Textual interface

Figure 1: A simple textual interaction based on commands

A good example of a common use of textual interaction is search engines. Today, if I type a sentence such as search for a hotel in NYC on Google or Bing (or any other search engine for that matter), the search engine will provide me with a list of relevant hotels in NYC.

Textual interface

Figure 2: Modern textual UI: Google's search engine

Graphical user interface (GUI)

A later evolution of human-machine interface was the GUI. This interface mimics the way that we perform mechanical tasks in "real life" and replaces the textual interaction.

Graphical user interface (GUI)

Figure 3: The GUI mimicking real-life actions

With this interface, for example, to enable/disable an action or specific capability, we will click a button on a screen, using a mouse (instead of writing a textual command line), mimicking a mechanical action of turning on or off a real device.

Graphical user interface (GUI)

Figure 4: Microsoft Word is changing the way we interact with personal computers

The GUI became extremely popular during the 90s, with the introduction of Microsoft Windows, which became the most popular operating system for personal computers. The following evolution of GUIs came with the introduction of touchscreen devices, which eliminated the need for mediators, such as the mouse, and provided a more direct and natural way of interacting with a computer.

Graphical user interface (GUI)

Figure 5: Touchscreens are eliminating the mouse

Graphical user interface (GUI)

Figure 6: Touchscreens allow scrolling and clicking, mimicking manual actions

Conversational UI

The latest evolution of computer-human interaction is the conversational UI. As defined above, a conversational interaction is a new form of communication between humans and machines that includes a series of questions and answers, if not an actual exchange of thoughts.

Conversational UI

Figure 7: The CNN Facebook Messenger chatbot

In the conversational interface, we experience, once again, a form of two-sided communication, where the user asks a question and the computer will respond with an answer. In many ways, this is similar to the textual interface we introduced earlier (see the example of the search engine), however, in this case, the end user is not searching for information on the internet but is instead interacting in a one-to-one format with someone who delivers the answer. That someone is a humanized-computer entity called a bot.

The conversational UI mimics a text/voice interaction with a friend/service provider. Though still not a true conversation as defined in the Oxford Dictionary, it provides a free and natural experience that gets the closest to a human-human interaction that we have seen yet.

Conversational UI

Figure 8: The Expedia Facebook Messenger chatbot

Voice-enabled conversational UI

A sub-category in the field of conversational UI is voice-enabled conversational UI. Whereas the shift from textual to GUI and then from GUI to conversational is defined as evolution, conversational voice interaction is a full paradigm shift. This new way to interact with machines, using nothing but our voice – our most basic communication and expression tool – takes human-machine relationships to a whole new level.

Computers are now capable of recognizing our voice, "understanding" our requests, responding back, and even replying with suggestions and recommendations. Being a natural interaction method for humans, voice makes it easy for young people and adults to engage with computers, in a limit-free environment.

Voice-enabled conversational UI

Figure 9: Amazon Alexa and Google Home are voice-enabled devices that facilitate conversational interactions between humans and machines

The stack of conversational UI

The building blocks required to develop a modern and interactive conversational application include:

  • Speech recognition (for voicebots)
  • NLU
  • Conversational level:
    • Dictionary/samples
    • Context
    • Business logic

In this section, we will walk through the "journey" of a conversational interaction along the conversational stack.

The stack of conversational UI

Figure 10: The conversational stack: voice recognition, NLU, and context

Voice recognition technology

Voice recognition (also known as speech recognition or speech-to-text) transcribes voice into text. The computer captures our voice with a microphone and provides a text transcription of the words. Using a simple level of text processing, we can develop a voice control feature with simple commands, such as "turn left" or "call John." Leading providers of speech recognition today include Nuance, Amazon, IBM Watson, Google, Microsoft, and Apple.

NLU

To achieve a higher level of understanding, beyond simple commands, we must include a layer of NLU. NLU fulfills the task of reading comprehension. The computer "reads the text" (in a voicebot, it will be the transcribed text from the speech recognition) and then tries to grasp the user's intent behind it and translate it into concrete steps.

Lets take a look at travel bot, as an example. The system identifies two individual intentions:

  1. Flight booking – BookFlight
  2. Hotel booking – BookHotel

When a user asks to book a flight, the NLU layer is what helps the bot to understand that the intent behind the user's request is BookFlight. However, since people don't talk like computers, and since our goal is to create a humanized experience (and not a computerized one), the NLU layer should understand or be able to connect various requests to a specific intent.

Another example is when a user says, I need to fly to NYC. The NLU layer is expected to understand that the user's intent is to book a flight. A more complex request for our NLU to understand would be when a user says, I'm travelling again.

Similarly, the NLU should connect the user's sentence to the BookFlight intent. This is a much more complex task, since the bot can't identify the word flight in the sentence or a destination out of a list of cities or states. Therefore, the sentence is more difficult for the bot to understand.

Computer science considers NLU to be a "hard AI problem"(Turing Test as a Defining Feature of AI-Completeness in Artificial Intelligence, Evolutionary Computation and Metaheuristics (AIECM), Roman V. Yampolskiy), meaning that even with AI (powered by deep learning) developers are still struggling to provide a high-quality solution. To call a problem AI-hard means that this problem cannot be solved by a simple specific algorithm and that means dealing with unexpected circumstances while solving any real-world problem. In NLU, those unexpected circumstances are the various configurations of words and sentences in an endless number of languages and dialects. Some leading providers of NLU are Dialogflow (previously api.ai, acquired by Google), wit.ai (acquired by Facebook), Amazon, IBM Watson, and Microsoft.

Dictionaries/samples

To build a good NLU layer that can understand people, we must provide a broad and comprehensive sample set of concepts and categories in a subject area or domain. Simply put, we need to provide a list of associated samples or, even better, a collection of possible sentences for each single intent (request) that a user can activate on our bot. If we go back to our travel example, we would need to build a comprehensive dictionary, as you can see in the following table:

User says (samples)

Related intent

I want to book my travel

I want to book a flight

I need a flight

BookFlight

Please book a hotel room

I need accommodation

BookRoom

Building these dictionaries, or sets of samples, can be a tough and Sisyphean task. It is domain-specific and language-specific, and, as such, requires different configurations and tweaks from one use case to another, and from one language to another. Unlike the GUI, where the user is restricted to choosing from the web screen, the conversational UI is unique, since it offers the user an unlimited experience. However, as such, it is also very difficult to pre-configure to a level of perfection (see the AI-hard problem above). Therefore, the more samples we provide, the better the bot's NLU layer will be able to understand different requests from a user. Beware of the catch-22 in this case: the more intents we build, the more samples are required, and all those samples can easily lead to intents overlapping. For example, when a user says, I need help, they might mean they want to contact support, but they also might require help on how to use the app.

Context

Contextual conversation is one of the toughest challenges in conversational interaction. Being able to understand context is what makes a bot's interaction a humanized one. As mentioned previously, at its minimum, conversational UI is a series of questions and answers. However, adding a contextual aspect to it is what makes it a "true" conversational experience. By enabling context understanding, the bot can keep track of the conversation in its different stages and relate, and make a connection between, different requests. The entire flow of the conversation is taken into consideration and not just the last request.

In every conversational bot we build – either as a chatbot or a voicebot – the interaction will have two sides:

The end user will ask, Can I book a flight?

The bot will respond, Yes. The bot might also add, Do you want to fly international?

The end user can then approve this or respond by saying, No, domestic.

A contextual conversation is very different from a simple Q&A. For the preceding scenario, there were multiple different ways the user could have responded and the bot must be able to deal with all those different flows.

State machine

One methodology for dealing with different flows is to use a state machine methodology. This popular and simple way to describe context connects each state (phase) of the conversation to the next state, depending on the user's reaction.

State machine

Figure 11: Building contextual conversation using a state machine works better for simple use-cases and flows

However, the advantage of a state machine is also its disadvantage. This methodology forces us to map every possible conversational flow in advance. While it is very easy to use for building simple use cases, it is extremely difficult to understand and maintain over time, and it's impossible to use for more complicated flows (flight booking, for example, is a complex flow that can't be supported using a state machine). Another problem with the state machines method is that, even for simple use cases, to support multiple use cases with the same response, we still need to duplicate much of the work.

State machine

Figure 12: The disadvantage of using a state machine methodology when building complex flows

Event-driven contextual approach

The event-driven contextual approach is a more suitable method for today's conversational UI. It lets the users express themselves in an unlimited flow and doesn't force them through a specific flow. Understanding that it's impossible to map the entire conversational flow in advance, the event-driven contextual approach focuses on the context of the user's request to gather all the information it needs in an unstructured way by minimizing all other options.

Using this methodology, the user leads the conversation and the machine analyzes the data and completes the flow at the back. This method allows us to depart from the restricting GUI state machine flow and provide human-level interaction.

In this example, the machine knows that it needs the following parameters to complete a flight:

  • Departure location
  • Destination
  • Date
  • Airline

The user in this case can fluently say, I want to book a flight to NYC, or I want to fly from SF to NYC tomorrow, or I want to fly with Delta.

For each of these flows, the machine will return to the user to collect the missing information:

User says

Information bot collects

Information bot requests

User replies

I want to book a flight to NYC

Destination: NYC

Departure location

Date

Airline

Tomorrow, from SF with Delta

I want to fly from SF to NYC tomorrow

Departure: SF

Destination: NY

Date: Tomorrow

Airline

With Delta

I want to fly with Delta to NYC

Destination: NYC Airline: Delta

Departure location

Date

From NY, tomorrow

By building a conversational flow in an event-driven contextual approach, we succeed in mimicking our interaction with a human agent. When booking a flight with a travel agent, I start the conversation and provide the details that I know. The agent, in return, will ask me only for the missing details and won't force me to state each detail at a certain time.

Business logic/dynamic data

At this stage, I think we can agree that building a conversational UI is not an easy task. In fact, many bots today don't use NLU and avoid free-speech interaction. We had great expectations of chatbots and with those high expectations came a great disappointment. This is why many chatbots and voicebots today provide mostly simple Q&A flows.

Business logic/dynamic data

Figure 13: The Michael Kors Facebook Messenger bot: conversational UI minimized to a simple Q&A flow with no contextuality

Most of those bots have a limited offering and the business logic is connected to two-to-three specific use cases, such as opening hours or a phone number, no matter what the user is asking for. In other very popular chat interfaces, bots are still leaning on the GUI, offering a menu selection and eliminating free text.

Business logic/dynamic data

Figure 14: The Michael Kors Facebook Messenger bot: forcing graphic UI on conversational UI mediums

However, if we are building a true conversational communication between our bot and our users, we must make sure that we connect it to a dynamic business logic. So, after we have enabled speech recognition, worked on our NLU, built samples, and developed an event-driven contextual flow, it is time to connect our bot to dynamic data. To reach real-time data, and to be able to run transactions, our bot needs to connect to the business logic of our application. This can be done through the usage of APIs to your backend systems.

Going back to our flight booking bot, we would need to retrieve real-time data on when the next flight from SF to NYC is, what seats are still available, and what the price is for the flight. Our APIs can also help us to complete the order and approve a payment. If you are lacking APIs for some of the needed data and functions, you can develop new ones or use screen-scraping techniques to avoid a complex development.

Challenges and gaps in conversational UI

Conversational UI is still new to us and, as such, there are still challenges and gaps that prevent it from reaching its full potential. Technology has improved greatly over the years to get us to where we are, but, although we are far from HAL 9000 (from the movie 2001: A Space Odyssey, in which a computer program interacts freely with the ship's astronaut crew and controls the systems of the Discovery One spacecraft using thinking and feeling), we must keep in mind that even HAL had some malfunctions. In this section, I will list the five main challenges that technology and bot designers will have to address in the next few years.

NLU is an AI-hard problem

As human-machine interaction becomes more sophisticated, natural, and humanized, the harder it is to build and develop it. While creating a simple command-line text-based interface can be done by any developer, a high-quality UI in the form of a chatbot or voicebot requires many experts, including chat and voice designers and NLU specialists, both of which are very hard to find.

Natural language understanding is the attempt to mimic reading comprehension by a machine. It is a subtopic of AI and, as mentioned earlier, it is an AI-hard (or AI-complete) problem. An AI-hard problem is equivalent to solving the central AI problem: making computers as intelligent as people (https://en.wikipedia.org/wiki/AI-complete). Why is it so difficult? As discussed above, when responding to a conversational UI, there is an infinite number of unknown and unexpected features in the input, within an infinite number of options of syntactic and semantic schemes to apply to it. This means that when we chat or talk to a bot, just as when we talk to another person, we are unlimited in what we can say. We are not restricted to keeping to a specific GUI path: we are free to ask about anything and everything.

One way to tackle the NLU AI-hard issue is to focus and limit the computer's understanding to a specific theme, subject, or use case. When I go to the doctor, I'm probably not going to consult with him about the return I will yield when investing in the NY stock exchange. When I visit the doctor, I am within a specific context: I don't feel well, I need a new subscription to a medication, and so on. In fact, just within a doctor scenario, there are so many use cases that we will have to predefine, so it would make sense to break those down into sub-use cases, to help improve our NLU in sub-domain contexts (pediatrician, gynecology, oncology, and so on).

If we go back to our travel example, we can train the NLU layer of our bot to be able to respond to everything related to the booking of flights. In this case, we mimic a possible conversation between the user and a travel agent. While a human travel agent can help us with additional tasks, such as finding a hotel, planning our trip, and more, in this use case we will stay within the context of booking flights to maximize the experience and the responses.

Accuracy level

A major derivative of the NLU problem is the accuracy level of the conversation. Even when limiting our bot to a specific use case, the need to cover all possible requests, in each form of language, makes it very hard to create a good user experience (UX). In fact, more than 70% of the interactions we have with machines fail (https://www.fool.com/investing/2017/02/28/facebook-incs-chatbots-hit-a-70-failure-rate.aspx). While users are willing to try and address their needs quickly with an automated system, they are unforgiving once the system fails to serve them.

The accuracy of the level of understanding is dependent on the number of preconfigured samples in the bot. Those samples are sentences that users say that represent their request or intent. The bot, thereafter, translates them into actions. For every request, there are hundreds of such sentences. For complex requests, where there are also many parameters involved (such as our flight booking bot example), there are thousands, if not tens of thousands of them. This remains an unsolved problem today and, as a result, many bots today offer a poor experience to their users, which stays within very limited boundaries.

From GUI to CUI and VUI

The transition from GUI to conversational UI (CUI), as well as to conversational user experience (CUX), and voice user experience (VUX) introduces many challenges within this paradigm shift that we are witnessing. Beyond the unlimited options that we discussed above, as part of the AI-hard problem raised around NLU, when building a conversational UI, and especially a voice UI and UX, there is a challenge of exposing the user to your offer in a screenless environment.

When I go to the store, I can see all the items I can choose from and purchase, and I can ask the salesperson for more help. A good salesperson will help me and recommend items that they think I should be made aware of in the store. When I shop online, I can view all the items that are available for me to purchase and can also search for something specific and browse through the various results. Here, as well, I can get recommendations, sometimes based on my previous purchases, in different graphical forms such as pop-ups or newsletters. Exposing the user to your offering within a text or a voice conversational UI is extremely difficult. Just as a conversational UI is limited in nature (focusing on specific use cases, within a certain context), the ways to expose the user to what you offer, or how you can help him/her, are limited as well.

Chatbots

Many chatbots offer a menu-based interaction, providing options to choose from. This way, the conversation is limited to a specific flow (state machine supported), but the added value is that the user can be exposed to additional information. The problem with this solution is that it inherits the GUI experience into the CUI and very often offers very little value.

Chatbots

Figure 15: The Expedia Facebook Messenger bot:is a menu-based interaction conversational?

Voicebots

In the case of voicebots, we often witness a "help" section, which provides the user with a list of actions they can perform when talking to the bot. This will be in the form of an introduction to the application, offering a few examples of what the user can ask. Going back to our flight example, imagine that a user says, Ok Google, open travel bot. The first response can be Welcome to Travel Bot! How can I help you? You can ask me: what is the next flight to NYC from SF? In addition, voice-enabled devices, such as Amazon Alexa and Google Home, provide users with an instruction cart that gives some examples of questions. The companies also send out a weekly newsletter with new capabilities.

Voicebots

Figure 16: The Amazon Echo jump-start cart for first-time users, which exposes users to basic capabilities

Non-implicit contextual conversation

I mentioned a couple of times the need to build contextual conversational UI and UX, and I will dedicate a full chapter (Chapter 3, Building a Killer Conversational App) to this in the book. Being a major challenge in today's conversational UI development, I believe that it deserves one more mention in this section.

We expect bots to replace humans – not computers. The conversational UI mimics my interaction with a human, whether through text or voice. Even when we limit the interaction to a specific use case and include all possible sample sentences that could prompt a question, there is one thing that is very difficult to predict within a contextual conversation: non-implicit requests.

If I call my travel agent and excitedly tell her that my daughter's 6th birthday is coming up, she might "do the math" and understand that we are planning a family trip to Disneyland. She will then extract all the parameters needed to complete my request:

  1. Dates
  2. Number of people/adults/kids
  3. Flights
  4. Hotels for the dates
  5. Car rental
  6. Allergies and more…

Even though I haven't explicitly requested her help to plan a trip to Disneyland, the travel agent will be able to connect the dots and respond to my request. Training a machine to do that, that is, to react to non-implicit requests, remains a huge challenge in today's technology stack. However, the good news is that AI technologies and, more specifically, machine learning and deep learning, will become very useful in the next couple of years for tackling this challenge.

Security and privacy

One very controversial aspect when discussing chatbots and voicebots is security and, more specifically, the privacy around it. In today's world, chatbot and voicebot platforms are controlled by some of the leading corporations and our data and information become their assets. Although Google, Amazon, and Facebook have been collecting private data for quite a while (whenever we searched the web, purchased items on Amazon, or just posted something on Facebook), now those companies "listen" to us outside of the web/app environment: they are in our homes and in every private message. Recently, Amazon Alexa was accused of recording a private conversation of a man at his home and sending it to his boss, without that person's consent.

The "constantly listening" functionality reminds many of George Orwell's 1984 and the party-monitoring telescreen that was designed to simultaneously broadcast entertainment and listen in to people's conversations to detect disorders. Although Orwell's telescreen was used by a tyranny to control its people, whereas today's solutions are owned by commercial corporations, one cannot help but wonder what the implications of using such devices will be in the future.

Conversational channels controlled by the above corporations have also become a challenge for businesses that are forced into running their customers' interactions through third-party channels. Where five years ago businesses were reluctant about shifting their data centers to the cloud, today it has no meaning at all, when data is being transferred through additional channels anyway.

This is important for us to understand when we design our chatbots and voicebots. Mainly, we should protect our customers' data and, where needed, obey the relevant country's/state's regulations. We should make sure we are not asking for specific data, such as SSN or credit card numbers and, for the time being, use complementary ways to get that, such as rerouting the user to a secure site to complete registration.

Summary

Intelligent assistance, chatbots, voicebots, and voice-enabled devices, such as Amazon Echo and Google Home, have stormed into our lives, offering many ways to improve daily tasks, through natural human-computer communication. In fact, some of the applications that we use today already take advantage of voice/chat-enabled interaction to ease our lives. Whether we are turning the lights on and off in our living room with a simple voice command or shopping online with a Facebook Messenger bot, conversational UI makes our interactions more focused and efficient.

Fast-forward from today, we can assume that conversational UI, and more specifically voice-enabled communication, will replace all interactions with computers. In the movie Her (2013), written and directed by Spike Jonze, an unseen computer bot communicates with the main character using voice. This voicebot (played by Scarlett Johansson) assists, guides, and consults the main character on any possible matter. It is a personal assistant on steroids.

Its knowledge is unlimited, it continues to learn all the time, it can create a conversation (a true exchange of ideas), and at the end it can even understand feelings (however, it still doesn't feel itself). However, as we've seen above, with current technology, real-life conversational UI still lacks many of the components seen in Her and faces unsolved challenges and question marks around it. The experience is limited for the user, as it's still mostly un-contextual and bots are far from understanding feelings or social situations.

Nevertheless, with all the limitations we experience today, creating a supercomputer that knows everything is more within reach than creating a super-knowledgeable person. Technology, whether in the form of advanced AI, ML, or DL methodologies, will solve most of those challenges and make the progress needed to build successful bot assistants.

What might take a bit more time to transform is human skepticism: conversational UI is also limited because its users are still very skeptical of it. Aware of its limitations, we stick to what works best and tend to not challenge it too much. When comparing children-bot interaction with that of adults, it is clear to see that while the latter group stays within specific boundaries of usage, the former interacts with the bots as they are real adult humans – knowledgeable about almost everything. It might be a classic chicken or the egg dilemma, but one thing is for sure: the journey has started and there's no going back.

References

Left arrow icon Right arrow icon

Key benefits

  • Build AI chatbots and voicebots using practical and accessible toolkits
  • Design and create voicebots that really shine in front of humans
  • Work with familiar appliances like Alexa, Google Home, and FB Messenger
  • Design for UI success across different industries and use cases

Description

We are entering the age of conversational interfaces, where we will interact with AI bots using chat and voice. But how do we create a good conversation? How do we design and build voicebots and chatbots that can carry successful conversations in in the real world? In this book, Rachel Batish introduces us to the world of conversational applications, bots and AI. You’ll discover how - with little technical knowledge - you can build successful and meaningful conversational UIs. You’ll find detailed guidance on how to build and deploy bots on the leading conversational platforms, including Amazon Alexa, Google Home, and Facebook Messenger. You’ll then learn key design aspects for building conversational UIs that will really succeed and shine in front of humans. You’ll discover how your AI bots can become part of a meaningful conversation with humans, using techniques such as persona shaping, and tone analysis. For successful bots in the real world, you’ll explore important use-cases and examples where humans interact with bots. With examples across finance, travel, and e-commerce, you’ll see how you can create successful conversational UIs in any sector. Expand your horizons further as Rachel shares with you her insights into cutting-edge voicebot and chatbot technologies, and how the future might unfold. Join in right now and start building successful, high impact bots!

Who is this book for?

This book is for you, if you want to deepen your appreciation of UI and how conversational UIs - driven by artificial intelligence - are transforming the way humans interact with computers, appliances, and the everyday world around us. This book works with the major UI toolkits available today, so you do not need a deep programming knowledge to build the bots in this book: a basic familiarity with markup languages and JavaScript will give you everything you need to start building cutting-edge conversational UIs.

What you will learn

  • Build your own AI voicebots and chatbots
  • Use familiar appliances like Alexa, Google Home, and Facebook Messenger
  • Master the elements of conversational user interfaces
  • Key design techniques to make your bots successful
  • Use tone analysis to deepen UI conversation for humans
  • Create voicebots and UIs designed for real-world situations
  • Insightful case studies in finance, travel, and e-commerce
  • Cutting-edge technology and insight into the future of AI bots

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Sep 29, 2018
Length: 296 pages
Edition : 1st
Language : English
ISBN-13 : 9781789136883
Languages :
Concepts :

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Product Details

Publication date : Sep 29, 2018
Length: 296 pages
Edition : 1st
Language : English
ISBN-13 : 9781789136883
Languages :
Concepts :

Packt Subscriptions

See our plans and pricing
Modal Close icon
$19.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
$199.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just NZ$7 each
Feature tick icon Exclusive print discounts
$279.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just NZ$7 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total NZ$ 194.97
Hands-On Chatbot Development with Alexa Skills and Amazon Lex
NZ$64.99
Voicebot and Chatbot Design
NZ$64.99
Voice User Interface Projects
NZ$64.99
Total NZ$ 194.97 Stars icon
Banner background image

Table of Contents

13 Chapters
1. Conversational UI is our Future Chevron down icon Chevron up icon
2. How Not to Build Your Next Chat and Voicebots Chevron down icon Chevron up icon
3. Building a Killer Conversational App Chevron down icon Chevron up icon
4. Designing for Amazon Alexa and Google Home Chevron down icon Chevron up icon
5. Designing a Facebook Messenger Chatbot Chevron down icon Chevron up icon
6. Contextual Design – Can We Make a Bot Feel More Human? Chevron down icon Chevron up icon
7. Building Personalities – Your Bot Can Be a Better Human Chevron down icon Chevron up icon
8. A View into Vertical-Specific Bots – Financial Institutions Chevron down icon Chevron up icon
9. Travel and E-Commerce Bots – Use Cases and Implementation Chevron down icon Chevron up icon
10. Conversational Design Project – A Step-By-Step Guide Chevron down icon Chevron up icon
11. Summary Chevron down icon Chevron up icon
Other Book You May Enjoy Chevron down icon Chevron up icon
Index Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Full star icon Full star icon Full star icon Full star icon Half star icon 4.6
(7 Ratings)
5 star 71.4%
4 star 14.3%
3 star 14.3%
2 star 0%
1 star 0%
Filter icon Filter
Top Reviews

Filter reviews by




Luciana Mar 27, 2019
Full star icon Full star icon Full star icon Full star icon Full star icon 5
I was looking for a quick read on Bot design, NOT development. This book provided a great overview on all the considerations required to design a solid bot.
Amazon Verified review Amazon
Avishay Pariz Oct 26, 2018
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Great book.Written very well with simple explanations and visuals to make things very clear.Also up to date and have examples that very relevant for today and the near future.
Amazon Verified review Amazon
Teri Oct 31, 2018
Full star icon Full star icon Full star icon Full star icon Full star icon 5
In this voice-first world, technology is changing so fast... and Rachel covers it all! She covers everything from definitions of basic voice user interface terms to tutorials on how to design effective conversation experiences. This is the ultimate resource for anyone wanting to learn about the latest technology of conversational design.
Amazon Verified review Amazon
Amazon Customer Oct 28, 2018
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Rachel is a preiminent thought leader in the conversational interface space. Rachel leverages her real world experience consulting enterprise companies at Conversation One and injects her actionable insights into this book. She expertly articulates easy to understand strategies, best practices, and methodologies. I highly recommend this book for novices and experts alike working in the chatbot/voice space.
Amazon Verified review Amazon
Bojan Poturica May 07, 2020
Full star icon Full star icon Full star icon Full star icon Full star icon 5
It is not an easy task to find quality books and information on chatbots and voice bots, but this book is really great to read. It gives you a great overview and saves lot of time searching for the information.
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

How do I buy and download an eBook? Chevron down icon Chevron up icon

Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.

If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.

Please Note: Packt eBooks are non-returnable and non-refundable.

Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:

  • You may make copies of your eBook for your own use onto any machine
  • You may not pass copies of the eBook on to anyone else
How can I make a purchase on your website? Chevron down icon Chevron up icon

If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:

  1. Register on our website using your email address and the password.
  2. Search for the title by name or ISBN using the search option.
  3. Select the title you want to purchase.
  4. Choose the format you wish to purchase the title in; if you order the Print Book, you get a free eBook copy of the same title. 
  5. Proceed with the checkout process (payment to be made using Credit Card, Debit Cart, or PayPal)
Where can I access support around an eBook? Chevron down icon Chevron up icon
  • If you experience a problem with using or installing Adobe Reader, the contact Adobe directly.
  • To view the errata for the book, see www.packtpub.com/support and view the pages for the title you have.
  • To view your account details or to download a new copy of the book go to www.packtpub.com/account
  • To contact us directly if a problem is not resolved, use www.packtpub.com/contact-us
What eBook formats do Packt support? Chevron down icon Chevron up icon

Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.

You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.

What are the benefits of eBooks? Chevron down icon Chevron up icon
  • You can get the information you need immediately
  • You can easily take them with you on a laptop
  • You can download them an unlimited number of times
  • You can print them out
  • They are copy-paste enabled
  • They are searchable
  • There is no password protection
  • They are lower price than print
  • They save resources and space
What is an eBook? Chevron down icon Chevron up icon

Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.

When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.

For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.