Chatbot Using Deep Learning

5 min readMay 4, 2021

What is a Chatbot?

A chatbot is a conversational agent that interacts with users using natural language. It is also often described as an expression of the interaction between humans and machines. In the backend, it is identifying the user’s request and returning the response.

Background

First Chatbot was developed in 1966 at MIT called ELIZA. ELIZA was a simple decision tree question that answers a few questions. Now it is developed into everyday life with messenger apps, voice assistants, and many more. Chatbots are quickly replacing humans for technical support and customer service.

Benefits

Every business needs a chatbot for its website or app. Even the private sector and other companies need a chatbot for their customer services. A chatbot can replace customer service agents with 24-hour services and help businesses save money.

If you want to check my presentation on this project, here is the link.

Different methods to create a chatbot

Creating a set amount of patterns and response
Using RASA Framework
Creating your own Framework

Creating your own data/intents

My notebook for the first method is here.

Intents are categories of the text of the user’s input. For example, ‘Hi’ would be a greeting, ‘how can you help me?’ would be a help intents. You can create different intent for a different purpose. It is Great for FAQ and easy to create You can create different responses to those intents. Here is the

First Step: Create Intents based on the type of FAQ you want

Below is a short Apple Stock price FAQ that I created.

Tag: Possible classes of user intention for asking a question

Pattern: The ways in which users usually ask questions relating to a particular tag, the more the merry

Response: Predefined responses for each tag in the dataset from which the model can choose to respond to a particular question

Context: Contextual words relating to a tag for easy and better classification of what the user intends with their request

Next Step: Load in the file and then pre-processing the data

## Loading the file
with open(“intents.json”) as file:
 data = json.load(file)

For pre-processing, you need to change the pattern in an array of numbers for the modeling part. Below is the preprocessing part on the pattern of the data:

Tokenize: Separating the words into a list

Stemmer: Makes similar words like programs, programer, programer into its based word program

Removing all the Special character: I use normalize method to remove all the ASCII and UTF8 words

for word in new_words:
    word = unicodedata.normalize('NFKD', word).encode('ascii', 'ignore').decode('utf-8', 'ignore')
    words.append(word)

Labels: We need a label for the list of words by using tags of the intent file that we created above.

Creating a bag of words in binary to train the model

So with the word list that we created using the preprocessing, we need to turn it into an array of numbers.

for x, doc in enumerate(docs_x):
    bag = []
    wrds = [stemmer.stem(w) for w in doc]for w in words: 
        if w in wrds:
            bag.append(1)
        else:
            bag.append(0)
    output_row = out_empty[:]
    output_row[labels.index(docs_y[x])] = 1
    training.append(bag)
    output.append(output_row)
## switching the list into an array for input into a model
training = np.array(training)
output = np.array(output)

Modeling: Deep Neural Network

For this model, I used a tflearn module to used the Neural Network model to train the model. Since the data is very little, I used 1000 epochs because it is able to process 1000 epochs very fast.

## defines the input shape for the model
net = tflearn.input_data(shape = [None, len(training[0])])
## adding to the neural network to 2 hidden layers 
## more hidden layers for more complex problem
net = tflearn.fully_connected(net, 8)
net = tflearn.fully_connected(net, 8)
## output layers activation will allow us to get probablity for each neuron.
net = tflearn.fully_connected(net, len(output[0]), activation = 'softmax')
net = tflearn.regression(net)
## Deep Neural Networl model
model = tflearn.DNN(net)
model.fit(training, output, n_epoch = 1000, batch_size = 8, show_metric = True)

Building the Chatbot

Now, we need to change the input for the machine to understand. So, we need a function that creates an array from the user’s input.

def bag_of_words(s, words):
    bag = [0 for _ in range(len(words))]
    
    s_words = nltk.word_tokenize(s)
    s_words = [stemmer.stem(word.lower()) for word in s_words]
    
    for se in s_words:
        for i, w in enumerate(words):
            if w == se:
                bag[i] = 1
    return np.array(bag)

This function takes in inputs “s” and a list of words that you preprocess above. Then it tokenizes and stemmer your input and then it turns into a binary array for the machine to understand. If the word of your input is in the list then it is converted into 1 and 0: if it is not.

Now you can create a chatbot function that can interact with you.

def chat():
    print('Hello, I am an Apple Stock FAQ Bot!! How may I help you? (type q to stop) !')
    while True:
        inp = input('You: ')
        if inp.lower() == 'q':
            break
        ## it will giveout a probablity
        results = model.predict([bag_of_words(inp,words)])## gives an index of the largest probablity, so you can display the best answer
        results_index = np.argmax(results)
        tag = labels[results_index]
        if results[results_index] > 0.7:
#             print(results)for tg in data['intents']:
                if tg['tag'] == tag:
                    responses = tg['responses']print('Chatbot: ' + random.choice(responses))
        else:
            print("Chatbot: Sorry, Here is Customer Service number Phone: 877-360-5390 (U.S. toll-free) for better help!")

In this function, what the function does is that it runs the bag of words function to convert your input into a binary array then you use the model to predict. It will give a list of the probability of similarity of the intents. Then it gets the largest results intent and if it is greater than 70%, it gives the response.

RASA Framework

Rasa is a framework for developing AI-powered, industrial-grade chatbots. It’s really powerful and is used by developed worldwide to create chatbots and contextual assistants. Rasa has many dependency issues, so I suggest you either create a new virtual environment or use google colab. Here is the link to the guide to install rasa on your computer.

Rasa allows us to design stories so you can keep conversing with the bot. So it remembers the previous store answer. So the chatbot can control the flow of the dialogue. So this can be useful for businesses like hotels, and airlines. So, the customer can book a room or flight.

When you create the rasa project it comes with some files that you need to edit in order to customize your liking.

RASA Files

Config.yml: all the machine learning settings
Domain.yml: an overview of intents + responses custom actions
Stories.md: examples of intent/action sequences
Nlu.md: examples of intents/entities

RASA Pipeline

Whitespace Tokenizer (using whitespaces as a separator)
Count Vectors Feature(Creates bag-of-words representation of user messages, intents, and responses)
N-gram from 1–4
NLU model (Natural language understanding)

You can edit the Pipeline in the config.yml file that is optimized for your chatbot. Here is the notebook for it.

Below is the link for Github for the project if you want to check it out.

https://github.com/tw1270/Chatbot

https://www.linkedin.com/in/tenzin-wangdu/