A Chatbot is basically a computer program that conducts a conversation between the user and the computer through auditory or conversational methods. In short, it acts as a real-world conversational partner. So, in this tutorial, we will be creating a very simple Chatbot application that will tell you a little about Lung cancer. So, let’s get started!
Prerequisites
Before you go ahead, please note that there are a few prerequisites for this tutorial. You should have some prior basic knowledge of Machine Learning, as well as basic programming knowledge in any language (preferably in Python). You must also have some knowledge about NLTK which is a leading platform for building Python programs to work with human language data. Apart from this, the article is pretty beginner-friendly and easy to understand. We will also be using Google Colab for writing our code but you can write it on any code editor of your choice.
(https://www.fingent.com/blog/capitalizing-on-ai-chatbots-will-redefine-your-business-heres-how/)
Chatbot
A chatbot is a part of the software in various applications such as virtual assistants (Siri, Alexa, Google Assistant, etc.), website function, or other networks that try to converse with users to understand their needs and then assist them to perform a particular task. Chatbots are categorized into two main types: Self Learning and Rule-Based chatbots.
- Self-Learning chatbots use machine learning and artificial intelligence algorithms to recognize the characteristics of the inputs that they get from the users and use them later on. These characteristics are obtained from the trained models and usually in the forms of high-dimensional vectors. They are more efficient than Rule-Based chatbots.
- Rule-Based chatbots follow certain rules on which the responses are based. A rule-based approach is a very simple and nice approach to start with, but it can fail to handle the complex questions of a user.
Our Application
We will be making a chatbot buddy which will act as a doctor and will answer your questions about lung cancer. It is basically a Self-learning chatbot and will respond to the user questions by selecting the best response that is most similar to the question that the user has asked. The application takes text about Lung Cancer and uses it to converse with the user and answer his or her queries about the disease. The knowledge of the chatbot is limited to the content of the website.
Python Implementation
#Installing essential packages pip3 install nltk pip3 install newspaper3k # Importing relevant libraries from newspaper import Article import random from sklearn.feature_extraction.text import TfidfVectorizer from sklearn.metrics.pairwise import cosine_similarity import nltk import numpy as np from random import choice import string # Downloading NLTK packages nltk.download ('punkt') nltk.download ('wordnet') # Geting the paper/article and extracting its text paper = Article ('https://www.mayoclinic.org/diseases-conditions/lung-cancer/symptoms-causes/syc-20374620') paper.download() paper.parse() paper.nlp() paper_text = paper.text #print(paper_text) text = paper_text # Converting the text in the article into a list of individual words or sentences tokens = nltk.sent_tokenize (text) #print(tokens) # Removing punctuation in the text by creating a dictionary no_punctuation = dict( (ord (punct), None) for punct in string.punctuation ) # Function that returns lemmatized words after punctuation removal. Words are in lower case # lemmatize : sort so as to group together inflected or variant forms of the same word def lemmatizeWord (text): return nltk.word_tokenize(text.lower().translate(no_punctuation)) #print(lemmatizeWord(text)) def reply (user_reply): # User's question user_reply = user_reply.lower () # Chatbot's reply to the user's question chatbot_reply = '' # Appends the question of the user to the list of tokens. tokens.append (user_reply) # Measures the frequency and rarity of a word. Is statistical measure used to evaluate how important a word is to a document. Importance is proportional to the number of times a word is repeated in a document or text. TfidfVector = TfidfVectorizer (tokenizer = lemmatizeWord, stop_words = "english") # Converting text to matrix form tfidf = TfidfVector.fit_transform (tokens) # Gets similarity score between the user’s question and our text in the article scores = cosine_similarity (tfidf[-1], tfidf) # Get sindex of most similar text to the user's question index = scores.argsort ()[0][-2] # Reduces dimensionality of scores (make one list) by flattening flat = scores.flatten () # Sorts in ascending order flat.sort () best_score = flat[-2] # print(best_score) # for no text similar to the user's question, best_score will be 0 if (best_score == 0): chatbot_reply = chatbot_reply + "Sorry, I do not understand what you are saying." else: chatbot_reply = chatbot_reply + tokens[index] # Remove user's response from tokens list tokens.remove (user_reply) # print(chatbot_response) return chatbot_reply def main (): var = 1 print ("Chatbot: Hi, I am your doctor chatbot buddy. I specialise in Lung cancer (sort of).What is your name?. PS, if you wish to quit this conversation, type 'quit'.") user_name = input ("User: ") if user_name == "quit": var = 0 print ("Chatbot: Oh thats ok. We can talk later") else: print("Chatbot: Well hello "+ user_name + ". What do you want to know about Lung cancer?") while (var == 1): user_reply = (input (user_name + ": "+ " ")) user_reply = user_reply.lower () if (user_reply != "quit"): if (user_reply == "thanks" or user_reply == "thank you"): var = 0 print ("Chatbot: You are welcome.") else: print ("Chatbot: "+ reply (user_reply)) else: var = 0 print ("Chatbot: Oh thats ok. We can talk later.") main ()
Output:
Chatbot: Hi, I am your doctor chatbot buddy. I specialise in Lung cancer (sort of).What is your name?. PS, if you wish to quit this conversation, type ‘quit’. User: Junaid Chatbot: Well hello Junaid. What do you want to know about Lung cancer? Junaid: I suffer from shortness of breath? Do you think I have lung cancer? Chatbot: Fluid accumulating in the chest can cause shortness of breath. Junaid: So what causes lung cancer? Chatbot: How smoking causes lung cancer Doctors believe smoking causes lung cancer by damaging the cells that line the lungs. Junaid: Nooo :3 Chatbot: Sorry, I do not understand what you are saying. Junaid: quit Chatbot: Oh thats ok. We can talk later.
Applications of Chatbots
Chatbots have become popular in the past few years and businesses have discovered innovative ways to put them to use. Basically, chatbots can help users to:
- Book flights/tickets for travel
- Order food
- Do Market Research
- Make product inquiries
- Have health care knowledge
- Have companionship
For your chatbot to perform the best quality outputs, you need lots and lots of text data with high quality. Most times, it is very difficult for an individual or small- or medium-sized companies to collect such large quantity data while maintaining high quality. Therefore, it is often more efficient to find another service that does laborious works for you. We could be your perfect solution!
Here at DATUMO, we crowdsource our tasks to diverse users located globally to ensure the quality and quantity on time. Moreover, our in-house managers double-check the quality of the collected or processed data.
Chatbots have are high in demand and are used extensively in the world nowadays. Every business needs it to automate their customer service/ support department, at least to some extent. If you visit a Facebook page, there’s a 90% chance that if a page is well-established, they would have a messenger chatbot, which is equipped to provide you any generic information. Where the chatbot fails, you get diverted to a human for support instead. This lowers the company’s need for employees for their Customer Support Team, hence increasing their profits as well. This was just one example, you’ll find chatbots being used by pretty much every big enterprise, to automatically answer 24/7 common customer queries.
To sum it all up, we started off by getting an introduction of what chatbots are and talked about the two main types which are Self-learning and Rule-based. We made a small medical application about lung cancer using the Self-learning based approach in Python. Lastly, we discussed some of the other common applications of Chatbots and how businesses today are creatively using chatbots to attract customers.