Building a Chatbot from WhatsApp Conversations: A Step-by-Step Tutorial
In this post, we will guide you through the process of creating a chatbot for WhatsApp conversations using Python. We’ll cover text preprocessing, building a word cloud, and training a chatbot using machine learning techniques. By the end of this tutorial, you’ll have a functional chatbot that can generate responses based on the sender’s identity.
Data Preparation: Reading WhatsApp Chats into Pandas DataFrame
- Import necessary libraries, including pandas for data manipulation.
- Read the WhatsApp chat file into a Pandas DataFrame, ensuring proper column naming and formatting.
- Clean and organize the data, extracting timestamps, senders, and messages.
To export the history of a conversation that you’ve had on WhatsApp, you need to open the conversation on your phone. Once you’re on the conversation screen, you can access the export menu:
- Click on the three dots (⋮) in the top right corner to open the main menu.
- Choose More to bring up additional menu options.
- Select Export chat to create a TXT export of your conversation.
In the stitched-together screenshots below, you can see the three consecutive steps numbered and outlined in red:
Filtering Messages within a Timestamp Range:
- Define a timestamp range to filter the DataFrame based on specific dates and times.
- Filter messages from a particular sender within the specified timestamp range.
chat_file_path = 'D:\DS\chatbot\chat\_chat.txt'
timestamps = []
senders = []
messages = []
# Read the WhatsApp chat file line by line
with open(chat_file_path, 'r', encoding='utf-8') as file:
for line in file:
# Split the line into timestamp, sender, and message
parts = line.strip().split(' ', 2)
# Ensure the line has at least three parts
if len(parts) >= 3:
timestamps.append(parts[0] + ' ' + parts[1])
senders.append(parts[2].split(':')[0])
# Check if there is a message part
if len(parts[2].split(':')) > 1:
messages.append(parts[2].split(':', 1)[1].strip())
else:
messages.append('')
else:
print(f"Skipping line: {line}")
df = pd.DataFrame({'Timestamp': timestamps, 'Sender': senders, 'Message': messages})
df
def remove_non_message_text(export_text_lines):
messages = export_text_lines[1:-1]
filter_out_msgs = ("<Media omitted>",)
return tuple((msg for msg in messages if msg not in filter_out_msgs))
if __name__ == "__main__":
message_corpus = remove_chat_metadata("chat.txt")
cleaned_corpus = remove_non_message_text(message_corpus)
print(cleaned_corpus)
Training a Chatbot Using ChatterBot:
- Import ChatterBot and related modules for training a chatbot.
- Prepare a DataFrame with ‘Sender’ and ‘Message’ columns for training data.
- Train separate Naive Bayes classifiers for different senders using TF-IDF vectorization.
- Create and train individual chatbots for each sender.
import pandas as pd
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.naive_bayes import MultinomialNB
from chatterbot import ChatBot
from chatterbot.trainers import ListTrainer
# Replace "John Doe" with "xyz"
df['Sender'] = df['Sender'].replace('John Doe', 'xyz')
# Update Sender Classification for xyz
xyz_messages = df[df['Sender'] == 'xyz']
tfidf_vectorizer_xyz = TfidfVectorizer()
features_xyz = tfidf_vectorizer_xyz.fit_transform(xyz_messages['Message'])
classifier_xyz = MultinomialNB()
classifier_xyz.fit(features_xyz, xyz_messages['Sender'])
# Create chatbot for xyz
chatbot_xyz = ChatBot('XYZChatBot')
xyz_messages_list = xyz_messages['Message'].tolist()
trainer_xyz = ListTrainer(chatbot_xyz)
trainer_xyz.train(xyz_messages_list)
# Use the new classifiers and chatbot in the interactive chat loop
while True:
user_input = input("You: ")
if user_input.lower() == 'exit':
print("Chatbot: Goodbye!")
break
sender_xyz = classify_sender(user_input, classifier_xyz, tfidf_vectorizer_xyz)
if sender_xyz == 'xyz':
response = chatbot_xyz.get_response(user_input)
print(f"Chatbot (xyz): {response.text}")
else:
print("Chatbot: I'm sorry, I don't recognize the sender of this message.")
Conclusion:
- Summarize the key steps in building the WhatsApp chatbot.
- Discuss potential improvements and extensions to enhance the chatbot’s capabilities.