Downloading the therapy bot session text dataset
This section will focus on downloading and setting up the dataset that will be used for NLP in this chapter.
Getting ready
The dataset that we will use in this chapter is based on interactions between a therapy bot and visitors to an online therapy website. It contains 100 interactions and each interaction is tagged as either escalate
or do_not_escalate
. If the discussion warrants a more serious conversation, the bot will tag the discussion as escalate
to an individual. Otherwise, the bot will continue the discussion with the user.
How it works...
This section walks through the steps for downloading the chatbot data.
- Access the dataset from the following GitHub repository: https://github.com/asherif844/ApacheSparkDeepLearningCookbook/tree/master/CH07/data
- Once you arrive at the repository, right-click on the file seen in the following screenshot:
- Download
TherapyBotSession.csv
and save to the same local directory as the Jupyter notebookSparkSession...