Getting our data ready for the chatbot
To get our data ready, we first need to source and then process it so that it’s in the cleanest state, ready to be used.
Selecting our data sources
For this project, we’re going to need to provide a number of different types of context-specific knowledge to our LLM agent so that we can service the expected question types:
- Hotel information: This would come from our online travel agent hotel dataset, so we’ll need some hotel data to represent the hotels we want to recommend to our users. The data we’re going to use to provide hotel recommendations is a small subset of the dataset at https://www.kaggle.com/datasets/raj713335/tbo-hotels-dataset. This dataset contains information on 1,000,000+ hotels from different countries and regions, such as their rates, reviews, amenities, location, and star ratings. The data was collected from various sources, such as hotel websites, online travel agencies, and review...