Grounding Responses
Hallucinations are one of the key problems in large language models (LLMs). In this chapter, we’re going to discuss what that means and how you can reduce the amount of hallucinations. We will discuss closed-book and open-book question-answering, and how retrieval augmented generation (RAG) is gaining popularity. If the concept of RAG is new to you, please do not worry as we’ll discuss it in this chapter.
We’ll also look at a managed Google Cloud service – Vertex AI Agent Builder – that enables you to build RAG-based applications that use a custom corpus of data or documents. A classical RAG application consists of two steps – based on the query, retrieving relevant passages from a large corpus of documents, and then passing these passages as a context in a prompt to the LLM to generate a full answer. We’ll discuss the key steps of building an RAG application and focus on ways to improve context preparation for...