Overcoming hallucinations in LLMs
LLMs are trained on large amounts of publicly available data (please take a look at Appendix 1 for more details on how LLMs are trained). By design, they absorb the information they are explicitly given in the prompt and the information they have seen during the training. By default, LLMs don’t have access to any external information (except for what they have already memorized), and in most cases, they’re autoregressive models (they predict output words or tokens one by one by looking at the previous input, and this limits their reasoning capabilities). We’ll see some examples of how we can expand an LLM’s reasoning capabilities with agentic workflows in Chapters 9, 10, and 11.
Simply put, LLMs will use the information from their training to respond to a prompt. LLMs effectively reproduce human language and, therefore, their answers sound very credible even if they are just a probabilistic continuation of the prompt...