You're reading from Generative AI Application Integration Patterns Integrate large language models into your applications

Product type Paperback

Published in Sep 2024

Publisher Packt

ISBN-13 9781835887608

Length 218 pages

Edition 1st Edition

Languages

Python

Tools

TensorFlow

Concepts

Artificial Intelligence

Authors (2):

Luis Lopez Soria

Juan Pablo Bustos

View More author details

Table of Contents (13) Chapters

Preface

1. Introduction to Generative AI Patterns FREE CHAPTER

2. Identifying Generative AI Use Cases

3. Designing Patterns for Interacting with Generative AI

4. Generative AI Batch and Real-Time Integration Patterns

5. Integration Pattern: Batch Metadata Extraction

6. Integration Pattern: Batch Summarization

7. Integration Pattern: Real-Time Intent Classification

8. Integration Pattern: Real-Time Retrieval Augmented Generation

9. Operationalizing Generative AI Integration Patterns

10. Embedding Responsible AI into Your GenAI Applications

11. Other Books You May Enjoy

12. Index

Predictive AI vs generative AI use case ideation

Predictive AI refers to systems that analyze data to identify patterns and make forecasts or classifications about future events. In contrast, generative AI models create new synthetic content like images, text, or code based on the patterns gleaned from their training data. For example, with predictive AI, you can confidently identify if an image contains a cat or not, whereas with generative AI you can create an image of a cat from a text prompt, modify an existing image to include a cat where there was none, or generate a creative text blurb about a cat.

Product innovation focused on AI involves various phases of the product development lifecycle. With the emergence of generative AI, the paradigm has shifted away from initially needing to compile training data to train traditional ML models and toward leveraging flexible pre-trained models.

Foundational models like Google’s PaLM 2 and Gemini, OpenAI’s GPT and DALL-E, and Stable Diffusion provide broad foundations enabling rapid prototype development. Their versatile capabilities lower the barrier for experimenting with novel AI applications.

Where previously data curation and model training from scratch could take months before assessing viability, now proof-of-concept generation is possible within days without the need to fine-tune a foundation model.

This generative approach facilitates more iterative concept validation. After quickly building an initial prototype powered by the baseline model, developers can then collect niche training data and perform knowledge transfer via techniques like distillation to customize later versions; we will deep dive into the concept of distillation later in the book. The model’s primary foundation contains already encoded patterns useful for kickstarting and for iterations of new models.

In contrast, the predictive modeling approach requires upfront data gathering and training before any application testing. This more linear progression limits early-stage flexibility. However, predictive systems can efficiently learn specialized correlations and achieve a high level of confidence inference metrics once substantial data exists.

Leveraging versatile generative foundations supports rapid prototyping and use case exploration. But, later, custom predictive modeling boosts performance on narrow tasks with sufficient data. Blending these AI approaches capitalizes on their complementary strengths throughout the model deployment lifecycle.

Beyond the basic use – prompt engineering – of a foundational model, several auxiliary, more complex techniques can enhance its capabilities. Examples include Chain-of-Thought (CoT) and ReAct, which empower the model to not only reason about a situation but also define and evaluate a course of action.

ReAct, presented in the paper ReAct: Synergizing Reasoning and Acting in Language Models (https://arxiv.org/abs/2210.03629), addresses the current disconnect between LLMs’ language understanding and their ability to make decisions. While LLMs excel at tasks like comprehension and question answering, their reasoning and action-taking skills (for example, generating action plans or adapting to unforeseen situations) are often treated separately.

ReAct bridges this gap by prompting LLMs to generate both “reasoning traces,” detailing the model’s thought process, and task-specific actions in an interleaved manner. This tight coupling allows the model to leverage reasoning for planning, execution monitoring, and error handling, while simultaneously using actions to gather additional information from external sources like knowledge bases or environments. This integrated approach demonstrably improves LLM performance in both language and decision-making tasks.

For example, in question-answering and fact-verification tasks, ReAct combats common issues like hallucination and error propagation by utilizing a simple Wikipedia API. This interaction allows the model to generate more transparent and trustworthy solutions compared to methods lacking reasoning or action components. LLM hallucinations are defined as content generated that seems plausible yet factually unsupported. There are various papers that aim to address this phenomenon. For example, A survey of Hallucination in Large Language Models – Principles, Taxonomy, Challenges, and Open Questions deep dives into an approach to not only identify but also mitigate hallucinations. Another good example of a mitigation technique is covered in the paper Chain-of-Verification Reduces Hallucination in Large Language Models (https://arxiv.org/pdf/2309.11495.pdf). At the time of writing this book, hallucinations are a very rapidly changing field.

Both CoT and ReAct rely on prompting: feeding the LLM with carefully crafted instructions that guide its thought process. CoT, as presented in the paper Chain-of-Thought Prompting Elicits Reasoning in Large Language Models (https://arxiv.org/abs/2201.11903), focuses on building a chain of reasoning steps, mimicking human thinking. Imagine prompting the model with: “I want to bake a cake. First, I need flour. Where can I find some?” The model responds with a potential source, like your pantry. This back-and-forth continues, building a logical chain of actions and decisions.

ReAct takes things a step further, integrating action into the reasoning loop. Think of it as a dynamic dance between thought and action. The LLM not only reasons about the situation but also interacts with the world, fetching information or taking concrete steps, and then updates its reasoning based on the results. It’s like the model simultaneously planning a trip and checking maps to adjust the route if it hits a roadblock.

This powerful synergy between reasoning and action unlocks a new realm of possibilities for LLMs. CoT and ReAct tackle challenges like error propagation (jumping to the wrong conclusions based on faulty assumptions) by allowing the model to trace its logic and correct course. They also improve transparency, making the LLM’s thought process clear and understandable.

In other words, large language models (LLMs) are like brilliant linguists, adept at understanding and generating text. But when it comes to real-world tasks demanding reasoning and action, they often stumble. Here’s where techniques like CoT and ReAct enter the scene, transforming LLMs into reasoning powerhouses.

Imagine an LLM helping diagnose a complex disease. CoT could guide it through a logical chain of symptoms and examinations, while ReAct could prompt it to consult medical databases or run simulations. This not only leads to more accurate diagnoses but also enables doctors to understand the LLM’s reasoning, fostering trust and collaboration.

These futuristic applications are what drive us to keep building and investing in this technology, which is very exciting. Before we dive deep into the patterns that are needed to leverage generative AI technology to generate business value, let’s take a step back and look at some initial concepts.

You're reading from Generative AI Application Integration Patterns Integrate large language models into your applications

Table of Contents (13) Chapters

Predictive AI vs generative AI use case ideation

Authors (2)

Personalised recommendations for you